Always On Availability Group Detail

Check the state of a High Availability Group

This dashboard shows detailed health and replication telemetry for a single Always On Availability Group (AG). Use it to verify replica roles, track failovers, monitor data movement, and identify databases that need attention.

Top: AG summary

  • Availability Group: AG name
  • Primary Replica: current primary replica host
  • Secondary Replicas: configured secondaries
  • Total Nodes: count of configured replicas
  • Online Nodes: replicas currently online and reachable
  • N. Databases: number of databases in the AG
  • Synchronization Health: overall sync state for the AG
  • Listener Name: cluster listener DNS name (if configured)
  • Listener IP: listener IP address(es)

Primary Replica Failovers timeline

  • A timeline that shows which replica was primary at each point in time.
  • Use it to review recent failovers and to correlate role changes with events or performance anomalies.

Availability Group Nodes table

  • Replica Instance: instance name for each replica
  • Replica role: Primary / Secondary
  • Sync. Health: per-replica synchronization status
  • Availability Mode: synchronous / asynchronous
  • Failover Mode: automatic / manual
  • Seeding Mode: automatic / manual
  • Secondary Allow Connections: read-intent settings for secondaries
  • Backup Priority: priority used for backup routing
  • Endpoint URL: data movement endpoint
  • R/O Routing URL: read-only routing address (if configured)
  • R/W Routing URL: read-write routing address (if configured)

Nodes KPIs and online history

  • KPIs: Total Nodes and Offline Nodes for quick situational awareness.
  • Online Nodes chart: time-series showing the number of online replicas over the selected interval to spot outages or flapping nodes.

Transfer rates and queues

  • Transfer Rates chart: Send Rate (how fast the primary sends changes) and Redo Rate (how fast secondaries apply changes). Use to spot slow secondaries or network saturation.
  • Transfer Queue Size chart: Send Queue Size and Redo Queue Size. Growing queues indicate replication lag or bottlenecks that may affect failover readiness.

Health history charts

  • Online node history: online vs total nodes over time to visualize availability trends.
  • Database Health History: healthy databases vs total databases to track when databases become unsynchronized or unhealthy.

Databases Replication Status table

  • SQL Instance: instance hosting the database replica
  • Database Name: database name
  • Sync. Health: synchronization status for the database
  • Is Primary Replica: indicates whether this row is the primary
  • Availability Mode: database-level availability mode (inherits from AG)

Usage and investigation tips

  • Correlate failover times with primary timeline and with performance metrics (CPU, I/O) to find causes of role changes.
  • Increasing send/redo queues or sustained low redo rates often point to network, disk, or resource contention on secondaries — investigate those hosts before initiating failover or taking corrective action.