Always On Availability Groups
Check High Availability of databases
This dashboard provides an at-a-glance overview of Always On Availability
Groups (AGs) across the selected instances. Use it to verify AG health,
replica roles, and synchronization status, and to quickly locate groups that
need attention.
Availability Groups table
- Availability Group: AG name (click to open the AG detail dashboard).
- Primary Replica: the current primary replica host for the AG.
- Secondary Replicas: comma-separated list of configured secondary replicas.
- Total Nodes: number of replicas configured in the AG.
- Online Nodes: replicas currently online and reachable.
- N. Databases: number of databases protected by the AG.
- Synchronization Health: overall sync state (Healthy, Not Healthy)
based on replica synchronization and failover readiness.
- Listener DNS Name: cluster listener DNS name, if configured.
- Listener IP: listener IP address or addresses.
Usage
- Click an AG name to view the detail dashboard for per-replica metrics,
database synchronization progress, failover readiness.
- Filter or sort the table to find AGs with offline nodes, unsynchronized
databases, or other anomalies that require investigation.
1 - Always On Availability Group Detail
Check the state of a High Availability Group
This dashboard shows detailed health and replication telemetry for a single
Always On Availability Group (AG). Use it to verify replica roles, track
failovers, monitor data movement, and identify databases that need attention.
Top: AG summary
- Availability Group: AG name
- Primary Replica: current primary replica host
- Secondary Replicas: configured secondaries
- Total Nodes: count of configured replicas
- Online Nodes: replicas currently online and reachable
- N. Databases: number of databases in the AG
- Synchronization Health: overall sync state for the AG
- Listener Name: cluster listener DNS name (if configured)
- Listener IP: listener IP address(es)
Primary Replica Failovers timeline
- A timeline that shows which replica was primary at each point in time.
- Use it to review recent failovers and to correlate role changes with events
or performance anomalies.
Availability Group Nodes table
- Replica Instance: instance name for each replica
- Replica role: Primary / Secondary
- Sync. Health: per-replica synchronization status
- Availability Mode: synchronous / asynchronous
- Failover Mode: automatic / manual
- Seeding Mode: automatic / manual
- Secondary Allow Connections: read-intent settings for secondaries
- Backup Priority: priority used for backup routing
- Endpoint URL: data movement endpoint
- R/O Routing URL: read-only routing address (if configured)
- R/W Routing URL: read-write routing address (if configured)
Nodes KPIs and online history
- KPIs: Total Nodes and Offline Nodes for quick situational awareness.
- Online Nodes chart: time-series showing the number of online replicas over
the selected interval to spot outages or flapping nodes.
Transfer rates and queues
- Transfer Rates chart: Send Rate (how fast the primary sends changes) and
Redo Rate (how fast secondaries apply changes). Use to spot slow secondaries
or network saturation.
- Transfer Queue Size chart: Send Queue Size and Redo Queue Size. Growing
queues indicate replication lag or bottlenecks that may affect failover
readiness.
Health history charts
- Online node history: online vs total nodes over time to visualize availability
trends.
- Database Health History: healthy databases vs total databases to track when
databases become unsynchronized or unhealthy.
Databases Replication Status table
- SQL Instance: instance hosting the database replica
- Database Name: database name
- Sync. Health: synchronization status for the database
- Is Primary Replica: indicates whether this row is the primary
- Availability Mode: database-level availability mode (inherits from AG)
Usage and investigation tips
- Correlate failover times with primary timeline and with performance metrics
(CPU, I/O) to find causes of role changes.
- Increasing send/redo queues or sustained low redo rates often point to
network, disk, or resource contention on secondaries — investigate those
hosts before initiating failover or taking corrective action.