Always On Availability Groups

Check High Availability of databases

1: Always On Availability Group Detail

This dashboard provides an at-a-glance overview of Always On Availability Groups (AGs) across the selected instances. Use it to verify AG health, replica roles, and synchronization status, and to quickly locate groups that need attention.

Availability Groups table

Availability Group: AG name (click to open the AG detail dashboard).
Primary Replica: the current primary replica host for the AG.
Secondary Replicas: comma-separated list of configured secondary replicas.
Total Nodes: number of replicas configured in the AG.
Online Nodes: replicas currently online and reachable.
N. Databases: number of databases protected by the AG.
Synchronization Health: overall sync state (Healthy, Not Healthy) based on replica synchronization and failover readiness.
Listener DNS Name: cluster listener DNS name, if configured.
Listener IP: listener IP address or addresses.

Usage

Click an AG name to view the detail dashboard for per-replica metrics, database synchronization progress, failover readiness.
Filter or sort the table to find AGs with offline nodes, unsynchronized databases, or other anomalies that require investigation.

1 - Always On Availability Group Detail

Check the state of a High Availability Group

This dashboard shows detailed health and replication telemetry for a single Always On Availability Group (AG). Use it to verify replica roles, track failovers, monitor data movement, and identify databases that need attention.

Top: AG summary

Availability Group: AG name
Primary Replica: current primary replica host
Secondary Replicas: configured secondaries
Total Nodes: count of configured replicas
Online Nodes: replicas currently online and reachable
N. Databases: number of databases in the AG
Synchronization Health: overall sync state for the AG
Listener Name: cluster listener DNS name (if configured)
Listener IP: listener IP address(es)

Primary Replica Failovers timeline

A timeline that shows which replica was primary at each point in time.
Use it to review recent failovers and to correlate role changes with events or performance anomalies.

Availability Group Nodes table

Replica Instance: instance name for each replica
Replica role: Primary / Secondary
Sync. Health: per-replica synchronization status
Availability Mode: synchronous / asynchronous
Failover Mode: automatic / manual
Seeding Mode: automatic / manual
Secondary Allow Connections: read-intent settings for secondaries
Backup Priority: priority used for backup routing
Endpoint URL: data movement endpoint
R/O Routing URL: read-only routing address (if configured)
R/W Routing URL: read-write routing address (if configured)

Nodes KPIs and online history

KPIs: Total Nodes and Offline Nodes for quick situational awareness.
Online Nodes chart: time-series showing the number of online replicas over the selected interval to spot outages or flapping nodes.

Transfer rates and queues

Transfer Rates chart: Send Rate (how fast the primary sends changes) and Redo Rate (how fast secondaries apply changes). Use to spot slow secondaries or network saturation.
Transfer Queue Size chart: Send Queue Size and Redo Queue Size. Growing queues indicate replication lag or bottlenecks that may affect failover readiness.

Health history charts

Online node history: online vs total nodes over time to visualize availability trends.
Database Health History: healthy databases vs total databases to track when databases become unsynchronized or unhealthy.

Databases Replication Status table

SQL Instance: instance hosting the database replica
Database Name: database name
Sync. Health: synchronization status for the database
Is Primary Replica: indicates whether this row is the primary
Availability Mode: database-level availability mode (inherits from AG)

Usage and investigation tips

Correlate failover times with primary timeline and with performance metrics (CPU, I/O) to find causes of role changes.
Increasing send/redo queues or sustained low redo rates often point to network, disk, or resource contention on secondaries — investigate those hosts before initiating failover or taking corrective action.