Problem description
Customer has configured multiple SQL server as AAG on the Druva console
E.g AAG name:SQLserver
Primary node: SQLserver 1
Secondary node : SQLserver 2
DB discovery works successful on SQL server1 but it fails on SQLserver2
Traceback:
1.Please retrieve the client logs from both the primary and secondary node servers where the EnterpriseWorkload client is installed. You can find the logs in the following directory:
%ProgramData%\Druva
Check if in the main service logs or control service logs you see below error for any of the SQL sever
[2024-01-26 14:35:11,265] [DEBUG] SqlClusterHelper : instances = [u'<SQL_instance name>'], command_params=None
[2024-01-26 14:35:11,328] [INFO] SQL Server instance <SQL_instance name> has edition Developer Edition (64-bit)
[2024-01-26 14:35:11,328] [INFO] SQL Server instance <SQL_instance name>version SQL_2019 - 15.0.2000.5 RTM
[2024-01-26 14:35:11,328] [DEBUG] CLUSTER_NAME result = [(u'<SQL_cluster_ name>r', )]
[2024-01-26 14:35:11,344] [DEBUG] DISCOVERY_NODE_NAME result = [(u'<primary/secondary node>', )]
[2024-01-26 14:35:11,828] [DEBUG] AG_LIST [u'<SQL_AAG_name>']
[2024-01-26 14:35:11,828] [ERROR] Error in Executing sql discovery
[2024-01-26 14:35:11,828] [ERROR] Error <type 'exceptions.IndexError'>:list index out of range. Traceback -Traceback (most recent call last):
File "agents\sqlserver\SqlAgent.pyc", line 165, in run_discovery
File "agents\sqlserver\SqlHelper.pyc", line 169, in get_discovery_info
File "agents\sqlserver\SqlHelper.pyc", line 214, in _get_instances_info
File "agents\sqlserver\SqlHelper.pyc", line 292, in _get_availability_groups
IndexError: list index out of range
2. Check the sql_agent.log from both the primary and secondary server and check for respective entry
"replica_info": {
"SQLserver1": {
"instance_name": "SQLserver1",
"exclude_replica": false,
"backup_priority": 50,
"replica_connected_state": 1,
"secondary_role_allow_connections": 2,
"replica_role": 1,
"replica_sync_health": 2,
"replica_id": "621D9441-xxxx-xxxx-xxxx-06FD2A23xxxx",
"is_discovery_node": true
},
"SQLserver2": {
"instance_name": "SQLserver2",
"exclude_replica": false,
"backup_priority": 50,
"replica_connected_state": 1,
"secondary_role_allow_connections": 2,
"replica_role": 2,
"replica_sync_health": 0,
"replica_id": "211F603D-xxxx-xxxx-xxxx0-19092DCExxxx",
"is_discovery_node": false
}
}
Resolution
In the context of Microsoft SQL Server, replica_sync_health = 2 indicates that the synchronization health of a replica in an Always On Availability Group is marked as Not Healthy.
Meaning:
This value is part of the sys.dm_hadr_availability_replica_states DMV (Dynamic Management View) and represents the replica_sync_health column. The possible values are:
0: Not Healthy
The replica is not synchronizing data properly or has encountered errors.1: Partially Healthy
Some synchronization is occurring, but there may be issues affecting full health.2: Healthy
The replica is fully synchronized and functioning as expected.
What replica_sync_health = 2 signifies:
If you see replica_sync_health = 2, it means:
The replica is in a good state.
Data is synchronized between the primary and the replica.
There are no issues reported regarding the health of the synchronization.
Reference Article.
From the above traceback we see the "replica_sync_health": 0, is shown for SQLserver2 which indicates that the replica is not synchronizing data properly or has encountered errors. Thus, db discovery was failing for SQLserver2.
To remediate this issue, the customer should involve the SQL administrator to resolve the synchronization issue for the specified AG SQL server. Once the synchronization issue is fixed, we can proceed with initiating an on-demand database discovery from the Phoenix console.