Skip to main content
All CollectionsKnowledge BaseEnterprise WorkloadsTroubleshooting - Enterprise Workloads
Possible causes of Cloud Cache Decommissioning process gets stuck
Possible causes of Cloud Cache Decommissioning process gets stuck
Updated over 3 months ago

This article applies to:

  • Product edition: Phoenix

Problem description

The Cloud Cache decommissioning process gets stuck.

Causes and Resolution

This article describes the different scenarios that can cause this issue. The stages of decommissioning are as follows:

  1. Backup sets are unmapped from CloudCache.

  2. Backups and restores to and from the Phoenix CloudCache stop.

  3. Phoenix CloudCache waits for the next scheduled synchronization operation to flush the unsync data from CloudCache to Phoenix Cloud.

  4. Phoenix removes the data blocks from Cache Store.

  5. Phoenix removes the Cache Entries from Phoenix UI and database.

Scenario-1: When the Cloud Cache is disconnected

Ensure the Phoenix Cloud Cache status is connected to the Phoenix Cloud until the entire decommissioning process and removal from the UI are complete. Do not disconnect the CloudCache when the decommissioning is in progress.

Traceback

[2019-03-11 14:47:52,650] [ERROR] [wpid 91-4228-1548748451] Failed to connect, error:Failed to connect to client. (#100000011)
[2019-03-11 14:47:52,650] [INFO] [wpid 91-4228-1548748451] CacheFlush activity disconnected. wid 0
[2019-03-11 14:47:52,650] [INFO] Cache CacheFlush activity disconnected. Bytes read 0.00 B. Bytes written 0.00 B.
[2019-03-11 14:47:55,650] [ERROR] [wpid 226-940-1481039643] Error <class 'inSyncLib.inSyncError.SyncError'>:Failed to connect to client. (#100000011). Traceback -Traceback (most recent call last):
  File "roboCacheWorker.py", line 284, in runserver
  File "inSyncLib\inSyncRPCServer.pyc", line 557, in serve
SyncError: Failed to connect to client. (#100000011)
Snip

Resolution

Ensure that the Phoenix Cloud is connected to the Phoenix CloudCache. Fix the connectivity issues if any to resume the decommissioning.

The Cache status remains stuck on Decommissioning in Progress if the CloudCache is disconnected.

See also

Scenario-2: Network bandwidth allocated and sync schedule is insufficient

According to the decommissioning workflow, Phoenix synchs the data pending for synchronization to Phoenix Cloud in the next cache schedule cycle. Since this takes time, CloudCache displays Decommissioning in Progress depending on:

  • The size of the data to be synchronized

  • The available network bandwidth

  • The duration specified in the sync schedule


💡 Tip

Configure the CloudCache synchronization schedule for 24 hours for 7 days to un-interrupted decommission. Ensure that you select the Max Available Bandwidth in your environment.(The bandwidth is measured in Megabits/second)


Scenario-3: Phoenix CacheStore is unavailable

The decommission can get stuck if the customer has initiated a decommission process and all the data has been synced to the cloud; however, the PhoenixCacheStore is unavailable. This can occur when:

  • The volume on which the CacheStore resides has been formatted.

  • The disk on which the CacheStore resides has crashed.

Traceback

[2019-03-11 15:47:56,732] [ERROR] Error <type 'exceptions.Exception'>:CRITICAL: Cache store folder E:\PhoenixCacheStore does not exist on file system. Exiting. Traceback -Traceback (most recent call last):
File "roboCacheServer.py", line 236, in server_main
File "roboCacheServer.py", line 389, in _server_main
Exception: CRITICAL: Cache store folder E:\PhoenixCacheStore does not exist on file system. Exiting.

Resolution

Do not format the volume where the Cache Store is residing until the decommissioning process is complete. Contact Druva Support to further troubleshoot this scenario.

Scenario-4: Storage Mapped does not exist

Cloud Cache decommissioning can get stuck when the process is initiated but the storage mapped to the Cloud Cache is deleted. In this case, unflushed data accumulates in the DB that is never synced.

Resolution

Contact Druva Support to troubleshoot this scenario.

References

Scenario-5 : Phoenix cloud-cache is deleted from the customer environment.

Cloud-cache Decommissioning can become blocked if the cloud cache has been completely deleted from the customer environment before the decommissioning process is initiated from the console.

Note : The above step is not an ideal way of doing the decommission process. If at any point there is data which is unsynced might cause the data loss situation.

Resolution :

  1. Firstly make sure that the decommission process is initiated once there is no pending data to upload in the cloud-cache console.

  2. Once that is done you can then wait for the decommission to complete and then delete the cloud-cache from the environment

  3. If any issues contact support for further investigation.

Did this answer your question?