Problem Description
VMware backups were consistently failing to connect to the designated CloudCache server.
As a result, backups were bypassing the cache and falling back to a direct-to-cloud transfer.
The backup progress logs displayed "could not connect to cloudcache" messages, confirming the connectivity failure between the VMware Backup Proxy and the CloudCache.
This issue was identified after the customer switched to a new Certificate Authority (CA), GlobalSign, and replaced the default self-signed certificates on the CloudCache with the new custom certificates.
Traceback
Initial network connectivity checks, such as
ping
andtelnet
on port443
between the VMware Backup Proxy and the CloudCache server, were successful, ruling out basic network or firewall issues.The root of the issue was identified in the Phoenix CloudCache logs and through SSL connection tests.
1. Phoenix CloudCache Logs
The following errors were found in the CloudCache log file:
Linux:
/var/log/PhoenixCloudCache/PhoenixCloudCache.log
Windows:
C:/ProgramData/PhoenixCloudCache/PhoenixCloudCache.log
[ERROR] (#15113) Error while reading key [INFO] (#15113) No workers were available for use due to connection issue [INFO] (#15113) Server is busy
2. SSL Handshake Failure
An openssl
command executed from the VMware Backup Proxy to test the SSL connection to the CloudCache failed, producing the following output.
Command:
Bash
openssl s_client -connect CloudCache_FQDN:443 -showcerts
Output:
CONNECTED(00000003) write:errno=104 --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 0 bytes and written 325 bytes Verification: OK --- New, (NONE), Cipher is (NONE) Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE No ALPN negotiated Early data was not sent Verify return code: 0 (ok)
The key indicators of failure in this output are:
write:errno=104
: This is a "Connection reset by peer" error, indicating the CloudCache server terminated the TCP connection prematurely.no peer certificate available
: This is the critical error, confirming that the CloudCache server did not present its SSL certificate during the handshake, making secure communication impossible.
Cause
The root cause was an improper installation or configuration of the new custom SSL certificate from GlobalSign on the CloudCache server.
The
PhoenixCacheServer
service was unable to read or load its configured SSL key and certificate file (PhoenixCloudCacheSSL.key
).This failure prevented the service from presenting its identity to connecting clients like the VMware Backup Proxy, leading to the observed SSL handshake failure and the subsequent fallback of backups to the cloud.
Resolution
The resolution involves correctly generating a new key and Certificate Signing Request (CSR), obtaining the signed certificate from the CA, and properly installing it on the CloudCache and the trusting proxy.
Phase A: On Your CloudCache Server (CentOS Machine)
Generate a new Private Key (.key file):
Bash
openssl genrsa -out /etc/ssl/private/cloudcache_server.key 2048
Generate a Certificate Signing Request (CSR - .csr file):
Bash
sudo openssl req -new -key /etc/ssl/private/cloudcache_server.key -out /tmp/cloudcache_server.csr
Note: When prompted, ensure the Common Name (CN) matches the exact Fully Qualified Domain Name (FQDN) that the VMware Backup Proxy uses to connect to the CloudCache (e.g.,
cloudcache1.yourcompany.com
).
Phase B: Obtain Certificate from the Certificate Authority (e.g., GlobalSign)
This phase is performed by the customer via their CA's web portal.
Order/Reissue Certificate: Log in to the CA's platform.
Submit CSR: Paste the full content of the
cloudcache_server.csr
file (generated in Phase A) into the portal.Complete Validation: Fulfill any domain or organization validation steps required by the CA.
Download Certificates: Once issued, download the following files:
The Server Certificate (e.g.,
cloudcache_server.crt
)The Intermediate CA Certificate bundle (e.g.,
intermediate.crt
)The Root CA Certificate (e.g.,
GlobalSignRoot.crt
)
Phase C: On CloudCache Server & VMware Backup Proxy
Copy Certificates to CloudCache (CC): Securely copy the downloaded server certificate (
cloudcache_server.crt
) and the intermediate certificate bundle from the customer to a temporary location on the CloudCache machine, such as/tmp/
.Stop Phoenix CloudCache Service (On CC):
Bash
sudo service PhoenixCacheServer stop
Remove the Old SSL Key File (On CC):
Bash
sudo rm /etc/PhoenixCloudCache/PhoenixCloudCacheSSL.key
Combine and Install the New Certificate & Key (On CC): Concatenate the private key (from Phase A), the new server certificate, and any intermediate certificates into the single file required by the Phoenix service.
Bash
(cat /etc/ssl/private/cloudcache_server.key; echo ""; cat /tmp/cloudcache_server.crt; echo ""; cat /tmp/intermediate.crt) > /etc/PhoenixCloudCache/PhoenixCloudCacheSSL.key
Note: The command above assumes an intermediate certificate was provided. Adjust if your CA provides the full chain in the server certificate file.
Start Phoenix CloudCache Service (On CC):
Bash
sudo service PhoenixCacheServer start
Update Trusted CA Certificate on the VMware Backup Proxy: To ensure the proxy trusts the new GlobalSign certificate, add the GlobalSign Root CA to its system-wide trust store.
Bash
sudo cp /path/to/downloaded/GlobalSignRoot.crt /usr/local/share/ca-certificates/ sudo update-ca-certificates
Restart VMware Backup Proxy Service: Restart the backup proxy service for the new CA trust settings to take effect.
Verification
Test SSL Communication: From the VMware Backup Proxy, re-run the
openssl
command. A successful connection will now display the full certificate chain of the CloudCache server instead of the "no peer certificate available" error.Bash
openssl s_client -connect CloudCache_FQDN:443 -showcerts
Check Worker Registration: Monitor the
PhoenixCloudcache.log
on the CloudCache server. You should now see informational messages indicating that workers are successfully registering and communication is established.Bash
tail -f /var/log/PhoenixCloudCache/PhoenixCloudCache.log
Run Test Backups: Trigger new VMware backups. They should now successfully connect to the CloudCache and complete without falling back to the cloud.