Skip to main content

VMware backups failed to connect to the CloudCache due to an SSL issue, causing them to fall back to the cloud.

VMware backups failed to connect to the CloudCache due to an SSL issue, causing them to fall back to the cloud.

Updated this week

Problem Description

  • VMware backups were consistently failing to connect to the designated CloudCache server.

  • As a result, backups were bypassing the cache and falling back to a direct-to-cloud transfer.

  • The backup progress logs displayed "could not connect to cloudcache" messages, confirming the connectivity failure between the VMware Backup Proxy and the CloudCache.

  • This issue was identified after the customer switched to a new Certificate Authority (CA), GlobalSign, and replaced the default self-signed certificates on the CloudCache with the new custom certificates.

Traceback

  • Initial network connectivity checks, such as ping and telnet on port 443 between the VMware Backup Proxy and the CloudCache server, were successful, ruling out basic network or firewall issues.

  • The root of the issue was identified in the Phoenix CloudCache logs and through SSL connection tests.

1. Phoenix CloudCache Logs

The following errors were found in the CloudCache log file:

  • Linux: /var/log/PhoenixCloudCache/PhoenixCloudCache.log

  • Windows: C:/ProgramData/PhoenixCloudCache/PhoenixCloudCache.log

[ERROR] (#15113) Error while reading key [INFO] (#15113) No workers were available for use due to connection issue [INFO] (#15113) Server is busy

2. SSL Handshake Failure

An openssl command executed from the VMware Backup Proxy to test the SSL connection to the CloudCache failed, producing the following output.

Command:

Bash

openssl s_client -connect CloudCache_FQDN:443 -showcerts

Output:

CONNECTED(00000003) write:errno=104 --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 0 bytes and written 325 bytes Verification: OK --- New, (NONE), Cipher is (NONE) Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE No ALPN negotiated Early data was not sent Verify return code: 0 (ok)

The key indicators of failure in this output are:

  • write:errno=104: This is a "Connection reset by peer" error, indicating the CloudCache server terminated the TCP connection prematurely.

  • no peer certificate available: This is the critical error, confirming that the CloudCache server did not present its SSL certificate during the handshake, making secure communication impossible.

Cause

  • The root cause was an improper installation or configuration of the new custom SSL certificate from GlobalSign on the CloudCache server.

  • The PhoenixCacheServer service was unable to read or load its configured SSL key and certificate file (PhoenixCloudCacheSSL.key).

  • This failure prevented the service from presenting its identity to connecting clients like the VMware Backup Proxy, leading to the observed SSL handshake failure and the subsequent fallback of backups to the cloud.

Resolution

The resolution involves correctly generating a new key and Certificate Signing Request (CSR), obtaining the signed certificate from the CA, and properly installing it on the CloudCache and the trusting proxy.

Phase A: On Your CloudCache Server (CentOS Machine)

  1. Generate a new Private Key (.key file):

    Bash

    openssl genrsa -out /etc/ssl/private/cloudcache_server.key 2048
  2. Generate a Certificate Signing Request (CSR - .csr file):

    Bash

    sudo openssl req -new -key /etc/ssl/private/cloudcache_server.key -out /tmp/cloudcache_server.csr

    Note: When prompted, ensure the Common Name (CN) matches the exact Fully Qualified Domain Name (FQDN) that the VMware Backup Proxy uses to connect to the CloudCache (e.g., cloudcache1.yourcompany.com).

Phase B: Obtain Certificate from the Certificate Authority (e.g., GlobalSign)

This phase is performed by the customer via their CA's web portal.

  1. Order/Reissue Certificate: Log in to the CA's platform.

  2. Submit CSR: Paste the full content of the cloudcache_server.csr file (generated in Phase A) into the portal.

  3. Complete Validation: Fulfill any domain or organization validation steps required by the CA.

  4. Download Certificates: Once issued, download the following files:

    • The Server Certificate (e.g., cloudcache_server.crt)

    • The Intermediate CA Certificate bundle (e.g., intermediate.crt)

    • The Root CA Certificate (e.g., GlobalSignRoot.crt)

Phase C: On CloudCache Server & VMware Backup Proxy

  1. Copy Certificates to CloudCache (CC): Securely copy the downloaded server certificate (cloudcache_server.crt) and the intermediate certificate bundle from the customer to a temporary location on the CloudCache machine, such as /tmp/.

  2. Stop Phoenix CloudCache Service (On CC):

    Bash

    sudo service PhoenixCacheServer stop
  3. Remove the Old SSL Key File (On CC):

    Bash

    sudo rm /etc/PhoenixCloudCache/PhoenixCloudCacheSSL.key
  4. Combine and Install the New Certificate & Key (On CC): Concatenate the private key (from Phase A), the new server certificate, and any intermediate certificates into the single file required by the Phoenix service.

    Bash

    (cat /etc/ssl/private/cloudcache_server.key; echo ""; cat /tmp/cloudcache_server.crt; echo ""; cat /tmp/intermediate.crt) > /etc/PhoenixCloudCache/PhoenixCloudCacheSSL.key

    Note: The command above assumes an intermediate certificate was provided. Adjust if your CA provides the full chain in the server certificate file.

  5. Start Phoenix CloudCache Service (On CC):

    Bash

    sudo service PhoenixCacheServer start
  6. Update Trusted CA Certificate on the VMware Backup Proxy: To ensure the proxy trusts the new GlobalSign certificate, add the GlobalSign Root CA to its system-wide trust store.

    Bash

    sudo cp /path/to/downloaded/GlobalSignRoot.crt /usr/local/share/ca-certificates/ sudo update-ca-certificates
  7. Restart VMware Backup Proxy Service: Restart the backup proxy service for the new CA trust settings to take effect.

Verification

  1. Test SSL Communication: From the VMware Backup Proxy, re-run the openssl command. A successful connection will now display the full certificate chain of the CloudCache server instead of the "no peer certificate available" error.

    Bash

    openssl s_client -connect CloudCache_FQDN:443 -showcerts
  2. Check Worker Registration: Monitor the PhoenixCloudcache.log on the CloudCache server. You should now see informational messages indicating that workers are successfully registering and communication is established.

    Bash

    tail -f /var/log/PhoenixCloudCache/PhoenixCloudCache.log
  3. Run Test Backups: Trigger new VMware backups. They should now successfully connect to the CloudCache and complete without falling back to the cloud.

Did this answer your question?