Configure a disaster recovery site for CloudBees CD/RO

3 minute readReferenceAutomation

This guide describes how to configure a disaster recovery (DR) site for CloudBees CD/RO running in a Kubernetes environment.

Before you begin

Ensure the following environment prerequisites are met:

  1. CloudBees CD/RO is fully installed in the primary Kubernetes cluster, with all components running as expected.

  2. A Global Server Load Balancer (GSLB) is available, along with its fully qualified domain name (FQDN).

  3. A secondary (DR) cluster—warm or cold standby—is ready, with a CloudBees CD/RO installation that mirrors the primary site configuration.

  4. A recent backup of the primary site’s database is available.

Configure GSLB on the primary site

To enable automatic failover between the primary and DR sites, configure the global server load balancer (GSLB) using the load balancer’s FQDN.

Automated configuration

To use Helm to automatically set up your GSLB:

CloudBees only recommends using this method if other CloudBees CD/RO components need to access the server via DNS or external load balancer. Using this method can affect performance, as internal components will start communicating with the flow-server using this endpoint. If the endpoint resolves over a network, such as a DNS or external load balancer, it may introduce latency and degrade CloudBees CD/RO performance.

If your environment does not require other CloudBees CD/RO components to access the server via DNS or external load balancer, CloudBees recommends using Manual configuration.

  1. Set the server name in the Common images configurations section of your values file:

    serverName: <FLOW_SERVER_LOAD_BALANCER_FQDN>
  2. Run your helm upgrade command to apply the changes in your CloudBees CD/RO deployment.

Manual configuration

To manually set up the GSLB:

  1. Define required variables:

    NAMESPACE=<YOUR-NAMESPACE> FQDN=<FLOW_SERVER_LOAD_BALANCER_FQDN> FLOW_SERVER_POD=$(kubectl get pods -n $NAMESPACE -l app=flow-server -o jsonpath='{.items[0].metadata.name}') REPO_POD=$(kubectl get pods -n $NAMESPACE -l app=repository -o jsonpath='{.items[0].metadata.name}')
  2. Update the server name on the primary Flow server:

    kubectl exec -it $FLOW_SERVER_POD -n $NAMESPACE -- \ ecconfigure --serverName $FQDN
  3. Update the repository target hostname:

    kubectl exec -it $REPO_POD -n $NAMESPACE -- \ ecconfigure --repositoryTargetHostName $FQDN
  4. Update Server settings in the CloudBees CD/RO UI:

    1. Go to Administration  Server settings.

    2. Update the following fields:

      • CloudBees CD/RO server IP address:

        <FLOW_SERVER_LOAD_BALANCER_FQDN>
      • Stomp connection URL:

        stomp+ssl://<FLOW_SERVER_LOAD_BALANCER_FQDN>:61613

Extract credentials from the primary site

Secure communication with the DR site requires the passkey and keystore files from the primary flow-server.

  1. Define required variables:

    NAMESPACE=<YOUR-NAMESPACE> FLOW_SERVER_POD=$(kubectl get pods -n $NAMESPACE -l app=flow-server -o jsonpath='{.items[0].metadata.name}')
  2. Locate the passkey file:

    kubectl exec -n $NAMESPACE $FLOW_SERVER_POD -- ls /tmp/ | grep passkey
  3. Copy the passkey file to your local machine:

    kubectl cp $NAMESPACE/$FLOW_SERVER_POD:/tmp/<passkey-filename> ./passkey
  4. Locate the keystore file:

    kubectl exec -n $NAMESPACE $FLOW_SERVER_POD -- ls /tmp/ | grep keystore
  5. Copy the keystore file to your local machine:

    kubectl cp $NAMESPACE/$FLOW_SERVER_POD:/tmp/<keystore-filename> ./keystore

Optional: Retrieve credentials from ZooKeeper

Use the following steps if the files are not found in /tmp.

  1. Open a shell session on the flow-server pod:

    kubectl exec -it $FLOW_SERVER_POD -n $NAMESPACE -- bash
  2. Create a temporary directory:

    mkdir -p /tmp/configs && cd /tmp/configs
  3. Download the passkey file:

    /opt/cbflow/jre/bin/java -DCOMMANDER_ZK_CONNECTION=zookeeper:2181 \ -jar /opt/cbflow/server/bin/zk-config-tool-jar-with-dependencies.jar \ com.electriccloud.commander.cluster.ZKConfigTool --readFile /commander/conf/passkey ./passkey
  4. Download the keystore file:

    /opt/cbflow/jre/bin/java -DCOMMANDER_ZK_CONNECTION=zookeeper:2181 \ -jar /opt/cbflow/server/bin/zk-config-tool-jar-with-dependencies.jar \ com.electriccloud.commander.cluster.ZKConfigTool --readFile /commander/conf/keystore ./keystore
  5. Copy the files to your local machine:

    kubectl cp $NAMESPACE/$FLOW_SERVER_POD:/tmp/configs/passkey ./passkey kubectl cp $NAMESPACE/$FLOW_SERVER_POD:/tmp/configs/keystore ./keystore

Convert credentials

The Helm installation for the DR site requires the credentials in base64 format.

  1. Convert the passkey file:

    base64 -i passkey -o passkey.b64
  2. Convert the keystore file:

    base64 -i keystore -o keystore.b64

Install the DR site

Before installing the DR site, create a Kubernetes secret for admin credentials.

  1. Define the namespace variable:

    NAMESPACE=<YOUR-NAMESPACE>
  2. Create the admin credentials secret:

    kubectl create secret generic flow-admin-cred \ --from-literal=CBF_SERVER_ADMIN_PASSWORD='XXXXXXX' \ -n $NAMESPACE
  3. Configure the admin credentials in your values-dr.yaml file:

    flowCredentials: existingSecret: flow-admin-cred boundAgent: flowCredentials: existingSecret: flow-admin-cred
  4. Deploy the DR site using Helm:

    helm upgrade --install cloudbees-dr cloudbees/cloudbees-flow \ -f values-dr.yaml -n $NAMESPACE \ --set-file server.customConfig.passkey\.b64=passkey.b64 \ --set-file server.customConfig.keystore\.b64=keystore.b64 \ --timeout 1400s

Restore the DR site database

  1. Define the namespace variable:

    NAMESPACE=<YOUR-NAMESPACE>
  2. Scale down the DR flow-server:

    kubectl scale deployment flow-server -n $NAMESPACE --replicas=0
  3. Restore the database using your vendor-recommended tool (e.g., psql, mysql, etc).

  4. Scale the flow-server back up:

    kubectl scale deployment flow-server -n $NAMESPACE --replicas=1

Validate the DR site

After restoration, verify that the DR site is functional:

  1. Confirm that the CloudBees CD/RO web interface is accessible.

  2. Verify that connected agents appear and pipelines execute as expected.

  3. Ensure that GSLB failover works correctly.

  4. Validate that licensing and security credentials are restored.

  5. Confirm that the DR configuration matches the primary site.