Rollback CloudBees Analytics via snapshot

Rolling back CloudBees Analytics is not recommended except as a last resort. Whenever possible, upgrade to a supported or compatible version instead.

Snapshot-based rollback carries significant risk. If the snapshot is missing, incomplete, or corrupted, this process could result in permanently losing all CloudBees Analytics data.

Before proceeding:

Review this guide thoroughly.
Take complete, verified backups.
Ensure you have a valid and restorable snapshot.
Proceed only if you fully understand the risks involved.

Before you start

CloudBees Analytics uses OpenSearch to store data. Rolling back this data is only possible via snapshot restore, and only if CloudBees Analytics snapshots were enabled in your environment prior to the upgrade. However, downgrading OpenSearch is not officially supported once a cluster has been upgraded, due to:

Internal index format changes.
Cluster state and metadata updates.
Incompatible data schema evolution.

Also, older OpenSearch versions cannot interpret upgraded cluster metadata or index formats. Additionally, if the snapshot is incomplete or corrupted, this process could result in permanently losing all CloudBees Analytics data.

Manually rollback CloudBees Analytics

Scale down CloudBees Analytics StatefulSet

kubectl scale sts flow-analytics -n <release-namespace> --replicas=0▼

Delete OpenSearch data PVCs

Do NOT delete the backup PVC where snapshots are stored. It’s highly recommended to back up data volumes to avoid data loss due to potential snapshot corruption.
- If this is a single-node cluster, delete:
  kubectl delete pvc -n <namespace> analytics-data-flow-analytics-0
  ▼
- If this is a multi-node cluster, delete all data volumes:
  kubectl delete pvc -n <namespace> analytics-data-flow-analytics-0 kubectl delete pvc -n <namespace> analytics-data-flow-analytics-1 kubectl delete pvc -n <namespace> analytics-data-flow-analytics-2
  ▼
Temporarily disable the CloudBees Analytics backup cronjobs:

This step is critical to prevent any backup jobs from running after PVCs are deleted, which could overwrite valid snapshots or cause job failures.
```
kubectl patch cronjob flow-analytics-backup-job -n <namespace> -p '{"spec" : {"suspend" : true }}'▼
```
Scale up CloudBees Analytics StatefulSet:

CloudBees suggests scaling back up to one replica to start.
```
kubectl scale sts flow-analytics -n <release-namespace> --replicas=1▼
```
1. Wait for the pod to be initialized and running successfully.

Manually recreate snapshot repository (if using file-system storage):

If you are using file-system-based backups (and not an object storage like S3), you must manually create the snapshot repository due to a known issue. This issue was fixed in CloudBees CD/RO v2025.06.0.

Open an interactive bash shell in the flow-analytics-0 pod by running:
```
kubectl exec -it -n <release-namespace> flow-analytics-0 -- bash▼
```

In the pod, run the following:

echo $CBF_ANALYTICS_PATH_REPO

curl -XPUT -u admin:<analytics-admin-credentials> --insecure \
  --header 'Content-Type: application/json' \
  https://flow-analytics:9201/_snapshot/analytics_snap \
  -d '{"type":"fs","settings":{"location":"'$CBF_ANALYTICS_PATH_REPO'","compress":true}}'▼

Edit the Helm release to enable snapshot restore:

Get current Helm values:

helm get values <release-name> -n <release-namespace> > my-values.yaml▼

Trigger restore job:

helm upgrade <release-name> cloudbees/cloudbees-flow \
  -n <release-namespace> \
  --version <current-chart-version> \
  --set analytics.backup.restoreSnapshot=true --timeout 10000s▼

Verify snapshot restore job:
1. Check logs of the restore job pod to verify restoration:
  kubectl logs -n <release-namespace> <restore-job-pod-name>
  ▼
2. In CloudBees CD/RO, ensure your CloudBees Analytics data is restored and the UI reflects the expected historical data.
After confirming a successful restore, re-enable CloudBees Analytics backup cronjobs:

This is critical to ensure new backups are made, and to prevent data loss should you need to restore a backup in the future.
```
kubectl patch cronjob  -n <release-namespace> flow-analytics-backup-job \   -p '{"spec" : {"suspend" : false }}'▼
```

Disable restore flag post-rollback:

This is a critical step to prevent unintended restores in future upgrades.

helm upgrade <release-name> cloudbees/cloudbees-flow \
  -n <release-namespace> \
  --version <current-chart-version> \
  --set analytics.backup.restoreSnapshot=false --timeout 10000s▼

When you run this command, if you are using CloudBees Analytics in cluster mode, your replica count should also be scaled to the original value. Once the deployment is complete, to check the replica count, run:

kubectl get sts flow-analytics -n <release-namespace> -o jsonpath='{.spec.replicas}'▼

Best practices for snapshots and restores

The following list provides a summary of best practices for taking and recovering snapshots:

Always enable snapshot backups before upgrades.
Keep a backup of both data PVCs and snapshot PVCs before performing rollbacks.
Use external storage (e.g., S3) for more flexible backup/restore capabilities.