Backup and restore on AWS

4 minute read

This chapter covers backup or restore information specific to running CloudBees CI on modern cloud platforms on Amazon Web Services (AWS).

"Backup to S3" with Amazon S3 compatible storage systems

Included in the CloudBees Backup Plugin starting with version 3.33.

The "Backup to S3" destination can be used to store backups in storage systems that are compatible with Amazon S3 (e.g. EMC Atmos storage, OpenStack Swift…​).

To do this, you have to use the standard mechanism of the AWS SDK to modify the list of Amazon S3 endpoints known by the SDK defining a file com/amazonaws/partitions/override/endpoints.json in the classpath of Jenkins. This modification can either be to add a new Amazon S3 endpoint or to replace the existing Amazon S3 endpoints.

Notes on endpoints.json:

  • CloudBees recommends its customers to ask their storage system vendor for their recommendations to customize the standard AWS SDK endpoints.json configuration file. Note that this endpoints.json file is used in all the AWS SDK including aws-sdk-net, aws-sdk-js, aws-sdk-java and aws-sdk-ruby.

  • The desired S3 compatible endpoint MUST be declared in the section partitions.regions and in the section partitions.services.s3.endpoints.

  • The section partitions.services.s3.endpoints. my-s3-compatible-endpoint .signatureVersions (where my-s3-compatible-endpoint is the new section) must be filled according to the specifications of the vendor of the S3 compatible storage system.

Customization of the Jenkins startup command line to add the endpoints.json file to the classpath

A solution to add this endpoints.json file in the classpath of Jenkins is to use the java commandline parameter -Xbootclasspath/a:/path/to/boot/classpath/folder/ and to move com/amazonaws/partitions/override/endpoints.json in the folder /path/to/boot/classpath/folder/.

Sample:

Jenkins startup command line
java -Xbootclasspath/a:/opt/jenkins/boot-classpath/ -jar /opt/jenkins/jenkins.war

Customization of endpoints.json

To add the Amazon S3 Compatible Storage Endpoint keeping all the AWS existing endpoints, we recommend to edit the endpoints.json file and add a partition to the out-of-the-box AWS partitions.

To remove all the existing AWS endpoints and just have the Amazon S3 Compatible Storage Endpoint, we recommend to edit the endpoints.json file, delete all the existing AWS partitions and add a partition with the Amazon S3 Compatible Storage Endpoint.

Sample AWS SDK endpoints.json adding a "My Company" partition containing an "us-mycompany-east-1" region with the s3 endpoint "s3-us-mycompany-east-1.amazonaws.com".

com/amazonaws/partitions/override/endpoints.json
{ "partitions": [ {"_comment": "OUT OF THE BOX AWS PARTITIONS ..."}, { "defaults": { "hostname": "{service}.{region}.{dnsSuffix}", "protocols": [ "https" ], "signatureVersions": [ "v4" ] }, "dnsSuffix": "cloud.mycompany.com", "partition": "mycompany", "partitionName": "My Company", "regionRegex": "^(us|eu|ap|sa|ca)\\-\\w+\\-\\d+$", "regions": { "us-mycompany-east-1": { "description": "My Company US East 1" } }, "services": { "s3": { "defaults": { "protocols": [ "http", "https" ], "signatureVersions": [ "s3", "s3v4" ] }, "endpoints": { "us-mycompany-east-1": { "hostname": "s3-us-mycompany-east-1.amazonaws.com" } } } } } ], "version": 3 }

Backup and restore from EBS volume on AWS

Amazon Web Services (AWS) uses Amazon Elastic Block Store (Amazon EBS) for high-performance block storage. This chapter describes how to find your $JENKINS_HOME on AWS, backup to an EBS volume, and restore to an EBS storage.

Accessing $JENKINS_HOME

Accessing Jenkins Home Directory (Pod Running)

By running the following sequence of commands, you can ascertain the path of the $JENKINS_HOME inside a given pod and a specific CloudBees CI instance.

# Get the location of the $JENKINS_HOME $ kubectl describe pod controller2-0 | grep " jenkins-home " | awk '{print $1}' /var/jenkins_home # Access the bash of a given pod $ kubectl exec controller2-0 -i -t -- bash -i -l controller2-0:/$ cd /var/jenkins_home/ controller2-0:~$ ps -ef PID USER TIME COMMAND 1 jenkins 0:00 /sbin/tini -- /usr/local/bin/launch.sh 5 jenkins 1:46 java -Dhudson.agents.NodeProvisioner.initialDelay=0 -Duser.home=/var/jenkins_home -Xmx1433m -Xms1433m -Djenkins.model.Jenkins.agentAgentPortEnforce=true -Djenkins.model.Jenkins.slav 516 jenkins 0:00 bash -i -l 524 jenkins 0:00 ps -ef controller2-0:~$ ps -ef | grep java 5 jenkins 1:46 java -Dhudson.agents.NodeProvisioner.initialDelay=0 -Duser.home=/var/jenkins_home -Xmx1433m -Xms1433m -Djenkins.model.Jenkins.agentAgentPortEnforce=true -Djenkins.model.Jenkins.agentAgentPort=50000 -DMASTER_GRANT_ID=270bd80c-3e5c-498c-88fe-35ac9e11f3d3 -Dcb.IMProp.warProfiles.cje=kubernetes.json -DMASTER_INDEX=1 -Dcb.IMProp.warProfiles=kubernetes.json -DMASTER_OPERATIONSCENTER_ENDPOINT=http://cjoc/cjoc -DMASTER_NAME=controller2 -DMASTER_ENDPOINT=http://cje.support-core.beescloud.k8s.local/controller2/ -jar -Dcb.distributable.name=Docker Common CJE -Dcb.distributable.commit_sha=888f01a54c12cfae5c66ec27fd4f2a7346097997 /usr/share/jenkins/jenkins.war --webroot=/tmp/jenkins/war --pluginroot=/tmp/jenkins/plugins --prefix=/controller2/ 528 jenkins 0:00 grep java # Operations to be done. This is an example $ kubectl cp controller2-0:/var/jenkins_home/jobs/ ./jobs/ tar: removing leading '/' from member names

Accessing Jenkins Home Directory (Pod Not Running)

# Stop a pod $ kubectl scale statefulset/controller2 --replicas=0 statefulset "controller2" scaled # Create a new rescue-pod running something with any effect # in the $JENKINS_HOME $ cat <<EOF | kubectl create -f - kind: Pod apiVersion: v1 metadata: name: rescue-pod spec: volumes: - name: rescue-storage persistentVolumeClaim: claimName: jenkins-home-controller2-0 containers: - name: rescue-container image: nginx volumeMounts: - mountPath: "/tmp/jenkins-home" name: rescue-storage EOF # Access to the bash of the rescue-pod $ kubectl exec rescue-pod -i -t -- bash -i -l mesg: ttyname failed: Success root@rescue-pod:/# cd /tmp/jenkins-home/ root@rescue-pod:/tmp/jenkins-home# # Operations to be done. This is an example $ kubectl cp rescue-pod:/tmp/jenkins_home/jobs/ ./jobs/ tar: removing leading '/' from member names # Delete the rescue pod $ kubectl delete pod rescue-pod pod "rescue-pod" deleted # Start the pod $ kubectl scale statefulset/controller2 --replicas=1 statefulset "controller2" scaled

Backup in an EBS Volume

The following section describes the action items to be performed to generate a backup of a $JENKINS_HOME directory tree to an EBS volume.

Backup can be performed by the CloudBees Backup plugin or the AWS CLI. In those cases where an interactive backup is needed, login as a CloudBees CI administrator and follow this procedure:

# Stop the controller before backup (either from the UI or from the command line) $ kubectl scale statefulset/controller2 --replicas=0 statefulset "controller2" scaled
# Find the current persistent volume (pv) $ kubectl get pv/pvc-1ad36f90-2607-11e8-aa97-128b794e99f4 -o go-template={{.spec.awsElasticBlockStore.volumeID}} aws://us-east-1e/vol-0263c2e40981587ed

Then, go to AWS Console / EBS Volumes, create a snapshot of the volume, and add the following new tags:

KubernetesCluster Name kubernetes.io/cluster/<clustername> # Replace with your cluster name kubernetes.io/created-for/pv/name kubernetes.io/created-for/pvc/name kubernetes.io/created-for/pvc/namespace
# Start the instance (either from the UI or from the command line) $ kubectl scale statefulset/controller2 --replicas=1 statefulset "controller2" scaled

Restore to an EBS Volume

This section provides the steps to perform a disaster recovery operation given a snapshot in AWS.

First, stop the instance you would like to restore.

# Stop the instance (either from the {OC} UI or from command line) $ kubectl scale statefulset <MM_STATEFULSET_NAME> --replicas=0

Then, export the current configuration of the PV and PVC of the managed controller you would like to restore.

# Export the PVC and the PV $ kubectl get pv <PV> --export -o=yaml > mm-pv.yaml $ kubectl get pvc <PVC> --export -o=yaml > mm-pvc.yaml

Now you can proceed to delete the PVC and the PV.

# Delete the PVC and the PV $ kubectl delete pvc <PVC> $ kubectl delete pv <PV>

Go to AWS Console > EBS > Snapshots and create a volume from the snapshot. Before doing this operation, ensure that all tags from the snapshot were copied, so they can be applied to the new volume.

Before manually creating the PV and the PVC, we should edit the mm-pv.yaml and the mm-pvc.yaml ensuring that the dynamic information like `selfLink and uid are deleted.

In mm-pv.yaml the volumeID must contain the new reference - in case it changed.

On the other hand, for mm-pvc.yaml ensure that the section claimRef get deleted and the volumeName must corresponded with the PV we just created.

At this point, we can start the instance again.

# Start the instance (either from the {OC} UI or from command line) $ kubectl scale statefulset/controller2 --replicas=1 statefulset "controller2" scaled

Below, there are two examples of mm-pv.yaml and `mm-pvc.yaml before being applied.

apiVersion: v1 kind: PersistentVolume metadata: annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner pv.kubernetes.io/bound-by-controller: "yes" pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs creationTimestamp: null finalizers: - kubernetes.io/pv-protection labels: failure-domain.beta.kubernetes.io/region: us-east-1 failure-domain.beta.kubernetes.io/zone: us-east-1b name: pvc-cf8db87b-34de-11e9-86e4-1201fa18b3da spec: accessModes: - ReadWriteOnce awsElasticBlockStore: fsType: ext4 volumeID: aws://us-east-1b/<VOLUMEN_ID> capacity: storage: 20Gi persistentVolumeReclaimPolicy: Delete storageClassName: gp2 status: {}
apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs creationTimestamp: null finalizers: - kubernetes.io/pvc-protection labels: com.cloudbees.cje.tenant: mm-1 com.cloudbees.cje.type: master com.cloudbees.pse.tenant: mm-1 com.cloudbees.pse.type: master name: jenkins-home-mm-1-0 spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi storageClassName: gp2 volumeName: pvc-cf8db87b-34de-11e9-86e4-1201fa18b3da status: {}
In August 2020, the Jenkins project voted to replace the term master with controller. We have taken a pragmatic approach to cleaning these up, ensuring the least amount of downstream impact as possible. CloudBees is committed to ensuring a culture and environment of inclusiveness and acceptance - this includes ensuring the changes are not just cosmetic ones, but pervasive. As this change happens, please note that the term master has been replaced through the latest versions of the CloudBees documentation with controller (as in managed controller, client controller, team controller) except when still used in the UI or in code.