Resource management and scaling

9 minute readScalability

This page covers configurations for allocating CPU and memory resources, implementing horizontal pod autoscaling, customizing health = probe settings, and managing container constraints. These configurations enable you to tune CloudBees CD/RO component behavior, handle varying workloads, and prevent resource contention issues in your Kubernetes cluster.

Configure custom resources for init job

If using CloudBees CD/RO v2024.06.0 or later, you can configure the resources allocated to the CloudBees CD/RO init job within your values file using jobInit.resources. To get started:

  1. If you do not already have a v2024.06.0 or later values file, update your existing values file with the following fields for jobInit:

    resources: limits: cpu: 4 memory: 6Gi requests: cpu: 2 memory: 6Gi
  2. Set your custom values for the fields.

    The default values are the minimum suggested values. For some environments these settings may not be sufficient, and result in the init job failing. If this occurs, increase the default settings for your environment, and run the installation or upgrade again.
  3. Ensure the YAML is valid, and save your changes.

  4. Deploy the updated chart to your environment using your helm upgrade command.

The CloudBees CD/RO init job will now be allocated the configured resources.

Configure custom probing values

In CloudBees CD/RO v2023.12.0 and later Helm charts, liveness and readiness probe values were added for:

To override the defaults values in the CloudBees CD/RO Helm charts, follow the instructions below to configure custom values in your myvalues.yaml.

Configure probing values for CloudBees CD/RO server jobInit

Add a custom jobInit probe values

To configure custom jobInit.livenessProbe values:

  1. In your myvalues.yaml, navigate to values.jobInit.

  2. Add the following fields under jobInit:

    ## Kubernetes Liveness Probes: livenessProbe: initialDelaySeconds: <INITIAL-DELAY-IN-SECONDS> periodSeconds: <PROBE-PERIOD-IN-SECONDS> timeoutSeconds: <TIMEOUT-LENGTH-IN-SECONDS>
    • The default values are:

      ## Kubernetes Liveness Probes: livenessProbe: initialDelaySeconds: 60 periodSeconds: 60 timeoutSeconds: 10
  3. Install or upgrade your CloudBees CD/RO instance to apply these values.

Configure probing values for CloudBees CD/RO web server

Add a custom web probe values

To configure custom web.livenessProbe or web.readinessProbe values:

  1. In your myvalues.yaml, navigate to values.web.

  2. Add the following applicable fields under web:

    web.livenessProbe
    web.readinessProbe
    ## Kubernetes Liveness Probes: livenessProbe: initialDelaySeconds: <INITIAL-DELAY-IN-SECONDS> periodSeconds: <PROBE-PERIOD-IN-SECONDS> timeoutSeconds: <TIMEOUT-LENGTH-IN-SECONDS>
    ## Kubernetes Readiness Probes: readinessProbe: initialDelaySeconds: <INITIAL-DELAY-IN-SECONDS> periodSeconds: <PROBE-PERIOD-IN-SECONDS> timeoutSeconds: <TIMEOUT-LENGTH-IN-SECONDS> failureThreshold: <FAILURE-THRESHOLD-IN-SECONDS>
    • The default values are:

      web.livenessProbe defaults
      web.readinessProbe defaults
      ## Kubernetes Liveness Probes: livenessProbe: initialDelaySeconds: 10 periodSeconds: 60 timeoutSeconds: 10 failureThreshold: 3
      ## Kubernetes Readiness Probes: readinessProbe: initialDelaySeconds: 10 periodSeconds: 5 timeoutSeconds: 10 failureThreshold: 3
  3. Install or upgrade your CloudBees CD/RO instance to apply these values.

Configure probing values for CloudBees CD/RO repository server

Add a custom repository probe values

To configure custom repository.livenessProbe or repository.readinessProbe values:

  1. In your myvalues.yaml, navigate to values.repository.

  2. Add the following fields applicable under repository:

    repository.livenessProbe
    repository.readinessProbe
    ## Kubernetes Liveness Probes: livenessProbe: initialDelaySeconds: <INITIAL-DELAY-IN-SECONDS> periodSeconds: <PROBE-PERIOD-IN-SECONDS> timeoutSeconds: <TIMEOUT-LENGTH-IN-SECONDS> failureThreshold: <FAILURE-THRESHOLD-IN-SECONDS>
    ## Kubernetes Readiness Probes: readinessProbe: initialDelaySeconds: <INITIAL-DELAY-IN-SECONDS> periodSeconds: <PROBE-PERIOD-IN-SECONDS> timeoutSeconds: <TIMEOUT-LENGTH-IN-SECONDS> failureThreshold: <FAILURE-THRESHOLD-IN-SECONDS>
    • The default values are:

      repository.livenessProbe defaults
      repository.readinessProbe defaults
      ## Kubernetes Liveness Probes: livenessProbe: initialDelaySeconds: 120 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3
      ## Kubernetes Readiness Probes: readinessProbe: initialDelaySeconds: 120 periodSeconds: 5 timeoutSeconds: 5 failureThreshold: 3
  3. Install or upgrade your CloudBees CD/RO instance to apply these values.

Configure autoscale server pods

A HorizontalPodAutoscaler (HPA) automatically updates a workload resource to scale the workload to match demand. HPA deploys additional pods in response to an increased load.

For more information, refer to Horizontal Pod Autoscaling.

CloudBees CD/RO includes horizontal pod autoscaling support for the following deployment components:

  • CloudBees CD/RO server

  • Web server

  • Repository server

CloudBees CD/RO server

The CloudBees CD/RO server supports HPA only when clusteredMode is true.

To enable HPA for the CloudBees CD/RO server, add the following parameter values:

server: autoscaling: enabled: true # enable: true to enable HPA for server minReplicas: 1 # Min Number of Replicas maxReplicas: 3 # Max Number of Replicas to scale targetCPUUtilizationPercentage: 80 # CPU Threshold to scale up targetMemoryUtilizationPercentage: 80 # Memory Threshold to scale up templates: [] # Custom or additional autoscaling metrics # ref: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics # - type: Pods # pods: # metric: # name: repository_process_requests_total # target: # type: AverageValue # averageValue: 10000m
server.autoscaling.minReplicas must match server.replicas.

Web server

The web server supports scaling in both cluster and non-cluster modes.

To enable HPA for the web server, add the following parameter values:

web: autoscaling: enabled: true # enable: true to enable HPA for web minReplicas: 1 # Min Number of Replicas maxReplicas: 3 # Max Number of Replicas to scale targetCPUUtilizationPercentage: 80 # CPU Threshold to scale up targetMemoryUtilizationPercentage: 80 # Memory Threshold to scale up templates: [] # Custom or additional autoscaling metrics # ref: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics # - type: Pods # pods: # metric: # name: repository_process_requests_total # target: # type: AverageValue # averageValue: 10000m
web.autoscaling.minReplicas must match web.replicas.

Repository server

The repository server supports scaling in both cluster and non-cluster modes.

To enable HPA for the repository server, add the following parameter values:

repository: autoscaling: enabled: true # enable: true to enable HPA for repository minReplicas: 1 # Min Number of Replicas maxReplicas: 3 # Max Number of Replicas to scale targetCPUUtilizationPercentage: 80 # CPU Threshold to scale up targetMemoryUtilizationPercentage: 80 # Memory Threshold to scale up templates: [] # Custom or additional autoscaling metrics # ref: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-custom-metrics # - type: Pods # pods: # metric: # name: repository_process_requests_total # target: # type: AverageValue # averageValue: 10000m
repository.autoscaling.minReplicas must match repository.replicas.

Configure memory limits for CloudBees CD/RO components

During periods of high work load, a server component could run out of memory if it requests more memory than is allocated to the JVM. To increase the memory for a component, we have to allocate more memory to the component’s container. Then, depending on the component, the memory allocation for the component running in the container needs to be increased accordingly. Refer to Cluster capacity for default container memory settings.

The following configurations can be used to change the memory allocation for each container and component.

Component Container memory limit Component memory setting Example

CloudBees CD/RO server

server.resources.limits.memory

server.ecconfigure

server.ecconfigure: "--serverInitMemoryMB=4096 --serverMaxMemoryMB=4096"

CloudBees CD/RO web server

web.resources.limits.memory

N/A

Repository server

repository.resources.limits.memory

repository.ecconfigure

ecconfigure: " --repositoryInitMemoryMB=256 --repositoryMaxMemoryMB=512"

CloudBees Analytics server

analytics.resources.limits.memory

analytics.heapSize (heap size in MB for CloudBees Analytics)

Bound agent

boundAgent.resources.limits.memory

boundAgent.ecconfigure

ecconfigure: "--repositoryInitMemoryMB=256 --repositoryMaxMemoryMB=512"

The CloudBees CD/RO bound agent (flow-bound-agent) is an internal component used specifically by CloudBees CD/RO for internal operations. While it is possible to schedule user jobs on bound agents, they are not intended for this purpose, and CloudBees CD/RO agents should be used instead.

If operations other than CloudBees CD/RO internal operations run bound agents, CloudBees CD/RO performance may become unpredictable. Additionally, system requirements for CloudBees CD/RO instances assume that bound agents are used exclusively by CloudBees CD/RO, and are not reliable for instances where user jobs are also running.

Inject new memory limits using helm . Update your local values file (here it is called myvalues.yaml ) with the new values and issue the Helm `upgrade ` command.

helm upgrade <chartName> --name <releaseName> \ -f <valuesFile> --namespace <nameSpace> --timeout 10000

Configure VM memory maps

The default configuration of your Linux kernel’s vm.max_map_count may not support the needs of your Docker containers. There are multiple scenarios, such as logging and analytics, where you may need to increase the vm.max_map_count.

To increase the vm.max_map_count in a Docker container, run:

helm upgrade --install node-level-sysctl node-level-sysctl -n kube-system \ --set "parameters.vm\.max_map_count=<replace_with_your_value>"

Configuration to remove pod affinity

By default, CloudBees CD/RO components are configured with a podAntiAffinity constraint. This influences the Kubernetes scheduler to prefer distributing component pods across multiple nodes, reducing the risk of a single node failure impacting multiple instances of a service and improving high availability.

However, the podAntiAffinity constraint is optional. You can remove it for specific or all CloudBees CD/RO components by setting podAntiAffinity to an empty value in the component Helm chart.

Once removed, the scheduler instead considers other constraints and ultimately assigns pods to a node based on resource availability and scheduling rules. If no constraints exist, it defaults to placing the pod on the least loaded node. In either case, this may result in multiple CloudBees CD/RO component pods being assigned to the same node.

If podAntiAffinity is removed, the Kubernetes scheduler may assign multiple CloudBees CD/RO component pods to the same node. If this node fails, multiple CloudBees CD/RO instances may be unavailable until Kubernetes reschedules them, or they are manually recovered.

To remove pod affinity from a component:

The following steps apply to all CloudBees CD/RO components, including the CloudBees CD/RO agent, which is configured separately in the agent values file.

  1. Open your values file and navigate to the component section.

  2. Locate the <component>.affinity field.

  3. Update the affinity field from:

    affinity: {}

    to:

    affinity: podAntiAffinity: {}
  4. Ensure your values file is properly formatted, then save your changes.

  5. (Optional) Repeat for each component as needed.

  6. Run your Helm upgrade command to apply the changes and update the deployment.

You have removed the pod affinity constraint, and the Kubernetes scheduler will assign CloudBees CD/RO component pods based on your cluster configuration.

If you experience scheduling issues or increased downtime after making these changes, CloudBees recommends restoring the values to their default settings.

How to add additional container values for sidecar injectors

For the default CloudBees CD/RO Helm charts, refer to cloudbees-flow chart configuration values for each component. You can find these cloudbees-flow values referenced in the values.yaml under the *.additionalContainers tag for the corresponding component.

Table 1. Sidecar injector additional container descriptions
Key Description/Default

server.additionalContainers

To add additional containers for the server, uncomment the name, image, and command in your values file.

server: additionalContainers: # additionalContainers: # - name: container-name # image: image:version # command: # - "/container-command"

web.additionalContainers

To add additional containers, uncomment the name, image, and command in your values file.

web: additionalContainers: # additionalContainers: # - name: container-name # image: image:version # command: # - "/container-command"

analytics.additionalContainers

To add additional containers, uncomment the name, image, and command in your values file.

analytics: additionalContainers: # additionalContainers: # - name: container-name # image: image:version # command: # - "/container-command"

repository.additionalContainers

To add additional containers, uncomment the name, image, and command in your values file.

repository: additionalContainers: # additionalContainers: # - name: container-name # image: image:version # command: # - "/container-command"

boundAgent.additionalContainers

To add additional containers, uncomment the name, image, and command in your values file.

boundAgent: additionalContainers: # additionalContainers: # - name: container-name # image: image:version # command: # - "/container-command"

Configure additional volume values for sidecar injectors

For the default CloudBees CD/RO Helm charts, refer to cloudbees-flow chart configuration values for each component. You can find these cloudbees-flow values referenced in the values.yaml under the *.additionalVolume and *.additionalVolumeMounts tags for the corresponding component.

Table 2. Sidecar injector additional volume parameter descriptions and default values
Key Description/Default

server.additionalVolume

server.additionalVolumeMounts

You can use ConfigMaps, Secrets, and PersistentVolumeClaims as additional volumes. The example below uses PersistentVolumeClaims. To add an additional volume and mount for the server, update the settings in the myvalues.yaml file.

server: additionalVolumes: # - name: volume0 # persistentVolumeClaim: # claimName: volume0 additionalVolumeMounts: # - name: volume0 # mountPath: /tmp/volume0

web.additionalVolume

web.additionalVolumeMounts

To add additional volume and mount, update the settings in the myvalues.yaml file.

web: additionalVolumes: # - name: volume0 # persistentVolumeClaim: # claimName: volume0 additionalVolumeMounts: # - name: volume0 # mountPath: /tmp/volume0

repository.additionalVolume

repository.additionalVolumeMounts

To add an additional volume and mount, update the settings in the myvalues.yaml file.

repository: additionalVolumes: # - name: volume0 # persistentVolumeClaim: # claimName: volume0 additionalVolumeMounts: # - name: volume0 # mountPath: /tmp/volume0

boundAgent.additionalVolume

boundAgent.additionalVolumeMounts

To add additional volume and mount, update the settings in the myvalues.yaml file.

boundAgent: additionalVolumes: # - name: volume0 # persistentVolumeClaim: # claimName: volume0 additionalVolumeMounts: # - name: volume0 # mountPath: /tmp/volume0

Pre-provision volume snapshots as a PVC in StatefulSets

This how-to describes using a pre-provisioned volume snapshot as a PersistentVolumeClaim (PVC) within a PersistentVolume (PV) for a Kubernetes StatefulSet. The instructions on how to perform these actions may differ depending on your cloud provider, however the general steps are:

  1. Create a PVC manifest.

  2. Create a PV manifest that references your PVC and snapshot.

  3. Apply these manifests to your cluster.

  4. Test your cluster to ensure the PV and PVC are present with the desired values.

    Even though this how-to describes specific steps for CloudBees Analytics (flow-analytics), you can modify the steps here to apply to other CloudBees CD/RO components.
  5. Create a PVC manifest (pvc.yaml):

    apiVersion: v1 kind: PersistentVolumeClaim metadata: name: PVCNAME spec: storageClassName: STORAGE_CLASSS accessModes: - ReadWriteOnce resources: requests: storage: STORAGE_SIZE
  6. Create a PV manifest (pv.yaml) that references the PVCNAME from your PVC manifest and your snapshot:

    The following example is based on using GCP’s gcePersistentDisk. Use the format required by your provider to create a reference for your snapshot.
    apiVersion: v1 kind: PersistentVolume metadata: name: PVNAME spec: storageClassName: STORAGE_CLASSS capacity: storage: STORAGE_SIZE accessModes: - ReadWriteOnce claimRef: namespace: NAMESPACE name: PVCNAME # Use the directive from your provider to reference your snapshot gcePersistentDisk: pdName: CLOUD_DISK_NAME fsType: ext4
  7. Update your cluster with the manifest files:

    # Apply the PVC (pvc.yaml) and # PV (pv.yaml) to your cluster: kubectl apply -f pvc.yaml -f pv.yaml
  8. Assign the following variables:

    pvcName ="<PVCNAME-from-pvc.yaml>" pvName ="<PVNAME-from-pv.yaml>"
  9. Check if the PV is available in your cluster and has the desired values:

    kubectl get pv $pvName
  10. Check if the PVC is available in your cluster and has the desired values:

    kubectl get pvc $pvcName

Resolve volume node affinity conflicts

Sometimes pods can hang in the pending stage with the following error:

x/y nodes are available: y node(s) had volume node affinity conflict.

This can happen when the availability zone for the persistent volume claim is different from the availability zone of the node on which the pod gets scheduled.

A cluster administrator can address this issue by specifying the WaitForFirstConsumer mode, which delays the binding and provisioning of a PersistentVolume until a pod using the PersistentVolumeClaim is created. PersistentVolume s are selected or provisioned conforming to the topology that is specified by the pod’s scheduling constraints.

For more information, refer to Volume binding mode in Kubernetes storage classes.