Setup wizard
When a controller running in HA mode starts for the first time, one of the controller replicas acquires a
lock in the shared JENKINS_HOME
. This replica is the only one available, and
the lock remains until the Setup wizard is ended by a user.
When the Setup wizard ends, the remaining replicas continue the startup process. During this process the remaining replicas, one by one and automatically, acquire the lock, start, and release the lock until all of them are available.
However, if the controller is created using a CasC bundle, the Setup wizard is not displayed and all the replicas automatically follow the same process described above without any human confirmation. One by one, they acquire the lock, start, and release the lock until all of them are up and running.
Workload distribution in HA
HA distributes the different pipeline builds among the replicas, and if a replica fails, running builds continue and are adopted by another replica.
Starting with version 2.426.1.2, CloudBees CI provides explicit load balancing for controllers running in HA mode.
Explicit load balancing redirects new builds to the the controller replica with the least load.
CloudBees CI calculates the load using a simple metric that considers the following factors:
|
CloudBees CI provides explicit load balancing in most cases. The table below summarizes supported and unsupported cases:
Job type | Scheduling strategy | ||
---|---|---|---|
Interactive trigger (Build Now) |
Replica with the least work load |
||
Scheduled build (Cron job) |
Replica with the least work load |
||
Branch indexing (Multibranch and Organization folder jobs) |
Replica with the least work load |
||
Webhooks (including multibranch events) |
Replica with the least work load |
||
REST API triggers |
Replica with the least work load |
||
Replica with the least work load |
|||
Always the same replica as the upstream build.
|
|||
Replica with the least work load |
|||
Any other trigger type |
Same replica that processed the trigger |
Plugin installation and HA
Plugins can be managed and installed from the
screen. When using HA with multiple replicas, dynamic loading of plugins (plugin installation without restarting CloudBees CI) is not supported. Therefore, you must restart each replica of the controller to install or upgrade plugins.In a CloudBees CI on modern cloud platforms with a managed controller running in HA mode, when selecting Restart Jenkins when installation is complete and no jobs are running, a rolling restart is performed, and when completed, new plugin versions are available in all replicas.
In a CloudBees CI on traditional platforms running in HA mode with multiple replicas, you must restart all controller replicas either manually or using your own automation.
When the controller is running in HA mode with only one replica, the behaviour is the same as a non-HA controller. |
HA and REST-API endpoints
When running a controller in HA mode, requests to API pull-based endpoints may return information about the controller replica that responds to the API request instead of aggregated information about all the controller replicas part of the HA cluster.
Examples of these endpoints are:
-
The
/metrics
endpoint provided by the Metrics plugin. -
The
/monitoring
endpoint provided by the Monitoring plugin.
For example, when using those plugins, if you make an HTTP API query for JVM heap usage
, the returned value would only correspond to the replica that processed the request and not provide insight into other replicas. However, other information, like the number of projects, is accurate because it is automatically synchronized among all the controller replicas.
In general, responses are accurate and display aggregated replica information for:
-
Global settings.
-
List of jobs, folders, etc., and their configuration.
-
List of permanent or static agents and their configuration.
-
Set of completed builds for a given job.
However, with limited exceptions endpoints display information only about the replica responding to the requests for:
-
JVM information (current heap usage, CPU, etc.)
-
Queue items.
-
List of running builds.
-
List of ephemeral agents connected to the replica.
-
Status of static agents connected to the replica.
CloudBees CI overrides the following Jenkins core endpoints to provide aggregated information about running builds and agents:
-
The endpoint
/job/xxx/api/json?tree=builds[number,building,result]
returns aggregated information about running builds in all the controller replicas. -
The endpoint
/computer/api/json?tree=computer[displayName,offline]
returns aggregated information about agents connected to all the controller replicas.
These endpoints do not present aggregated information without the Requests to these endpoints where the |
You can also use and configure third-party monitoring solutions like Prometheus using the CloudBees Prometheus Metrics plugin, to provide aggregated information from all the controller replicas.
When using pull-based endpoints, whether responses provide aggregated or single-replica information depends on the implementation of the plugins and the entrypoints that provide the information. CloudBees recommends testing those pull-based entry points beforehand to verify which specific data is returned. The scenario is different for push-based monitoring plugins, where data is directly sent from your CloudBees CI instance to the monitoring application. Under those circumstances, and depending on your specific requirements, the data from the various replicas can be consolidated by sending it to the same container, or not. |