Upgrade Notes
- Operations center CloudBees Assurance Program plugin changes since 2.504.3.28224
-
The following plugins have been added to the operations center CloudBees Assurance Program since 2.504.3.28224:
-
CloudBees License Tracker Plugin (
cloudbees-license-tracker
) -
jsoup API Plugin (
jsoup
)
-
- Controller CloudBees Assurance Program plugin changes since 2.504.3.28224
-
The following plugins have been added to the controller CloudBees Assurance Program since 2.504.3.28224:
-
CloudBees License Tracker Plugin (
cloudbees-license-tracker
)
-
New Features
- CloudBees CI on modern cloud platforms now supports Kubernetes 1.33
-
CloudBees CI on modern cloud platforms now supports Kubernetes 1.33 on Azure Kubernetes Service, Amazon Elastic Kubernetes Service, Google Kubernetes Engine, CNCF-certified Kubernetes Platform. For more information, refer to Supported platforms for CloudBees CI on modern cloud platforms.
Feature Enhancements
- CloudBees CI streamlined navigation and relocated global actions
-
CloudBees CI introduces a refreshed user interface, redesigned to align with the Jenkins 2.516.1 LTS release. Key navigation and header enhancements include:
-
The Manage Jenkins page is now exclusively accessible via the Manage Jenkins icon in the upper-right corner.
-
Alerts are now conveniently located under the Alerts icon in the upper-right corner.
-
Global actions—such as Role-Based Access Control (RBAC) groups and roles, CloudBees CI Teams, Configuration as Code (CasC) bundle loading, and the CloudBees Support page—have moved to the More actions icon in the upper-right corner.
-
These options are no longer available from the left pane. Contextual actions relevant to the current page remain in the left pane, ensuring quick access to page-specific features, while global actions are centralized in the header for a more streamlined experience.
- The CloudBees CyberArk Credentials Provider plugin (
cloudbees-cyberark-credentials
) now only supports the CyberArk REST API -
As of December 31, 2024, CyberArk no longer supports calling the Web service using SOAP requests, and only supports calling the Web service using the REST API. As a result, the CloudBees CyberArk Credentials Provider plugin (
cloudbees-cyberark-credentials
) has been updated to remove SOAP support, and now only supports the REST API. CyberArk version 10 and later are required. For more information, refer to Call the Web Service using REST.
- More stringent health check for High Availability (HA) controllers
-
As of version 2.504.1.6, the
/health
endpoint (which should be used for Kubernetes probes in the future) in a High Availability (HA) controller would verify that the replica’s Hazelcast networking library was running, but not that it was useful for communicating with other replicas.
For example, if port 5701 were blocked, it would still be considered healthy, even though every replica thought it was part of a cluster of size one. Now, the health check tracks the set of running replicas via a directory in the shared $JENKINS_HOME
folder. Therefore, if a split-brain scenario occurs in Hazelcast, the controller will automatically scale itself down to one replica until the situation is resolved.
- Custom security contexts for the Configuration as Code Bundle Retriever containers
-
Now the Helm chart to install CloudBees CI includes two properties to customize the Kubernetes security contexts for the Configuration as Code Bundle Retriever containers.
-
OperationsCenter.CasC.Retriever.containerSecurityContext
for the sidecar container. -
OperationsCenter.CasC.Retriever.initContainerSecurityContext
for the initContainer.
-
- Configurable API URL Support for CloudBees Platform Insights Plugin (
cloudbees-platform-insights
) -
The current CloudBees Platform Insights Plugin (
cloudbees-platform-insights
) uses a default, standard CloudBees Unify endpoint, which only supports controllers integrated with the CloudBees Unify. This update introduces support for single-tenant CloudBees instances by allowing users to configure a custom CloudBees Platform API URL in the plugin configuration UI. With this enhancement, the CloudBees Platform Insights Plugin (cloudbees-platform-insights
) can now send data to the correct CloudBees Unify instance for processing and visualization, based on the specified endpoint. If no custom URL is configured, the plugin will continue to default to the standard CloudBees Unify endpoint.
- CasC Controller Bundle Service allows loading credentials from Kubernetes secrets
-
To do this, set the chart value
CascBundleService.credentialSecretsNamespace
to the namespace containing the secrets you want to use. Then, specify the source attribute with the secret name for the credentials you want to load from Kubernetes secrets.
- Allow skipping creation of CasC Controller Bundle Service configuration secret via Helm values
-
A new Helm value,
CascBundleService.createConfig
, has been introduced. Setting this value tofalse
prevents Helm from creating thecasc-bundle-service-config
Kubernetes secret. This feature is intended for CloudBees CI deployments where secrets are handled by an external secret manager.
- Multi-executor inbound agents in a High Availability (HA) controller now run from a single process
-
Previously, when configuring multiple virtual executors on an inbound permanent agent in a High Availability (HA) controller, a separate Java process was launched for each cloned agent. Now, all of the agent Java code runs from the same process as the original. This reduces memory usage and simplifies some platform-dependent code for looking up which processes are already running. It also allows a service wrapper to stop all executors at once without special setup.
For best results, run the agent with the version of Remoting bundled with the controller (or a newer version).
There is no change to the behavior of outbound permanent agents with multiple executors. As before, each physical agent (executor) is launched on demand by the replica that needs it and then shut down when idle.
Resolved Issues
- Agent unloading bug on ownership change during rolling restart in High Availability (HA) controllers
-
This fix addresses a bug that occurred when a build was running on an ephemeral agent and the replica handling it went down. In these scenarios, before another replica could adopt the build, the agent ownership was temporarily set to "pending for adoption" (introduced in 2.492.3.5). However, the agent was prematurely unloaded when its ownership status changed to "pending for adoption". As a result, when a new replica attempted to adopt the build, the agent had already been removed, resulting in the build failing.
This update ensures that the agent isn’t unloaded when its ownership is set to "pending for adoption." This allows builds to be successfully adopted and continued by another replica after failover.
- Unable to launch High Availability (HA) multi-executor permanent agents on Windows
-
When a permanent agent with multiple executors was running on Windows, the path for extra nodes wasn’t computed correctly. This led to errors that prevented the startup of these extra agents.
Now, the path is computed correctly on Windows, allowing normal usage of these agents.
sh
step failure when user process exited during High Availability (HA) rolling restart-
When using the "watching" mode of durable Pipeline steps such as
sh
(enabled by default in High Availability (HA) controllers, if the user script exited after Jenkins had begun shutting down (for example, during a rolling restart), and specifically after the corresponding build had printedPausing (Preparing for shutdown)
, the exit notification might be processed on the agent side but not on the controller side. This printed aFatalRejectedExecutionException
error. In this case, after the build was resumed or adopted, it would printprocess apparently never started in …
and eventually fail. Now, the exit event is ignored during shutdown and left for whichever controller process resumes the build.
- Ensure consistency of job builds REST API across replicas in High Availability (HA) controllers
-
When running a High Availability (HA) controller, the list of builds served through the REST API could differ slightly depending on the replica being browsed. This difference depended on the number of builds currently running for a given job.
This list is now trimmed to a maximum of 100 builds, which matches the default Jenkins implementation.
- Excessive threads consumed by WebSocket agents in High Availability (HA) controllers
-
When an inbound WebSocket agent connects to a replica of an High Availability (HA) controller and the connection must be forwarded on to another replica, eleven native threads were consumed by the first replica’s JVM. Because most of these threads were unused, this could lead to excessive memory consumption or ulimit exhaustion under heavy load conditions. Now the WebSocket reverse proxy is reimplemented with a different client which permits more efficient thread pool usage.
HazelcastInstanceNotActiveException
could break a build starting during High Availability (HA) replica shutdown-
A Pipeline build starting just as a replica of a High Availability (HA) controller was in its shutdown sequence could have failed immediately with a stack trace of
HazelcastInstanceNotActiveException
. This error was related toFlowExecutionListStorage.register
, because it was too late to add the build to the list of running builds recognized by other replicas. Now, this warning is non-fatal.
- Declining to create placeholder Pipeline step structure in High Availability (HA) controllers
-
When a controller attempted to load a pipeline build from disk but failed to reconstruct its flow graph (list of stages and steps), it would normally save a copy of the original metadata and then replace it with a placeholder. The placeholder indicated that the original metadata was broken and couldn’t be used. However, in High Availability (HA) controller, timing-dependent issues sometimes caused replicas to temporarily observe inconsistent metadata. In these cases, attempts to clean up and swap in the placeholder resulted in worse problems than the system was intended to solve. Therefore, the placeholder system is now disabled by default in High Availability (HA) controllers. Any persistent metadata corruption in a build will be displayed as such whenever the build is viewed, without attempting to make further changes on disk.
- Replace HTTP health check prior to Websocket agent proxied connections
-
Whenever a WebSocket agent connected to a High Availability (HA) controller replica that wasn’t the owner of that agent, a health check was made via HTTP to the owning replica before proxying the connection. With a large number of WebSocket agents attempting to connect, this can result in a high volume of HTTP connections.
Now, the health check is replaced with a readiness check through Hazelcast.
- Pipeline builds with numerous
echo
steps not saved in a timely manner during shutdown -
If a Pipeline build runs many
echo
steps in close succession (which isn’t recommended for performance reasons), instead of encapsulating them in a singlesh
step, saving its state during controller shutdown could take a long time. This could potentially exceed, for example, a Kubernetes termination grace period, resulting in corrupted metadata and issues resuming the build. These problems could include adoption by another replica in High Availability (HA) controllers.
- Failure to resume Declarative Pipeline builds paused in a
post
block in High Availability (HA) controllers -
Under certain conditions (specifically, build adoption in High Availability (HA) controllers), a Declarative Pipeline build paused in a
post
block could fail to resume.
- Corrupted builds during High Availability (HA) controller rolling restart
-
Under certain timing conditions, a rolling restart of a High Availability (HA) controller could result in two replicas claiming the same build at once. This could cause issues such as CloudBees Pipeline Explorer reporting malformed metadata. One cause was tracked to an unnecessary attempt by a new replica to resume all known builds, which is properly handled solely by the build adoption code.
- “Also cancelling shell steps” could abort pipeline builds due to much older corrupt builds
-
It was reported that, under certain conditions, an old pipeline build (which the user didn’t realize was still running but whose metadata was corrupt) might persistently abort new and otherwise correct builds using the same permanent agent. This resulted in the message "Also cancelling shell steps running on…". Now, this action is limited to
sh
steps running in the same build where the corruption was observed.
- Queue items no longer lost when agents disconnect
-
Occasionally, if an agent disconnected while a queue item was being assigned to it, the queue item could be lost, leaving the job in an erroneous state.
Now, the queue item is returned to the queue, allowing the job to proceed as expected.
-name @path.txt
syntax not supported by multi-executor permanent agents in a High Availability (HA) controller-
If an inbound permanent agent is configured on a High Availability (HA) controller with multiple executors, and the
-name
argument is given with@path.txt
syntax rather than a literal agent name, the clones failed to connect.
- Lost launch log output for High Availability (HA) multi-executor permanent agent clones
-
The launch log for an inbound permanent agent set to use multiple executors on a High Availability (HA) controller shows the command used to launch each clone and that command’s output. Since version 2.504.3, the output was accidentally lost and instead sent to the controller standard output.
- High load prevented some permanent WebSocket agents from immediately reconnecting after a High Availability (HA) replica exited
-
This bug only affected permanent WebSocket agents under load. A race condition caused some of these agents to attempt to establish a connection with a High Availability (HA) replica that had exited. This could occur if a replica were deleted by Kubernetes for infrastructure reasons or during a rolling restart or upgrade. The condition would resolve itself in approximately 10 to 25 minutes. Now, all permanent WebSocket agents reconnect immediately under load.
- Reverse proxy in High Availability (HA) controllers may delegate to a non-ready replica
-
In High Availability (HA) controllers, when accessing an object that could be owned by a different replica (such as a build, node), the proxying replica may attempt to delegate to a replica that is already gone.
This change ensures the proxy only targets valid replicas or fail immediately, giving a more meaningful error message.
- Misleading warnings from
MultipleExecutorsProperty.launchExtraAgentsProcess
in High Availability (HA) controllers -
In version 2.504.3, a change intended to launch or shut down executor clones of inbound permanent agents in a High Availability (HA) controller was mistakenly also called on outbound permanent agents, which led to a warning. A misleading warning was also printed under some conditions for inbound agents which weren’t currently connected.
- Inbound High Availability (HA) multi-executor permanent agents did not work on some Windows versions
-
The system for tracking which clones of an inbound multi-executor permanent agent were already running on a High Availability (HA) controller relied on the
wmic
binary. This binary isn’t present by default in some newer versions of Windows (11 or Server), which caused the main agent to misreport their status. Now, the status check is performed using the WinP library, which was already in use for other purposes in the same system and doesn’t depend on any specific executable.
- Compatibility with Kubernetes 1.33
-
The new version of the Kubernetes Client API Plugin (
kubernetes-client-api
) is required for compatibility with Kubernetes (K8S) 1.33.
- Client certificate credential authentication was not properly handled for the Bitbucket Branch Source plugin (
cloudbees-bitbucket-branch-source
) -
The Bitbucket Branch Source plugin (
cloudbees-bitbucket-branch-source
) supports client certificate credential authentication. However, when attempting to connect to a BitBucket server configured for Mutual TLS (mTLS), the client certificate was not sent with requests, resulting in an HTTP error.
- Email Extension plugin (
email-ext
) issues with Configuration as Code -
When exporting any element with a post-build action of type "Editable Email Notification" using Configuration as Code, the fields weren’t properly exported. Modifications were made to the Email Extension plugin (
email-ext
) to address this issue.
- Token Review Role-Based Access Control permissions only created when needed
-
The Helm chart automatically created the
Token Review
clusterrole and clusterrolebinding. However, they are only needed when either the CasC Controller Bundle Service or the CloudBees Pluggable Storage is enabled.
The Token Review
clusterrole and clusterrolebinding are now created only when needed.
- Errors resuming declarative builds from older releases after extra restart
-
If a Declarative Pipeline build was running in version 2.492.2.3 (or earlier) and the controller was then upgraded, the build would resume. However, if the controller was restarted a second time, the build would fail. This issue also impacted most running Declarative Pipelines during HA controllers rolling upgrades.
Known Issues
- Duplicate plugins in the Operations center Plugin Manager UI
-
When you search for a specific plugin under the Available tab in the Operations center Plugin Manager, the search results show duplicate entries for the plugin.
- Corruption of
IdentityStorage
file halts event processing and blocks system entry points -
The file
com.cloudbees.ci.license.tracker.consolidation.IdentityStorage
may become corrupted during certain operations, resulting in lines that contain only thenull
literal.
When this file is corrupted, event processing is halted, which can block critical system entry points such as authentication and cause significant disruptions.
Refer to A bug corrupts a file from ULC (User License Counting) causing delayed logins and other potential side effects in CloudBees CI release 2.516.1.28665 for a workaround to apply on affected instances.