Upgrade Notes
- Removed controller remoting transport option for directory lookup
-
Controllers sometimes need to contact the operations center to inspect folders and jobs on the operations center or other connected controllers. As of CloudBees CI 2.426.1.2, this mechanism was switched from remoting to using HTTP, to better support HA controllers, but a system property
-Dcom.cloudbees.opscenter.client.plugin.RemoteDirectoryServiceImpl.useHttp=falsewas retained to permit switching back to the original implementation. This mode has been removed and the property is ignored if set.
Feature Enhancements
- WebSocket enabled by default for new client controller items
-
When creating a new client controller item, WebSocket is now enabled by default for the connection to the operations center, requiring no special network configuration. WebSocket uses the standard HTTP or HTTPS port, and for deployments using Kubernetes Gateway API, it works with no additional configuration because Gateway API routes HTTP/HTTPS traffic through
HTTPRouteresources. This does not affect new controllers defined via Configuration as Code.
- Administrative monitor for misuse of build artifact discard options
-
A new administrative monitor detects jobs configured to discard artifacts from older builds without deleting the builds altogether, which can cause serious performance problems.
Resolved Issues
batPipeline steps hung on Windows SSH agents after controller restart-
batPipeline steps would hang indefinitely on Windows agents connected via SSH after a controller restart. This requires the following variable to be set:-Dorg.jenkinsci.plugins.durabletask.WindowsBatchScript.USE_BINARY_WRAPPER=true
- Amazon EC2 plugin performance improvements
-
The Amazon EC2 plugin now uses a single SSH client to benefit from an NIO performance enhancement that reduces the number of threads created on controllers using the plugin, and prevents thread leaks when connection errors occur.
- Potential deadlocks in HA-specific administrative monitors
-
A coding pattern used in several administrative monitors present in HA controllers was prone to deadlocks and has been fixed. (An actual deadlock of this form was only observed in a newly introduced monitor prior to its release.)
- HTTP 503 outages caused by slow
SecurityRealmblocking permission checks -
Previously, a slow
SecurityRealm(for example, Crowd or LDAP) when adding a group member could block all permission checks, causing HTTP 503 outages. TheSecurityRealmlookup is now performed outside the write lock to prevent this cascade.
- Controllers stuck offline after an operations center restart due to a
setChannelrace condition -
A race condition could cause controllers to get permanently stuck in a reconnection loop after an operations center restart. When multiple WebSocket connections arrived for the same controller simultaneously, the server incorrectly closed the new connection instead of the stale one, causing the controller to appear offline despite being reachable.
- HA controller replicas can get stuck at startup when a crashed replica leaves a stale NFSv3 file lock
-
During HA managed controller startup, replicas come online sequentially, coordinated by an exclusive lock on
$JENKINS_HOME/.launching. In rare cases, if a replica was killed abruptly while holding this lock, NFSv3 may not have released it, leaving the lock held by a process that no longer existed. Any subsequent replica attempting startup would block indefinitely waiting to acquire the lock.
- Rendering failure in
cjoc-networkpolicy.yamlwheningressControllerSelectorwas set with CasC Bundle Retriever enabled -
The
cjoc-networkpolicy.yamlproduced invalid YAML whenNetworkPolicy.ingressControllerSelectorwas configured andOperationsCenter.CasC.Retriever.Enabledwastrue.
- File lock acquisition now retried during HA emergency adoption
-
If an HA replica crashed while running a build and left behind a file lock that was not immediately released, another replica could have attempted to adopt it, failed to acquire the lock, and permanently given up. The cluster now repeatedly attempts to adopt the build, assuming the lock will eventually be released.
LinkageErrorin Bitbucket Branch Source plugin caused by duplicate Jackson annotations-
A
LinkageErrorin the Bitbucket Branch Source plugin was caused by duplicatejackson-annotationsclasses shipped by both Jackson 2 API plugin and Jackson Annotations 2 API plugin.
- CasC Bundle Retriever may not correctly update the version of the bundle
-
Previously, if the CasC Bundle Retriever was configured with a
scmBundlePathstarting with a leading/and automatic versioning (ocBundleAutomaticVersion) was enabled, the version of the bundle would not be updated.
- CasC Bundle Retriever sidecar failed to authenticate with the operations center for automatic bundle reload
-
The CasC Bundle Retriever sidecar failed to authenticate with the operations center and could not trigger automatic bundle reloads after an SCM update. The operations center is now notified immediately when a new bundle is available.
- Race condition aborting an
shstep -
When a Pipeline build running an
sh(orbatorpowershell) step is aborted, either directly by a user or in response to an event such as a timeout, theshstep should fail and the abort reason should propagate up to stop the build as a whole. Due to a race condition, if the user process happened to exit with code zero (success) within milliseconds of the abort request, theshstep could succeed and the build continued (potentially to success) and the abort was lost. Any abort received by the step now always fails the step, regardless of exit status or step configuration.
- Reduced timeout for closing User Activity Monitoring database
-
A managed controller is by default given 30 seconds to shut down upon receiving a termination signal. The User Activity Monitoring plugin was allowing up to five seconds to finish writing a single JSON file, which was excessive, especially since this data collection is best-effort. The timeout has been reduced by default to one second to ensure that other more critical shutdown processes have plenty of time to run.
- Credentials fingerprints disabled by default in HA controllers
-
The default fingerprint storage is not compatible with HA managed controllers, yet Jenkins credentials attempted to record build fingerprints by default. Aside from the files becoming potentially corrupt from concurrent access, this could be a performance bottleneck in high-volume builds. Therefore, HA managed controllers now default to
-DCredentialsProvider.fingerprintEnabled=false.
- Startup error on Amazon Elastic File System about
/var/jenkins/.gitconfig -
Under certain conditions, an HA controller starting on Amazon Elastic File System (EFS) could fail with an error
could not lock config file /var/jenkins_home/.gitconfig: File exists. This step is now treated as best-effort rather than fatal.
- HA multi-executor agent scale-down occasionally triggered unnecessary JVM restart
-
When using multiple executors on a permanent agent in an HA controller and reducing the scale, background threads occasionally remained running after the scale-down. These threads appeared to indicate that stopping the extraneous executors had failed, when in fact there was no real issue, forcing the agent JVM to restart unnecessarily.
- Deprecated hibernation monitor
POSTendpoint removed -
The hibernation monitor used to access hibernated managed controllers has a deprecated
/proxy/<NAME>/endpoint, which is documented to handleGETrequests. As of CloudBees CI 2.528.1.29783, this was extended to handlePOSTrequests to work around a limitation in the MCP Server plugin. The new request method was not documented or tested, and did not support the non-deprecated namespaced endpoint format (or the protected variant). The MCP Server plugin has since been improved so that this temporary workaround is no longer needed, and it has been removed.
- Performance issues caused by
ExpiringTokensMonitor -
Previously, the expiring service account tokens monitor triggered expensive Role-Based Access Control permission checks on every folder during page load, causing timeouts on instances with many folders. This has been fixed.
- Job instance list failed to load
-
Previously, the list of job instances failed to load when viewing a Multibranch Pipeline Template via the Pipeline Template Catalog folder path.
- Test SSH Connection incorrectly reported missing credentials for folder-scoped and restricted credentials on shared agents
-
The Test SSH Connection validation button on shared agents configured with folder-scoped or restricted SSH credentials incorrectly reported
credentials cannot be found.
master-networkpolicy.yamlinvalid YAML due to incorrect JMX selector indentation-
Previously, JMX network policy rules in
master-networkpolicy.yamlrendered with incorrect indentation, producing invalid YAML.
pluggable-storage-service-deployment.yamlinvalid YAML due to incorrect pod annotation indentation-
Pod annotations in
pluggable-storage-service-deployment.yamlrendered with incorrect indentation, producing invalid YAML.
- Restart from stage option missing when viewed from a different HA replica
-
The Restart from stage build action was not loaded correctly on HA replicas after a Declarative Pipeline build completed. This caused the Restart from stage button to appear missing when viewing the build from a different replica than the replica it ran on.
- Removed Tomcat-specific code that counted HTTP worker threads
-
Support for running CloudBees CI in Tomcat ended in October 2025. A support bundle component that counted HTTP worker threads was still checking for Tomcat configurations and has been simplified to check for Jetty only.