Issue
After upgrading to versions 2.277.3.1, 2.277.4.3 you may have issues accessing the web UI of your controller, and in the support bundle in the slow-requests
directory, you will find a stack trace similar to:
"Handling GET / from IP_REDACTED: Jetty (winstone)-95 View/index.jelly View/sidepanel.jelly" Id=95 BLOCKED on java.util.concurrent.ConcurrentHashMap$Node@71db1029 owned by "Handling GET / from IP_REDACTED: Jetty (winstone)-19 View/index.jelly View/sidepanel.jelly" Id=19 at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1868) - blocked on java.util.concurrent.ConcurrentHashMap$Node@71db1029 at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2344) at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2327) at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) at com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) at nectar.plugins.rbac.groups.GroupContainerACL.hasPermission(GroupContainerACL.java:139) ...
Environment
-
CloudBees CI (CloudBees Core) on modern cloud platforms - Managed controller
-
CloudBees CI (CloudBees Core) on modern cloud platforms - Operations Center
-
CloudBees CI (CloudBees Core) on traditional platforms - Client controller
-
CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
Resolution
This is caused by a bug which was fixed in 2.289.1.2:
RBAC caching issue (BEE-3553) An RBAC function was occasionally causing livelock issues, where a request for access is repeatedly denied until the system stops responding. This caching issue has been resolved. The RBAC function no longer causes the system to stop responding.
Workaround
You can workaround this bug without upgrading by adding the following startup argument:
-Dnectar.plugins.rbac.strategy.RoleMatrixAuthorizationPlugin.cacheSize=0
To add this, you can follow: How to add Java arguments to Jenkins? Since disabling this cache may have a small performance impact, after you upgrade to the (upcoming) release with the fix, be sure to remove this startup argument.