This article references an issue that affects a product version that is no longer supported. Please verify the version listed in the article applies to your situation. If unsure, please submit a support ticket at: https://support.cloudbees.com/. |
Issue
-
Jenkins server hangs unexpectedly only recovered after a restart.
-
Stacktrace similar to the below is observed when the issue occurs.
Thread updating group members
"Handling POST /groups/Jenkins-users/addMember/api/json from 10.15.132.73 : Jetty (winstone)-37013694" id=37013694 (0x234c8be) state=WAITING cpu=73% - waiting on <0x595f0447> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) - locked <0x595f0447> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943) at nectar.plugins.rbac.groups.Group.doAddMember(Group.java:1710)
Many threads blocked on
- waiting on <0x595f0447> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) - locked <0x595f0447> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727) at nectar.plugins.rbac.groups.Group.isMatch(Group.java:510)
Environment
-
CloudBees CI (CloudBees Core) on modern cloud platforms - Managed controller
-
CloudBees CI (CloudBees Core) on modern cloud platforms - Operations Center
-
CloudBees CI (CloudBees Core) on traditional platforms - Client controller
-
CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
Resolution
Reentrant locking in RBAC groups causes Jenkins to become unresponsive. Making concurrent modifications to RBAC groups was occasionally causing Jenkins to become unresponsive. The issue was caused by deadlock as a result of the reentrant locking strategy.
The deadlock issue has been resolved. Making concurrent modifications to RBAC groups no longer causes Jenkins to become unresponsive. The fix is available in 2.303.1.5 and upwards.
Workaround
If you are unable to upgrade your instance for some reason the fix is also back ported into version 5.51.1 of the CloudBees Role-Based Access Control Plugin.