Issue
Intermittently, you observe that your pipeline builds stop working. When trying to trigger the build you see some log entries like the ones shown below:
XXXXXX [id=12594374] WARNING j.model.lazy.LazyBuildMixIn#newBuild: A new build could not be created in job XXXXXX java.lang.IllegalStateException: JENKINS-23152: /var/lib/jenkins/jobs/DXXXXXX/X already existed; will not overwrite with XXXXXX #X at hudson.model.RunMap.put(RunMap.java:189) at jenkins.model.lazy.LazyBuildMixIn.newBuild(LazyBuildMixIn.java:182) at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.createExecutable(ParameterizedJobMixIn.java:510) at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.createExecutable(ParameterizedJobMixIn.java:320) at hudson.model.Executor$1.call(Executor.java:365) at hudson.model.Executor$1.call(Executor.java:347) at hudson.model.Queue._withLock(Queue.java:1457) at hudson.model.Queue.withLock(Queue.java:1318) at hudson.model.Executor.run(Executor.java:347) XXXXXX [id=12594374] SEVERE hudson.model.Executor#run: Executor #-1 for controller: Unexpected executor death java.lang.IllegalStateException: JENKINS-23152: /var/lib/jenkins/jobs/DXXXXXX/X already existed; will not overwrite with XXXXXX #X at hudson.model.RunMap.put(RunMap.java:189) at jenkins.model.lazy.LazyBuildMixIn.newBuild(LazyBuildMixIn.java:182) Caused: java.lang.Error at jenkins.model.lazy.LazyBuildMixIn.newBuild(LazyBuildMixIn.java:190) at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.createExecutable(ParameterizedJobMixIn.java:510) at jenkins.model.ParameterizedJobMixIn$ParameterizedJob.createExecutable(ParameterizedJobMixIn.java:320) at hudson.model.Executor$1.call(Executor.java:365) at hudson.model.Executor$1.call(Executor.java:347) at hudson.model.Queue._withLock(Queue.java:1457) at hudson.model.Queue.withLock(Queue.java:1318) at hudson.model.Executor.run(Executor.java:347)
Resolution
Under some circumstances, pub-sub light, which according to its documentation here is a A light-weight Publish-Subscribe (async) event notification module for Jenkins. can face some problems when one of the items subscribed or publishing is not in the queue anymore. This can potentially cause a memory leak that can block pipeline builds and make the instance unstable.
In a heap dump taken on the instance while the issue is happening, you should be able to see a big region of the heap containing org.jenkinsci.plugins.pubsub.Message
objects.
The fix for this potential issue is included in Pub-sub light 1.16
Workaround
A temporary workaround for this issue is to restart the instance, but the long term fix requires the plugin to be upgraded to version 1.16 as mentioned above.
Tested product/plugin versions
-
Pub-sub light 1.14
-
Jenkins LTS 2.263.2