Issue
Jenkins jobs will sit in the build queue, and not get started, even when there are build agents available for the chosen 'label', and you see stack traces in the logs similar to the one shown below:
SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXXX failed
How do we know what is causing this queue freeze?
Resolution
The Queue.MaintainTask is a periodic task which is run in the instance, and it is responsible for maintenance operations such as adding elements to the queue, assigning elements in the queue to nodes or executors, etc. If, for some reason, this task fails, this causes the queue to become unresponsive and the builds eventually stop being run as they stay stuck in the queue.
To determine what is the cause for this problem, we need to pay special attention to the full stack trace of the error, which shows up in the logs.
You will be able to see some potential causes below, the intent of the list below is to allow you to understand the pattern that you might follow to determine the root cause of the failure, thus helping you recover the instance as fast as possible. Whenever possible, we will also add some Workaround or Solution details.
A CloudBees Nodes Plus plugin
SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from XXXX/XXX:XXX
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
at hudson.remoting.Channel.call(Channel.java:957)
at hudson.Launcher$RemoteLauncher.launch(Launcher.java:1059)
at hudson.Launcher$ProcStarter.start(Launcher.java:455)
at com.cloudbees.jenkins.plugins.nodesplus.CustomNodeProbeBuildFilterProperty.getProbeResult(CustomNodeProbeBuildFilterProperty.java:180)
In the stacktrace we can clearly see that there is a correlation between the nodesplus.CustomNodeProbeBuildFilterProperty.getProbeResult method and the task failure.
A.1 Solution/Workaround
Check your nodes looking for any custom node probe and disable it first as an initial remediation step.
Verify that you are using cloudbees-nodes-plus 1.18 or higher, as this version included some extra verification that prevented faulty custom probes from causing a queue lock.
If after upgrading the plugin the issue persists, disable any custom node probes and open a support ticket.
B OpenText Application Automation Tools plugin
SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.logging.log4j.core.impl.Log4jLogEvent
at org.apache.logging.log4j.core.impl.DefaultLogEventFactory.createEvent(DefaultLogEventFactory.java:54)
at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:401)
at org.apache.logging.log4j.core.config.DefaultReliabilityStrategy.log(DefaultReliabilityStrategy.java:49)
at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:146)
at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2116)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2100)
at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:1994)
at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1966)
at org.apache.logging.log4j.spi.AbstractLogger.error(AbstractLogger.java:739)
at com.microfocus.application.automation.tools.octane.events.WorkflowListenerOctaneImpl.onNewHead(WorkflowListenerOctaneImpl.java:79)
In this case, the exception thrown is different, but the effect is the same, the periodic task starts failing.
B.1 Solution/Workaround
For this plugin, the recommended workaround is to upgrade the plugin at least up to version 5.6.2, as previous versions of the plugin would also be impacted by JENKINS-6070.
C Build Blocker plugin
SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*.XXX.*XXX
^
at java.util.regex.Pattern.error(Pattern.java:1957)
at java.util.regex.Pattern.sequence(Pattern.java:2125)
at java.util.regex.Pattern.expr(Pattern.java:1998)
at java.util.regex.Pattern.compile(Pattern.java:1698)
at java.util.regex.Pattern.<init>(Pattern.java:1351)
at java.util.regex.Pattern.compile(Pattern.java:1028)
at java.util.regex.Pattern.matches(Pattern.java:1133)
at java.lang.String.matches(String.java:2121)
at hudson.plugins.buildblocker.BlockingJobsMonitor.checkForPlannedBuilds(BlockingJobsMonitor.java:162)
at hudson.plugins.buildblocker.BlockingJobsMonitor.checkForQueueEntries(BlockingJobsMonitor.java:86)
at hudson.plugins.buildblocker.BuildBlockerQueueTaskDispatcher.checkAccordingToProperties(BuildBlockerQueueTaskDispatcher.java:151)
Again, the stack trace will allow us to determine the source of the problem affecting the queue. hudson.plugins.buildblocker.BuildBlockerQueueTaskDispatcher is the important stack frame which helps us find the plugin.
C.1 Solution/Workaround
The remediation step if you find yourself impacted by this issue is either to review that you are using a correct regular expression in the configuration page for the job showing in the thread referenced. Alternatively, you can try replacing this plugin with Lockable Resources plugin.
D Pipeline Graph Analysis plugin
SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.lang.IndexOutOfBoundsException: Index: 0
at java.util.Collections$EmptyList.get(Collections.java:4454)
at org.jenkinsci.plugins.workflow.graph.StandardGraphLookupView.bruteForceScanForEnclosingBlock(StandardGraphLookupView.java:150)
at org.jenkinsci.plugins.workflow.graph.StandardGraphLookupView.findEnclosingBlockStart(StandardGraphLookupView.java:197)
at org.jenkinsci.plugins.workflow.graph.StandardGraphLookupView.findAllEnclosingBlockStarts(StandardGraphLookupView.java:217)
D.1 Solution/Workaround
The solution for this error is to upgrade workflow-api plugin to version 2.35 or higher. This will ensure that you have the fix to prevent the edge case that triggered this issue.
E Block Queued Job plugin
SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.lang.NullPointerException
at org.jenkinsci.plugins.blockqueuedjob.condition.JobResultBlockQueueCondition.isBlocked(JobResultBlockQueueCondition.java:70)
at org.jenkinsci.plugins.blockqueuedjob.BlockItemQueueTaskDispatcher.canRun(BlockItemQueueTaskDispatcher.java:35)
at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)
E.1 Solution/Workaround
The solution for this error is to disable the plugin. At the time of writing, the last release of the plugin was in 2016. It does not have too many installations, so if you can disable it, that would be the most direct way to get the problem solved.
F Kubernetes plugin
The queue is blocked, and no builds are being processed. Shortly after, the instance goes down. After capturing a thread dump for the instance, you get a stack trace similar to the one shown below:
java.lang.Object.wait(Native Method) hudson.remoting.Request.call(Request.java:177) hudson.remoting.Channel.call(Channel.java:954) org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave._terminate(KubernetesSlave.java:236) hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:67) hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:59) hudson.slaves.CloudRetentionStrategy.check(CloudRetentionStrategy.java:43) hudson.slaves.SlaveComputer$4.run(SlaveComputer.java:843) hudson.model.Queue._withLock(Queue.java:1380)
The queue is locked due to the KubernetesSlave._terminate() call.
F.1 Solution/Workaround
This is a known issue that was reported in JENKINS-54988 and it is due to a problem with the Kubernetes plugin.
The fix for this issue was released as Kubernetes Plugin 1.21.1
G High Availability and Horizontal Scalability
2025-06-10 12:35:00.034+0000 [id=51] SEVERE hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@4118451d failed com.hazelcast.core.HazelcastInstanceNotActiveException: HazelcastInstance[[IP]:5701] is not active! at com.hazelcast.spi.impl.proxyservice.impl.ProxyRegistry.getService(ProxyRegistry.java:91) ... at com.hazelcast.instance.impl.HazelcastInstanceImpl.getDistributedObject(HazelcastInstanceImpl.java:364) at com.hazelcast.instance.impl.HazelcastInstanceImpl.getMap(HazelcastInstanceImpl.java:183) at com.hazelcast.instance.impl.HazelcastInstanceProxy.getMap(HazelcastInstanceProxy.java:96) at com.cloudbees.jenkins.plugins.replication.builds.Adoption.owners(Adoption.java:97) at com.cloudbees.jenkins.plugins.replication.builds.Adoption$HAListener.allowLoad(Adoption.java:261) at hudson.model.RunMap.allowLoad(RunMap.java:258) at jenkins.model.lazy.AbstractLazyLoadRunMap.getByNumber(AbstractLazyLoadRunMap.java:578) at hudson.model.RunMap.getById(RunMap.java:237) at hudson.model.RunMap.getById(RunMap.java:65) at jenkins.model.lazy.BuildReferenceMapAdapter.unwrap(BuildReferenceMapAdapter.java:40) at jenkins.model.lazy.BuildReferenceMapAdapter$CollectionAdapter$1.adapt(BuildReferenceMapAdapter.java:187) at jenkins.model.lazy.BuildReferenceMapAdapter$CollectionAdapter$1.adapt(BuildReferenceMapAdapter.java:184) ... at jenkins.model.lazy.LazyBuildMixIn.getEstimatedDurationCandidates(LazyBuildMixIn.java:279) at hudson.model.AbstractProject.getEstimatedDurationCandidates(AbstractProject.java:961) at hudson.model.Job.getEstimatedDuration(Job.java:1064) ... at hudson.model.Queue.maintain(Queue.java:1665) at hudson.model.Queue$MaintainTask.doRun(Queue.java:2919) at hudson.triggers.SafeTimerTask.run(SafeTimerTask.java:92) ...
Hazelcast is the technology used by the High Availability and Horizontal Scalability. If it shows up in the Queue$MaintainTask stack trace, it means the root cause has something to do with CloudBees HA.
G.1 Solution/Workaround
Restarting the affected replica should reset the Hazelcast state. Ensure you are running the latest release to benefit from the latest reliability improvements.
On modern platforms, ensure that HA managed controllers have "Use new health check" enabled. The flag was introduced in release 2.504.1.6 and includes Hazelcast state. Hence, if replica has an issue with Hazelcast it is restarted by Kubernetes liveness probe.