Issue
-
After upgrading CloudBees Core to release 2.190.2.2, Kubernetes planned agents are stuck in "Pending" and builds are hanging forever, waiting for the planned agents to come online with "Waiting for next available executor".
-
In such cases, the agent pod are not even scheduled
Environment
-
CloudBees CI (CloudBees Core) 2.190.2.2
-
CloudBees CI (CloudBees Core) on Modern Cloud Platforms - Managed controller 2.190.2.2
-
CloudBees CI (CloudBees Core) on Modern Cloud Platforms - Operations Center 2.190.2.2
-
CloudBees CI (CloudBees Core) on Traditional Platforms - Client controller 2.190.2.2
-
CloudBees CI (CloudBees Core) on Traditional Platforms - Operations Center 2.190.2.2
-
Kubernetes Plugin from version 1.19.1 to 1.21.2 (excluded)
Related Issue(s)
-
JENKINS-56307 (regression)
-
JENKINS-60055 (fix)
Explanation
This is a bug in the kubernetes plugin introduced by JENKINS-56307 in version 1.19.1 of the kubernetes plugin.
This version introduces a new Node Provisioner strategy NoDelayProvisionerStrategy
that is enabled by default. The strategy provisions a node as soon as the Node Provisioner detects a need for more agents. As opposed to the default strategy that makes his decision based on load estimates.
There is a bug in the implementation that causes agents to never be provisioned and builds to hang.
Resolution
This issue has been fixed in version 1.21.2 of the kubernetes plugin.
Solution
The solution is to upgrade CloudBees CI to version CloudBees CI 2.190.3.2 or later.
Workaround
The workaround is to disable the NoDelayProvisionerStrategy
. This can be done by adding the system property -Dio.jenkins.plugins.kubernetes.disableNoDelayProvisioning=true
to the controller’s startup. This requires to restart the controller in order to take effect. See How to add Java arguments to Jenkins to under stand how to do this.