Gradle Plugin Build Scan Injection causes Performance Issues

2 minute read

Issue

  • After upgrading to CloudBees CI 2.346.2.3, at some point Jenkins queue does not schedule tasks anymore

  • After upgrading to CloudBees CI 2.346.2.3, at some point Jenkins becomes unresponsive

  • Thread dumps shows a thread holding the queue lock Queue._withLock and spending time under hudson.plugins.gradle.injection.BuildScanInjectionListener while other are waiting on it:

      "Computer.threadPoolForRemoting [#2290] / waiting for JNLP4-connect connection from 127.0.0.1/127.0.0.1:45678 id=851584" id=27728 (0x6c50) state=TIMED_WAITING cpu=84%
        - waiting on <0x1ed30886> (a hudson.remoting.UserRequest)
        - locked <0x1ed30886> (a hudson.remoting.UserRequest)
        at java.base@11.0.16/java.lang.Object.wait(Native Method)
        at hudson.remoting.Request.call(Request.java:177)
        at hudson.remoting.Channel.call(Channel.java:999)
        at hudson.FilePath.act(FilePath.java:1285)
        at hudson.plugins.gradle.injection.MavenBuildScanInjection.removeMavenExtension(MavenBuildScanInjection.java:76)
        at hudson.plugins.gradle.injection.MavenBuildScanInjection.inject(MavenBuildScanInjection.java:49)
        at hudson.plugins.gradle.injection.BuildScanInjectionListener.lambda$inject$0(BuildScanInjectionListener.java:57)
        at hudson.plugins.gradle.injection.BuildScanInjectionListener$$Lambda$611/0x00000008412d6c40.accept(Unknown Source)
        at java.base@11.0.16/java.util.Arrays$ArrayList.forEach(Arrays.java:4390)
        at hudson.plugins.gradle.injection.BuildScanInjectionListener.inject(BuildScanInjectionListener.java:57)
        at hudson.plugins.gradle.injection.BuildScanInjectionListener.onConfigurationChange(BuildScanInjectionListener.java:49)
        at hudson.model.AbstractCIBase$$Lambda$609/0x00000008412d7440.accept(Unknown Source)
        at jenkins.util.Listeners.lambda$notify$0(Listeners.java:59)
        at jenkins.util.Listeners$$Lambda$610/0x00000008412d6840.run(Unknown Source)
        at jenkins.util.Listeners.notify(Listeners.java:70)
        at hudson.model.AbstractCIBase.updateComputerList(AbstractCIBase.java:277)
        at jenkins.model.Jenkins.updateComputerList(Jenkins.java:1670)
        at jenkins.model.Nodes$5.run(Nodes.java:279)
        at hudson.model.Queue._withLock(Queue.java:1395)
        at hudson.model.Queue.withLock(Queue.java:1269)
        at jenkins.model.Nodes.removeNode(Nodes.java:270)
        at jenkins.model.Jenkins.removeNode(Jenkins.java:2215)
        at hudson.slaves.AbstractCloudSlave.terminate(AbstractCloudSlave.java:91)
        at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$1$1.run(OnceRetentionStrategy.java:128)
        at hudson.model.Queue._withLock(Queue.java:1395)
        at hudson.model.Queue.withLock(Queue.java:1269)
        at org.jenkinsci.plugins.durabletask.executors.OnceRetentionStrategy$1.run(OnceRetentionStrategy.java:123)
      [...]

Explanation

Gradle Plugin version 1.39 introduces a feature that automatically injects Gradle / Maven build scans on agents, see Injecting Build Scans. This feature triggers the injection every time an agent comes online or has its configuration changed (that would include when a node is removed). This adds some overhead to the creation/deletion of agents, and since the processing of those events holds the Queue lock, it can cause some serious process contention in the environment. That feature is known to have a very detrimental impact in environments that use ephemeral node provisioning.

Eventually in version 1.39.4, the featue has been disabled by default.

Resolution

The recommended solution is to upgrade to CloudBees CI 2.346.3.4 that include gradle plugin 1.39.4 or later.

Workaround

Upgrade the Gradle plugin to version 1.39.4 or later.

If running Gradle 1.39.4, make sure that the injection is NOT enabled. That is, make sure that the environment variable JENKINSGRADLEPLUGIN_GRADLE_ENTERPRISE_INJECTION is NOT set.