SCM Polling Threads hung with Perforce

Article ID:217104428
3 minute readKnowledge base

Issue

  • Perforce SCM Polling hung (and I need to restart Jenkins)

  • Should I perform Perforce Polling on agent or controller?

Environment

  • CloudBees Jenkins Enterprise

  • Jenkins

Resolution

Perforce SCM Polling requires the execution of a CLI Tool and is demanding in terms of I/O. When setting up instances with lots of jobs performing SCM Polling, it is important to understand its limitations. The purpose of this article is to provide more knowledge about Perforce polling in a Jenkins Environment as well as few tips/advices to overcome overload issues.

Perforce Plugins

  • Perforce plugin spawns a process with CLI tool. The code is heavy and error-prone.

  • P4 Plugin is known to have few I/O and deadlock issues but it is well maintained and therefore it is preferred to the Perforce Plugin.

Polling on Agents

The polling is done by specific threads and doesn’t consume executors on the agent. However, our licensing counts for used all executors of the agents that are up. Thus if you keep an agent up for polling, its number of executors are counted in the license usage.

Polling on controller

Our recommendation for large Jenkins instances with many Perforce-dependent jobs would be to not perform polling on the controller. It is critical to limit the number of thread doing the polling otherwise the controller uses all its resources for polling. This can be done by setting the Max SCM Polling Threads limit in Manage Jenkins/Configure System (This option appears when more than 10 jobs have been configured with SCM polling).

Reduce Polling Jobs

For large instances and when possible, it is a good idea to create a "trigger job" that performs the polling and triggers other jobs. For example, if multiple jobs are configured with polling on the same workspace/repository and therefore needs to be triggered for the same event. This can be done with the Parameterized Trigger Plugin.

SCM Polling

SCM polling thread are triggered by your instance based on the CronTab specified. When it is time, It executes the polling and terminates when polling is finished. Depending on the changes on the repository and the reliability of the network, it can take few seconds.

Max SCM Polling Threads limit

Because intense Polling on controller requires a large amount of concurrent thread, it is critical to set the Max SCM Polling Threads limit for the sanity of the platform.

How does the limit works?

Let’s say that you set this limit to 20. When you have 20 concurrent polling threads running, any additional polling request will be queued. It is only when one of the concurrent thread terminates that the request in the queue can be executed.

Even Load Polling

When specifying polling interval, it is a good practice to use the H symbol wherever possible. This prevents for having multiple jobs polling at the exact same time.

For example, 0 0 * * * for 20 jobs will cause 20 threads to start at midnight simultaneously. And if the Max SCM Polling Threads limit is lower than this, some polling requests will have to wait before being executed. In contrast, using H H * * * would execute each job once a day but not simultaneously. More information can be found here.

Tools/Housekeeping

If the Jenkins controller is overloaded with hung SCM Polling threads, the following Groovy scripts can help you to fix the load. They might interrupt healthy threads but can at least prevents Jenkins controller from crashing. These scripts could be automated in a Freestyle job.

  • The following script interrupt any SCM Polling thread currently running more than 3 minutes and can be tuned:

jenkins.model.Jenkins.instance.getTrigger("SCMTrigger").getRunners().each() { runner ->
    println(runner.getTarget().asItem().name)
    println(runner.getDuration())
    println(runner.getStartTime())
    long millis = Calendar.instance.time.time - runner.getStartTime()

    if (millis > (1000 * 60 * 3)) // 1000 millis in a second * 60 seconds in a minute * 3 minutes
        Thread.getAllStackTraces().keySet().each() { tItem ->
            if (tItem.getName().contains("SCM polling") && tItem.getName().contains(runner.getTarget().asItem().name)) {
                println "Interrupting thread " + tItem.getId() + " " + tItem.getName();
                tItem.interrupt()
            }
        }
    }
}
  • The following script interrupt any SCM Polling thread currently running:

Thread.getAllStackTraces().keySet().each(){
    item -> if(item.getName().contains("SCM polling")){
        println "Interrupting thread " + item.getId() + " " + item.getName();
        //Uncomment the next line to interrupt these threads
        //item.interrupt();
    }
}
return;