Agent sporadically fails with 'Unable to create native threads'

2 minute readKnowledge base

Issue

  • Provisioned agents sporadically fails to start with the following exception:

Also: hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from XXXX/XXX.XXX.XXX.XXX:XXXXX
	at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1800)
	at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
	at hudson.remoting.Channel.call(Channel.java:1001)
	at hudson.FilePath.act(FilePath.java:1157)
	at hudson.FilePath.act(FilePath.java:1146)
	at org.jenkinsci.plugins.gitclient.Git.getClient(Git.java:121)
	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:904)
	at hudson.plugins.git.GitSCM.createClient(GitSCM.java:835)
	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1288)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep.checkout(SCMStep.java:125)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:93)
	at org.jenkinsci.plugins.workflow.steps.scm.SCMStep$StepExecutionImpl.run(SCMStep.java:80)
	at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.OutOfMemoryError: unable to create new native thread
    [...]
  • JDK 11 agents shows the following:

[14.901s][warning][os,thread] Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached.

Explanation

The issue is rather sporadic and would generally happen during class loading of the first user requests through remoting. Typically in git checkout step. This is caused by a JDK issue JDK-8268773: Improvements related to: Failed to start thread - pthread_create failed (EAGAIN). An improvement in class loading has also been made in Remoting 4.14, also backported to 4.13.3.

Solution

  • Upgrade Java to version 11.0.16 or later on agent machines (it is recommended to upgrade to 11.0.16.1 or later due to known memory leak problems with 11.0.16)

  • Upgrade Jenkins Agent (Remoting) to version 4.13.3 or later.

Note: CloudBees CI 2.346.3.4 contains Java 11.0.16 and Remoting 4.13.3. It is however recommended to upgrade CloudBees CI to 2.361.1.2 though to avoid the issue Memory leak caused by regression in C2 JIT Compiler in Java 11.0.16