Lack of ephemeral storage in the controller pod causes frequent instance restarts

Last Reviewed:2025-08-22()
2 minute readKnowledge base

Issue

My CloudBees CI managed controller gets restarted frequently. In the Jenkins logs I don’t see any error causing the restart, only INFO messages informing about the JVM shutdown:

... 2025-04-14 21:21:03.887+0000 [id=25] INFO winstone.Logger#logInternal: JVM is terminating. Shutting down Jetty 2025-04-14 21:21:03.887+0000 [id=25] INFO org.eclipse.jetty.server.Server#doStop: Stopped Server@1cfd1875{STOPPING}[10.0.18,sto=0] 2025-04-14 21:21:03.892+0000 [id=25] INFO o.e.j.server.AbstractConnector#doStop: Stopped ServerConnector@425357dd{HTTP/1.1, (http/1.1)}{0.0.0.0:8080} 2025-04-14 21:21:03.895+0000 [id=25] INFO hudson.lifecycle.Lifecycle#onStatusUpdate: Stopping Jenkins 2025-04-14 21:21:03.896+0000 [id=25] INFO o.c.j.p.k.p.r.Reaper$CloudPodWatcher#stop: Stopping watch for kubernetes cloud kubernetes 2025-04-14 21:21:03.912+0000 [id=25] INFO jenkins.model.Jenkins$16#onAttained: Started termination 2025-04-14 21:21:03.913+0000 [id=25] INFO c.c.j.p.d.events.JenkinsEvents#stopSubmitter: Stopped accepting events 2025-04-14 21:21:03.913+0000 [id=25] INFO c.c.j.p.d.events.JenkinsEvents#stopSubmitter: Shut-down 2025-04-14 21:21:03.915+0000 [id=25] INFO c.c.o.c.MapDBMessagingStore#close: Messaging Stopped 2025-04-14 21:21:03.929+0000 [id=25] INFO jenkins.model.Jenkins$16#onAttained: Completed termination ...

The operations center provisioning logs show the following warnings:

[Mon Apr 14 09:37:42 UTC 2025] Connected ManagedMaster{id=6, name='my-controller', encodedName='my-controller', idName='6-my-controller', timeStamp=0, grantId='3828e261-3391-43c1-bea4-700aaa3bcc91', approved=true, localHome='null', localEndpoint=https://cloudbees.my-company.net/my-controller/, identity=X.509, RSA} [Mon Apr 14 09:49:14 UTC 2025][Warning][Pod][my-controller-0][Evicted] Pod ephemeral local storage usage exceeds the total limit of containers 4Gi. [Mon Apr 14 09:49:14 UTC 2025][Normal][Pod][my-controller-0][Killing] Stopping container jenkins ERROR: [Mon Apr 14 09:49:15 UTC 2025] Disconnected Error ManagedMaster{id=6, name='my-controller', encodedName='my-controller', idName='6-my-controller', timeStamp=0, grantId='3828e261-3391-43c1-bea4-700aaa3bcc91', approved=true, localHome='null', localEndpoint=https://cloudbees.my-company.net/my-controller/, identity=X.509, RSA} java.nio.channels.ClosedChannelException

Resolution

The operations center provisioning logs point out a problem with the ephemeral storage. At the moment, we don’t have any guidelines for ephemeral-storage configuration as it depends very heavily on the number of builds, libraries and jobs that you have in a controller and can vary a lot from one controller to another. For instance, the checkout of a large repository in a pipeline can cause the controller to increase noticeably the ephemeral usage and cause a restart every time the pipeline is launched. You need to fine-tune the controller ephemeral-storage requests and limits to meet your needs.

Follow the instructions in Adding ephemeral storage requests and limits to a managed controller to customize these values for controllers from your CloudBees CI operations center.

Tested product/plugin versions

  • CloudBees CI on modern cloud platforms - 2.504.3.28227