Issue
After upgrading CloudBees CI Modern, the operations center and / or managed controller are restarting unexpectedly. The node / pod description show:
The node was low on resource: ephemeral-storage. Threshold quantity: ..., available: .... Container jenkins was using ..., request is 0, has larger consumption of ephemeral-storage.
Explanation
Starting version 2.401.3.3, the CI Modern container images automatically set the following system properties on startup:
jenkins.plugins.git.AbstractGitSCMSource.cacheRootDir=/tmp/jenkins/caches/git org.jenkinsci.plugins.github_branch_source.GitHubSCMSource.cacheRootDir=/tmp/jenkins/caches/github-branch-source
This is a requirement for High Availability (active/active) controllers to avoid corruption issues that was brought to the container image. Consequently applied to any kind of managed controller.
It moves the Git plugin cache and GitHub Branch Source okhttp cache from the JENKINS_HOME
to the /tmp
directory. Although this has a little impact in most environments, it can cause stability issues due to ephemeral storage usage. Especially in environments that require lightweight checkout of large git repositories and using Git SCM.
Resolution
The solution is to add the following System Properties to the operations center / managed controller configuration to restore the default behavior of those features:
jenkins.plugins.git.AbstractGitSCMSource.cacheRootDir=/var/jenkins_home/caches org.jenkinsci.plugins.github_branch_source.GitHubSCMSource.cacheRootDir=/var/jenkins_home/org.jenkinsci.plugins.github_branch_source.GitHubSCMProbe.cache
if these cache directories are put back into /var/jenkins_home using these system properties, ensure your backups omit these directories.
|
See How to add Java arguments to Jenkins on CloudBees CI Modern ? for details.
If despite this change, operations center / managed controller pods are still being evicted due to low available ephemeral-storage on the node, then the problem is not related to the change discussed here. The /tmp needs to be evaluated (through monitoring and investigation of files in the /tmp directory) and ephemeral-storage resource requests / limits may be applied as a solution. See Adding ephemeral storage requests and limits to a managed controller ?
|