Symptoms
Jenkins instance is slow at some point. For example, pages that are usually fast to load suddenly take several seconds. The UI is always displayed although not usable at some point.
Diagnosis/Treatment
Precondition
Before taking any further action ensure that the recommended JVM parameters are in place per the guide Prepare CloudBees CI for Support.
Data Collection when the problem is exposed
It is capital to collect the data when the problem is exposed.
Automatic Data collection
This is the preferred method if you are using a product supported by the cbsupport
CLI.
Current products supporting collecting performance data are:
Steps to follow are:
-
Install and configure
cbsupport
following Using cbsupport CLI to collect the requested data -
Run
cbsupport required-data performance
-
Collect the archive generated in the working directory of cbsupport and attach it to the ticket using our upload service
Manual Data collection
-
We need to use the script collectPerformanceData.sh in order to take several thread dumps when the problem is exposed and correlate them with the output of a process list.
-
Only a thread dump is NOT ENOUGH because you are not able to know what is the percentage of CPU used by each thread in the thread dump - you need also the process list with CPU usage listed.
-
Only one thread dump is NOT ENOUGH as it only represent a punctual static photograph. We need several sequential thread dumps to understand behavior over time.
-
Attach at least one support bundle when the issue is exposed or right after.
-
Output of the garbage collector when the issue is exposed. Remember that you are specifying this file in Prepare CloudBees CI for Support as per
-Xlog:gc[...]:file=<some-log-file>:[...]
for JDK 11+ or-Xloggc:<some-log-file>
for JDK 8:
JVM Tuning has been known to improve performance, especially with a large java heap size. For this we recommend:
-
As Jenkins response time is very important, memory consumption is usually higher than 2G and garbage collection pauses must be kept shorter than approximately 1 second, please use the proper GC as previously suggested and also Tuning Jenkins GC For Responsiveness and Stability
Most users need 4G or less of heap memory for Jenkins. It is best to ensure you really need additional heap (either from low garbage collection throughput and excessive garbage collection activity, measured by GC logs, or from getting OutOfMemory errors in Jenkins). |
The script is run as follows:
# Syntax collectPerformanceData.sh <PID> <TOTAL_TIME> <SLOTS>
# Example sudo -i -u jenkins-oc /home/jenkins/collectPerformanceData.sh <PID> 300 30
Notice that the script needs to be run by the user running the jenkins process ps -aux | grep jenkins
.
We recommend running the script prior to troubleshooting any issue to confirm that all the elements generated have content.
Generated bundle’s directory structure:
Contents:
-
iostat Monitoring the performance of storage devices
-
jstack thread Dump
-
mode.txt Script output
-
netstat.out Network connections for the Transmission Control Protocol (both incoming and outgoing), routing tables, and a number of network interfaces
-
nfsiostat Monitoring the performance of NFS mounts
-
nfsstat NFS statistics for client and server activity
-
topdashHOutput Sub-process list with multiple information like CPU usage, memory consumption, …
-
topOutput Process running on the machine
-
vmstat Display summary information about operating system memory, processes, interrupts, paging and block I/O