Issue
-
I need to increase the default timeout for the CloudBees High Availability (active/passive) Plugin
Description
Often times a High Availability (active/passive) failover is a sign of an underlying issue that should be addressed, but this timeout can be adjusted while you are debugging the issue, or you can disable HA by following How to disable High Availability (active/passive) in Jenkins?.
Commonly, long running JVM Garbage Collection cycles that last longer than the default timeout (10s for versions lower than 2.303.2.5, 30s for version 2.303.2.5 and greater) can cause a failover. Therefore, following the Best Practices is a must.
If you are suffering HA Failover too often, we encourage you to Submit a Support Request so we can diagnose the root cause.
Resolution
If you are running product version 2.303.2.3 or higher, you can adjust the timeout from the product by going to Manage Jenkins
-> Configure System
-> High Availability Configuration
-> Enable customize JGroups configuration
. You will need to configure the ports you would like to use, as well as you can configure the timeout (default is 30 seconds).
For product versions lower than 2.303.2.3, you can put a configurable jgroups.xml
file that can live inside of ${JENKINS_HOME}
By default this file is not present; if you want to customize the jgroups settings you will need to create the file. This article has reference copies of the file which you can use as a basis. Be sure to choose the file that matches your version of CloudBees CI.
The following <FD>
node within jgroups.xml
is what determines the timeout period before failover. It essentially works like: timeout*max_tries (+ verify_suspect)
. Therefore, with the default settings:
<FD timeout="3000" max_tries="3"/><VERIFY_SUSPECT timeout="1500"/>
3000*3(+1500)
= ~10seconds
To increase the timeout, you can increase the values:
<FD timeout="3000" max_tries="10"/><VERIFY_SUSPECT timeout="1500"/>
3000*9(+1500)
= ~30seconds
This 30 second timeout became the default timeout in relase 2.303.2.5 with change Increased High Availability (HA) default timeout (BEE-106)
.