Troubleshooting Pipelines

Inserting checkpoints

2 minute read

All Pipelines are durable: if Jenkins needs to be restarted (or crashes, or the server reboots) while a flow is running, it resumes at the same point in its Pipeline script after Jenkins restarts. Similarly, if a flow is running a lengthy sh or bat step when an agent unexpectedly disconnects, no progress should be lost when the agent is reconnected (so long as the agent continues running). The step running on the agent continues to execute until it completes, then waits for Jenkins to reconnect the agent.

However, in some cases, a Pipeline will have done a great deal of work and proceeded to a point where a transient error occurred: one which does not reflect the inputs to this build, such as source code changes. For example, after completing a lengthy build and test of a software component, final deployment to a server might fail for a silly reason, such as a DNS error or low disk space. After correcting the problem you might prefer to restart just the last portion of the Pipeline, without needing to redo everything that came before.

The checkpoint step makes this possible. Simply place a checkpoint at a safe point in your script, after performing some work and before doing something that might fail randomly:

node {
    sh './build-and-test'
checkpoint 'Completed tests'
node {
    sh './deploy'

Whenever build-and-test completes normally, this checkpoint will be recorded as part of the Pipeline, along with any program state at that point, such as local variables. If deploy in this build fails (or just behaved differently than you wanted), you can later go back and restart from this checkpoint in this build. (You can use the Checkpoints link in the sidebar of the original build, or the Retry icon in the stage view, mentioned below.) A new flow build (with a fresh number) will be started which skips over all the steps preceding checkpoint and just runs the remainder of the flow.