Managing build agents with Nodes Plus plugin

The Nodes Plus plugin provides additional functionality for managing Jenkins build agents (nodes), including:

  • Assign owners to agents

  • Notify owners when agent availability changes

  • Accept tasks only when a custom probe command succeeds

Agent owners

In a Jenkins instance that is shared by a large team (or a number of teams) there can often be a pattern whereby certain individuals are responsible for certain specific build agents. If the build agent goes off-line and fails to come back on-line then builds which are tied to that build agent can end up stuck in the build queue. Most Jenkins users, eventually settle on filtering out the e-mail notification that is associated with a successful build (either by configuring the build only to email when the build fails, or by setting up rules in their mail client). Thus if a specific project’s builds are stuck in the build queue, nobody may notice until they actually browse the Jenkins instance in their web browser.

The Node Owners property introduced by the Nodes Plus plugin provides an e-mail notification to the designated owners based on a configurable set of availability triggers

In some cases Jenkins can take a number of minutes to confirm that an agent is off-line. E-mail notifications are only sent after Jenkins has confirmed that the agent is off-line, which may involve waiting for socket connections to time-out.

Configuring owners

On the agent configuration screen enable the Node owners checkbox in the Node Properties section. The configuration options should then be visible.

Configuring the agent owners

node owners property

Enter the list of people who should receive e-mails when the agent availability changes in the Owner(s) input box. E-mail addresses should be separated by whitespace or blank lines or ,.

Send email when connected

This trigger fires when the communication channel has been established with the agent. If the agent has been marked as Temporarily off-line then no build jobs will be accepted by the agent. + NOTE: This trigger will only fire on transition from disconnected to connected.

Send email when disconnected

This trigger fires when the communication channel with the agent has been confirmed dead. Where an agent is configured to be kept on-line as much as possible Jenkins will immediately try to reconnect the agent, and so in such cases the email from this trigger will be immediately followed by either the launch failure or successful connection email. As such this trigger is typically most useful where one of the other availability strategies has been selected for the agent. + NOTE: This trigger will only fire on transition from connected to disconnected.

Send email on launch failure

This trigger fires when an attempt to establish a communication channel with the agent fails. + NOTE: This trigger will fire each and every time a connection attempt fails.

Send email when temporary off-line mark applied

This trigger fires when the agent is marked as temporarily off-line. + NOTE: This trigger will fire within the first 5 seconds of the agent being marked off-line.

Send email when temporary off-line mark removed

This trigger fires when the agent mark of being temporarily off-line has been removed from the agent. + NOTE: This trigger will fire within the first 5 seconds of the agent ceasing to be marked off-line.

Save the agent configuration to apply the changes.

Custom probe before accepting tasks

Some agents may be dedicated to specific tasks like code signing, VLSI evaluation, static analysis, javadoc generation or other tasks that are specific to a capability on the specific agent. If the capability is not available, then tasks should not be accepted on that agent.

The Nodes Plus plugin allows the administrator to optionally define a custom probe command to determine if the node should accept tasks. When the custom probe command returns 0, the agent is allowed to accept tasks. When the custom probe command returns a non-zero value, the agent will not accept tasks.

This command will not be executed more than once per minute and will only be executed if there is a job that could run on the node. Exit code 0 indicates that jobs can run on the node. Non-zero exit codes indicate that the node is not accepting new jobs. Standard output will be reported as the reason for this node being blocked by a non-zero exit code.

For example, if the agent is dedicated to javadoc generation, then a custom probe command might be:

ls /usr/lib/jvm/java-1.8-openjdk/bin/javadoc

If the javadoc command does not exist at that location on that agent, then the agent will not accept tasks. The agent status includes the output of the custom probe command. In this example, the agent status would include the message:

Not accepting tasks: ls: cannot access '/usr/lib/jvm/java-1.8-openjdk/bin/javadoc': No such file or directory

The custom probe command might check that the special tool on that agent is correctly licensed and ready to perform tasks. It might check that the code signing infrastructure on the agent is correctly configured.