Using Cluster Manager Administration Tools

11 minute read

To manage and administer an Accelerator host cluster, three tools are available to perform operations across all hosts simultaneously. The tools, clusterexec, clusterupload, and clusterdownload are platform-independent tools and part of the eRunner package. By default, the eRunner daemon (UNIX) and eRunner service (Windows) are installed on agent machines and eMake machines. The tools are installed on Cluster Manager machines and eMake machines. Only the server side (erunnerd) is installed on agents to allow them to serve requests from tools. If you decided not to install eRunner during Cluster Manager installation, you do not have access to these tools. These Cluster Manager administration tools allow you to:

  • Start and stop agents

  • Reboot hosts

  • Run commands on hosts

  • Upload files

  • Download files

These tools are particularly useful for automating the cluster upgrade process to update the build environment (for example, build tools, compiler, header files, system libraries, and so on) and for collecting debug information. Other uses include restarting agents, restarting a subset of agents, or obtaining host status information.

Specific command-line tool functions are:

  • clusterexec —Lets you run arbitrary commands on a host machine or all hosts in a cluster

  • clusterupload —Lets you upload executables and files (for example, compiler, libraries, and so on) to all hosts in a cluster

  • clusterdownload —Lets you download files from host machines to a central location

clusterupload, clusterexec, and clusterdownload communicate with the eRunner service on Windows hosts or the eRunner daemon on UNIX machines. The eRunner service/daemon listens for connections on port 2411 by default.

Because these tools are platform-neutral, clusterexec running on Windows can communicate with eRunner on Linux hosts so those hosts can execute the commands. Conversely, use clusterupload on Linux to upload files to a Windows host. However, some rules should be followed when using these tools cross-platform from UNIX to Windows. For more detailed information, see Using clusterexec.

clusterupload and clusterdownload do not support ACLs. During upload/download, they are ignored. An effect of ignoring ACLs is that Cygwin symlinks are not handled properly.

Using clusterexec

This command executes shell commands given by cmd1, cmd2, and so on, on one or more hosts—typically, all hosts in a cluster. The syntax for running clusterexec is:

% clusterexec [options] "cmd1 arg2; cmd2; …​"

For example, if you are running Linux and want to find out how long the operating system on each host in the cluster has been running since the last reboot, you could run the uptime commands on each host in the cluster, lin-cluster. Using clusterexec, the syntax would be:

% clusterexec --cm=lin-cluster uptime

Each command may be the following type:

  • A fully-qualified path to an executable on the hosts.

  • The executable name on the hosts. The eRunner service that processes the clusterexec command does a PATH search based on the following:

  • For Linux and Solaris, the PATH that erunner searches is defined in /etc/init.d/erunner and includes the well-known binary and system binary installations on the host.

  • For Windows, the path that erunner searches is the PATH system environment variable on the host.

  • A built-in command defined in the eRunner service. See the “eRunner Built-in Commands” table below for a description of these commands.

Some platform-specific rules to observe:

  • clusterexec commands run as the root user on Solaris and Linux, or as the LocalSystem user on Windows

  • When using clusterexec on Solaris/Linux to run a command on a Windows host, the shell may alter the command, for example, the UNIX shell treats backslash (\) characters as escapes. This issue may apply also if the user is running in a UNIX-like shell on Windows (for example, cygwin bash ). To correct this situation, use one of the following methods:

  • Use single quotes instead of double quotes, for example: clusterexec --hosts=win1 'c:\\winnt\\system32\\xcopy c:\\a.txt

  • Use double-escape backslashes, for example: clusterexec --hosts=win1 "c:\\\\winnt\\\\system32\\\\xcopy

  • Use forward slashes (/) instead of backslashes where possible, for example: clusterexec --hosts=win1 "c:/winnt/system32/xcopy c:/a.txt

The --hosts argument can take patterns in the form [X-Y] to indicate a range of host Name/IP. This argument can also take patterns in the form [X,Y,Z] to indicate a list of hosts with X,Y, and Z substituted at the pattern location to operate on host1, host2, host3, host5, host7, host-a, host-b, and host-c, you can use a --hosts argument such as host[1-3] host[5,7] host-[a-c]. Patterns must be specified in [ ] or { } brackets.

If a command fails, clusterexec does not run subsequent commands by default. Use the -k option to override this behavior and keep going after failure.

clusterexec command-line options are described in the following table:

clusterexec Command-Line Options

Switch Options Description

-h, --help

Prints a usage message summarizing information in this table

--hosts=<value>

Set of hosts on which to run commands. value should be in the form host1 host2

--cm=<host>:<port>

Cluster Manager that is contacted to get the relevant hosts. This is used only when --hosts is not specified. If this option is specified, commands run on all hosts in the cluster, subject to the --platform and --good-hosts options below. Defaults to the value of the EMAKE_CM environment variable (if present)

--platform=<value>

Platform of desired hosts. This is used only when --cm is specified and --hosts is not specified. value is either windows, solaris, solarisx86, or linux. Commands are executed on hosts of the named platform type only. Defaults to the platform on which this client program is running

--good-hosts-only

Only use hosts with at least one pingable enabled agent in Cluster Manager. This option applies only when --cm is specified. Default behavior uses all hosts regardless of their state

--file=<value>

Name of a file containing commands to run. If this option is specified, an inline script (as shown above), must not be specified

-k, --keep-going

Continue running commands after failure. If more than one command is specified, continue to run subsequent commands, even if an earlier command fails

--mergestreams=<0/1>

Default = true (merge stdout and stderr commands and write to clusterexec stdout stream). If set to false, stdout and stderr commands are sent to clusterexec stdout and stderr, respectively.

You cannot disable mergestreams if you enable annotation. Enabling annotation automatically enables mergestreams, even if it was explicitly disabled

-v, --version

Displays clusterexec version

-s, --use-shell

Send the entire command string to the shell ( sh on UNIX, cmd on Windows)

--timeout=<value>

Abort execution after value seconds

Commands may refer to executables on hosts or commands built into the eRunner service/daemon on the hosts. Valid built-in commands are listed in the following table:

eRunner Built-In Commands

Commands Description

stopAgent

Stops the agent service on the hosts

startAgent

Starts the agent service on the hosts

restartAgent

Restarts the agent service on the hosts

reboot [<delay>]

Reboot the hosts. If delay is specified, reboot each host delay ms (milliseconds) apart

stopErunner

Shuts down eRunner service on the hosts

restartErunner

Restarts eRunner service on the hosts

printVersion

Prints the eRunner service version on the hosts

logLevel <level>

Query or set the eRunner service log level on the hosts. Valid values for level are DEBUG and INFO

Using clusterupload

The clusterupload command is a convenient tool for transferring a file or files to all hosts in a cluster. This command can upload one or more files to one or more hosts in the cluster. The syntax for running clusterupload is:

clusterupload [options] <source> <target>

where target could be a file or a directory. The interpretation depends on whether or not target already exists on the host (as a file or as a directory). To interpret the logic, source is the relative or absolute path of the file/directory on the local machine and target is the absolute path to the file/directory on the hosts. Then,

  • if source is a file and target is a directory, source is copied into target .

  • if source is a file and target is a file, source is copied over target .

  • if source is a directory, and target is a file, an error is reported.

  • if source is a directory ending in / or \ , the source contents are copied into the target if target is a directory. If target is not a directory, an error is reported.

  • if source is a directory not ending in / or \, the last component of the source path becomes a subdirectory under target , and the source contents are copied into that subdirectory.

Target paths need not exist in advance—relevant directories are created as needed. If you specify a relative path for target , clusterupload gives you an error—relative paths are not allowed.

For taking multiple sources , the syntax is:

clusterupload [options] <source1> <source2> <target>

where target is assumed to be a directory. (The target is always assumed to be a directory when multiple sources are specified.)

Switch option values are described in the following table:

clusterupload Command-Line Options

Switch Options Description

-h, --help

Prints a usage message summarizing information in this table

--verbose

Verbose mode

--hosts=<value>

Set of hosts within the cluster to update. value should be in the form host1 host2[:port]…​

--cm=<host>:<port>

The Cluster Manager contacted to get relevant hosts. This is used only when --hosts is not specified. If this option is specified, commands are run on all hosts in the cluster, subject to the --platform and --good-hosts options below. Defaults to the value of the EMAKE_CM environment variable (if present)

--platform=<value>

Platform of desired hosts. This is used only when --cm is specified. value is either windows, solaris, or linux. Commands are executed on hosts of the specific platform type only. Defaults to the platform on which this client program is running

--good-hosts-only

Only upload to hosts pingable and enabled in Cluster Manager. This option applies only when --cm is specified. Default behavior is to upload to all hosts regardless of their state

--filelist=<value>

Filename containing a list of source and target files to upload. If this option is specified, source and target must not be specified. Each line in the file must be in the form: sourcePath ⇒ targetPath . If --filelist is specified with a dash value, file list data is read from stdin

--stop-agents

Stops agents when committing uploaded data to its final location

-v, --version

Displays clusterupload version

--timeout=<value>

Aborts execution after value seconds

Using clusterdownload

The clusterdownload command is a convenient tool for transferring a file or files from hosts in a cluster to a central location. This command downloads one or more files to one or more hosts in the cluster. The syntax for running clusterdownload is:

clusterdownload [options] <sourcedir> <targetdir>

The target directory does not need to exist in advance; it is created as needed. Relative paths are supported for targetDir but not for sourceDir .

For downloading multiple sources, you can use a pattern in sourceDir or use the --include option:

clusterdownload [options] "/opt/ecloud/i686_Linux/logs/*.log" <target>

If wildcard characters are used, quotes are required so the command is not expanded first by the shell.

 — or use — 

clusterdownload [options] --include=*.log /opt/ecloud/i686_Linux/logs <target>

The second example above copies all .log files from the specified directory.

Switch option values are described in the following table:

clusterdownload Command-Line Options

Switch Options Description

-h, --help

Prints a usage message summarizing information in this table

-v, --version

Print version information

--verbose

Verbose mode: shows progress information as well as connection requests

--hosts=<value>

Set of hosts within the cluster from which files need to come. value should be in the form: host1 host2[:port]…​ Patterns can be used also: host[1-3] host[5,7] host-[a-c] Also you can use curly brackets instead of square brackets. The --hosts option overrides the --cm option.

--cm=<host>:<port>

Cluster Manager that is contacted to get the relevant hosts. This is used only when --hosts is not specified. If this option is specified, the commands run on all hosts on the cluster, subject to the --platform and --good-hosts options below. Defaults to the value of the EMAKE_CM environment variable (if present)

--platform=<value>

Platform of desired hosts. This is used only when --cm is specified. value is either windows, solaris, or linux. Commands are executed only on hosts of the specific platform type. Defaults to the platform on which this client program is running

--timeout=<value>

Abort the connection after value seconds. Any partially downloaded files are discarded

-r, --recursive

Recurse into subdirectories

-u, --update

Update only. Do not overwrite newer files. File modification times are used

-t, --times

Preserve file timestamps

-n, --dry-run

Show what would have been transferred

--existing

Only update existing files

--progress

Show progress during the file transfer. When this option is used, clusterdownload lists each file when it begins its download. Progress also prints dots (…​.) as files are being written to identify an active process

--exclude=<pattern>

Exclude files matching pattern . Patterns can include the usual ? and * meta-characters. Patterns apply to filenames, not to paths. When a file matches both the include and exclude patterns, it is excluded

--include=<pattern>

Include files matching pattern . Patterns can include the usual ? and * meta-characters. Patterns apply to filenames, not to paths. When a file matches both the include and exclude patterns, it is excluded

--logfile=<file>

Set the logfile name. This can include an absolute or relative path. If no path is specified, the default location on Solaris and Linux is /var/log. On Windows the default location is where Accelerator is installed. (Default = c:\ECloud\i686_win32 ) If the target file is not writable for some reason, the log is written to STDOUT. If you specify --logfile, you must also specify --loglevel because there is no default log level.

--loglevel=<info|debug>

There is no default log level. If you specify --loglevel but not --logfile, the default log filename is clusterdownload.log in the location described above. To set log levels on Cluster Manager or on agent hosts see clusterexec.

Sample Uses of Cluster Tools

In the following example, clusterupload is used to upload a new version of bash , and clusterexec is used to run a command on Accelerator cluster hosts.

  1. Upload bash to /usr/bin on a Linux/Solaris cluster: % clusterupload --cm=dilbert-cm ~/download/bash /usr/bin/bash

  2. Find out how long hosts have been up on a Linux/Solaris cluster: % clusterexec --cm=dilbert-cm uptime The result would be similar to: dilbert1.cloudbees.com output:16:02:46 up 5:47, 0 users, load average: 0.00, 0.04, 0.07dilbert2.cloudbees.com output:16:02:45 up 5:47, 0 users, load average: 0.08, 0.04, 0.06dilbert3.electric-cloud.com output:16:02:46 up 6:16, 0 users, load average: 0.00, 0.02, 0.05dilbert4.cloudbees.com output:16:02:45 up 6:16, 0 users, load average: 0.00, 0.01, 0.02dilbert6.electric-cloud.com output:16:02:45 up 6:17, 0 users, load average: 0.00, 0.04, 0.05dilbert5.electric-cloud.com output:16:02:45 up 6:17, 0 users, load average: 0.00, 0.00, 0.01

In the next examples, clusterexec is used to gather various statistics on an Accelerator host cluster.

  1. Find out how long hosts have been up on a Windows cluster: C:\> clusterexec --cm=win-cm "net statistics workstation" The output would be similar to: WIN2 output: Workstation Statistics for \\Statistics since 10/25/2005 3:54 PM. . .The command completed successfully. WIN1 output: Workstation Statistics for \\Statistics since 10/25/2005 4:02 PM. . .The command completed successfully.

  2. Get the last 5 lines of agent log files on host1: % clusterexec --hosts=host1 -s "tail -5 /var/log/ecagent*.log" The result would be: host1 output: =⇒ /var/log/ecagent1.log ⇐=* -numagents 2* -version* -webport 8001-------------------------------------------------------=⇒ /var/log/ecagent2.log ⇐= -numagents 2* -version* -webport 2421------------------------------------------------------- In the example above: To obtain the last 5 lines of log files for all agents, an sh invocation is necessary. Use the wildcard to achieve this result. To use the wildcard, a shell must invoke the expansion; clusterexec does not automatically invoke a shell in which to run commands. clusterexec does not handle pipelines. For pipelines, the -s option should be used, for example: clusterexec --hosts=host1 -s "ps -ef | grep agent"

By default, clusterexec stops running commands after the first failure, for example:

clusterexec --hosts="lin1" "badCmd.sh; /bin/echo hi"

will not run the echo command on lin1 if badCmd.sh returns a non-zero exit code.

A simple example for a Windows cluster download:

`clusterdownload --cm=mycm c:/ECloud/ecagent*log c:/tmp `

This example downloads all agent log files in c:\ECloud from all agent hosts on the cluster to directory /tmp. Suppose your cluster has host1 and host2 and each host has 2 agents:

On the host where you ran clusterdownload, the result is:

/tmp/host1/ecagent1.log/tmp/host1/ecagent2.log/tmp/host2/ecagent1.log/tmp/host2/ecagent2.log

To get agent logs from Solaris or Linux hosts use:

clusterdownload --cm=mycm "/var/log/ecagent*.log" /tmp

If you want to get all the log files under /opt/ecloud, you could use:

clusterdownload --cm=mycm -r "/opt/ecloud/.log" /tmp — or — clusterdownload --cm=mycm --include=.log -r /opt/ecloud /tmp

If wildcard characters are used, quotes are required so the command is not expanded first by the shell.

Stopping, Starting, or Restarting the eRunner Daemon or Checking Its Status

Stopping, Starting, or Restarting the Daemon

On Linux and Solaris platforms, you can stop, start, or restart the eRunner daemon by using the following commands:

/etc/init.d/erunner stop
/etc/init.d/erunner start
/etc/init.d/erunner restart

Checking Whether the Daemon is Running

On Linux and Solaris platforms, you can check whether the eRunner daemon is running by using the following command:

/etc/init.d/erunner status

If the daemon is running, the following message appears:

erunner running with PID <PROCESS_ID>

If the daemon is not running, the following message appears:

erunner is not running