CloudBees Accelerator Ledger File

5 minute read

Traditional Make facilities rely exclusively on a comparison of file system timestamps to determine if the target is up-to-date. More specifically, an existing target is considered out-of-date if its inputs have a “last-modified” timestamp later than the target output.

For typical interactive development, this scheme is adequate: As a developer makes changes to source files, their modification timestamps are updated, which signals Make that dependent targets must be rebuilt. There is, however, a class of workflow styles that cause file timestamps to move arbitrarily into the past or future, and therefore circumvent Make’s ability to correctly rebuild targets.

Two common examples are:

  • Using a version control system that preserves timestamps on checkout (also known as “sync” or “update”).

The default mode for most source control systems is to set the last-modified timestamp of every file updated in a checkout or sync operation to the current day and time. If you change this behavior to preserve timestamps (or if your tool’s default mode is preserve ), then updating your source files can result in modified contents but with a timestamp in the past (typically, it is the time of the checkin).

  • Using file or directory synchronization tools (even simple recursive directory copies) to keep files updated against some other repository.

Here again, while it is easy to modify source file content, the timestamp for modifications might be any of several possibilities: time of copy, last-modified time of source, last-modified time of destination, and so on.

The Problem

In all modified source files cases, we would like the Make system to rebuild any dependent objects. However, because timestamps of modified files are not set reliably, Make might or might not force a target update. Here is an example Makefile:

foo.o: foo.c gcc -c foo.c
foo.o: foo.h

And a build is run without an existing foo.o object:

% make gcc -c foo.c
% ls -lt total 4 -rw-r--r-- 1 jdoe None 21 May 29 13:50 foo.o -rw-r--r-- 1 jdoe None 21 May 29 13:50 foo.c -rw-r--r-- 1 jdoe None 20 Apr 25 17:34 foo.h -rw-r--r-- 1 jdoe None 41 Jan 19 09:27 Makefile

The foo.o target is updated. Next, suppose we ask our source control system to update the working directory, and it responds by giving us a newer copy of foo.h, one that is several weeks newer than what we have, and that timestamp is preserved:

% <sync> % ls -lt total 4 -rw-r--r-- 1 jdoe None 21 May 29 13:50 foo.o -rw-r--r-- 1 jdoe None 21 May 29 13:50 foo.c -rw-r--r-- 1 jdoe None 29 May 17 11:21 foo.h <-- notice timestamp change -rw-r--r-- 1 jdoe None 41 Jan 19 09:27 Makefile

Traditional Make programs (here, GNU Make) will not notice the change because the timestamp is still in the past, and will incorrectly report that the target is up-to-date.

% make make: `foo.o' is up to date.

Some Make facilities (notably, Rational 'clearmake' in conjunction with Rational ClearCase) have the ability to track timestamp information because they are integrated with the source control system.

The eMake Solution

eMake solves this problem at the file level, completely independent of the source control system, by keeping a separate database of inputs and outputs called a ledger. To use the Ledger, you specify which file aspects to check for changes when considering a rebuild. To do so, use the --emake-ledger=<valuelist> command-line switch (or the EMAKE_LEDGER environment variable). <valuelist> is a comma-separated list that includes one or more of: timestamp , size , command , nobackup , nonlocal , and unknown . For more information, see Ledger options in eMake Command-Line Options, Variables, and Configuration File.

  • timestamp – Any timestamp changes to either the target or the explicitly declared dependency, regardless of how it relates to the last modified time of the target input file, triggers a target rebuild.

  • size – Any size change, regardless of the timestamp in the input file, triggers a target rebuild.

  • command – Records the text of the command used to create the target. If makefile or its variables change, using command rebuilds the target. Important caveat: If you initialize a variable using the $( shell ) function, be extremely careful to use the $( shell ) function with a ’:=’ assignment to avoid re-evaluating it every time the variable is referenced. ’:=’ simply expanded variables are expanded immediately upon reading the line.

  • nobackup – Suppresses the automatic backup of the ledger file before its use.

  • nonlocal – Instructs eMake to operate on the ledger file in its current location, even if it is on a network volume. By default, if the file specified by --emake-ledgerfile (emake.ledger in the current working directory, by default) is not on a local disk, eMake copies that file (if it already exists) to the system temporary directory and opens the copy, then copies it back to the specified location when the build is complete.

    Using nonlocal removes a safety and might cause problems if the non-local file system has issues with memory-mapped I/O (IBM Rational ClearCase MVFS is known to have issues with memory-mapped I/O). If you are confident that you will get efficient and reliable memory-mapped I/O performance from the non-local file system, you can remove the safety for improved efficiency because eMake does not spend time at startup and shutdown copying ledger files. CloudBees strongly recommends against using nonlocal with ClearCase dynamic views. CloudBees does not support Ledger-related problems that occur when nonlocal is used in conjunction with the MVFS.

  • unknown – Specifies that the Ledger feature consider a target to be out of date, if the Ledger database contains no entry for the target.

In the example above, the Ledger can detect if a rebuild is necessary as the timestamps change. If the original build was:

% emake --emake-ledger=timestamp gcc -c foo.c % <sync> <-- notice timestamp change % emake --emake-ledger=timestamp gcc -c foo.c

eMake consulted the Ledger and concluded the target needed to be rebuilt.

Important Notes for the Ledger Feature

  • The Ledger feature works by comparing an earlier input state with the current state: if the Ledger has no information about a particular input (for example, during the first build after it was added to a makefile), it will not contribute in the up-to-dateness check.

  • Only one Ledger is used per build.

  • The default ledger file is called emake.ledger

It can be adjusted by the --emake-ledgerfile=<path> command-line option or EMAKE_LEDGERFILE=<path> environment variable.

  • If you specify --emake-ledgerfile=<path> but not --emake-ledger=<valuelist>, the Ledger still hashes the file names, so the Ledger is triggered when the file name order changes or a file is added or removed.

  • The Ledger automatically backs up the ledger file before using it. This ensures a non-corrupt file is available. If the ledger file is large, copying it could take some time on incremental builds. The ledger option, nobackup, suppresses the backup.

  • Ledger works for local builds and those using a cluster, as well as local submakes in a runlocal job, see Running a Local Job on the Make Machine.

  • It is not possible, however, to share a Ledger between top-level make instances and local-mode submakes running on the cluster. See EMAKE_BUILD_MODE=local in eMake Command-Line Options, Variables, and Configuration File.

  • eMake consults Ledger information to trigger a rebuild only when a target would otherwise be considered up-to-date. Information in the Ledger never prevents a target from being rebuilt.

  • In a GNU Make emulation, the Ledger feature changes the meaning of the ' $? ' automatic variable to be synonymous with ' $^ ' (all prerequisites, regardless of up-to-dateness).

  • You cannot change Ledger options for a particular ledger file—you must use the same combination of timestamp, size, and command that was used to create the ledger file.

  • If you turn on --emake-ledger and --emake-autodepend at the same time, the Ledger keeps track of both implicit and explicit dependencies. This feature is comparable to using ClearMake under ClearCase, but is independent of ClearCase information records.

  • Order-only prerequisites, in keeping with their semantic meaning, never affect Ledger behavior.

  • Because the Ledger automatically rebuilds a target when there is no existing entry in the ledger file, a build that is using the Ledger for the first time might take longer than expected.