Job Caching

12 minute read

JobCache is a feature that can substantially reduce compilation time. JobCache lets a build avoid recompiling object files that it previously built, if their inputs have not changed. JobCache works even after you clean the build output tree (for example, by using “make clean”). By caching and reusing object files, JobCache can significantly speed up full builds.

JobCache uses cache “slots.” When JobCache is enabled, eMake maintains a slot for each combination of command line options, relevant environment variable assignments, and current working directory. A slot can be empty or can hold a previously-cached result. If the appropriate slot holds an up-to-date result, a cache “hit” occurs, and compilation is avoided.

A cached result becomes obsolete if eMake detects file system changes that might have caused a different result (with the same command line options, environment variable assignments, and current working directory). Such file system changes include any files that are read during compilation, which means all source files, gcc precompiled headers, and compilation tools included in the eMake root.

CloudBees recommends that all components (Cluster Manager, Electric Agent/EFS, and eMake) on all machines in the cluster are upgraded to the latest version. However, you can still use JobCache with Agents on Electric Agent/EFS machines running versions as old as 7.2, as long as an appropriate backward compatibility package (BCP) is installed on those Electric Agent/EFS machines. For more information, see Installing the Backward-Compatibility Package on Agent Machines.

Benefits

  • Speeds long, full builds (for example, when you do a “make clean” then a “make,” or when you run a build in a new workspace)

  • Builds faster than ccache

  • Avoids certain false cache hits that might occur when you use ccache

  • Uses intelligent hashing for some types of files to avoid spurious cache misses because of changes in unimportant segments of those files.

    For example, JobCache ignores the .NT_GNU_BUILD_ID tag in ELF executables and libraries or input file timestamps in byte-compiled Python scripts.

Limitations

  • JobCache does not cache results from jobs that invoke eMake. JobCache stores results from particular jobs, but if a rule includes an invocation of eMake, (which might spawn other jobs), then that job is not cached, as in the following example:

    gcc -o foo.o foo.c && $(MAKE) -C subdir
  • The date and time in an object file will still reflect the original compilation (not the current date or time) if you use a C preprocessor macro that expands to the date or time when the compiler runs, and CloudBees Accelerator re-uses the resulting object file in a subsequent build.

  • Source paths embedded in debugging information in object files will reflect the original compilation (even if you re-use the object file while building in a different workspace).

  • JobCache is incompatible with incremental builds and should be used only for full builds.

Supported Tools

JobCache supports the following tools:

Tool Supported Platforms Files Cached Notes

GNU ar or GNU ar-compatible archivers

Linux

All .a and .la

GNU ar-compatible archivers are those that use the same command-line convention and environment variable sensitivity as GNU ar.

cl (Microsoft Visual C/C++)

Windows

All .obj

Should not be used to cache results from linking. The only supported debug option is /Z7.

clang and clang++

Linux and Windows

All .o and .lo

Should not be used to cache results from linking.

gcc and g++

Linux and Windows

All .o and .lo

Should not be used to cache results from linking.

Java Android Compiler Kit (Jack)

Linux

All .jack and .dex

javac

Linux

All .jar

GNU ld or GNU ld-compatible linkers

Linux

All .so

GNU ld-compatible archivers are those that use the same command-line convention and environment variable sensitivity as GNU ld.

Metalava

Linux

All .srcjar and .timestamp jobs that run Metalava

When you use --emake-jobcache= to enable JobCache (rather than using #pragma jobcache, eMake ignores trailing digits and version numbers in filename extensions (such as 29 in framework.jar29 and .8.3.0 in libcryptopp.so.8.3.0 ) so that it can use those extensions to try to intelligently determine the type of JobCache to apply to a job.

Running a “Learning” Build to Populate the Cache

You must first populate the cache by running a “learning” build with JobCache enabled. For the learning build (because the cache is empty), JobCache saves only a new result to the cache. For subsequent builds, JobCache re-uses cached results and saves a new result to the cache as appropriate. If you do not enable JobCache, then the job cache is not accessed.

Extending JobCache to Teams Via a Shared Cache

The Shared JobCache feature extends JobCache to teams of developers by using a shared cache. As with non-shared JobCache, Shared JobCache accelerates builds by reusing outputs from a build in the next build, which avoids costly redundant work across builds. Shared JobCache extends this concept by giving developers read-only access to a cache that was previously populated by another user or process (such as a nightly build). With this feature enabled, only one user must actually run the compilations; other team members simply reuse the output from that “golden” build.

A shared cache gives JobCache two tiers: shared and private (“local”). The shared cache strictly augments the traditional (“local”) cache but does not replace it.

Shared JobCache tries to find matching cache entries in the following order:

  1. During build execution with Shared JobCache enabled, eMake tries to find matching cache entries in the shared asset directory. This directory is specified by the --emake-shared-assetdir option. For example, --emake-shared-assetdir=/net/nightlybuild/.

    Developers can never modify cache entries in the shared asset directory.

  2. If there is no matching slot for a job in the shared asset directory, or if the input files for the job differ, eMake uses the developer’s local cache instead by checking for a hit there. This cache is specified by the --emake-assetdir=<directory> option. For example, --emake-assetdir=/net/home/bob/.

  3. If there is no local cache match, eMake creates a slot in the developer’s local cache if needed. The developer will get a hit in their local cache during the next compilation.

Prerequisites

Shared JobCache requires all participating developers to have access to a shared file system (such as NFS) where the shared asset directory will reside.

Populating the Shared Cache

For the builds that populate the shared cache, use the following eMake options:

--emake-jobcache=<types> --emake-assetdir=<assetdir>

Using the Shared Cache

Developer builds use the following eMake options:

--emake-jobcache=<types> --emake-shared-assetdir=<assetdir>

where <assetdir> is the asset directory of the populated cache.

Configuring JobCache

Licensing

JobCache is licensed based on the maximum number of builds that may use it simultaneously. This number is read from the jobcacheMax property in the Accelerator license file. Simultaneous builds that exceed this number occur without using this feature.

If the JobCache license entry is invalid, or if the number of simultaneous builds has exceeded the license limit, a WARNING EC1181: Your license does not permit object caching message appears when a build tries to use JobCache. eMake will continue to work normally.

Choosing a Disk for the Job Cache

To estimate the disk space required for the job cache, add the sizes of your object files together and multiply by 0.7. A specific example is that the Android KitKat Open Source Project (AOSP) requires about 4 GB. For best performance, choose a disk that is local to the eMake client host. For Shared JobCache, users must have access to a shared file system such as NFS.

You can use the --emake-assetdir= eMake option to specify the directory for your job cache. The default name of this directory is .emake. By default, this directory is in the working directory in which eMake is invoked. (This option also determines the cache location for the parse avoidance feature and the location of the saved dependency information for the dependency optimization feature.)

Building Multiple Branches

To maximize your cache hits, use the --emake-assetdir= option to specify a separate asset directory for each branch of code that you build.

Setting the eMake Root

JobCache does not detect changes to compilation inputs that are not under your eMake root. You must ensure that your eMake root contains all sources and tools that might change.

Job Caching for Non-cl Tools

These tools are: ar, clang, clang-cl, gcc, Jack, javac, ld, and Metalava. For information about cl, see Job Caching for cl.

Enabling JobCache for All Make Invocations in a Build

To enable JobCache for all make invocations in a build, use the --emake-jobcache=<types> eMake option, where <types> is a comma-separated list of any combination of ar, clang, clang-cl, gcc, jack, javac, ld, metalava. The list cannot contain spaces. (As an alternative, you can use --emake-jobcache=all as a shortcut to cache all files.) The --emake-jobcache=<type> option works for recursive and nonrecursive builds.

  • --emake-jobcache=clang is an alias for --emake-jobcache=gcc. In the eMake annotation file, the JobCache type appears as jobcache type="gcc".

  • --emake-jobcache=clang-cl is an alias for --emake-jobcache=cl. In the eMake annotation file, the JobCache type appears as jobcache type="cl".

Following is an example command that enables JobCache for gcc, Jack, and javac:

% emake --emake-cm=mycm --emake-root=/src/mysource --emake-jobcache=gcc,jack,javac --emake-annodetail=basic

If some of your makefile targets are built by rules that do not invoke ar, gcc/g, clang/clang, Jack, javac, ld, or Metalava, and those rules do not behave enough like those tools for job caching to be suitable, then use the jobcache pragma to be more selective about which jobs are cached, either by enabling JobCache for fewer jobs or by selectively disabling it for particular jobs with #pragma jobcache none. For example, the rules that build a particular “.o” file might use an environment variable that is not used by gcc/g++, and you might want to miss the cache if that environment variable changes its value.

On Linux, Electrify supports JobCache. In most cases, just add --emake-jobcache=all to your Electrify command-line options.

The --emake-annodetail=basic option is not required to invoke eMake, but it is recommended for troubleshooting. For details, see the Troubleshooting section below.

For information about how to use this functionality with Visual Studio, see the Job Caching for cl when Using Visual Studio section.

Using the jobcache pragma with patterns on targets in a pragma addendum file is another way to enable JobCache on all make invocations in a build. For details, see Using Patterns in a Pragma Addendum File.

Using the Response File Command Line Option for Cacheable GNU Tools (gcc, ld, and ar)

JobCache supports the response file ( @<file>) command-line option for all JobCache-supported GNU tools (gcc, ld, and ar). This option reads command-line options from a separate file specified by <file>.

Enabling JobCache for All Object File Targets in a Make Invocation

If you do not want to enable JobCache for all make invocations in a build, you can still enable JobCache for specific object file targets by applying the jobcache pragma inside the makefile. To enable the feature for all targets with a particular extension, use the following:

#pragma jobcache <type> -exist * %.<object_file_extension> :

where <type> is ar, clang, clang-cl, gcc, jack, javac, ld, or metalava.

Following is an example makefile excerpt with the feature enabled for all targets ending in “.o”:

... #pragma jobcache gcc -exist *
%.o : ...

(You can apply the pragma to a pattern, but not to a suffix rule.) If cache misses occur because an insignificant file exists or does not exist, you can delete -exist *, and eMake will check only the existence of files whose names suggest that they are source files or gcc precompiled headers. You can augment the set of files that matters to eMake by adding -exist options that specify the desired glob patterns.

If you do not want to add the pragma to the main makefile, you can add it to an “addendum” makefile. This is a small makefile that you include in the eMake invocation from the command line. For example, you could create a makefile named jobcache.mak and then add -f Makefile -f jobcache.mak to your eMake invocation. An addendum makefile is useful when you do not control the content of your makefiles (such as when you include open source components that use build tools such as configure, CMake, or qmake to generate makefiles).

Enabling or Disabling JobCache for Specific Object File Targets

To enable JobCache only for some object file targets, do one of the following:

  • If an object file is built by an explicit rule (one without “%” in it), you can enable JobCache for that rule by inserting a pragma just above it in the makefile. You can also use an addendum makefile in this case. For example, when using ar, gcc/g, clang/clang, Jack, javac, ld, or Metalava to cache an explicit rule producing target.o, you can use this addendum:

#pragma jobcache type -exist * target.object_file_extension :

For example, to enable JobCache using gcc only for some object file targets:

#pragma jobcache gcc -exist * target.o :
  • You can use narrower target patterns than “%.o” after a jobcache pragma. For example, “abc%.o” applies the pragma to “abcdef.o” but not to “xyzdef.o”. (This is true even if the rule to build “abcdef.o” appears elsewhere in the makefile.)

Similarly, you can explicitly disable caching by using the #pragma jobcache none pragma.

If jobcache pragmas with different options apply to a target, then eMake selects one of them as follows:

  • If the target is built by an explicit rule with a jobcache pragma, then eMake chooses that pragma.

  • Otherwise, eMake chooses the pragma whose pattern has the most in common with the target name.

  • If eMake still encounters ties, then it chooses the pragma that it encountered last during makefile parsing.

For example, the following gcc makefile applies job caching to hello1.o and hello2.o, but not to hello3.o or date2.o:

CC=gcc all : hello ./hello #pragma jobcache gcc -exist * hello1.o: hello1.c hello1.h $(CC) -c -o $@ $< $(CFLAGS) hello2.o: hello2.c hello2.h $(CC) -c -o $@ $< $(CFLAGS) # Uses __DATE__ to record when build occurred--do not cache. #pragma jobcache none date2.o: date2.c date2.h $(CC) -c -o $@ $< $(CFLAGS) hello3.o: hello3.c hello3.h $(CC) -c -o $@ $< $(CFLAGS) hello: hello1.o hello2.o hello3.o $(CC) -o $@ $^ clean: rm hello hello1.o hello2.o hello3.o .PHONY: all clean #pragma jobcache gcc -exist * %2.o :

Job Caching for cl

Enabling cl JobCache for All Make Invocations in a Build

On Windows, setting --emake-jobcache=cl or --emake-jobcache=clang-cl enables JobCache for all jobs that produce .obj files. These are normally created by the Visual C++ compiler ( cl.exe ). You can set the option on the command line, in an emake.conf file, or in the EMAKEFLAGS environment variable.

When using the Visual Studio IDE extension, add –-emake-jobcache=cl or --emake-jobcache=clang-cl to Electrify Options in the command line settings:

For example, if you create a simple C++ Console Application using Visual Studio, it creates two compile jobs for Example1.cpp and stdafx.cpp.

image using visual studio jobcache

The extension converts this project into a makefile. For example:

".\Example1\Debug\Example1.obj":: ".\Example1\Example1.cpp" @cl.exe /Od /Oy- /sdl /D WIN32 /D _DEBUG /D _CONSOLE /D _LIB /D _UNICODE /D UNICODE /EHsc /RTC1 /analyze- /MDd /GS /Zc:wchar_t /Zc:forScope /Yu"stdafx.h" /Fp".\Example1\Debug\Example1.pch" /Fo".\Example1\Debug\Example1.obj" /W3 /WX- /nologo /c /Z7 /Gd /errorReport:none /fp:precise /TP ".\Example1\Example1.cpp" ".\Example1\Debug\Example1.pch" :: ".\Example1\Debug\stdafx.obj" ".\Example1\Debug\stdafx.obj" :: ".\Example1\stdafx.cpp" @cl.exe /Od /Oy- /sdl /D WIN32 /D _DEBUG /D _CONSOLE /D _LIB /D _UNICODE /D UNICODE /EHsc /RTC1 /analyze- /MDd /GS /Zc:wchar_t /Zc:forScope /Yc"stdafx.h" /Fp".\Example1\Debug\Example1.pch" /Fo".\Example1\Debug\stdafx.obj" /W3 /WX- /nologo /c /Z7 /Gd /errorReport:none /fp:precise /TP ".\Example1\stdafx.cpp"

When Electrify runs with JobCache, it stores Example1.obj and stdafx.obj and reuses them in the next build if it gets a “hit” (meaning that it detects no relevant changes since the last build).

Troubleshooting

Enabling Basic Annotation

Basic annotation helps you to resolve your cache hit/miss problems by providing information about whether a cache miss occurred and the cause. Annotation that is related to JobCache is included in basic annotation. Basic annotation is not enabled by default, so you must do so by using the --emake-annodetail=basic eMake option. By default, the eMake annotation file is created in the working directory in which eMake is invoked; the file is named emake.xml by default.

Interpreting Job Cache Annotation Information

Examine the relevant job XML element in the annotation file for a subelement named jobcache.

Following is an example of a jobcache subelement for a job with a job cache hit. The status attribute indicates whether eMake used a cached object file in that job (whether it had a cache hit), and if not, what else occurred:

... <job id=...> ... <jobcache type="gcc" options=" -exist *" slot="6d00a0d9242610a075a98bc9400f8f11" duration="0.155831" status="hit"> ... </jobcache> ... </job> ...

The duration attribute is the duration (in seconds) of the job that populated the cache. If the current build is updating the cache, then the value of duration is the same as the duration of the job containing that jobcache subelement. If the current build is replaying from the cache, then it is the duration of the corresponding job from a previous build whose results were saved into the cache. eMake retrieves the figure from the cache.

Following is an example of a jobcache subelement for a job with a shared job cache hit. Note the src="shared" tag, which appears when Shared JobCache is used:

... <job id=...> ... <jobcache type="gcc" options=" -exist *" slot="86e319e5cd837ad78360145fb3a933f1" duration="0.244847" status="hit" src="shared"> <triggers> <trigger type="commandline" option="--emake-jobcache=gcc"/> </triggers> </jobcache> ... </job> ...

The local cache was used unless src="shared" appears.

Following is an example of a jobcache subelement for a job with a cache miss:

... <job id=...> ... <jobcache type="gcc" options=" -exist *" slot="989324a736c583a306630930da80ba1e" duration="0.266177" status="miss"> ... <differences> <diff name="/c/src/mysql-5.6.21/include/mysql.h" old="md5:d55ae58fd7eec6055a50fcf6b83af99c" new="md5:071ed8e989b0d0e14d0b0bd96e94cd35"/> </differences> </jobcache> ... </job> ...

Following is an example of a jobcache subelement for a job for which caching was disabled by using #pragma jobcache none:

... <job id=...> ... <jobcache type="none" options="" status="uncacheable"> <triggers> <trigger type="pragma" file="Makefile" line="11" options="none"/> </triggers> </jobcache> ... </job> ...

Following is a complete list of the possible values of the status attribute:

  • hit —eMake had a cache hit for that job.

  • miss —eMake had a cache miss for that job. Each diff subelement shows a file system input to compilation that differs since the object file was cached and shows the difference that was observed.

  • newslot —A new slot was created. This is because the previously-cached object files were from compilations that used different command-line arguments, environment variables, working directories, or any combination of these. To see the options used for that compilation, see the “key” file for the slot identified by the slot attribute. For example, the key file for slot 2b253a890c9745a0b500d888349ec2e2 has the following path name:

    .emake/cache.16/i686_Linux/2b/25/3a/890c9745a0b500d888349ec2e2/key

    The version number in the cache. <number> directory might vary with your CloudBees Accelerator software version. The key file specifies the relevant environment variables, the working directory, and the command line. To allow cache hits when building in a new workspace, path names are specified relative to your eMake roots. If a target repeatedly has newslot status, get the slot identifiers from two consecutive builds for that same target and compare the key files.

  • uncacheable —Caching was disabled by the #pragma jobcache none pragma, or an error occurred during the update of the relevant cache slot. Examine any ERROR and WARNING messages in the console output from eMake.

  • rootschanged —There is no natural mapping from the old eMake roots to the new eMake roots.

  • unneeded —JobCache was enabled for the job but not needed (because the target was already up to date according to ordinary GNU Make rules). The cache was not consulted, even though caching was requested for that target.

If Job Cache Annotation Information Is Missing

If a particular job element in the annotation file has no XML jobcache subelement, this is because any combination of the following has occurred:

  • The target name for that job does not match the pattern following any jobcache pragma, and if the target is built by an explicit rule, that rule does not follow a jobcache pragma.

  • The intended jobcache pragma is misspelled.

  • Either the --emake-jobcache= command line option was not used, or none of the targets matched.

  • The appropriate licensing is not available to the eMake client.

If eMake Unexpectedly Used a Cached Object File

If eMake should not have used a particular cached object file, then

  • If eMake should have detected a change to a particular file, compare its path to your eMake roots.

  • If a relevant environment variable changed, check whether the key file mentions it (see above), and if it does not, notify CloudBees.

Profiling Debug Logging

Annotation files include profiling metrics to help troubleshoot performance issues. These are the same metrics that are in the debug log file when the --emake-debug=g option is set. The metrics appear in annotation whether or not --emake-debug=g is used. The metrics are in the <profile> tag and appear exactly as they do in the debug log file.

CloudBees engineering and support staff use profiling debug logging as well as other information in the eMake debug logs to help troubleshoot problems. For more information about debug logging and log levels, see eMake Debug Log Levels.

Viewing JobCache Metrics

The annotation file includes metrics about job cache activity. Following is an example that lists the metrics. This example shows that Shared JobCache is used:

... <metrics> ... <metric name="jobcache.hit.local">382</metric> <metric name="jobcache.hit.shared">0</metric> <metric name="jobcache.hit">382</metric> <metric name="jobcache.miss">666</metric> <metric name="jobcache.newslot">98</metric> <metric name="jobcache.sharedmiss">0</metric> <metric name="jobcache.sharednewslot">0</metric> <metric name="jobcache.rootschanged">0</metric> <metric name="jobcache.uncacheable">4</metric> <metric name="jobcache.unneeded">6</metric> <metric name="jobcache.na">3150</metric> <metric name="jobcache.workloadsaved">367.393489</metric> ... </metrics> ...

For descriptions of these metrics, see Metrics in Annotation Files.

Moving Your Workspace

If you want to move your workspace, make sure that the new eMake roots correspond to the old eMake roots. Also, because the asset directory defaults to .emake in the current working directory, you must either copy that directory to the new workspace or use --emake-assetdir= to specify an asset directory that you want the two workspaces to share. If you already use --emake-assetdir= to point to an asset directory within your old workspace and also want to move the asset directory, you must update its value to point to the new asset directory location.

Deleting the Cache

In general, content in the cache is not deleted automatically (although it might be replaced by newer content). If the cache grows significantly beyond the size expected for a full build, you can delete the cache to save disk space.

For example, if you change the value of the C_INCLUDE_PATH environment variable, then the cache will grow to contain results for both the old and new values of that variable. In this case, you might want to clear the cache when you permanently change the value of this variable and therefore no longer need the old cache results.

To delete the cache, you delete the <assetdir>/cache.* directories. For example, if you are using the default asset directory on Linux, enter

rm -r .emake/cache.*