Job Caching

11 minute read

JobCache is a feature that can substantially reduce compilation time. JobCache lets a build avoid recompiling object files that it previously built, if their inputs have not changed. JobCache works even after you clean the build output tree (for example, by using “make clean”). By caching and reusing object files, JobCache can significantly speed up full builds.

JobCache uses cache “slots.” When JobCache is enabled, eMake maintains a slot for each combination of command line options, relevant environment variable assignments, and current working directory. A slot can be empty or can hold a previously-cached result. If the appropriate slot holds an up-to-date result, a cache “hit” occurs, and compilation is avoided.

A cached result becomes obsolete if eMake detects file system changes that might have caused a different result (with the same command line options, environment variable assignments, and current working directory). Such file system changes include any files that are read during compilation, which means all source files, gcc precompiled headers, and compilation tools included in the eMake root.

Electric Cloud recommends that all components (Cluster Manager, Electric Agent/EFS, and eMake) on all machines in the cluster are upgraded to the latest version. However, you can still use JobCache with Agents on Electric Agent/EFS machines running versions as old as 7.2, as long as an appropriate backward compatibility package (BCP) is installed on those Electric Agent/EFS machines. For more information, see the ElectricAccelerator Installation Guide and ElectricAccelerator Configuration Guide.

Benefits

  • Speeds long, full builds (for example, when you do a “make clean” then a “make,” or when you run a build in a new workspace)

  • Builds faster than ccache

  • Avoids certain false cache hits that might occur when you use ccache

  • Uses intelligent hashing for some types of files to avoid spurious cache misses because of changes in unimportant segments of those files.

    For example, JobCache ignores the .NT_GNU_BUILD_ID tag in ELF executables and libraries or input file timestamps in byte-compiled Python scripts.

Limitations

  • JobCache does not cache results from compilation jobs that invoke eMake. JobCache stores results from particular compilation jobs, but if a rule includes an invocation of eMake, (which might spawn other jobs), then that job is not cached, as in the following example:

    gcc -o foo.o foo.c && $(MAKE) -C subdir
  • The date and time in an object file will still reflect the original compilation (not the current date or time) if you use a C preprocessor macro that expands to the date or time when the compiler runs, and ElectricAccelerator re-uses the resulting object file in a subsequent build.

  • Source paths embedded in debugging information in object files will reflect the original compilation (even if you re-use the object file while building in a different workspace).

Supported Tools

JobCache supports the following tools.

Tool Supported Platforms Notes

gcc and g++

Linux Windows

  • All .o and .lo files are cached

  • Should not be used to cache results from linking

clang and clang++

Linux Windows

  • All .o and .lo files are cached

  • Should not be used to cache results from linking

cl (Microsoft Visual C/C++)

Windows

  • All .obj files are cached

  • The only supported debug option is /Z7

  • Should not be used to cache results from linking

Java Android Compiler Kit (Jack)

Linux

  • All .jack and .dex files are cached

javac

Linux

  • All .jar files are cached

Running a “Learning” Build to Populate the Cache

You must first populate the cache by running a “learning” build with JobCache enabled. For the learning build (because the cache is empty), JobCache saves only a new result to the cache. For subsequent builds, JobCache re-uses cached results and saves a new result to the cache as appropriate. If you do not enable JobCache, then the job cache is not accessed.

Extending JobCache to Teams Via a Shared Cache

The Shared JobCache feature extends JobCache to teams of developers by using a shared cache. As with non-shared JobCache, Shared JobCache accelerates builds by reusing outputs from a build in the next build, which avoids costly redundant work across builds. Shared JobCache extends this concept by giving developers read-only access to a cache that was previously populated by another user or process (such as a nightly build). With this feature enabled, only one user must actually run the compilations; other team members simply reuse the output from that “golden” build.

A shared cache gives JobCache two tiers: shared and private (“local”). The shared cache strictly augments the traditional (“local”) cache but does not replace it.

Shared JobCache tries to find matching cache entries in the following order:

  1. During build execution with Shared JobCache enabled, eMake tries to find matching cache entries in the shared asset directory. This directory is specified by the --emake-shared-assetdir option. For example, --emake-shared-assetdir=/net/nightlybuild/.

    Developers can never modify cache entries in the shared asset directory.

  2. If there is no matching slot for a job in the shared asset directory, or if the input files for the job differ, eMake uses the developer’s local cache instead by checking for a hit there. This cache is specified by the --emake-assetdir=<directory> option. For example, --emake-assetdir=/net/home/bob/.

  3. If there is no local cache match, eMake creates a slot in the developer’s local cache if needed. The developer will get a hit in their local cache during the next compilation.

Prerequisites

Shared JobCache requires all participating developers to have access to a shared file system (such as NFS) where the shared asset directory will reside.

Populating the Shared Cache

For the builds that populate the shared cache, use the following eMake options:

--emake-jobcache=<types> --emake-assetdir=<assetdir>

Using the Shared Cache

Developer builds use the following eMake options:

--emake-jobcache=<types> --emake-shared-assetdir=<assetdir>

where <assetdir> is the asset directory of the populated cache.

Configuring JobCache

Licensing

JobCache is licensed based on the maximum number of builds that may use it simultaneously. This number is read from the jobcacheMax property in the Accelerator license file. Simultaneous builds that exceed this number occur without using this feature.

If the JobCache license entry is invalid, or if the number of simultaneous builds has exceeded the license limit, a WARNING EC1181: Your license does not permit object caching message appears when a build tries to use JobCache. eMake will continue to work normally.

Choosing a Disk for the Job Cache

To estimate the disk space required for the job cache, add the sizes of your object files together and multiply by 0.7. A specific example is that the Android KitKat Open Source Project (AOSP) requires about 4 GB. For best performance, choose a disk that is local to the eMake client host. For Shared JobCache, users must have access to a shared file system such as NFS.

You can use the --emake-assetdir= eMake option to specify the directory for your job cache. The default name of this directory is .emake. By default, this directory is in the working directory in which eMake is invoked. (This option also determines the cache location for the parse avoidance feature and the location of the saved dependency information for the dependency optimization feature.)

Building Multiple Branches

To maximize your cache hits, use the --emake-assetdir= option to specify a separate asset directory for each branch of code that you build.

Setting the eMake Root

JobCache does not detect changes to compilation inputs that are not under your eMake root. You must ensure that your eMake root contains all sources and tools that might change.

Job Caching for gcc, clang, Jack, and javac

Enabling JobCache for All Make Invocations in a Build

To enable JobCache for all make invocations in a build, use the --emake-jobcache=<types> eMake option, where <types> is a comma-separated list of any combination of gcc, clang, clang-cl, jack, or javac. The list cannot contain spaces. The --emake-jobcache=<type> option works for recursive and nonrecursive builds.

  • --emake-jobcache=clang is an alias for --emake-jobcache=gcc. In the eMake annotation file, the JobCache type appears as jobcache type="gcc".

  • --emake-jobcache=clang-cl is an alias for --emake-jobcache=cl. In the eMake annotation file, the JobCache type appears as jobcache type="cl".

Following is an example command that enables JobCache for gcc, Jack, and javac:

% emake --emake-cm=mycm --emake-root=/src/mysource --emake-jobcache=gcc,jack,javac --emake-annodetail=basic

If some of your makefile targets are built by rules that do not invoke gcc/g, clang/clang, Jack, or javac, and those rules do not behave enough like those tools for job caching to be suitable, then use the jobcache pragma to be more selective about which jobs are cached, either by enabling JobCache for fewer jobs or by selectively disabling it for particular jobs with #pragma jobcache none.

For example, the rules that build a particular “.o” file might use an environment variable that is not used by gcc/g++, and you might want to miss the cache if that environment variable changes its value.

The --emake-annodetail=basic option is not required to invoke eMake, but it is recommended for troubleshooting. For details, see the Troubleshooting section below.

For information about how to use this functionality with Visual Studio, see the Job Caching for cl section.

Using the Response File gcc Command Line Option

JobCache supports the response file ( @<file>) gcc command line option, which reads gcc command-line options from a separate file specified by <file> .

Enabling JobCache for All Object File Targets in a Make Invocation

If you do not want to enable JobCache for all make invocations in a build, you can still enable JobCache for specific object file targets by applying the jobcache pragma inside the makefile. To enable the feature for all targets with a particular extension, use the following:

#pragma jobcache <type> -exist *
%.<object_file_extension> :

where <type> is gcc, clang, clang-cl, jack, or javac.

Following is an example makefile excerpt with the feature enabled for all targets ending in “.o”:

...
#pragma jobcache gcc -exist *
%.o :
...

(You can apply the pragma to a pattern, but not to a suffix rule.) If cache misses occur because an insignificant file exists or does not exist, you can delete -exist *, and eMake will check only the existence of files whose names suggest that they are source files or gcc precompiled headers. You can augment the set of files that matters to eMake by adding -exist options that specify the desired glob patterns.

If you do not want to add the pragma to the main makefile, you can add it to an “addendum” makefile. This is a small makefile that you include in the eMake invocation from the command line. For example, you could create a makefile named jobcache.mak and then add -f Makefile -f jobcache.mak to your eMake invocation. An addendum makefile is useful when you do not control the content of your makefiles (such as when you include open source components that use build tools such as configure, CMake, or qmake to generate makefiles).

Enabling or Disabling JobCache for Specific Object File Targets

To enable JobCache only for some object file targets, do one of the following:

  • If an object file is built by an explicit rule (one without “%” in it), you can enable JobCache for that rule by inserting a pragma just above it in the makefile. You can also use an addendum makefile in this case. For example, when using gcc/g, clang/clang, Jack, or javac to cache an explicit rule producing target.o, you can use this addendum:

#pragma jobcache type -exist *
target.object_file_extension :

For example, to enable JobCache using gcc only for some object file targets:

#pragma jobcache gcc -exist *
target.o :
  • You can use more narrow target patterns than “%.o” after a jobcache pragma. For example, “abc%.o” applies the pragma to “abcdef.o” but not to “xyzdef.o”. (This is true even if the rule to build “abcdef.o” appears elsewhere in the makefile.)

Similarly, you can explicitly disable caching by using the #pragma jobcache none pragma.

If jobcache pragmas with different options apply to a target, then eMake selects one of them as follows:

  • If the target is built by an explicit rule with a jobcache pragma, then eMake chooses that pragma.

  • Otherwise, eMake chooses the pragma whose pattern has the most in common with the target name.

  • If eMake still encounters ties, then it chooses the pragma that it encountered last during makefile parsing.

For example, the following gcc makefile applies job caching to hello1.o and hello2.o, but not to hello3.o or date2.o:

CC=gcc

all : hello
    ./hello

#pragma jobcache gcc -exist *
hello1.o: hello1.c hello1.h
    $(CC) -c -o $@ $< $(CFLAGS)

hello2.o: hello2.c hello2.h
    $(CC) -c -o $@ $< $(CFLAGS)
#Uses __DATE__ to record when build occurred--do not cache.
#pragma jobcache none
date2.o: date2.c date2.h
    $(CC) -c -o $@ $< $(CFLAGS)

hello3.o: hello3.c hello3.h
    $(CC) -c -o $@ $< $(CFLAGS)

hello: hello1.o hello2.o hello3.o
    $(CC) -o $@ $^

clean:
    rm hello hello1.o hello2.o hello3.o

.PHONY: all clean

#pragma jobcache gcc -exist *
%2.o :

Job Caching for cl

Enabling cl JobCache for All Make Invocations in a Build

On Windows, setting –emake-jobcache=cl enables JobCache for all jobs that produce .obj files. These are normally created by the Visual C++ compiler ( cl.exe ). You can set the option on the command line, in an emake.conf file, or in the EMAKEFLAGS environment variable.

When using the Visual Studio IDE extension, add the –-emake-jobcache=cl option to EMake Options in the command line settings:

For example, if you create a simple C++ Console Application using Visual Studio, it creates two compile jobs for Example1.cpp and stdafx.cpp.

image using visual studio jobcache

The extension converts this project into a makefile. For example:

".\Example1\Debug\Example1.obj":: ".\Example1\Example1.cpp"
    @cl.exe /Od /Oy- /sdl /D WIN32 /D _DEBUG /D _CONSOLE /D _LIB /D _UNICODE /D UNICODE /EHsc /RTC1 /analyze- /MDd /GS /Zc:wchar_t /Zc:forScope /Yu"stdafx.h" /Fp".\Example1\Debug\Example1.pch" /Fo".\Example1\Debug\Example1.obj" /W3 /WX- /nologo /c /Z7 /Gd /errorReport:none /fp:precise /TP ".\Example1\Example1.cpp"

".\Example1\Debug\Example1.pch" :: ".\Example1\Debug\stdafx.obj"

".\Example1\Debug\stdafx.obj" :: ".\Example1\stdafx.cpp"
    @cl.exe /Od /Oy- /sdl /D WIN32 /D _DEBUG /D _CONSOLE /D _LIB /D _UNICODE /D UNICODE /EHsc /RTC1 /analyze- /MDd /GS /Zc:wchar_t /Zc:forScope /Yc"stdafx.h" /Fp".\Example1\Debug\Example1.pch" /Fo".\Example1\Debug\stdafx.obj" /W3 /WX- /nologo /c /Z7 /Gd /errorReport:none /fp:precise /TP ".\Example1\stdafx.cpp"

When emake runs with JobCache, it stores Example1.obj and stdafx.obj and reuse them in the next build if it gets a “hit” (meaning that no relevant changes are detected since the last build).

Using the /Z7 Option with JobCache

JobCache does not work with the /Zi or /ZI compiler debug options when the same PDB file is updated between compilations. Each compilation must create a unique PDB file if debugging is enabled. You should use the /Z7 option with jobcache. When using the Visual Studio IDE extension, check Set Debug Information Format to C7 Compatible :

Troubleshooting

Enabling Basic Annotation

Basic annotation helps you to resolve your cache hit/miss problems by providing information about whether a cache miss occurred and the cause. Annotation that is related to JobCache is included in basic annotation. Basic annotation is not enabled by default, so you must do so by using the --emake-annodetail=basic eMake option. By default, the eMake annotation file is created in the working directory in which eMake is invoked; the file is named emake.xml by default.

Interpreting Job Cache Annotation Information

Examine the relevant job XML element in the annotation file for a subelement named jobcache.

Following is an example of a jobcache subelement for a job with a job cache hit. The status attribute indicates whether eMake used a cached object file in that job (whether it had a cache hit), and if not, what else occurred:

...
<job id=...>
...
     <jobcache type="gcc" options=" -exist *" slot="6d00a0d9242610a075a98bc9400f8f11" duration="0.155831" status="hit">
...
     </jobcache>
...
</job>
...

The duration attribute is the duration (in seconds) of the job that populated the cache. If the current build is updating the cache, then the value of duration is the same as the duration of the job containing that jobcache subelement. If the current build is replaying from the cache, then it is the duration of the corresponding job from a previous build whose results were saved into the cache. eMake retrieves the figure from the cache.

Following is an example of a jobcache subelement for a job with a shared job cache hit. Note the src="shared" tag, which appears when Shared JobCache is used:

...
<job id=...>
...
  <jobcache type="gcc" options=" -exist *" slot="86e319e5cd837ad78360145fb3a933f1" duration="0.244847" status="hit" src="shared">    <triggers>      <trigger type="commandline" option="--emake-jobcache=gcc"/>    </triggers>  </jobcache>
...
</job>
...

The local cache was used unless src="shared" appears.

Following is an example of a jobcache subelement for a job with a cache miss:

...
<job id=...>
...
     <jobcache type="gcc" options=" -exist *" slot="989324a736c583a306630930da80ba1e" duration="0.266177" status="miss">
...       <differences>
               <diff name="/c/src/mysql-5.6.21/include/mysql.h" old="md5:d55ae58fd7eec6055a50fcf6b83af99c" new="md5:071ed8e989b0d0e14d0b0bd96e94cd35"/>
          </differences>
     </jobcache>
...
</job>
...

Following is an example of a jobcache subelement for a job for which caching was disabled by using #pragma jobcache none :

...
<job id=...>
...
     <jobcache type="none" options="" status="uncacheable">
          <triggers>
               <trigger type="pragma" file="Makefile" line="11" options="none"/>
          </triggers>
     </jobcache>
...
</job>
...

Following is a complete list of the possible values of the status attribute:

  • hit —eMake had a cache hit for that job.

  • miss —eMake had a cache miss for that job. Each diff subelement shows a file system input to compilation that differs since the object file was cached and shows the difference that was observed.

  • newslot —A new slot was created. This is because the previously-cached object files were from compilations that used different command-line arguments, environment variables, working directories, or any combination of these. To see the options used for that compilation, see the “key” file for the slot identified by the slot attribute. For example, the key file for slot 2b253a890c9745a0b500d888349ec2e2 has the following path name:

    .emake/cache.16/i686_Linux/2b/25/3a/890c9745a0b500d888349ec2e2/key

    The version number in the cache. <number> directory might vary with your ElectricAccelerator software version. The key file specifies the relevant environment variables, the working directory, and the command line. To allow cache hits when building in a new workspace, path names are specified relative to your eMake roots. If a target repeatedly has newslot status, get the slot identifiers from two consecutive builds for that same target and compare the key files.

  • uncacheable —Caching was disabled by the #pragma jobcache none pragma, or an error occurred during the update of the relevant cache slot. Examine any ERROR and WARNING messages in the console output from eMake.

  • rootschanged —There is no natural mapping from the old eMake roots to the new eMake roots.

  • unneeded —JobCache was enabled for the job but not needed (because the target was already up to date according to ordinary GNU Make rules). The cache was not consulted, even though caching was requested for that target.

If Job Cache Annotation Information Is Missing

If a particular job element in the annotation file has no XML jobcache subelement, this is because any combination of the following has occurred:

  • The target name for that job does not match the pattern following any jobcache pragma, and if the target is built by an explicit rule, that rule does not follow a jobcache pragma.

  • The intended jobcache pragma is misspelled.

  • Either the --emake-jobcache= command line option was not used, or none of the targets matched

  • The appropriate licensing is not available to the eMake client.

If eMake Unexpectedly Used a Cached Object File

If eMake should not have used a particular cached object file, then

  • If eMake should have detected a change to a particular file, compare its path to your eMake roots.

  • If a relevant environment variable changed, check whether the key file mentions it (see above), and if it does not, notify Electric Cloud.

Profiling Debug Logging

Annotation files include profiling metrics to help troubleshoot performance issues. These are the same metrics that are in the debug log file when the --emake-debug=g option is set. The metrics appear in annotation whether or not --emake-debug=g is used. The metrics are in the <profile> tag and appear exactly as they do in the debug log file.

Electric Cloud engineering and support staff use profiling debug logging as well as other information in the eMake debug logs to help troubleshoot problems. For more information about debug logging and log levels, see eMake Debug Log Levels.

Viewing JobCache Metrics

The annotation file includes metrics about job cache activity. Following is an example that lists the metrics. This example shows that Shared JobCache is used:

...
<metrics>
...
     <metric name="jobcache.hit.local">382</metric>
     <metric name="jobcache.hit.shared">0</metric>
     <metric name="jobcache.hit">382</metric>
     <metric name="jobcache.miss">666</metric>
     <metric name="jobcache.newslot">98</metric>
     <metric name="jobcache.sharedmiss">0</metric>
     <metric name="jobcache.sharednewslot">0</metric>
     <metric name="jobcache.rootschanged">0</metric>
     <metric name="jobcache.uncacheable">4</metric>
     <metric name="jobcache.unneeded">6</metric>
     <metric name="jobcache.na">3150</metric>
     <metric name="jobcache.workloadsaved">367.393489</metric>
...
</metrics>
...

For descriptions of these metrics, see Metrics in Annotation Files.

Moving Your Workspace

If you want to move your workspace, make sure that the new eMake roots correspond to the old eMake roots. Also, because the asset directory defaults to .emake in the current working directory, you must either copy that directory to the new workspace or use --emake-assetdir= to specify an asset directory that you want the two workspaces to share. If you already use --emake-assetdir= to point to an asset directory within your old workspace and also want to move the asset directory, you must update its value to point to the new asset directory location.

Deleting the Cache

In general, content in the cache is not deleted automatically (although it might be replaced by newer content). If the cache grows significantly beyond the size expected for a full build, you can delete the cache to save disk space.

For example, if you change the value of the C_INCLUDE_PATH environment variable, then the cache will grow to contain results for both the old and new values of that variable. In this case, you might want to clear the cache when you permanently change the value of this variable and therefore no longer need the old cache results.

To delete the cache, you delete the <assetdir>/cache.* directories. For example, if you are using the default asset directory on Linux, enter

rm -r .emake/cache.*