Postprocessors: Collecting Data for Reports

12 minute readAutomation

Overview

In CloudBees CD/RO, reporting is divided into two phases.

  • The first phase is data collection: interesting information is extracted from job step logs and saved in the CloudBees CD/RO database.

  • The second phase is report generation: data collected previously is retrieved and organized into reports.

The report phases are separated so data is gathered once but then available to use in a variety of different reports either immediately or later. For example, one report might summarize errors within a particular job, and another report might display error trends from all jobs over the past month.

CloudBees CD/RO implements data collection with a postprocessor .

  • A postprocessor is a command associated with a particular procedure step. If a postprocessor is specified for a step, it executes concurrently with the main step command.

  • The postprocessor runs on the same machine as the main command and in the same working directory, and it retrieves the log file from the step as its standard input.

CloudBees CD/RO includes a standard postprocessor called postp that you can use and extend. postp scans the step’s log file looking for interesting output such as error messages and then sets properties on the job step to describe what it found. For example, postp might create a property named "errors" whose value is a count of the number of error messages in the log file, or a property named "tests" that counts the number of tests executed by the step. Also, postp can extract portions of the step log that contain useful diagnostic information and save this information for reporting.

Standard CloudBees CD/RO reports, such as those on the Job Details page, display information collected by postp such as properties named "errors" and diagnostic log extracts. This information is available immediately, even before the step completes execution, so you can view it via CloudBees CD/RO’s web interface to monitor step execution. Also, you can create additional report generators of your own, which can use the same information displayed by CloudBees CD/RO and/or any other additional information of your choice.

Postp

CloudBees CD/RO’s general-purpose postprocessor, postp , uses regular expression patterns to detect interesting lines in a step log.

  • postp is already configured with patterns to handle many common cases such as error messages and warnings from gcc , gmake , cl , junit , and cppunit , or any error message containing the string "error." You can use the empty matcher group named "none" if just want to run a postpEndHook.

  • postp is easy to use by simply setting the postprocessor for a step to " postp."

  • postp also supports several useful command-line options.

To explore these options, invoke " postp --help " from your command-line.

Extending postp : matchers

If you find useful patterns in your log files undetected by postp, you can extend postp with additional patterns. This feature is easily implemented, but the extension interface currently involves writing simple Perl scripts and you need to look at the Perl source file for postp as you do this. The postp source code is installed during CloudBees CD/RO installation and located in the src/postp.pl file in the distribution directory.

Postp is driven by a collection of matchers , which are regular expression patterns that select certain lines from step logs, and by a collection of Perl functions the matchers invoke to handle lines of interest. A matcher is a Perl hash with three values similar to the following example:

{
   id      => "error",
   pattern => q{ERROR:|[Ee]rror:},
   action  => q{
      incValue("errors"); diagnostic("", "error", -4)
   },
}

Explanation of the values in the Perl hash example above:

  • id —A unique name for the matcher—used to identify the matcher in command-line arguments and a few other places.

  • pattern —A regular expression tested against each line of the step log file. This particular pattern matches lines containing any of the strings "ERROR:", "Error:" or "error:"

  • action —A Perl script that executes whenever a log line matches the pattern for this matcher. The script in this example increments a variable named "errors" that is copied automatically to a job step property with the same name. The script also saves a portion of the step log beginning 4 lines prior and extending through the current line and associates those lines with this error so it can be displayed in the web interface. See below for more information on the incValue and diagnostic functions.

To extend postp with patterns of your own, write a Perl script to add new patterns to the @::gMatchers array. Here is a simple example:

my @newMatchers = (
    {
      id      => "coreDump",
      pattern => "core dumped",
      action  => q{
                      incValue("coreDumps")
                 }
    },
    {
      id      => "segFault",
      pattern => "segmentation fault",
      action  => q{
                      incValue("segFaults")
                 }
    },
);
push @::gMatchers, @newMatchers;

These matchers detect lines containing the strings "core dumped" or "segmentation fault" and increment separate variables for each line type.

After writing extension code, you must ask postp to execute the code when it starts up. You can execute extension code in one of two ways:

  • Place the code into a file and invoke postp with the --load option. For example, postp --load fileName

  • Copy the code into a property in CloudBees CD/RO and use the --loadProperty option to postp. For example, postp --loadProperty /myProject/extraMatchers

Postp extensions can contain arbitrary Perl code, which means you can use this mechanism to define additional functions to invoke in matcher actions if existing functions do not provide what you need.

Additional postp matchers are available for your convenience. The matcher sample directory was installed during CloudBees CD/RO installation. Go to src/samples/postp.

Postp functions

Postp contains several built-in functions to invoke in your matchers. The most useful functions are summarized below. Also, you can scan postp code for additional functions.

debugLog(format, arg, arg, …)

Outputs information to the debug log, if the --debugLog command-line switch is set. Format provides a format string similar to printf, and each argument provides a value to substitute into the format string.

backTo(pattern, start)

This function searches backward to find the first line in the step log matching "pattern" (a regular expression) and returns the offset of that line relative to the current line. The result is normally used as an argument to the "diagnostic" function. "Start" is optional; it specifies the first line to check and is specified as an offset relative to the current line (it defaults to -1).

backWhile(pattern, start)

This function searches backward to find the first line in the step log that does not match "pattern" (a regular expression) and returns the offset relative to the current line of the line just after the first non-matching line. The result is normally used as an argument to the "diagnostic" function. "Start" is optional; it specifies the first line to check and is specified as an offset relative to the current line (it defaults to -1).

currentModule()

Returns the name of the current module (as determined by previous calls to pushModule and popModule), or an empty string if there is no current module.

diagnostic(name, type, first, last)

This function extracts a group of contiguous lines from the log file and saves them along with additional information for reports such as the log extracts, which appear at the bottom of the Job Details web page. The range of lines to extract is indicated by "first" and "last," each of which is an offset relative to the current line. For example, if "first" is -3 and "last" is 2, then 6 lines will be recorded: 3 lines before the current line, the current line, and 2 lines after the current line. In many cases, the values for "first" and "last" are computed by calling functions such as "forwardTo" or "backWhile." "Name" provides an identifier for this particular diagnostic, such as the name of a test that failed or a file that did not compile. "Type" specifies which kind of information this is, and must be "error," "warning," or "info." In addition to log lines, this function records "name," "type," the current module, if any, and the name of the current matcher.

forwardTo(pattern, start)

This function searches forward to find the first line in the step log matching "pattern" (a regular expression) and returns the offset of that line relative to the current line. The result is normally used as an argument to the "diagnostic" function. "Start" is optional; it specifies the first line to check and is specified as an offset relative to the current line (it defaults to 1).

forwardWhile(pattern, start)

This function searches forward to find the first line in the step log that does not match "pattern" (a regular expression) and returns the offset relative to the current line of the line just before the first non-matching line. The result is normally used as an argument to the "diagnostic" function. "Start" is optional; it specifies the first line to check and is specified as an offset relative to the current line (it defaults to 1).

incValue(name, increment)

Adds "increment" to a value named "name" and arranges for that value to be written eventually to a property by the same name on the current job step. If this is the first call for "name," its value is initialized to 0. "Increment" is optional and defaults to 1.

postp does not check the job step for a pre-existing property with the same name; it simply overwrites it.

logLine(lineNumber)

Returns the line from the step log given by lineNumber. 1 corresponds to the first line in the step log and the Perl variable $::gCurrentLine holds the number of the current line. This function caches a sliding window of lines in the file, allowing you to go back to retrieve lines preceding the current line (as long as they do not precede it by too many lines). If the requested line is off the end of the file then undef is returned. If the requested line is before the beginning window of cached lines, an empty string is returned.

popModule()

Cancels the effect of the most recent call to pushModule, resetting the current module name to whatever it was before the corresponding call to pushModule.

postpEndHook()

If you define a function with this name, it invokes after postp has finished processing the log file, but before it makes its final properties update on the job step. Use this function to perform your own operations such as generating an error if the log file did not contain a particular line you were expecting.

pushModule(name)

In some situations it is possible to divide the log file into parts corresponding to different modules. For example, with recursive make invocations, there are typically notifications in the log output before and after each recursive make. This function is invoked to indicate a new module is being entered, where "name" is the name of the module. After this function is called, the "diagnostic" function will include "name" with error or warning messages to provide additional information in job reports. The previous module name, if any, is saved; you can return to it by calling popModule.

setProperty

This function sets a property in the CloudBees CD/RO server. If the named parameter is a relative path, like moduleCount, the property is set or created on the current job step. You can use an absolute path, like /myJob/fileLocation also. Calling setProperty does not result in an immediate call to the CloudBees CD/RO server. The property is added to an update list for updating at the next "update interval", typically every 30 seconds.

Integration with the CloudBees CD/RO user interface

postp interacts with the CloudBees CD/ROUI using two methods. The first method: Create custom properties with special names that are recognized by the UI itself. The second method: Create a file that contains “diagnostics”, which are used to display errors or warnings, and link them to specific sections in the step’s log file.

Custom property names and values

postp can be used to create properties in CloudBees CD/RO, on the job step or anywhere else in CloudBees CD/RO. However, the following properties are used by the standard CloudBees CD/RO UI, so you should use these property names whenever possible, and avoid using these names in ways that conflict with the definitions below.

  • compiles —the number of files compiled during the job step

  • diagFile —the filename in the top-level directory of the job’s workspace, containing diagnostics extracted from the step’s log file

  • errors —the number of errors (compilation failures, test failures, application crashes, and so on) that occurred during the job step. When property errors are set by postp, the step outcome is set to error also.

  • tests —the number of tests executed by the job step, including successes and failures

  • testsSkipped —the number of tests skipped during the job step

  • warnings —the number of warnings that occurred during the job step. When property warnings is set by postp, the step outcome is set to warning also.

  • preSummary —if this property exists, its value is displayed in the "Status" field (on the Job Details page) for this step. This property appears before whatever would normally be displayed for status. If the property contains multiple lines separated by newline characters, each line is displayed on a separate line in the status field.

  • postSummary —if this property exists, its value is displayed in the "Status" field (on the Job Details page) for this step. This property appears after whatever would normally be displayed for status. If the property contains multiple lines separated by newline characters, each line is displayed on a separate line in the status field.

  • summary —if this property exists, its value is displayed in the "Status" field (in the job reports) for this step, replacing whatever would normally be displayed for status. If the property contains multiple lines separated by newline characters, each line is displayed on a separate line in the status field.

Diagnostic information

The second form of postprocessor generated output contains diagnostic extracts from the step’s log file, typically providing additional information about problems. The postprocessor stores diagnostic information in an XML file in the top-level directory of the job workspace, then sets the step’s diagFile property with the name of the file. Diagnostic files must have a format like the following example:

   compileErrortesta.cerrorutil/timeLib27testa.c: In function 'proc1':
           testa.c:12: error: parse error before ';' token
           testa.c:13: error: 'for' loop initial declaration used outside C99 mode
           testa.c:14: error: parse error before ';' token
           testa.c:16: error: too few arguments to function 'exit'
           testa.c:18:2: warning: no newline at end of file
           testa.c:18: error: parse error at end of input
         testa.oerrorutil/timeLib91make: *** [testa.o] Error 1 

XML elements in the diagnostic file are as follows:

  • diagnostics —Overall container; its children consist of all of the diagnostic elements.

  • diagnostic —Describes one diagnostic extract; the elements described below are all children of this element.

  • matcher —(optional) Identifier for the matcher that triggered this diagnostic; used primarily for debugging.

  • name —(optional) Identifier that indicates the problem or situation that resulted in this diagnostic, such as the name of a failed test or the name of a file that did not compile.

  • type —Type of message: must be "error," "warning," or "info."

  • module —(optional) Name of the module in which the issue occurred, such as the name of a source code module being compiled at the time of a compile error.

  • firstLine —Line number in the log file for the first line of the diagnostic extract (1 means the first line of the file). This element is used to provide a link from the diagnostic extract to the full log file.

  • numLines —Total number of lines included in the diagnostic extract.

  • message —The actual lines from the log file.

Postp integration with Java Tools

CloudBees CD/RO integrates with three standard Java tools through postp matchers. These Java tools are:

  • EMMA—an open source toolkit for measuring and reporting Java code coverage (Emma v2.0)

  • JUnit—a framework for writing and running automated tests (tested with Ant v1.7)

  • Clover—Atlassian’s Java code coverage (generated by system, functional, or unit tests) analysis tool (Clover v2.4)

If one of these Java tools is invoked in a CloudBees CD/RO job step, CloudBees CD/RO automatically detects the invocation (if you are using Postp) and adds any reports generated by these tools to the list of links at the top of the Job Details page.

The postp Process

Postp parses output from the invoked JAVA tool to match paths to reports it has already created. Then, postp copies all files that comprise the report to a unique location in the "Artifacts" Directory. Next, postp generates a link to the location in the Artifacts Directory to make the report available in the Links section on the Job Details page.

The following example of generated output from Emma illustrates this process:

emma:

init:[mkdir] Created dir: C:\Documents and Settings\ptharani\Desktop\emma\emma-2.0.5312\examples\out

compile:[javac] Compiling 4 source files to C:\Documents and Settings\ptharani\Desktop\emma\emma-2.0.5312\examples\out

run:[emmajava] EMMA: processing classpath …​[emmajava] EMMA: [3 class(es) processed in 47 ms][emmajava] main(): running doSearch()…​[emmajava] main(): done[emmajava] EMMA: writing [txt] report to [C:\Documents and Settings\ptharani\Desktop\emma\emma-2.0.5312\examples\coverage\coverage.txt] …​[emmajava] EMMA: writing [html] report to [C:\Documents and Settings\ptharani\Desktop\emma\emma-2.0.5312\examples\coverage\coverage.html] …​

all:

BUILD SUCCESSFULTotal time: 1 second

From this output

The actual link may reside here:

/home/commanderWorkspace/job_112_200901081732/artifacts/javaTools/238/emmaCoverage/2/coverage.html

While the link name may be:

Step Id 238 ant-on-the-fly—emma report# 2

And the value of the link may be:

jobSteps/238//javaTools/238/emmaCoverage/2/coverage.html

Java Tool matcher examples

Two examples of postp Emma matchers:

{ id ⇒ "emmaReport1", pattern ⇒ q{EMMA: writing}, action ⇒ q{emmaExtractReport()},

` }, { id ⇒ "emmaReport2", pattern ⇒ q{\[report\] writing}, action ⇒ q{emmaValidateOutput()}, },`

An example of postp JUnit matchers:

` { id ⇒ "junitReportCapture", pattern ⇒ q{\[junitreport\] Processing}, action ⇒ q{junitExtractReport ()}, },`

These matchers correspond to the following output:

junit.report:[junitreport] Processing /net/WinStor2home/ptharani/junitDemo1/sample/testreport/TESTS-TestSuites.xml to /tmp/null1214791178[junitreport] Loading stylesheet jar:[file:/usr/local/tools/common/apache-ant-1.7.0/lib/ant-junit.jar\!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl[junitreport] [file:/usr/local/tools/common/apache-ant-1.7.0/lib/ant-junit.jar!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl[junitreport]] Transform time: 622ms[junitreport] Deleting: /tmp/null1214791178

An example of postp Clover matchers:

` { id ⇒ "cloverHtmlReportAntTask", pattern ⇒ q{\[clover-html-report\] Writing HTML report to}, action ⇒ q{cloverExtractReport()}, },`

These matchers correspond to the following output:

clover.report:[clover-html-report] Clover Version 2.4.0, built on November 05 2008 (build-747)[clover-html-report] Loaded from: /home/cloverDemo/clover-ant-2.4.0/lib/clover.jar[clover-html-report] Clover: Evaluation License registered to electric cloud.[clover-html-report] You have 27 day(s) before your license expires.[clover-html-report] Loading coverage database from: '/home/cloverDemo/clover-ant- 2.4.0/tutorial/.clover/clover2_4_0.db'[clover-html-report] Writing HTML report to '/home/cloverDemo/clover-ant-2.4.0/tutorial/clover_html'[clover-html-report] Done. Processed 1 packages in 2559ms (2559ms per package).

Artifacts directory

The value of the artifacts directory determines the scope of what is visible in the CloudBees CD/RO UI from a job’s workspace. By default, the Artifacts Directory is set to "artifacts", so the artifacts directory would have the form:

<path to job workspace>/artifacts

The value of the artifacts directory can be set in any of the following properties:

/myJob/artifactsDirectory

/myProject/artifactsDirectory

/server/settings/artifactsDirectory

Postp queries these properties [in the order listed] to determine the location of the Artifacts Directory. If no value is found (because the property was never set), the default "artifacts" is used.

Postp has a feature where it recognizes that data belongs to some standard tool (such as Junit), copies the logs produced by that tool to the artifacts directory, and then creates the report-url properties. These actions are done with the junitReportCapture matcher.

If you want to disable this matcher in your postp invocation in the step that runs junit tests, do the following:

postp --dontCheck junitReportCapture

It is possible that the artifacts directory will still be created even if nothing is put in it. If that occurs, set the artifactsDirectory property on your job (or the owning project) to empty-string.