System of Record GraphQL API

CloudBees SDM is a preview, with early access for select preview members. Product features and documentation are frequently updated. If you find an issue or have a suggestion, please contact CloudBees Support. Learn more about the preview program.

Clients interact with the System of Record using a GraphQL API that is available at the system’s /data/api/v1/a/{accountName}/graphql end-point, where {accountName} is the DevOptics account name of the caller. The end-point produces JSON data and accepts either JSON or GraphQL payloads. You can find more information about GraphQL on the project’s website.

You can find the GraphQL schema that is provided by the System of Record system in Appendix: System of Record GraphQL schemas. The rest of this document will explain how to use the GraphQL types, mutations and queries defined in the schema.

Defining data types

The System of Record is a key and value store that allow clients to store blobs of JSON data and then retrieve that data by key or via typed queries.

Data stored in the System of Record has types. See Application types, data, and relationships for how to define new data types via an application’s manifest.

Storing data entities

To store data of a given type, you must provide a key that uniquely identifies the data entity within its type and the data itself, which is a blob of JSON data. You can use the Data.add mutation, for example:

mutation createData( $type: String!, $key: String!, $data: JsonObject!, $validation: DataValidationOptions) {
    data(type: $type) {
        add(input: {key: $key, data: $data, validation: $validation}) {
            data {
                key
                data
            }
        }
    }
}

To store a Jenkins Build using the type defined above, you would post the mutation above with the following parameters, including the whole blob of JSON data from Jenkins:

{
  "type":"jenkins_builds",
  "key": "https://gauntlet-2.cloudbees.com/elroy/job/Admin/job/access/job/testAccessGitHub/2/",
  "data": {
    "url": "https://gauntlet-2.cloudbees.com/elroy/job/Admin/job/access/job/testAccessGitHub/2/",
    "number":1,
    "etc": "etc"
  }
}

Data validation

The optional validation field controls the level of validation performed on the provided data. It takes an object with two fields: level and strict.

The level field takes one of the following three enumeration values:

  • OFF (the default) indicates that no validation is performed

  • WARN indicates that validation is performed and any errors are returned in the response but invalid data will still be ingested

  • FAIL indicates that validation is performed and any errors are returned in the response but invalid data will not be ingested

The strict field is a boolean where true indicates that fields not defined in the schema will result in errors and false (the default) does not.

Querying raw data

To retrieve data of a given type, use the data query. Here’s an example that will return the key, ts, and data for each data entity of a specified type:

query getRawData( $where: TopLevelDataQuery! ) {
    data( where: $where ) {
        nodes {
            key
            ts
            data
        }
    }
}

For example, to query for all data of type githubrepo you would post the above query and the following variables:

{
    "where": {
        "type": "github_repositories"
    }
}

Mutating raw data

Here is an example of a raw data mutation:

mutation updateData( $type: String!, $key: String!, $data: Json!) {
    data(type: $type) {
        update( input: { key: $key, data: $data } ) {
            data {
                id
                key
                data
            }
        }
    }
}

An update or upsert mutation replaces all of the existing data, whereas merge will update only those fields supplied in the new data.

eg: Let us assume that there already exists an entity in the System of Record, with the following data fields:

{
  "a": 1,
  "inner": {"keep": "keepme", "change": "changeme"}
}

If we apply an update or upsert with the following data:

{
  "b": 2,
  "inner": {"change": "changed", "new": "newdata"}
}

Then the resulting entity stored in the System of Record will be the following:

{
  "b": 2,
  "inner": {"change": "changed", "new": "newdata"}
}

However, if you apply a merge mutation with the same data, then the resulting entity stored in the System of Record will be:

{
  "a": 1,
  "b": 2,
  "inner": {"keep": "keepme", "change": "changed", "new", "newdata"}
}

Querying typed data

Data can also be queried via typed GraphQL queries generated based on the schema provided for each data type.

Designing your schema

Let’s say you want to store a data item for each GitHub repository you have. For each such repository you have a JSON object similar to the one below (which is a subset of actual GitHub repository data).

{
  "id": 34641857,
  "url": "https:\/\/api.github.com\/repos\/cloudbees\/cloudbees-public-key",
  "fork": false,
  "name": "cloudbees-public-key",
  "owner": {
    "id": 235526,
    "url": "https:\/\/api.github.com\/users\/cloudbees",
    "login": "cloudbees",
    "node_id": "MDEyOk9yZ2FuaXphdGlvbjIzNTUyNg=="
  },
  "permissions": {
    "pull": true,
    "push": true,
    "admin": true
  },
  "organization": {
    "id": 235526,
    "url": "https:\/\/api.github.com\/users\/cloudbees",
    "login": "cloudbees",
    "node_id": "MDEyOk9yZ2FuaXphdGlvbjIzNTUyNg=="
  }
}

To make that data accessible to GraphQL queries, you create a GraphQL schema for that data. You can choose what fields you want to expose and what to name your types, but, by default, there must be one type that matches the type key of the Data Type, but capitalized and singular (type keys are snake-cased and pluralized).

Below is example schema that matches the above GitHub data and assuming the type key is repositories:

type Repository {
    key: String!
    ts: Timestamp
    id: String
    name: String
    fork: String
    git_url: String
    owner: RepositoryOwner
    organization: RepositoryOrganization
}

type RepositoryOwner {
    id: String
    url: String
    login: String
    node_id: String
}

type RepositoryOrganization {
    id: String
    url: String
    login: String
    node_id: String
}

Schema field mappings

What if you want your schema to look different from what is shown above? Maybe you would prefer that all fields are camel-case instead of snake-case. Or, maybe you would prefer that the permissions fields appear inside the owner object.

To accomplish this, you can add GraphQL @field directives, defined by the System of Record, to map the data to the schema you want. For each field you provide field-mapping and you specify a JSON Path expression that defines where that field is to get it’s value from, within the data field of each data entity.

The example below shows a custom schema that representation that provides a schema for mapping GitHub Repository to the shape you want:

type Repository {
    key: String!
    ts: Timestamp
    id: String
    name: String
    fork: String
    gitUrl: String @field(path:"@.git_url")
    owner: RepositoryOwner
    organization: RepositoryOrganization
}

type RepositoryOwner {
    id: String
    url: String
    login: String
    nodeId: String @field(path:"@.node_id")
    permissionAdmin: String @field(path:"$.permissions.admin")
    permissionPull: String @field(path:"$.permissions.pull")
    permissionPush: String @field(path:"$.permissions.push")
}

type RepositoryOrganization {
    id: String
    url: String
    login: String
    nodeId: String @field(path:"@.node_id")
}

There are a couple of things to note about the schema above:

  1. The syntax of the path attribute ($, @ etc) follows a subset of the JSON Path "spec". See below.

  2. By default, there must be a GraphQL type that matches the capitalized name of your System of Record type, and in the above example the System of Record type is "repository" and the corresponding GraphQL type is "Repository."

  3. For each field that must be renamed from snake-case to camel-case, there is a GraphQL @field directive which provides a JSON Path expression that tells where to find the data for the field. This expression is absolute (it starts at the top of your data object) if it starts with $ and relative (to the current object) if it starts with @. For example, the nodeId field of RepositoryOrganization declares that data should come from the node_id field at that same level of nesting.

  4. The directives can also be used to transform the data. In the above example, the three fields from permission fields are pulled into the owner. Since the JSON Path expression may be absolute, it can map data from any part of the data field into the schema field you want.

  5. Path tokens that reference a field/column name that contains spaces are supported e.g. epicLink: String @field(path:"@.fields.Epic Link"), where Epic Link is a field on fields.

A note about JSON Path

The System of Record supports a tiny subset of the JSON Path syntax.

JSON Path expressions must be of the form $.field.subfield.subsubfield or @.field.subfield.subsubfield where the former is relative to the root data object and the latter is relative to the current object.

Indexed fields

Fields that are used to define natural relationships are indexed by default. Additional fields can be annotated for indexing by using the @index directive. For example:

type JiraIssue implements PolicyTarget @queryable {
    jiraIssueKey: String @field(path:"$.key") @index
    ...
}

Custom type and field names

A default algorithm is used for mapping lower-case, snake-case, plural data type key names into GraphQL type names and singular and plural field names. If this algorithm does not match your requirements then you can use the @queryable directive on a type to define a custom mapping.

For example, the algorithm does not handle capital letters in the middle of names (e.g. GitHub) nor the correct pluralization of "branch" which could be rectified as follows:

type GitHubBranch @queryable(key: "github_branches", singular: "gitHubBranch", plural: "gitHubBranches") {
    ...
}

@display field directive

Field definitions can be annotated with a @display directive. The directive is defined as follows:

directive @display(label: String, visualizer: String, priority: Int) on FIELD_DEFINITION

Querying DataType data (including fields)

DataType info can be queried.

All types

DataTypeDefinition for all types defined on an account can be queried via the dataTypes query as follows:

query {
  dataTypes {
    key
    fields {
      name
      type
      display {
        label
        visualizer
        priority
      }
    }
  }
}

By type or key

The dataType query is defined as follows:

dataType(key: String, type: String): DataTypeDefinition

The DataTypeDefinition for an individual type can be queried by key or type:

query {
  dataType(key: "github_pull_request") {
    key
    fields {
      name
      type
      display {
        label
        visualizer
        priority
      }
    }
  }
}

And by type:

query {
  dataType(type: "GithubPullRequest") {
    key
    fields {
      name
      type
    }
  }
}

Mutating typed data

Data types that are marked as mutable with the @mutable directive will have a mutator with remove, add, and update mutations defined. For example:

extend type Mutation {
    product: ProductMutator
}


type ProductMutator {
  add(input: ProductMutatorAddInput): ProductMutatorAddPayload
  remove(input: ProductMutatorRemoveInput): ProductMutatorRemovePayload
  update(input: ProductMutatorUpdateInput): ProductMutatorUpdatePayload
}

input ProductMutatorAddInput {
  description: String
  name: String!
}

type ProductMutatorAddPayload {
  product: Product
}

input ProductMutatorRemoveInput {
  id: ID!
}

type ProductMutatorRemovePayload {
  result: Boolean
}

input ProductMutatorUpdateInput {
  description: String
  id: ID!
  name: String!
}

type ProductMutatorUpdatePayload {
  product: Product
}

Current limitations:

  • An update replaces all of the existing data i.e. it does not just update fields provided in the input.

Relationships

Explicit relationships

The API includes a feature that allows you to define bi-directional relationships between data entities, and then query for those relationships.

To create a relationship between two data entities use the addRelation mutation, for example:

mutation createRelationship( $from: ID!, $to: ID! ) {
    addRelation( from: $from, to: $to )
}

As shown in the schema, the InputDataType is an object with string fields type and key. For example, if you have an entity of type sdm-product and key 342345 with id 123 and you want to relate it to another entity of type githubrepo and key git://github.com/cloudbees/ps-customers.git with id 456 you would post the above mutation with the following data:

{
    "from": "123",
    "to": "456"
}

Then, to query for all GitHub repos related to a product you would use data query like this:

query getFieldData( $where: TopLevelDataQuery! ) {
    data( where: $where ) {
        nodes {
            key
            fields
        }
    }
}

In the variables you would include type githubrepo and use the relatedTo field to find repos related to the product, for example:

{
    "where": {
        "type": 'githubrepo',
        "relatedTo": {
            "type": "sdm-product",
            "key": "342345"
    }
}

The GraphQL schema for a type may also define a field that represents the results of querying for one or more related entities of another type by adding the @relationship directive. The type of the desired related entity is derived from the type of the field. For example, the repos field in the following schema will return all objects matching the GraphQL type Repository for which a relationship has been added to the given product:

type Product {
    id: ID!
    name: String!
    description: String
    repos: [Repository] @relationship
}

If the type is a single instance, then null will be returned if no relationship is found and the first result is returned if more than one is found. If, as in the above example, the type is wrapped in a list, it will be automatically translated to return the Relay-style connection type e.g. RepositoryConnection.

For types that are mutable, having a relationship field defined will also cause an additional field to be added to the input type for the add and update operations. Depending on the cardinality of the relationship, this field either takes one ID or a list of IDs. For example, given the Product schema above, a new instance with a relationship to an existing repository with GraphQL ID 1234 could be created as follows:

mutation {
    product {
        add(input: {name: "Product hub", repos: ["1234"]}) {
            product {
                name
            }
        }
    }
}

Note that, on an update, the field will replace all existing relationships between the two types.

Natural relationships

The @relationship directive may also be used to define a "natural" relationship field based on one or more matching fields between two types. For example:

type JenkinsMaster {
    id: ID!
    instanceId: String!
    allJobs: [JenkinsJob] @relationship(matches: [{source: "instanceId", target: "masterInstanceId"}])
}

type JenkinsJob {
    id: ID!
    masterInstanceId: String!
    master: JenkinsMaster! @relationship(matches: [{source: "masterInstanceId", target: "instanceId"}])
}

Cascading deletion

The @relationship directive has an optional boolean argument called cascadeDelete which defaults to false. Setting the argument to true indicates that, when an entity of the type to which the directive is applied is deleted, any instances of the type matching the relationship criteria should also be deleted. For example, updating the JenkinsMaster defined above as follows, indicates that, when the master is deleted, the jobs related to that master should also be automatically deleted.

type JenkinsMaster {
    id: ID!
    instanceId: String!
    allJobs: [JenkinsJob] @relationship(matches: [{source: "instanceId", target: "masterInstanceId"}], cascadeDelete: true)
}

The deletion is performed by a trigger and stored procedure in the database so will be applied whether the parent entity is deleted via the GraphQL API or directly in the database. As a consequence, deletions will also cascade further to any descendants of the parent entity.

Filtering on relationships

A filter may be specified when querying a relationship field. The filter is strongly typed enabling tab-completion when using GraphQL Playground. Logic operators are distinguished from field names by the use of an underscore. The comparators supported vary by type, for example _eq and _ne for a String but an Int also has _gt, _lt, _gte and _lte. An example query that retrieves a product, its repositories, and the count of pull requests that are ready to merge, looks as follows:

query {
  product(
    id: "Y2xvdWRiZWVzL3Byb2R1Y3RzL2YwNGVhYzMzLWI0YjItNGI2YS04OTdkLWUwMmViNTk4Mzc3Zg=="
  ) {
    name
    repositories {
      nodes {
        owner {
          login
        }
        name
        readyToMerge: pullRequests(
          filter: {
            _and: [
              { state: { _eq: OPEN } }
              { mergeStateStatus: { _eq: CLEAN } }
            ]
          }
        ) {
          totalCount
        }
      }
    }
  }
}
When filtering the contents of an embedded list field, only the _eq operator is currently supported.

It is possible to filter based on the size of a nested list or a child relationships using the _size field. For example:

query {
  product(
    id: "Y2xvdWRiZWVzL3Byb2R1Y3RzL2YwNGVhYzMzLWI0YjItNGI2YS04OTdkLWUwMmViNTk4Mzc3Zg=="
  ) {
    name
    repositories {
      nodes {
        owner {
          login
        }
        name
        noReviewers: pullRequests(
          filter: {
            _and: [
              { state: { _eq: OPEN } }
              { reviewRequests: {_size: { _eq: 0 } } }
            ]
          }
        ) {
          totalCount
        }
      }
    }
  }
}

Relationship fields can also be filtered using the filterString argument using the syntax described in Simplified query filter syntax.

Ordering of list results

Queries on top-level types and fields for natural relationships both support use of an orderBy argument to sort responses. This argument takes a list of objects where each object consists of a single field: the name of the field is the column to sort on and the key indicates the direction of sorting (ASC or DESC). Fields of the following GraphQL types are sortable: Int, Float, String, Boolean, Timestamp, Url and DateTime. The following is an example query that returns pull requests for the given repository in ascending order of creation time:

query {
    githubRepository(id: "123") {
        pullRequests(orderBy: [{createdAt: ASC}]) {
            nodes {
                title
            }
        }
    }
}

Alternatively, the results can be ordered by specifying the sort argument on selected fields. This argument takes a direction of ASC or DESC and an optional index. When multiple sort fields are specified, they are applied in ascending order of index. For example, the following query returns the first ten pull requests ordered first by creation time and then by number:

query {
  githubPullRequests(first: 10) {
    nodes {
      title
      createdAt(sort: {direction: ASC, index: 1})
      number(sort: {direction: ASC, index: 2})
    }
  }
}

Selecting only distinct results

Similar to the orderBy argument above, you can also use distinctOn with top-level types and fields for natural relationships. This argument takes a list of Strings, each specifying a field to perform distinct filtering on. The returned results will only contain a single result for each combination of the specified fields with all duplicates removed. Note that where multiple fields are specified, only duplicates of the combination of all of them are removed.

Often it makes sense to use distinctOn in combination with orderBy to ensure that the returned results are predictable (otherwise, any of the distinct results could be returned). When orderBy is used in conjunction with distinctOn then the initial fields for each must match. That is, the leftmost fields of both must be identical, but once that is satisfied then either the distinctOn or orderBy may contain additional fields.

For example, to find the set of Job types in use on each Jenkins Master:

query {
  jenkinsMasters {
    nodes {
      url
      displayName
      allJobs  (distinctOn: type) {
        nodes {
          type
        }
      }
    }
  }
}

Limiting list results

As a first step towards supporting Relay-style pagination, and to prevent unneeded results being returned to the client, queries on top-level types and fields for natural relationships both support use of a first argument to limit the number of results returned. For example:

query {
    githubRepository(id: "123") {
        newestPullRequest: pullRequests(orderBy: {createdAt: desc}, first: 1) {
            nodes {
                title
            }
        }
    }
}

Retrieving aggregate data for lists

When querying for a list of entities of a given data type or a relationship field that returns a list of related entities, you can retrieve aggregate information about the entities in the list. This aggregate information is retrieved using GraphQL fields that are peers of the nodes field.

The totalCount field can be used to return the total number of entities in the list. For example, the following query can be used to count the total number of open GitHub pull requests:

query {
    githubPullRequests(filterString: "state = 'OPEN'") {
        totalCount
    }
}

If the datatype contains numeric fields, the average, minimum, maximum, and sum fields can be used to retrieve the aggregate values for those fields. For example, the following query returns the average and maximum number of changed files per closed GitHub pull request:

query {
    githubPullRequests(filterString: "state = 'CLOSED'") {
        average {
            changedFields
        }
        maximum {
            changedFields
        }
    }
}

An average field will return a Float whereas the type of the minimum, maximum, and sum fields will match the type of the numeric field being aggregated.

If the first argument is specified on a query it does not restrict the value of any aggregate fields, then the aggregation is still performed across the entire list. For example, totalCount will return the size of the list that would have been returned had first not been specified.

Simplified query filter syntax

The System of Record supports a simplified query filter syntax that you can use instead of the GraphQL syntax described above. Instead of specifying a filter in the GraphQL query, you specify a filterString. The query string looks like what you would find in the WHERE clause of an SQL query.

The product documentation contains a description of the SDM query language along with examples used in the Sample policies.

Appendix: System of Record GraphQL schemas

System of Record includes a set of GraphQL schemas that define the system’s internal built-in types, types needed for SDM features and types needed for integrations. Here a break-down of those three categories of schemas:

  • Internal built-in types: These types define the base-types of Data and DataType and provide untyped query and mutation of those types.

  • Types needed for SDM features: These types are needed to implement SDM notions of Product, Feature, etc.

  • Types needed by integrations: These types are needed by integrations such as Jenkins, JIRA, and GitHub. Many of these are provided by CloudBees SDM Applications.