High-level architecture

CloudBees SDM is a preview, with early access for select preview members. Product features and documentation are frequently updated. If you find an issue or have a suggestion, please contact CloudBees Support. Learn more about the preview program.

CloudBees SDM contains four main components: data ingestion, System of Record, web user interface (UI), and an authentication service. The app framework is documented on a separate page.

This product uses some Google Cloud Platform (GCP) services, including Google Kubernetes Engine (GKE), Flink, and Apache Kafka.

Architecture diagram
Figure 1. High-level architecture diagram

Data ingestion

CloudBees SDM uses integrations, or data apps, to import data from third-party applications. A data app allows CloudBees SDM to synchronize with the current state of an application and enables data manipulation and retrieval on CloudBees SDM.

Currently, CloudBees SDM supports data integrations with GitHub, Jira, DevOptics, and Jenkins. The mechanism by which the data is ingested depends on the integration. However, at a high level, the integration uses both webhooks and fetch requests to ingest raw data into the System of Record database.

The data ingestion services are Java applications that run within a Kubernetes cluster and are, at a minimum, in charge of deploying and monitoring the Apache Flink topologies used to process the integration data. These services may also expose endpoints for handling webhooks and configuring client authentication. The raw data received from a webhook is put onto a queue via Apache Kafka and consumed by the associated Flink topology.

Apache Flink is a highly scalable stream-processing framework for Java-based languages. A Flink topology defines how one or more streams of data are processed. Each integration has an associated topology that is used to process raw data and keep the System of Record in sync with the third-party application. Currently, the Flink cluster is deployed on Google Kubernetes Engine.

System of Record

The System of Record (SOR) provides dynamic association and querying of data in CloudBees SDM. SOR is composed of a PostgreSQL database and a Java GraphQL API service. Raw data from integrations is stored as JSON blobs, enabling SOR to use PostgreSQL JSON functions for retrieval and manipulation. Each data record is associated with a specific data type, or schema, indicating the structure of the record along with any implicit relationships with other schemas. The dynamic nature of the schemas enables flexibility in querying and establishing relationships across two disparate data types. GraphQL, an open-source data query and manipulation language from Facebook, provides flexible querying across both CloudBees and third-party data.

Web UI

When users connect to CloudBees SDM, their browser receives a ReactJS front-end web bundle that is hosted by an NGINX web server running in a Kubernetes container. The UI redirects the user to authenticate before allowing the user to access their user profile on CloudBees SDM.

App authentication service

Users are authenticated through an authentication service, which provides a short-lived access token for making requests to the System of Record GraphQL APISystem of Record GraphQL API]. This same service is in charge of authorizing integrations to give them access to writing data for a specific user profile that has enabled the integration.

When installing an integration, the user enables the integration to access their user profile in the third-party service, through the authorization mechanism outlined by that service. For cloud providers, this likely includes a redirect to a third-party service to accept an authorization request.

Deployment architecture

CloudBees SDM supports integrations with customer-managed servers deployed inside secured enterprise environments.

Using NGINX as a data center gateway

In enterprise environments where software delivery services are deployed inside private networks with limited Internet access, NGINX can be used as a gateway that will forward requests from inside the data center to the CloudBees SDM. Once the NGINX proxy server is configured, you can set the proxy url in the CloudBees SDM plugin settings. Refer to connecting your CloudBees SDM user profile using a proxy for more information.

When using this deployment architecture, only the NGINX proxy server needs to be configured to allow network connections to CloudBees SDM.

The following diagram shows the physical view of NGINX architecture.

Physical view
Figure 2. Physical view of architecture

The following diagram shows the logical view of NGINX architecture.

Logical view
Figure 3. Logical view