Maintaining DevOps Insight server data

The DevOps Insight server uses the Elasticsearch search engine and the Logstash data-collection and log-parsing engine to gather data from the CloudBees Flow server for use in the Deployments, Releases, and Release Command Center dashboards. The DevOps Insight server also receives predictive analytics data (based on raw Elasticsearch data) from the DevOps Foresight server. For information about the DevOps Foresight server (packaged and licensed separately), see DevOps Foresight Overview .

Backing up DevOps Insight Server Elasticsearch data

You should back up your existing DevOps Insight server data frequently. We recommend full regular (nightly) backups and a backup before an upgrade. For further details on archiving and restoring Elasticsearch indices, see https://www.elastic.co.

You should consider the following points for the DevOps Insight server when you set up the Elasticsearch snapshot repository:

When you register the location of the shared file system repository in the path.repo setting in the elasticsearch.yml file, you must specify the setting in the Custom Settings section to ensure that it is preserved during upgrades.

Following is an example for Linux platforms:
```
path.repo: ["/home/ecloud/bb", "/mount/backups", "/mount/longterm_backups"]▼
```
Following is an example for a remote shared folder location on Windows platforms using a Windows UNC path:
```
path.repo: ["\\\\<MY_SERVER>\\Snapshots"]▼
```

Because the DevOps Insight server is configured with SSL authentication, the curl command format must be as follows:

curl -k –X <POST|PUT> -E <data_dir>/conf/reporting/elasticsearch/admin.crtfull.pem --key <data_dir>/conf/reporting/elasticsearch/admin.key.pem \https://<DevOps-Insight server-host-name>:<Elasticsearch-port>/<request-URI>▼

For example:

curl -k –X POST -E /opt/ef/conf/reporting/elasticsearch/admin.crtfull.pem --key /opt/ef/conf/reporting/elasticsearch/admin.key.pem \https://localhost:Elasticsearch-port/_snapshot/my_backup/snapshot_1/_restore▼

The Elasticsearch indices created by CloudBees Flow through the DevOps Insight server begin with ef- so they can be selected using the ef-* index pattern.
Most Elasticsearch indices follow a time-based index naming scheme and use -yyyy as the suffix for the index name, where yyyy is the year associated with the document.

For example, all deployments for the year 2018 will be stored in the index named ef-deployment-2018. This time-based naming scheme can be used in your archiving strategy for the DevOps Insight server.

Removing Old DevOps Insight Elasticsearch Data

DevOps Insight provides insight and visibility into not just your ongoing releases and deployments, but also historic releases. So you must retain old data in the DevOps Insight server.

You can provide sufficient disk space for the DevOps Insight server based on its the usage requirements in Disk Usage . However, if you must remove very old data from the DevOps Insight server to reclaim disk space, follow the recommendations explained below.

Ensuring Sufficient Disk Space for Storing DevOps Insight Data

Make sure that enough disk space is provided for storing DevOps Insight data for the last n years based on your data retention requirements. For details about calculating disk usage requirements for the DevOps Insight server based on your data-generation patterns, see Disk Usage .

Removing the Old Data

Elasticsearch is the underlying analytics store for the DevOps Insight server. The DevOps Insight server data is stored as indices in Elasticsearch. If you must remove old data, you should use Elasticsearch Curator to delete old indices. For more information about Elasticsearch Curator, see https://www.elastic.co/guide/en/elasticsearch/client/curator/5.7/index.html .

Install Elasticsearch Curator on the system where the DevOps Insight server is installed.

The curator CLIs curator_cli and curator use a configuration file that contains Elasticsearch connection settings.

Following is a sample YAML configuration file that you can use for connecting to an Elasticsearch cluster or instance that is backing the DevOps Insight server:
```
client:
  hosts:
    - 127.0.0.1
  port: Elasticsearch_port
  use_ssl: True
  certificate: data_dir/conf/reporting/elasticsearch/chain-ca.pem
  client_cert: data_dir/conf/reporting/elasticsearch/admin.crtfull.pem
  client_key: data_dir/conf/reporting/elasticsearch/admin.key.pem
  ssl_no_validate: False
  http_auth:
  timeout: 30
  master_only: False▼
```
where Elasticsearch port is the Elasticsearch port number and data_dir is the DevOps Insight server data directory path.
Run the following command to verify that you can connect to Elasticsearch using the configuration file:
```
curator_cli --config curator-config.yml show_indices▼
```
The Elasticsearch indices created by CloudBees Flow begin with ef-. Most of the CloudBees Flow indices follow a time-based index naming scheme and use -yyyy as the suffix for the index name, where yyyy is the year associated with the record. For example, all deployments for the year 2018 are stored in the index named ef-deployment-2018.

Following is a sample YAML action file to delete CloudBees Flow indices older than seven years. You can increase the number of years for which to retain the old indices based on your data retention policies.
```
actions:
  1:
    action: delete_indices
    description: >-
      Delete CloudBees Flow DevOps Insight indices older than 7 years
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: ef-
    - filtertype: period
      period_type: relative
      source: name
      range_from: -8
      range_to: -7
      timestring: '-%Y'
      unit: years▼
```
Run the following command to do a dry run using the configuration file and the action file:
```
curator --config curator-config.yml --dry-run curator-action.yml▼
```
This shows you the indices that will be deleted but will not actually delete them.
Verify the dry run output.
Schedule the following curator command to run periodically to delete the old indices based on your YAML action file by entering:
```
curator --config curator-config.yml curator-action.yml▼
```

Removing Incorrect DevOps Insight Elasticsearch Data

If incorrect data is loaded into DevOps Insight server, for example, during building or testing of a script meant to send reporting data to the DevOps Insight server, you can delete this data using these steps:

Identify the Elasticsearch index from which incorrect data needs to be deleted.

DevOps Insight server indices are named using the pattern ef-report-object-name-yyyy. So assuming that you used the sendReportingData API to send the data to the DevOps Insight server, and the report object name was test, then the corresponding index name would be ef-test-2019.

Back up the index before deleting any data in case something goes wrong and you need to restore the data.

Log in to the system running the DevOps Insight server.
Open a terminal window and change directories to the DevOps Insight server conf/ directory.

On Linux, the default path is
```
/opt/electriccloud/electriccommander/conf/reporting▼
```

Run the following commands:

# Create backup index
curl -vk -XPUT 'https://127.0.0.1:Elasticsearch_port/backup-test' -E elasticsearch/admin.crtfull.pem --key elasticsearch/admin.key.pem

# Copy the data from the original index to the backup index
curl -XPOST 'https://127.0.0.1:Elasticsearch_port/_reindex?pretty' -E elasticsearch/admin.crtfull.pem --key elasticsearch/admin.key.pem -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "ef-test-2019"
  },
  "dest": {
    "index": "backup-test"
  }
}'▼

Use the Elasticsearch _delete_by_query to API delete the data from the original index based on criteria that uniquely identify the data to be deleted.

For example, if the data with a field named projectName and value of motorbike needs to be deleted, the following command deletes documents matching the criteria in the index ef-test-2019 :
```
curl -vk -XPOST "https://127.0.0.1:Elasticsearch_port/ef-test-2019/_delete_by_query?pretty"
-H 'Content-Type: application/json' -E elasticsearch/admin.crtfull.pem --key elasticsearch/admin.key.pem -d'▼
```
```
{▼
```
```
  "query": {▼
```
```
    "term": {▼
```
```
      "projectName": "motorbike-backend"▼
```
```
    }▼
```
```
  }▼
```
```
}▼
```
```
'▼
```