In CloudBees CD/RO v2024.06.0, CloudBees Analytics was upgraded from using Elasticsearch to OpenSearch. The data formats of these search engines are not fully compatible, and to preserve data from your legacy CloudBees Analytics servers, you must migrate it to an updated server. In the following content, these servers are referenced as:
-
Source server: A legacy CloudBees Analytics server using Elasticsearch.
-
Destination server: An updated CloudBees Analytics server using OpenSearch.
Before starting, review Kubernetes migrations overview. |
Kubernetes migrations overview
The following information provides an overview for Kubernetes migrations:
-
Before migrating your data, back up the legacy CloudBees Analytics data. For more information on this process, refer to Maintain CloudBees Analytics server data on Kubernetes.
Failing to back up the legacy CloudBees Analytics data could result in permanent data loss if issues arise during data migration.
-
Starting in v2024.06.0, separate CloudBees Analytics services for
flow-devopsinsight
, using Elasticsearch, andflow-analytics
, using OpenSearch, are included.Although both CloudBees Analytics services are included, only flow-analytics
can communicate with other CloudBees CD/RO services. -
Before performing Migrate data with CloudBees CD/RO procedure, ensure you have copied any custom setting from the
dois
chart to theanalytics
chart and deployed them to your environment.Ensure an authentication method is configured for analytics
. For more information, refer to Update CloudBees Analytics authentication methods. -
If you have not already done so, in your updated values file, change
analytics.autoRegister: true
toanalytics.autoRegister: false
.Setting
analytics.autoRegister: false
in your values file prevents the CloudBees Analytics server configuration from being created on the CloudBees Software Delivery Automation server. This is critical to prevent unexpected issues while migrating CloudBees Analytics data from Elasticsearch to OpenSearch. -
The following is an explanation of using Migrate data with CloudBees CD/RO procedure for Kubernetes migrations:
-
Using this procedure transfers your data between the two services using their URL endpoints. By default, the URL endpoints are:
-
Source URL:
https://flow-devopsinsight.<namespace>:9200
-
Destination URL:
https://flow-analytics.<namespace>:9201
-
-
For both CloudBees Analytics instances, select Runtime credential.
-
For the Runtime credential, use
reportuser
as the username for both CloudBees Analytics instances. -
To retrieve the
reportuser
password:To get the
<secret-name>
for the following commands, for each, run:kubectl get secrets --namespace <namespace>
-
For the source server (
flow-devopsinsight
) password, run:kubectl get secret --namespace <namespace> <dois-secret-name> -o jsonpath="{.data.CBF_DOIS_PASSWORD}" | base64 --decode; echo
-
For the destination server (
flow-analytics
) password, run:kubectl get secret --namespace <namespace> <analytics-secret-name> -o jsonpath="{.data.CBF_ANALYTICS_PASSWORD}" | base64 --decode; echo
-
-
-
-
(OPTIONAL) To avoid resource overhead, CloudBees recommends disabling the legacy
dois
service after you complete migration and confirm your updated CloudBees Analytics server is operating as expected. For more information, refer to Disable legacy server after migration.
Migrate data with CloudBees CD/RO procedure
The EC-Utilities project comes with a Reindex Analytics Data procedure that can be used to migrate your data from Elasticsearch to OpenSearch. This procedure copies data from the source server to the destination server.
To migrate your CloudBees Analytics from Elasticsearch to OpenSearch using the Reindex Analytics Data procedure:
Before running the Reindex Analytics Data procedure:
|
-
In CloudBees CD/RO, navigate to
. -
For the project, in the filtering options, change from All projects to EC-Utilities.
-
Select Run icon for the Reindex Analytics Data procedure, and New run.
-
Provide the following data:
The section Kubernetes migrations overview provides specific information on the values required for the Reindex Analytics Data procedure parameters. Table 1. Reindex Analytics Data procedure description Procedure Parameter Description Source URL
Specify the URL for the data source in the format:
<protocol>://<hostname>:<portnumber>
Source Credential
Specify the username and password on the data source server for reindexing:
-
Username:
reportuser
-
For the
reportuser
password, refer to Runtime credential.
Destination URL
Specify the URL for the destination server in the format:
<protocol>://<hostname>:<portnumber>
Destination Credential
Specify the username and password on the data recipient server for reindexing:
-
Username:
reportuser
-
For the
reportuser
password, refer to Runtime credential.
Allow Mismatched Indices
This setting controls the behavior if indices on the source and destination have the same name:
-
If selected, indices with the same name are automatically handled as described in Handling of indexes with the same name.
-
If unselected, and indices with the same name are encountered during migration, the procedure terminates with an error.
Debug
Specify the verbosity level of debug messages. The entries/level for this value are as follows:
-
-2
: Critical -
-1
: Error -
0
: Info (Default) -
1
: Debug -
2
: Trace
-
-
Select OK to start the migration.
After starting the procedure with the parameters, the provided parameters will be checked and the data copying process will start. The jobstep log shows the progress and result of this copying. For more information, refer to Data migration log example.
Data migration log example
When running the CloudBees CD/RO procedure the log will be similar to:
CloudBees Analytics example data migration log
Checking available indices from the source server... [ 1/ 22] Checking the index 'ef-build-2020' ... [ 2/ 22] Checking the index 'ef-build-2021' ... ... [ 21/ 22] Checking the index 'ef-pipelinerun-2023' ... [ 22/ 22] Checking the index 'ef-release' ... The source server contains 22 indices with 260,000 documents. [ 1/ 22] Transferring the index 'ef-build-2020' with 2 documents... Done 2 documents in 882 msecs. Created: 2; Updated: 0; Deleted: 0; Batches: 1; Conflicts: 0; Noops: 0 Verifying the index 'ef-build-2020' in the destination server... The resulting index 'ef-build-2020' on the destination server contains 2 documents. [ 2/ 22] Transferring the index 'ef-build-2021' with 21 documents... Done 21 documents in 549 msecs. Created: 21; Updated: 0; Deleted: 0; Batches: 1; Conflicts: 0; Noops: 0 Verifying the index 'ef-build-2021' in the destination server... The resulting index 'ef-build-2021' on the destination server contains 21 documents. ..... [ 21/ 22] Transferring the index 'ef-pipelinerun-2023' with 50,061 documents... Done 50,061 documents in 13 secs 609 msecs. Created: 50,061; Updated: 0; Deleted: 0; Batches: 51; Conflicts: 0; Noops: 0 Verifying the index 'ef-pipelinerun-2023' in the destination server... The resulting index 'ef-pipelinerun-2023' on the destination server contains 50,061 documents. [ 22/ 22] Transferring the index 'ef-release' with 66 documents... Done 66 documents in 340 msecs. Created: 66; Updated: 0; Deleted: 0; Batches: 1; Conflicts: 0; Noops: 0 Verifying the index 'ef-release' in the destination server... The resulting index 'ef-release' on the destination server contains 66 documents. Reindexing has been successfully completed. Processed 22 indices and 260,000 documents in 1 min 33 secs.
To view the data migration log:
|
In the first step, the Reindex Wizard checks available indices on the source server, and displays basic statistics:
Checking available indices from the source server... [ 1/ 22] Checking the index 'ef-build-2020' ... [ 2/ 22] Checking the index 'ef-build-2021' ... ... [ 21/ 22] Checking the index 'ef-pipelinerun-2023' ... [ 22/ 22] Checking the index 'ef-release' ...
In the second stage, each index is copied individually from the source to destination server. The progress is shown similar to:
[ 1/ 22] Transferring the index 'ef-build-2020' with 2 documents... Done 2 documents in 882 msecs. Created: 2; Updated: 0; Deleted: 0; Batches: 1; Conflicts: 0; Noops: 0 Verifying the index 'ef-build-2020' in the destination server... The resulting index 'ef-build-2020' on the destination server contains 2 documents.
The third step outputs the result of migration, which includes the total number of transferred indices and documents. This is similar to:
Reindexing has been successfully completed. Processed 22 indices and 260,000 documents in 1 min 33 secs.
In this example, 22 indices with 260,000 documents were detected on the source server and migrated to the designation server.
Enable the CloudBees Analytics configuration after migration
After migrating your data from flow-devopsinsight
to flow-analytics
, and completing Disable legacy server after migration, you must update your deployment with the CloudBees Analytics server configuration for the CloudBees Software Delivery Automation server.
To enable the CloudBees Analytics server, in your values files, set analytics.autoRegister: true
, and rerun your helm upgrade
command.
If you have previously changed your CloudBees CD/RO administrator password in the UI (
If you fail to do this, this step fails silently, and once you log into CloudBees CD/RO, CloudBees Analytics is still disabled. To fix this issue, update your values with the correct value, and rerun your |
Your CloudBees Software Delivery Automation server is now configured with the CloudBees Analytics server configuration.
Disable legacy server after migration
Both the flow-devopsinsight
and flow-analytics
services must be running to complete Migrate data with CloudBees CD/RO procedure. However, after you have completed the procedure, and confirmed the updated flow-analytics
service is operating as expected, flow-devopsinsight
is no longer required.
Although it is optional, to avoid resource overhead, CloudBees recommends disabling the legacy dois
load to stop the flow-devopsinsight
service. To do so, either:
If you have not yet created a backup of your |
-
From the command line, rerun your
helm upgrade
command and include--set dois.enabled=false
.On your next helm upgrade
, this is overwritten ifdois.enabled: false
has not been updated in your values file. -
In your values files, set
dois.enabled: false
, and rerun yourhelm upgrade
command.
You have now disabled the legacy flow-devopsinsight
service. Next, follow the instruction in Enable the CloudBees Analytics configuration after migration.
Known issues for data migration
This section provides information about known issues you may encounter while migrating data from Elasticsearch to OpenSearch.
Increased disk space requirements for data migration
During the migration from Elasticsearch to OpenSearch, disk space requirements may need to be increased. This is caused by the simultaneous existence of indexes for both the legacy and updated CloudBees Analytics instances.
This issue typically only applies to:
|
To roughly calculate the space needed during migration, CloudBees has provided a utility. To use this utility:
-
Navigate to the CloudBees examples repository.
-
Download the
reporting-data-reindex.pl
utility. -
Follow the instructions in the README.md.
-
Based on the value returned for
Indices Size
totaled for all nodes, double the disk space.-
Example: If the total returned for all nodes was
20GB
, then an additional20GB
is required only for the migration.After the migration is completed, you can return the disk space to the desired level.
-
Timeouts reached when migrating large indexes
The migration options provided by CloudBees have a timeout of 180 minutes
per index to avoid unexpected hangs. In cases where an index contains a considerably large amount of data, and its migration does not complete within the timeout duration, the migration process fails.
This may result in having to split such indexes into multiple smaller indexes. If you encounter multiple timeout issues, contact CloudBees support.
To calculate the size of indexes, CloudBees has provided a utility. To use this utility:
|
Handling of indexes with the same name
During reindexing, there are several scenarios that can occur:
-
An index is copied from the source server, and no index on the destination server has the same name. A new index is then created on the destination server with the same settings as the source server, and the data is copied to it.
-
An index is copied from the source server, and an index on the destination server has the same name with the same settings. The index on the destination server is then updated to include any new data from the source server.
-
An index is copied from the source server, and an index on the destination server has the same name, but with different settings. When this occurs:
-
The existing index destination server is backed up as a new index with a new name using the scheme
ef-reindex_backup-<timestamp>-<index name>
. -
A new index is created with the name and settings from the source server.
-
An entry for each such event appears in the job log, similar to:
[ 6/ 22] Transferring the index 'ef-defect-2021' with 175 documents... Properties with mismatched types were found in the destination index. This index will be saved under a different name. Renaming the existing index 'ef-defect-2021' on the destination server to the new name 'ef-reindex_backup-20240510130805-defect-2021'... Done 175 documents in 186 msecs. Created: 138; Updated: 37; Deleted: 0; Batches: 1; Conflicts: 0; Noops: 0 Verifying the index 'ef-defect-2021' in the destination server... The resulting index 'ef-defect-2021' on the destination server contains 138 documents.
-
In this case:
-
Both the source and destination server have an index named
ef-defect-2021
. -
Different settings are detected for the index on each server.
-
The existing
ef-defect-2021
index on the destination server is backed up as:ef-reindex_backup-20240510130805-defect-2021
-
A new
ef-defect-2021
index is created on the destination server with the data and settings from the source server.
Handling of removed or deprecated Elasticsearch query syntax
With the upgrade from Elasticsearch to OpenSearch, query DSL changes for all default CloudBees reports that used Elasticsearch query DSL syntax are handled automatically. This includes the following changes:
-
Replacing the deprecated field
[inline]
with[source]
in the[script]
section. -
Replacing the deprecated field
[interval]
with[calendar_interval]
in the[date_histogram]
section. -
Replacing the deprecated order key
[_term]
with[_key]
in the aggregation section. -
Replacing
["field": "_type"]
with["script", "_doc"]
, because the"_type"
field was removed.
The changes described above are also automatically handled within custom reports. |
However, for custom reports that use other query constructs, this upgrade may create breaking changes caused by deprecated or removed ElasticSearch fields. As a result, such queries must be updated with DSL syntax that is compatible with OpenSearch. For more information, refer to the OpenSearch Query DSL documentation.
For any other breaking changes that may impact your custom reports, refer to the OpenSearch v2.14 breaking changes documentation. |