Back Up On-Prem Deployment With Object Storage
Note
This document shows how to configure, back up, and restore a Chef Automate high availability deployment with object storage.
During deployment of Chef Automate, if you set backup_config = "object_storage"
or backup_config = "file_system"
in the Automate configuration TOML file, then backup is already configured and you don’t need to configure data backup for Chef Automate.
If a backup wasn’t configured during the initial deployment, then follow these instructions to configure it manually.
Chef Automate supports backing up data to the following platforms:
- S3 (AWS S3, MinIO, non-AWS S3)
- Google Cloud Storage (GCS)
Configure backup for S3
This section shows how to configure data back up on a Chef Automate high availability deployment to object storage on AWS S3, MinIO, or non-AWS S3.
Configure OpenSearch nodes
Add a secret key and access key for your S3 backup provider on every OpenSearch node.
Note
Set the OpenSearch path configuration location.
export OPENSEARCH_PATH_CONF="/hab/svc/automate-ha-opensearch/config"
Add your S3 access and secret keys to the OpenSearch keystore.
hab pkg exec chef/automate-ha-opensearch opensearch-keystore add s3.client.default.access_key hab pkg exec chef/automate-ha-opensearch opensearch-keystore add s3.client.default.secret_key
Change ownership of the keystore.
chown -RL hab:hab /hab/svc/automate-ha-opensearch/config/opensearch.keystore
Load the secure settings into the OpenSearch keystore.
curl -X POST https://localhost:9200/_nodes/reload_secure_settings?pretty --cacert /hab/svc/automate-ha-opensearch/config/certificates/root-ca.pem --key /hab/svc/automate-ha-opensearch/config/certificates/admin-key.pem --cert /hab/svc/automate-ha-opensearch/config/certificates/admin.pem -k
Repeat these steps on all OpenSearch nodes until they are all updated.
OpenSearch health check
Use the following commands on OpenSearch nodes to verify their health status.
Verify that the Habitat service is running.
hab svc status
Check the status of OpenSearch indices.
curl -k -X GET "https://localhost:9200/_cat/indices/*?v=true&s=index&pretty" -u admin:admin
View logs of the Chef Habitat services.
journalctl -u hab-sup -f | grep 'automate-ha-opensearch'
Patch the Automate configuration
On the bastion host, update the S3 and OpenSearch configuration.
Before starting, make sure the frontend nodes and OpenSearch nodes have access to the object storage endpoint.
Create a TOML file on the bastion host with the following settings.
[s3] [s3.client.default] protocol = "https" read_timeout = "60s" max_retries = "3" use_throttle_retries = true endpoint = "s3.example.com"
Replace the value of
endpoint
with the URL of your S3 storage endpoint.Add the following content to the TOML file to configure OpenSearch.
[global.v1] [global.v1.external.opensearch.backup] enable = true location = "s3" [global.v1.external.opensearch.backup.s3] # bucket (required): The name of the bucket bucket = "<BUCKET_NAME>" # base_path (optional): The path within the bucket where backups should be stored # If base_path is not set, backups will be stored at the root of the bucket. base_path = "opensearch" # name of an s3 client configuration you create in your opensearch.yml # see https://www.open.co/guide/en/opensearch/plugins/current/repository-s3-client.html # for full documentation on how to configure client settings on your # OpenSearch nodes client = "default" [global.v1.external.opensearch.backup.s3.settings] ## The meaning of these settings is documented in the S3 Repository Plugin ## documentation. See the following links: ## https://www.open.co/guide/en/opensearch/plugins/current/repository-s3-repository.html ## Backup repo settings # compress = false # server_side_encryption = false # buffer_size = "100mb" # canned_acl = "private" # storage_class = "standard" ## Snapshot settings # max_snapshot_bytes_per_sec = "40mb" # max_restore_bytes_per_sec = "40mb" # chunk_size = "null" ## S3 client settings # read_timeout = "50s" # max_retries = 3 # use_throttle_retries = true # protocol = "https" [global.v1.backups] location = "s3" [global.v1.backups.s3.bucket] # name (required): The name of the bucket name = "<BUCKET_NAME>" # endpoint (required): The endpoint for the region the bucket lives in for Automate Version 3.x.y # endpoint (required): For Automate Version 4.x.y, use this https://s3.amazonaws.com endpoint = "<OBJECT_STORAGE_URL>" # base_path (optional): The path within the bucket where backups should be stored # If base_path is not set, backups will be stored at the root of the bucket. base_path = "automate" [global.v1.backups.s3.credentials] access_key = "<ACCESS_KEY>" secret_key = "<SECRET_KEY>"
Use the
patch
subcommand to patch the Automate configuration../chef-automate config patch --frontend /PATH/TO/FILE_NAME.TOML
Configure backup on Google Cloud Storage
This sections shows how to configure a Chef Automate high availability deployment to back up data to object storage on Google Cloud Storage (GCS).
Configure OpenSearch nodes
Add a GCS service account file that gives access to the GCS bucket to every OpenSearch node.
Log in to an OpenSearch node and set the OpenSearch path and GCS service account file locations.
export OPENSEARCH_PATH_CONF="/hab/svc/automate-ha-opensearch/config" export GCS_SERVICE_ACCOUNT_JSON_FILE_PATH="/PATH/TO/GOOGLESERVICEACCOUNT.JSON"
Change ownership of the GCS service account file.
chown -RL hab:hab $GCS_SERVICE_ACCOUNT_JSON_FILE_PATH
Add the GCS service account file to OpenSearch.
hab pkg exec chef/automate-ha-opensearch opensearch-keystore add-file --force gcs.client.default.credentials_file $GCS_SERVICE_ACCOUNT_JSON_FILE_PATH
Change ownership of the keystore.
chown -RL hab:hab /hab/svc/automate-ha-opensearch/config/opensearch.keystore
Load the secure settings into the OpenSearch keystore.
curl -X POST https://localhost:9200/_nodes/reload_secure_settings?pretty --cacert /hab/svc/automate-ha-opensearch/config/certificates/root-ca.pem --key /hab/svc/automate-ha-opensearch/config/certificates/admin-key.pem --cert /hab/svc/automate-ha-opensearch/config/certificates/admin.pem -k
Repeat these steps on all OpenSearch nodes until they are all updated.
After updating all nodes, the above curl command will return an output similar to this:
{
"_nodes": {
"total": 3,
"successful": 3,
"failed": 0
},
"cluster_name": "chef-insights",
"nodes": {
"lenRTrZ1QS2uv_vJIwL-kQ": {
"name": "lenRTrZ"
},
"Us5iBo4_RoaeojySjWpr9A": {
"name": "Us5iBo4"
},
"qtz7KseqSlGm2lEm0BiUEg": {
"name": "qtz7Kse"
}
}
}
OpenSearch health check
Use the following commands on OpenSearch nodes to verify their health status.
Verify that the Habitat service is running.
hab svc status
Check the status of OpenSearch indices.
curl -k -X GET "https://localhost:9200/_cat/indices/*?v=true&s=index&pretty" -u admin:admin
View logs of the Chef Habitat services.
journalctl -u hab-sup -f | grep 'automate-ha-opensearch'
Patch the Automate configuration
On the bastion host, update the OpenSearch configuration.
Before starting, make sure the frontend nodes and OpenSearch nodes have access to the object storage endpoint.
Create a TOML file on the bastion host with the following settings.
[global.v1] [global.v1.external.opensearch.backup] enable = true location = "gcs" [global.v1.external.opensearch.backup.gcs] # bucket (required): The name of the bucket bucket = "bucket-name" # base_path (optional): The path within the bucket where backups should be stored # If base_path is not set, backups will be stored at the root of the bucket. base_path = "opensearch" client = "default" [global.v1.backups] location = "gcs" [global.v1.backups.gcs.bucket] # name (required): The name of the bucket name = "bucket-name" # endpoint = "" # base_path (optional): The path within the bucket where backups should be stored # If base_path is not set, backups will be stored at the root of the bucket. base_path = "automate" [global.v1.backups.gcs.credentials] json = '''{ "type": "service_account", "project_id": "chef-automate-ha", "private_key_id": "7b1e77baec247a22a9b3****************f", "private_key": "<PRIVATE KEY>", "client_email": "myemail@chef.iam.gserviceaccount.com", "client_id": "1******************1", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/myemail@chef.iam.gserviceaccount.com", "universe_domain": "googleapis.com" }'''
Patch the Automate configuration to trigger the deployment.
./chef-automate config patch --frontend /PATH/TO/FILE_NAME.TOML
Backup and Restore
Backup
To create a backup, run the backup command from the bastion host.
chef-automate backup create
Restore
Restore a backup from external object storage.
Check the status of the Automate HA cluster from the bastion host.
chef-automate status
Restore the backup by running the restore command from the bastion host.
For S3:
chef-automate backup restore s3://BUCKET_NAME/PATH/TO/BACKUPS/BACKUP_ID --skip-preflight --s3-access-key "ACCESS_KEY" --s3-secret-key "SECRET_KEY"
For GCS:
chef-automate backup restore gs://BUCKET_NAME/PATH/TO/BACKUPS/BACKUP_ID --gcs-credentials-path "PATH/TO/GOOGLE_SERVICE_ACCOUNT.JSON"
In an airgapped environment:
chef-automate backup restore <OBJECT-STORAGE-BUCKET-PATH>/BACKUPS/BACKUP_ID --skip-preflight --airgap-bundle </PATH/TO/BUNDLE>
Note
- If you are restoring the backup from an older version, then you need to provide the
--airgap-bundle </path/to/current/bundle>
. - Large Compliance Report is not supported in Automate HA
Troubleshooting
Try these steps if Chef Automate returns an error while restoring data.
Check the Chef Automate status.
chef-automate status
Check the status of your Habitat service on the Automate node.
hab svc status
If the deployment services are not healthy, reload them.
hab svc load chef/deployment-service
Now check the status of the Automate node and then try running the restore command from the bastion host.
How to change the
base_path
orpath
. The steps for the File System backup are as shown below:While at the time of deployment
backup_mount
default value will be/mnt/automate_backups
In case, if you modify the
backup_mount
inconfig.toml
before deployment, then the deployment process will do the configuration with the updated valueIn case, you changed the
backup_mount
value post-deployment, then we need to patch the configuration manually to all the frontend and backend nodes, for example, if you change thebackup_mount
to/bkp/backps
Update the FE nodes with the below template, use the command
chef-automate config patch fe.toml --fe
[global.v1.backups] [global.v1.backups.filesystem] path = "/bkp/backps" [global.v1.external.opensearch.backup] [global.v1.external.opensearch.backup.fs] path = "/bkp/backps"
Update the OpenSearch node with the below template, use the command
chef-automate config patch os.toml --os
[path] repo = "/bkp/backps"
Run the curl request to one of the automate frontend node
curl localhost:10144/_snapshot?pretty
If the response is empty
{}
, then we are goodIf the response has json output, then it should have correct value for the
backup_mount
, refer thelocation
value in the response. It should start with the/bkp/backps
{ "chef-automate-es6-event-feed-service" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service" } }, "chef-automate-es6-compliance-service" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service" } }, "chef-automate-es6-ingest-service" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service" } }, "chef-automate-es6-automate-cs-oc-erchef" : { "type" : "fs", "settings" : { "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef" } } }
- If the pre string in the
location
is not match withbackup_mount
, then we need to to delete the existing snapshots. use below script to delete the snapshot from the one of the automate frontend node.
snapshot=$(curl -XGET http://localhost:10144/_snapshot?pretty | jq 'keys[]') for name in $snapshot;do key=$(echo $name | tr -d '"') curl -XDELETE localhost:10144/_snapshot/$key?pretty done
- The above scritp requires the
jq
needs to be installed, You can install from the airgap bundle, please use command on the one of the automate frontend node to locate thejq
package.
ls -ltrh /hab/cache/artifacts/ | grep jq -rw-r--r--. 1 ec2-user ec2-user 730K Dec 8 08:53 core-jq-static-1.6-20220312062012-x86_64-linux.hart -rw-r--r--. 1 ec2-user ec2-user 730K Dec 8 08:55 core-jq-static-1.6-20190703002933-x86_64-linux.hart
- In case of multiple
jq
version, then install the latest one. use the below command to install thejq
package to the automate frontend node
hab pkg install /hab/cache/artifacts/core-jq-static-1.6-20190703002933-x86_64-linux.hart -bf
Below steps for object storage as a backup option
- While at the time of deployment
backup_config
will beobject_storage
- To use the
object_storage
, we are using below template at the time of deployment
[object_storage.config] google_service_account_file = "" location = "" bucket_name = "" access_key = "" secret_key = "" endpoint = "" region = ""
- If you configured pre deployment, then we are good
- If you want to change the
bucket
orbase_path
, then use the below template for Frontend nodes
[global.v1] [global.v1.external.opensearch.backup.s3] bucket = "<BUCKET_NAME>" base_path = "opensearch" [global.v1.backups.s3.bucket] name = "<BUCKET_NAME>" base_path = "automate"
- You can choose any value for the variable
base_path
.base_path
patch is only required for the frontend node. - Use the command to apply the above template
chef-automate config patch frontend.toml --fe
- Post the configuration patch, and use the curl request to validate
curl localhost:10144/_snapshot?pretty
- If the response is empty `{}`, then we are good - If the response has JSON output, then it should have the correct value for the `base_path` ```sh { "chef-automate-es6-event-feed-service" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service", "readonly" : "false", "compress" : "false" } }, "chef-automate-es6-compliance-service" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service", "readonly" : "false", "compress" : "false" } }, "chef-automate-es6-ingest-service" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service", "readonly" : "false", "compress" : "false" } }, "chef-automate-es6-automate-cs-oc-erchef" : { "type" : "s3", "settings" : { "bucket" : "MY-BUCKET", "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef", "readonly" : "false", "compress" : "false" } } } ``` - In case of `base_path` value is not matching, then we have to delete the existing `snapshot`. please refer to the steps from the file system
- While at the time of deployment