Elasticsearch Backup and Restore

This article provides a comprehensive guide to the backup and restoration processes for Elasticsearch within your on-premises Provider Connectivity Assurance environment. Maintaining consistent backups is critical for data durability and the rapid recovery of search indices and analytics data.

This guide provides detailed procedures for backing up and restoring Elasticsearch in on-premises Kubernetes Provider Connectivity Assurance deployments.

Prerequisites

kubectl access to the Provider Connectivity Assurance cluster
SSH access to cluster nodes (for restoration)
Familiarity with OpenEBS local storage paths

Backup Overview

Elasticsearch backups run automatically via a Kubernetes CronJob at 01:00 UTC daily. Backups are stored in MinIO at:

/elasticsearch/v2/elasticsearch-Backup/

Verify the backup job exists:

kubectl get cronjobs -n pca | grep elasticsearch

Manual Backup Trigger

From any pod with curl (such as airflow):

kubectl exec -it -n pca airflow-0 -- sh
curl -vvv -f elasticsearch01:10005/backup

Expected output: Successfully ran Backup on elasticsearch

Access Backups in MinIO

Connect to the MinIO pod:

kubectl exec -it -n pca pca-minio-pool-0-0 -- sh
. /tmp/minio/config.env

Configure the MinIO client:

mc alias set --insecure pca https://localhost:9000 "$MINIO_ROOT_USER" "$MINIO_ROOT_PASSWORD"

List available backups:

mc --insecure ls pca//elasticsearch/v2/elasticsearch-Backup/

Copy a backup to the pod filesystem:

mc --insecure cp -r \
pca//elasticsearch/v2/elasticsearch-Backup/.tar.gz \
/tmp/

Export Backup to Admin VM

The MinIO image does not include tar, so use cat to stream files:

Set the backup filename:

export ES_BKUP=".tar.gz"
mkdir ES_RESTORE

Copy the file:

kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$ES_BKUP > ES_RESTORE/$ES_BKUP

Verify the checksum:

kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$ES_BKUP
cksum ES_RESTORE/$ES_BKUP

Checksums must match exactly.

Identify Storage Location

Embedded and air-gapped deployments use OpenEBS local storage. Persistent Volume Claim (PVC) data resides on the node filesystem.

Identify the Elasticsearch pod's node and PVC:

kubectl get pods -n pca -o wide | grep elasticsearch-0 | awk '{ print $7 }'
kubectl get pvc -n pca | grep elasticsearch | awk '{ print $3 }'

Set environment variables:
```
export ES_HOST=
export ES_PVC=
```

Copy the backup to the target node:

scp -r ./ES_RESTORE/ \
${ES_HOST}:/var/lib/embedded-cluster/openebs-local/$ES_PVC/

Extract Backup on Target Node

On the node where the PVC resides:

export ES_PVC=
export ES_BKUP=".tar.gz"

cd /var/lib/embedded-cluster/openebs-local/$ES_PVC/ES_RESTORE
tar zxvf $ES_BKUP

Restore Procedure

Warning: This procedure deletes all existing Elasticsearch data. Ensure you have a valid backup before proceeding.

Step 1: Stop Dependent Services

for deploy in gather alert-service es-exporter; do
kubectl scale deployment -n pca $deploy --replicas=0
done

Step 2: Delete Existing Data

Connect to the Elasticsearch pod:

kubectl exec -it -n pca elasticsearch-0 -- sh

Delete all indices:
```
curl -X DELETE 'http://localhost:9200/_all'
```
Expected output: {"acknowledged":true}
Delete the snapshot repository:
```
curl -X DELETE "localhost:9200/_snapshot/backup_repository?pretty"
```
Expected output: { "acknowledged" : true }
Exit the pod:
```
exit
```

Step 3: Scale Down Elasticsearch

kubectl scale sts -n pca elasticsearch --replicas=0

Step 4: Replace Data Directories

On the target node:

cd /var/lib/embedded-cluster/openebs-local/$ES_PVC
rm -rf backup nodes

cp -r ES_RESTORE/backup ./
chmod 775 backup
chown -R debian:admin backup

Step 5: Start Elasticsearch

kubectl scale statefulset -n pca elasticsearch --replicas=1
kubectl logs -f -n pca elasticsearch-0

Step 6: Configure Permissions and Repository

Connect to the Elasticsearch pod:

kubectl exec -it -n pca elasticsearch-0 -- sh

Set directory permissions:

chown -R elasticsearch:root /opt/backups
chown elasticsearch:root /usr/share/elasticsearch/data
chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data/nodes /usr/share/elasticsearch/data/backup

Create the snapshot repository:

curl -X PUT "127.0.0.1:9200/_snapshot/backup_repository?pretty" \
-H 'Content-Type: application/json' \
-d '{"type": "fs","settings": {"location": "/usr/share/elasticsearch/data/backup"}}'

Expected output: { "acknowledged" : true }

Verify the snapshot is available:

curl -X GET "localhost:9200/_snapshot/backup_repository/*?verbose=false&pretty" | grep

Restore from the snapshot:

curl -X POST "localhost:9200/_snapshot/backup_repository//_restore?pretty&wait_for_completion=true" \
-H 'Content-Type: application/json' \
-d '{
"indices": "*",
"include_global_state": true
}'

Exit the pod:
```
exit
```

Step 7: Start Dependent Services

for deploy in gather alert-service es-exporter; do
kubectl scale deployment -n pca $deploy --replicas=1
done

Step 8: Clean Up

Remove the temporary restore directory on the target node:

rm -rf ES_RESTORE

Post-Restore Validation

Verify Elasticsearch cluster health:

kubectl exec -it -n pca elasticsearch-0 -- curl -s localhost:9200/_cluster/health?pretty

Confirm indices are present:

kubectl exec -it -n pca elasticsearch-0 -- curl -s localhost:9200/_cat/indices

Check that alerts and incidents are visible in the Provider Connectivity Assurance UI