New: Try our AI‑powered Search (Ctrl + K) — Read more

Elasticsearch Backup and Restore

Prev Next

# Elasticsearch Backup and Restore

This guide provides detailed procedures for backing up and restoring Elasticsearch in on-premises Kubernetes PCA deployments.

Prerequisites

  • kubectl access to the PCA cluster
  • SSH access to cluster nodes (for restoration)
  • Familiarity with OpenEBS local storage paths

Backup Overview

Elasticsearch backups run automatically via a Kubernetes CronJob at 01:00 UTC daily. Backups are stored in MinIO at:

<bucket-name>/elasticsearch/v2/elasticsearch-Backup/

Verify the backup job exists:

kubectl get cronjobs -n pca | grep elasticsearch

Manually Trigger a Backup

From any pod with curl (such as airflow):

kubectl exec -it -n pca airflow-0 -- sh
curl -vvv -f elasticsearch01:10005/backup

Expected output: Successfully ran Backup on elasticsearch

Access Backups in MinIO

  1. Connect to the MinIO pod:

    kubectl exec -it -n pca pca-minio-pool-0-0 -- sh
    . /tmp/minio/config.env
    
  2. Configure the MinIO client:

    mc alias set --insecure pca https://localhost:9000 "$MINIO_ROOT_USER" "$MINIO_ROOT_PASSWORD"
    
  3. List available backups:

    mc --insecure ls pca/<bucket-name>/elasticsearch/v2/elasticsearch-Backup/
    
  4. Copy a backup to the pod filesystem:

    mc --insecure cp -r \
      pca/<bucket-name>/elasticsearch/v2/elasticsearch-Backup/<backup-file>.tar.gz \
      /tmp/
    

Export Backup to Admin VM

The MinIO image does not include tar, so use cat to stream files:

  1. Set the backup filename:

    export ES_BKUP="<backup-timestamp>.tar.gz"
    mkdir ES_RESTORE
    
  2. Copy the file:

    kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$ES_BKUP > ES_RESTORE/$ES_BKUP
    
  3. Verify the checksum:

    kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$ES_BKUP
    cksum ES_RESTORE/$ES_BKUP
    

    Checksums must match exactly.

Identify Storage Location

Embedded and air-gapped deployments use OpenEBS local storage. PVC data resides on the node filesystem.

  1. Identify the Elasticsearch pod's node and PVC:

    kubectl get pods -n pca -o wide | grep elasticsearch-0 | awk '{ print $7 }'
    kubectl get pvc -n pca | grep elasticsearch | awk '{ print $3 }'
    
  2. Set environment variables:

    export ES_HOST=<node-name>
    export ES_PVC=<pvc-id>
    
  3. Copy the backup to the target node:

    scp -r ./ES_RESTORE/ \
      ${ES_HOST}:/var/lib/embedded-cluster/openebs-local/$ES_PVC/
    

Extract Backup on Target Node

On the node where the PVC resides:

export ES_PVC=<pvc-id>
export ES_BKUP="<backup-timestamp>.tar.gz"

cd /var/lib/embedded-cluster/openebs-local/$ES_PVC/ES_RESTORE
tar zxvf $ES_BKUP

Restore Procedure

Warning: This procedure deletes all existing Elasticsearch data. Ensure you have a valid backup before proceeding.

Step 1: Stop Dependent Services

for deploy in gather alert-service es-exporter; do
  kubectl scale deployment -n pca $deploy --replicas=0
done

Step 2: Delete Existing Data

  1. Connect to the Elasticsearch pod:

    kubectl exec -it -n pca elasticsearch-0 -- sh
    
  2. Delete all indices:

    curl -X DELETE 'http://localhost:9200/_all'
    

    Expected output: {"acknowledged":true}

  3. Delete the snapshot repository:

    curl -X DELETE "localhost:9200/_snapshot/backup_repository?pretty"
    

    Expected output: { "acknowledged" : true }

  4. Exit the pod:

    exit
    

Step 3: Scale Down Elasticsearch

kubectl scale sts -n pca elasticsearch --replicas=0

Step 4: Replace Data Directories

On the target node:

cd /var/lib/embedded-cluster/openebs-local/$ES_PVC
rm -rf backup nodes

cp -r ES_RESTORE/backup ./
chmod 775 backup
chown -R debian:admin backup

Step 5: Start Elasticsearch

kubectl scale statefulset -n pca elasticsearch --replicas=1
kubectl logs -f -n pca elasticsearch-0

Step 6: Configure Permissions and Repository

  1. Connect to the Elasticsearch pod:

    kubectl exec -it -n pca elasticsearch-0 -- sh
    
  2. Set directory permissions:

    chown -R elasticsearch:root /opt/backups
    chown elasticsearch:root /usr/share/elasticsearch/data
    chown -R elasticsearch:elasticsearch /usr/share/elasticsearch/data/nodes /usr/share/elasticsearch/data/backup
    
  3. Create the snapshot repository:

    curl -X PUT "127.0.0.1:9200/_snapshot/backup_repository?pretty" \
      -H 'Content-Type: application/json' \
      -d '{"type": "fs","settings": {"location": "/usr/share/elasticsearch/data/backup"}}'
    

    Expected output: { "acknowledged" : true }

  4. Verify the snapshot is available:

    curl -X GET "localhost:9200/_snapshot/backup_repository/*?verbose=false&pretty" | grep <backup-date>
    
  5. Restore from the snapshot:

    curl -X POST "localhost:9200/_snapshot/backup_repository/<snapshot-name>/_restore?pretty&wait_for_completion=true" \
      -H 'Content-Type: application/json' \
      -d '{
        "indices": "*",
        "include_global_state": true
      }'
    
  6. Exit the pod:

    exit
    

Step 7: Start Dependent Services

for deploy in gather alert-service es-exporter; do
  kubectl scale deployment -n pca $deploy --replicas=1
done

Step 8: Clean Up

Remove the temporary restore directory on the target node:

rm -rf ES_RESTORE

Post-Restore Validation

  1. Verify Elasticsearch cluster health:

    kubectl exec -it -n pca elasticsearch-0 -- curl -s localhost:9200/_cluster/health?pretty
    
  2. Confirm indices are present:

    kubectl exec -it -n pca elasticsearch-0 -- curl -s localhost:9200/_cat/indices
    
  3. Check that alerts and incidents are visible in the PCA UI

Related Topics