✨ New: Try our AI‑powered Search (Ctrl + K) — Read more

PostgreSQL Backup and Restore

Prev Next

This guide provides detailed procedures for backup and restore of PostgreSQL in on-premises Kubernetes Provider Connectivity Assurance deployments.

Note: This procedure applies to both postgres and yang-postgres instances.

Prerequisites

  • kubectl access to the Provider Connectivity Assurance cluster

  • SSH access to cluster nodes (for restore)

  • Familiarity with OpenEBS local storage paths

Backup Overview

PostgreSQL backups run automatically via Kubernetes CronJobs at 01:00 UTC daily. Backups are stored in MinIO at:

<bucket-name>/postgres/v2/

Verify the backup jobs exist:

kubectl get cronjobs -n pca | grep postgres

Trigger Backup Manually

From any pod with curl (such as airflow):

kubectl exec -it -n pca airflow-0 -- sh
curl http://postgres:10004/backup

Expected output: Successfully ran Backup on postgres

Access Backups in MinIO

  1. Connect to the MinIO pod:

    kubectl exec -it -n pca pca-minio-pool-0-0 -- sh
    . /tmp/minio/config.env
    
  2. Configure the MinIO client:

    mc alias set --insecure pca https://localhost:9000 "$MINIO_ROOT_USER" "$MINIO_ROOT_PASSWORD"
    
  3. List available backups:

    mc --insecure ls pca/<bucket-name>/postgres/v2/
    
  4. Copy a backup to the pod filesystem:

    mc --insecure cp -r \
      pca/<bucket-name>/postgres/v2/postgresBackup-<timestamp>/ \
      /tmp/
    

Export Backup to Admin VM

The MinIO image does not include tar, so use cat to stream files:

  1. Set the backup directory name:

    export PG_BKUP="postgresBackup-<timestamp>"
    mkdir $PG_BKUP
    
  2. Copy the backup files:

    kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$PG_BKUP/pg_wal.tar > $PG_BKUP/pg_wal.tar
    kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$PG_BKUP/base.tar > $PG_BKUP/base.tar
    
  3. Verify checksums:

    kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$PG_BKUP/base.tar
    kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$PG_BKUP/pg_wal.tar
    cksum $PG_BKUP/*.tar
    

    Checksums must match exactly.

Identify Storage Location

Embedded and air-gapped deployments use OpenEBS local storage. PVC data resides on the node filesystem.

  1. Identify the PostgreSQL pod's node and PVC:

    kubectl get pods -n pca -o wide | grep postgres-0
    kubectl get pvc -n pca | grep postgres-data-postgres-0
    
  2. Set environment variables:

    export PG_HOST=<node-name>
    export PG_PVC=<pvc-id>
    export PG_BKUP="postgresBackup-<timestamp>"
    
  3. Copy the backup to the target node:

    scp -r ./$PG_BKUP \
      ${PG_HOST}:/var/lib/embedded-cluster/openebs-local/$PG_PVC/
    

Extract Backup on Target Node

On the node where the PVC resides:

cd /var/lib/embedded-cluster/openebs-local/$PG_PVC/$PG_BKUP

tar xf base.tar -C pgdata
tar xf pg_wal.tar -C pgdata/pg_wal/

chmod 600 pgdata
chown 70:root pgdata
chown 70:70 pgdata/*

Restore Procedure

Warning: This procedure replaces existing data. Ensure you have a valid backup before proceeding.

Step 1: Stop Dependent Services

for deploy in cerbos coordinator gather overlord sky-zitadel-gw skylight-aaa zitadel; do
  kubectl scale deployment -n pca $deploy --replicas=0
done

for sts in airflow airflow-scheduler; do
  kubectl scale statefulset -n pca $sts --replicas=0
done

Step 2: Verify No Active Connections

kubectl exec -it -n pca postgres-0 -- psql -U postgres
SELECT DISTINCT client_addr, usename, datname
FROM pg_stat_activity
WHERE client_addr IS NOT NULL;

This query should return 0 rows.

Step 3: Scale Down PostgreSQL

kubectl scale statefulset -n pca postgres --replicas=0

Step 4: Replace Data Directory

On the target node:

cd /var/lib/embedded-cluster/openebs-local/$PG_PVC
mv pgdata pgdata_$(date +%Y-%m-%d_%H-%M-%S)
cp -r $PG_BKUP/pgdata ./

Step 5: Restart Services

Start PostgreSQL:

kubectl scale statefulset -n pca postgres --replicas=1
kubectl logs -f -n pca postgres-0

Start dependent services:

for deploy in cerbos coordinator gather overlord sky-zitadel-gw skylight-aaa zitadel; do
  kubectl scale deployment -n pca $deploy --replicas=1
done

for sts in airflow airflow-scheduler; do
  kubectl scale statefulset -n pca $sts --replicas=1
done

Post-Restore Validation

Expected Data Gaps

Backups run at 01:00 UTC. Any data generated after that time (typically via Airflow jobs) will be missing.

Backfill Airflow Jobs

Determine the missing hours and re-run ingestion jobs:

kubectl exec -it -n pca airflow-0 -- sh
for hour in 05 06 07 08 09 10 11 12 13 14 15 16 17; do
  payload=$(printf '{"logical_date":"<date>T%s:00:00Z"}' "$hour")
  curl -X POST http://airflow:8080/airflowui/api/v1/dags/Batch_Ingestion_v1.1_contemporary/dagRuns \
    -H 'Content-Type: application/json' \
    -d "$payload"
  sleep 180
done

Note: Only two Airflow job slots are available. Adjust the sleep interval based on job duration and system load.

Related Topics

© 2026 Cisco and/or its affiliates. All rights reserved.

For more information about trademarks, please visit:
Cisco trademarks 
For more information about legal terms, please visit:
Cisco legal terms
For legal information about Accedian Skylight products, please visit:  Accedian legal terms and trademarks