This guide provides detailed procedures for backup and restore of PostgreSQL in on-premises Kubernetes Provider Connectivity Assurance deployments.
Note: This procedure applies to both
postgresandyang-postgresinstances.
Prerequisites
kubectlaccess to the Provider Connectivity Assurance clusterSSH access to cluster nodes (for restore)
Familiarity with OpenEBS local storage paths
Backup Overview
PostgreSQL backups run automatically via Kubernetes CronJobs at 01:00 UTC daily. Backups are stored in MinIO at:
<bucket-name>/postgres/v2/
Verify the backup jobs exist:
kubectl get cronjobs -n pca | grep postgres
Trigger Backup Manually
From any pod with curl (such as airflow):
kubectl exec -it -n pca airflow-0 -- sh
curl http://postgres:10004/backup
Expected output: Successfully ran Backup on postgres
Access Backups in MinIO
Connect to the MinIO pod:
kubectl exec -it -n pca pca-minio-pool-0-0 -- sh . /tmp/minio/config.envConfigure the MinIO client:
mc alias set --insecure pca https://localhost:9000 "$MINIO_ROOT_USER" "$MINIO_ROOT_PASSWORD"List available backups:
mc --insecure ls pca/<bucket-name>/postgres/v2/Copy a backup to the pod filesystem:
mc --insecure cp -r \ pca/<bucket-name>/postgres/v2/postgresBackup-<timestamp>/ \ /tmp/
Export Backup to Admin VM
The MinIO image does not include tar, so use cat to stream files:
Set the backup directory name:
export PG_BKUP="postgresBackup-<timestamp>" mkdir $PG_BKUPCopy the backup files:
kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$PG_BKUP/pg_wal.tar > $PG_BKUP/pg_wal.tar kubectl exec -n pca pca-minio-pool-0-0 -- cat /tmp/$PG_BKUP/base.tar > $PG_BKUP/base.tarVerify checksums:
kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$PG_BKUP/base.tar kubectl exec -n pca pca-minio-pool-0-0 -- cksum /tmp/$PG_BKUP/pg_wal.tar cksum $PG_BKUP/*.tarChecksums must match exactly.
Identify Storage Location
Embedded and air-gapped deployments use OpenEBS local storage. PVC data resides on the node filesystem.
Identify the PostgreSQL pod's node and PVC:
kubectl get pods -n pca -o wide | grep postgres-0 kubectl get pvc -n pca | grep postgres-data-postgres-0Set environment variables:
export PG_HOST=<node-name> export PG_PVC=<pvc-id> export PG_BKUP="postgresBackup-<timestamp>"Copy the backup to the target node:
scp -r ./$PG_BKUP \ ${PG_HOST}:/var/lib/embedded-cluster/openebs-local/$PG_PVC/
Extract Backup on Target Node
On the node where the PVC resides:
cd /var/lib/embedded-cluster/openebs-local/$PG_PVC/$PG_BKUP
tar xf base.tar -C pgdata
tar xf pg_wal.tar -C pgdata/pg_wal/
chmod 600 pgdata
chown 70:root pgdata
chown 70:70 pgdata/*
Restore Procedure
Warning: This procedure replaces existing data. Ensure you have a valid backup before proceeding.
Step 1: Stop Dependent Services
for deploy in cerbos coordinator gather overlord sky-zitadel-gw skylight-aaa zitadel; do
kubectl scale deployment -n pca $deploy --replicas=0
done
for sts in airflow airflow-scheduler; do
kubectl scale statefulset -n pca $sts --replicas=0
done
Step 2: Verify No Active Connections
kubectl exec -it -n pca postgres-0 -- psql -U postgres
SELECT DISTINCT client_addr, usename, datname
FROM pg_stat_activity
WHERE client_addr IS NOT NULL;
This query should return 0 rows.
Step 3: Scale Down PostgreSQL
kubectl scale statefulset -n pca postgres --replicas=0
Step 4: Replace Data Directory
On the target node:
cd /var/lib/embedded-cluster/openebs-local/$PG_PVC
mv pgdata pgdata_$(date +%Y-%m-%d_%H-%M-%S)
cp -r $PG_BKUP/pgdata ./
Step 5: Restart Services
Start PostgreSQL:
kubectl scale statefulset -n pca postgres --replicas=1
kubectl logs -f -n pca postgres-0
Start dependent services:
for deploy in cerbos coordinator gather overlord sky-zitadel-gw skylight-aaa zitadel; do
kubectl scale deployment -n pca $deploy --replicas=1
done
for sts in airflow airflow-scheduler; do
kubectl scale statefulset -n pca $sts --replicas=1
done
Post-Restore Validation
Expected Data Gaps
Backups run at 01:00 UTC. Any data generated after that time (typically via Airflow jobs) will be missing.
Backfill Airflow Jobs
Determine the missing hours and re-run ingestion jobs:
kubectl exec -it -n pca airflow-0 -- sh
for hour in 05 06 07 08 09 10 11 12 13 14 15 16 17; do
payload=$(printf '{"logical_date":"<date>T%s:00:00Z"}' "$hour")
curl -X POST http://airflow:8080/airflowui/api/v1/dags/Batch_Ingestion_v1.1_contemporary/dagRuns \
-H 'Content-Type: application/json' \
-d "$payload"
sleep 180
done
Note: Only two Airflow job slots are available. Adjust the
sleepinterval based on job duration and system load.
Related Topics
© 2026 Cisco and/or its affiliates. All rights reserved.
For more information about trademarks, please visit: Cisco trademarks
For more information about legal terms, please visit: Cisco legal terms
For legal information about Accedian Skylight products, please visit: Accedian legal terms and trademarks