# Backup and Restore (Kubernetes)
For on-premises Kubernetes deployments, Cisco Provider Connectivity Assurance provides automated backup solutions and restoration procedures for two classes of issues:
- Targeted application issues — Database corruption or accidental data deletion
- Infrastructure issues — Node or storage loss requiring rebuild
Backup Overview
PCA automatically backs up critical data stores using Kubernetes CronJobs that run daily. Backups are stored in the local MinIO object storage deployed as part of the solution.
Backed-Up Components
| Component | CronJob Name | Default Schedule | Storage Path |
|---|---|---|---|
| CouchDB | backup-create-couchdb |
01:00 UTC | couchDB/v2/couchDB-Backup/ |
| PostgreSQL | postgres-backup-create |
01:00 UTC | postgres/v2/ |
| Yang PostgreSQL | yang-postgres-backup-create |
01:00 UTC | postgres/v2/ |
| Elasticsearch | backup-create-elasticsearch |
01:00 UTC | elasticsearch/v2/elasticsearch-Backup/ |
| Dgraph | Managed by stitchit-env-config |
04:00 UTC | dgraphBackup/v2/ |
To verify backup jobs are running:
kubectl get cronjobs -n pca
Targeted Backup and Restoration
Configuration and application data is automatically backed up daily to MinIO. These backups address specific database corruption or accidental data deletion scenarios.
Accessing Backups
You access backups through the MinIO pod using the MinIO client (mc).
-
Connect to the MinIO pod:
kubectl exec -it -n pca pca-minio-pool-0-0 -- sh -
Load MinIO credentials:
. /tmp/minio/config.env -
Configure the MinIO client:
mc alias set --insecure pca https://localhost:9000 "$MINIO_ROOT_USER" "$MINIO_ROOT_PASSWORD" -
List available backups (example for CouchDB):
mc --insecure ls pca/<bucket-name>/couchDB/v2/couchDB-Backup/Replace
<bucket-name>with your deployment's bucket name (visible in the backup CronJob configuration).
Manually Triggering a Backup
You can trigger an immediate backup from any pod with curl installed (such as airflow):
CouchDB:
kubectl exec -it -n pca airflow-0 -- curl -f couchdb:10003/backup
PostgreSQL:
kubectl exec -it -n pca airflow-0 -- curl http://postgres:10004/backup
Elasticsearch:
kubectl exec -it -n pca airflow-0 -- curl -f elasticsearch01:10005/backup
Restoration Procedures
Restoration procedures vary by component and require careful coordination to avoid data loss. The general process involves:
- Export backup files from MinIO to the admin VM
- Identify the target node and persistent volume claim (PVC)
- Stop the affected service and dependent services
- Replace data files with backup contents
- Restart services in the correct order
- Validate data integrity
For detailed step-by-step restoration procedures, see the component-specific guides:
- CouchDB Backup and Restore
- PostgreSQL Backup and Restore
- Elasticsearch Backup and Restore
- Dgraph Backup and Restore
Important: For complex recovery scenarios or if you encounter issues, contact Cisco TAC for assistance.
Infrastructure Issues Requiring Rebuild
For catastrophic scenarios involving node loss or storage failures, application backups alone may not be sufficient for rapid recovery.
Recommended Snapshots
Maintain regular snapshots of the following for disaster recovery:
All cluster nodes:
/var/lib/embedded-cluster/— Contains OpenEBS local storage PVCs
Admin node only:
- Replicated Admin Console configuration
- Deployment configuration files
Storage Considerations
Embedded and air-gapped deployments use OpenEBS local storage (openebs-hostpath), meaning persistent volume data resides directly on node filesystems at:
/var/lib/embedded-cluster/openebs-local/<pvc-id>/
To identify which node hosts a specific service's data:
kubectl get pods -n pca -o wide | grep <service-name>
kubectl get pvc -n pca | grep <service-name>
Recovery from Infrastructure Loss
Recovery from node or storage loss requires:
- Restore the node or provision a replacement
- Restore filesystem snapshots to the appropriate paths
- Rejoin the node to the cluster (if applicable)
- Verify PVC bindings and service health
Important: Contact Cisco TAC for infrastructure recovery. TAC provides deployment-specific guidance based on your cluster topology and the nature of the failure.
Post-Restoration Validation
After any restoration:
-
Verify service health through the PCA UI
-
Check that data is processing correctly
-
Review logs for errors:
kubectl logs -f -n pca <pod-name> -
For time-sensitive data, you may need to backfill jobs for the period between the backup timestamp and the restoration time