Backup & Disaster Recovery
Contract Lucidity processes and stores high-value legal documents. A robust backup strategy is essential to meet your organisation's RPO (Recovery Point Objective) and RTO (Recovery Time Objective) requirements.
What to Back Up
| Component | Priority | Contains | Backup Method |
|---|---|---|---|
| PostgreSQL database | Critical | All document metadata, analysis results, clause data, embeddings, user accounts, AI provider config, playbook entries | pg_dump or continuous WAL archiving |
Document storage (/data/storage) | Critical | Original uploaded files (PDF, DOCX, etc.) | File-level backup or volume snapshot |
.env configuration | Critical | Database credentials, JWT secret, CORS settings | Version control or secrets manager |
Redis (cl-redisdata) | Low | Celery task queue, result cache | Not critical -- rebuilds automatically. Active tasks will be lost and need reprocessing |
| Container images | Low | Application code | Rebuilt from source code with docker compose build |
The .env file contains your JWT_SECRET_KEY. If this is lost, all existing user sessions and refresh tokens become invalid. Users will need to log in again. If POSTGRES_PASSWORD is lost, you cannot connect to the database.
PostgreSQL Backup
Option 1: pg_dump (Simple, Scheduled)
Best for small to mid-size deployments with nightly backup windows.
#!/bin/bash
# backup-db.sh -- Run nightly via cron
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/opt/backups/cl-postgres"
RETENTION_DAYS=30
mkdir -p "$BACKUP_DIR"
# Dump the entire database (compressed)
docker exec cl-postgres pg_dump \
-U cl_user \
-d contract_lucidity \
--format=custom \
--compress=9 \
> "$BACKUP_DIR/cl_backup_${TIMESTAMP}.dump"
# Verify the backup is not empty
FILESIZE=$(stat -c%s "$BACKUP_DIR/cl_backup_${TIMESTAMP}.dump" 2>/dev/null || echo 0)
if [ "$FILESIZE" -lt 1000 ]; then
echo "ERROR: Backup file suspiciously small ($FILESIZE bytes)" >&2
exit 1
fi
# Remove backups older than retention period
find "$BACKUP_DIR" -name "cl_backup_*.dump" -mtime +$RETENTION_DAYS -delete
echo "Backup complete: cl_backup_${TIMESTAMP}.dump ($FILESIZE bytes)"
Schedule with cron:
# Run at 2:00 AM daily
0 2 * * * /opt/scripts/backup-db.sh >> /var/log/cl-backup.log 2>&1
Option 2: Continuous WAL Archiving (Point-in-Time Recovery)
Best for enterprise deployments requiring sub-hour RPO.
- Enable WAL archiving in PostgreSQL:
# Add to your PostgreSQL configuration or docker-compose environment
POSTGRES_EXTRA_ARGS: >
-c wal_level=replica
-c archive_mode=on
-c archive_command='cp %p /backups/wal/%f'
- Take a base backup periodically:
docker exec cl-postgres pg_basebackup \
-U cl_user \
-D /backups/base \
--format=tar \
--gzip \
--checkpoint=fast
- WAL files are archived continuously, enabling point-in-time recovery to any moment.
The backup includes all pgvector embedding data. Embeddings are stored as standard PostgreSQL columns and are fully captured by pg_dump and WAL archiving. No special handling is required.
Option 3: Cloud-Managed Database
If you deploy PostgreSQL as a managed service (AWS RDS, Azure Database for PostgreSQL, GCP Cloud SQL), automated backups are typically included:
| Cloud | Service | Automated Backups | Point-in-Time Recovery |
|---|---|---|---|
| AWS | RDS for PostgreSQL | Daily snapshots, 35-day retention | Yes, to any second |
| Azure | Azure Database for PostgreSQL | Daily snapshots, 35-day retention | Yes, to any second |
| GCP | Cloud SQL for PostgreSQL | Daily snapshots, 365-day retention | Yes, to any second |
Document Storage Backup
The cl-storage Docker volume (mounted at /data/storage inside containers) contains all original uploaded documents.
Option 1: Volume-Level Backup
#!/bin/bash
# backup-storage.sh
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_DIR="/opt/backups/cl-storage"
mkdir -p "$BACKUP_DIR"
# Create a tarball of the Docker volume
docker run --rm \
-v cl-storage:/source:ro \
-v "$BACKUP_DIR":/backup \
alpine tar czf "/backup/cl_storage_${TIMESTAMP}.tar.gz" -C /source .
echo "Storage backup complete: cl_storage_${TIMESTAMP}.tar.gz"
Option 2: Rsync to Remote Storage
# Sync to a remote backup server (incremental)
rsync -avz --delete \
/var/lib/docker/volumes/contract-lucidity_cl-storage/_data/ \
backup-server:/backups/cl-storage/
Option 3: Cloud Object Storage
For cloud deployments, sync documents to S3, Azure Blob, or GCS:
# AWS S3
aws s3 sync /data/storage/ s3://cl-backups/storage/ --storage-class STANDARD_IA
# Azure Blob
az storage blob upload-batch \
--destination cl-backups \
--source /data/storage/ \
--account-name clbackupstorage
# Google Cloud Storage
gsutil -m rsync -r /data/storage/ gs://cl-backups/storage/
Configuration Backup
# Back up the .env file (contains secrets -- encrypt at rest!)
cp .env /opt/backups/cl-config/env_$(date +%Y%m%d_%H%M%S)
# Or better: use a secrets manager
# AWS: aws secretsmanager create-secret --name cl-env --secret-string file://.env
# Azure: az keyvault secret set --vault-name cl-vault --name cl-env --file .env
Never commit .env to a Git repository. Use a secrets manager or encrypted backup for production credentials.
Recovery Procedures
Full Recovery from Backup
Step-by-Step: Restore PostgreSQL
# 1. Stop services that write to the database
docker compose stop cl-backend cl-worker
# 2. Drop and recreate the database
docker exec cl-postgres psql -U cl_user -c "DROP DATABASE IF EXISTS contract_lucidity;"
docker exec cl-postgres psql -U cl_user -c "CREATE DATABASE contract_lucidity;"
# 3. Ensure pgvector extension exists
docker exec cl-postgres psql -U cl_user -d contract_lucidity \
-c "CREATE EXTENSION IF NOT EXISTS vector;"
# 4. Restore from backup
docker exec -i cl-postgres pg_restore \
-U cl_user \
-d contract_lucidity \
--no-owner \
--no-privileges \
< /opt/backups/cl-postgres/cl_backup_20260319_020000.dump
# 5. Restart services (migrations will run automatically on backend startup)
docker compose up -d cl-backend cl-worker
# 6. Verify
curl -s https://contractlucidity.com/api/health
docker exec cl-postgres psql -U cl_user -d contract_lucidity \
-c "SELECT count(*) FROM documents;"
Step-by-Step: Restore Document Storage
# 1. Stop services
docker compose stop cl-backend cl-worker
# 2. Clear existing volume and restore
docker run --rm \
-v cl-storage:/target \
-v /opt/backups/cl-storage:/backup:ro \
alpine sh -c "rm -rf /target/* && tar xzf /backup/cl_storage_20260319_020000.tar.gz -C /target"
# 3. Restart services
docker compose up -d cl-backend cl-worker
RTO / RPO Guidelines
| Deployment Tier | RPO Target | RTO Target | Backup Strategy |
|---|---|---|---|
| Demo / POC | 24 hours | 4 hours | Daily pg_dump + manual storage backup |
| Small firm | 12 hours | 2 hours | Daily pg_dump + nightly storage sync |
| Mid-size | 1 hour | 1 hour | WAL archiving + hourly storage sync + warm standby |
| Am Law 200 | 15 minutes | 30 minutes | Managed DB with PITR + cloud storage replication + hot standby |
| Am Law 100 | < 5 minutes | < 15 minutes | Multi-region managed DB + real-time storage replication + failover automation |
Backup Verification
Schedule monthly backup restoration tests. An untested backup is not a backup.
Monthly Verification Checklist
- Restore the database backup to a test environment
- Verify document counts match production
- Verify a random sample of documents can be opened and viewed
- Confirm embeddings are present (
SELECT count(*) FROM document_embeddings;) - Test the full pipeline by uploading a new document
- Document the restoration time (this is your actual RTO)
- Record results in your DR runbook
Automated Verification Script
#!/bin/bash
# verify-backup.sh -- Run monthly
BACKUP_FILE=$(ls -t /opt/backups/cl-postgres/cl_backup_*.dump | head -1)
echo "Testing backup: $BACKUP_FILE"
# Restore to a test database
docker exec cl-postgres psql -U cl_user -c "DROP DATABASE IF EXISTS cl_backup_test;"
docker exec cl-postgres psql -U cl_user -c "CREATE DATABASE cl_backup_test;"
docker exec cl-postgres psql -U cl_user -d cl_backup_test \
-c "CREATE EXTENSION IF NOT EXISTS vector;"
docker exec -i cl-postgres pg_restore \
-U cl_user -d cl_backup_test --no-owner \
< "$BACKUP_FILE"
# Verify key tables
for TABLE in documents document_metadata document_embeddings users; do
COUNT=$(docker exec cl-postgres psql -U cl_user -d cl_backup_test -t \
-c "SELECT count(*) FROM $TABLE;" 2>/dev/null | tr -d ' ')
echo " $TABLE: $COUNT rows"
done
# Cleanup
docker exec cl-postgres psql -U cl_user -c "DROP DATABASE cl_backup_test;"
echo "Verification complete."