At 2:23 AM last Tuesday, something went wrong with our database. Backups were a few days old, and so recovery took longer. That’s when I realized our backup system had to be better.

https://cdn.hashnode.com/res/hashnode/image/upload/v1765867594016/6a876012-b7bd-40de-a696-d3908ccf69ab.png?w=1600&h=840&fit=crop&crop=entropy&auto=compress,format&format=webp

Managed databases can be expensive to run. Going self-managed is the alternative, but it comes with its own responsibilities to achieve resiliency. Part of that for me involves building a backup system that:

Runs automatically at 2 AM
Retries up to 10 times if something fails
Sends Slack notifications on success or failure
Stores backups in S3
Makes restoration a single command

No database was harmed in the making of this system. Here's how I did it.

The Stack

Database: PostgreSQL 17 in Docker
Storage: AWS S3 with lifecycle policies
Orchestration: Bash scripts + cron
Notifications: Slack webhooks
Backup tool: pg_dump (PostgreSQL's built-in tool)

Architecture Overview

The backup system has three main components:

The Backup Script - Handles the actual pg_dump, S3 upload, and local cleanup. Includes retry logic for resilience.
Cron Jobs - Schedules backups (production daily, staging weekly).
Notification Webhook - POSTs status updates directly to Slack webhook.

This is straightforward… cron triggers the script → script backs up database → uploads to S3 → sends notification → cleans up old local files

Implementation

Step 1: Configure AWS and Docker

AWS Setup:

First, you want configure AWS CLI on your host machine and create S3 buckets for your backups. Install AWS CLI if it’s not.

aws configure
# Enter your credentials and region

# Test access
aws s3 ls

Docker Setup:

Add a volume mount to your database service in compose.yml:

services:
  db:
    image: postgres:17.2-bookworm
    volumes:
      - db_data:/var/lib/postgresql/data
      - ./backups:/backups  # Add this for backup files
    environment:
      POSTGRES_DB: myapp_db
      POSTGRES_USER: myapp_user
      POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
  db_data:

Restart your containers to apply the changes:

docker compose up -d

> Production Note: I actually use Docker Swarm in production/staging for better orchestration. The backup strategy is pretty much the same. Just update container filters to docker ps -qf "name=stackname_db.1" instead of "name=projectname_db". I'll cover this in the Production Considerations section.

Step 2: The Backup Script

Create ~/scripts/backup.sh with three key functions:

run_backup() - Executes pg_dump via docker exec, uploads to S3, cleans up old local files
send_notification() - POSTs backup status directly to Slack
Retry logic - attempts backup up to 10 times with 10-minute intervals

Here's the simplified structure:

#!/bin/bash
set -euo pipefail

ENV="${1:-prod}"
MAX_RETRIES=10
RETRY_INTERVAL=600  # 10 minutes

# load configs (Slack webhook, AWS region, etc.)
source ~/.backup_env

# environment-specific config
if [ "$ENV" == "prod" ]; then
    DB_NAME="myapp_prod"
    DB_USER="myapp_user"
    S3_BUCKET="myapp-backups"
else
    DB_NAME="myapp_staging"
    DB_USER="myapp_user"
    S3_BUCKET="myapp-staging-backups"
fi

BACKUP_FILE="backup_$(date +%Y-%m-%dT%H-%M-%S).dump"

send_notification() {
    local status=$1
    local error_msg=${2:-""}

    # determine color
    local color="good"
    [[ "$status" == *"failure"* ]] &amp;&amp; color="danger"

    # build the Slack payload
    local payload=$(cat &lt; ~/.backup_env &lt;&lt; 'EOF'
SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
EOF

chmod 600 ~/.backup_env # only the file's owner has read and write access

Step 3: Automation with Cron

Schedule the backups to run automatically:

crontab -e

# add these lines:
# Production - daily at 2 AM
0 2 * * * /home/username/scripts/backup.sh prod &gt;&gt; /home/username/.db_backup_logs/backup-prod.log 2&gt;&amp;1

# Staging - Sundays at 2 AM  
0 2 * * 0 /home/username/scripts/backup.sh staging &gt;&gt; /home/username/.db_backup_logs/backup-staging.log 2&gt;&amp;1

# Cleanup old logs - daily at 3 AM
0 3 * * * find /home/username/.db_backup_logs -name "*.log" -mtime +30 -delete

mkdir -p ~/.db_backup_logs

Step 4: The Restoration Script

Backups are useless if you can't restore them. You could do it manually, but why not just use Bash scripting as well? Create ~/scripts/db-restore.sh:

```bash

!/bin/bash

set -euo pipefail

Usage: ./db-restore.sh prod backup_2025-12-13T02-00-00.dump [--full]

ENV="${1}"
BACKUP_FILE="${2}"
FULL_RESTORE="${3:-}"

Download from S3

aws s3 cp s3://myapp-backups/$BACKUP_FILE ~/restore/

Copy to container

CONTAINER_ID=$(docker ps -qf "name=db")
docker cp ~/restore/$BACKUP_FILE $CONTAINER_ID:/tmp/

Restore (with confirmation prompt)

if [ "$FULL_RESTORE" == "--full" ]; then
# Drop and recreate database
docker exec -i $CONTAINER_ID psql -U postgres <Originally published at blog.theolujay.dev

PostgreSQL Backup System for Docker with S3 and Slack

The Stack

Architecture Overview

Implementation

Step 1: Configure AWS and Docker

Step 2: The Backup Script

Step 3: Automation with Cron

Step 4: The Restoration Script

!/bin/bash

Usage: ./db-restore.sh prod backup_2025-12-13T02-00-00.dump [--full]

Download from S3

Copy to container

Restore (with confirmation prompt)

Replies

PostgreSQL Backup System for Docker with S3 and Slack

The Stack

Architecture Overview

Implementation

Step 1: Configure AWS and Docker

Step 2: The Backup Script

Step 3: Automation with Cron

Step 4: The Restoration Script

!/bin/bash

Usage: ./db-restore.sh prod backup_2025-12-13T02-00-00.dump [--full]

Download from S3

Copy to container

Restore (with confirmation prompt)

Replies

KEYCHAIN VOTE