A Minimal Backup System Using Standard Linux Tools

Posted Dec 21, 2025

By João Cício

14 min read

When deploying a server, especially a public-facing one, one of the things we absolutely cannot forget is backups. Backups are not optional. They are a safeguard against data loss, human error, hardware failure, and attacks such as ransomware.

Nowadays, many infrastructures rely on complex backup solutions that integrate deeply with clusters, virtualization platforms, and cloud storage. Those systems make sense at scale. But securing a server does not have to be complex.

This article documents a simple backup strategy that reliably protected our servers for years before more advanced solutions were introduced. The tools involved are basic, widely available, easy to automate, and, most importantly, easy to understand and restore from when things go wrong.

This is not about the “best possible” backup system. It is about having a correct, efficient, and dependable baseline.

Core Backup Principles

Before talking about tools, it is important to understand the principles behind a solid backup strategy. The tools themselves are interchangeable; the ideas are not.

Backups must be off the server

A backup that lives on the same server as the original data is not a backup. It is merely a copy.

If the server is compromised, corrupted, or completely lost, both the data and the backup disappear together. A meaningful backup must be physically and logically separated from the system it protects.

This can be:

another machine in your infrastructure,
a storage server or NAS,
or an off-site system reachable over SSH.

Backups must be encrypted

Backups often contain extremely sensitive data: configuration files, credentials, private keys, and database dumps. Storing them unencrypted, even temporarily, is a serious security risk.

Encryption ensures that even if backup files are leaked or stolen, their contents remain protected. This is especially important when backups are transferred or stored on systems you do not fully trust.

Backups must be consistent and automatic

A simple backup that runs every day without fail is far more valuable than a complex system that silently fails or is rarely checked.

Automation and predictability matter more than cleverness.

Choosing the Right Tools

The following tools are standard on almost every Linux system. They are boring, battle-tested, and well understood. Which is exactly what you want when recovering from a disaster.

Rsync vs scp

When copying data between servers, scp is often the first tool that comes to mind. It sends files over SSH, is easy to use, and works well for single files or one-off transfers.

However, scp blindly copies files every time. For backups, this means re-transferring unchanged files over and over again, wasting bandwidth, time, and disk I/O. There is no notion of delta transfers, resumable copies, or structured exclusions.

SCP Examples

        
      
# Copy a directory recursively to a remote host
scp -r /var/backups/app user@backup-host:/srv/offsite/app/

# Copy a single file to a remote host
scp /var/backups/app/latest.tar.gz.gpg user@backup-host:/srv/offsite/app/

# Copy from remote to local (restore/download)
scp user@backup-host:/srv/offsite/app/latest.tar.gz.gpg /tmp/

rsync, on the other hand, was practically designed for backup workflows. It transfers only the differences between source and destination, making repeated backups extremely efficient. It supports compression, exclusions, dry runs, permission preservation, and detailed logging. Like scp, it can run over SSH.

For backing up large directory trees or frequently changing data, rsync should be the default choice. scp still has its place for small, simple, one-off transfers — and we will use it later for exactly that.

Rsync Examples

        
      
# Sync a directory efficiently over SSH (preserve perms/owner/time, compress, verbose)
rsync -azv -e "ssh -p 22" /var/backups/app/ user@backup-host:/srv/offsite/app/

# Same, but delete files on destination that no longer exist on source (mirror)
rsync -azv --delete -e "ssh" /var/backups/app/ user@backup-host:/srv/offsite/app/

# Dry run (shows what would change without copying anything)
rsync -azv --dry-run -e "ssh" /var/backups/app/ user@backup-host:/srv/offsite/app/

# Exclude patterns (caches, tmp, sockets, etc.)
rsync -azv --exclude='*.sock' --exclude='cache/' --exclude='tmp/' -e "ssh" /data/ user@backup-host:/srv/offsite/data/

Packaging with tar

Once you have decided what data to back up, the next step is packaging it properly. This is where tar shines.

With a single command, you can bundle entire directory trees, preserve permissions, ownership, and timestamps, and produce a single archive that is easy to move and store. You can also exclude caches, temporary files, sockets, and other irrelevant data.

Tar Examples

        
      
# Create a gzipped tar archive of a folder
tar -czf /var/backups/app-$(date +%F).tar.gz /srv/app

# Create archive with exclusions
tar -czf /var/backups/app-$(date +%F).tar.gz \
  --exclude='/srv/app/tmp' \
  --exclude='/srv/app/cache' \
  --exclude='*.sock' \
  /srv/app

# List contents of an archive
tar -tzf /var/backups/app-2025-12-25.tar.gz

# Extract an archive to a target directory
tar -xzf /var/backups/app-2025-12-25.tar.gz -C /restore/

Encrypting with gpg

Encryption is a non-negotiable step. Backups frequently contain the keys to your entire infrastructure.

Encrypting tar archives with gpg ensures that even if backup files are leaked, the data remains unreadable. The tar + gpg combination is simple, scriptable, and fits perfectly into automated workflows.

gpg Examples

        
      
# Encrypt for a recipient (recommended for automation if you use a GPG public key)
gpg --encrypt --recipient "Backup Key" --output app.tar.gz.gpg app.tar.gz

# Decrypt
gpg --decrypt --output app.tar.gz app.tar.gz.gpg

# Encrypt with a passphrase (non-interactive; convenient but handle secrets carefully)
gpg --symmetric --cipher-algo AES256 --batch --yes \
  --pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
  --output app.tar.gz.gpg app.tar.gz

# Verify what's inside after decrypting (no extraction)
gpg --decrypt app.tar.gz.gpg | tar -tz

Scheduling and Retention

Backups should run automatically and regularly. The correct frequency depends on how often your data changes and how much data loss you are willing to tolerate.

As a rough rule of thumb:

Daily backups are a solid baseline
Hourly backups make sense for frequently changing data
Weekly backups can complement daily ones for longer retention

Consistency matters more than frequency. Cron is more than sufficient for scheduling reliable backup jobs on most Linux systems.

        
# Edit the user crontab
crontab -e

# Edit the root crontab
sudo crontab -e

# Daily encrypted offsite backup at 05:00 local server time
0 5 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1

Retention is just as important. There is rarely a need to keep dozens of daily backups. A small number of recent backups combined with one or two longer-term snapshots is usually enough.

Storage is never free — even if it feels that way.

Putting Everything Together

Using the tools described above, designing a solid backup system becomes straightforward:

Decide which directories need to be backed up and where backups will be stored.
Define a clear backup cadence and retention policy.
Create a snapshot by archiving the data with tar.
Encrypt the snapshot and remove the plaintext archive.
Transfer the encrypted backup off the server.

Because the data is encrypted, the destination does not need to be fully trusted. Even if the backup server is compromised, the attacker still has to break the encryption.

Layering simple backups with other mechanisms, filesystem snapshots, database-native backups, or VM snapshots, further increases resilience. If a complex system fails, simple backups remain understandable, inspectable, and restorable with standard Unix tools.

Example Scripts

Backing up

        
      
#!/usr/bin/env bash
set -euo pipefail

# -------------------------
# CONFIG
# -------------------------
SOURCE_DIR="/srv/app"
WORKDIR="/var/backups/work"
OUTDIR="/var/backups/out"
STAMP="$(date -u +%Y%m%dT%H%M%SZ)"

ARCHIVE_BASENAME="app-backup-${STAMP}.tar.gz"
ARCHIVE_PATH="${OUTDIR}/${ARCHIVE_BASENAME}"
ENCRYPTED_PATH="${ARCHIVE_PATH}.gpg"

# Offsite destination (SSH)
REMOTE_USER="user"
REMOTE_HOST="backup-host"
REMOTE_DIR="/srv/offsite/app"
SSH_PORT="22"

# Encryption options:
# Option A (recommended): recipient-based encryption (set to your key's uid/email)
GPG_RECIPIENT="Backup Key"

# Option B: symmetric encryption (uncomment + export BACKUP_PASSPHRASE)
# USE_SYMMETRIC="1"

# Retention (local)
KEEP_DAYS_LOCAL=7

# Exclusions
EXCLUDES=(
  "--exclude=${SOURCE_DIR}/tmp"
  "--exclude=${SOURCE_DIR}/cache"
  "--exclude=*.sock"
)

# -------------------------
# PREP
# -------------------------
mkdir -p "$WORKDIR" "$OUTDIR"

echo "[1/6] Creating tar archive..."
tar -czf "${ARCHIVE_PATH}" "${EXCLUDES[@]}" "$SOURCE_DIR"

echo "[2/6] Encrypting archive with gpg..."
if [[ "${USE_SYMMETRIC:-0}" == "1" ]]; then
  : "${BACKUP_PASSPHRASE:?BACKUP_PASSPHRASE must be set for symmetric encryption}"
  gpg --symmetric --cipher-algo AES256 --batch --yes \
    --pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
    --output "${ENCRYPTED_PATH}" "${ARCHIVE_PATH}"
else
  gpg --encrypt --recipient "${GPG_RECIPIENT}" \
    --output "${ENCRYPTED_PATH}" "${ARCHIVE_PATH}"
fi

echo "[3/6] (Optional) Quick integrity check: list files inside without writing plaintext to disk..."
gpg --decrypt "${ENCRYPTED_PATH}" | tar -tz >/dev/null

echo "[4/6] Uploading encrypted backup using rsync (efficient + resumable)..."
rsync -azv --partial --progress -e "ssh -p ${SSH_PORT}" \
  "${ENCRYPTED_PATH}" \
  "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/"

echo "[5/6] Uploading a 'latest' marker using scp (simple one-off copy)..."
# Create a small manifest/marker file locally, then scp it
LATEST_MARKER="${WORKDIR}/latest.txt"
printf "%s\n" "$(basename "${ENCRYPTED_PATH}")" > "${LATEST_MARKER}"
scp -P "${SSH_PORT}" "${LATEST_MARKER}" "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/latest.txt"

echo "[6/6] Cleanup: remove plaintext tar + prune old local files..."
rm -f "${ARCHIVE_PATH}" "${LATEST_MARKER}"

# Prune old encrypted backups locally
find "${OUTDIR}" -type f -name "*.gpg" -mtime "+${KEEP_DAYS_LOCAL}" -print -delete

echo "Done."
echo "Uploaded: $(basename "${ENCRYPTED_PATH}") to ${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/"

Restoring

        
      
#!/usr/bin/env bash
set -euo pipefail

# -------------------------
# CONFIG (match backup script)
# -------------------------
WORKDIR="/var/backups/restore-work"
RESTORE_DIR="/restore"                  # where to extract
REMOTE_USER="user"
REMOTE_HOST="backup-host"
REMOTE_DIR="/srv/offsite/app"
SSH_PORT="22"

# Encryption options (must match how you encrypted)
# Option A: recipient-based (GPG private key must be available on this machine)
GPG_RECIPIENT="Backup Key"

# Option B: symmetric encryption (uncomment + load/export BACKUP_PASSPHRASE, e.g. from .env)
# USE_SYMMETRIC="1"

# If you want to restore a specific file, set it like:
# BACKUP_FILE="app-backup-20251225T000000Z.tar.gz.gpg"
# Otherwise it will use remote latest.txt
BACKUP_FILE="${BACKUP_FILE:-}"

# -------------------------
# PREP
# -------------------------
mkdir -p "$WORKDIR" "$RESTORE_DIR"

echo "[1/7] Selecting backup to restore..."

if [[ -z "$BACKUP_FILE" ]]; then
  # Fetch latest marker (created by the backup script via scp)
  echo "No BACKUP_FILE provided; fetching latest.txt from remote..."
  scp -P "${SSH_PORT}" \
    "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/latest.txt" \
    "${WORKDIR}/latest.txt"

  BACKUP_FILE="$(cat "${WORKDIR}/latest.txt" | tr -d '\r\n')"
fi

if [[ -z "$BACKUP_FILE" ]]; then
  echo "ERROR: Could not determine backup file name." >&2
  exit 1
fi

REMOTE_PATH="${REMOTE_DIR}/${BACKUP_FILE}"
LOCAL_ENCRYPTED="${WORKDIR}/${BACKUP_FILE}"

echo "Selected backup: ${BACKUP_FILE}"

echo "[2/7] Downloading encrypted backup (rsync resumable)..."
rsync -azv --partial --progress -e "ssh -p ${SSH_PORT}" \
  "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_PATH}" \
  "${LOCAL_ENCRYPTED}"

echo "[3/7] Verifying archive is readable (decrypt + list, no extraction)..."
if [[ "${USE_SYMMETRIC:-0}" == "1" ]]; then
  : "${BACKUP_PASSPHRASE:?BACKUP_PASSPHRASE must be set for symmetric decryption}"
  gpg --batch --yes --pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
    --decrypt "${LOCAL_ENCRYPTED}" | tar -tz >/dev/null
else
  # Recipient-based: will use your private key available in local keyring/agent
  gpg --decrypt "${LOCAL_ENCRYPTED}" | tar -tz >/dev/null
fi

echo "[4/7] Extracting into restore directory..."
# Optional: restore into a unique subfolder per run
RUN_DIR="${RESTORE_DIR}/restored-$(date -u +%Y%m%dT%H%M%SZ)"
mkdir -p "$RUN_DIR"

if [[ "${USE_SYMMETRIC:-0}" == "1" ]]; then
  gpg --batch --yes --pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
    --decrypt "${LOCAL_ENCRYPTED}" | tar -xzf - -C "$RUN_DIR"
else
  gpg --decrypt "${LOCAL_ENCRYPTED}" | tar -xzf - -C "$RUN_DIR"
fi

echo "[5/7] Showing top-level restored contents:"
ls -la "$RUN_DIR" | head -n 50

echo "[6/7] Cleanup (optional): removing downloaded encrypted file..."
# Comment this out if you want to keep the encrypted backup locally.
rm -f "${LOCAL_ENCRYPTED}" "${WORKDIR}/latest.txt" 2>/dev/null || true

echo "[7/7] Done."
echo "Restored backup '${BACKUP_FILE}' into: ${RUN_DIR}"
echo "Next: validate app config/data, then perform your service-specific restore steps (db import, secrets, etc.)."

Executing the scripts

Give execution permissions to your scripts.

chmod +x backup_script.sh

And to add the backup script to the crontab!

crontab -e

# Daily encrypted offsite backup at 05:00 local server time
0 5 * * * /usr/local/bin/backup_script.sh >> /var/log/backup.log 2>&1

Other Tools Worth Exploring

Shell-based backups are powerful and reliable, but dedicated tools can add features such as deduplication, versioning, and repository management. These tools build on the same principles discussed in this article, but package them into more integrated systems. Depending on your needs and scale, these tools may be worth exploring.

Restic

Restic is a modern backup tool designed with security in mind. Encryption is enabled by default, and backups are stored in deduplicated repositories, which saves a significant amount of space over time. It supports multiple backends, including local storage, SSH servers, and various cloud providers.

Restic is particularly attractive if you want a clean command-line interface, fast incremental backups, and strong cryptographic guarantees without having to glue multiple tools together manually.

Borg

Borg (BorgBackup) is a well-established backup solution focused on efficiency and reliability. It offers powerful deduplication, compression, and encryption, making it ideal for large datasets and long retention periods.

Borg repositories are designed to be robust against corruption, and the tool provides excellent performance for repeated backups. It does, however, introduce its own repository format, which means restores require Borg itself to be available.

Duplicity

Duplicity takes a slightly different approach by building encrypted, incremental backups on top of standard tar archives. This makes it easier to integrate with existing workflows and storage backends, while still providing versioned backups and encryption.

It is a solid option if you want incremental, encrypted backups that remain relatively transparent and compatible with traditional Unix tools.

Comparison

Feature / Tool	Restic	Borg (BorgBackup)	Duplicity
Backup type	Snapshot-based	Snapshot-based	Incremental (tar-based)
Encryption	Yes (default, mandatory)	Yes (optional but common)	Yes (via GPG)
Deduplication	Yes (content-defined)	Yes (very efficient)	No
Compression	Yes	Yes	Yes
Incremental backups	Yes	Yes	Yes
Restore granularity	File / directory	File / directory	File / directory
Storage backend	Local, SSH, S3, many cloud providers	Local or SSH	Local, SSH, cloud via backends
Repository format	Custom	Custom	Standard tar volumes
Setup complexity	Low	Medium	Medium
Performance	Very good	Excellent	Moderate
Bandwidth efficiency	High	Very high	Low–moderate
Ease of scripting	Easy	Easy	Easy
Learning curve	Low	Medium	Medium
Tool availability	Single static binary	Package-based	Python-based
Best suited for	Simple, secure modern backups	Large datasets & long retention	Legacy & tar-friendly workflows

Conclusion

There are better tools for managing backups at scale, especially in clustered or virtualized environments. This article is not about those tools.

It is about showing how easy it is to implement a correct backup strategy using simple, standard Linux utilities. For any experienced sysadmin, this post will read as “water is wet” obvious. But I have been called into too many catastrophic situations where no backup policy existed at all.

More than the tools, the backup principles mentioned here are the most important takeaway. Frequency, snapshot, encryption and off-site storage are more than pointers. They are rules to live by, regardless of the tools you use.

Do not disregard the need for backups. They are as important as your security policies. Efficient solutions are far simpler to design and apply than most people assume, and they not only protect you from system crashes, accidental deletions, and the dreaded ransomware attacks, they buy you something invaluable: peace of mind.

Engineering