A Minimal Backup System Using Standard Linux Tools
When deploying a server, especially a public-facing one, one of the things we absolutely cannot forget is backups. Backups are not optional. They are a safeguard against data loss, human error, hardware failure, and attacks such as ransomware.
Nowadays, many infrastructures rely on complex backup solutions that integrate deeply with clusters, virtualization platforms, and cloud storage. Those systems make sense at scale. But securing a server does not have to be complex.
This article documents a simple backup strategy that reliably protected our servers for years before more advanced solutions were introduced. The tools involved are basic, widely available, easy to automate, and, most importantly, easy to understand and restore from when things go wrong.
This is not about the “best possible” backup system. It is about having a correct, efficient, and dependable baseline.
Core Backup Principles
Before talking about tools, it is important to understand the principles behind a solid backup strategy. The tools themselves are interchangeable; the ideas are not.
Backups must be off the server
A backup that lives on the same server as the original data is not a backup. It is merely a copy.
If the server is compromised, corrupted, or completely lost, both the data and the backup disappear together. A meaningful backup must be physically and logically separated from the system it protects.
This can be:
- another machine in your infrastructure,
- a storage server or NAS,
- or an off-site system reachable over SSH.
Backups must be encrypted
Backups often contain extremely sensitive data: configuration files, credentials, private keys, and database dumps. Storing them unencrypted, even temporarily, is a serious security risk.
Encryption ensures that even if backup files are leaked or stolen, their contents remain protected. This is especially important when backups are transferred or stored on systems you do not fully trust.
Backups must be consistent and automatic
A simple backup that runs every day without fail is far more valuable than a complex system that silently fails or is rarely checked.
Automation and predictability matter more than cleverness.
Choosing the Right Tools
The following tools are standard on almost every Linux system. They are boring, battle-tested, and well understood. Which is exactly what you want when recovering from a disaster.
Rsync vs scp
When copying data between servers, scp is often the first tool that comes to mind. It sends files over SSH, is easy to use, and works well for single files or one-off transfers.
However, scp blindly copies files every time. For backups, this means re-transferring unchanged files over and over again, wasting bandwidth, time, and disk I/O. There is no notion of delta transfers, resumable copies, or structured exclusions.
SCP Examples
1
2
3
4
5
6
7
8
# Copy a directory recursively to a remote host
scp -r /var/backups/app user@backup-host:/srv/offsite/app/
# Copy a single file to a remote host
scp /var/backups/app/latest.tar.gz.gpg user@backup-host:/srv/offsite/app/
# Copy from remote to local (restore/download)
scp user@backup-host:/srv/offsite/app/latest.tar.gz.gpg /tmp/
rsync, on the other hand, was practically designed for backup workflows. It transfers only the differences between source and destination, making repeated backups extremely efficient. It supports compression, exclusions, dry runs, permission preservation, and detailed logging. Like scp, it can run over SSH.
For backing up large directory trees or frequently changing data, rsync should be the default choice. scp still has its place for small, simple, one-off transfers — and we will use it later for exactly that.
Rsync Examples
1
2
3
4
5
6
7
8
9
10
11
# Sync a directory efficiently over SSH (preserve perms/owner/time, compress, verbose)
rsync -azv -e "ssh -p 22" /var/backups/app/ user@backup-host:/srv/offsite/app/
# Same, but delete files on destination that no longer exist on source (mirror)
rsync -azv --delete -e "ssh" /var/backups/app/ user@backup-host:/srv/offsite/app/
# Dry run (shows what would change without copying anything)
rsync -azv --dry-run -e "ssh" /var/backups/app/ user@backup-host:/srv/offsite/app/
# Exclude patterns (caches, tmp, sockets, etc.)
rsync -azv --exclude='*.sock' --exclude='cache/' --exclude='tmp/' -e "ssh" /data/ user@backup-host:/srv/offsite/data/
Packaging with tar
Once you have decided what data to back up, the next step is packaging it properly. This is where tar shines.
With a single command, you can bundle entire directory trees, preserve permissions, ownership, and timestamps, and produce a single archive that is easy to move and store. You can also exclude caches, temporary files, sockets, and other irrelevant data.
Tar Examples
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Create a gzipped tar archive of a folder
tar -czf /var/backups/app-$(date +%F).tar.gz /srv/app
# Create archive with exclusions
tar -czf /var/backups/app-$(date +%F).tar.gz \
--exclude='/srv/app/tmp' \
--exclude='/srv/app/cache' \
--exclude='*.sock' \
/srv/app
# List contents of an archive
tar -tzf /var/backups/app-2025-12-25.tar.gz
# Extract an archive to a target directory
tar -xzf /var/backups/app-2025-12-25.tar.gz -C /restore/
Encrypting with gpg
Encryption is a non-negotiable step. Backups frequently contain the keys to your entire infrastructure.
Encrypting tar archives with gpg ensures that even if backup files are leaked, the data remains unreadable. The tar + gpg combination is simple, scriptable, and fits perfectly into automated workflows.
gpg Examples
1
2
3
4
5
6
7
8
9
10
11
12
13
# Encrypt for a recipient (recommended for automation if you use a GPG public key)
gpg --encrypt --recipient "Backup Key" --output app.tar.gz.gpg app.tar.gz
# Decrypt
gpg --decrypt --output app.tar.gz app.tar.gz.gpg
# Encrypt with a passphrase (non-interactive; convenient but handle secrets carefully)
gpg --symmetric --cipher-algo AES256 --batch --yes \
--pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
--output app.tar.gz.gpg app.tar.gz
# Verify what's inside after decrypting (no extraction)
gpg --decrypt app.tar.gz.gpg | tar -tz
Scheduling and Retention
Backups should run automatically and regularly. The correct frequency depends on how often your data changes and how much data loss you are willing to tolerate.
As a rough rule of thumb:
- Daily backups are a solid baseline
- Hourly backups make sense for frequently changing data
- Weekly backups can complement daily ones for longer retention
Consistency matters more than frequency. Cron is more than sufficient for scheduling reliable backup jobs on most Linux systems.
1
2
3
4
5
# Edit the user crontab
crontab -e
# Edit the root crontab
sudo crontab -e
1
2
# Daily encrypted offsite backup at 05:00 local server time
0 5 * * * /usr/local/bin/backup.sh >> /var/log/backup.log 2>&1
Retention is just as important. There is rarely a need to keep dozens of daily backups. A small number of recent backups combined with one or two longer-term snapshots is usually enough.
Storage is never free — even if it feels that way.
Putting Everything Together
Using the tools described above, designing a solid backup system becomes straightforward:
- Decide which directories need to be backed up and where backups will be stored.
- Define a clear backup cadence and retention policy.
- Create a snapshot by archiving the data with tar.
- Encrypt the snapshot and remove the plaintext archive.
- Transfer the encrypted backup off the server.
Because the data is encrypted, the destination does not need to be fully trusted. Even if the backup server is compromised, the attacker still has to break the encryption.
Layering simple backups with other mechanisms, filesystem snapshots, database-native backups, or VM snapshots, further increases resilience. If a complex system fails, simple backups remain understandable, inspectable, and restorable with standard Unix tools.
Example Scripts
Backing up
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#!/usr/bin/env bash
set -euo pipefail
# -------------------------
# CONFIG
# -------------------------
SOURCE_DIR="/srv/app"
WORKDIR="/var/backups/work"
OUTDIR="/var/backups/out"
STAMP="$(date -u +%Y%m%dT%H%M%SZ)"
ARCHIVE_BASENAME="app-backup-${STAMP}.tar.gz"
ARCHIVE_PATH="${OUTDIR}/${ARCHIVE_BASENAME}"
ENCRYPTED_PATH="${ARCHIVE_PATH}.gpg"
# Offsite destination (SSH)
REMOTE_USER="user"
REMOTE_HOST="backup-host"
REMOTE_DIR="/srv/offsite/app"
SSH_PORT="22"
# Encryption options:
# Option A (recommended): recipient-based encryption (set to your key's uid/email)
GPG_RECIPIENT="Backup Key"
# Option B: symmetric encryption (uncomment + export BACKUP_PASSPHRASE)
# USE_SYMMETRIC="1"
# Retention (local)
KEEP_DAYS_LOCAL=7
# Exclusions
EXCLUDES=(
"--exclude=${SOURCE_DIR}/tmp"
"--exclude=${SOURCE_DIR}/cache"
"--exclude=*.sock"
)
# -------------------------
# PREP
# -------------------------
mkdir -p "$WORKDIR" "$OUTDIR"
echo "[1/6] Creating tar archive..."
tar -czf "${ARCHIVE_PATH}" "${EXCLUDES[@]}" "$SOURCE_DIR"
echo "[2/6] Encrypting archive with gpg..."
if [[ "${USE_SYMMETRIC:-0}" == "1" ]]; then
: "${BACKUP_PASSPHRASE:?BACKUP_PASSPHRASE must be set for symmetric encryption}"
gpg --symmetric --cipher-algo AES256 --batch --yes \
--pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
--output "${ENCRYPTED_PATH}" "${ARCHIVE_PATH}"
else
gpg --encrypt --recipient "${GPG_RECIPIENT}" \
--output "${ENCRYPTED_PATH}" "${ARCHIVE_PATH}"
fi
echo "[3/6] (Optional) Quick integrity check: list files inside without writing plaintext to disk..."
gpg --decrypt "${ENCRYPTED_PATH}" | tar -tz >/dev/null
echo "[4/6] Uploading encrypted backup using rsync (efficient + resumable)..."
rsync -azv --partial --progress -e "ssh -p ${SSH_PORT}" \
"${ENCRYPTED_PATH}" \
"${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/"
echo "[5/6] Uploading a 'latest' marker using scp (simple one-off copy)..."
# Create a small manifest/marker file locally, then scp it
LATEST_MARKER="${WORKDIR}/latest.txt"
printf "%s\n" "$(basename "${ENCRYPTED_PATH}")" > "${LATEST_MARKER}"
scp -P "${SSH_PORT}" "${LATEST_MARKER}" "${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/latest.txt"
echo "[6/6] Cleanup: remove plaintext tar + prune old local files..."
rm -f "${ARCHIVE_PATH}" "${LATEST_MARKER}"
# Prune old encrypted backups locally
find "${OUTDIR}" -type f -name "*.gpg" -mtime "+${KEEP_DAYS_LOCAL}" -print -delete
echo "Done."
echo "Uploaded: $(basename "${ENCRYPTED_PATH}") to ${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/"
Restoring
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
#!/usr/bin/env bash
set -euo pipefail
# -------------------------
# CONFIG (match backup script)
# -------------------------
WORKDIR="/var/backups/restore-work"
RESTORE_DIR="/restore" # where to extract
REMOTE_USER="user"
REMOTE_HOST="backup-host"
REMOTE_DIR="/srv/offsite/app"
SSH_PORT="22"
# Encryption options (must match how you encrypted)
# Option A: recipient-based (GPG private key must be available on this machine)
GPG_RECIPIENT="Backup Key"
# Option B: symmetric encryption (uncomment + load/export BACKUP_PASSPHRASE, e.g. from .env)
# USE_SYMMETRIC="1"
# If you want to restore a specific file, set it like:
# BACKUP_FILE="app-backup-20251225T000000Z.tar.gz.gpg"
# Otherwise it will use remote latest.txt
BACKUP_FILE="${BACKUP_FILE:-}"
# -------------------------
# PREP
# -------------------------
mkdir -p "$WORKDIR" "$RESTORE_DIR"
echo "[1/7] Selecting backup to restore..."
if [[ -z "$BACKUP_FILE" ]]; then
# Fetch latest marker (created by the backup script via scp)
echo "No BACKUP_FILE provided; fetching latest.txt from remote..."
scp -P "${SSH_PORT}" \
"${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_DIR}/latest.txt" \
"${WORKDIR}/latest.txt"
BACKUP_FILE="$(cat "${WORKDIR}/latest.txt" | tr -d '\r\n')"
fi
if [[ -z "$BACKUP_FILE" ]]; then
echo "ERROR: Could not determine backup file name." >&2
exit 1
fi
REMOTE_PATH="${REMOTE_DIR}/${BACKUP_FILE}"
LOCAL_ENCRYPTED="${WORKDIR}/${BACKUP_FILE}"
echo "Selected backup: ${BACKUP_FILE}"
echo "[2/7] Downloading encrypted backup (rsync resumable)..."
rsync -azv --partial --progress -e "ssh -p ${SSH_PORT}" \
"${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_PATH}" \
"${LOCAL_ENCRYPTED}"
echo "[3/7] Verifying archive is readable (decrypt + list, no extraction)..."
if [[ "${USE_SYMMETRIC:-0}" == "1" ]]; then
: "${BACKUP_PASSPHRASE:?BACKUP_PASSPHRASE must be set for symmetric decryption}"
gpg --batch --yes --pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
--decrypt "${LOCAL_ENCRYPTED}" | tar -tz >/dev/null
else
# Recipient-based: will use your private key available in local keyring/agent
gpg --decrypt "${LOCAL_ENCRYPTED}" | tar -tz >/dev/null
fi
echo "[4/7] Extracting into restore directory..."
# Optional: restore into a unique subfolder per run
RUN_DIR="${RESTORE_DIR}/restored-$(date -u +%Y%m%dT%H%M%SZ)"
mkdir -p "$RUN_DIR"
if [[ "${USE_SYMMETRIC:-0}" == "1" ]]; then
gpg --batch --yes --pinentry-mode loopback --passphrase "$BACKUP_PASSPHRASE" \
--decrypt "${LOCAL_ENCRYPTED}" | tar -xzf - -C "$RUN_DIR"
else
gpg --decrypt "${LOCAL_ENCRYPTED}" | tar -xzf - -C "$RUN_DIR"
fi
echo "[5/7] Showing top-level restored contents:"
ls -la "$RUN_DIR" | head -n 50
echo "[6/7] Cleanup (optional): removing downloaded encrypted file..."
# Comment this out if you want to keep the encrypted backup locally.
rm -f "${LOCAL_ENCRYPTED}" "${WORKDIR}/latest.txt" 2>/dev/null || true
echo "[7/7] Done."
echo "Restored backup '${BACKUP_FILE}' into: ${RUN_DIR}"
echo "Next: validate app config/data, then perform your service-specific restore steps (db import, secrets, etc.)."
Executing the scripts
Give execution permissions to your scripts.
1
chmod +x backup_script.sh
And to add the backup script to the crontab!
1
crontab -e
1
2
# Daily encrypted offsite backup at 05:00 local server time
0 5 * * * /usr/local/bin/backup_script.sh >> /var/log/backup.log 2>&1
Other Tools Worth Exploring
Shell-based backups are powerful and reliable, but dedicated tools can add features such as deduplication, versioning, and repository management. These tools build on the same principles discussed in this article, but package them into more integrated systems. Depending on your needs and scale, these tools may be worth exploring.
Restic
Restic is a modern backup tool designed with security in mind. Encryption is enabled by default, and backups are stored in deduplicated repositories, which saves a significant amount of space over time. It supports multiple backends, including local storage, SSH servers, and various cloud providers.
Restic is particularly attractive if you want a clean command-line interface, fast incremental backups, and strong cryptographic guarantees without having to glue multiple tools together manually.
Borg
Borg (BorgBackup) is a well-established backup solution focused on efficiency and reliability. It offers powerful deduplication, compression, and encryption, making it ideal for large datasets and long retention periods.
Borg repositories are designed to be robust against corruption, and the tool provides excellent performance for repeated backups. It does, however, introduce its own repository format, which means restores require Borg itself to be available.
Duplicity
Duplicity takes a slightly different approach by building encrypted, incremental backups on top of standard tar archives. This makes it easier to integrate with existing workflows and storage backends, while still providing versioned backups and encryption.
It is a solid option if you want incremental, encrypted backups that remain relatively transparent and compatible with traditional Unix tools.
Comparison
| Feature / Tool | Restic | Borg (BorgBackup) | Duplicity |
|---|---|---|---|
| Backup type | Snapshot-based | Snapshot-based | Incremental (tar-based) |
| Encryption | Yes (default, mandatory) | Yes (optional but common) | Yes (via GPG) |
| Deduplication | Yes (content-defined) | Yes (very efficient) | No |
| Compression | Yes | Yes | Yes |
| Incremental backups | Yes | Yes | Yes |
| Restore granularity | File / directory | File / directory | File / directory |
| Storage backend | Local, SSH, S3, many cloud providers | Local or SSH | Local, SSH, cloud via backends |
| Repository format | Custom | Custom | Standard tar volumes |
| Setup complexity | Low | Medium | Medium |
| Performance | Very good | Excellent | Moderate |
| Bandwidth efficiency | High | Very high | Low–moderate |
| Ease of scripting | Easy | Easy | Easy |
| Learning curve | Low | Medium | Medium |
| Tool availability | Single static binary | Package-based | Python-based |
| Best suited for | Simple, secure modern backups | Large datasets & long retention | Legacy & tar-friendly workflows |
Conclusion
There are better tools for managing backups at scale, especially in clustered or virtualized environments. This article is not about those tools.
It is about showing how easy it is to implement a correct backup strategy using simple, standard Linux utilities. For any experienced sysadmin, this post will read as “water is wet” obvious. But I have been called into too many catastrophic situations where no backup policy existed at all.
More than the tools, the backup principles mentioned here are the most important takeaway. Frequency, snapshot, encryption and off-site storage are more than pointers. They are rules to live by, regardless of the tools you use.
Do not disregard the need for backups. They are as important as your security policies. Efficient solutions are far simpler to design and apply than most people assume, and they not only protect you from system crashes, accidental deletions, and the dreaded ransomware attacks, they buy you something invaluable: peace of mind.