docs(readme): rewrite README to reflect deterministic backup design

- clarify separation between file backups (always) and SQL dumps (explicit only)
- document correct nested backup directory layout
- remove legacy script-based usage and outdated sections
- add explicit explanation of database definition scope
- update usage examples to current baudolo CLI

https://chatgpt.com/share/694ef6d2-7584-800f-a32b-27367f234d1d
This commit is contained in:
2025-12-26 21:57:46 +01:00
parent bbb2dd1732
commit e0b2e8934e

196
README.md
View File

@@ -1,80 +1,196 @@
# Backup Docker Volumes to Local (baudolo) 📦🔄
[![GitHub Sponsors](https://img.shields.io/badge/Sponsor-GitHub%20Sponsors-blue?logo=github)](https://github.com/sponsors/kevinveenbirkenbach) [![Patreon](https://img.shields.io/badge/Support-Patreon-orange?logo=patreon)](https://www.patreon.com/c/kevinveenbirkenbach) [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20me%20a%20Coffee-Funding-yellow?logo=buymeacoffee)](https://buymeacoffee.com/kevinveenbirkenbach) [![PayPal](https://img.shields.io/badge/Donate-PayPal-blue?logo=paypal)](https://s.veen.world/paypaldonate)
# baudolo Deterministic Backup & Restore for Docker Volumes 📦🔄
[![GitHub Sponsors](https://img.shields.io/badge/Sponsor-GitHub%20Sponsors-blue?logo=github)](https://github.com/sponsors/kevinveenbirkenbach) [![Patreon](https://img.shields.io/badge/Support-Patreon-orange?logo=patreon)](https://www.patreon.com/c/kevinveenbirkenbach) [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20me%20a%20Coffee-Funding-yellow?logo=buymeacoffee)](https://buymeacoffee.com/kevinveenbirkenbach) [![PayPal](https://img.shields.io/badge/Donate-PayPal-blue?logo=paypal)](https://s.veen.world/paypaldonate) [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker Version](https://img.shields.io/badge/Docker-Yes-blue.svg)](https://www.docker.com) [![Python Version](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org) [![GitHub stars](https://img.shields.io/github/stars/kevinveenbirkenbach/backup-docker-to-local.svg?style=social)](https://github.com/kevinveenbirkenbach/backup-docker-to-local/stargazers)
**Backup Docker Volumes to Local** is a set of Python and shell scripts that enable you to perform incremental backups of all your Docker volumes using rsync. It is designed to integrate seamlessly with [Kevin's Package Manager](https://github.com/kevinveenbirkenbach/package-manager) under the alias **baudolo**, making it easy to install and manage. The tool supports both file and database recoveries with a clear, automated backup scheme.
`baudolo` is a backup and restore system for Docker volumes with
**mandatory file backups** and **explicit, deterministic database dumps**.
It is designed for environments with many Docker services where:
- file-level backups must always exist
- database dumps must be intentional, predictable, and auditable
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker Version](https://img.shields.io/badge/Docker-Yes-blue.svg)](https://www.docker.com) [![Python Version](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org) [![GitHub stars](https://img.shields.io/github/stars/kevinveenbirkenbach/backup-docker-to-local.svg?style=social)](https://github.com/kevinveenbirkenbach/backup-docker-to-local/stargazers)
## ✨ Key Features
## 🎯 Goal
- 📦 Incremental Docker volume backups using `rsync --link-dest`
- 🗄 Optional SQL dumps for:
- PostgreSQL
- MariaDB / MySQL
- 🌱 Explicit database definition for SQL backups (no auto-discovery)
- 🧾 Backup integrity stamping via `dirval` (Python API)
- ⏸ Automatic container stop/start when required for consistency
- 🚫 Whitelisting of containers that do not require stopping
- ♻️ Modular, maintainable Python architecture
This project automates the backup of Docker volumes using incremental backups (rsync) and supports recovering both files and database dumps (MariaDB/PostgreSQL). A robust directory stamping mechanism ensures data integrity, and the tool also handles restarting Docker Compose services when necessary.
## 🚀 Features
## 🧠 Core Concept (Important!)
- **Incremental Backups:** Uses rsync with `--link-dest` for efficient, versioned backups.
- **Database Backup Support:** Backs up MariaDB and PostgreSQL databases from running containers.
- **Volume Recovery:** Provides scripts to recover volumes and databases from backups.
- **Docker Compose Integration:** Option to automatically restart Docker Compose services after backup.
- **Flexible Configuration:** Easily integrated with your Docker environment with minimal setup.
- **Comprehensive Logging:** Detailed command output and error handling for safe operations.
`baudolo` **separates file backups from database dumps**.
## 🛠 Requirements
- **Docker volumes are always backed up at file level**
- **SQL dumps are created only for explicitly defined databases**
- **Linux Operating System** (with Docker installed) 🐧
- **Python 3.x** 🐍
- **Docker & Docker Compose** 🔧
- **rsync** installed on your system
This results in the following behavior:
## 📥 Installation
| Database defined | File backup | SQL dump |
|------------------|-------------|----------|
| No | ✔ yes | ✘ no |
| Yes | ✔ yes | ✔ yes |
You can install **Backup Docker Volumes to Local** easily via [Kevin's Package Manager](https://github.com/kevinveenbirkenbach/package-manager) using the alias **baudolo**:
## 📁 Backup Layout
```bash
pkgmgr install baudolo
Backups are stored in a deterministic, fully nested structure:
```text
<backups-dir>/
└── <machine-hash>/
└── <repo-name>/
└── <timestamp>/
└── <volume-name>/
├── files/
└── sql/
└── <database>.backup.sql
```
Alternatively, clone the repository directly:
### Meaning of each level
* `<machine-hash>`
SHA256 hash of `/etc/machine-id` (host separation)
* `<repo-name>`
Logical backup namespace (project / stack)
* `<timestamp>`
Backup generation (`YYYYMMDDHHMMSS`)
* `<volume-name>`
Docker volume name
* `files/`
Incremental file backup (rsync)
* `sql/`
Optional SQL dumps (only for defined databases)
## 🚀 Installation
### Local (editable install)
```bash
git clone https://github.com/kevinveenbirkenbach/backup-docker-to-local.git
cd backup-docker-to-local
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
```
## 🚀 Usage
## 🌱 Database Definition (SQL Backup Scope)
### Backup All Volumes
### How SQL backups are defined
To backup all Docker volumes, simply run:
`baudolo` creates SQL dumps **only** for databases that are **explicitly defined**
via configuration (e.g. a databases definition file or seeding step).
If a database is **not defined**:
* its Docker volume is still backed up (files)
* **no SQL dump is created**
> No database definition → file backup only
> Database definition present → file backup + SQL dump
### Why explicit definition?
`baudolo` does **not** inspect running containers to guess databases.
Databases must be explicitly defined to guarantee:
* deterministic backups
* predictable restore behavior
* reproducible environments
* zero accidental production data exposure
### Required database metadata
Each database definition provides:
* database instance (container or logical instance)
* database name
* database user
* database password
This information is used by `baudolo` to execute
`pg_dump`, `pg_dumpall`, or `mariadb-dump`.
## 💾 Running a Backup
```bash
./backup-docker-to-local.sh
baudolo \
--compose-dir /srv/docker \
--databases-csv /etc/baudolo/databases.csv \
--database-containers central-postgres central-mariadb \
--images-no-stop-required alpine postgres mariadb mysql \
--images-no-backup-required redis busybox
```
### Recovery
### Common Backup Flags
#### Recover Volume Files
| Flag | Description |
| --------------- | ------------------------------------------- |
| `--everything` | Always stop containers and re-run rsync |
| `--dump-only` | Only create SQL dumps, skip file backups |
| `--shutdown` | Do not restart containers after backup |
| `--backups-dir` | Backup root directory (default: `/Backups`) |
| `--repo-name` | Backup namespace under machine hash |
## ♻️ Restore Operations
### Restore Volume Files
```bash
bash ./recover-docker-from-local.sh "{{volume_name}}" "$(sha256sum /etc/machine-id | head -c 64)" "{{version_to_recover}}"
baudolo-restore files \
my-volume \
<machine-hash> \
<version> \
--backups-dir /Backups \
--repo-name my-repo
```
#### Recover Database
For example, to recover a MySQL/MariaDB database:
Restore into a **different target volume**:
```bash
docker exec -i mysql_container mysql -uroot -psecret database < db.sql
baudolo-restore files \
target-volume \
<machine-hash> \
<version> \
--source-volume source-volume
```
#### Debug Mode
To inspect whats happening inside a container:
### Restore PostgreSQL
```bash
docker run -it --entrypoint /bin/sh --rm --volumes-from {{container_name}} -v /Backups/:/Backups/ kevinveenbirkenbach/alpine-rsync
baudolo-restore postgres \
my-volume \
<machine-hash> \
<version> \
--container postgres \
--db-name appdb \
--db-password secret \
--empty
```
### Restore MariaDB / MySQL
```bash
baudolo-restore mariadb \
my-volume \
<machine-hash> \
<version> \
--container mariadb \
--db-name shopdb \
--db-password secret \
--empty
```
> `baudolo` automatically detects whether `mariadb` or `mysql`
> is available inside the container
## 🔍 Backup Scheme
The backup mechanism uses incremental backups with rsync and stamps directories with a unique hash. For more details on the backup scheme, check out [this blog post](https://blog.veen.world/blog/2020/12/26/how-i-backup-dedicated-root-servers/).