113 Commits

Author SHA1 Message Date
3b39a6ef02 Release version 1.0.0 2025-12-27 09:30:38 +01:00
e0b2e8934e docs(readme): rewrite README to reflect deterministic backup design
- clarify separation between file backups (always) and SQL dumps (explicit only)
- document correct nested backup directory layout
- remove legacy script-based usage and outdated sections
- add explicit explanation of database definition scope
- update usage examples to current baudolo CLI

https://chatgpt.com/share/694ef6d2-7584-800f-a32b-27367f234d1d
2025-12-26 21:57:46 +01:00
bbb2dd1732 Removed .travis 2025-12-26 21:03:00 +01:00
159502af5e Added mirros 2025-12-26 20:50:29 +01:00
698d1e7a9e ci: add Makefile-driven CI with unit, integration and e2e tests
- add GitHub Actions CI workflow using Makefile targets exclusively
- run unit, integration and e2e tests via `make test`
- publish Docker image to GHCR on SemVer tags
- force-update `stable` git tag after successful release
- add integration test for seed CLI (CSV upsert behavior)
- extend Makefile with test-unit and test-integration targets

https://chatgpt.com/share/694ee54f-b814-800f-a714-e87563e538b7
2025-12-26 20:43:06 +01:00
f8420c8bea renamed configure to seed 2025-12-26 19:58:39 +01:00
8e1a53e1f9 Deleted Starting file 2025-12-26 19:47:54 +01:00
7b55d59300 fix(restore): handle bytes stdin correctly in subprocess wrapper
Avoid passing raw bytes/str via stdin to subprocess.run(), which caused
"'bytes' object has no attribute 'fileno'" and
"stdin and input arguments may not both be used" errors.

If stdin is bytes or str, pass it via input= instead; otherwise forward
stdin unchanged. This fixes Postgres restore failures in E2E tests
without changing productive restore logic.

https://chatgpt.com/share/694ed70d-9e04-800f-8dec-edf08e6e2082
2025-12-26 19:42:17 +01:00
cf6f4d8326 test(e2e): stabilize MariaDB 11 tests and fix restore empty-mode client selection
- add `make clean` and run it before `test-e2e` to avoid stale artifacts
- restore: do not hardcode `mysql` for --empty; use detected mariadb/mysql client
- e2e: improve subprocess error output for easier CI debugging
- e2e: adjust MariaDB readiness checks for socket-only root on MariaDB 11
- e2e: add `wait_for_mariadb_sql` and run SQL readiness checks for dedicated TCP user
- e2e: update MariaDB full/no-copy tests to use dedicated user over TCP and consistent credentials

https://chatgpt.com/share/694ecfb8-f8a8-800f-a1c9-a5f410d4ba02
2025-12-26 19:10:55 +01:00
4af15d9074 fix(restore): allow restoring files from different source volume
The file restore command previously assumed that the target volume name
was identical to the volume name used in the backup path. This caused
restores to fail (exit code 2) when restoring data from one volume backup
into a different target volume.

Introduce an optional --source-volume argument for `baudolo-restore files`
to explicitly specify the backup source volume while keeping the target
volume unchanged.

- Default behavior remains fully backward-compatible
- Enables restoring backups from volume A into volume B
- Fixes E2E test scenario restoring into a new volume

Tests:
- Update E2E file restore test to use --source-volume

https://chatgpt.com/share/694ec70f-1d7c-800f-b221-9d22e4b0775e
2025-12-26 18:33:58 +01:00
c30b4865d4 refactor: migrate to src/ package + add DinD-based E2E runner with debug artifacts
- Replace legacy standalone scripts with a proper src-layout Python package
  (baudolo backup/restore/configure entrypoints via pyproject.toml)
- Remove old scripts/files (backup-docker-to-local.py, recover-docker-from-local.sh,
  databases.csv.tpl, Todo.md)
- Add Dockerfile to build the project image for local/E2E usage
- Update Makefile: build image and run E2E via external runner script
- Add scripts/test-e2e.sh:
  - start DinD + dedicated network
  - recreate DinD data volume (and shared /tmp volume)
  - pre-pull helper images (alpine-rsync, alpine)
  - load local baudolo:local image into DinD
  - run unittest E2E suite inside DinD and abort on first failure
  - on failure: dump host+DinD diagnostics and archive shared /tmp into artifacts/
- Add artifacts/ debug outputs produced by failing E2E runs (logs, events, tmp archive)

https://chatgpt.com/share/694ec23f-0794-800f-9a59-8365bc80f435
2025-12-26 18:13:26 +01:00
41910aece2 Optimized entry columns 2025-10-18 09:58:06 +02:00
b6dd624f97 Added install hints to pass install requirements 2025-09-11 20:45:30 +02:00
47828c44db Use dirval CLI instead of direct Python script reference in backup-docker-to-local.py and declare dirval in requirements.yml
Details:
- Replaced python call to directory-validator.py with direct 'dirval' command
- Updated error message accordingly
- Added dirval as dependency in requirements.yml

Conversation: https://chatgpt.com/share/68c31763-6a84-800f-a697-50fa40e9841b
2025-09-11 20:40:21 +02:00
a538e537cb Added emoji 2025-07-17 00:26:22 +02:00
8f72d61300 Bypass buffer 2025-07-17 00:19:10 +02:00
c754083cec Added more detailled info to is_image_whitelisted 2025-07-17 00:13:38 +02:00
84d0fd6346 Added more detailled output why stopped 2025-07-17 00:05:44 +02:00
627187cecb Solved variable bug 2025-07-16 23:41:17 +02:00
978e153723 Changed special instances to database containers for more clarity 2025-07-16 14:50:32 +02:00
2bf2b0798e Solved bug 2025-07-16 14:47:32 +02:00
8196a0206b Optimized parameters 2025-07-16 14:43:43 +02:00
c4cbb290b3 Made IMAGES_NO_STOP_REQUIRED obligatoric 2025-07-16 10:55:21 +02:00
2d2376eac8 Implemented new parameters to make it more flexibel for cymais 2025-07-14 18:47:25 +02:00
8c4ae60a6a Left hint 2025-07-10 22:12:29 +02:00
18d6136de0 Used streaming instead of memory to prevent overflow 2025-07-03 14:37:36 +02:00
3ed89a59a8 Added failure handling for bussy databases 2025-07-03 12:09:44 +02:00
7d3f0a3ae3 Added roles, and stop on failure 2025-07-03 12:06:38 +02:00
5762754ed7 Added restore_backup.py restore_postgres_databases.py 2025-07-03 11:59:40 +02:00
556cb17433 Implemented check for empty database name 2025-04-21 10:52:06 +02:00
2e2c8131c4 Implemented fallback_pg_dumpall 2025-04-19 00:21:45 +02:00
5005d577cc Update README.md 2025-04-19 00:04:29 +02:00
327b666237 Removed setup instructions, because they are managed now by pkgmgr 2025-04-01 13:20:47 +02:00
a7c6fa861a Moved requirements from cymais to this role 2025-04-01 13:18:30 +02:00
f6c57be1b7 Added Funding 2025-03-12 20:52:47 +01:00
9d990a728d Update README.md 2025-03-12 11:14:40 +01:00
a355f34e6e Update README.md 2025-03-04 22:35:22 +01:00
f847c8dd74 Merge branch 'main' of github.com:kevinveenbirkenbach/docker-volume-backup 2024-12-03 11:19:40 +01:00
3e225b0317 Added hard restart for mailu 2024-12-03 11:19:12 +01:00
6537626d77 optimized code formating 2024-02-05 19:18:54 +01:00
da7e5cc9be changed loading of directory-validator 2024-02-05 19:16:09 +01:00
69a1ea30aa Implemented stamp function 2024-01-29 20:04:21 +01:00
e9588b0e31 Added exception catching 2024-01-14 01:46:47 +01:00
42566815c4 solved start bug 2024-01-14 01:25:54 +01:00
8bc2b068ff Added condition to stop containers 2024-01-12 18:58:02 +01:00
25d428fc9c Removed skipping of unused volumes 2024-01-12 18:42:26 +01:00
0077efa63c Implemented stop on first error 2024-01-12 16:20:17 +01:00
9d8e80f793 Implemented better parameter check 2024-01-12 15:42:51 +01:00
d2b699c271 Implemented postgres recovery 2024-01-12 11:47:46 +01:00
b7dcb17fd5 Optimized logic for central databases 2024-01-11 20:51:55 +01:00
7f6f5f6dc8 Optimized logic for central databases 2024-01-11 20:47:57 +01:00
75d48fb3e9 Removed unnecessary warning 2024-01-11 20:40:07 +01:00
bb3d20c424 Solved path comparisment bug 2024-01-11 16:21:39 +01:00
f057104a65 Solved order bug 2024-01-11 15:56:34 +01:00
7fe1886ff9 Solved trailing spaces bug 2024-01-11 12:20:38 +01:00
35e28f31d2 Changed logic so that volume is not created for db recoveries 2024-01-11 11:04:03 +01:00
15a1f17184 Changed logic so that volume is not created for db recoveries 2024-01-11 10:58:35 +01:00
ace1a70488 Optimized database recovery function 2024-01-11 03:04:13 +01:00
d537393da8 Solved array bug 2024-01-09 13:18:17 +01:00
2b716e5d90 Optimized code performance 2024-01-09 12:59:53 +01:00
7702b17a9d Changed from jq to format 2024-01-08 20:43:42 +01:00
489b5796b7 Implemented shutdown parameter 2024-01-08 19:48:50 +01:00
bf9986f282 Removed unecessary host from seeder function 2024-01-06 14:23:23 +01:00
e2e62c5835 Added comment, to prevent thinking in future. PS: Thanks ChatGPT :-* 2024-01-06 14:02:13 +01:00
4388e09937 Implemented support of multiple databases per instance 2024-01-06 13:51:30 +01:00
31133f251e Added images where no restart is required 2024-01-06 13:36:35 +01:00
850fc3bf0c Implemented dynamic storage path identification 2024-01-06 13:27:17 +01:00
00fd102f81 removed parameter bug 2023-12-27 23:33:17 +01:00
f369a13d37 changed function strucutre 2023-12-27 23:31:21 +01:00
f505be35d3 Refactored variables to global to reduce complexity 2023-12-27 23:24:31 +01:00
49c442b299 changed parameter 2023-12-27 21:37:06 +01:00
0322eee107 Removed redundant code 2023-12-27 21:36:07 +01:00
9a5b544e0b Implemented SQL dumps for backup everything 2023-12-27 21:26:11 +01:00
15d7406b7e Implemented global variables to reduce code complexity 2023-12-27 21:04:25 +01:00
9dd58f3ee4 Implemented forced file backup 2023-12-27 20:46:56 +01:00
7f383fcce2 Solved undefined variable bug 2023-12-26 20:33:35 +01:00
a72753921a Execute backup just once per volume and not once per volume container 2023-12-26 20:20:26 +01:00
407eddc2c3 Ignored images 2023-12-26 20:07:49 +01:00
3fedf49f4e Solved bug 2023-12-26 19:46:20 +01:00
fb2e1df233 Solved parameter bug 2023-12-26 19:21:01 +01:00
47922f53fa Solved versions bug 2023-12-26 18:53:58 +01:00
162b3eec06 Reimplemented incremental backup and improved downtime 2023-12-26 18:22:06 +01:00
e0fc263dcb Removed duplicated code 2023-12-26 16:09:18 +01:00
581ff501fc Implemented create_volume_directory 2023-12-26 14:34:32 +01:00
540797f244 Made password optional 2023-12-26 04:01:29 +01:00
7853283ef3 Added condition for passwordlesslogin 2023-12-26 02:56:12 +01:00
5e91e298c4 Added instance attribut 2023-12-26 00:27:27 +01:00
de59646fc0 Removed deprecated warning 2023-12-25 23:39:28 +01:00
bcc8a7fb00 Solved header bug 2023-12-25 23:36:21 +01:00
8c4785dfe6 Solved replacement bug 2023-12-25 23:29:54 +01:00
d4799af904 Added database seeder 2023-12-25 23:20:30 +01:00
d1f942bc58 Escaped {{ }} correct 2023-12-25 22:46:14 +01:00
397e242e5b Implemented whitelist 2023-12-25 22:33:28 +01:00
b06317ad48 Added more prints 2023-12-25 22:19:26 +01:00
79f4cb5e7f Refactored database logic 2023-12-25 21:57:23 +01:00
50db914c36 removed bugs 2023-12-25 21:49:06 +01:00
02062c7d49 Implemented postgres backup and check for stop and restart for file backup necessary 2023-12-25 21:39:50 +01:00
a1c33c1747 refactored. untested. 2023-12-25 20:31:56 +01:00
b83e481d01 implemented exception handling for rsync 24 2023-12-13 08:55:02 +01:00
c4107d91b0 removed -t parameter 2023-12-13 00:53:30 +01:00
bff513d639 Solved spacing bug 2023-12-11 09:32:59 +01:00
be0bdff4d8 Raise the sigterm to 2h 2023-12-09 14:29:49 +01:00
d8aa5f7d79 renamed 2023-11-16 23:02:09 +01:00
cdd3d88202 Update README.md 2023-11-16 17:32:30 +01:00
7488262c4c Update README.md 2023-11-16 17:31:44 +01:00
3a8f002f85 Updated README.md 2023-09-02 16:56:34 +02:00
978a8f93e3 Updated recovery script 2023-07-09 19:42:11 +02:00
5786e21c11 Optimized exits 2023-06-28 21:32:09 +02:00
7832c85de7 Optimized version by chat gpt 2023-06-28 21:25:15 +02:00
9fa37046ab Optimized mysql recovery 2023-06-28 21:22:02 +02:00
eddccb1936 Changed mysql to mariadb 2023-06-25 22:15:00 +02:00
02449cb501 removed versions folder from recovery script 2023-04-19 13:21:09 +02:00
4290464986 removed version dir and latest link from backup script 2023-04-19 13:20:07 +02:00
46 changed files with 2674 additions and 196 deletions

7
.github/FUNDING.yml vendored Normal file
View File

@@ -0,0 +1,7 @@
github: kevinveenbirkenbach
patreon: kevinveenbirkenbach
buy_me_a_coffee: kevinveenbirkenbach
custom: https://s.veen.world/paypaldonate

91
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,91 @@
name: CI (make tests, stable, publish)
on:
push:
branches: ["**"]
tags: ["v*.*.*"] # SemVer tags like v1.2.3
pull_request:
permissions:
contents: write # push/update 'stable' tag
packages: write # push to GHCR
env:
IMAGE_NAME: baudolo
REGISTRY: ghcr.io
IMAGE_REPO: ${{ github.repository }}
jobs:
test:
name: make test
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Show docker info
run: |
docker version
docker info
- name: Run all tests via Makefile
run: |
make test
- name: Upload E2E artifacts (always)
if: always()
uses: actions/upload-artifact@v4
with:
name: e2e-artifacts
path: artifacts
if-no-files-found: ignore
stable_and_publish:
name: Mark stable + publish image (SemVer tags only)
needs: [test]
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
steps:
- name: Checkout (full history for tags)
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Derive version from tag
id: ver
run: |
TAG="${GITHUB_REF#refs/tags/}" # v1.2.3
echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
- name: Mark 'stable' git tag (force update)
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git tag -f stable "${GITHUB_SHA}"
git push -f origin stable
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build image (Makefile)
run: |
make build
- name: Tag image for registry
run: |
# local image built by Makefile is: baudolo:local
docker tag "${IMAGE_NAME}:local" "${REGISTRY}/${IMAGE_REPO}:${{ steps.ver.outputs.tag }}"
docker tag "${IMAGE_NAME}:local" "${REGISTRY}/${IMAGE_REPO}:stable"
docker tag "${IMAGE_NAME}:local" "${REGISTRY}/${IMAGE_REPO}:sha-${GITHUB_SHA::12}"
- name: Push image
run: |
docker push "${REGISTRY}/${IMAGE_REPO}:${{ steps.ver.outputs.tag }}"
docker push "${REGISTRY}/${IMAGE_REPO}:stable"
docker push "${REGISTRY}/${IMAGE_REPO}:sha-${GITHUB_SHA::12}"

3
.gitignore vendored
View File

@@ -1 +1,2 @@
databases.csv
__pycache__
artifacts/

View File

@@ -1,2 +0,0 @@
language: shell
script: shellcheck $(find . -type f -name '*.sh')

4
CHANGELOG.md Normal file
View File

@@ -0,0 +1,4 @@
## [1.0.0] - 2025-12-27
* Official Release 🥳

34
Dockerfile Normal file
View File

@@ -0,0 +1,34 @@
# syntax=docker/dockerfile:1
FROM python:3.11-slim
WORKDIR /app
# Runtime + build essentials:
# - rsync: required for file backup/restore
# - ca-certificates: TLS
# - docker-cli: needed if you want to control the host Docker engine (via /var/run/docker.sock mount)
# - make: to delegate install logic to Makefile
#
# Notes:
# - On Debian slim, the docker client package is typically "docker.io".
# - If you only want restore-without-docker, you can drop docker.io later.
RUN apt-get update && apt-get install -y --no-install-recommends \
make \
rsync \
ca-certificates \
docker-cli \
&& rm -rf /var/lib/apt/lists/*
# Fail fast if docker client is missing
RUN command -v docker
COPY . .
# All install decisions are handled by the Makefile.
RUN make install
# Sensible defaults (can be overridden at runtime)
ENV PYTHONUNBUFFERED=1
# Default: show CLI help
CMD ["baudolo", "--help"]

4
MIRRORS Normal file
View File

@@ -0,0 +1,4 @@
git@github.com:kevinveenbirkenbach/backup-docker-to-local.git
ssh://git@git.veen.world:2201/kevinveenbirkenbach/backup-docker-to-local.git
ssh://git@code.infinito.nexus:2201/kevinveenbirkenbach/backup-docker-to-local.git
https://pypi.org/project/baudolo/

57
Makefile Normal file
View File

@@ -0,0 +1,57 @@
.PHONY: install build \
test-e2e test test-unit test-integration
# Default python if no venv is active
PY_DEFAULT ?= python3
IMAGE_NAME ?= baudolo
IMAGE_TAG ?= local
IMAGE := $(IMAGE_NAME):$(IMAGE_TAG)
install:
@set -eu; \
PY="$(PY_DEFAULT)"; \
if [ -n "$${VIRTUAL_ENV:-}" ] && [ -x "$${VIRTUAL_ENV}/bin/python" ]; then \
PY="$${VIRTUAL_ENV}/bin/python"; \
fi; \
echo ">>> Using python: $$PY"; \
"$$PY" -m pip install --upgrade pip; \
"$$PY" -m pip install -e .; \
command -v baudolo >/dev/null 2>&1 || { \
echo "ERROR: baudolo not found on PATH after install"; \
exit 2; \
}; \
baudolo --help >/dev/null 2>&1 || true
# ------------------------------------------------------------
# Build the baudolo Docker image
# ------------------------------------------------------------
build:
@echo ">> Building Docker image $(IMAGE)"
docker build -t $(IMAGE) .
clean:
git clean -fdX .
# ------------------------------------------------------------
# Run E2E tests inside the container (Docker socket required)
# ------------------------------------------------------------
# E2E via isolated Docker-in-Docker (DinD)
# - depends on local image build
# - starts a DinD daemon container on a dedicated network
# - loads the freshly built image into DinD
# - runs the unittest suite inside a container that talks to DinD via DOCKER_HOST
test-e2e: clean build
@bash scripts/test-e2e.sh
test: test-unit test-integration test-e2e
test-unit: clean build
@echo ">> Running unit tests"
@docker run --rm -t $(IMAGE) \
sh -lc 'python -m unittest discover -t . -s tests/unit -p "test_*.py" -v'
test-integration: clean build
@echo ">> Running integration tests"
@docker run --rm -t $(IMAGE) \
sh -lc 'python -m unittest discover -t . -s tests/integration -p "test_*.py" -v'

239
README.md
View File

@@ -1,62 +1,217 @@
# docker-volume-backup
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](./LICENSE.txt) [![Travis CI](https://api.travis-ci.org/kevinveenbirkenbach/docker-volume-backup.svg?branch=main)](https://travis-ci.org/kevinveenbirkenbach/docker-volume-backup)
# baudolo Deterministic Backup & Restore for Docker Volumes 📦🔄
[![GitHub Sponsors](https://img.shields.io/badge/Sponsor-GitHub%20Sponsors-blue?logo=github)](https://github.com/sponsors/kevinveenbirkenbach) [![Patreon](https://img.shields.io/badge/Support-Patreon-orange?logo=patreon)](https://www.patreon.com/c/kevinveenbirkenbach) [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20me%20a%20Coffee-Funding-yellow?logo=buymeacoffee)](https://buymeacoffee.com/kevinveenbirkenbach) [![PayPal](https://img.shields.io/badge/Donate-PayPal-blue?logo=paypal)](https://s.veen.world/paypaldonate) [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker Version](https://img.shields.io/badge/Docker-Yes-blue.svg)](https://www.docker.com) [![Python Version](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org) [![GitHub stars](https://img.shields.io/github/stars/kevinveenbirkenbach/backup-docker-to-local.svg?style=social)](https://github.com/kevinveenbirkenbach/backup-docker-to-local/stargazers)
## goal
This script backups all docker-volumes with the help of rsync.
## scheme
It is part of the following scheme:
![backup scheme](https://www.veen.world/wp-content/uploads/2020/12/server-backup-768x567.jpg)
Further information you will find [in this blog post](https://www.veen.world/2020/12/26/how-i-backup-dedicated-root-servers/).
`baudolo` is a backup and restore system for Docker volumes with
**mandatory file backups** and **explicit, deterministic database dumps**.
It is designed for environments with many Docker services where:
- file-level backups must always exist
- database dumps must be intentional, predictable, and auditable
## Backup all volumes
Execute:
## ✨ Key Features
```bash
./docker-volume-backup.sh
- 📦 Incremental Docker volume backups using `rsync --link-dest`
- 🗄 Optional SQL dumps for:
- PostgreSQL
- MariaDB / MySQL
- 🌱 Explicit database definition for SQL backups (no auto-discovery)
- 🧾 Backup integrity stamping via `dirval` (Python API)
- ⏸ Automatic container stop/start when required for consistency
- 🚫 Whitelisting of containers that do not require stopping
- ♻️ Modular, maintainable Python architecture
## 🧠 Core Concept (Important!)
`baudolo` **separates file backups from database dumps**.
- **Docker volumes are always backed up at file level**
- **SQL dumps are created only for explicitly defined databases**
This results in the following behavior:
| Database defined | File backup | SQL dump |
|------------------|-------------|----------|
| No | ✔ yes | ✘ no |
| Yes | ✔ yes | ✔ yes |
## 📁 Backup Layout
Backups are stored in a deterministic, fully nested structure:
```text
<backups-dir>/
└── <machine-hash>/
└── <repo-name>/
└── <timestamp>/
└── <volume-name>/
├── files/
└── sql/
└── <database>.backup.sql
```
## Recover
### Meaning of each level
### database
```bash
docker exec -i mysql_container mysql -uroot -psecret database < db.sql
```
* `<machine-hash>`
SHA256 hash of `/etc/machine-id` (host separation)
### volume
Execute:
* `<repo-name>`
Logical backup namespace (project / stack)
* `<timestamp>`
Backup generation (`YYYYMMDDHHMMSS`)
* `<volume-name>`
Docker volume name
* `files/`
Incremental file backup (rsync)
* `sql/`
Optional SQL dumps (only for defined databases)
## 🚀 Installation
### Local (editable install)
```bash
bash ./docker-volume-recover.sh "{{volume_name}}" "$(sha256sum /etc/machine-id | head -c 64)" "{{version_to_recover}}"
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
```
### Database
## 🌱 Database Definition (SQL Backup Scope)
## Debug
To checkout what's going on in the mount container type in the following command:
### How SQL backups are defined
`baudolo` creates SQL dumps **only** for databases that are **explicitly defined**
via configuration (e.g. a databases definition file or seeding step).
If a database is **not defined**:
* its Docker volume is still backed up (files)
* **no SQL dump is created**
> No database definition → file backup only
> Database definition present → file backup + SQL dump
### Why explicit definition?
`baudolo` does **not** inspect running containers to guess databases.
Databases must be explicitly defined to guarantee:
* deterministic backups
* predictable restore behavior
* reproducible environments
* zero accidental production data exposure
### Required database metadata
Each database definition provides:
* database instance (container or logical instance)
* database name
* database user
* database password
This information is used by `baudolo` to execute
`pg_dump`, `pg_dumpall`, or `mariadb-dump`.
## 💾 Running a Backup
```bash
docker run -it --entrypoint /bin/sh --rm --volumes-from {{container_name}} -v /Backups/:/Backups/ kevinveenbirkenbach/alpine-rsync
baudolo \
--compose-dir /srv/docker \
--databases-csv /etc/baudolo/databases.csv \
--database-containers central-postgres central-mariadb \
--images-no-stop-required alpine postgres mariadb mysql \
--images-no-backup-required redis busybox
```
## Setup
Install pandas
### Common Backup Flags
## Optimation
This setup script is not optimized yet for performance. Please optimized this script for performance if you want to use it in a professional environment.
| Flag | Description |
| --------------- | ------------------------------------------- |
| `--everything` | Always stop containers and re-run rsync |
| `--dump-only` | Only create SQL dumps, skip file backups |
| `--shutdown` | Do not restart containers after backup |
| `--backups-dir` | Backup root directory (default: `/Backups`) |
| `--repo-name` | Backup namespace under machine hash |
## Stucking rsync
- https://stackoverflow.com/questions/20773118/rsync-suddenly-hanging-indefinitely-during-transfers
## ♻️ Restore Operations
## More information
- https://docs.docker.com/storage/volumes/
- https://blog.ssdnodes.com/blog/docker-backup-volumes/
- https://www.baculasystems.com/blog/docker-backup-containers/
- https://gist.github.com/spalladino/6d981f7b33f6e0afe6bb
- https://stackoverflow.com/questions/26331651/how-can-i-backup-a-docker-container-with-its-data-volumes
- https://netfuture.ch/2013/08/simple-versioned-timemachine-like-backup-using-rsync/
- https://zwischenzugs.com/2016/08/29/bash-to-python-converter/
- https://en.wikipedia.org/wiki/Incremental_backup#Incremental
- https://unix.stackexchange.com/questions/567837/linux-backup-utility-for-incremental-backups
### Restore Volume Files
```bash
baudolo-restore files \
my-volume \
<machine-hash> \
<version> \
--backups-dir /Backups \
--repo-name my-repo
```
Restore into a **different target volume**:
```bash
baudolo-restore files \
target-volume \
<machine-hash> \
<version> \
--source-volume source-volume
```
### Restore PostgreSQL
```bash
baudolo-restore postgres \
my-volume \
<machine-hash> \
<version> \
--container postgres \
--db-name appdb \
--db-password secret \
--empty
```
### Restore MariaDB / MySQL
```bash
baudolo-restore mariadb \
my-volume \
<machine-hash> \
<version> \
--container mariadb \
--db-name shopdb \
--db-password secret \
--empty
```
> `baudolo` automatically detects whether `mariadb` or `mysql`
> is available inside the container
## 🔍 Backup Scheme
The backup mechanism uses incremental backups with rsync and stamps directories with a unique hash. For more details on the backup scheme, check out [this blog post](https://blog.veen.world/blog/2020/12/26/how-i-backup-dedicated-root-servers/).
![Backup Scheme](https://blog.veen.world/wp-content/uploads/2020/12/server-backup-1024x755.jpg)
## 👨‍💻 Author
**Kevin Veen-Birkenbach**
- 📧 [kevin@veen.world](mailto:kevin@veen.world)
- 🌐 [https://www.veen.world/](https://www.veen.world/)
## 📜 License
This project is licensed under the **GNU Affero General Public License v3.0**. See the [LICENSE](./LICENSE) file for details.
## 🔗 More Information
- [Docker Volumes Documentation](https://docs.docker.com/storage/volumes/)
- [Docker Backup Volumes Blog](https://blog.ssdnodes.com/blog/docker-backup-volumes/)
- [Backup Strategies](https://en.wikipedia.org/wiki/Incremental_backup#Incremental)
---
Happy Backing Up! 🚀🔐

View File

@@ -1 +0,0 @@
database;username;password;container

View File

@@ -1,118 +0,0 @@
#!/bin/python
# Backups volumes of running containers
#
import subprocess
import os
import re
import pathlib
import pandas
from datetime import datetime
def bash(command):
print(command)
process = subprocess.Popen([command], stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
out, err = process.communicate()
stdout = out.splitlines()
output = []
for line in stdout:
output.append(line.decode("utf-8"))
if process.wait() > bool(0):
print(command, out, err)
raise Exception("Exitcode is greater then 0")
return output
def print_bash(command):
output = bash(command)
print(list_to_string(output))
return output
def list_to_string(list):
return str(' '.join(list))
print('start backup routine...')
dirname = os.path.dirname(__file__)
repository_name = os.path.basename(dirname)
# identifier of this backups
machine_id = bash("sha256sum /etc/machine-id")[0][0:64]
# Folder in which all Backups are stored
backups_dir = '/Backups/'
# Folder in which docker volume backups are stored
backup_type_dir = backups_dir + machine_id + "/" + repository_name + "/"
# Folder containing all versions
versions_dir = backup_type_dir + "versions/"
# Time when the backup started
backup_time = datetime.now().strftime("%Y%m%d%H%M%S")
# Folder containing the current version
version_dir = versions_dir + backup_time + "/"
# Define latest path
latest_link = backup_type_dir + "latest/"
# Create folder to store version in
pathlib.Path(version_dir).mkdir(parents=True, exist_ok=True)
if pathlib.Path(latest_link).is_symlink():
print("Unlink " + latest_link + "...")
pathlib.Path(latest_link).unlink()
# Link latest to current version
pathlib.Path(latest_link).symlink_to(version_dir)
print('start volume backups...')
print('load connection data...')
databases = pandas.read_csv(dirname + "/databases.csv", sep=";")
volume_names = bash("docker volume ls --format '{{.Name}}'")
for volume_name in volume_names:
print('start backup routine for volume: ' + volume_name)
containers = bash("docker ps --filter volume=\"" + volume_name + "\" --format '{{.Names}}'")
if len(containers) == 0:
print('skipped due to no running containers using this volume.')
else:
container = containers[0]
# Folder to which the volumes are copied
volume_destination_dir = version_dir + volume_name
# Database name
database_name = re.split("(_|-)(database|db)", container)[0]
# Entries with database login data concerning this container
databases_entries = databases.loc[databases['database'] == database_name]
# Exception for akaunting due to fast implementation
if len(databases_entries) == 1 and container != 'akaunting':
print("Backup database...")
mysqldump_destination_dir = volume_destination_dir + "/sql"
mysqldump_destination_file = mysqldump_destination_dir + "/backup.sql"
pathlib.Path(mysqldump_destination_dir).mkdir(parents=True, exist_ok=True)
database_entry = databases_entries.iloc[0]
database_backup_command = "docker exec " + container + " /usr/bin/mysqldump -u " + database_entry["username"] + " -p" + database_entry["password"] + " " + database_entry["database"] + " > " + mysqldump_destination_file
print_bash(database_backup_command)
print("Backup files...")
files_rsync_destination_path = volume_destination_dir + "/files"
pathlib.Path(files_rsync_destination_path).mkdir(parents=True, exist_ok=True)
versions = os.listdir(versions_dir)
versions.sort(reverse=True)
if len(versions) > 1:
last_version = versions[1]
last_version_files_dir = versions_dir + last_version + "/" + volume_name + "/files"
if os.path.isdir(last_version_files_dir):
link_dest_parameter="--link-dest='" + last_version_files_dir + "' "
else:
print("No previous version exists in path "+ last_version_files_dir + ".")
link_dest_parameter=""
else:
print("No previous version exists in path "+ last_version_files_dir + ".")
link_dest_parameter=""
source_dir = "/var/lib/docker/volumes/" + volume_name + "/_data/"
rsync_command = "rsync -abP --delete --delete-excluded " + link_dest_parameter + source_dir + " " + files_rsync_destination_path
print_bash(rsync_command)
print("stop containers...")
print("Backup data after container is stopped...")
print_bash("docker stop " + list_to_string(containers))
print_bash(rsync_command)
print("start containers...")
print_bash("docker start " + list_to_string(containers))
print("end backup routine for volume:" + volume_name)
print('finished volume backups.')
print('restart docker service...')
print_bash("systemctl restart docker")
print('finished backup routine.')

View File

@@ -1,32 +0,0 @@
#!/bin/bash
volume_name="$1" # Volume-Name
backup_hash="$2" # Hashed Machine ID
version="$3" # version to backup
container="$4" # optional
mysql_root_password="$5" # optional
database="$6" # optional
backup_folder="Backups/$backup_hash/docker-volume-backup/versions/$version/$volume_name"
backup_files="/$backup_folder/files"
backup_sql="/$backup_folder/sql/backup.sql"
echo "Inspect volume $volume_name"
docker volume inspect "$volume_name"
exit_status_volume_inspect=$?
if [ $exit_status_volume_inspect -eq 0 ]; then
echo "Volume $volume_name allready exists"
else
echo "Create volume $volume_name"
docker volume create "$volume_name"
fi
if [ -f "$backup_sql" ]; then
echo "recover mysql dump"
cat $backup_sql | docker exec -i "$container" /usr/bin/mysql -u root --password="$mysql_root_password" $database
exit 0
else
if [ -d "$backup_files" ]; then
echo "recover files"
docker run --rm -v "$volume_name:/recover/" -v "$backup_files:/backup/" "kevinveenbirkenbach/alpine-rsync" sh -c "rsync -avv --delete /backup/ /recover/"
fi
fi
echo "ERROR: $backup_files and $backup_sql don't exist"
exit 1

29
pyproject.toml Normal file
View File

@@ -0,0 +1,29 @@
[build-system]
requires = ["setuptools>=69", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "backup-docker-to-local"
version = "1.0.0"
description = "Backup Docker volumes to local with rsync and optional DB dumps."
readme = "README.md"
requires-python = ">=3.9"
license = { text = "AGPL-3.0-or-later" }
authors = [{ name = "Kevin Veen-Birkenbach" }]
dependencies = [
"pandas",
"dirval",
]
[project.scripts]
baudolo = "baudolo.backup.__main__:main"
baudolo-restore = "baudolo.restore.__main__:main"
baudolo-seed = "baudolo.seed.__main__:main"
[tool.setuptools]
package-dir = { "" = "src" }
[tool.setuptools.packages.find]
where = ["src"]
exclude = ["tests*"]

234
scripts/test-e2e.sh Executable file
View File

@@ -0,0 +1,234 @@
#!/usr/bin/env bash
set -euo pipefail
# -----------------------------------------------------------------------------
# E2E runner using Docker-in-Docker (DinD) with debug-on-failure
#
# Debug toggles:
# E2E_KEEP_ON_FAIL=1 -> keep DinD + volumes + network if tests fail
# E2E_KEEP_VOLUMES=1 -> keep volumes even on success/cleanup
# E2E_DEBUG_SHELL=1 -> open an interactive shell in the test container instead of running tests
# E2E_ARTIFACTS_DIR=./artifacts
# -----------------------------------------------------------------------------
NET="${E2E_NET:-baudolo-e2e-net}"
DIND="${E2E_DIND_NAME:-baudolo-e2e-dind}"
DIND_VOL="${E2E_DIND_VOL:-baudolo-e2e-dind-data}"
E2E_TMP_VOL="${E2E_TMP_VOL:-baudolo-e2e-tmp}"
DIND_HOST="${E2E_DIND_HOST:-tcp://127.0.0.1:2375}"
DIND_HOST_IN_NET="${E2E_DIND_HOST_IN_NET:-tcp://${DIND}:2375}"
IMG="${E2E_IMAGE:-baudolo:local}"
RSYNC_IMG="${E2E_RSYNC_IMAGE:-ghcr.io/kevinveenbirkenbach/alpine-rsync}"
READY_TIMEOUT_SECONDS="${E2E_READY_TIMEOUT_SECONDS:-120}"
ARTIFACTS_DIR="${E2E_ARTIFACTS_DIR:-./artifacts}"
KEEP_ON_FAIL="${E2E_KEEP_ON_FAIL:-0}"
KEEP_VOLUMES="${E2E_KEEP_VOLUMES:-0}"
DEBUG_SHELL="${E2E_DEBUG_SHELL:-0}"
FAILED=0
TS="$(date +%Y%m%d%H%M%S)"
mkdir -p "${ARTIFACTS_DIR}"
log() { echo ">> $*"; }
dump_debug() {
log "DEBUG: collecting diagnostics into ${ARTIFACTS_DIR}"
{
echo "=== Host docker version ==="
docker version || true
echo
echo "=== Host docker info ==="
docker info || true
echo
echo "=== DinD reachable? (docker -H ${DIND_HOST} version) ==="
docker -H "${DIND_HOST}" version || true
echo
} > "${ARTIFACTS_DIR}/debug-host-${TS}.txt" 2>&1 || true
# DinD logs
docker logs --tail=5000 "${DIND}" > "${ARTIFACTS_DIR}/dind-logs-${TS}.txt" 2>&1 || true
# DinD state
{
echo "=== docker -H ps -a ==="
docker -H "${DIND_HOST}" ps -a || true
echo
echo "=== docker -H images ==="
docker -H "${DIND_HOST}" images || true
echo
echo "=== docker -H network ls ==="
docker -H "${DIND_HOST}" network ls || true
echo
echo "=== docker -H volume ls ==="
docker -H "${DIND_HOST}" volume ls || true
echo
echo "=== docker -H system df ==="
docker -H "${DIND_HOST}" system df || true
} > "${ARTIFACTS_DIR}/debug-dind-${TS}.txt" 2>&1 || true
# Try to capture recent events (best effort; might be noisy)
docker -H "${DIND_HOST}" events --since 10m --until 0s \
> "${ARTIFACTS_DIR}/dind-events-${TS}.txt" 2>&1 || true
# Dump shared /tmp content from the tmp volume:
# We create a temporary container that mounts the volume, then tar its content.
# (Does not rely on host filesystem paths.)
log "DEBUG: archiving shared /tmp (volume ${E2E_TMP_VOL})"
docker -H "${DIND_HOST}" run --rm \
-v "${E2E_TMP_VOL}:/tmp" \
alpine:3.20 \
sh -lc 'cd /tmp && tar -czf /out.tar.gz . || true' \
>/dev/null 2>&1 || true
# The above writes inside the container FS, not to host. So do it properly:
# Use "docker cp" from a temp container.
local tmpc="baudolo-e2e-tmpdump-${TS}"
docker -H "${DIND_HOST}" rm -f "${tmpc}" >/dev/null 2>&1 || true
docker -H "${DIND_HOST}" create --name "${tmpc}" -v "${E2E_TMP_VOL}:/tmp" alpine:3.20 \
sh -lc 'cd /tmp && tar -czf /tmpdump.tar.gz . || true' >/dev/null
docker -H "${DIND_HOST}" start -a "${tmpc}" >/dev/null 2>&1 || true
docker -H "${DIND_HOST}" cp "${tmpc}:/tmpdump.tar.gz" "${ARTIFACTS_DIR}/e2e-tmp-${TS}.tar.gz" >/dev/null 2>&1 || true
docker -H "${DIND_HOST}" rm -f "${tmpc}" >/dev/null 2>&1 || true
log "DEBUG: artifacts written:"
ls -la "${ARTIFACTS_DIR}" | sed 's/^/ /' || true
}
cleanup() {
if [ "${FAILED}" -eq 1 ] && [ "${KEEP_ON_FAIL}" = "1" ]; then
log "KEEP_ON_FAIL=1 and failure detected -> skipping cleanup."
log "Next steps:"
echo " - Inspect DinD logs: docker logs ${DIND} | less"
echo " - Use DinD daemon: docker -H ${DIND_HOST} ps -a"
echo " - Shared tmp vol: docker -H ${DIND_HOST} run --rm -v ${E2E_TMP_VOL}:/tmp alpine:3.20 ls -la /tmp"
echo " - DinD docker root: docker -H ${DIND_HOST} run --rm -v ${DIND_VOL}:/var/lib/docker alpine:3.20 ls -la /var/lib/docker/volumes"
return 0
fi
log "Cleanup: stopping ${DIND} and removing network ${NET}"
docker rm -f "${DIND}" >/dev/null 2>&1 || true
docker network rm "${NET}" >/dev/null 2>&1 || true
if [ "${KEEP_VOLUMES}" != "1" ]; then
docker volume rm -f "${DIND_VOL}" >/dev/null 2>&1 || true
docker volume rm -f "${E2E_TMP_VOL}" >/dev/null 2>&1 || true
else
log "Keeping volumes (E2E_KEEP_VOLUMES=1): ${DIND_VOL}, ${E2E_TMP_VOL}"
fi
}
trap cleanup EXIT INT TERM
log "Creating network ${NET} (if missing)"
docker network inspect "${NET}" >/dev/null 2>&1 || docker network create "${NET}" >/dev/null
log "Removing old ${DIND} (if any)"
docker rm -f "${DIND}" >/dev/null 2>&1 || true
log "(Re)creating DinD data volume ${DIND_VOL}"
docker volume rm -f "${DIND_VOL}" >/dev/null 2>&1 || true
docker volume create "${DIND_VOL}" >/dev/null
log "(Re)creating shared /tmp volume ${E2E_TMP_VOL}"
docker volume rm -f "${E2E_TMP_VOL}" >/dev/null 2>&1 || true
docker volume create "${E2E_TMP_VOL}" >/dev/null
log "Starting Docker-in-Docker daemon ${DIND}"
docker run -d --privileged \
--name "${DIND}" \
--network "${NET}" \
-e DOCKER_TLS_CERTDIR="" \
-v "${DIND_VOL}:/var/lib/docker" \
-v "${E2E_TMP_VOL}:/tmp" \
-p 2375:2375 \
docker:dind \
--host=tcp://0.0.0.0:2375 \
--tls=false >/dev/null
log "Waiting for DinD to be ready..."
for i in $(seq 1 "${READY_TIMEOUT_SECONDS}"); do
if docker -H "${DIND_HOST}" version >/dev/null 2>&1; then
log "DinD is ready."
break
fi
sleep 1
if [ "${i}" -eq "${READY_TIMEOUT_SECONDS}" ]; then
echo "ERROR: DinD did not become ready in time"
docker logs --tail=200 "${DIND}" || true
FAILED=1
dump_debug || true
exit 1
fi
done
log "Pre-pulling helper images in DinD..."
log " - Pulling: ${RSYNC_IMG}"
docker -H "${DIND_HOST}" pull "${RSYNC_IMG}"
log "Ensuring alpine exists in DinD (for debug helpers)"
docker -H "${DIND_HOST}" pull alpine:3.20 >/dev/null
log "Loading ${IMG} image into DinD..."
docker save "${IMG}" | docker -H "${DIND_HOST}" load >/dev/null
log "Running E2E tests inside DinD"
set +e
if [ "${DEBUG_SHELL}" = "1" ]; then
log "E2E_DEBUG_SHELL=1 -> opening shell in test container"
docker run --rm -it \
--network "${NET}" \
-e DOCKER_HOST="${DIND_HOST_IN_NET}" \
-e E2E_RSYNC_IMAGE="${RSYNC_IMG}" \
-v "${DIND_VOL}:/var/lib/docker:ro" \
-v "${E2E_TMP_VOL}:/tmp" \
"${IMG}" \
sh -lc '
set -e
if [ ! -f /etc/machine-id ]; then
mkdir -p /etc
cat /proc/sys/kernel/random/uuid > /etc/machine-id
fi
echo ">> DOCKER_HOST=${DOCKER_HOST}"
docker ps -a || true
exec sh
'
rc=$?
else
docker run --rm \
--network "${NET}" \
-e DOCKER_HOST="${DIND_HOST_IN_NET}" \
-e E2E_RSYNC_IMAGE="${RSYNC_IMG}" \
-v "${DIND_VOL}:/var/lib/docker:ro" \
-v "${E2E_TMP_VOL}:/tmp" \
"${IMG}" \
sh -lc '
set -euo pipefail
set -x
export PYTHONUNBUFFERED=1
export TMPDIR=/tmp TMP=/tmp TEMP=/tmp
if [ ! -f /etc/machine-id ]; then
mkdir -p /etc
cat /proc/sys/kernel/random/uuid > /etc/machine-id
fi
python -m unittest discover -t . -s tests/e2e -p "test_*.py" -v -f
'
rc=$?
fi
set -e
if [ "${rc}" -ne 0 ]; then
FAILED=1
echo "ERROR: E2E tests failed (exit code: ${rc})"
dump_debug || true
exit "${rc}"
fi
log "E2E tests passed."

0
src/baudolo/__init__.py Normal file
View File

View File

@@ -0,0 +1 @@
"""Baudolo backup package."""

View File

@@ -0,0 +1,9 @@
#!/usr/bin/env python3
from __future__ import annotations
from .app import main
if __name__ == "__main__":
raise SystemExit(main())

183
src/baudolo/backup/app.py Normal file
View File

@@ -0,0 +1,183 @@
from __future__ import annotations
import os
import pathlib
from datetime import datetime
import pandas
from dirval import create_stamp_file
from .cli import parse_args
from .compose import handle_docker_compose_services
from .db import backup_database
from .docker import (
change_containers_status,
containers_using_volume,
docker_volume_names,
get_image_info,
has_image,
)
from .shell import execute_shell_command
from .volume import backup_volume
def get_machine_id() -> str:
return execute_shell_command("sha256sum /etc/machine-id")[0][0:64]
def stamp_directory(version_dir: str) -> None:
"""
Use dirval as a Python library to stamp the directory (no CLI dependency).
"""
create_stamp_file(version_dir)
def create_version_directory(versions_dir: str, backup_time: str) -> str:
version_dir = os.path.join(versions_dir, backup_time)
pathlib.Path(version_dir).mkdir(parents=True, exist_ok=True)
return version_dir
def create_volume_directory(version_dir: str, volume_name: str) -> str:
path = os.path.join(version_dir, volume_name)
pathlib.Path(path).mkdir(parents=True, exist_ok=True)
return path
def is_image_ignored(container: str, images_no_backup_required: list[str]) -> bool:
if not images_no_backup_required:
return False
img = get_image_info(container)
return any(pat in img for pat in images_no_backup_required)
def volume_is_fully_ignored(containers: list[str], images_no_backup_required: list[str]) -> bool:
"""
Skip file backup only if all containers linked to the volume are ignored.
"""
if not containers:
return False
return all(is_image_ignored(c, images_no_backup_required) for c in containers)
def requires_stop(containers: list[str], images_no_stop_required: list[str]) -> bool:
"""
Stop is required if ANY container image is NOT in the whitelist patterns.
"""
for c in containers:
img = get_image_info(c)
if not any(pat in img for pat in images_no_stop_required):
return True
return False
def backup_mariadb_or_postgres(
*,
container: str,
volume_dir: str,
databases_df: "pandas.DataFrame",
database_containers: list[str],
) -> bool:
"""
Returns True if the container is a DB container we handled.
"""
for img in ["mariadb", "postgres"]:
if has_image(container, img):
backup_database(
container=container,
volume_dir=volume_dir,
db_type=img,
databases_df=databases_df,
database_containers=database_containers,
)
return True
return False
def _backup_dumps_for_volume(
*,
containers: list[str],
vol_dir: str,
databases_df: "pandas.DataFrame",
database_containers: list[str],
) -> bool:
"""
Create DB dumps for any mariadb/postgres containers attached to this volume.
Returns True if at least one dump was produced.
"""
dumped_any = False
for c in containers:
if backup_mariadb_or_postgres(
container=c,
volume_dir=vol_dir,
databases_df=databases_df,
database_containers=database_containers,
):
dumped_any = True
return dumped_any
def main() -> int:
args = parse_args()
machine_id = get_machine_id()
backup_time = datetime.now().strftime("%Y%m%d%H%M%S")
versions_dir = os.path.join(args.backups_dir, machine_id, args.repo_name)
version_dir = create_version_directory(versions_dir, backup_time)
databases_df = pandas.read_csv(args.databases_csv, sep=";")
print("💾 Start volume backups...", flush=True)
for volume_name in docker_volume_names():
print(f"Start backup routine for volume: {volume_name}", flush=True)
containers = containers_using_volume(volume_name)
vol_dir = create_volume_directory(version_dir, volume_name)
# Old behavior: DB dumps are additional to file backups.
_backup_dumps_for_volume(
containers=containers,
vol_dir=vol_dir,
databases_df=databases_df,
database_containers=args.database_containers,
)
# dump-only: skip ALL file rsync backups
if args.dump_only:
continue
# skip file backup if all linked containers are ignored
if volume_is_fully_ignored(containers, args.images_no_backup_required):
print(
f"Skipping file backup for volume '{volume_name}' (all linked containers are ignored).",
flush=True,
)
continue
if args.everything:
# "everything": always do pre-rsync, then stop + rsync again
backup_volume(versions_dir, volume_name, vol_dir)
change_containers_status(containers, "stop")
backup_volume(versions_dir, volume_name, vol_dir)
if not args.shutdown:
change_containers_status(containers, "start")
continue
# default: rsync, and if needed stop + rsync
backup_volume(versions_dir, volume_name, vol_dir)
if requires_stop(containers, args.images_no_stop_required):
change_containers_status(containers, "stop")
backup_volume(versions_dir, volume_name, vol_dir)
if not args.shutdown:
change_containers_status(containers, "start")
# Stamp the backup version directory using dirval (python lib)
stamp_directory(version_dir)
print("Finished volume backups.", flush=True)
print("Handling Docker Compose services...", flush=True)
handle_docker_compose_services(args.compose_dir, args.docker_compose_hard_restart_required)
return 0

93
src/baudolo/backup/cli.py Normal file
View File

@@ -0,0 +1,93 @@
from __future__ import annotations
import argparse
import os
from pathlib import Path
def _default_repo_name() -> str:
"""
Derive the repository name from the folder that contains `src/`.
Expected layout:
<repo-root>/src/baudolo/backup/cli.py
=> parents[0]=backup, [1]=baudolo, [2]=src, [3]=repo-root
"""
try:
return Path(__file__).resolve().parents[3].name
except Exception:
return "backup-docker-to-local"
def parse_args() -> argparse.Namespace:
dirname = os.path.dirname(__file__)
default_databases_csv = os.path.join(dirname, "databases.csv")
p = argparse.ArgumentParser(description="Backup Docker volumes.")
p.add_argument(
"--compose-dir",
type=str,
required=True,
help="Path to the parent directory containing docker-compose setups",
)
p.add_argument(
"--docker-compose-hard-restart-required",
nargs="+",
default=["mailu"],
help="Compose dir names that require 'docker-compose down && up -d' (default: mailu)",
)
p.add_argument(
"--repo-name",
default=_default_repo_name(),
help="Backup repo folder name under <backups-dir>/<machine-id>/ (default: git repo folder name)",
)
p.add_argument(
"--databases-csv",
default=default_databases_csv,
help=f"Path to databases.csv (default: {default_databases_csv})",
)
p.add_argument(
"--backups-dir",
default="/Backups",
help="Backup root directory (default: /Backups)",
)
p.add_argument(
"--database-containers",
nargs="+",
required=True,
help="Container names treated as special instances for database backups",
)
p.add_argument(
"--images-no-stop-required",
nargs="+",
required=True,
help="Image name patterns for which containers should not be stopped during file backup",
)
p.add_argument(
"--images-no-backup-required",
nargs="+",
default=[],
help="Image name patterns for which no backup should be performed",
)
p.add_argument(
"--everything",
action="store_true",
help="Force file backup for all volumes and also execute database dumps (like old script)",
)
p.add_argument(
"--shutdown",
action="store_true",
help="Do not restart containers after backup",
)
p.add_argument(
"--dump-only",
action="store_true",
help="Only create DB dumps (skip ALL file rsync backups)",
)
return p.parse_args()

View File

@@ -0,0 +1,31 @@
from __future__ import annotations
import os
import subprocess
def hard_restart_docker_services(dir_path: str) -> None:
print(f"Hard restart docker-compose services in: {dir_path}", flush=True)
subprocess.run(["docker-compose", "down"], cwd=dir_path, check=True)
subprocess.run(["docker-compose", "up", "-d"], cwd=dir_path, check=True)
def handle_docker_compose_services(parent_directory: str, hard_restart_required: list[str]) -> None:
for entry in os.scandir(parent_directory):
if not entry.is_dir():
continue
dir_path = entry.path
name = os.path.basename(dir_path)
compose_file = os.path.join(dir_path, "docker-compose.yml")
print(f"Checking directory: {dir_path}", flush=True)
if not os.path.isfile(compose_file):
print("No docker-compose.yml found. Skipping.", flush=True)
continue
if name in hard_restart_required:
print(f"{name}: hard restart required.", flush=True)
hard_restart_docker_services(dir_path)
else:
print(f"{name}: no restart required.", flush=True)

73
src/baudolo/backup/db.py Normal file
View File

@@ -0,0 +1,73 @@
from __future__ import annotations
import os
import pathlib
import re
import pandas
from .shell import BackupException, execute_shell_command
def get_instance(container: str, database_containers: list[str]) -> str:
if container in database_containers:
return container
return re.split(r"(_|-)(database|db|postgres)", container)[0]
def fallback_pg_dumpall(container: str, username: str, password: str, out_file: str) -> None:
cmd = (
f"PGPASSWORD={password} docker exec -i {container} "
f"pg_dumpall -U {username} -h localhost > {out_file}"
)
execute_shell_command(cmd)
def backup_database(
*,
container: str,
volume_dir: str,
db_type: str,
databases_df: "pandas.DataFrame",
database_containers: list[str],
) -> None:
instance_name = get_instance(container, database_containers)
entries = databases_df.loc[databases_df["instance"] == instance_name]
if entries.empty:
raise BackupException(f"No entry found for instance '{instance_name}'")
out_dir = os.path.join(volume_dir, "sql")
pathlib.Path(out_dir).mkdir(parents=True, exist_ok=True)
for row in entries.iloc:
db_name = row["database"]
user = row["username"]
password = row["password"]
dump_file = os.path.join(out_dir, f"{db_name}.backup.sql")
if db_type == "mariadb":
cmd = (
f"docker exec {container} /usr/bin/mariadb-dump "
f"-u {user} -p{password} {db_name} > {dump_file}"
)
execute_shell_command(cmd)
continue
if db_type == "postgres":
cluster_file = os.path.join(out_dir, f"{instance_name}.cluster.backup.sql")
if not db_name:
fallback_pg_dumpall(container, user, password, cluster_file)
return
try:
cmd = (
f"PGPASSWORD={password} docker exec -i {container} "
f"pg_dump -U {user} -d {db_name} -h localhost > {dump_file}"
)
execute_shell_command(cmd)
except BackupException as e:
print(f"pg_dump failed: {e}", flush=True)
print(f"Falling back to pg_dumpall for instance '{instance_name}'", flush=True)
fallback_pg_dumpall(container, user, password, cluster_file)
continue

View File

@@ -0,0 +1,43 @@
from __future__ import annotations
from .shell import execute_shell_command
def get_image_info(container: str) -> str:
return execute_shell_command(
f"docker inspect --format '{{{{.Config.Image}}}}' {container}"
)[0]
def has_image(container: str, pattern: str) -> bool:
"""Return True if container's image contains the pattern."""
return pattern in get_image_info(container)
def docker_volume_names() -> list[str]:
return execute_shell_command("docker volume ls --format '{{.Name}}'")
def containers_using_volume(volume_name: str) -> list[str]:
return execute_shell_command(
f"docker ps --filter volume=\"{volume_name}\" --format '{{{{.Names}}}}'"
)
def change_containers_status(containers: list[str], status: str) -> None:
"""Stop or start a list of containers."""
if not containers:
print(f"No containers to {status}.", flush=True)
return
names = " ".join(containers)
print(f"{status.capitalize()} containers: {names}...", flush=True)
execute_shell_command(f"docker {status} {names}")
def docker_volume_exists(volume: str) -> bool:
# Avoid throwing exceptions for exists checks.
try:
execute_shell_command(f"docker volume inspect {volume} >/dev/null 2>&1 && echo OK")
return True
except Exception:
return False

View File

@@ -0,0 +1,26 @@
from __future__ import annotations
import subprocess
class BackupException(Exception):
"""Generic exception for backup errors."""
def execute_shell_command(command: str) -> list[str]:
"""Execute a shell command and return its output lines."""
print(command, flush=True)
process = subprocess.Popen(
[command],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True,
)
out, err = process.communicate()
if process.returncode != 0:
raise BackupException(
f"Error in command: {command}\n"
f"Output: {out}\nError: {err}\n"
f"Exit code: {process.returncode}"
)
return [line.decode("utf-8") for line in out.splitlines()]

View File

@@ -0,0 +1,42 @@
from __future__ import annotations
import os
import pathlib
from .shell import BackupException, execute_shell_command
def get_storage_path(volume_name: str) -> str:
path = execute_shell_command(
f"docker volume inspect --format '{{{{ .Mountpoint }}}}' {volume_name}"
)[0]
return f"{path}/"
def get_last_backup_dir(versions_dir: str, volume_name: str, current_backup_dir: str) -> str | None:
versions = sorted(os.listdir(versions_dir), reverse=True)
for version in versions:
candidate = os.path.join(versions_dir, version, volume_name, "files", "")
if candidate != current_backup_dir and os.path.isdir(candidate):
return candidate
return None
def backup_volume(versions_dir: str, volume_name: str, volume_dir: str) -> None:
"""Perform incremental file backup of a Docker volume."""
dest = os.path.join(volume_dir, "files") + "/"
pathlib.Path(dest).mkdir(parents=True, exist_ok=True)
last = get_last_backup_dir(versions_dir, volume_name, dest)
link_dest = f"--link-dest='{last}'" if last else ""
source = get_storage_path(volume_name)
cmd = f"rsync -abP --delete --delete-excluded {link_dest} {source} {dest}"
try:
execute_shell_command(cmd)
except BackupException as e:
if "file has vanished" in str(e):
print("Warning: Some files vanished before transfer. Continuing.", flush=True)
else:
raise

View File

@@ -0,0 +1 @@
__all__ = ["main"]

View File

@@ -0,0 +1,144 @@
from __future__ import annotations
import argparse
import sys
from .paths import BackupPaths
from .files import restore_volume_files
from .db.postgres import restore_postgres_sql
from .db.mariadb import restore_mariadb_sql
def _add_common_backup_args(p: argparse.ArgumentParser) -> None:
p.add_argument("volume_name", help="Docker volume name (target volume)")
p.add_argument("backup_hash", help="Hashed machine id")
p.add_argument("version", help="Backup version directory name")
p.add_argument(
"--backups-dir",
default="/Backups",
help="Backup root directory (default: /Backups)",
)
p.add_argument(
"--repo-name",
default="backup-docker-to-local",
help="Backup repo folder name under <backups-dir>/<hash>/ (default: backup-docker-to-local)",
)
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(
prog="baudolo-restore",
description="Restore docker volume files and DB dumps.",
)
sub = parser.add_subparsers(dest="cmd", required=True)
# ------------------------------------------------------------------
# files
# ------------------------------------------------------------------
p_files = sub.add_parser("files", help="Restore files into a docker volume")
_add_common_backup_args(p_files)
p_files.add_argument(
"--rsync-image",
default="ghcr.io/kevinveenbirkenbach/alpine-rsync",
)
p_files.add_argument(
"--source-volume",
default=None,
help=(
"Volume name used as backup source path key. "
"Defaults to <volume_name> (target volume). "
"Use this when restoring from one volume backup into a different target volume."
),
)
# ------------------------------------------------------------------
# postgres
# ------------------------------------------------------------------
p_pg = sub.add_parser("postgres", help="Restore a single PostgreSQL database dump")
_add_common_backup_args(p_pg)
p_pg.add_argument("--container", required=True)
p_pg.add_argument("--db-name", required=True)
p_pg.add_argument("--db-user", default=None, help="Defaults to db-name if omitted")
p_pg.add_argument("--db-password", required=True)
p_pg.add_argument("--empty", action="store_true")
# ------------------------------------------------------------------
# mariadb
# ------------------------------------------------------------------
p_mdb = sub.add_parser("mariadb", help="Restore a single MariaDB/MySQL-compatible dump")
_add_common_backup_args(p_mdb)
p_mdb.add_argument("--container", required=True)
p_mdb.add_argument("--db-name", required=True)
p_mdb.add_argument("--db-user", default=None, help="Defaults to db-name if omitted")
p_mdb.add_argument("--db-password", required=True)
p_mdb.add_argument("--empty", action="store_true")
args = parser.parse_args(argv)
try:
if args.cmd == "files":
# target volume = args.volume_name
# source volume (backup key) defaults to target volume
source_volume = args.source_volume or args.volume_name
bp_files = BackupPaths(
source_volume,
args.backup_hash,
args.version,
repo_name=args.repo_name,
backups_dir=args.backups_dir,
)
return restore_volume_files(
args.volume_name,
bp_files.files_dir(),
rsync_image=args.rsync_image,
)
if args.cmd == "postgres":
user = args.db_user or args.db_name
restore_postgres_sql(
container=args.container,
db_name=args.db_name,
user=user,
password=args.db_password,
sql_path=BackupPaths(
args.volume_name,
args.backup_hash,
args.version,
repo_name=args.repo_name,
backups_dir=args.backups_dir,
).sql_file(args.db_name),
empty=args.empty,
)
return 0
if args.cmd == "mariadb":
user = args.db_user or args.db_name
restore_mariadb_sql(
container=args.container,
db_name=args.db_name,
user=user,
password=args.db_password,
sql_path=BackupPaths(
args.volume_name,
args.backup_hash,
args.version,
repo_name=args.repo_name,
backups_dir=args.backups_dir,
).sql_file(args.db_name),
empty=args.empty,
)
return 0
parser.error("Unhandled command")
return 2
except Exception as e:
print(f"ERROR: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1 @@
"""Database restore handlers (Postgres, MariaDB/MySQL)."""

View File

@@ -0,0 +1,89 @@
from __future__ import annotations
import os
import sys
from ..run import docker_exec, docker_exec_sh
def _pick_client(container: str) -> str:
"""
Prefer 'mariadb', fallback to 'mysql'.
Some MariaDB images no longer ship a 'mysql' binary, so we must not assume it exists.
"""
script = r"""
set -eu
if command -v mariadb >/dev/null 2>&1; then echo mariadb; exit 0; fi
if command -v mysql >/dev/null 2>&1; then echo mysql; exit 0; fi
exit 42
"""
try:
out = docker_exec_sh(container, script, capture=True).stdout.decode().strip()
if not out:
raise RuntimeError("empty client detection output")
return out
except Exception as e:
print("ERROR: neither 'mariadb' nor 'mysql' found in container.", file=sys.stderr)
raise e
def restore_mariadb_sql(
*,
container: str,
db_name: str,
user: str,
password: str,
sql_path: str,
empty: bool,
) -> None:
client = _pick_client(container)
if not os.path.isfile(sql_path):
raise FileNotFoundError(sql_path)
if empty:
# IMPORTANT:
# Do NOT hardcode 'mysql' here. Use the detected client.
# MariaDB 11 images may not contain the mysql binary at all.
docker_exec(
container,
[client, "-u", user, f"--password={password}", "-e", "SET FOREIGN_KEY_CHECKS=0;"],
)
result = docker_exec(
container,
[
client,
"-u",
user,
f"--password={password}",
"-N",
"-e",
f"SELECT table_name FROM information_schema.tables WHERE table_schema = '{db_name}';",
],
capture=True,
)
tables = result.stdout.decode().split()
for tbl in tables:
docker_exec(
container,
[
client,
"-u",
user,
f"--password={password}",
"-e",
f"DROP TABLE IF EXISTS `{db_name}`.`{tbl}`;",
],
)
docker_exec(
container,
[client, "-u", user, f"--password={password}", "-e", "SET FOREIGN_KEY_CHECKS=1;"],
)
with open(sql_path, "rb") as f:
docker_exec(container, [client, "-u", user, f"--password={password}", db_name], stdin=f)
print(f"MariaDB/MySQL restore complete for db '{db_name}'.")

View File

@@ -0,0 +1,53 @@
from __future__ import annotations
import os
from ..run import docker_exec
def restore_postgres_sql(
*,
container: str,
db_name: str,
user: str,
password: str,
sql_path: str,
empty: bool,
) -> None:
if not os.path.isfile(sql_path):
raise FileNotFoundError(sql_path)
# Make password available INSIDE the container for psql.
docker_env = {"PGPASSWORD": password}
if empty:
drop_sql = r"""
DO $$ DECLARE r RECORD;
BEGIN
FOR r IN (
SELECT table_name AS name, 'TABLE' AS type FROM information_schema.tables WHERE table_schema='public'
UNION ALL
SELECT routine_name AS name, 'FUNCTION' AS type FROM information_schema.routines WHERE specific_schema='public'
UNION ALL
SELECT sequence_name AS name, 'SEQUENCE' AS type FROM information_schema.sequences WHERE sequence_schema='public'
) LOOP
EXECUTE format('DROP %s public.%I CASCADE', r.type, r.name);
END LOOP;
END $$;
"""
docker_exec(
container,
["psql", "-v", "ON_ERROR_STOP=1", "-U", user, "-d", db_name],
stdin=drop_sql.encode(),
docker_env=docker_env,
)
with open(sql_path, "rb") as f:
docker_exec(
container,
["psql", "-v", "ON_ERROR_STOP=1", "-U", user, "-d", db_name],
stdin=f,
docker_env=docker_env,
)
print(f"PostgreSQL restore complete for db '{db_name}'.")

View File

@@ -0,0 +1,37 @@
from __future__ import annotations
import os
import sys
from .run import run, docker_volume_exists
def restore_volume_files(volume_name: str, backup_files_dir: str, *, rsync_image: str) -> int:
if not os.path.isdir(backup_files_dir):
print(f"ERROR: backup files dir not found: {backup_files_dir}", file=sys.stderr)
return 2
if not docker_volume_exists(volume_name):
print(f"Volume {volume_name} does not exist. Creating...")
run(["docker", "volume", "create", volume_name])
else:
print(f"Volume {volume_name} already exists.")
# Keep behavior close to the old script: rsync -avv --delete
run(
[
"docker",
"run",
"--rm",
"-v",
f"{volume_name}:/recover/",
"-v",
f"{backup_files_dir}:/backup/",
rsync_image,
"sh",
"-lc",
"rsync -avv --delete /backup/ /recover/",
]
)
print("File restore complete.")
return 0

View File

@@ -0,0 +1,29 @@
from __future__ import annotations
import os
from dataclasses import dataclass
@dataclass(frozen=True)
class BackupPaths:
volume_name: str
backup_hash: str
version: str
repo_name: str
backups_dir: str = "/Backups"
def root(self) -> str:
# Always build an absolute path under backups_dir
return os.path.join(
self.backups_dir,
self.backup_hash,
self.repo_name,
self.version,
self.volume_name,
)
def files_dir(self) -> str:
return os.path.join(self.root(), "files")
def sql_file(self, db_name: str) -> str:
return os.path.join(self.root(), "sql", f"{db_name}.backup.sql")

View File

@@ -0,0 +1,89 @@
from __future__ import annotations
import subprocess
import sys
from typing import Optional
def run(
cmd: list[str],
*,
stdin=None,
capture: bool = False,
env: Optional[dict] = None,
) -> subprocess.CompletedProcess:
try:
kwargs: dict = {
"check": True,
"capture_output": capture,
"env": env,
}
# If stdin is raw data (bytes/str), pass it via input=.
# IMPORTANT: when using input=..., do NOT pass stdin=... as well.
if isinstance(stdin, (bytes, str)):
kwargs["input"] = stdin
else:
kwargs["stdin"] = stdin
return subprocess.run(cmd, **kwargs)
except subprocess.CalledProcessError as e:
msg = f"ERROR: command failed ({e.returncode}): {' '.join(cmd)}"
print(msg, file=sys.stderr)
if e.stdout:
try:
print(e.stdout.decode(), file=sys.stderr)
except Exception:
print(e.stdout, file=sys.stderr)
if e.stderr:
try:
print(e.stderr.decode(), file=sys.stderr)
except Exception:
print(e.stderr, file=sys.stderr)
raise
def docker_exec(
container: str,
argv: list[str],
*,
stdin=None,
capture: bool = False,
env: Optional[dict] = None,
docker_env: Optional[dict[str, str]] = None,
) -> subprocess.CompletedProcess:
cmd: list[str] = ["docker", "exec", "-i"]
if docker_env:
for k, v in docker_env.items():
cmd.extend(["-e", f"{k}={v}"])
cmd.extend([container, *argv])
return run(cmd, stdin=stdin, capture=capture, env=env)
def docker_exec_sh(
container: str,
script: str,
*,
stdin=None,
capture: bool = False,
env: Optional[dict] = None,
docker_env: Optional[dict[str, str]] = None,
) -> subprocess.CompletedProcess:
return docker_exec(
container,
["sh", "-lc", script],
stdin=stdin,
capture=capture,
env=env,
docker_env=docker_env,
)
def docker_volume_exists(volume: str) -> bool:
p = subprocess.run(
["docker", "volume", "inspect", volume],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
return p.returncode == 0

View File

@@ -0,0 +1,50 @@
import pandas as pd
import argparse
import os
def check_and_add_entry(file_path, instance, database, username, password):
# Check if the file exists and is not empty
if os.path.exists(file_path) and os.path.getsize(file_path) > 0:
# Read the existing CSV file with header
df = pd.read_csv(file_path, sep=';')
else:
# Create a new DataFrame with columns if file does not exist
df = pd.DataFrame(columns=['instance', 'database', 'username', 'password'])
# Check if the entry exists and remove it
mask = (
(df['instance'] == instance) &
((df['database'] == database) |
(((df['database'].isna()) | (df['database'] == '')) & (database == ''))) &
(df['username'] == username)
)
if not df[mask].empty:
print("Replacing existing entry.")
df = df[~mask]
else:
print("Adding new entry.")
# Create a new DataFrame for the new entry
new_entry = pd.DataFrame([{'instance': instance, 'database': database, 'username': username, 'password': password}])
# Add (or replace) the entry using concat
df = pd.concat([df, new_entry], ignore_index=True)
# Save the updated CSV file
df.to_csv(file_path, sep=';', index=False)
def main():
parser = argparse.ArgumentParser(description="Check and replace (or add) a database entry in a CSV file.")
parser.add_argument("file_path", help="Path to the CSV file")
parser.add_argument("instance", help="Database instance")
parser.add_argument("database", help="Database name")
parser.add_argument("username", help="Username")
parser.add_argument("password", nargs='?', default="", help="Password (optional)")
args = parser.parse_args()
check_and_add_entry(args.file_path, args.instance, args.database, args.username, args.password)
if __name__ == "__main__":
main()

0
tests/__init__.py Normal file
View File

0
tests/e2e/__init__.py Normal file
View File

222
tests/e2e/helpers.py Normal file
View File

@@ -0,0 +1,222 @@
# tests/e2e/helpers.py
from __future__ import annotations
import shutil
import subprocess
import time
import uuid
from pathlib import Path
def run(
cmd: list[str],
*,
capture: bool = True,
check: bool = True,
cwd: str | None = None,
) -> subprocess.CompletedProcess:
try:
return subprocess.run(
cmd,
check=check,
cwd=cwd,
text=True,
capture_output=capture,
)
except subprocess.CalledProcessError as e:
# Print captured output so failing E2E tests are "live" / debuggable in CI logs
print(">>> command failed:", " ".join(cmd))
print(">>> exit code:", e.returncode)
if e.stdout:
print(">>> STDOUT:\n" + e.stdout)
if e.stderr:
print(">>> STDERR:\n" + e.stderr)
raise
def sh(cmd: str, *, capture: bool = True, check: bool = True) -> subprocess.CompletedProcess:
return run(["sh", "-lc", cmd], capture=capture, check=check)
def unique(prefix: str) -> str:
return f"{prefix}-{uuid.uuid4().hex[:10]}"
def require_docker() -> None:
run(["docker", "version"], capture=True, check=True)
def machine_hash() -> str:
out = sh("sha256sum /etc/machine-id | awk '{print $1}'").stdout.strip()
if len(out) < 16:
raise RuntimeError("Could not determine machine hash from /etc/machine-id")
return out
def wait_for_log(container: str, pattern: str, timeout_s: int = 60) -> None:
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(["docker", "logs", container], capture=True, check=False)
if pattern in (p.stdout or ""):
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for log pattern '{pattern}' in {container}")
def wait_for_postgres(container: str, *, user: str = "postgres", timeout_s: int = 90) -> None:
"""
Docker-outside-of-Docker friendly readiness: check from inside the DB container.
"""
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(
["docker", "exec", container, "sh", "-lc", f"pg_isready -U {user} -h localhost"],
capture=True,
check=False,
)
if p.returncode == 0:
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for Postgres readiness in container {container}")
def wait_for_mariadb(container: str, *, root_password: str, timeout_s: int = 90) -> None:
"""
Liveness probe for MariaDB.
IMPORTANT (MariaDB 11):
Root TCP auth is often restricted (unix_socket auth), so a TCP ping like
`mariadb-admin -uroot -p... -h localhost ping` can fail even though the server is up.
We therefore check readiness via a socket-based query.
"""
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(
["docker", "exec", container, "sh", "-lc", "mariadb -uroot --protocol=socket -e \"SELECT 1;\""],
capture=True,
check=False,
)
if p.returncode == 0:
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for MariaDB readiness in container {container}")
def wait_for_mariadb_sql(container: str, *, user: str, password: str, timeout_s: int = 90) -> None:
"""
SQL login readiness for the *dedicated test user* over TCP.
This is separate from wait_for_mariadb(root) because root may be socket-only,
while the tests use a normal user that should work via TCP.
"""
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(
[
"docker",
"exec",
container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{user} -p{password} -e \"SELECT 1;\"",
],
capture=True,
check=False,
)
if p.returncode == 0:
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for MariaDB SQL login readiness in container {container}")
def backup_run(
*,
backups_dir: str,
repo_name: str,
compose_dir: str,
databases_csv: str,
database_containers: list[str],
images_no_stop_required: list[str],
images_no_backup_required: list[str] | None = None,
dump_only: bool = False,
) -> None:
cmd = [
"baudolo",
"--compose-dir", compose_dir,
"--docker-compose-hard-restart-required", "mailu",
"--repo-name", repo_name,
"--databases-csv", databases_csv,
"--backups-dir", backups_dir,
"--database-containers", *database_containers,
"--images-no-stop-required", *images_no_stop_required,
]
if images_no_backup_required:
cmd += ["--images-no-backup-required", *images_no_backup_required]
if dump_only:
cmd += ["--dump-only"]
try:
run(cmd, capture=True, check=True)
except subprocess.CalledProcessError as e:
print(">>> baudolo failed (exit code:", e.returncode, ")")
if e.stdout:
print(">>> baudolo STDOUT:\n" + e.stdout)
if e.stderr:
print(">>> baudolo STDERR:\n" + e.stderr)
raise
def latest_version_dir(backups_dir: str, repo_name: str) -> tuple[str, str]:
"""
Returns (hash, version) for the latest backup.
"""
h = machine_hash()
root = Path(backups_dir) / h / repo_name
if not root.is_dir():
raise FileNotFoundError(str(root))
versions = sorted([p.name for p in root.iterdir() if p.is_dir()])
if not versions:
raise RuntimeError(f"No versions found under {root}")
return h, versions[-1]
def backup_path(backups_dir: str, repo_name: str, version: str, volume: str) -> Path:
h = machine_hash()
return Path(backups_dir) / h / repo_name / version / volume
def create_minimal_compose_dir(base: str) -> str:
"""
baudolo requires --compose-dir. Create an empty dir with one non-compose subdir.
"""
p = Path(base) / "compose-root"
p.mkdir(parents=True, exist_ok=True)
(p / "noop").mkdir(parents=True, exist_ok=True)
return str(p)
def write_databases_csv(path: str, rows: list[tuple[str, str, str, str]]) -> None:
"""
rows: (instance, database, username, password)
database may be '' (empty) to trigger pg_dumpall behavior if you want, but here we use db name.
"""
Path(path).parent.mkdir(parents=True, exist_ok=True)
with open(path, "w", encoding="utf-8") as f:
f.write("instance;database;username;password\n")
for inst, db, user, pw in rows:
f.write(f"{inst};{db};{user};{pw}\n")
def cleanup_docker(*, containers: list[str], volumes: list[str]) -> None:
for c in containers:
run(["docker", "rm", "-f", c], capture=True, check=False)
for v in volumes:
run(["docker", "volume", "rm", "-f", v], capture=True, check=False)
def ensure_empty_dir(path: str) -> None:
p = Path(path)
if p.exists():
shutil.rmtree(p)
p.mkdir(parents=True, exist_ok=True)

View File

@@ -0,0 +1,94 @@
import unittest
from pathlib import Path
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
)
class TestE2EFilesFull(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-files-full")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.volume_src = f"{cls.prefix}-vol-src"
cls.volume_dst = f"{cls.prefix}-vol-dst"
cls.containers = []
cls.volumes = [cls.volume_src, cls.volume_dst]
# create source volume with a file
run(["docker", "volume", "create", cls.volume_src])
run([
"docker", "run", "--rm",
"-v", f"{cls.volume_src}:/data",
"alpine:3.20",
"sh", "-lc", "mkdir -p /data && echo 'hello' > /data/hello.txt",
])
# databases.csv (unused, but required by CLI)
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [])
# Run backup (files should be copied)
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=["dummy-db"],
images_no_stop_required=["alpine", "postgres", "mariadb", "mysql"],
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_exists(self) -> None:
p = (
backup_path(
self.backups_dir,
self.repo_name,
self.version,
self.volume_src,
)
/ "files"
/ "hello.txt"
)
self.assertTrue(p.is_file(), f"Expected backed up file at: {p}")
def test_restore_files_into_new_volume(self) -> None:
# restore files from volume_src backup into volume_dst
run([
"baudolo-restore", "files",
self.volume_dst, self.hash, self.version,
"--backups-dir", self.backups_dir,
"--repo-name", self.repo_name,
"--source-volume", self.volume_src,
"--rsync-image", "ghcr.io/kevinveenbirkenbach/alpine-rsync",
])
# verify restored file exists in dst volume
p = run([
"docker", "run", "--rm",
"-v", f"{self.volume_dst}:/data",
"alpine:3.20",
"sh", "-lc", "cat /data/hello.txt",
])
self.assertEqual((p.stdout or "").strip(), "hello")

View File

@@ -0,0 +1,72 @@
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
)
class TestE2EFilesNoCopy(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-files-nocopy")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.volume_src = f"{cls.prefix}-vol-src"
cls.volume_dst = f"{cls.prefix}-vol-dst"
cls.containers = []
cls.volumes = [cls.volume_src, cls.volume_dst]
run(["docker", "volume", "create", cls.volume_src])
run([
"docker", "run", "--rm",
"-v", f"{cls.volume_src}:/data",
"alpine:3.20",
"sh", "-lc", "echo 'hello' > /data/hello.txt",
])
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [])
# dump-only => NO file rsync backups
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=["dummy-db"],
images_no_stop_required=["alpine", "postgres", "mariadb", "mysql"],
dump_only=True,
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_not_present(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.volume_src) / "files"
self.assertFalse(p.exists(), f"Did not expect files backup dir at: {p}")
def test_restore_files_fails_expected(self) -> None:
p = run([
"baudolo-restore", "files",
self.volume_dst, self.hash, self.version,
"--backups-dir", self.backups_dir,
"--repo-name", self.repo_name,
], check=False)
self.assertEqual(p.returncode, 2, f"Expected exitcode 2, got {p.returncode}\nSTDOUT={p.stdout}\nSTDERR={p.stderr}")

View File

@@ -0,0 +1,155 @@
# tests/e2e/test_e2e_mariadb_full.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_mariadb,
wait_for_mariadb_sql,
)
class TestE2EMariaDBFull(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-mariadb-full")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.db_container = f"{cls.prefix}-mariadb"
cls.db_volume = f"{cls.prefix}-mariadb-vol"
cls.containers = [cls.db_container]
cls.volumes = [cls.db_volume]
cls.db_name = "appdb"
cls.db_user = "test"
cls.db_password = "testpw"
cls.root_password = "rootpw"
run(["docker", "volume", "create", cls.db_volume])
# Start MariaDB with a dedicated TCP-capable user for tests.
run(
[
"docker",
"run",
"-d",
"--name",
cls.db_container,
"-e",
f"MARIADB_ROOT_PASSWORD={cls.root_password}",
"-e",
f"MARIADB_DATABASE={cls.db_name}",
"-e",
f"MARIADB_USER={cls.db_user}",
"-e",
f"MARIADB_PASSWORD={cls.db_password}",
"-v",
f"{cls.db_volume}:/var/lib/mysql",
"mariadb:11",
]
)
# Liveness + actual SQL login readiness (TCP)
wait_for_mariadb(cls.db_container, root_password=cls.root_password, timeout_s=90)
wait_for_mariadb_sql(cls.db_container, user=cls.db_user, password=cls.db_password, timeout_s=90)
# Create table + data via the dedicated user (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"CREATE TABLE {cls.db_name}.t (id INT PRIMARY KEY, v VARCHAR(50)); "
f"INSERT INTO {cls.db_name}.t VALUES (1,'ok');\"",
]
)
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
# IMPORTANT: baudolo backup expects credentials for the DB dump.
write_databases_csv(cls.databases_csv, [(cls.db_container, cls.db_name, cls.db_user, cls.db_password)])
# Backup with file+dump
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.db_container],
images_no_stop_required=["mariadb", "mysql", "alpine", "postgres"],
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
# Wipe DB via the dedicated user (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"DROP TABLE {cls.db_name}.t;\"",
]
)
# Restore DB (uses baudolo-restore which execs mysql/mariadb inside the container)
run(
[
"baudolo-restore",
"mariadb",
cls.db_volume,
cls.hash,
cls.version,
"--backups-dir",
cls.backups_dir,
"--repo-name",
cls.repo_name,
"--container",
cls.db_container,
"--db-name",
cls.db_name,
"--db-user",
cls.db_user,
"--db-password",
cls.db_password,
"--empty",
]
)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_dump_file_exists(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.db_volume) / "sql" / f"{self.db_name}.backup.sql"
self.assertTrue(p.is_file(), f"Expected dump file at: {p}")
def test_data_restored(self) -> None:
p = run(
[
"docker",
"exec",
self.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{self.db_user} -p{self.db_password} "
f"-N -e \"SELECT v FROM {self.db_name}.t WHERE id=1;\"",
]
)
self.assertEqual((p.stdout or "").strip(), "ok")

View File

@@ -0,0 +1,153 @@
# tests/e2e/test_e2e_mariadb_no_copy.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_mariadb,
wait_for_mariadb_sql,
)
class TestE2EMariaDBNoCopy(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-mariadb-nocopy")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.db_container = f"{cls.prefix}-mariadb"
cls.db_volume = f"{cls.prefix}-mariadb-vol"
cls.containers = [cls.db_container]
cls.volumes = [cls.db_volume]
cls.db_name = "appdb"
cls.db_user = "test"
cls.db_password = "testpw"
cls.root_password = "rootpw"
run(["docker", "volume", "create", cls.db_volume])
run(
[
"docker",
"run",
"-d",
"--name",
cls.db_container,
"-e",
f"MARIADB_ROOT_PASSWORD={cls.root_password}",
"-e",
f"MARIADB_DATABASE={cls.db_name}",
"-e",
f"MARIADB_USER={cls.db_user}",
"-e",
f"MARIADB_PASSWORD={cls.db_password}",
"-v",
f"{cls.db_volume}:/var/lib/mysql",
"mariadb:11",
]
)
wait_for_mariadb(cls.db_container, root_password=cls.root_password, timeout_s=90)
wait_for_mariadb_sql(cls.db_container, user=cls.db_user, password=cls.db_password, timeout_s=90)
# Create table + data (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"CREATE TABLE {cls.db_name}.t (id INT PRIMARY KEY, v VARCHAR(50)); "
f"INSERT INTO {cls.db_name}.t VALUES (1,'ok');\"",
]
)
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [(cls.db_container, cls.db_name, cls.db_user, cls.db_password)])
# dump-only => no files
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.db_container],
images_no_stop_required=["mariadb", "mysql", "alpine", "postgres"],
dump_only=True,
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
# Wipe table (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"DROP TABLE {cls.db_name}.t;\"",
]
)
# Restore DB
run(
[
"baudolo-restore",
"mariadb",
cls.db_volume,
cls.hash,
cls.version,
"--backups-dir",
cls.backups_dir,
"--repo-name",
cls.repo_name,
"--container",
cls.db_container,
"--db-name",
cls.db_name,
"--db-user",
cls.db_user,
"--db-password",
cls.db_password,
"--empty",
]
)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_not_present(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.db_volume) / "files"
self.assertFalse(p.exists(), f"Did not expect files backup dir at: {p}")
def test_data_restored(self) -> None:
p = run(
[
"docker",
"exec",
self.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{self.db_user} -p{self.db_password} "
f"-N -e \"SELECT v FROM {self.db_name}.t WHERE id=1;\"",
]
)
self.assertEqual((p.stdout or "").strip(), "ok")

View File

@@ -0,0 +1,102 @@
# tests/e2e/test_e2e_postgres_full.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_postgres,
)
class TestE2EPostgresFull(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-postgres-full")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.pg_container = f"{cls.prefix}-pg"
cls.pg_volume = f"{cls.prefix}-pg-vol"
cls.containers = [cls.pg_container]
cls.volumes = [cls.pg_volume]
run(["docker", "volume", "create", cls.pg_volume])
run([
"docker", "run", "-d",
"--name", cls.pg_container,
"-e", "POSTGRES_PASSWORD=pgpw",
"-e", "POSTGRES_DB=appdb",
"-e", "POSTGRES_USER=postgres",
"-v", f"{cls.pg_volume}:/var/lib/postgresql/data",
"postgres:16",
])
wait_for_postgres(cls.pg_container, user="postgres", timeout_s=90)
# Create a table + data
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"CREATE TABLE t (id int primary key, v text); INSERT INTO t VALUES (1,'ok');\"",
])
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [(cls.pg_container, "appdb", "postgres", "pgpw")])
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.pg_container],
images_no_stop_required=["postgres", "mariadb", "mysql", "alpine"],
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
# Wipe schema
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"DROP TABLE t;\"",
])
# Restore
run([
"baudolo-restore", "postgres",
cls.pg_volume, cls.hash, cls.version,
"--backups-dir", cls.backups_dir,
"--repo-name", cls.repo_name,
"--container", cls.pg_container,
"--db-name", "appdb",
"--db-user", "postgres",
"--db-password", "pgpw",
"--empty",
])
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_dump_file_exists(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.pg_volume) / "sql" / "appdb.backup.sql"
self.assertTrue(p.is_file(), f"Expected dump file at: {p}")
def test_data_restored(self) -> None:
p = run([
"docker", "exec", self.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -t -c \"SELECT v FROM t WHERE id=1;\"",
])
self.assertEqual((p.stdout or "").strip(), "ok")

View File

@@ -0,0 +1,99 @@
# tests/e2e/test_e2e_postgres_no_copy.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_postgres,
)
class TestE2EPostgresNoCopy(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-postgres-nocopy")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.pg_container = f"{cls.prefix}-pg"
cls.pg_volume = f"{cls.prefix}-pg-vol"
cls.containers = [cls.pg_container]
cls.volumes = [cls.pg_volume]
run(["docker", "volume", "create", cls.pg_volume])
run([
"docker", "run", "-d",
"--name", cls.pg_container,
"-e", "POSTGRES_PASSWORD=pgpw",
"-e", "POSTGRES_DB=appdb",
"-e", "POSTGRES_USER=postgres",
"-v", f"{cls.pg_volume}:/var/lib/postgresql/data",
"postgres:16",
])
wait_for_postgres(cls.pg_container, user="postgres", timeout_s=90)
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"CREATE TABLE t (id int primary key, v text); INSERT INTO t VALUES (1,'ok');\"",
])
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [(cls.pg_container, "appdb", "postgres", "pgpw")])
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.pg_container],
images_no_stop_required=["postgres", "mariadb", "mysql", "alpine"],
dump_only=True,
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"DROP TABLE t;\"",
])
run([
"baudolo-restore", "postgres",
cls.pg_volume, cls.hash, cls.version,
"--backups-dir", cls.backups_dir,
"--repo-name", cls.repo_name,
"--container", cls.pg_container,
"--db-name", "appdb",
"--db-user", "postgres",
"--db-password", "pgpw",
"--empty",
])
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_not_present(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.pg_volume) / "files"
self.assertFalse(p.exists(), f"Did not expect files backup dir at: {p}")
def test_data_restored(self) -> None:
p = run([
"docker", "exec", self.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -t -c \"SELECT v FROM t WHERE id=1;\"",
])
self.assertEqual((p.stdout or "").strip(), "ok")

View File

View File

@@ -0,0 +1,88 @@
import csv
import subprocess
import sys
import tempfile
import unittest
from pathlib import Path
def run_seed(csv_path: Path, instance: str, database: str, username: str, password: str = "") -> subprocess.CompletedProcess:
# Run the real CLI module (integration-style).
return subprocess.run(
[
sys.executable,
"-m",
"baudolo.seed",
str(csv_path),
instance,
database,
username,
password,
],
text=True,
capture_output=True,
check=True,
)
def read_csv_semicolon(path: Path) -> list[dict]:
with path.open("r", encoding="utf-8", newline="") as f:
reader = csv.DictReader(f, delimiter=";")
return list(reader)
class TestSeedIntegration(unittest.TestCase):
def test_creates_file_and_adds_entry_when_missing(self) -> None:
with tempfile.TemporaryDirectory() as td:
p = Path(td) / "databases.csv"
self.assertFalse(p.exists())
cp = run_seed(p, "docker.test", "appdb", "alice", "secret")
self.assertEqual(cp.returncode, 0, cp.stderr)
self.assertTrue(p.exists())
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["instance"], "docker.test")
self.assertEqual(rows[0]["database"], "appdb")
self.assertEqual(rows[0]["username"], "alice")
self.assertEqual(rows[0]["password"], "secret")
def test_replaces_existing_entry_same_keys(self) -> None:
with tempfile.TemporaryDirectory() as td:
p = Path(td) / "databases.csv"
# First add
run_seed(p, "docker.test", "appdb", "alice", "oldpw")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["password"], "oldpw")
# Replace (same instance+database+username)
run_seed(p, "docker.test", "appdb", "alice", "newpw")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1, "Expected replacement, not a duplicate row")
self.assertEqual(rows[0]["instance"], "docker.test")
self.assertEqual(rows[0]["database"], "appdb")
self.assertEqual(rows[0]["username"], "alice")
self.assertEqual(rows[0]["password"], "newpw")
def test_database_empty_string_matches_existing_empty_database(self) -> None:
with tempfile.TemporaryDirectory() as td:
p = Path(td) / "databases.csv"
# Add with empty database
run_seed(p, "docker.test", "", "alice", "pw1")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["database"], "")
# Replace with empty database again
run_seed(p, "docker.test", "", "alice", "pw2")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["database"], "")
self.assertEqual(rows[0]["password"], "pw2")

0
tests/unit/__init__.py Normal file
View File

36
tests/unit/test_backup.py Normal file
View File

@@ -0,0 +1,36 @@
import unittest
from unittest.mock import patch
from baudolo.backup.app import requires_stop
class TestRequiresStop(unittest.TestCase):
@patch("baudolo.backup.app.get_image_info")
def test_requires_stop_false_when_all_images_are_whitelisted(self, mock_get_image_info):
# All containers use images containing allowed substrings
mock_get_image_info.side_effect = [
"repo/mastodon:v4",
"repo/wordpress:latest",
]
containers = ["c1", "c2"]
whitelist = ["mastodon", "wordpress"]
self.assertFalse(requires_stop(containers, whitelist))
@patch("baudolo.backup.app.get_image_info")
def test_requires_stop_true_when_any_image_is_not_whitelisted(self, mock_get_image_info):
mock_get_image_info.side_effect = [
"repo/mastodon:v4",
"repo/nginx:latest",
]
containers = ["c1", "c2"]
whitelist = ["mastodon", "wordpress"]
self.assertTrue(requires_stop(containers, whitelist))
@patch("baudolo.backup.app.get_image_info")
def test_requires_stop_true_when_whitelist_empty(self, mock_get_image_info):
mock_get_image_info.return_value = "repo/anything:latest"
self.assertTrue(requires_stop(["c1"], []))
if __name__ == "__main__":
unittest.main()