11 Commits

Author SHA1 Message Date
3b39a6ef02 Release version 1.0.0 2025-12-27 09:30:38 +01:00
e0b2e8934e docs(readme): rewrite README to reflect deterministic backup design
- clarify separation between file backups (always) and SQL dumps (explicit only)
- document correct nested backup directory layout
- remove legacy script-based usage and outdated sections
- add explicit explanation of database definition scope
- update usage examples to current baudolo CLI

https://chatgpt.com/share/694ef6d2-7584-800f-a32b-27367f234d1d
2025-12-26 21:57:46 +01:00
bbb2dd1732 Removed .travis 2025-12-26 21:03:00 +01:00
159502af5e Added mirros 2025-12-26 20:50:29 +01:00
698d1e7a9e ci: add Makefile-driven CI with unit, integration and e2e tests
- add GitHub Actions CI workflow using Makefile targets exclusively
- run unit, integration and e2e tests via `make test`
- publish Docker image to GHCR on SemVer tags
- force-update `stable` git tag after successful release
- add integration test for seed CLI (CSV upsert behavior)
- extend Makefile with test-unit and test-integration targets

https://chatgpt.com/share/694ee54f-b814-800f-a714-e87563e538b7
2025-12-26 20:43:06 +01:00
f8420c8bea renamed configure to seed 2025-12-26 19:58:39 +01:00
8e1a53e1f9 Deleted Starting file 2025-12-26 19:47:54 +01:00
7b55d59300 fix(restore): handle bytes stdin correctly in subprocess wrapper
Avoid passing raw bytes/str via stdin to subprocess.run(), which caused
"'bytes' object has no attribute 'fileno'" and
"stdin and input arguments may not both be used" errors.

If stdin is bytes or str, pass it via input= instead; otherwise forward
stdin unchanged. This fixes Postgres restore failures in E2E tests
without changing productive restore logic.

https://chatgpt.com/share/694ed70d-9e04-800f-8dec-edf08e6e2082
2025-12-26 19:42:17 +01:00
cf6f4d8326 test(e2e): stabilize MariaDB 11 tests and fix restore empty-mode client selection
- add `make clean` and run it before `test-e2e` to avoid stale artifacts
- restore: do not hardcode `mysql` for --empty; use detected mariadb/mysql client
- e2e: improve subprocess error output for easier CI debugging
- e2e: adjust MariaDB readiness checks for socket-only root on MariaDB 11
- e2e: add `wait_for_mariadb_sql` and run SQL readiness checks for dedicated TCP user
- e2e: update MariaDB full/no-copy tests to use dedicated user over TCP and consistent credentials

https://chatgpt.com/share/694ecfb8-f8a8-800f-a1c9-a5f410d4ba02
2025-12-26 19:10:55 +01:00
4af15d9074 fix(restore): allow restoring files from different source volume
The file restore command previously assumed that the target volume name
was identical to the volume name used in the backup path. This caused
restores to fail (exit code 2) when restoring data from one volume backup
into a different target volume.

Introduce an optional --source-volume argument for `baudolo-restore files`
to explicitly specify the backup source volume while keeping the target
volume unchanged.

- Default behavior remains fully backward-compatible
- Enables restoring backups from volume A into volume B
- Fixes E2E test scenario restoring into a new volume

Tests:
- Update E2E file restore test to use --source-volume

https://chatgpt.com/share/694ec70f-1d7c-800f-b221-9d22e4b0775e
2025-12-26 18:33:58 +01:00
c30b4865d4 refactor: migrate to src/ package + add DinD-based E2E runner with debug artifacts
- Replace legacy standalone scripts with a proper src-layout Python package
  (baudolo backup/restore/configure entrypoints via pyproject.toml)
- Remove old scripts/files (backup-docker-to-local.py, recover-docker-from-local.sh,
  databases.csv.tpl, Todo.md)
- Add Dockerfile to build the project image for local/E2E usage
- Update Makefile: build image and run E2E via external runner script
- Add scripts/test-e2e.sh:
  - start DinD + dedicated network
  - recreate DinD data volume (and shared /tmp volume)
  - pre-pull helper images (alpine-rsync, alpine)
  - load local baudolo:local image into DinD
  - run unittest E2E suite inside DinD and abort on first failure
  - on failure: dump host+DinD diagnostics and archive shared /tmp into artifacts/
- Add artifacts/ debug outputs produced by failing E2E runs (logs, events, tmp archive)

https://chatgpt.com/share/694ec23f-0794-800f-9a59-8365bc80f435
2025-12-26 18:13:26 +01:00
47 changed files with 2563 additions and 848 deletions

91
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,91 @@
name: CI (make tests, stable, publish)
on:
push:
branches: ["**"]
tags: ["v*.*.*"] # SemVer tags like v1.2.3
pull_request:
permissions:
contents: write # push/update 'stable' tag
packages: write # push to GHCR
env:
IMAGE_NAME: baudolo
REGISTRY: ghcr.io
IMAGE_REPO: ${{ github.repository }}
jobs:
test:
name: make test
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Show docker info
run: |
docker version
docker info
- name: Run all tests via Makefile
run: |
make test
- name: Upload E2E artifacts (always)
if: always()
uses: actions/upload-artifact@v4
with:
name: e2e-artifacts
path: artifacts
if-no-files-found: ignore
stable_and_publish:
name: Mark stable + publish image (SemVer tags only)
needs: [test]
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
steps:
- name: Checkout (full history for tags)
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Derive version from tag
id: ver
run: |
TAG="${GITHUB_REF#refs/tags/}" # v1.2.3
echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
- name: Mark 'stable' git tag (force update)
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git tag -f stable "${GITHUB_SHA}"
git push -f origin stable
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build image (Makefile)
run: |
make build
- name: Tag image for registry
run: |
# local image built by Makefile is: baudolo:local
docker tag "${IMAGE_NAME}:local" "${REGISTRY}/${IMAGE_REPO}:${{ steps.ver.outputs.tag }}"
docker tag "${IMAGE_NAME}:local" "${REGISTRY}/${IMAGE_REPO}:stable"
docker tag "${IMAGE_NAME}:local" "${REGISTRY}/${IMAGE_REPO}:sha-${GITHUB_SHA::12}"
- name: Push image
run: |
docker push "${REGISTRY}/${IMAGE_REPO}:${{ steps.ver.outputs.tag }}"
docker push "${REGISTRY}/${IMAGE_REPO}:stable"
docker push "${REGISTRY}/${IMAGE_REPO}:sha-${GITHUB_SHA::12}"

2
.gitignore vendored
View File

@@ -1,2 +1,2 @@
databases.csv
__pycache__ __pycache__
artifacts/

View File

@@ -1,2 +0,0 @@
language: shell
script: shellcheck $(find . -type f -name '*.sh')

4
CHANGELOG.md Normal file
View File

@@ -0,0 +1,4 @@
## [1.0.0] - 2025-12-27
* Official Release 🥳

34
Dockerfile Normal file
View File

@@ -0,0 +1,34 @@
# syntax=docker/dockerfile:1
FROM python:3.11-slim
WORKDIR /app
# Runtime + build essentials:
# - rsync: required for file backup/restore
# - ca-certificates: TLS
# - docker-cli: needed if you want to control the host Docker engine (via /var/run/docker.sock mount)
# - make: to delegate install logic to Makefile
#
# Notes:
# - On Debian slim, the docker client package is typically "docker.io".
# - If you only want restore-without-docker, you can drop docker.io later.
RUN apt-get update && apt-get install -y --no-install-recommends \
make \
rsync \
ca-certificates \
docker-cli \
&& rm -rf /var/lib/apt/lists/*
# Fail fast if docker client is missing
RUN command -v docker
COPY . .
# All install decisions are handled by the Makefile.
RUN make install
# Sensible defaults (can be overridden at runtime)
ENV PYTHONUNBUFFERED=1
# Default: show CLI help
CMD ["baudolo", "--help"]

4
MIRRORS Normal file
View File

@@ -0,0 +1,4 @@
git@github.com:kevinveenbirkenbach/backup-docker-to-local.git
ssh://git@git.veen.world:2201/kevinveenbirkenbach/backup-docker-to-local.git
ssh://git@code.infinito.nexus:2201/kevinveenbirkenbach/backup-docker-to-local.git
https://pypi.org/project/baudolo/

View File

@@ -1,12 +1,57 @@
.PHONY: test .PHONY: install build \
test-e2e test test-unit test-integration
test: # Default python if no venv is active
python -m unittest discover -s tests/unit -p "test_*.py" PY_DEFAULT ?= python3
IMAGE_NAME ?= baudolo
IMAGE_TAG ?= local
IMAGE := $(IMAGE_NAME):$(IMAGE_TAG)
install: install:
@echo ">> Installation instructions:" @set -eu; \
@echo " This software can be installed with pkgmgr under the alias 'baudolo':" PY="$(PY_DEFAULT)"; \
@echo " pkgmgr install baudolo" if [ -n "$${VIRTUAL_ENV:-}" ] && [ -x "$${VIRTUAL_ENV}/bin/python" ]; then \
@echo "" PY="$${VIRTUAL_ENV}/bin/python"; \
@echo " 📦 pkgmgr project page:" fi; \
@echo " https://github.com/kevinveenbirkenbach/package-manager" echo ">>> Using python: $$PY"; \
"$$PY" -m pip install --upgrade pip; \
"$$PY" -m pip install -e .; \
command -v baudolo >/dev/null 2>&1 || { \
echo "ERROR: baudolo not found on PATH after install"; \
exit 2; \
}; \
baudolo --help >/dev/null 2>&1 || true
# ------------------------------------------------------------
# Build the baudolo Docker image
# ------------------------------------------------------------
build:
@echo ">> Building Docker image $(IMAGE)"
docker build -t $(IMAGE) .
clean:
git clean -fdX .
# ------------------------------------------------------------
# Run E2E tests inside the container (Docker socket required)
# ------------------------------------------------------------
# E2E via isolated Docker-in-Docker (DinD)
# - depends on local image build
# - starts a DinD daemon container on a dedicated network
# - loads the freshly built image into DinD
# - runs the unittest suite inside a container that talks to DinD via DOCKER_HOST
test-e2e: clean build
@bash scripts/test-e2e.sh
test: test-unit test-integration test-e2e
test-unit: clean build
@echo ">> Running unit tests"
@docker run --rm -t $(IMAGE) \
sh -lc 'python -m unittest discover -t . -s tests/unit -p "test_*.py" -v'
test-integration: clean build
@echo ">> Running integration tests"
@docker run --rm -t $(IMAGE) \
sh -lc 'python -m unittest discover -t . -s tests/integration -p "test_*.py" -v'

196
README.md
View File

@@ -1,80 +1,196 @@
# Backup Docker Volumes to Local (baudolo) 📦🔄 # baudolo Deterministic Backup & Restore for Docker Volumes 📦🔄
[![GitHub Sponsors](https://img.shields.io/badge/Sponsor-GitHub%20Sponsors-blue?logo=github)](https://github.com/sponsors/kevinveenbirkenbach) [![Patreon](https://img.shields.io/badge/Support-Patreon-orange?logo=patreon)](https://www.patreon.com/c/kevinveenbirkenbach) [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20me%20a%20Coffee-Funding-yellow?logo=buymeacoffee)](https://buymeacoffee.com/kevinveenbirkenbach) [![PayPal](https://img.shields.io/badge/Donate-PayPal-blue?logo=paypal)](https://s.veen.world/paypaldonate) [![GitHub Sponsors](https://img.shields.io/badge/Sponsor-GitHub%20Sponsors-blue?logo=github)](https://github.com/sponsors/kevinveenbirkenbach) [![Patreon](https://img.shields.io/badge/Support-Patreon-orange?logo=patreon)](https://www.patreon.com/c/kevinveenbirkenbach) [![Buy Me a Coffee](https://img.shields.io/badge/Buy%20me%20a%20Coffee-Funding-yellow?logo=buymeacoffee)](https://buymeacoffee.com/kevinveenbirkenbach) [![PayPal](https://img.shields.io/badge/Donate-PayPal-blue?logo=paypal)](https://s.veen.world/paypaldonate) [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker Version](https://img.shields.io/badge/Docker-Yes-blue.svg)](https://www.docker.com) [![Python Version](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org) [![GitHub stars](https://img.shields.io/github/stars/kevinveenbirkenbach/backup-docker-to-local.svg?style=social)](https://github.com/kevinveenbirkenbach/backup-docker-to-local/stargazers)
**Backup Docker Volumes to Local** is a set of Python and shell scripts that enable you to perform incremental backups of all your Docker volumes using rsync. It is designed to integrate seamlessly with [Kevin's Package Manager](https://github.com/kevinveenbirkenbach/package-manager) under the alias **baudolo**, making it easy to install and manage. The tool supports both file and database recoveries with a clear, automated backup scheme. `baudolo` is a backup and restore system for Docker volumes with
**mandatory file backups** and **explicit, deterministic database dumps**.
It is designed for environments with many Docker services where:
- file-level backups must always exist
- database dumps must be intentional, predictable, and auditable
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0) [![Docker Version](https://img.shields.io/badge/Docker-Yes-blue.svg)](https://www.docker.com) [![Python Version](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org) [![GitHub stars](https://img.shields.io/github/stars/kevinveenbirkenbach/backup-docker-to-local.svg?style=social)](https://github.com/kevinveenbirkenbach/backup-docker-to-local/stargazers) ## ✨ Key Features
## 🎯 Goal - 📦 Incremental Docker volume backups using `rsync --link-dest`
- 🗄 Optional SQL dumps for:
- PostgreSQL
- MariaDB / MySQL
- 🌱 Explicit database definition for SQL backups (no auto-discovery)
- 🧾 Backup integrity stamping via `dirval` (Python API)
- ⏸ Automatic container stop/start when required for consistency
- 🚫 Whitelisting of containers that do not require stopping
- ♻️ Modular, maintainable Python architecture
This project automates the backup of Docker volumes using incremental backups (rsync) and supports recovering both files and database dumps (MariaDB/PostgreSQL). A robust directory stamping mechanism ensures data integrity, and the tool also handles restarting Docker Compose services when necessary.
## 🚀 Features ## 🧠 Core Concept (Important!)
- **Incremental Backups:** Uses rsync with `--link-dest` for efficient, versioned backups. `baudolo` **separates file backups from database dumps**.
- **Database Backup Support:** Backs up MariaDB and PostgreSQL databases from running containers.
- **Volume Recovery:** Provides scripts to recover volumes and databases from backups.
- **Docker Compose Integration:** Option to automatically restart Docker Compose services after backup.
- **Flexible Configuration:** Easily integrated with your Docker environment with minimal setup.
- **Comprehensive Logging:** Detailed command output and error handling for safe operations.
## 🛠 Requirements - **Docker volumes are always backed up at file level**
- **SQL dumps are created only for explicitly defined databases**
- **Linux Operating System** (with Docker installed) 🐧 This results in the following behavior:
- **Python 3.x** 🐍
- **Docker & Docker Compose** 🔧
- **rsync** installed on your system
## 📥 Installation | Database defined | File backup | SQL dump |
|------------------|-------------|----------|
| No | ✔ yes | ✘ no |
| Yes | ✔ yes | ✔ yes |
You can install **Backup Docker Volumes to Local** easily via [Kevin's Package Manager](https://github.com/kevinveenbirkenbach/package-manager) using the alias **baudolo**: ## 📁 Backup Layout
```bash Backups are stored in a deterministic, fully nested structure:
pkgmgr install baudolo
```text
<backups-dir>/
└── <machine-hash>/
└── <repo-name>/
└── <timestamp>/
└── <volume-name>/
├── files/
└── sql/
└── <database>.backup.sql
``` ```
Alternatively, clone the repository directly: ### Meaning of each level
* `<machine-hash>`
SHA256 hash of `/etc/machine-id` (host separation)
* `<repo-name>`
Logical backup namespace (project / stack)
* `<timestamp>`
Backup generation (`YYYYMMDDHHMMSS`)
* `<volume-name>`
Docker volume name
* `files/`
Incremental file backup (rsync)
* `sql/`
Optional SQL dumps (only for defined databases)
## 🚀 Installation
### Local (editable install)
```bash ```bash
git clone https://github.com/kevinveenbirkenbach/backup-docker-to-local.git python3 -m venv .venv
cd backup-docker-to-local source .venv/bin/activate
pip install -e .
``` ```
## 🚀 Usage ## 🌱 Database Definition (SQL Backup Scope)
### Backup All Volumes ### How SQL backups are defined
To backup all Docker volumes, simply run: `baudolo` creates SQL dumps **only** for databases that are **explicitly defined**
via configuration (e.g. a databases definition file or seeding step).
If a database is **not defined**:
* its Docker volume is still backed up (files)
* **no SQL dump is created**
> No database definition → file backup only
> Database definition present → file backup + SQL dump
### Why explicit definition?
`baudolo` does **not** inspect running containers to guess databases.
Databases must be explicitly defined to guarantee:
* deterministic backups
* predictable restore behavior
* reproducible environments
* zero accidental production data exposure
### Required database metadata
Each database definition provides:
* database instance (container or logical instance)
* database name
* database user
* database password
This information is used by `baudolo` to execute
`pg_dump`, `pg_dumpall`, or `mariadb-dump`.
## 💾 Running a Backup
```bash ```bash
./backup-docker-to-local.sh baudolo \
--compose-dir /srv/docker \
--databases-csv /etc/baudolo/databases.csv \
--database-containers central-postgres central-mariadb \
--images-no-stop-required alpine postgres mariadb mysql \
--images-no-backup-required redis busybox
``` ```
### Recovery ### Common Backup Flags
#### Recover Volume Files | Flag | Description |
| --------------- | ------------------------------------------- |
| `--everything` | Always stop containers and re-run rsync |
| `--dump-only` | Only create SQL dumps, skip file backups |
| `--shutdown` | Do not restart containers after backup |
| `--backups-dir` | Backup root directory (default: `/Backups`) |
| `--repo-name` | Backup namespace under machine hash |
## ♻️ Restore Operations
### Restore Volume Files
```bash ```bash
bash ./recover-docker-from-local.sh "{{volume_name}}" "$(sha256sum /etc/machine-id | head -c 64)" "{{version_to_recover}}" baudolo-restore files \
my-volume \
<machine-hash> \
<version> \
--backups-dir /Backups \
--repo-name my-repo
``` ```
#### Recover Database Restore into a **different target volume**:
For example, to recover a MySQL/MariaDB database:
```bash ```bash
docker exec -i mysql_container mysql -uroot -psecret database < db.sql baudolo-restore files \
target-volume \
<machine-hash> \
<version> \
--source-volume source-volume
``` ```
#### Debug Mode ### Restore PostgreSQL
To inspect whats happening inside a container:
```bash ```bash
docker run -it --entrypoint /bin/sh --rm --volumes-from {{container_name}} -v /Backups/:/Backups/ kevinveenbirkenbach/alpine-rsync baudolo-restore postgres \
my-volume \
<machine-hash> \
<version> \
--container postgres \
--db-name appdb \
--db-password secret \
--empty
``` ```
### Restore MariaDB / MySQL
```bash
baudolo-restore mariadb \
my-volume \
<machine-hash> \
<version> \
--container mariadb \
--db-name shopdb \
--db-password secret \
--empty
```
> `baudolo` automatically detects whether `mariadb` or `mysql`
> is available inside the container
## 🔍 Backup Scheme ## 🔍 Backup Scheme
The backup mechanism uses incremental backups with rsync and stamps directories with a unique hash. For more details on the backup scheme, check out [this blog post](https://blog.veen.world/blog/2020/12/26/how-i-backup-dedicated-root-servers/). The backup mechanism uses incremental backups with rsync and stamps directories with a unique hash. For more details on the backup scheme, check out [this blog post](https://blog.veen.world/blog/2020/12/26/how-i-backup-dedicated-root-servers/).

View File

@@ -1,2 +0,0 @@
# Todo
- Verify that restore backup is correct implemented

View File

@@ -1,382 +0,0 @@
#!/bin/python
# Backups volumes of running containers
import subprocess
import os
import re
import pathlib
import pandas
from datetime import datetime
import argparse
class BackupException(Exception):
"""Generic exception for backup errors."""
pass
def execute_shell_command(command):
"""Execute a shell command and return its output."""
print(command)
process = subprocess.Popen(
[command],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True
)
out, err = process.communicate()
if process.returncode != 0:
raise BackupException(
f"Error in command: {command}\n"
f"Output: {out}\nError: {err}\n"
f"Exit code: {process.returncode}"
)
return [line.decode("utf-8") for line in out.splitlines()]
def create_version_directory():
"""Create necessary directories for backup."""
version_dir = os.path.join(VERSIONS_DIR, BACKUP_TIME)
pathlib.Path(version_dir).mkdir(parents=True, exist_ok=True)
return version_dir
def get_machine_id():
"""Get the machine identifier."""
return execute_shell_command("sha256sum /etc/machine-id")[0][0:64]
### GLOBAL CONFIGURATION ###
# Container names treated as special instances for database backups
DATABASE_CONTAINERS = ['central-mariadb', 'central-postgres']
# Images which do not require container stop for file backups
IMAGES_NO_STOP_REQUIRED = []
# Images to skip entirely
IMAGES_NO_BACKUP_REQUIRED = []
# Compose dirs requiring hard restart
DOCKER_COMPOSE_HARD_RESTART_REQUIRED = ['mailu']
# DEFINE CONSTANTS
DIRNAME = os.path.dirname(__file__)
SCRIPTS_DIRECTORY = pathlib.Path(os.path.realpath(__file__)).parent.parent
DATABASES = pandas.read_csv(os.path.join(DIRNAME, "databases.csv"), sep=";")
REPOSITORY_NAME = os.path.basename(DIRNAME)
MACHINE_ID = get_machine_id()
BACKUPS_DIR = '/Backups/'
VERSIONS_DIR = os.path.join(BACKUPS_DIR, MACHINE_ID, REPOSITORY_NAME)
BACKUP_TIME = datetime.now().strftime("%Y%m%d%H%M%S")
VERSION_DIR = create_version_directory()
def get_instance(container):
"""Extract the database instance name based on container name."""
if container in DATABASE_CONTAINERS:
instance_name = container
else:
instance_name = re.split("(_|-)(database|db|postgres)", container)[0]
print(f"Extracted instance name: {instance_name}")
return instance_name
def stamp_directory():
"""Stamp a directory using directory-validator."""
stamp_command = f"dirval {VERSION_DIR} --stamp"
try:
execute_shell_command(stamp_command)
print(f"Successfully stamped directory: {VERSION_DIR}")
except BackupException as e:
print(f"Error running 'dirval' for {VERSION_DIR}: {e}")
exit(1)
def backup_database(container, volume_dir, db_type):
"""Backup database (MariaDB or PostgreSQL) if applicable."""
print(f"Starting database backup for {container} using {db_type}...")
instance_name = get_instance(container)
database_entries = DATABASES.loc[DATABASES['instance'] == instance_name]
if database_entries.empty:
raise BackupException(f"No entry found for instance '{instance_name}'")
for database_entry in database_entries.iloc:
database_name = database_entry['database']
database_username = database_entry['username']
database_password = database_entry['password']
backup_destination_dir = os.path.join(volume_dir, "sql")
pathlib.Path(backup_destination_dir).mkdir(parents=True, exist_ok=True)
backup_destination_file = os.path.join(
backup_destination_dir,
f"{database_name}.backup.sql"
)
if db_type == 'mariadb':
cmd = (
f"docker exec {container} "
f"/usr/bin/mariadb-dump -u {database_username} "
f"-p{database_password} {database_name} > {backup_destination_file}"
)
execute_shell_command(cmd)
if db_type == 'postgres':
cluster_file = os.path.join(
backup_destination_dir,
f"{instance_name}.cluster.backup.sql"
)
if not database_name:
fallback_pg_dumpall(
container,
database_username,
database_password,
cluster_file
)
return
try:
if database_password:
cmd = (
f"PGPASSWORD={database_password} docker exec -i {container} "
f"pg_dump -U {database_username} -d {database_name} "
f"-h localhost > {backup_destination_file}"
)
else:
cmd = (
f"docker exec -i {container} pg_dump -U {database_username} "
f"-d {database_name} -h localhost --no-password "
f"> {backup_destination_file}"
)
execute_shell_command(cmd)
except BackupException as e:
print(f"pg_dump failed: {e}")
print(f"Falling back to pg_dumpall for instance '{instance_name}'")
fallback_pg_dumpall(
container,
database_username,
database_password,
cluster_file
)
print(f"Database backup for database {container} completed.")
def get_last_backup_dir(volume_name, current_backup_dir):
"""Get the most recent backup directory for the specified volume."""
versions = sorted(os.listdir(VERSIONS_DIR), reverse=True)
for version in versions:
backup_dir = os.path.join(
VERSIONS_DIR, version, volume_name, "files", ""
)
if backup_dir != current_backup_dir and os.path.isdir(backup_dir):
return backup_dir
print(f"No previous backups available for volume: {volume_name}")
return None
def getStoragePath(volume_name):
path = execute_shell_command(
f"docker volume inspect --format '{{{{ .Mountpoint }}}}' {volume_name}"
)[0]
return f"{path}/"
def getFileRsyncDestinationPath(volume_dir):
path = os.path.join(volume_dir, "files")
return f"{path}/"
def fallback_pg_dumpall(container, username, password, backup_destination_file):
"""Fallback function to run pg_dumpall if pg_dump fails or no DB is defined."""
print(f"Running pg_dumpall for container '{container}'...")
cmd = (
f"PGPASSWORD={password} docker exec -i {container} "
f"pg_dumpall -U {username} -h localhost > {backup_destination_file}"
)
execute_shell_command(cmd)
def backup_volume(volume_name, volume_dir):
"""Perform incremental file backup of a Docker volume."""
try:
print(f"Starting backup routine for volume: {volume_name}")
dest = getFileRsyncDestinationPath(volume_dir)
pathlib.Path(dest).mkdir(parents=True, exist_ok=True)
last = get_last_backup_dir(volume_name, dest)
link_dest = f"--link-dest='{last}'" if last else ""
source = getStoragePath(volume_name)
cmd = (
f"rsync -abP --delete --delete-excluded "
f"{link_dest} {source} {dest}"
)
execute_shell_command(cmd)
except BackupException as e:
if "file has vanished" in str(e):
print("Warning: Some files vanished before transfer. Continuing.")
else:
raise
print(f"Backup routine for volume: {volume_name} completed.")
def get_image_info(container):
return execute_shell_command(
f"docker inspect --format '{{{{.Config.Image}}}}' {container}"
)
def has_image(container, image):
"""Check if the container is using the image"""
info = get_image_info(container)[0]
return image in info
def change_containers_status(containers, status):
"""Stop or start a list of containers."""
if containers:
names = ' '.join(containers)
print(f"{status.capitalize()} containers: {names}...")
execute_shell_command(f"docker {status} {names}")
else:
print(f"No containers to {status}.")
def is_image_whitelisted(container, images):
"""
Return True if the container's image matches any of the whitelist patterns.
Also prints out the image name and the match result.
"""
# fetch the image (e.g. "nextcloud:23-fpm-alpine")
info = get_image_info(container)[0]
# check against each pattern
whitelisted = any(pattern in info for pattern in images)
# log the result
print(f"Container {container!r} → image {info!r} → whitelisted? {whitelisted}", flush=True)
return whitelisted
def is_container_stop_required(containers):
"""
Check if any of the containers are using images that are not whitelisted.
If so, print them out and return True; otherwise return False.
"""
# Find all containers whose image isnt on the whitelist
not_whitelisted = [
c for c in containers
if not is_image_whitelisted(c, IMAGES_NO_STOP_REQUIRED)
]
if not_whitelisted:
print(f"Containers requiring stop because they are not whitelisted: {', '.join(not_whitelisted)}")
return True
return False
def create_volume_directory(volume_name):
"""Create necessary directories for backup."""
path = os.path.join(VERSION_DIR, volume_name)
pathlib.Path(path).mkdir(parents=True, exist_ok=True)
return path
def is_image_ignored(container):
"""Check if the container's image is one of the ignored images."""
return any(has_image(container, img) for img in IMAGES_NO_BACKUP_REQUIRED)
def backup_with_containers_paused(volume_name, volume_dir, containers, shutdown):
change_containers_status(containers, 'stop')
backup_volume(volume_name, volume_dir)
if not shutdown:
change_containers_status(containers, 'start')
def backup_mariadb_or_postgres(container, volume_dir):
"""Performs database image specific backup procedures"""
for img in ['mariadb', 'postgres']:
if has_image(container, img):
backup_database(container, volume_dir, img)
return True
return False
def default_backup_routine_for_volume(volume_name, containers, shutdown):
"""Perform backup routine for a given volume."""
vol_dir = ""
for c in containers:
if is_image_ignored(c):
print(f"Ignoring volume '{volume_name}' linked to container '{c}'.")
continue
vol_dir = create_volume_directory(volume_name)
if backup_mariadb_or_postgres(c, vol_dir):
return
if vol_dir:
backup_volume(volume_name, vol_dir)
if is_container_stop_required(containers):
backup_with_containers_paused(volume_name, vol_dir, containers, shutdown)
def backup_everything(volume_name, containers, shutdown):
"""Perform file backup routine for a given volume."""
vol_dir = create_volume_directory(volume_name)
for c in containers:
backup_mariadb_or_postgres(c, vol_dir)
backup_volume(volume_name, vol_dir)
backup_with_containers_paused(volume_name, vol_dir, containers, shutdown)
def hard_restart_docker_services(dir_path):
"""Perform a hard restart of docker-compose services in the given directory."""
try:
print(f"Performing hard restart for docker-compose services in: {dir_path}")
subprocess.run(["docker-compose", "down"], cwd=dir_path, check=True)
subprocess.run(["docker-compose", "up", "-d"], cwd=dir_path, check=True)
print(f"Hard restart completed successfully in: {dir_path}")
except subprocess.CalledProcessError as e:
print(f"Error during hard restart in {dir_path}: {e}")
exit(2)
def handle_docker_compose_services(parent_directory):
"""Iterate through directories and restart or hard restart services as needed."""
for entry in os.scandir(parent_directory):
if entry.is_dir():
dir_path = entry.path
name = os.path.basename(dir_path)
print(f"Checking directory: {dir_path}")
compose_file = os.path.join(dir_path, "docker-compose.yml")
if os.path.isfile(compose_file):
print(f"Found docker-compose.yml in {dir_path}.")
if name in DOCKER_COMPOSE_HARD_RESTART_REQUIRED:
print(f"Directory {name} detected. Performing hard restart...")
hard_restart_docker_services(dir_path)
else:
print(f"No restart required for services in {dir_path}...")
else:
print(f"No docker-compose.yml found in {dir_path}. Skipping.")
def main():
global DATABASE_CONTAINERS, IMAGES_NO_STOP_REQUIRED
parser = argparse.ArgumentParser(description='Backup Docker volumes.')
parser.add_argument('--everything', action='store_true',
help='Force file backup for all volumes and additional execute database dumps')
parser.add_argument('--shutdown', action='store_true',
help='Doesn\'t restart containers after backup')
parser.add_argument('--compose-dir', type=str, required=True,
help='Path to the parent directory containing docker-compose setups')
parser.add_argument(
'--database-containers',
nargs='+',
required=True,
help='List of container names treated as special instances for database backups'
)
parser.add_argument(
'--images-no-stop-required',
nargs='+',
required=True,
help='List of image names for which containers should not be stopped during file backup'
)
parser.add_argument(
'--images-no-backup-required',
nargs='+',
help='List of image names for which no backup should be performed (optional)'
)
args = parser.parse_args()
DATABASE_CONTAINERS = args.database_containers
IMAGES_NO_STOP_REQUIRED = args.images_no_stop_required
if args.images_no_backup_required is not None:
global IMAGES_NO_BACKUP_REQUIRED
IMAGES_NO_BACKUP_REQUIRED = args.images_no_backup_required
print('💾 Start volume backups...', flush=True)
volume_names = execute_shell_command("docker volume ls --format '{{.Name}}'")
for volume_name in volume_names:
print(f'Start backup routine for volume: {volume_name}')
containers = execute_shell_command(
f"docker ps --filter volume=\"{volume_name}\" --format '{{{{.Names}}}}'"
)
if args.everything:
backup_everything(volume_name, containers, args.shutdown)
else:
default_backup_routine_for_volume(volume_name, containers, args.shutdown)
stamp_directory()
print('Finished volume backups.')
print('Handling Docker Compose services...')
handle_docker_compose_services(args.compose_dir)
if __name__ == "__main__":
main()

View File

@@ -1 +0,0 @@
instance;database;username;password

29
pyproject.toml Normal file
View File

@@ -0,0 +1,29 @@
[build-system]
requires = ["setuptools>=69", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "backup-docker-to-local"
version = "1.0.0"
description = "Backup Docker volumes to local with rsync and optional DB dumps."
readme = "README.md"
requires-python = ">=3.9"
license = { text = "AGPL-3.0-or-later" }
authors = [{ name = "Kevin Veen-Birkenbach" }]
dependencies = [
"pandas",
"dirval",
]
[project.scripts]
baudolo = "baudolo.backup.__main__:main"
baudolo-restore = "baudolo.restore.__main__:main"
baudolo-seed = "baudolo.seed.__main__:main"
[tool.setuptools]
package-dir = { "" = "src" }
[tool.setuptools.packages.find]
where = ["src"]
exclude = ["tests*"]

View File

@@ -1,85 +0,0 @@
#!/bin/bash
# Check minimum number of arguments
if [ $# -lt 3 ]; then
echo "ERROR: Not enough arguments. Please provide at least a volume name, backup hash, and version."
exit 1
fi
volume_name="$1" # Volume-Name
backup_hash="$2" # Hashed Machine ID
version="$3" # version to recover
# DATABASE PARAMETERS
database_type="$4" # Valid values; mariadb, postgress
database_container="$5" # optional
database_password="$6" # optional
database_name="$7" # optional
database_user="$database_name"
backup_folder="Backups/$backup_hash/backup-docker-to-local/$version/$volume_name"
backup_files="/$backup_folder/files"
backup_sql="/$backup_folder/sql/$database_name.backup.sql"
# DATABASE RECOVERY
if [ ! -z "$database_type" ]; then
if [ "$database_type" = "postgres" ]; then
if [ -n "$database_container" ] && [ -n "$database_password" ] && [ -n "$database_name" ]; then
echo "Recover PostgreSQL dump"
export PGPASSWORD="$database_password"
cat "$backup_sql" | docker exec -i "$database_container" psql -v ON_ERROR_STOP=1 -U "$database_user" -d "$database_name"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to recover PostgreSQL dump"
exit 1
fi
exit 0
fi
elif [ "$database_type" = "mariadb" ]; then
if [ -n "$database_container" ] && [ -n "$database_password" ] && [ -n "$database_name" ]; then
echo "recover mysql dump"
cat "$backup_sql" | docker exec -i "$database_container" mariadb -u "$database_user" --password="$database_password" "$database_name"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to recover mysql dump"
exit 1
fi
exit 0
fi
fi
echo "A database backup exists, but a parameter is missing."
exit 1
fi
# FILE RECOVERY
echo "Inspect volume $volume_name"
docker volume inspect "$volume_name"
exit_status_volume_inspect=$?
if [ $exit_status_volume_inspect -eq 0 ]; then
echo "Volume $volume_name already exists"
else
echo "Create volume $volume_name"
docker volume create "$volume_name"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to create volume $volume_name"
exit 1
fi
fi
if [ -d "$backup_files" ]; then
echo "recover files"
docker run --rm -v "$volume_name:/recover/" -v "$backup_files:/backup/" "kevinveenbirkenbach/alpine-rsync" sh -c "rsync -avv --delete /backup/ /recover/"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to recover files"
exit 1
fi
exit 0
else
echo "ERROR: $backup_files doesn't exist"
exit 1
fi
echo "ERROR: Unhandled case"
exit 1

View File

@@ -1,5 +0,0 @@
pacman:
- lsof
- python-pandas
pkgmgr:
- dirval

View File

@@ -1,170 +0,0 @@
#!/usr/bin/env python3
# @todo Not tested yet. Needs to be tested
"""
restore_backup.py
A script to recover Docker volumes and database dumps from local backups.
Supports an --empty flag to clear the database objects before import (drops all tables/functions etc.).
"""
import argparse
import os
import sys
import subprocess
def run_command(cmd, capture_output=False, input=None, **kwargs):
"""Run a subprocess command and handle errors."""
try:
result = subprocess.run(cmd, check=True, capture_output=capture_output, input=input, **kwargs)
return result
except subprocess.CalledProcessError as e:
print(f"ERROR: Command '{' '.join(cmd)}' failed with exit code {e.returncode}")
if e.stdout:
print(e.stdout.decode())
if e.stderr:
print(e.stderr.decode())
sys.exit(1)
def recover_postgres(container, password, db_name, user, backup_sql, empty=False):
print("Recovering PostgreSQL dump...")
os.environ['PGPASSWORD'] = password
if empty:
print("Dropping existing PostgreSQL objects...")
# Drop all tables, views, sequences, functions in public schema
drop_sql = """
DO $$ DECLARE r RECORD;
BEGIN
FOR r IN (
SELECT table_name AS name, 'TABLE' AS type FROM information_schema.tables WHERE table_schema='public'
UNION ALL
SELECT routine_name AS name, 'FUNCTION' AS type FROM information_schema.routines WHERE specific_schema='public'
UNION ALL
SELECT sequence_name AS name, 'SEQUENCE' AS type FROM information_schema.sequences WHERE sequence_schema='public'
) LOOP
-- Use %s for type to avoid quoting the SQL keyword
EXECUTE format('DROP %s public.%I CASCADE', r.type, r.name);
END LOOP;
END
$$;
"""
run_command([
'docker', 'exec', '-i', container,
'psql', '-v', 'ON_ERROR_STOP=1', '-U', user, '-d', db_name
], input=drop_sql.encode())
print("Existing objects dropped.")
print("Importing the dump...")
with open(backup_sql, 'rb') as f:
run_command([
'docker', 'exec', '-i', container,
'psql', '-v', 'ON_ERROR_STOP=1', '-U', user, '-d', db_name
], stdin=f)
print("PostgreSQL recovery complete.")
def recover_mariadb(container, password, db_name, user, backup_sql, empty=False):
print("Recovering MariaDB dump...")
if empty:
print("Dropping existing MariaDB tables...")
# Disable foreign key checks
run_command([
'docker', 'exec', container,
'mysql', '-u', user, f"--password={password}", '-e', 'SET FOREIGN_KEY_CHECKS=0;'
])
# Get all table names
result = run_command([
'docker', 'exec', container,
'mysql', '-u', user, f"--password={password}", '-N', '-e',
f"SELECT table_name FROM information_schema.tables WHERE table_schema = '{db_name}';"
], capture_output=True)
tables = result.stdout.decode().split()
for tbl in tables:
run_command([
'docker', 'exec', container,
'mysql', '-u', user, f"--password={password}", '-e',
f"DROP TABLE IF EXISTS `{db_name}`.`{tbl}`;"
])
# Enable foreign key checks
run_command([
'docker', 'exec', container,
'mysql', '-u', user, f"--password={password}", '-e', 'SET FOREIGN_KEY_CHECKS=1;'
])
print("Existing tables dropped.")
print("Importing the dump...")
with open(backup_sql, 'rb') as f:
run_command([
'docker', 'exec', '-i', container,
'mariadb', '-u', user, f"--password={password}", db_name
], stdin=f)
print("MariaDB recovery complete.")
def recover_files(volume_name, backup_files):
print(f"Inspecting volume {volume_name}...")
inspect = subprocess.run(['docker', 'volume', 'inspect', volume_name], stdout=subprocess.DEVNULL)
if inspect.returncode != 0:
print(f"Volume {volume_name} does not exist. Creating...")
run_command(['docker', 'volume', 'create', volume_name])
else:
print(f"Volume {volume_name} already exists.")
if not os.path.isdir(backup_files):
print(f"ERROR: Backup files folder '{backup_files}' does not exist.")
sys.exit(1)
print("Recovering files...")
run_command([
'docker', 'run', '--rm',
'-v', f"{volume_name}:/recover/",
'-v', f"{backup_files}:/backup/",
'kevinveenbirkenbach/alpine-rsync',
'sh', '-c', 'rsync -avv --delete /backup/ /recover/'
])
print("File recovery complete.")
def main():
parser = argparse.ArgumentParser(
description='Recover Docker volumes and database dumps from local backups.'
)
parser.add_argument('volume_name', help='Name of the Docker volume')
parser.add_argument('backup_hash', help='Hashed Machine ID')
parser.add_argument('version', help='Version to recover')
parser.add_argument('--db-type', choices=['postgres', 'mariadb'], help='Type of database backup')
parser.add_argument('--db-container', help='Docker container name for the database')
parser.add_argument('--db-password', help='Password for the database user')
parser.add_argument('--db-name', help='Name of the database')
parser.add_argument('--empty', action='store_true', help='Drop existing database objects before importing')
args = parser.parse_args()
volume = args.volume_name
backup_hash = args.backup_hash
version = args.version
backup_folder = os.path.join('Backups', backup_hash, 'backup-docker-to-local', version, volume)
backup_files = os.path.join(os.sep, backup_folder, 'files')
backup_sql = None
if args.db_name:
backup_sql = os.path.join(os.sep, backup_folder, 'sql', f"{args.db_name}.backup.sql")
# Database recovery
if args.db_type:
if not (args.db_container and args.db_password and args.db_name):
print("ERROR: A database backup exists, aber ein Parameter fehlt.")
sys.exit(1)
user = args.db_name
if args.db_type == 'postgres':
recover_postgres(args.db_container, args.db_password, args.db_name, user, backup_sql, empty=args.empty)
else:
recover_mariadb(args.db_container, args.db_password, args.db_name, user, backup_sql, empty=args.empty)
sys.exit(0)
# File recovery
recover_files(volume, backup_files)
if __name__ == '__main__':
main()

View File

@@ -1,96 +0,0 @@
#!/usr/bin/env python3
"""
Restore multiple PostgreSQL databases from .backup.sql files via a Docker container.
Usage:
./restore_databases.py /path/to/backup_dir container_name
"""
import argparse
import subprocess
import sys
import os
import glob
def run_command(cmd, stdin=None):
"""
Run a subprocess command and abort immediately on any failure.
:param cmd: list of command parts
:param stdin: file-like object to use as stdin
"""
subprocess.run(cmd, stdin=stdin, check=True)
def main():
parser = argparse.ArgumentParser(
description="Restore Postgres databases from backup SQL files via Docker container."
)
parser.add_argument(
"backup_dir",
help="Path to directory containing .backup.sql files"
)
parser.add_argument(
"container",
help="Name of the Postgres Docker container"
)
args = parser.parse_args()
backup_dir = args.backup_dir
container = args.container
pattern = os.path.join(backup_dir, "*.backup.sql")
sql_files = sorted(glob.glob(pattern))
if not sql_files:
print(f"No .backup.sql files found in {backup_dir}", file=sys.stderr)
sys.exit(1)
for sqlfile in sql_files:
# Extract database name by stripping the full suffix '.backup.sql'
filename = os.path.basename(sqlfile)
if not filename.endswith('.backup.sql'):
continue
dbname = filename[:-len('.backup.sql')]
print(f"=== Processing {sqlfile} → database: {dbname} ===")
# Drop the database, forcing disconnect of sessions if necessary
run_command([
"docker", "exec", "-i", container,
"psql", "-U", "postgres", "-c",
f"DROP DATABASE IF EXISTS \"{dbname}\" WITH (FORCE);"
])
# Create a fresh database
run_command([
"docker", "exec", "-i", container,
"psql", "-U", "postgres", "-c",
f"CREATE DATABASE \"{dbname}\";"
])
# Ensure the ownership role exists
print(f"Ensuring role '{dbname}' exists...")
run_command([
"docker", "exec", "-i", container,
"psql", "-U", "postgres", "-c",
(
"DO $$BEGIN "
f"IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = '{dbname}') THEN "
f"CREATE ROLE \"{dbname}\"; "
"END IF; "
"END$$;"
)
])
# Restore the dump into the database by streaming file (will abort on first error)
print(f"Restoring dump into {dbname} (this may take a while)…")
with open(sqlfile, 'rb') as infile:
run_command([
"docker", "exec", "-i", container,
"psql", "-U", "postgres", "-d", dbname
], stdin=infile)
print(f"{dbname} restored.")
print("All databases have been restored.")
if __name__ == "__main__":
main()

234
scripts/test-e2e.sh Executable file
View File

@@ -0,0 +1,234 @@
#!/usr/bin/env bash
set -euo pipefail
# -----------------------------------------------------------------------------
# E2E runner using Docker-in-Docker (DinD) with debug-on-failure
#
# Debug toggles:
# E2E_KEEP_ON_FAIL=1 -> keep DinD + volumes + network if tests fail
# E2E_KEEP_VOLUMES=1 -> keep volumes even on success/cleanup
# E2E_DEBUG_SHELL=1 -> open an interactive shell in the test container instead of running tests
# E2E_ARTIFACTS_DIR=./artifacts
# -----------------------------------------------------------------------------
NET="${E2E_NET:-baudolo-e2e-net}"
DIND="${E2E_DIND_NAME:-baudolo-e2e-dind}"
DIND_VOL="${E2E_DIND_VOL:-baudolo-e2e-dind-data}"
E2E_TMP_VOL="${E2E_TMP_VOL:-baudolo-e2e-tmp}"
DIND_HOST="${E2E_DIND_HOST:-tcp://127.0.0.1:2375}"
DIND_HOST_IN_NET="${E2E_DIND_HOST_IN_NET:-tcp://${DIND}:2375}"
IMG="${E2E_IMAGE:-baudolo:local}"
RSYNC_IMG="${E2E_RSYNC_IMAGE:-ghcr.io/kevinveenbirkenbach/alpine-rsync}"
READY_TIMEOUT_SECONDS="${E2E_READY_TIMEOUT_SECONDS:-120}"
ARTIFACTS_DIR="${E2E_ARTIFACTS_DIR:-./artifacts}"
KEEP_ON_FAIL="${E2E_KEEP_ON_FAIL:-0}"
KEEP_VOLUMES="${E2E_KEEP_VOLUMES:-0}"
DEBUG_SHELL="${E2E_DEBUG_SHELL:-0}"
FAILED=0
TS="$(date +%Y%m%d%H%M%S)"
mkdir -p "${ARTIFACTS_DIR}"
log() { echo ">> $*"; }
dump_debug() {
log "DEBUG: collecting diagnostics into ${ARTIFACTS_DIR}"
{
echo "=== Host docker version ==="
docker version || true
echo
echo "=== Host docker info ==="
docker info || true
echo
echo "=== DinD reachable? (docker -H ${DIND_HOST} version) ==="
docker -H "${DIND_HOST}" version || true
echo
} > "${ARTIFACTS_DIR}/debug-host-${TS}.txt" 2>&1 || true
# DinD logs
docker logs --tail=5000 "${DIND}" > "${ARTIFACTS_DIR}/dind-logs-${TS}.txt" 2>&1 || true
# DinD state
{
echo "=== docker -H ps -a ==="
docker -H "${DIND_HOST}" ps -a || true
echo
echo "=== docker -H images ==="
docker -H "${DIND_HOST}" images || true
echo
echo "=== docker -H network ls ==="
docker -H "${DIND_HOST}" network ls || true
echo
echo "=== docker -H volume ls ==="
docker -H "${DIND_HOST}" volume ls || true
echo
echo "=== docker -H system df ==="
docker -H "${DIND_HOST}" system df || true
} > "${ARTIFACTS_DIR}/debug-dind-${TS}.txt" 2>&1 || true
# Try to capture recent events (best effort; might be noisy)
docker -H "${DIND_HOST}" events --since 10m --until 0s \
> "${ARTIFACTS_DIR}/dind-events-${TS}.txt" 2>&1 || true
# Dump shared /tmp content from the tmp volume:
# We create a temporary container that mounts the volume, then tar its content.
# (Does not rely on host filesystem paths.)
log "DEBUG: archiving shared /tmp (volume ${E2E_TMP_VOL})"
docker -H "${DIND_HOST}" run --rm \
-v "${E2E_TMP_VOL}:/tmp" \
alpine:3.20 \
sh -lc 'cd /tmp && tar -czf /out.tar.gz . || true' \
>/dev/null 2>&1 || true
# The above writes inside the container FS, not to host. So do it properly:
# Use "docker cp" from a temp container.
local tmpc="baudolo-e2e-tmpdump-${TS}"
docker -H "${DIND_HOST}" rm -f "${tmpc}" >/dev/null 2>&1 || true
docker -H "${DIND_HOST}" create --name "${tmpc}" -v "${E2E_TMP_VOL}:/tmp" alpine:3.20 \
sh -lc 'cd /tmp && tar -czf /tmpdump.tar.gz . || true' >/dev/null
docker -H "${DIND_HOST}" start -a "${tmpc}" >/dev/null 2>&1 || true
docker -H "${DIND_HOST}" cp "${tmpc}:/tmpdump.tar.gz" "${ARTIFACTS_DIR}/e2e-tmp-${TS}.tar.gz" >/dev/null 2>&1 || true
docker -H "${DIND_HOST}" rm -f "${tmpc}" >/dev/null 2>&1 || true
log "DEBUG: artifacts written:"
ls -la "${ARTIFACTS_DIR}" | sed 's/^/ /' || true
}
cleanup() {
if [ "${FAILED}" -eq 1 ] && [ "${KEEP_ON_FAIL}" = "1" ]; then
log "KEEP_ON_FAIL=1 and failure detected -> skipping cleanup."
log "Next steps:"
echo " - Inspect DinD logs: docker logs ${DIND} | less"
echo " - Use DinD daemon: docker -H ${DIND_HOST} ps -a"
echo " - Shared tmp vol: docker -H ${DIND_HOST} run --rm -v ${E2E_TMP_VOL}:/tmp alpine:3.20 ls -la /tmp"
echo " - DinD docker root: docker -H ${DIND_HOST} run --rm -v ${DIND_VOL}:/var/lib/docker alpine:3.20 ls -la /var/lib/docker/volumes"
return 0
fi
log "Cleanup: stopping ${DIND} and removing network ${NET}"
docker rm -f "${DIND}" >/dev/null 2>&1 || true
docker network rm "${NET}" >/dev/null 2>&1 || true
if [ "${KEEP_VOLUMES}" != "1" ]; then
docker volume rm -f "${DIND_VOL}" >/dev/null 2>&1 || true
docker volume rm -f "${E2E_TMP_VOL}" >/dev/null 2>&1 || true
else
log "Keeping volumes (E2E_KEEP_VOLUMES=1): ${DIND_VOL}, ${E2E_TMP_VOL}"
fi
}
trap cleanup EXIT INT TERM
log "Creating network ${NET} (if missing)"
docker network inspect "${NET}" >/dev/null 2>&1 || docker network create "${NET}" >/dev/null
log "Removing old ${DIND} (if any)"
docker rm -f "${DIND}" >/dev/null 2>&1 || true
log "(Re)creating DinD data volume ${DIND_VOL}"
docker volume rm -f "${DIND_VOL}" >/dev/null 2>&1 || true
docker volume create "${DIND_VOL}" >/dev/null
log "(Re)creating shared /tmp volume ${E2E_TMP_VOL}"
docker volume rm -f "${E2E_TMP_VOL}" >/dev/null 2>&1 || true
docker volume create "${E2E_TMP_VOL}" >/dev/null
log "Starting Docker-in-Docker daemon ${DIND}"
docker run -d --privileged \
--name "${DIND}" \
--network "${NET}" \
-e DOCKER_TLS_CERTDIR="" \
-v "${DIND_VOL}:/var/lib/docker" \
-v "${E2E_TMP_VOL}:/tmp" \
-p 2375:2375 \
docker:dind \
--host=tcp://0.0.0.0:2375 \
--tls=false >/dev/null
log "Waiting for DinD to be ready..."
for i in $(seq 1 "${READY_TIMEOUT_SECONDS}"); do
if docker -H "${DIND_HOST}" version >/dev/null 2>&1; then
log "DinD is ready."
break
fi
sleep 1
if [ "${i}" -eq "${READY_TIMEOUT_SECONDS}" ]; then
echo "ERROR: DinD did not become ready in time"
docker logs --tail=200 "${DIND}" || true
FAILED=1
dump_debug || true
exit 1
fi
done
log "Pre-pulling helper images in DinD..."
log " - Pulling: ${RSYNC_IMG}"
docker -H "${DIND_HOST}" pull "${RSYNC_IMG}"
log "Ensuring alpine exists in DinD (for debug helpers)"
docker -H "${DIND_HOST}" pull alpine:3.20 >/dev/null
log "Loading ${IMG} image into DinD..."
docker save "${IMG}" | docker -H "${DIND_HOST}" load >/dev/null
log "Running E2E tests inside DinD"
set +e
if [ "${DEBUG_SHELL}" = "1" ]; then
log "E2E_DEBUG_SHELL=1 -> opening shell in test container"
docker run --rm -it \
--network "${NET}" \
-e DOCKER_HOST="${DIND_HOST_IN_NET}" \
-e E2E_RSYNC_IMAGE="${RSYNC_IMG}" \
-v "${DIND_VOL}:/var/lib/docker:ro" \
-v "${E2E_TMP_VOL}:/tmp" \
"${IMG}" \
sh -lc '
set -e
if [ ! -f /etc/machine-id ]; then
mkdir -p /etc
cat /proc/sys/kernel/random/uuid > /etc/machine-id
fi
echo ">> DOCKER_HOST=${DOCKER_HOST}"
docker ps -a || true
exec sh
'
rc=$?
else
docker run --rm \
--network "${NET}" \
-e DOCKER_HOST="${DIND_HOST_IN_NET}" \
-e E2E_RSYNC_IMAGE="${RSYNC_IMG}" \
-v "${DIND_VOL}:/var/lib/docker:ro" \
-v "${E2E_TMP_VOL}:/tmp" \
"${IMG}" \
sh -lc '
set -euo pipefail
set -x
export PYTHONUNBUFFERED=1
export TMPDIR=/tmp TMP=/tmp TEMP=/tmp
if [ ! -f /etc/machine-id ]; then
mkdir -p /etc
cat /proc/sys/kernel/random/uuid > /etc/machine-id
fi
python -m unittest discover -t . -s tests/e2e -p "test_*.py" -v -f
'
rc=$?
fi
set -e
if [ "${rc}" -ne 0 ]; then
FAILED=1
echo "ERROR: E2E tests failed (exit code: ${rc})"
dump_debug || true
exit "${rc}"
fi
log "E2E tests passed."

View File

@@ -0,0 +1 @@
"""Baudolo backup package."""

View File

@@ -0,0 +1,9 @@
#!/usr/bin/env python3
from __future__ import annotations
from .app import main
if __name__ == "__main__":
raise SystemExit(main())

183
src/baudolo/backup/app.py Normal file
View File

@@ -0,0 +1,183 @@
from __future__ import annotations
import os
import pathlib
from datetime import datetime
import pandas
from dirval import create_stamp_file
from .cli import parse_args
from .compose import handle_docker_compose_services
from .db import backup_database
from .docker import (
change_containers_status,
containers_using_volume,
docker_volume_names,
get_image_info,
has_image,
)
from .shell import execute_shell_command
from .volume import backup_volume
def get_machine_id() -> str:
return execute_shell_command("sha256sum /etc/machine-id")[0][0:64]
def stamp_directory(version_dir: str) -> None:
"""
Use dirval as a Python library to stamp the directory (no CLI dependency).
"""
create_stamp_file(version_dir)
def create_version_directory(versions_dir: str, backup_time: str) -> str:
version_dir = os.path.join(versions_dir, backup_time)
pathlib.Path(version_dir).mkdir(parents=True, exist_ok=True)
return version_dir
def create_volume_directory(version_dir: str, volume_name: str) -> str:
path = os.path.join(version_dir, volume_name)
pathlib.Path(path).mkdir(parents=True, exist_ok=True)
return path
def is_image_ignored(container: str, images_no_backup_required: list[str]) -> bool:
if not images_no_backup_required:
return False
img = get_image_info(container)
return any(pat in img for pat in images_no_backup_required)
def volume_is_fully_ignored(containers: list[str], images_no_backup_required: list[str]) -> bool:
"""
Skip file backup only if all containers linked to the volume are ignored.
"""
if not containers:
return False
return all(is_image_ignored(c, images_no_backup_required) for c in containers)
def requires_stop(containers: list[str], images_no_stop_required: list[str]) -> bool:
"""
Stop is required if ANY container image is NOT in the whitelist patterns.
"""
for c in containers:
img = get_image_info(c)
if not any(pat in img for pat in images_no_stop_required):
return True
return False
def backup_mariadb_or_postgres(
*,
container: str,
volume_dir: str,
databases_df: "pandas.DataFrame",
database_containers: list[str],
) -> bool:
"""
Returns True if the container is a DB container we handled.
"""
for img in ["mariadb", "postgres"]:
if has_image(container, img):
backup_database(
container=container,
volume_dir=volume_dir,
db_type=img,
databases_df=databases_df,
database_containers=database_containers,
)
return True
return False
def _backup_dumps_for_volume(
*,
containers: list[str],
vol_dir: str,
databases_df: "pandas.DataFrame",
database_containers: list[str],
) -> bool:
"""
Create DB dumps for any mariadb/postgres containers attached to this volume.
Returns True if at least one dump was produced.
"""
dumped_any = False
for c in containers:
if backup_mariadb_or_postgres(
container=c,
volume_dir=vol_dir,
databases_df=databases_df,
database_containers=database_containers,
):
dumped_any = True
return dumped_any
def main() -> int:
args = parse_args()
machine_id = get_machine_id()
backup_time = datetime.now().strftime("%Y%m%d%H%M%S")
versions_dir = os.path.join(args.backups_dir, machine_id, args.repo_name)
version_dir = create_version_directory(versions_dir, backup_time)
databases_df = pandas.read_csv(args.databases_csv, sep=";")
print("💾 Start volume backups...", flush=True)
for volume_name in docker_volume_names():
print(f"Start backup routine for volume: {volume_name}", flush=True)
containers = containers_using_volume(volume_name)
vol_dir = create_volume_directory(version_dir, volume_name)
# Old behavior: DB dumps are additional to file backups.
_backup_dumps_for_volume(
containers=containers,
vol_dir=vol_dir,
databases_df=databases_df,
database_containers=args.database_containers,
)
# dump-only: skip ALL file rsync backups
if args.dump_only:
continue
# skip file backup if all linked containers are ignored
if volume_is_fully_ignored(containers, args.images_no_backup_required):
print(
f"Skipping file backup for volume '{volume_name}' (all linked containers are ignored).",
flush=True,
)
continue
if args.everything:
# "everything": always do pre-rsync, then stop + rsync again
backup_volume(versions_dir, volume_name, vol_dir)
change_containers_status(containers, "stop")
backup_volume(versions_dir, volume_name, vol_dir)
if not args.shutdown:
change_containers_status(containers, "start")
continue
# default: rsync, and if needed stop + rsync
backup_volume(versions_dir, volume_name, vol_dir)
if requires_stop(containers, args.images_no_stop_required):
change_containers_status(containers, "stop")
backup_volume(versions_dir, volume_name, vol_dir)
if not args.shutdown:
change_containers_status(containers, "start")
# Stamp the backup version directory using dirval (python lib)
stamp_directory(version_dir)
print("Finished volume backups.", flush=True)
print("Handling Docker Compose services...", flush=True)
handle_docker_compose_services(args.compose_dir, args.docker_compose_hard_restart_required)
return 0

93
src/baudolo/backup/cli.py Normal file
View File

@@ -0,0 +1,93 @@
from __future__ import annotations
import argparse
import os
from pathlib import Path
def _default_repo_name() -> str:
"""
Derive the repository name from the folder that contains `src/`.
Expected layout:
<repo-root>/src/baudolo/backup/cli.py
=> parents[0]=backup, [1]=baudolo, [2]=src, [3]=repo-root
"""
try:
return Path(__file__).resolve().parents[3].name
except Exception:
return "backup-docker-to-local"
def parse_args() -> argparse.Namespace:
dirname = os.path.dirname(__file__)
default_databases_csv = os.path.join(dirname, "databases.csv")
p = argparse.ArgumentParser(description="Backup Docker volumes.")
p.add_argument(
"--compose-dir",
type=str,
required=True,
help="Path to the parent directory containing docker-compose setups",
)
p.add_argument(
"--docker-compose-hard-restart-required",
nargs="+",
default=["mailu"],
help="Compose dir names that require 'docker-compose down && up -d' (default: mailu)",
)
p.add_argument(
"--repo-name",
default=_default_repo_name(),
help="Backup repo folder name under <backups-dir>/<machine-id>/ (default: git repo folder name)",
)
p.add_argument(
"--databases-csv",
default=default_databases_csv,
help=f"Path to databases.csv (default: {default_databases_csv})",
)
p.add_argument(
"--backups-dir",
default="/Backups",
help="Backup root directory (default: /Backups)",
)
p.add_argument(
"--database-containers",
nargs="+",
required=True,
help="Container names treated as special instances for database backups",
)
p.add_argument(
"--images-no-stop-required",
nargs="+",
required=True,
help="Image name patterns for which containers should not be stopped during file backup",
)
p.add_argument(
"--images-no-backup-required",
nargs="+",
default=[],
help="Image name patterns for which no backup should be performed",
)
p.add_argument(
"--everything",
action="store_true",
help="Force file backup for all volumes and also execute database dumps (like old script)",
)
p.add_argument(
"--shutdown",
action="store_true",
help="Do not restart containers after backup",
)
p.add_argument(
"--dump-only",
action="store_true",
help="Only create DB dumps (skip ALL file rsync backups)",
)
return p.parse_args()

View File

@@ -0,0 +1,31 @@
from __future__ import annotations
import os
import subprocess
def hard_restart_docker_services(dir_path: str) -> None:
print(f"Hard restart docker-compose services in: {dir_path}", flush=True)
subprocess.run(["docker-compose", "down"], cwd=dir_path, check=True)
subprocess.run(["docker-compose", "up", "-d"], cwd=dir_path, check=True)
def handle_docker_compose_services(parent_directory: str, hard_restart_required: list[str]) -> None:
for entry in os.scandir(parent_directory):
if not entry.is_dir():
continue
dir_path = entry.path
name = os.path.basename(dir_path)
compose_file = os.path.join(dir_path, "docker-compose.yml")
print(f"Checking directory: {dir_path}", flush=True)
if not os.path.isfile(compose_file):
print("No docker-compose.yml found. Skipping.", flush=True)
continue
if name in hard_restart_required:
print(f"{name}: hard restart required.", flush=True)
hard_restart_docker_services(dir_path)
else:
print(f"{name}: no restart required.", flush=True)

73
src/baudolo/backup/db.py Normal file
View File

@@ -0,0 +1,73 @@
from __future__ import annotations
import os
import pathlib
import re
import pandas
from .shell import BackupException, execute_shell_command
def get_instance(container: str, database_containers: list[str]) -> str:
if container in database_containers:
return container
return re.split(r"(_|-)(database|db|postgres)", container)[0]
def fallback_pg_dumpall(container: str, username: str, password: str, out_file: str) -> None:
cmd = (
f"PGPASSWORD={password} docker exec -i {container} "
f"pg_dumpall -U {username} -h localhost > {out_file}"
)
execute_shell_command(cmd)
def backup_database(
*,
container: str,
volume_dir: str,
db_type: str,
databases_df: "pandas.DataFrame",
database_containers: list[str],
) -> None:
instance_name = get_instance(container, database_containers)
entries = databases_df.loc[databases_df["instance"] == instance_name]
if entries.empty:
raise BackupException(f"No entry found for instance '{instance_name}'")
out_dir = os.path.join(volume_dir, "sql")
pathlib.Path(out_dir).mkdir(parents=True, exist_ok=True)
for row in entries.iloc:
db_name = row["database"]
user = row["username"]
password = row["password"]
dump_file = os.path.join(out_dir, f"{db_name}.backup.sql")
if db_type == "mariadb":
cmd = (
f"docker exec {container} /usr/bin/mariadb-dump "
f"-u {user} -p{password} {db_name} > {dump_file}"
)
execute_shell_command(cmd)
continue
if db_type == "postgres":
cluster_file = os.path.join(out_dir, f"{instance_name}.cluster.backup.sql")
if not db_name:
fallback_pg_dumpall(container, user, password, cluster_file)
return
try:
cmd = (
f"PGPASSWORD={password} docker exec -i {container} "
f"pg_dump -U {user} -d {db_name} -h localhost > {dump_file}"
)
execute_shell_command(cmd)
except BackupException as e:
print(f"pg_dump failed: {e}", flush=True)
print(f"Falling back to pg_dumpall for instance '{instance_name}'", flush=True)
fallback_pg_dumpall(container, user, password, cluster_file)
continue

View File

@@ -0,0 +1,43 @@
from __future__ import annotations
from .shell import execute_shell_command
def get_image_info(container: str) -> str:
return execute_shell_command(
f"docker inspect --format '{{{{.Config.Image}}}}' {container}"
)[0]
def has_image(container: str, pattern: str) -> bool:
"""Return True if container's image contains the pattern."""
return pattern in get_image_info(container)
def docker_volume_names() -> list[str]:
return execute_shell_command("docker volume ls --format '{{.Name}}'")
def containers_using_volume(volume_name: str) -> list[str]:
return execute_shell_command(
f"docker ps --filter volume=\"{volume_name}\" --format '{{{{.Names}}}}'"
)
def change_containers_status(containers: list[str], status: str) -> None:
"""Stop or start a list of containers."""
if not containers:
print(f"No containers to {status}.", flush=True)
return
names = " ".join(containers)
print(f"{status.capitalize()} containers: {names}...", flush=True)
execute_shell_command(f"docker {status} {names}")
def docker_volume_exists(volume: str) -> bool:
# Avoid throwing exceptions for exists checks.
try:
execute_shell_command(f"docker volume inspect {volume} >/dev/null 2>&1 && echo OK")
return True
except Exception:
return False

View File

@@ -0,0 +1,26 @@
from __future__ import annotations
import subprocess
class BackupException(Exception):
"""Generic exception for backup errors."""
def execute_shell_command(command: str) -> list[str]:
"""Execute a shell command and return its output lines."""
print(command, flush=True)
process = subprocess.Popen(
[command],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True,
)
out, err = process.communicate()
if process.returncode != 0:
raise BackupException(
f"Error in command: {command}\n"
f"Output: {out}\nError: {err}\n"
f"Exit code: {process.returncode}"
)
return [line.decode("utf-8") for line in out.splitlines()]

View File

@@ -0,0 +1,42 @@
from __future__ import annotations
import os
import pathlib
from .shell import BackupException, execute_shell_command
def get_storage_path(volume_name: str) -> str:
path = execute_shell_command(
f"docker volume inspect --format '{{{{ .Mountpoint }}}}' {volume_name}"
)[0]
return f"{path}/"
def get_last_backup_dir(versions_dir: str, volume_name: str, current_backup_dir: str) -> str | None:
versions = sorted(os.listdir(versions_dir), reverse=True)
for version in versions:
candidate = os.path.join(versions_dir, version, volume_name, "files", "")
if candidate != current_backup_dir and os.path.isdir(candidate):
return candidate
return None
def backup_volume(versions_dir: str, volume_name: str, volume_dir: str) -> None:
"""Perform incremental file backup of a Docker volume."""
dest = os.path.join(volume_dir, "files") + "/"
pathlib.Path(dest).mkdir(parents=True, exist_ok=True)
last = get_last_backup_dir(versions_dir, volume_name, dest)
link_dest = f"--link-dest='{last}'" if last else ""
source = get_storage_path(volume_name)
cmd = f"rsync -abP --delete --delete-excluded {link_dest} {source} {dest}"
try:
execute_shell_command(cmd)
except BackupException as e:
if "file has vanished" in str(e):
print("Warning: Some files vanished before transfer. Continuing.", flush=True)
else:
raise

View File

@@ -0,0 +1 @@
__all__ = ["main"]

View File

@@ -0,0 +1,144 @@
from __future__ import annotations
import argparse
import sys
from .paths import BackupPaths
from .files import restore_volume_files
from .db.postgres import restore_postgres_sql
from .db.mariadb import restore_mariadb_sql
def _add_common_backup_args(p: argparse.ArgumentParser) -> None:
p.add_argument("volume_name", help="Docker volume name (target volume)")
p.add_argument("backup_hash", help="Hashed machine id")
p.add_argument("version", help="Backup version directory name")
p.add_argument(
"--backups-dir",
default="/Backups",
help="Backup root directory (default: /Backups)",
)
p.add_argument(
"--repo-name",
default="backup-docker-to-local",
help="Backup repo folder name under <backups-dir>/<hash>/ (default: backup-docker-to-local)",
)
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(
prog="baudolo-restore",
description="Restore docker volume files and DB dumps.",
)
sub = parser.add_subparsers(dest="cmd", required=True)
# ------------------------------------------------------------------
# files
# ------------------------------------------------------------------
p_files = sub.add_parser("files", help="Restore files into a docker volume")
_add_common_backup_args(p_files)
p_files.add_argument(
"--rsync-image",
default="ghcr.io/kevinveenbirkenbach/alpine-rsync",
)
p_files.add_argument(
"--source-volume",
default=None,
help=(
"Volume name used as backup source path key. "
"Defaults to <volume_name> (target volume). "
"Use this when restoring from one volume backup into a different target volume."
),
)
# ------------------------------------------------------------------
# postgres
# ------------------------------------------------------------------
p_pg = sub.add_parser("postgres", help="Restore a single PostgreSQL database dump")
_add_common_backup_args(p_pg)
p_pg.add_argument("--container", required=True)
p_pg.add_argument("--db-name", required=True)
p_pg.add_argument("--db-user", default=None, help="Defaults to db-name if omitted")
p_pg.add_argument("--db-password", required=True)
p_pg.add_argument("--empty", action="store_true")
# ------------------------------------------------------------------
# mariadb
# ------------------------------------------------------------------
p_mdb = sub.add_parser("mariadb", help="Restore a single MariaDB/MySQL-compatible dump")
_add_common_backup_args(p_mdb)
p_mdb.add_argument("--container", required=True)
p_mdb.add_argument("--db-name", required=True)
p_mdb.add_argument("--db-user", default=None, help="Defaults to db-name if omitted")
p_mdb.add_argument("--db-password", required=True)
p_mdb.add_argument("--empty", action="store_true")
args = parser.parse_args(argv)
try:
if args.cmd == "files":
# target volume = args.volume_name
# source volume (backup key) defaults to target volume
source_volume = args.source_volume or args.volume_name
bp_files = BackupPaths(
source_volume,
args.backup_hash,
args.version,
repo_name=args.repo_name,
backups_dir=args.backups_dir,
)
return restore_volume_files(
args.volume_name,
bp_files.files_dir(),
rsync_image=args.rsync_image,
)
if args.cmd == "postgres":
user = args.db_user or args.db_name
restore_postgres_sql(
container=args.container,
db_name=args.db_name,
user=user,
password=args.db_password,
sql_path=BackupPaths(
args.volume_name,
args.backup_hash,
args.version,
repo_name=args.repo_name,
backups_dir=args.backups_dir,
).sql_file(args.db_name),
empty=args.empty,
)
return 0
if args.cmd == "mariadb":
user = args.db_user or args.db_name
restore_mariadb_sql(
container=args.container,
db_name=args.db_name,
user=user,
password=args.db_password,
sql_path=BackupPaths(
args.volume_name,
args.backup_hash,
args.version,
repo_name=args.repo_name,
backups_dir=args.backups_dir,
).sql_file(args.db_name),
empty=args.empty,
)
return 0
parser.error("Unhandled command")
return 2
except Exception as e:
print(f"ERROR: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1 @@
"""Database restore handlers (Postgres, MariaDB/MySQL)."""

View File

@@ -0,0 +1,89 @@
from __future__ import annotations
import os
import sys
from ..run import docker_exec, docker_exec_sh
def _pick_client(container: str) -> str:
"""
Prefer 'mariadb', fallback to 'mysql'.
Some MariaDB images no longer ship a 'mysql' binary, so we must not assume it exists.
"""
script = r"""
set -eu
if command -v mariadb >/dev/null 2>&1; then echo mariadb; exit 0; fi
if command -v mysql >/dev/null 2>&1; then echo mysql; exit 0; fi
exit 42
"""
try:
out = docker_exec_sh(container, script, capture=True).stdout.decode().strip()
if not out:
raise RuntimeError("empty client detection output")
return out
except Exception as e:
print("ERROR: neither 'mariadb' nor 'mysql' found in container.", file=sys.stderr)
raise e
def restore_mariadb_sql(
*,
container: str,
db_name: str,
user: str,
password: str,
sql_path: str,
empty: bool,
) -> None:
client = _pick_client(container)
if not os.path.isfile(sql_path):
raise FileNotFoundError(sql_path)
if empty:
# IMPORTANT:
# Do NOT hardcode 'mysql' here. Use the detected client.
# MariaDB 11 images may not contain the mysql binary at all.
docker_exec(
container,
[client, "-u", user, f"--password={password}", "-e", "SET FOREIGN_KEY_CHECKS=0;"],
)
result = docker_exec(
container,
[
client,
"-u",
user,
f"--password={password}",
"-N",
"-e",
f"SELECT table_name FROM information_schema.tables WHERE table_schema = '{db_name}';",
],
capture=True,
)
tables = result.stdout.decode().split()
for tbl in tables:
docker_exec(
container,
[
client,
"-u",
user,
f"--password={password}",
"-e",
f"DROP TABLE IF EXISTS `{db_name}`.`{tbl}`;",
],
)
docker_exec(
container,
[client, "-u", user, f"--password={password}", "-e", "SET FOREIGN_KEY_CHECKS=1;"],
)
with open(sql_path, "rb") as f:
docker_exec(container, [client, "-u", user, f"--password={password}", db_name], stdin=f)
print(f"MariaDB/MySQL restore complete for db '{db_name}'.")

View File

@@ -0,0 +1,53 @@
from __future__ import annotations
import os
from ..run import docker_exec
def restore_postgres_sql(
*,
container: str,
db_name: str,
user: str,
password: str,
sql_path: str,
empty: bool,
) -> None:
if not os.path.isfile(sql_path):
raise FileNotFoundError(sql_path)
# Make password available INSIDE the container for psql.
docker_env = {"PGPASSWORD": password}
if empty:
drop_sql = r"""
DO $$ DECLARE r RECORD;
BEGIN
FOR r IN (
SELECT table_name AS name, 'TABLE' AS type FROM information_schema.tables WHERE table_schema='public'
UNION ALL
SELECT routine_name AS name, 'FUNCTION' AS type FROM information_schema.routines WHERE specific_schema='public'
UNION ALL
SELECT sequence_name AS name, 'SEQUENCE' AS type FROM information_schema.sequences WHERE sequence_schema='public'
) LOOP
EXECUTE format('DROP %s public.%I CASCADE', r.type, r.name);
END LOOP;
END $$;
"""
docker_exec(
container,
["psql", "-v", "ON_ERROR_STOP=1", "-U", user, "-d", db_name],
stdin=drop_sql.encode(),
docker_env=docker_env,
)
with open(sql_path, "rb") as f:
docker_exec(
container,
["psql", "-v", "ON_ERROR_STOP=1", "-U", user, "-d", db_name],
stdin=f,
docker_env=docker_env,
)
print(f"PostgreSQL restore complete for db '{db_name}'.")

View File

@@ -0,0 +1,37 @@
from __future__ import annotations
import os
import sys
from .run import run, docker_volume_exists
def restore_volume_files(volume_name: str, backup_files_dir: str, *, rsync_image: str) -> int:
if not os.path.isdir(backup_files_dir):
print(f"ERROR: backup files dir not found: {backup_files_dir}", file=sys.stderr)
return 2
if not docker_volume_exists(volume_name):
print(f"Volume {volume_name} does not exist. Creating...")
run(["docker", "volume", "create", volume_name])
else:
print(f"Volume {volume_name} already exists.")
# Keep behavior close to the old script: rsync -avv --delete
run(
[
"docker",
"run",
"--rm",
"-v",
f"{volume_name}:/recover/",
"-v",
f"{backup_files_dir}:/backup/",
rsync_image,
"sh",
"-lc",
"rsync -avv --delete /backup/ /recover/",
]
)
print("File restore complete.")
return 0

View File

@@ -0,0 +1,29 @@
from __future__ import annotations
import os
from dataclasses import dataclass
@dataclass(frozen=True)
class BackupPaths:
volume_name: str
backup_hash: str
version: str
repo_name: str
backups_dir: str = "/Backups"
def root(self) -> str:
# Always build an absolute path under backups_dir
return os.path.join(
self.backups_dir,
self.backup_hash,
self.repo_name,
self.version,
self.volume_name,
)
def files_dir(self) -> str:
return os.path.join(self.root(), "files")
def sql_file(self, db_name: str) -> str:
return os.path.join(self.root(), "sql", f"{db_name}.backup.sql")

View File

@@ -0,0 +1,89 @@
from __future__ import annotations
import subprocess
import sys
from typing import Optional
def run(
cmd: list[str],
*,
stdin=None,
capture: bool = False,
env: Optional[dict] = None,
) -> subprocess.CompletedProcess:
try:
kwargs: dict = {
"check": True,
"capture_output": capture,
"env": env,
}
# If stdin is raw data (bytes/str), pass it via input=.
# IMPORTANT: when using input=..., do NOT pass stdin=... as well.
if isinstance(stdin, (bytes, str)):
kwargs["input"] = stdin
else:
kwargs["stdin"] = stdin
return subprocess.run(cmd, **kwargs)
except subprocess.CalledProcessError as e:
msg = f"ERROR: command failed ({e.returncode}): {' '.join(cmd)}"
print(msg, file=sys.stderr)
if e.stdout:
try:
print(e.stdout.decode(), file=sys.stderr)
except Exception:
print(e.stdout, file=sys.stderr)
if e.stderr:
try:
print(e.stderr.decode(), file=sys.stderr)
except Exception:
print(e.stderr, file=sys.stderr)
raise
def docker_exec(
container: str,
argv: list[str],
*,
stdin=None,
capture: bool = False,
env: Optional[dict] = None,
docker_env: Optional[dict[str, str]] = None,
) -> subprocess.CompletedProcess:
cmd: list[str] = ["docker", "exec", "-i"]
if docker_env:
for k, v in docker_env.items():
cmd.extend(["-e", f"{k}={v}"])
cmd.extend([container, *argv])
return run(cmd, stdin=stdin, capture=capture, env=env)
def docker_exec_sh(
container: str,
script: str,
*,
stdin=None,
capture: bool = False,
env: Optional[dict] = None,
docker_env: Optional[dict[str, str]] = None,
) -> subprocess.CompletedProcess:
return docker_exec(
container,
["sh", "-lc", script],
stdin=stdin,
capture=capture,
env=env,
docker_env=docker_env,
)
def docker_volume_exists(volume: str) -> bool:
p = subprocess.run(
["docker", "volume", "inspect", volume],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
return p.returncode == 0

0
tests/e2e/__init__.py Normal file
View File

222
tests/e2e/helpers.py Normal file
View File

@@ -0,0 +1,222 @@
# tests/e2e/helpers.py
from __future__ import annotations
import shutil
import subprocess
import time
import uuid
from pathlib import Path
def run(
cmd: list[str],
*,
capture: bool = True,
check: bool = True,
cwd: str | None = None,
) -> subprocess.CompletedProcess:
try:
return subprocess.run(
cmd,
check=check,
cwd=cwd,
text=True,
capture_output=capture,
)
except subprocess.CalledProcessError as e:
# Print captured output so failing E2E tests are "live" / debuggable in CI logs
print(">>> command failed:", " ".join(cmd))
print(">>> exit code:", e.returncode)
if e.stdout:
print(">>> STDOUT:\n" + e.stdout)
if e.stderr:
print(">>> STDERR:\n" + e.stderr)
raise
def sh(cmd: str, *, capture: bool = True, check: bool = True) -> subprocess.CompletedProcess:
return run(["sh", "-lc", cmd], capture=capture, check=check)
def unique(prefix: str) -> str:
return f"{prefix}-{uuid.uuid4().hex[:10]}"
def require_docker() -> None:
run(["docker", "version"], capture=True, check=True)
def machine_hash() -> str:
out = sh("sha256sum /etc/machine-id | awk '{print $1}'").stdout.strip()
if len(out) < 16:
raise RuntimeError("Could not determine machine hash from /etc/machine-id")
return out
def wait_for_log(container: str, pattern: str, timeout_s: int = 60) -> None:
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(["docker", "logs", container], capture=True, check=False)
if pattern in (p.stdout or ""):
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for log pattern '{pattern}' in {container}")
def wait_for_postgres(container: str, *, user: str = "postgres", timeout_s: int = 90) -> None:
"""
Docker-outside-of-Docker friendly readiness: check from inside the DB container.
"""
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(
["docker", "exec", container, "sh", "-lc", f"pg_isready -U {user} -h localhost"],
capture=True,
check=False,
)
if p.returncode == 0:
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for Postgres readiness in container {container}")
def wait_for_mariadb(container: str, *, root_password: str, timeout_s: int = 90) -> None:
"""
Liveness probe for MariaDB.
IMPORTANT (MariaDB 11):
Root TCP auth is often restricted (unix_socket auth), so a TCP ping like
`mariadb-admin -uroot -p... -h localhost ping` can fail even though the server is up.
We therefore check readiness via a socket-based query.
"""
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(
["docker", "exec", container, "sh", "-lc", "mariadb -uroot --protocol=socket -e \"SELECT 1;\""],
capture=True,
check=False,
)
if p.returncode == 0:
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for MariaDB readiness in container {container}")
def wait_for_mariadb_sql(container: str, *, user: str, password: str, timeout_s: int = 90) -> None:
"""
SQL login readiness for the *dedicated test user* over TCP.
This is separate from wait_for_mariadb(root) because root may be socket-only,
while the tests use a normal user that should work via TCP.
"""
deadline = time.time() + timeout_s
while time.time() < deadline:
p = run(
[
"docker",
"exec",
container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{user} -p{password} -e \"SELECT 1;\"",
],
capture=True,
check=False,
)
if p.returncode == 0:
return
time.sleep(1)
raise TimeoutError(f"Timed out waiting for MariaDB SQL login readiness in container {container}")
def backup_run(
*,
backups_dir: str,
repo_name: str,
compose_dir: str,
databases_csv: str,
database_containers: list[str],
images_no_stop_required: list[str],
images_no_backup_required: list[str] | None = None,
dump_only: bool = False,
) -> None:
cmd = [
"baudolo",
"--compose-dir", compose_dir,
"--docker-compose-hard-restart-required", "mailu",
"--repo-name", repo_name,
"--databases-csv", databases_csv,
"--backups-dir", backups_dir,
"--database-containers", *database_containers,
"--images-no-stop-required", *images_no_stop_required,
]
if images_no_backup_required:
cmd += ["--images-no-backup-required", *images_no_backup_required]
if dump_only:
cmd += ["--dump-only"]
try:
run(cmd, capture=True, check=True)
except subprocess.CalledProcessError as e:
print(">>> baudolo failed (exit code:", e.returncode, ")")
if e.stdout:
print(">>> baudolo STDOUT:\n" + e.stdout)
if e.stderr:
print(">>> baudolo STDERR:\n" + e.stderr)
raise
def latest_version_dir(backups_dir: str, repo_name: str) -> tuple[str, str]:
"""
Returns (hash, version) for the latest backup.
"""
h = machine_hash()
root = Path(backups_dir) / h / repo_name
if not root.is_dir():
raise FileNotFoundError(str(root))
versions = sorted([p.name for p in root.iterdir() if p.is_dir()])
if not versions:
raise RuntimeError(f"No versions found under {root}")
return h, versions[-1]
def backup_path(backups_dir: str, repo_name: str, version: str, volume: str) -> Path:
h = machine_hash()
return Path(backups_dir) / h / repo_name / version / volume
def create_minimal_compose_dir(base: str) -> str:
"""
baudolo requires --compose-dir. Create an empty dir with one non-compose subdir.
"""
p = Path(base) / "compose-root"
p.mkdir(parents=True, exist_ok=True)
(p / "noop").mkdir(parents=True, exist_ok=True)
return str(p)
def write_databases_csv(path: str, rows: list[tuple[str, str, str, str]]) -> None:
"""
rows: (instance, database, username, password)
database may be '' (empty) to trigger pg_dumpall behavior if you want, but here we use db name.
"""
Path(path).parent.mkdir(parents=True, exist_ok=True)
with open(path, "w", encoding="utf-8") as f:
f.write("instance;database;username;password\n")
for inst, db, user, pw in rows:
f.write(f"{inst};{db};{user};{pw}\n")
def cleanup_docker(*, containers: list[str], volumes: list[str]) -> None:
for c in containers:
run(["docker", "rm", "-f", c], capture=True, check=False)
for v in volumes:
run(["docker", "volume", "rm", "-f", v], capture=True, check=False)
def ensure_empty_dir(path: str) -> None:
p = Path(path)
if p.exists():
shutil.rmtree(p)
p.mkdir(parents=True, exist_ok=True)

View File

@@ -0,0 +1,94 @@
import unittest
from pathlib import Path
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
)
class TestE2EFilesFull(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-files-full")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.volume_src = f"{cls.prefix}-vol-src"
cls.volume_dst = f"{cls.prefix}-vol-dst"
cls.containers = []
cls.volumes = [cls.volume_src, cls.volume_dst]
# create source volume with a file
run(["docker", "volume", "create", cls.volume_src])
run([
"docker", "run", "--rm",
"-v", f"{cls.volume_src}:/data",
"alpine:3.20",
"sh", "-lc", "mkdir -p /data && echo 'hello' > /data/hello.txt",
])
# databases.csv (unused, but required by CLI)
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [])
# Run backup (files should be copied)
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=["dummy-db"],
images_no_stop_required=["alpine", "postgres", "mariadb", "mysql"],
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_exists(self) -> None:
p = (
backup_path(
self.backups_dir,
self.repo_name,
self.version,
self.volume_src,
)
/ "files"
/ "hello.txt"
)
self.assertTrue(p.is_file(), f"Expected backed up file at: {p}")
def test_restore_files_into_new_volume(self) -> None:
# restore files from volume_src backup into volume_dst
run([
"baudolo-restore", "files",
self.volume_dst, self.hash, self.version,
"--backups-dir", self.backups_dir,
"--repo-name", self.repo_name,
"--source-volume", self.volume_src,
"--rsync-image", "ghcr.io/kevinveenbirkenbach/alpine-rsync",
])
# verify restored file exists in dst volume
p = run([
"docker", "run", "--rm",
"-v", f"{self.volume_dst}:/data",
"alpine:3.20",
"sh", "-lc", "cat /data/hello.txt",
])
self.assertEqual((p.stdout or "").strip(), "hello")

View File

@@ -0,0 +1,72 @@
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
)
class TestE2EFilesNoCopy(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-files-nocopy")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.volume_src = f"{cls.prefix}-vol-src"
cls.volume_dst = f"{cls.prefix}-vol-dst"
cls.containers = []
cls.volumes = [cls.volume_src, cls.volume_dst]
run(["docker", "volume", "create", cls.volume_src])
run([
"docker", "run", "--rm",
"-v", f"{cls.volume_src}:/data",
"alpine:3.20",
"sh", "-lc", "echo 'hello' > /data/hello.txt",
])
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [])
# dump-only => NO file rsync backups
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=["dummy-db"],
images_no_stop_required=["alpine", "postgres", "mariadb", "mysql"],
dump_only=True,
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_not_present(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.volume_src) / "files"
self.assertFalse(p.exists(), f"Did not expect files backup dir at: {p}")
def test_restore_files_fails_expected(self) -> None:
p = run([
"baudolo-restore", "files",
self.volume_dst, self.hash, self.version,
"--backups-dir", self.backups_dir,
"--repo-name", self.repo_name,
], check=False)
self.assertEqual(p.returncode, 2, f"Expected exitcode 2, got {p.returncode}\nSTDOUT={p.stdout}\nSTDERR={p.stderr}")

View File

@@ -0,0 +1,155 @@
# tests/e2e/test_e2e_mariadb_full.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_mariadb,
wait_for_mariadb_sql,
)
class TestE2EMariaDBFull(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-mariadb-full")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.db_container = f"{cls.prefix}-mariadb"
cls.db_volume = f"{cls.prefix}-mariadb-vol"
cls.containers = [cls.db_container]
cls.volumes = [cls.db_volume]
cls.db_name = "appdb"
cls.db_user = "test"
cls.db_password = "testpw"
cls.root_password = "rootpw"
run(["docker", "volume", "create", cls.db_volume])
# Start MariaDB with a dedicated TCP-capable user for tests.
run(
[
"docker",
"run",
"-d",
"--name",
cls.db_container,
"-e",
f"MARIADB_ROOT_PASSWORD={cls.root_password}",
"-e",
f"MARIADB_DATABASE={cls.db_name}",
"-e",
f"MARIADB_USER={cls.db_user}",
"-e",
f"MARIADB_PASSWORD={cls.db_password}",
"-v",
f"{cls.db_volume}:/var/lib/mysql",
"mariadb:11",
]
)
# Liveness + actual SQL login readiness (TCP)
wait_for_mariadb(cls.db_container, root_password=cls.root_password, timeout_s=90)
wait_for_mariadb_sql(cls.db_container, user=cls.db_user, password=cls.db_password, timeout_s=90)
# Create table + data via the dedicated user (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"CREATE TABLE {cls.db_name}.t (id INT PRIMARY KEY, v VARCHAR(50)); "
f"INSERT INTO {cls.db_name}.t VALUES (1,'ok');\"",
]
)
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
# IMPORTANT: baudolo backup expects credentials for the DB dump.
write_databases_csv(cls.databases_csv, [(cls.db_container, cls.db_name, cls.db_user, cls.db_password)])
# Backup with file+dump
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.db_container],
images_no_stop_required=["mariadb", "mysql", "alpine", "postgres"],
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
# Wipe DB via the dedicated user (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"DROP TABLE {cls.db_name}.t;\"",
]
)
# Restore DB (uses baudolo-restore which execs mysql/mariadb inside the container)
run(
[
"baudolo-restore",
"mariadb",
cls.db_volume,
cls.hash,
cls.version,
"--backups-dir",
cls.backups_dir,
"--repo-name",
cls.repo_name,
"--container",
cls.db_container,
"--db-name",
cls.db_name,
"--db-user",
cls.db_user,
"--db-password",
cls.db_password,
"--empty",
]
)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_dump_file_exists(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.db_volume) / "sql" / f"{self.db_name}.backup.sql"
self.assertTrue(p.is_file(), f"Expected dump file at: {p}")
def test_data_restored(self) -> None:
p = run(
[
"docker",
"exec",
self.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{self.db_user} -p{self.db_password} "
f"-N -e \"SELECT v FROM {self.db_name}.t WHERE id=1;\"",
]
)
self.assertEqual((p.stdout or "").strip(), "ok")

View File

@@ -0,0 +1,153 @@
# tests/e2e/test_e2e_mariadb_no_copy.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_mariadb,
wait_for_mariadb_sql,
)
class TestE2EMariaDBNoCopy(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-mariadb-nocopy")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.db_container = f"{cls.prefix}-mariadb"
cls.db_volume = f"{cls.prefix}-mariadb-vol"
cls.containers = [cls.db_container]
cls.volumes = [cls.db_volume]
cls.db_name = "appdb"
cls.db_user = "test"
cls.db_password = "testpw"
cls.root_password = "rootpw"
run(["docker", "volume", "create", cls.db_volume])
run(
[
"docker",
"run",
"-d",
"--name",
cls.db_container,
"-e",
f"MARIADB_ROOT_PASSWORD={cls.root_password}",
"-e",
f"MARIADB_DATABASE={cls.db_name}",
"-e",
f"MARIADB_USER={cls.db_user}",
"-e",
f"MARIADB_PASSWORD={cls.db_password}",
"-v",
f"{cls.db_volume}:/var/lib/mysql",
"mariadb:11",
]
)
wait_for_mariadb(cls.db_container, root_password=cls.root_password, timeout_s=90)
wait_for_mariadb_sql(cls.db_container, user=cls.db_user, password=cls.db_password, timeout_s=90)
# Create table + data (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"CREATE TABLE {cls.db_name}.t (id INT PRIMARY KEY, v VARCHAR(50)); "
f"INSERT INTO {cls.db_name}.t VALUES (1,'ok');\"",
]
)
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [(cls.db_container, cls.db_name, cls.db_user, cls.db_password)])
# dump-only => no files
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.db_container],
images_no_stop_required=["mariadb", "mysql", "alpine", "postgres"],
dump_only=True,
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
# Wipe table (TCP)
run(
[
"docker",
"exec",
cls.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{cls.db_user} -p{cls.db_password} "
f"-e \"DROP TABLE {cls.db_name}.t;\"",
]
)
# Restore DB
run(
[
"baudolo-restore",
"mariadb",
cls.db_volume,
cls.hash,
cls.version,
"--backups-dir",
cls.backups_dir,
"--repo-name",
cls.repo_name,
"--container",
cls.db_container,
"--db-name",
cls.db_name,
"--db-user",
cls.db_user,
"--db-password",
cls.db_password,
"--empty",
]
)
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_not_present(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.db_volume) / "files"
self.assertFalse(p.exists(), f"Did not expect files backup dir at: {p}")
def test_data_restored(self) -> None:
p = run(
[
"docker",
"exec",
self.db_container,
"sh",
"-lc",
f"mariadb -h 127.0.0.1 -u{self.db_user} -p{self.db_password} "
f"-N -e \"SELECT v FROM {self.db_name}.t WHERE id=1;\"",
]
)
self.assertEqual((p.stdout or "").strip(), "ok")

View File

@@ -0,0 +1,102 @@
# tests/e2e/test_e2e_postgres_full.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_postgres,
)
class TestE2EPostgresFull(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-postgres-full")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.pg_container = f"{cls.prefix}-pg"
cls.pg_volume = f"{cls.prefix}-pg-vol"
cls.containers = [cls.pg_container]
cls.volumes = [cls.pg_volume]
run(["docker", "volume", "create", cls.pg_volume])
run([
"docker", "run", "-d",
"--name", cls.pg_container,
"-e", "POSTGRES_PASSWORD=pgpw",
"-e", "POSTGRES_DB=appdb",
"-e", "POSTGRES_USER=postgres",
"-v", f"{cls.pg_volume}:/var/lib/postgresql/data",
"postgres:16",
])
wait_for_postgres(cls.pg_container, user="postgres", timeout_s=90)
# Create a table + data
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"CREATE TABLE t (id int primary key, v text); INSERT INTO t VALUES (1,'ok');\"",
])
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [(cls.pg_container, "appdb", "postgres", "pgpw")])
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.pg_container],
images_no_stop_required=["postgres", "mariadb", "mysql", "alpine"],
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
# Wipe schema
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"DROP TABLE t;\"",
])
# Restore
run([
"baudolo-restore", "postgres",
cls.pg_volume, cls.hash, cls.version,
"--backups-dir", cls.backups_dir,
"--repo-name", cls.repo_name,
"--container", cls.pg_container,
"--db-name", "appdb",
"--db-user", "postgres",
"--db-password", "pgpw",
"--empty",
])
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_dump_file_exists(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.pg_volume) / "sql" / "appdb.backup.sql"
self.assertTrue(p.is_file(), f"Expected dump file at: {p}")
def test_data_restored(self) -> None:
p = run([
"docker", "exec", self.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -t -c \"SELECT v FROM t WHERE id=1;\"",
])
self.assertEqual((p.stdout or "").strip(), "ok")

View File

@@ -0,0 +1,99 @@
# tests/e2e/test_e2e_postgres_no_copy.py
import unittest
from .helpers import (
backup_run,
backup_path,
cleanup_docker,
create_minimal_compose_dir,
ensure_empty_dir,
latest_version_dir,
require_docker,
unique,
write_databases_csv,
run,
wait_for_postgres,
)
class TestE2EPostgresNoCopy(unittest.TestCase):
@classmethod
def setUpClass(cls) -> None:
require_docker()
cls.prefix = unique("baudolo-e2e-postgres-nocopy")
cls.backups_dir = f"/tmp/{cls.prefix}/Backups"
ensure_empty_dir(cls.backups_dir)
cls.compose_dir = create_minimal_compose_dir(f"/tmp/{cls.prefix}")
cls.repo_name = cls.prefix
cls.pg_container = f"{cls.prefix}-pg"
cls.pg_volume = f"{cls.prefix}-pg-vol"
cls.containers = [cls.pg_container]
cls.volumes = [cls.pg_volume]
run(["docker", "volume", "create", cls.pg_volume])
run([
"docker", "run", "-d",
"--name", cls.pg_container,
"-e", "POSTGRES_PASSWORD=pgpw",
"-e", "POSTGRES_DB=appdb",
"-e", "POSTGRES_USER=postgres",
"-v", f"{cls.pg_volume}:/var/lib/postgresql/data",
"postgres:16",
])
wait_for_postgres(cls.pg_container, user="postgres", timeout_s=90)
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"CREATE TABLE t (id int primary key, v text); INSERT INTO t VALUES (1,'ok');\"",
])
cls.databases_csv = f"/tmp/{cls.prefix}/databases.csv"
write_databases_csv(cls.databases_csv, [(cls.pg_container, "appdb", "postgres", "pgpw")])
backup_run(
backups_dir=cls.backups_dir,
repo_name=cls.repo_name,
compose_dir=cls.compose_dir,
databases_csv=cls.databases_csv,
database_containers=[cls.pg_container],
images_no_stop_required=["postgres", "mariadb", "mysql", "alpine"],
dump_only=True,
)
cls.hash, cls.version = latest_version_dir(cls.backups_dir, cls.repo_name)
run([
"docker", "exec", cls.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -c \"DROP TABLE t;\"",
])
run([
"baudolo-restore", "postgres",
cls.pg_volume, cls.hash, cls.version,
"--backups-dir", cls.backups_dir,
"--repo-name", cls.repo_name,
"--container", cls.pg_container,
"--db-name", "appdb",
"--db-user", "postgres",
"--db-password", "pgpw",
"--empty",
])
@classmethod
def tearDownClass(cls) -> None:
cleanup_docker(containers=cls.containers, volumes=cls.volumes)
def test_files_backup_not_present(self) -> None:
p = backup_path(self.backups_dir, self.repo_name, self.version, self.pg_volume) / "files"
self.assertFalse(p.exists(), f"Did not expect files backup dir at: {p}")
def test_data_restored(self) -> None:
p = run([
"docker", "exec", self.pg_container,
"sh", "-lc",
"psql -U postgres -d appdb -t -c \"SELECT v FROM t WHERE id=1;\"",
])
self.assertEqual((p.stdout or "").strip(), "ok")

View File

View File

@@ -0,0 +1,88 @@
import csv
import subprocess
import sys
import tempfile
import unittest
from pathlib import Path
def run_seed(csv_path: Path, instance: str, database: str, username: str, password: str = "") -> subprocess.CompletedProcess:
# Run the real CLI module (integration-style).
return subprocess.run(
[
sys.executable,
"-m",
"baudolo.seed",
str(csv_path),
instance,
database,
username,
password,
],
text=True,
capture_output=True,
check=True,
)
def read_csv_semicolon(path: Path) -> list[dict]:
with path.open("r", encoding="utf-8", newline="") as f:
reader = csv.DictReader(f, delimiter=";")
return list(reader)
class TestSeedIntegration(unittest.TestCase):
def test_creates_file_and_adds_entry_when_missing(self) -> None:
with tempfile.TemporaryDirectory() as td:
p = Path(td) / "databases.csv"
self.assertFalse(p.exists())
cp = run_seed(p, "docker.test", "appdb", "alice", "secret")
self.assertEqual(cp.returncode, 0, cp.stderr)
self.assertTrue(p.exists())
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["instance"], "docker.test")
self.assertEqual(rows[0]["database"], "appdb")
self.assertEqual(rows[0]["username"], "alice")
self.assertEqual(rows[0]["password"], "secret")
def test_replaces_existing_entry_same_keys(self) -> None:
with tempfile.TemporaryDirectory() as td:
p = Path(td) / "databases.csv"
# First add
run_seed(p, "docker.test", "appdb", "alice", "oldpw")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["password"], "oldpw")
# Replace (same instance+database+username)
run_seed(p, "docker.test", "appdb", "alice", "newpw")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1, "Expected replacement, not a duplicate row")
self.assertEqual(rows[0]["instance"], "docker.test")
self.assertEqual(rows[0]["database"], "appdb")
self.assertEqual(rows[0]["username"], "alice")
self.assertEqual(rows[0]["password"], "newpw")
def test_database_empty_string_matches_existing_empty_database(self) -> None:
with tempfile.TemporaryDirectory() as td:
p = Path(td) / "databases.csv"
# Add with empty database
run_seed(p, "docker.test", "", "alice", "pw1")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["database"], "")
# Replace with empty database again
run_seed(p, "docker.test", "", "alice", "pw2")
rows = read_csv_semicolon(p)
self.assertEqual(len(rows), 1)
self.assertEqual(rows[0]["database"], "")
self.assertEqual(rows[0]["password"], "pw2")

View File

@@ -1,64 +1,36 @@
# tests/unit/test_backup.py
import unittest import unittest
from unittest.mock import patch from unittest.mock import patch
import importlib.util
import sys
import os
import pathlib
# Prevent actual directory creation in backup script import from baudolo.backup.app import requires_stop
dummy_mkdir = lambda self, *args, **kwargs: None
original_mkdir = pathlib.Path.mkdir
pathlib.Path.mkdir = dummy_mkdir
# Create a virtual databases.csv in the project root for the module import
test_dir = os.path.dirname(__file__)
project_root = os.path.abspath(os.path.join(test_dir, '../../'))
sys.path.insert(0, project_root)
db_csv_path = os.path.join(project_root, 'databases.csv')
with open(db_csv_path, 'w') as f:
f.write('instance;database;username;password\n')
# Dynamically load the hyphenated script as module 'backup' class TestRequiresStop(unittest.TestCase):
script_path = os.path.join(project_root, 'backup-docker-to-local.py') @patch("baudolo.backup.app.get_image_info")
spec = importlib.util.spec_from_file_location('backup', script_path) def test_requires_stop_false_when_all_images_are_whitelisted(self, mock_get_image_info):
backup = importlib.util.module_from_spec(spec) # All containers use images containing allowed substrings
sys.modules['backup'] = backup mock_get_image_info.side_effect = [
spec.loader.exec_module(backup) "repo/mastodon:v4",
"repo/wordpress:latest",
]
containers = ["c1", "c2"]
whitelist = ["mastodon", "wordpress"]
self.assertFalse(requires_stop(containers, whitelist))
# Restore original mkdir @patch("baudolo.backup.app.get_image_info")
pathlib.Path.mkdir = original_mkdir def test_requires_stop_true_when_any_image_is_not_whitelisted(self, mock_get_image_info):
mock_get_image_info.side_effect = [
"repo/mastodon:v4",
"repo/nginx:latest",
]
containers = ["c1", "c2"]
whitelist = ["mastodon", "wordpress"]
self.assertTrue(requires_stop(containers, whitelist))
class TestIsImageWhitelisted(unittest.TestCase): @patch("baudolo.backup.app.get_image_info")
@patch('backup.get_image_info') def test_requires_stop_true_when_whitelist_empty(self, mock_get_image_info):
def test_returns_true_when_image_matches(self, mock_get_image_info): mock_get_image_info.return_value = "repo/anything:latest"
# Simulate a container image containing 'mastodon' self.assertTrue(requires_stop(["c1"], []))
mock_get_image_info.return_value = ['repo/mastodon:v4']
images = ['mastodon', 'wordpress']
self.assertTrue(
backup.is_image_whitelisted('any_container', images),
"Should return True when at least one image substring matches"
)
@patch('backup.get_image_info')
def test_returns_false_when_no_image_matches(self, mock_get_image_info):
# Simulate a container image without matching substrings
mock_get_image_info.return_value = ['repo/nginx:latest']
images = ['mastodon', 'wordpress']
self.assertFalse(
backup.is_image_whitelisted('any_container', images),
"Should return False when no image substring matches"
)
@patch('backup.get_image_info') if __name__ == "__main__":
def test_returns_false_with_empty_image_list(self, mock_get_image_info):
# Even if get_image_info returns something, an empty list yields False
mock_get_image_info.return_value = ['repo/element:1.0']
self.assertFalse(
backup.is_image_whitelisted('any_container', []),
"Should return False when the images list is empty"
)
if __name__ == '__main__':
unittest.main() unittest.main()