Compare commits

..

1 Commits

Author SHA1 Message Date
932595128c Added draft mysql dump 2021-08-19 21:02:52 +02:00
8 changed files with 75 additions and 326 deletions

1
.gitignore vendored
View File

@@ -1 +0,0 @@
databases.csv

View File

@@ -1,5 +1,5 @@
# Backup Docker Volumes to Local
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](./LICENSE.txt)
# docker-volume-backup
[![License: GPL v3](https://img.shields.io/badge/License-GPL%20v3-blue.svg)](./LICENSE.txt) [![Travis CI](https://travis-ci.org/kevinveenbirkenbach/docker-volume-backup.svg?branch=master)](https://travis-ci.org/kevinveenbirkenbach/docker-volume-backup)
## goal
This script backups all docker-volumes with the help of rsync.
@@ -9,59 +9,50 @@ It is part of the following scheme:
![backup scheme](https://www.veen.world/wp-content/uploads/2020/12/server-backup-768x567.jpg)
Further information you will find [in this blog post](https://www.veen.world/2020/12/26/how-i-backup-dedicated-root-servers/).
## Backup all volumes
## Backup
Execute:
```bash
./backup-docker-to-local.sh
./docker-volume-backup.sh
```
## Recover
### database
```bash
docker exec -i mysql_container mysql -uroot -psecret database < db.sql
```
### volume
Execute:
```bash
bash ./recover-docker-from-local.sh "{{volume_name}}" "$(sha256sum /etc/machine-id | head -c 64)" "{{version_to_recover}}"
./docker-volume-recover.sh {{volume_name}} {{backup_path}}
```
### Database
## Debug
To checkout what's going on in the mount container type in the following command:
```bash
docker run -it --entrypoint /bin/sh --rm --volumes-from {{container_name}} -v /Backups/:/Backups/ kevinveenbirkenbach/alpine-rsync
```
## Manual Backup
rsync -aPvv '***{{source_path}}***/' ***{{destination_path}}***";
## Setup
Install pandas
## Test
Delete the volume.
## Author
```bash
docker rm -f container-name
docker volume rm volume-name
```
Kevin Veen-Birkenbach
- 📧 Email: [kevin@veen.world](mailto:kevin@veen.world)
- 🌍 Website: [https://www.veen.world/](https://www.veen.world/)
Recover the volume:
## License
```bash
docker volume create volume-name
docker run --rm -v volume-name:/recover/ -v ~/backup/:/backup/ "kevinveenbirkenbach/alpine-rsync" sh -c "rsync -avv /backup/ /recover/"
```
This project is licensed under the GNU Affero General Public License v3.0. The full license text is available in the `LICENSE` file of this repository.
Restart the container.
## Optimation
This setup script is not optimized yet for performance. Please optimized this script for performance if you want to use it in a professional environment.
## More information
- https://docs.docker.com/storage/volumes/
- https://blog.ssdnodes.com/blog/docker-backup-volumes/
- https://www.baculasystems.com/blog/docker-backup-containers/
- https://gist.github.com/spalladino/6d981f7b33f6e0afe6bb
- https://stackoverflow.com/questions/26331651/how-can-i-backup-a-docker-container-with-its-data-volumes
- https://netfuture.ch/2013/08/simple-versioned-timemachine-like-backup-using-rsync/
- https://zwischenzugs.com/2016/08/29/bash-to-python-converter/
- https://en.wikipedia.org/wiki/Incremental_backup#Incremental
- https://unix.stackexchange.com/questions/567837/linux-backup-utility-for-incremental-backups
- https://chat.openai.com/share/6d10f143-3f7c-4feb-8ae9-5644c3433a65
- https://hub.docker.com/_/mariadb

View File

@@ -1,186 +0,0 @@
#!/bin/python
# Backups volumes of running containers
import subprocess
import os
import re
import pathlib
import pandas
from datetime import datetime
class BackupException(Exception):
"""Generic exception for backup errors."""
pass
def execute_shell_command(command):
"""Execute a shell command and return its output."""
print(command)
process = subprocess.Popen([command], stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
out, err = process.communicate()
if process.returncode != 0:
raise BackupException(f"Error in command: {command}\nOutput: {out}\nError: {err}\nExit code: {process.returncode}")
return [line.decode("utf-8") for line in out.splitlines()]
def get_machine_id():
"""Get the machine identifier."""
return execute_shell_command("sha256sum /etc/machine-id")[0][0:64]
def create_backup_directories(base_dir, machine_id, repository_name, backup_time):
"""Create necessary directories for backup."""
version_dir = os.path.join(base_dir, machine_id, repository_name, backup_time)
pathlib.Path(version_dir).mkdir(parents=True, exist_ok=True)
return version_dir
def get_instance(container):
instance_name = re.split("(_|-)(database|db|postgres)", container)[0]
print(f"Extracted instance name: {instance_name}")
return instance_name
def backup_database(container, databases, version_dir, db_type):
"""Backup database (MariaDB or PostgreSQL) if applicable."""
print(f"Starting database backup for {container} using {db_type}...")
instance_name = get_instance(container)
# Filter the DataFrame for the given instance_name
database_entries = databases.loc[databases['instance'] == instance_name]
# Check if there are more than one entries
if len(database_entries) > 1:
raise BackupException(f"More than one entry found for instance '{instance_name}'")
# Check if there is no entry
if database_entries.empty:
raise BackupException(f"No entry found for instance '{instance_name}'")
# Get the first (and only) entry
database_entry = database_entries.iloc[0]
backup_destination_dir = os.path.join(version_dir, "sql")
pathlib.Path(backup_destination_dir).mkdir(parents=True, exist_ok=True)
backup_destination_file = os.path.join(backup_destination_dir, f"backup.sql")
if db_type == 'mariadb':
backup_command = f"docker exec {container} /usr/bin/mariadb-dump -u {database_entry['username']} -p{database_entry['password']} {database_entry['database']} > {backup_destination_file}"
elif db_type == 'postgres':
if database_entry['password']:
# Include PGPASSWORD in the command when a password is provided
backup_command = (
f"PGPASSWORD={database_entry['password']} docker exec -i {container} "
f"pg_dump -U {database_entry['username']} -d {database_entry['database']} "
f"-h localhost > {backup_destination_file}"
)
else:
# Exclude PGPASSWORD and use --no-password when the password is empty
backup_command = (
f"docker exec -i {container} pg_dump -U {database_entry['username']} "
f"-d {database_entry['database']} -h localhost --no-password "
f"> {backup_destination_file}"
)
execute_shell_command(backup_command)
print(f"Database backup for {container} completed.")
def backup_volume(volume_name, version_dir):
"""Backup files of a volume."""
print(f"Starting backup routine for volume: {volume_name}")
files_rsync_destination_path = os.path.join(version_dir, volume_name, "files")
pathlib.Path(files_rsync_destination_path).mkdir(parents=True, exist_ok=True)
source_dir = f"/var/lib/docker/volumes/{volume_name}/_data/"
rsync_command = f"rsync -abP --delete --delete-excluded {source_dir} {files_rsync_destination_path}"
execute_shell_command(rsync_command)
print(f"Backup routine for volume: {volume_name} completed.")
def has_image(container,image):
"""Check if the container is using the image"""
image_info = execute_shell_command(f"docker inspect {container} | jq -r '.[].Config.Image'")
return image in image_info[0]
def stop_containers(containers):
"""Stop a list of containers."""
for container in containers:
print(f"Stopping container {container}...")
execute_shell_command(f"docker stop {container}")
def start_containers(containers):
"""Start a list of stopped containers."""
for container in containers:
print(f"Starting container {container}...")
execute_shell_command(f"docker start {container}")
def get_container_with_image(containers,image):
for container in containers:
if has_image(container,image):
return container
return False
def is_image_whitelisted(container, images):
"""Check if the container's image is one of the whitelisted images."""
image_info = execute_shell_command(f"docker inspect {container} | jq -r '.[].Config.Image'")
container_image = image_info[0]
for image in images:
if image in container_image:
return True
return False
def is_any_image_not_whitelisted(containers, images):
"""Check if any of the containers are using images that are not whitelisted."""
return any(not is_image_whitelisted(container, images) for container in containers)
def backup_routine_for_volume(volume_name, containers, databases, version_dir, whitelisted_images):
"""Perform backup routine for a given volume."""
for container in containers:
if has_image(container, 'mariadb'):
backup_database(container, databases, version_dir, 'mariadb')
elif has_image(container, 'postgres'):
backup_database(container, databases, version_dir, 'postgres')
else:
if is_any_image_not_whitelisted(containers, whitelisted_images):
stop_containers(containers)
backup_volume(volume_name, version_dir)
start_containers(containers)
else:
backup_volume(volume_name, version_dir)
def main():
print('Start backup routine...')
dirname = os.path.dirname(__file__)
repository_name = os.path.basename(dirname)
machine_id = get_machine_id()
backups_dir = '/Backups/'
backup_time = datetime.now().strftime("%Y%m%d%H%M%S")
version_dir = create_backup_directories(backups_dir, machine_id, repository_name, backup_time)
print('Start volume backups...')
databases = pandas.read_csv(os.path.join(dirname, "databases.csv"), sep=";")
volume_names = execute_shell_command("docker volume ls --format '{{.Name}}'")
# This whitelist is configurated for https://github.com/kevinveenbirkenbach/backup-docker-to-local
stop_and_restart_not_needed = [
# 'baserow', Doesn't use an extra database
'element',
'gitea',
'listmonk',
'mastodon',
'matomo',
'memcached',
'nextcloud',
'openproject',
'pixelfed',
'redis',
'wordpress'
]
for volume_name in volume_names:
print(f'Start backup routine for volume: {volume_name}')
containers = execute_shell_command(f"docker ps --filter volume=\"{volume_name}\" --format '{{{{.Names}}}}'")
if not containers:
print('Skipped due to no running containers using this volume.')
continue
backup_routine_for_volume(volume_name, containers, databases, version_dir, stop_and_restart_not_needed)
print('Finished volume backups.')
if __name__ == "__main__":
main()

View File

@@ -1,45 +0,0 @@
import pandas as pd
import argparse
import os
def check_and_add_entry(file_path, instance, host, database, username, password):
# Check if the file exists and is not empty
if os.path.exists(file_path) and os.path.getsize(file_path) > 0:
# Read the existing CSV file with header
df = pd.read_csv(file_path, sep=';')
else:
# Create a new DataFrame with columns if file does not exist
df = pd.DataFrame(columns=['instance','host', 'database', 'username', 'password'])
# Check if the entry exists and remove it
mask = (df['instance'] == instance) & (df['host'] == host) & (df['database'] == database) & (df['username'] == username)
if not df[mask].empty:
print("Replacing existing entry.")
df = df[~mask]
else:
print("Adding new entry.")
# Create a new DataFrame for the new entry
new_entry = pd.DataFrame([{'instance': instance, 'host': host, 'database': database, 'username': username, 'password': password}])
# Add (or replace) the entry using concat
df = pd.concat([df, new_entry], ignore_index=True)
# Save the updated CSV file
df.to_csv(file_path, sep=';', index=False)
def main():
parser = argparse.ArgumentParser(description="Check and replace (or add) a database entry in a CSV file.")
parser.add_argument("file_path", help="Path to the CSV file")
parser.add_argument("instance", help="Database instance")
parser.add_argument("host", help="Database host")
parser.add_argument("database", help="Database name")
parser.add_argument("username", help="Username")
parser.add_argument("password", nargs='?', default="", help="Password (optional)")
args = parser.parse_args()
check_and_add_entry(args.file_path, args.instance, args.host, args.database, args.username, args.password)
if __name__ == "__main__":
main()

View File

@@ -1 +0,0 @@
database;username;password;container

46
docker-volume-backup.sh Normal file
View File

@@ -0,0 +1,46 @@
#!/bin/bash
# Just backups volumes of running containers
# If rsync stucks consider:
# @see https://stackoverflow.com/questions/20773118/rsync-suddenly-hanging-indefinitely-during-transfers
#
backup_time="$(date '+%Y%m%d%H%M%S')";
backups_folder="/Backups/";
repository_name="$(cd "$(dirname "$(readlink -f "${0}")")" && basename `git rev-parse --show-toplevel`)";
machine_id="$(sha256sum /etc/machine-id | head -c 64)";
backup_repository_folder="$backups_folder$machine_id/$repository_name/";
for volume_name in $(docker volume ls --format '{{.Name}}');
do
echo "start backup routine: $volume_name";
for container_name in $(docker ps -a --filter volume="$volume_name" --format '{{.Names}}');
do
echo "stop container: $container_name" && docker stop "$container_name"
for source_path in $(docker inspect --format "{{ range .Mounts }}{{ if eq .Type \"volume\"}}{{ if eq .Name \"$volume_name\"}}{{ println .Destination }}{{ end }}{{ end }}{{ end }}" "$container_name");
do
destination_path="$backup_repository_folder""latest/$volume_name";
raw_destination_path="$destination_path/raw"
prepared_destination_path="$destination_path/prepared"
log_path="$backup_repository_folder""log.txt";
backup_dir_path="$backup_repository_folder""diffs/$backup_time/$volume_name";
raw_backup_dir_path="$backup_dir_path/raw";
prepared_backup_dir_path="$backup_dir_path/prepared";
if [ -d "$destination_path" ]
then
echo "backup volume: $volume_name";
else
echo "first backup volume: $volume_name"
mkdir -vp "$raw_destination_path";
mkdir -vp "$raw_backup_dir_path";
mkdir -vp "$prepared_destination_path";
mkdir -vp "$prepared_backup_dir_path";
fi
docker run --rm --volumes-from "$container_name" -v "$backups_folder:$backups_folder" "kevinveenbirkenbach/alpine-rsync" sh -c "
rsync -abP --delete --delete-excluded --log-file=$log_path --backup-dir=$raw_backup_dir_path '$source_path/' $raw_destination_path";
done
echo "start container: $container_name" && docker start "$container_name";
if [ "mariadb" == "$(docker inspect --format='{{.Config.Image}}' $container_name)"]
then
docker exec some-mariadb sh -c 'exec mysqldump --all-databases -uroot -p"$MARIADB_ROOT_PASSWORD"' > /some/path/on/your/host/all-databases.sql
fi
done
echo "end backup routine: $volume_name";
done

6
docker-volume-recover.sh Normal file
View File

@@ -0,0 +1,6 @@
#!/bin/bash
# @param $1 Volume-Name
volume_name="$1"
backup_path="$2"
docker volume create "$volume_name"
docker run --rm -v "$volume_name:/recover/" -v "$backup_path:/backup/" "kevinveenbirkenbach/alpine-rsync" sh -c "rsync -avv /backup/ /recover/"

View File

@@ -1,61 +0,0 @@
#!/bin/bash
# Check minimum number of arguments
if [ $# -lt 3 ]; then
echo "ERROR: Not enough arguments. Please provide at least a volume name, backup hash, and version."
exit 1
fi
volume_name="$1" # Volume-Name
backup_hash="$2" # Hashed Machine ID
version="$3" # version to backup
container="${4:-}" # optional
mysql_root_password="${5:-}" # optional
database="${6:-}" # optional
backup_folder="Backups/$backup_hash/backup-docker-to-local/$version/$volume_name"
backup_files="/$backup_folder/files"
backup_sql="/$backup_folder/sql/backup.sql"
echo "Inspect volume $volume_name"
docker volume inspect "$volume_name"
exit_status_volume_inspect=$?
if [ $exit_status_volume_inspect -eq 0 ]; then
echo "Volume $volume_name already exists"
else
echo "Create volume $volume_name"
docker volume create "$volume_name"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to create volume $volume_name"
exit 1
fi
fi
if [ -f "$backup_sql" ]; then
if [ -n "$container" ] && [ -n "$mysql_root_password" ] && [ -n "$database" ]; then
echo "recover mysql dump"
cat "$backup_sql" | docker exec -i "$container" mariadb -u root --password="$mysql_root_password" "$database"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to recover mysql dump"
exit 1
fi
exit 0
fi
echo "A database backup exists, but a parameter is missing. Files will be recovered instead."
fi
if [ -d "$backup_files" ]; then
echo "recover files"
docker run --rm -v "$volume_name:/recover/" -v "$backup_files:/backup/" "kevinveenbirkenbach/alpine-rsync" sh -c "rsync -avv --delete /backup/ /recover/"
if [ $? -ne 0 ]; then
echo "ERROR: Failed to recover files"
exit 1
fi
exit 0
else
echo "ERROR: $backup_files doesn't exist"
exit 1
fi
echo "ERROR: Unhandled case"
exit 1