duplicate-file-handler/README.md

55 lines
1.9 KiB
Markdown
Raw Permalink Normal View History

2023-06-28 11:51:06 +02:00
# Duplicate File Handler
2023-11-14 13:23:12 +01:00
This repository contains a Python script for identifying and handling duplicate files in a directory and its subdirectories based on their MD5 hash. It allows for filtering by file type and provides options for handling duplicates such as deletion, hard linking, or sym linking.
2023-06-28 11:55:55 +02:00
2023-06-28 11:51:06 +02:00
## Author
2023-11-14 13:23:12 +01:00
- Kevin Veen-Birkenbach
2023-06-28 11:51:06 +02:00
- Email: kevin@veen.world
- Website: [https://www.veen.world](https://www.veen.world)
2023-11-14 13:23:12 +01:00
This repository was enhanced with the help of [OpenAI's ChatGPT](https://chat.openai.com/share/825931d6-1e33-40b0-8dfc-914b3f852eeb).
2023-06-28 11:51:06 +02:00
2023-06-28 11:55:55 +02:00
## Setup
2023-11-14 13:23:12 +01:00
To use the script, ensure you have Python installed on your system. No additional libraries are required as the script uses standard Python libraries.
2023-06-28 11:55:55 +02:00
2023-06-28 11:51:06 +02:00
## Usage
2023-11-14 13:23:12 +01:00
### Identifying and Handling Duplicates
`main.py` is a Python script to identify all duplicate files in the specified directories. It can also filter by file type and handle duplicates by deleting them or replacing them with hard or symbolic links.
```bash
python main.py [options] directories
```
#### Options
- `--apply-to`: Directories to apply modifications to.
- `--modification`: Action to perform on duplicates - `delete`, `hardlink`, `symlink`, or `show` (default).
- `--mode`: How to apply the modifications - `act`, `preview`, `interactive` (default: `preview`).
- `-f`, `--file-type`: Filter by file type (e.g., `.txt` for text files).
### Creating Test File Structure
2023-06-28 11:51:06 +02:00
2023-11-14 13:23:12 +01:00
`create_file_structure.py` is a utility script to create a test file structure with duplicate files for testing purposes.
2023-06-28 11:51:06 +02:00
```bash
2023-11-14 13:23:12 +01:00
python create_file_structure.py
2023-06-28 11:51:06 +02:00
```
2023-11-14 13:23:12 +01:00
## Example
To preview duplicate `.txt` files in `test_dir1` and `test_dir2`:
```bash
python main.py --file-type .txt --mode preview test_dir1 test_dir2
```
2023-06-28 11:51:06 +02:00
2023-11-14 13:23:12 +01:00
To interactively delete duplicates in `test_dir2`:
2023-06-28 11:51:06 +02:00
```bash
2023-11-14 13:23:12 +01:00
python main.py --apply-to test_dir2 --modification delete --mode interactive test_dir1 test_dir2
2023-06-28 11:51:06 +02:00
```
## License
2023-11-14 13:23:12 +01:00
This project is licensed under the terms of the [MIT License](LICENSE).