mirror of
https://github.com/kevinveenbirkenbach/duplicate-file-handler.git
synced 2025-04-22 16:22:25 +02:00
Update README.md
This commit is contained in:
parent
32669ba528
commit
8239040659
95
README.md
95
README.md
@ -1,55 +1,94 @@
|
|||||||
# Duplicate File Handler
|
# Duplicate File Handler (dufiha) 🔍
|
||||||
|
|
||||||
This repository contains a Python script for identifying and handling duplicate files in a directory and its subdirectories based on their MD5 hash. It allows for filtering by file type and provides options for handling duplicates such as deletion, hard linking, or sym linking.
|
[](./LICENSE) [](https://github.com/kevinveenbirkenbach/duplicate-file-handler/stargazers)
|
||||||
|
|
||||||
## Author
|
Duplicate File Handler is a Python CLI tool for identifying and handling duplicate files within one or more directories based on their MD5 hashes. With flexible file-type filtering and multiple action modes, you can efficiently delete duplicates or replace them with hard or symbolic links.
|
||||||
- Kevin Veen-Birkenbach
|
|
||||||
- Email: kevin@veen.world
|
|
||||||
- Website: [https://www.veen.world](https://www.veen.world)
|
|
||||||
|
|
||||||
This repository was enhanced with the help of [OpenAI's ChatGPT](https://chat.openai.com/share/825931d6-1e33-40b0-8dfc-914b3f852eeb).
|
---
|
||||||
|
|
||||||
## Setup
|
## 🛠 Features
|
||||||
To use the script, ensure you have Python installed on your system. No additional libraries are required as the script uses standard Python libraries.
|
|
||||||
|
|
||||||
## Usage
|
- **Duplicate Detection:** Computes MD5 hashes for files to find duplicates.
|
||||||
|
- **File Type Filtering:** Process only files with a specified extension.
|
||||||
|
- **Multiple Modification Options:** Choose to delete duplicates, replace them with hard links, or create symbolic links.
|
||||||
|
- **Flexible Modes:** Operate in preview, interactive, or active mode to suit your workflow.
|
||||||
|
- **Parallel Processing:** Utilizes process pooling for efficient scanning of large directories.
|
||||||
|
|
||||||
### Identifying and Handling Duplicates
|
---
|
||||||
|
|
||||||
`main.py` is a Python script to identify all duplicate files in the specified directories. It can also filter by file type and handle duplicates by deleting them or replacing them with hard or symbolic links.
|
## 📥 Installation
|
||||||
|
|
||||||
|
Install Duplicate File Handler via [Kevin's Package Manager](https://github.com/kevinveenbirkenbach/package-manager) under the alias `dufiha`:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python main.py [options] directories
|
package-manager install dufiha
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Options
|
This command installs the tool globally, making it available as `dufiha` in your terminal. 🚀
|
||||||
- `--apply-to`: Directories to apply modifications to.
|
|
||||||
- `--modification`: Action to perform on duplicates - `delete`, `hardlink`, `symlink`, or `show` (default).
|
|
||||||
- `--mode`: How to apply the modifications - `act`, `preview`, `interactive` (default: `preview`).
|
|
||||||
- `-f`, `--file-type`: Filter by file type (e.g., `.txt` for text files).
|
|
||||||
|
|
||||||
### Creating Test File Structure
|
---
|
||||||
|
|
||||||
`create_file_structure.py` is a utility script to create a test file structure with duplicate files for testing purposes.
|
## 🚀 Usage
|
||||||
|
|
||||||
|
Run Duplicate File Handler by specifying one or more directories to scan for duplicates:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python create_file_structure.py
|
dufiha [options] directory1 directory2 ...
|
||||||
```
|
```
|
||||||
|
|
||||||
## Example
|
### Options
|
||||||
|
|
||||||
To preview duplicate `.txt` files in `test_dir1` and `test_dir2`:
|
- **`--apply-to`**: Directories to which modifications should be applied.
|
||||||
|
- **`--modification`**: Action to perform on duplicates:
|
||||||
|
- `delete` – Delete duplicate files.
|
||||||
|
- `hardlink` – Replace duplicates with hard links.
|
||||||
|
- `symlink` – Replace duplicates with symbolic links.
|
||||||
|
- `show` – Only display duplicate files (default).
|
||||||
|
- **`--mode`**: How to apply modifications:
|
||||||
|
- `act` – Execute changes immediately.
|
||||||
|
- `preview` – Preview changes without making any modifications.
|
||||||
|
- `interactive` – Ask for confirmation before processing each duplicate.
|
||||||
|
- **`-f, --file-type`**: Filter by file type (e.g., `.txt` for text files).
|
||||||
|
|
||||||
|
### Example Commands
|
||||||
|
|
||||||
|
- **Preview duplicate `.txt` files in two directories:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python main.py --file-type .txt --mode preview test_dir1 test_dir2
|
dufiha --file-type .txt --mode preview test_dir1 test_dir2
|
||||||
```
|
```
|
||||||
|
|
||||||
To interactively delete duplicates in `test_dir2`:
|
- **Interactively delete duplicates in a specific directory:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
python main.py --apply-to test_dir2 --modification delete --mode interactive test_dir1 test_dir2
|
dufiha --apply-to test_dir2 --modification delete --mode interactive test_dir1 test_dir2
|
||||||
```
|
```
|
||||||
|
|
||||||
## License
|
- **Show duplicates without modifying any files:**
|
||||||
|
|
||||||
This project is licensed under the terms of the [MIT License](LICENSE).
|
```bash
|
||||||
|
dufiha --mode show test_dir1
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🧑💻 Author
|
||||||
|
|
||||||
|
Developed by **Kevin Veen-Birkenbach**
|
||||||
|
- 📧 [kevin@veen.world](mailto:kevin@veen.world)
|
||||||
|
- 🌐 [https://www.veen.world](https://www.veen.world)
|
||||||
|
|
||||||
|
This project was enhanced with assistance from [OpenAI's ChatGPT](https://chat.openai.com/share/825931d6-1e33-40b0-8dfc-914b3f852eeb).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 📜 License
|
||||||
|
|
||||||
|
This project is licensed under the **GNU Affero General Public License, Version 3, 19 November 2007**.
|
||||||
|
See the [LICENSE](./LICENSE) file for details.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 🤝 Contributions
|
||||||
|
|
||||||
|
Contributions are welcome! Please feel free to fork the repository, submit pull requests, or open issues to help improve Duplicate File Handler. Let’s make file management smarter and more efficient! 😊
|
||||||
|
Loading…
x
Reference in New Issue
Block a user