Added content

This commit is contained in:
Kevin Veen-Birkenbach 2023-06-28 11:51:06 +02:00
parent 447375cc98
commit c1cad6afe1
3 changed files with 98 additions and 0 deletions

39
README.md Normal file
View File

@ -0,0 +1,39 @@
# Duplicate File Handler
This repository contains two bash scripts for handling duplicate files in a directory and its subdirectories.
## Author
**Kevin Veen-Birkenbach**
- Email: kevin@veen.world
- Website: [https://www.veen.world](https://www.veen.world)
This repository was created with the help of [OpenAI's ChatGPT](https://openai.com/research/chatgpt) (Link to the conversation).
## Usage
### 1. List Duplicate Files
`list_duplicates.sh` is a script to list all duplicate files in a specified directory and its subdirectories. For text files, it will also display the diffs.
```bash
./list_duplicates.sh /path/to/directory
```
### 2. Delete Duplicate Files
`delete_duplicates.sh` is a script to find and delete duplicate files in a specified directory and its subdirectories. It will ask for confirmation before deleting each file and display the paths of its duplicates.
```bash
./delete_duplicates.sh /path/to/directory
```
## License
This project is licensed under the terms of the [GNU Affero General Public License v3.0](https://www.gnu.org/licenses/agpl-3.0.de.html).
These scripts will help you manage duplicate files in your directories. Please make sure to adjust permissions on the scripts to be executable with `chmod +x list_duplicates.sh delete_duplicates.sh` before running.
The scripts may need to be modified depending on the specific requirements of your system or the specific use case. They currently operate by comparing the MD5 hash of files to find duplicates, which is a common but not foolproof method.
Please be aware that these scripts are provided as is, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the scripts or the use or other dealings in the scripts.

35
delete_duplicates.sh Executable file
View File

@ -0,0 +1,35 @@
#!/bin/bash
if [ -z "$1" ]
then
echo "Directory path not provided"
exit 1
fi
dir="$1"
duplicates=$(find "$dir" -type f -exec md5sum {} + | sort | uniq -d -w32)
echo "Duplicates found:"
echo "$duplicates" | while read line
do
files=$(grep "$line" <<< "$duplicates" | awk '{print $2}')
for file in ${files[@]}
do
echo "File: $file"
echo "Duplicate(s) of this file:"
for duplicate in ${files[@]}
do
if [ $duplicate != $file ]
then
echo $duplicate
fi
done
echo "Do you want to delete this file? [y/N]"
read answer
if [[ $answer == [yY] || $answer == [yY][eE][sS] ]]
then
rm -i "$file"
fi
done
done

24
list_duplicates.sh Executable file
View File

@ -0,0 +1,24 @@
#!/bin/bash
if [ -z "$1" ]
then
echo "Directory path not provided"
exit 1
fi
dir="$1"
duplicates=$(find "$dir" -type f -exec md5sum {} + | sort | uniq -d -w32)
echo "Duplicates found:"
echo "$duplicates" | while read line
do
files=$(grep "$line" <<< "$duplicates" | awk '{print $2}')
file_type=$(file -b --mime-type "${files[0]}")
if [[ $file_type == text/* ]]
then
diff "${files[@]}"
else
echo "$files"
fi
done