mirror of
https://github.com/kevinveenbirkenbach/duplicate-file-handler.git
synced 2024-11-22 22:11:04 +01:00
Added content
This commit is contained in:
parent
447375cc98
commit
c1cad6afe1
39
README.md
Normal file
39
README.md
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
# Duplicate File Handler
|
||||||
|
|
||||||
|
This repository contains two bash scripts for handling duplicate files in a directory and its subdirectories.
|
||||||
|
|
||||||
|
## Author
|
||||||
|
|
||||||
|
**Kevin Veen-Birkenbach**
|
||||||
|
- Email: kevin@veen.world
|
||||||
|
- Website: [https://www.veen.world](https://www.veen.world)
|
||||||
|
|
||||||
|
This repository was created with the help of [OpenAI's ChatGPT](https://openai.com/research/chatgpt) (Link to the conversation).
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### 1. List Duplicate Files
|
||||||
|
|
||||||
|
`list_duplicates.sh` is a script to list all duplicate files in a specified directory and its subdirectories. For text files, it will also display the diffs.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./list_duplicates.sh /path/to/directory
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Delete Duplicate Files
|
||||||
|
|
||||||
|
`delete_duplicates.sh` is a script to find and delete duplicate files in a specified directory and its subdirectories. It will ask for confirmation before deleting each file and display the paths of its duplicates.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./delete_duplicates.sh /path/to/directory
|
||||||
|
```
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
This project is licensed under the terms of the [GNU Affero General Public License v3.0](https://www.gnu.org/licenses/agpl-3.0.de.html).
|
||||||
|
|
||||||
|
These scripts will help you manage duplicate files in your directories. Please make sure to adjust permissions on the scripts to be executable with `chmod +x list_duplicates.sh delete_duplicates.sh` before running.
|
||||||
|
|
||||||
|
The scripts may need to be modified depending on the specific requirements of your system or the specific use case. They currently operate by comparing the MD5 hash of files to find duplicates, which is a common but not foolproof method.
|
||||||
|
|
||||||
|
Please be aware that these scripts are provided as is, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the scripts or the use or other dealings in the scripts.
|
35
delete_duplicates.sh
Executable file
35
delete_duplicates.sh
Executable file
@ -0,0 +1,35 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
if [ -z "$1" ]
|
||||||
|
then
|
||||||
|
echo "Directory path not provided"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
dir="$1"
|
||||||
|
duplicates=$(find "$dir" -type f -exec md5sum {} + | sort | uniq -d -w32)
|
||||||
|
|
||||||
|
echo "Duplicates found:"
|
||||||
|
|
||||||
|
echo "$duplicates" | while read line
|
||||||
|
do
|
||||||
|
files=$(grep "$line" <<< "$duplicates" | awk '{print $2}')
|
||||||
|
for file in ${files[@]}
|
||||||
|
do
|
||||||
|
echo "File: $file"
|
||||||
|
echo "Duplicate(s) of this file:"
|
||||||
|
for duplicate in ${files[@]}
|
||||||
|
do
|
||||||
|
if [ $duplicate != $file ]
|
||||||
|
then
|
||||||
|
echo $duplicate
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
echo "Do you want to delete this file? [y/N]"
|
||||||
|
read answer
|
||||||
|
if [[ $answer == [yY] || $answer == [yY][eE][sS] ]]
|
||||||
|
then
|
||||||
|
rm -i "$file"
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
done
|
24
list_duplicates.sh
Executable file
24
list_duplicates.sh
Executable file
@ -0,0 +1,24 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
if [ -z "$1" ]
|
||||||
|
then
|
||||||
|
echo "Directory path not provided"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
dir="$1"
|
||||||
|
duplicates=$(find "$dir" -type f -exec md5sum {} + | sort | uniq -d -w32)
|
||||||
|
|
||||||
|
echo "Duplicates found:"
|
||||||
|
|
||||||
|
echo "$duplicates" | while read line
|
||||||
|
do
|
||||||
|
files=$(grep "$line" <<< "$duplicates" | awk '{print $2}')
|
||||||
|
file_type=$(file -b --mime-type "${files[0]}")
|
||||||
|
if [[ $file_type == text/* ]]
|
||||||
|
then
|
||||||
|
diff "${files[@]}"
|
||||||
|
else
|
||||||
|
echo "$files"
|
||||||
|
fi
|
||||||
|
done
|
Loading…
Reference in New Issue
Block a user