I have about 250,000 files in several folders with sub-folders.
Looking for a solution how to find duplicate filenames in all folders and their sub-folders. My OS is Ubuntu 22.04, using bash
.
I prefer a bash
command/script solution. However, suggestions about tools similar to fdupes -r
(but checking if filenames are the same, not their content) are welcome too.
About the files and their names:
- All files are images and have file extensions.
- The content of the file is not important and may be different.
- The extensions of the files are not important and may be different.
- Letter casing for both file names and their extensions is not consistent.
- Some of the files have more than one
.
(period) in their filename. Example:file_Name2.1.png
- The file extensions are 3 or 4 symbols. Example:
.png
,.JPG
,.jpeg
Structure:
The directory structure is pretty simple: ./[YEAR]/[MONTH]/[IMAGE_NAME].[EXTENSION]
. For example:
tree -a
.
├── 2022
│ └── 12
│ ├── file1.png
│ └── File2.png
└── 2023
├── 01
│ ├── file1.jpg
│ ├── file3.png
│ └── file4.png
└── 02
├── FILE1.png
├── FILE4.PNG
├── File5.png
└── File6.png
Expected result:
file1
:./2022/12/file1.png ./2023/01/file1.jpg ./2023/02/FILE1.png
file4
:./2023/01/file4.png ./2023/02/FILE4.PNG
Assuming your paths:
You can use something like: