Yes, I'm sorting out my music. I've got everything arranged beautifully in the following mantra: /Artist/Album/Track - Artist - Title.ext
and if one exists, the cover sits in /Artist/Album/cover.(jpg|png)
.
I want to scan through all the second-level directories and find the ones that don't have a cover. By second level, I mean I don't care if /Britney Spears/
doesn't have a cover.jpg, but I would care if /Britney Spears/In The Zone/
didn't have one.
Don't worry about the cover-downloading (that's a fun project for me tomorrow) I only care about the glorious bash-fuiness about an inverse-ish find
example.
Case 1: You know the exact file name to look for
Use
find
withtest -e your_file
to check if a file exists. For example, you look for directories which have nocover.jpg
in them:It's case sensitive though.
Case 2: You want to be more flexible
You're not sure of the case, and the extension might be
jPg
,png
...Explanation:
sh
for each directory since piping isn't possible when usingfind
ls -1 "{}"
outputs just the filenames of the directoryfind
is currently traversingegrep
(instead ofgrep
) uses extended regular expressions;-i
makes the search case insensitive,-q
makes it omit any output"^cover\.(jpg|png)$"
is the search pattern. In this example, it matches e.g.cOver.png
,Cover.JPG
orcover.png
. The.
must be escaped otherwise it means that it matches any character.^
marks the start of the line,$
its endOther search pattern examples for egrep:
Substitute the
egrep -i -q "^cover\.(jpg|png)$"
part with:egrep -i -q "cover\.(jpg|png)$"
: Also matchescd_cover.png
,album_cover.JPG
...egrep -q "^cover\.(jpg|png)$"
: Matchescover.png
,cover.jpg
, but NOTCover.jpg
(case sensitivity is not turned off)egrep -iq "^(cover|front)\.jpg$"
: matches e.g.front.jpg
,Cover.JPG
but notCover.PNG
For more info on this, check out Regular Expressions.
Simple, it transpires. The following gets a list of directories with the cover and compares that with a list of all the second-level directories. Lines that appear in both "files" are suppressed, leaving a list of directories that need covers.
Hooray.
Notes:
comm
's arguments are as follows:-1
suppress lines unique to file1-2
suppress lines unique to file2-3
suppress lines that appear in both filescomm
only takes files, hence the kooky<(...)
input method. This pipes the content via a real [temporary] file.comm
needs sorted input or it doesn't work andfind
does by no means guarantee an order. It also needs to be unique. The firstfind
operation could find multiple files forcover.*
so there could be duplicate entries.sort -u
quickly ruffles those down to one. The second find is always going to be unique.dirname
is a handy tool for getting a file's dir without resorting tosed
(et al).find
andcomm
are both a bit messy with their output. The finalsed
is there to clean things up so you're left withArtist/Album
. This may or may not be desirable for you.This is much nicer to solve with globbing than with find.
Now suppose you have no stray files in this nice structure. The current directory contains only artist subdirectories, and those contain only album subdirectories. Then we can do something like this:
The
<(...)
syntax is Bash process substitution: it lets you use a command in place of a file argument. It lets you treat the output of a command as a file. So we can run two programs, and take their diff, without saving their output in temporary files. Thediff
program thinks it is working with two files, but in fact it's reading from two pipes.The command that produces the right hand input to
diff
,printf "%s\n" */*
, just lists the album directories. The left hand command iterates through the*.cover
paths and prints their directory names.Test run:
Aha, the
a/b
andfoo/bar
directories have nocover.jpg
.There are some broken corner cases, like that by default
*
expands to itself if it matches nothing. This can be addressed with Bash'sset -o nullglob
.Will show all directories that do not have txt files in them.
Let's make it simple and easy to understand.
Make it one line (note that we need an extra
;
in the one-line version):