I'm trying to gzip all files on ubuntu that have the file extension .css, .html or .js. in a top directory and all subdirectories. I want to keep the original files and overwrite the .gz file, if already existing.
So when I have n files, I want to keep these n files and create additional n archive files. Not just one.
My try was to run a script that looks like this:
gzip -rkf *.css
gzip -rkf *.html
... one line for each file extension
First: I need to have one line in that script for each file extension I want to gzip. That's ok, but I hope to find a better way
Second and more important: It does not work. Although -r should do the job, the subdirectories are unchanged. The gzip file is only created in the top directory.
What am I missing here?
Btw: The following is a bug in the verbose output, right? When using -k and -v option
-k, --keep keep (don't delete) input files
-v, --verbose verbose mode
The verbose output says it replaces the file, although "replace" means that the original file does not exist after the replace. Anyway, THis is only the output thing.
$ ls
index.html subdir1 testfile testfile.css.gz
javaclass.java subdir2 testfile.css
$ gzip -fkv *.css
testfile.css: 6.6% -- replaced with testfile.css.gz
$ ls
index.html subdir1 testfile testfile.css.gz
javaclass.java subdir2 testfile.css
I would use
Change
name
toiname
if you want to match the extensions case-insensitively (i.e. include.CSS
and/or.HTML
extensions). You can omit the/path/to/dir
if you want to start the recursive search from the current directory.you can do that with a for loop to find every file then compress it:
To get the list of files:
And to gzip all those files:
You can use globstar.
With the
globstar
shell option enabled, all you need isgzip -vk **/*.{css,html}
.The Bash shell has a
globstar
option that lets you write recursive globs with**
.shopt -s globstar
enables it. But you might not want to do that for other commands you run later, so you can run it and yourgzip
command in a subshell instead.This command
gzip
s all.css
and.html
files in the current directory any of its subdirectories, any of their subdirectories, etc., keeping the original files (-k
) and telling you what it's doing (-v
):If you want to match filenames case-insensitively so those extensions with some or all letters capitalized are included, then you can also enable the
nocaseglob
shell option:;
separates the two commands, and the outer(
)
cause them to be run in a subshell. Setting a shell option in a subshell does not cause it to be set in the calling shell. If you do want to enableglobstar
then you can runshopt -s globstar
; then you can just run the command:You can disable
globstar
withshopt -u globstar
. You can check if it's currently enabled withshopt globstar
.How It Works
The key to how this
gzip
command works is that the shell performs expansions on it to produce a list of each file in the directory hierarchy with a matching name, then passes each of these filenames as arguments togzip
.**/*.{css,html}
into**/*.css **/*.html
.**
, due toglobstar
) whose filenames consist of anything (*
) followed by the specified suffix (.css
or.html
in this case).This does not match files whose names start with
.
or those that reside in directories named this way. You probably don't have any such HTML and CSS files and, if you do, you probably don't want to include them. But if you do want to include them, then you can match them explicitly depending on your needs. For example, changing**/*.{css,html}
to**/{,.}*.{css,html}
includes files that start with.
while still not searching in folders that do.If you want both files whose names start with
.
and files in directories whose names start with.
to be included, there's a cleaner and simpler way: enable thedotglob
shell option.Or if you want case-insensitive matching and matching of filenames that start with
.
:It's possible, though very rare, for
**
to expand to something too long.If you have a huge number of files named this way, then this may fail with an error message explaining that the shell cannot build the command line because it would be too long. (Even with thousands of files, this usually is not a problem.)
gzip
won't be called at all, so you won't get a half-done job.If this error happens, or if you're worried about it, you can use
find
with-exec
, either as steeldriver describes (with{} \;
) or as I describe below (with{} +
).You can use
find
with the-exec
action and+
for efficiency.The
gzip
command supports being given names of multiple files to be compressed. But thisfind
command, although it works well and won't be slow unless you have many files, runs thegzip
command once for each file:This works, and you can definitely use it. (
.
searches from the current directory. Besides that, it's really a slightly different way of writing the command in steeldriver's very good answer; you can use whichever style you prefer.)You can also make
find
pass multiple filenames togzip
and run it only as many times as necessary--which is nearly always just once. To do that, use+
instead of\;
. The+
argument should come just after{}
.find
replaces+
with additional filenames, if any.It's fine to use
+
even if there are only a few matching files, and when there are many of them, it can be noticeably faster than having a separategzip
invocation for each file.As steeldriver mentions, you can use
-iname
instead of-name
to match files whose name end like.css
or.html
but with different capitalization. This corresponds to enablingnocaseglob
in theglobstar
-based method described above.Finally, you probably don't have any matching files or directories that start with
.
. But if you do,find
automatically includes them. If you want to exclude them (as happens with theglobstar
-based method detailed above whendotglob
is off), you can:The
globstar
-based way described above is simpler to write, especially if you're excluding directories and files that begin with.
, since that's the default.What not to do...
Filenames can contain any character except the path separator
/
and the null character. Many techniques that break on weird filenames exist, and they are usually more complicated than techniques that always just work. So I suggest avoiding them even when you know (or think you know) they're okay in your specific situation. And of course you must not use them if you might have filenames with characters that may be treated specially, including spaces.It is possible to safely pipe the output of
find
to another command that processes it if you use-print0
or a similar action to cause it to place a null character between paths instead of a newline, and not otherwise. Filenames can contain newlines (though I discourage you from deliberately naming files with them). Afind
command with the-print
action--including find commands with no explicit action, since then-print
is the default--does not produce output that can safely be piped or otherwise provided to another command that performs an action on the files.The output
find
produces with the-print0
action may safely be piped toxargs -0
(the-0
flag tellsxargs
to expect null-separated input).I used steeldriver's answer, but I like to complete it with the
--best
and--force
options.cd
into any folder and type this code. All your matching files will be gzipped.--best
for best compression ratio.--force
for overwriting without asking if there is already a gzipped file.To zip all files in a folder/subfolder recursively:
To unzip: