Is the expansion of a wildcard in Bash guaranteed to be in alphabetical order? I am forced to split a large file into 10 Mb pieces so that they can be be accepted by my Mercurial repository.
So I was thinking I could use:
split -b 10485760 Big.file BigFilePiece.
and then in place of:
cat BigFile | bigFileProcessor
I could do:
cat BigFilePiece.* | bigFileProcessor
in its place.
However, I could not find anywhere that guaranteed that the expansion of the asterisk (aka wildcard, aka *
) would always be in alphabetical order so that .aa
came before .ab
(as opposed to be timestamp ordering or something like that).
Also, are there any flaws in my plan? How great is the performance cost of cat
ing the file together?
Yes, globbing expansion is alphabetical.
From the Bash
man
page:It is documented behavior for
bash
so you can depend upon it in your scripts. It also has been true of other Bourne compatible shells for a very long time ... though there may be corner cases regarding case folding or non-alphanumeric characters.(The resulting list, in
bash
will be in almost "ASCII-betical" order --- except that lower and upper case letters will be collated together as if there were no case differences but with lower case collated before their upper case equivalents. All non-alphabetics should collate into the same order as they appear in ASCII).As others have pointed out this could be perturbed by your language related environment settings: LANG generally and LC_COLLATE more specifically. In might be safest to run commands that depend on glob expansion ordering under an
env
command to clear the environment (using-i
or-u
as appropriate) or to pipe the results throughsort
to ensure robust sequencing.While glob expansions are sorted alphabetically, they also obey the shell's langage setting.
Make sure to set this to "C" in your script if you intend this to be portable.