I have a directory containing different text files such as:
ajac001a00.24o
ajac001a15.24o
ajac001a30.24o
.
.
areg001a00.24o
areg001a15.24o
areg001a30.24o
.
.
I need to combine these text files separately, starting with the same four characters, such as
cat *ajac* > ajac_combined
cat *areg* > areg_combined
How can I do this using a loop? There are too many files starting with different characters that exist, therefore this cannot be done using cat command manually.
You could collect all files in an array, then cut the first 4 characters to get the list of prefixes, and then iterate over the prefixes to combine the files. Like this:
The
printf '%s\n' "${files[@]}" | cut -c1-4 | sort -u
is doing the heavy lifting. First, theprintf
command prints each element of the$files
array on a separate line. This gives us the list of file names, and we then select the first 4 characters withcut -c1-4
. Note that this assume simple ASCII file names, no unicode, so that each character is a single byte. We then pass the list of prefixes throughsort -u
to remove duplicates, and then feed them to the loop.I used
cat "$prefix"*
instead ofcat *"$prefix"*
as you had in the question since these are all prefixes and there is nothing to match before them.In Bash you could do:
Which means: for every file in the current working directory, sorted by name according to Bash's globbing rules (i.e. numbers first, sorted numerically, followed by characters, sorted alphabetically), append the contents of the file to a file named as the first 4 characters of the file's name, followed by
_combined
.For reference, here's how those filenames would sort:
Which means that files named
ajac*
, sorted alphanumerically, will be joined intoajac_combined
, that files namedareg*
, sorted alphanumerically, will be joined intoareg_combined
and so on an so forth.If you need to restrict this to filenames ending in
.24o
:You can do it in a single line command:
"ls -w 1" will list all files in a single column and only the file name no other details.
The awk command will take each line, and run the system command cat the file ($1 = filename) to the file starting with the same 4 letters and ending with _combined.
the >> in the middle means append, so it will add to the file, if there is no file it will create it.
Note the single quotes for the whole awk command, and the double quotes for the static text in the cmd.