So far I was only reading, but now I have to start with a complicated question. I was googling crazy but I can't find an answer and it must be in BASH. (Thank you for the ideas like Perl, not an option unfortunately.)
I have a text file where the data is separated with | character like this:
DETAIL||||||||||103|line1
DETAIL||||||||||103|line2
DETAIL||||||||||105|line3
DETAIL||||||||||433|line4
DETAIL||||||||||433|line5
I managed to split it to new files by the 11th key field using this:
cat extract_GL2_*.txt | grep DETAIL | awk -F\| '{print>>"SPLIT/"$11".txt"}'
There are two problems with this:
1. I would need to have the names assigned from an another file called Company.txt (placed in the parent folder of SPLIT) having the values of the key column like this (so basically I need to replace the number with something meaningful):
Company.txt:
103|US100E1
104|US100E1
105|US100E1
433|EMEAE1
- As you can see from the example the key is not unique, multiple values might be merged to one based on the keys above Note: I'd prefer to have this file without the ".txt" which is needed for the output but I am happy to rework the Company.txt if the script is easier that way.
It is possible to have a second step which finds the filename by key and does the merger deleting the old files, but it would be more elegant to do in first step "simply" replacing the target file name from the second file. I failed both methods, but I'm fine with whichever is simpler/faster.
So the split must be based on the value from column 11 of the original file(s) and the filename on the second file. There can be more source files, they must append, and the split files may contain more than one of the key fields depending on the filename assignment.
Company.txt and extract_GL2* files are in the same folder, the split files need to go to a SPLIT subfolder.
A little code (the part I am unable to do are only pseudo, but can't test the rest either):
#!/bin/bash
while read line; do
company="${line|awk -F\| '{print $11}'}"
newfilename="${cat Company.txt | grep $company | awk -F\| '{print $2}' | head -1}" + ".txt"
_replace chr(34) to space in $line_
_replace , to space in $line_
_replace | to , in $line_
echo "$line" >> "SPLIT\$newfilename.txt"
done < "extract_GL2_*.txt"
Many thanks: Tamas