So far I was only reading, but now I have to start with a complicated question. I was googling crazy but I can't find an answer and it must be in BASH. (Thank you for the ideas like Perl, not an option unfortunately.)
I have a text file where the data is separated with | character like this:
DETAIL||||||||||103|line1
DETAIL||||||||||103|line2
DETAIL||||||||||105|line3
DETAIL||||||||||433|line4
DETAIL||||||||||433|line5
I managed to split it to new files by the 11th key field using this:
cat extract_GL2_*.txt | grep DETAIL | awk -F\| '{print>>"SPLIT/"$11".txt"}'
There are two problems with this:
1. I would need to have the names assigned from an another file called Company.txt (placed in the parent folder of SPLIT) having the values of the key column like this (so basically I need to replace the number with something meaningful):
Company.txt:
103|US100E1
104|US100E1
105|US100E1
433|EMEAE1
- As you can see from the example the key is not unique, multiple values might be merged to one based on the keys above Note: I'd prefer to have this file without the ".txt" which is needed for the output but I am happy to rework the Company.txt if the script is easier that way.
It is possible to have a second step which finds the filename by key and does the merger deleting the old files, but it would be more elegant to do in first step "simply" replacing the target file name from the second file. I failed both methods, but I'm fine with whichever is simpler/faster.
So the split must be based on the value from column 11 of the original file(s) and the filename on the second file. There can be more source files, they must append, and the split files may contain more than one of the key fields depending on the filename assignment.
Company.txt and extract_GL2* files are in the same folder, the split files need to go to a SPLIT subfolder.
A little code (the part I am unable to do are only pseudo, but can't test the rest either):
#!/bin/bash
while read line; do
company="${line|awk -F\| '{print $11}'}"
newfilename="${cat Company.txt | grep $company | awk -F\| '{print $2}' | head -1}" + ".txt"
_replace chr(34) to space in $line_
_replace , to space in $line_
_replace | to , in $line_
echo "$line" >> "SPLIT\$newfilename.txt"
done < "extract_GL2_*.txt"
Many thanks: Tamas
Place your files in the parent directory of the
SPLIT
directory and create a script file following the steps below and call itmy_script.sh
so the structure of this directory will be like this:IMPORTANT:
Files in the
SPLIT
directory will get created and deleted, DO NOT PLACE ANY FILES IN THESPLIT
DIRECTORY.Run the script ONLY ONCE or otherwise you will get duplicate entries in the result files.
To create and use the script file, please follow these steps:
Create and edit the script file in your parent directory by running the following command in the terminal (
cd
to the directory first):nano my_script.sh
Copy and paste the following code into the editor:
Save the script file and exit the editor by pressing Ctrl + X then press Y.
Make the script file executable by running the following command in the terminal:
chmod +x my_script.sh
Run the script by running the following command in the terminal:
bash my_script.sh
Done
Best of luck