file1.csv
A,,C,D
A,,C,D
A,,C,D
A,,C,D
file2.csv
A,B
A,B
A,B
A,B
desired Output.csv
A,B,C,D
A,B,C,D
A,B,C,D
A,B,C,D
I've tried using "join" and "paste" to no avail. Is there a bash command to do this? Column "A" is the same in both .csv
files.
With only
awk
command:Get a line from file1 and store it into local variable
f1
, then print the line that stored inf1
and finally print the third($3
) and forth($3
) fields from file1 which delimited with comma,
altogether, and change the OFS(output field separator [space by default]) to comma(,
).The short command would be like this:
paste the file2, then cut and paste the third column to the next(
-f3-
) from file1.With
awk
andpaste
(option A)Below command also copies the last two columns (
C,D
) from file1 at the end of each line in file2:Above command paste the file2 content then print a comma delimiter(
-d','
) then paste the two last field(NF
is the index of last field and$NF
is the string which its index isNF
. So$(NF-1)
is the second field before last field ) from file1 when those index redefines or splits with comma spectator(-F','
).With
awk
andpaste
(option B)This command also is the same as above(
$3
and$4
points to third and forth field of each line from file1 ):Or another solution with
cut
command:cut command in above command first cut the first field(
-f1
which indexed with comma delimiter(-d.
)) from file1(cut -d, -f1 file1
), then cut and paste the second field of file2(cut -d, -f2 file2
) and finally cut and paste the third column(-f3
) to the nexts(-
) from file1(cut -d, -f3- file1
) again.This command also returns the same result:
paste the second field from file1(
awk -F',' '{print $1}' file1
) then print a comma(-d,
), then paste the second column from file2(awk -F',' '{print $2}' file2
), finally paste the second and last column of file1(awk -F',' '{print $3","$4}' file1
) again.Here's a beauty (I think):
Broken down in steps:
Step 1. Install csvkit:
Step 2. Use the join command with a comma as separator
Step 3. Feed it the actual columns you want to. Note how you feed it the first column twice, because that is the one the join is actually performed on (default behavior of
join
).or in shorthand:
You can redirect that standard output to a file (desiredOutput) if wanted.
Advantages
This method has several advantages over the others proposed.
First and foremost: it performs a real join. That means that it can be used for more complex data as well. It is very easy to do a join on another field, for instance. It does not simply look at the position of the field, but it really takes the column into consideration. It actually works with the format of the data (csv) and does not treat it like text.
Second, it uses the very powerful csv toolkit which also allows you to a) display statistics with one command (
csvstats)
, b) check whether the data is clean (csvclean
), but also to transform it into json, into sql, or even load it into python! This toolkit is heavily used in data science for data preparation.Here is another beautiful one. I think it is the easiest of all suggestions, thus far.
If you have not installed csvtool already in the past, you have to
sudo apt-get install csvtool
.From the docs:
Note how in our case we are replacing the second columns of the files.
Examples
file1.csv
file2.csv
Combining the two files:
What you essentially do is paste the column two of
file2.csv
as column 2 infile1.csv
.Note that this also works on the same document. If you want to swap two columns, you can do so by using the same file as input.csv and update.vsc.
To move a chosen number of columns from one file to another:
from two files:
file_1
file_2
When you set
cols = 1
:But When you set
cols = 2
:cols = 3
:How to use
Copy it into an empty file, set the path to
file1
,file2
and the number of columns to move, save it asmove.py
and run it by:It is also possible to add one or more columns from the middle of the source file's colums this way.
Another method in python through csv module.
script.py
To run the above script,
Output: