Joining two file with sed awk separated by comma - bash

I have two files.
file1.txt
example1
example2
example3
file2.txt
testing1
testing2
testing3
I am trying to join the values from these two files into a new comma separated file, with output
desired output
example1,testing1
example2,testing2
example3,testing3
Could anyone help to do this in awk/sed ?
thank you

You can just use paste:
paste -d, file1 file2
example1,testing1
example2,testing2
example3,testing3
Or, you can use awk:
awk -v OFS=, 'FNR==NR{a[++i]=$0; next} {print a[FNR], $0}' file1 file2
example1,testing1
example2,testing2
example3,testing3

This might work for you (GNU sed):
sed 'Rfile2' file1 | sed 'N;y/\n/,/'
The first sed script reads a line from file1 then a line from file2. The second script takes this output and reads two lines at a time replacing the newline between the lines by a comma.
N.B. This expects each file1/2 to be the same length.

You can also use pr other than paste command
[akshay#localhost tmp]$ cat file1
example1
example2
example3
[akshay#localhost tmp]$ cat file2
testing1
testing2
testing3
[akshay#localhost tmp]$ pr -mtJS',' file1 file2
example1,testing1
example2,testing2
example3,testing3

Related

Extracting lines from 2 files using AWK just return the last match

Im a bit new using AWK and im trying to print lines in a file1 that a specific field exists in a file2. I copied exactly examples that I found here but i dont know why its just printing only the last match of the file1.
File1
58000
72518
94850
File2
58000;123;abc
69982;456;rty
94000;576;ryt
94850;234;wer
84850;576;cvb
72518;345;ert
Result Expected
58000;123;abc
94850;234;wer
72518;345;ert
What Im getting
94850;234;wer
awk -F';' 'NR==FNR{a[$1]++; next} $1 in a' file1 file2
What im doing wrong?
awk (while usable here), isn't the correct tool for the job. grep with the -f option is. The -f file option will read the patterns from file one per-line and search the input file for matches.
So in your case you want:
$ grep -f file1 file2
58000;123;abc
94850;234;wer
72518;345;ert
(note: I removed the trailing '\' from the data file, replace it if it wasn't a typo)
Using awk
If you did want to rewrite what grep is doing using awk, that is fairly simple. Just read the contents of file1 into an array and then for processing records from the second file, just check if field-1 is in the array, if so, print the record (default action), e.g.
$ awk -F';' 'FNR==NR {a[$1]=1; next} $1 in a' file1 file2
58000;123;abc
94850;234;wer
72518;345;ert
(same note about the trailing slash)
Thanks #RavinderSingh13!
The file1 really had some hidden characters and I could see it using cat.
$ cat -v file1
58000^M
72518^M
94850^M
I removed using sed -e "s/\r//g" file1 and the AWK worked perfectly.

Shell script for merging dotenv files with duplicate keys

Given two dotenv files,
# file1
FOO="X"
BAR="B"
and
# file2
FOO="A"
BAZ="C"
I want to run
$ ./merge.sh file1.env file2.env > file3.env
to get the following output:
# file3
FOO="A"
BAR="B"
BAZ="C"
So far, I used the python-dotenv module to parse the files into dictionaries, merge them and write them back. However, I feel like there should be a simple solution in shell that rids myself of a third-party module for such a basic task.
Answer
Alright, so I ended up using
$ sort -u -t '=' -k 1,1 file1 file2 | grep -v '^$\|^\s*\#' > file3
which omits blank lines and comments. Nevertheless, the proposed awk solution works just as fine.
Another quite simple approach is to use sort:
sort -u -t '=' -k 1,1 file1 file2 > file3
results in a file where the keys from file1 take precedence over the keys from file2.
Using a simple awk script:
awk -F= '{a[$1]=$2}END{for(i in a) print i "=" a[i]}' file1 file2
This stores all key values in the array a and prints the array content when both files are parsed.
The keys that are in file2 override the ones in file1.
To add new values only from file2 and NOT overwrite initial values from file1. Omit spaces from file 2.
grep "\S" file2 >> file1
awk -F "=" '!a[$1]++' file1 > file3

Diff to get changed line from second file

I have two files file1 and file2. I want to print the new line added to file2 using diff.
file1
/root/a
/root/b
/root/c
/root/d
file2
/root/new
/root/new_new
/root/a
/root/b
/root/c
/root/d
Expected output
/root/new
/root/new_new
I looked into man page but there was no any info on this
If you don't need to preserve the order, you could use the comm command like:
comm -13 <(sort file1) <(sort file2)
comm compares 2 sorted files and will print 3 columns of output. First is the lines unique to file1, then lines unique to file2 then lines common to both. You can supress any columns, so we turn of 1 and 3 in this example with -13 so we will see only lines unique to the second file.
or you could use grep:
grep -wvFf file1 file2
Here we use -f to have grep get its patterns from file1. We then tell it to treat them as fixed strings with -F instead of as patterns, match whole words with -w, and print only lines with no matches with -v
Following awk may help you on same. This will tell you all those lines which are present in Input_file2 and not in Input_file1.
awk 'FNR==NR{a[$0];next} !($0 in a)' Input_file1 Input_file2
Try using a combination of diff and sed.
The raw diff output is:
$ diff file1 file2
0a1,2
> /root/new
> /root/new_new
Add sed to strip out everything but the lines beginning with ">":
$ diff file1 file2 | sed -n -e 's/^> //p'
/root/new
/root/new_new
This preserves the order. Note that it also assumes you are only adding lines to the second file.

How to search for a pattern having the special characters in awk

I have to match file1 with file2 line by line . But file1 is in below format .If am using ak command to search with the below line in file2 , its throwing error with syntax error at '=' .
File1 :
Country_code=US/base_div_nbr=18/retail_channel_code=1/visit_date=2010-01-02/load_time_stamp=20100102058100
Country_code=US/base_div_nbr=18/retail_channel_code=1/visit_date=2010-01-02/load_time_stamp=20100102091000
Country_code=US/base_div_nbr=18/retail_channel_code=1/visit_date=2010-01-02/load_time_stamp=20100102067000
File2:
Country_code=US/base_div_nbr=18/retail_channel_code=1/visit_date=2010-01-02/load_time_stamp=20100102058100
Country_code=US/base_div_nbr=18/retail_channel_code=1/visit_date=2010-01-02/load_time_stamp=20100102091000
I took total line from file1 as the search pattern to search in file2 using below command:
awk "/$line/ {print ;}" file2
Here file1 , 3 rd record not found in file2 , So I need to know these differences
I am very much new to shell scripting, So please suggest me on this.
This is really a job for comm, assuming you can sort both input files, but if you want to use awk something like this might do it depending on your unstated requirements:
awk 'NR==FNR {file1[NR]=$0; next} $0 != file1[FNR]' file1 file2
If I understand correctly, you want to print the lines that are common to both files. In that case, awk is really not the best tool. You could instead do one of
comm <(sort file1) <(sort file2)
or
grep -Fxf file1 file2
If you really want to do it with awk, you could try
awk 'FNR==NR{a[$0]; next} $0 in a' file1 file2

UNIX - Simple merging of two files as in the input

Input File1:
HELLO
HOW
Input File2:
ARE
YOU
output file should be
HELLO
HOW
ARE
YOU
My input files will be in one folder and my script has to fetch the input files from that folder and merge all the files as in the above given order.
Thanks
You can simply use cat as shown below:
cat file1 file2
or, to concatenate all files in a folder (assuming there are not too many):
cat folder/*
sed '' file1 file2
hope this works fine +
cat:
cat file1 file2 >output
perl:
perl -plne '' file1 file2 >output
awk:
awk '1' file1 file2 >output

Resources