Shell script for merging dotenv files with duplicate keys - shell

Given two dotenv files,
# file1
FOO="X"
BAR="B"
and
# file2
FOO="A"
BAZ="C"
I want to run
$ ./merge.sh file1.env file2.env > file3.env
to get the following output:
# file3
FOO="A"
BAR="B"
BAZ="C"
So far, I used the python-dotenv module to parse the files into dictionaries, merge them and write them back. However, I feel like there should be a simple solution in shell that rids myself of a third-party module for such a basic task.
Answer
Alright, so I ended up using
$ sort -u -t '=' -k 1,1 file1 file2 | grep -v '^$\|^\s*\#' > file3
which omits blank lines and comments. Nevertheless, the proposed awk solution works just as fine.

Another quite simple approach is to use sort:
sort -u -t '=' -k 1,1 file1 file2 > file3
results in a file where the keys from file1 take precedence over the keys from file2.

Using a simple awk script:
awk -F= '{a[$1]=$2}END{for(i in a) print i "=" a[i]}' file1 file2
This stores all key values in the array a and prints the array content when both files are parsed.
The keys that are in file2 override the ones in file1.

To add new values only from file2 and NOT overwrite initial values from file1. Omit spaces from file 2.
grep "\S" file2 >> file1
awk -F "=" '!a[$1]++' file1 > file3

Related

Extracting lines from 2 files using AWK just return the last match

Im a bit new using AWK and im trying to print lines in a file1 that a specific field exists in a file2. I copied exactly examples that I found here but i dont know why its just printing only the last match of the file1.
File1
58000
72518
94850
File2
58000;123;abc
69982;456;rty
94000;576;ryt
94850;234;wer
84850;576;cvb
72518;345;ert
Result Expected
58000;123;abc
94850;234;wer
72518;345;ert
What Im getting
94850;234;wer
awk -F';' 'NR==FNR{a[$1]++; next} $1 in a' file1 file2
What im doing wrong?
awk (while usable here), isn't the correct tool for the job. grep with the -f option is. The -f file option will read the patterns from file one per-line and search the input file for matches.
So in your case you want:
$ grep -f file1 file2
58000;123;abc
94850;234;wer
72518;345;ert
(note: I removed the trailing '\' from the data file, replace it if it wasn't a typo)
Using awk
If you did want to rewrite what grep is doing using awk, that is fairly simple. Just read the contents of file1 into an array and then for processing records from the second file, just check if field-1 is in the array, if so, print the record (default action), e.g.
$ awk -F';' 'FNR==NR {a[$1]=1; next} $1 in a' file1 file2
58000;123;abc
94850;234;wer
72518;345;ert
(same note about the trailing slash)
Thanks #RavinderSingh13!
The file1 really had some hidden characters and I could see it using cat.
$ cat -v file1
58000^M
72518^M
94850^M
I removed using sed -e "s/\r//g" file1 and the AWK worked perfectly.

How to compare entries in one file to two files?

I have a file (named file1) which consists of names and their IPs. It looks something like this :-
VM1 2.33.4.22
VM2 88.43.21.34
VM3 120.3.45.66
VM4 99.100.34.5
VM5 111.3.4.66
and i have two files (file2 and file3) which consists solely of IPs.
File 2 consists of:-
120.3.45.66
88.43.21.34
File 3 consists of :-
99.100.34.5
I want to compare file1 to file2 and file3 and get the names and IPs that are not present in file2 and file3. So output would be:
VM1 2.33.4.22
VM5 111.3.4.66
How can i get the desired output?
sed 's/\./\\./g; s/.*/ &$/' file2 file3 | grep -vf - file1
Use sed to turn the entries in files 2 and 3 in to appropriate regexes.
Pipe this regex list to grep, with -f - to get the pattern list from standard input, and -v to print non matching lines in file 1.
You can write a shell script that will do it for you.
#!/bin/sh
cat $1.txt $2.txt > mergedFile.txt
grep -v -f mergedFile.txt $3.txt
You can run the script by using the following command
sh check.sh file2 file3 file1
awk 'NR==FNR { out[$1]=1; next} !out[$2]' <(/bin/cat file2 file3) file1
This uses basically the same thing as the sed solution, using awk instead.

Diff to get changed line from second file

I have two files file1 and file2. I want to print the new line added to file2 using diff.
file1
/root/a
/root/b
/root/c
/root/d
file2
/root/new
/root/new_new
/root/a
/root/b
/root/c
/root/d
Expected output
/root/new
/root/new_new
I looked into man page but there was no any info on this
If you don't need to preserve the order, you could use the comm command like:
comm -13 <(sort file1) <(sort file2)
comm compares 2 sorted files and will print 3 columns of output. First is the lines unique to file1, then lines unique to file2 then lines common to both. You can supress any columns, so we turn of 1 and 3 in this example with -13 so we will see only lines unique to the second file.
or you could use grep:
grep -wvFf file1 file2
Here we use -f to have grep get its patterns from file1. We then tell it to treat them as fixed strings with -F instead of as patterns, match whole words with -w, and print only lines with no matches with -v
Following awk may help you on same. This will tell you all those lines which are present in Input_file2 and not in Input_file1.
awk 'FNR==NR{a[$0];next} !($0 in a)' Input_file1 Input_file2
Try using a combination of diff and sed.
The raw diff output is:
$ diff file1 file2
0a1,2
> /root/new
> /root/new_new
Add sed to strip out everything but the lines beginning with ">":
$ diff file1 file2 | sed -n -e 's/^> //p'
/root/new
/root/new_new
This preserves the order. Note that it also assumes you are only adding lines to the second file.

Compare file1 and file2 but show only new lines which are not in file2

I am currently struggling with the task to compare two files. Both files have values which have differences and new lines. Example:
file1:
Germany=Munich
Swiss=Bern
Austria=Wien
Italy=Rom
file2:
Germany=Berlin
Swiss=Bern
Italy=Rom
The result of my action should be the following:
outputfile:
Austria=Wien
How can I achieve to get only lines to my output file which are not already in file2? I am not interested in differences of lines. Just a complete line which is missing.
I already experimented with diff and sdiff but without the desired results.
thanks
This should work:
awk -F= 'NR==FNR{a[$1]=$0;next}!($1 in a)' file2 file1
Austria=Wien
We read entire file2 first indexed at countries. We check if the country is not present in our file1 and print it. This won't give you results of lines which are in file2 but not in file1, but can be adjusted to give you that as well. I am not sure if that is your requirement. If it is then please update your question to reflect all your use-cases for more complete answer.
If you don't care about ordering, you can sort the files and then use join:
sort file1 > file1.srt
sort file2 > file2.srt
join -t'=' -v1 file1.srt file2.srt
The flags for join specify to use the equals sign as the field separator, include unpairable lines from file1.srt while suppressing the pairable lines from file1.srt.
This might work for you (GNU sed):
sed -r 's#([^=]*=).*#/^\1/d#' file2 | sed -f - file1
Use file2 as the basis for a sed script and run this sed script against file1

Unix: One line bash command to merge 3 files together. extracting only the first line of each

I am having time with my syntax here:
I have 3 files with various content file1 file2 file3 (100+ lines). I am trying to merge them together, but only the first line of each file should be merged. The point is to do it using one line of bash code:
sed -n 1p file1 file2 file3 returns only the first line of file1
You might want to try
head -n1 -q file1 file2 file3.
It's not clear if by merge you mean concatenate or join?
In awk by joining (each first line in the files printed side by side):
$ awk 'FNR==1{printf "%s ",$0}' file1 file2 file3
1 2 3
In awk by concatenating (each first line in the files printed one after another):
$ awk 'FNR==1' file1 file2 file3
1
2
3
I suggest you use head as explained by themel's answer. However, if you insist in using sed you cannot simply pass all files to it, since they are implicitly concatenated and you lose information about what the first line is in each file respectively. So, if you really want to do it in sed, you need bash to help you out:
for f in file1 file2 file3; do sed -n 1p "$f"; done
You can avoid calling external processes by using the read built-in command:
for f in file1 file2 file3; do read l < $f; echo "$l"; done > merged.txt

Resources