I need to get uniq lines when comparing 2 files. These files containing field separator ":" which should be treated as the end of line while comparing strings.
The file1 contains these lines
apple:tasty
apple:red
orange:nice
kiwi:awesome
kiwi:expensive
banana:big
grape:green
orange:oval
banana:long
The file2 contains these lines
orange:nice
banana:long
The output file should be (2 occurrences of orange and 2 occurrences of banana deleted)
apple:tasty
apple:red
kiwi:awesome
kiwi:expensive
grape:green
So the only strings before : should be compared
Is it possible to complete this task in 1 command ?
I tried to complete the task in such way but field separator does not work in that situation.
awk -F: 'FNR==NR {a[$0]++; next} !a[$0]' file1 file2 > outputfile
You basically had it, but $0 refers to the whole line when you want to deal with only the first field, which is $1.
Also you need to take care with the order of the input files. To use the values from file2 for deciding which lines to include from file1, process file2 first:
$ awk -F: 'FNR==NR {a[$1]++; next} !a[$1]' file2 file1
apple:tasty
apple:red
kiwi:awesome
kiwi:expensive
grape:green
One comment: awk is very ineffective with arrays. In real life with big files, better use something like:
comm -3 <(cut -d : -f 1 f1 | sort -u) <(cut -d : -f 1 f2 | sort -u) | grep -h -f /dev/stdin f1 f2
I'm trying to compare 2 txt files and get only those lines that are unique, the problem is that lines want to compare only on the basis of 1 word that ends with a sign; because it's just what interests me
This is the example line:
000000423B;Name;26.46;32.55;0;06;pc.
I need to find out if the text file is also 000000423B and if it does not display it or save it to a file
awk 'NR == FNR {exclude [$ 0]; next}! ($ 0 in exclude)' 1.txt 2.txt
and
grep -xvFf 1.txt 2.txt> 3.txt
They give nice results but they compare the whole line and I need to compare only to the first character;
Any idea?
My input
1.txt:
000000423B;Name;27.47;33.79;0;06;szt.
000010001;Name2;4.42;5.44;0;08;szt.
000010001D;Name3;1.68;2.06;0;06;szt.
2.txt
000000423B;Name;97.47;33.79;0;06;szt.
000010001;Name2;4.99;5.44;0;08;szt.
000010001D;Name3:8778;1.68;2.06;0;06;szt.
009999999;Name4:99999;1.68;2.06;0;96;szt.
I want get result:
009999999;Name4:99999;1.68;2.06;0;96;szt.
In 1.txt and 2.txt first three lines have that same "product id" but other price and I do not care. I need to find only new "product id", these are the first digits to the character " ; "
To fix your command and make it work only for the first column, you can do this :
awk -F';' 'NR == FNR {exclude [$1]; next} !($1 in exclude)' 1.txt 2.txt
join -t';' -11 -21 -v1 -v2 <(sort 1.txt) <(sort 2.txt)
sort both files. Note I didn't need to specify -t';' -k1.1, because we are joining on the first field from file.
join the sorted files
-t';' using ; as a separator
-11 -21 on first field from both files
-v1 -v2 print unmatched lines from first and second file. Actually -v2 would be enough, dunno if your interested in unmatching lines from first file too. If not, remove -v1.
I have 2 identical files with below content:
File1:
1,Abhi,Ban,20180921T09:09:01,EmpId1,SalaryX
4,Bbhi,Dan,20180922T09:09:03,EmpId2,SalaryY
7,Cbhi,Ean,20180923T09:09:05,EmpId3,SalaryZ
9,Dbhi,Fan,20180924T09:09:09,EmpId4,SalaryQ
File2:
11,Ebhi,Gan,20180922T09:09:02,EmpId5,SalaryA
12,Fbhi,Han,20180923T09:09:04,EmpId6,SalaryB
3,Gbhi,Ian,20180924T09:09:06,EmpId7,SalaryC
5,Hbhi,Jan,20180925T09:09:08,EmpId8,SalaryD
I want to append all File1's content in Files (based on the date in ascending order)
Outcome:
1,Abhi,Ban,20180921T09:09:01,EmpId1,SalaryX
11,Ebhi,Gan,20180922T09:09:02,EmpId5,SalaryA
4,Bbhi,Dan,20180922T09:09:03,EmpId2,SalaryY
12,Fbhi,Han,20180923T09:09:04,EmpId6,SalaryB
7,Cbhi,Ean,20180923T09:09:05,EmpId3,SalaryZ
3,Gbhi,Ian,20180924T09:09:06,EmpId7,SalaryC
9,Dbhi,Fan,20180924T09:09:09,EmpId4,SalaryQ
5,Hbhi,Jan,20180925T09:09:08,EmpId8,SalaryD
You can use below AWK construct to do this :-
awk -F "," 'NR==FNR{print $4, $0;next} NR>FNR{print $4, $0;}' f1.txt f2.txt | sort | awk '{print $2}'
Explanation :-
Prefix date column ($4) before every line ($0) for both the files.
sort it. And Then print $2 which is whole line.
These printed lines will be in sorted order by date.
f1.txt and f2.txt are two file names.
You can try the following command
awk 'FNR==NR{a[FNR]=$0;next}{print a[FNR]"\n"$0}' file1 file2
with an array a store file1's datas, FNR is a's key.
I have a 2 files:
file1.txt
rs142159069:45000079:TACTTCTTGGACATTTCC:T 45000079
rs111285978:45000103:A:AT 45000103
rs190363568:45000168:C:T 45000168
file2.txt
rs142159069:45000079:TACTTCTTGGACATTTCC:T rs142159069
rs111285978:45000103:A:AT rs111285978
rs190363568:45000168:C:T rs190363568
Using file2.txt, I want to replace the names (column2 of file1.txt which is column1 of file2.txt) by the entry in column 2. The output file would then be:
rs142159069 45000079
rs111285978 45000103
rs190363568 45000168
I have tried inputing the columns of file2.txt but without success:
while read -r a b
do
cat file1.txt | sed s'/$a/$b/'
done < file2.txt
I am quite new to bash. Also, not sure how to write an output file with my command. Any help would be deeply appreciated.
In your case, using awk or perl would be easier, if you are willing to accept an answer without sed:
awk '(NR==FNR){out[$1]=$2;next}{out[$1]=out[$1]" "$2}END{for (i in out){print out[i]} }' file2.txt file1.txt > output.txt
output.txt :
rs142159069 45000079
rs111285978 45000103
rs190363568 45000168
Note: this assume all symbols in column1 are unique, and that they are all present in both files
explanation:
(NR==FNR){out[$1]=$2;next} : while you are parsing the first file, create a map with the name from the first column as key
{out[$1]=out[$1]" "$2} : append the value from the second column
END{for (i in out){print out[i]} } : print all the values in the map
Apparently $2 of file2 is part of $1 of file1, so you could use awk and redefine FS:
$ awk -F"[: ]" '{print $1,$NF}' file1
rs142159069 45000079
rs111285978 45000103
rs190363568 45000168
I have 2 files like below. I need a script to find string from file2 in file1 and delete the line which contains the string from file1 and put it in another file (output1.txt). Also it shld print the lines deleted and the string if the string doesn't exist in File1 (Ouput2.txt).
File1:
Apple
Boy: Goes to school
Cat
File2:
Boy
Dog
I need output like below.
Output1.txt:
Apple
Cat
Output2.txt:
Dog
Can anyone help please
If you have awk available on your system:
awk -v FS='[ :]' 'NR==FNR{a[$1]}NR>FNR&&!($1 in a){print $1}' File2 File1 > Output1.txt
awk -v FS='[ :]' 'NR==FNR{a[$1]}NR>FNR&&!($1 in a){print $1}' File1 File2 > Output2.txt
The script is storing in an array a the first element $1 of the first file given in argument.
If the first parameter of the second file is not part of the array, print it.
Note that the delimiter is either a space or a :