I have two files,
File 1
2,1,1,1,Test1,1540584051,52
6,5,1,1,Test2,1540579206,54
3,3,0,0,Test3,1540591243,36
File 2
2,1,0,2,Test1,1540584051,52
6,5,0,2,Test2,1540579206,54
i want to look up column 7 value from File 1 to check if it matches with column 7 value from File 2 and when matched, replace the that line in file 2 with corresponding line in file 1
So the output would be
2,1,1,1,Test1,1540584051,52
6,5,1,1,Test2,1540579206,54
Thanks in advance.
You can do that with the following script:
BEGIN { FS="," }
NR==FNR {
lookup[$7] = $0
next
}
{
if (lookup[$7] != "") {
$0 = lookup[$7]
}
print
}
END {
print ""
print "Lookup table used was:"
for (i in lookup) {
print " Key '"i"', Value '"lookup[i]"'"
}
}
The BEGIN section simply sets the field separator to , so individual fields can be easily processed.
The NR and FNR variables are, respectively, the line number of the full input stream (all files) and the line number of the current file in the input stream. When you are processing the first (or only) file, these will be equal, so we use this as a means to simply store the lines from the first file, keyed on field seven.
When NR and FNR are not equal, it's because you've started the second file and this is where we want to replace lines if their key exists in the first file.
This is done by simply checking if a line exists in the lookup table with the desired key and, if it does, replacing the current line the lookup table line. Then we print the (original or replaced) line.
The END section is there just for debugging purposes, it outputs the lookup table that was created and used, and you can remove it once you're satisfied the script works as expected.
You'll see the output in the following transcript, illustrating hopefully that it is working correctly:
pax$ cat file1
2,1,1,1,Test1,1540584051,52
6,5,1,1,Test2,1540579206,54
3,3,0,0,Test3,1540591243,36
pax$ cat file2
2,1,0,2,Test1,1540584051,52
6,5,0,2,Test2,1540579206,54
pax$ awk -f sudarshan.awk file1 file2
2,1,1,1,Test1,1540584051,52
6,5,1,1,Test2,1540579206,54
Lookup table used was:
Key '36', Value '3,3,0,0,Test3,1540591243,36'
Key '52', Value '2,1,1,1,Test1,1540584051,52'
Key '54', Value '6,5,1,1,Test2,1540579206,54'
If you need it as a "short as possible" one-liner to use from your script, just use:
awk -F, 'NR==FNR{x[$7]=$0;next}{if(x[$7]!=""){$0=x[$7]};print}' file1 file2
though I prefer the readable version myself.
This might work for you (GNU sed):
sed -r 's|^([^,]*,){6}([^,]*).*|/^([^,]*,){6}\2/s/.*/&/p|' file1 | sed -rnf - file2
Turn file1 into a sed script and using the 7th field as a key lookup replace any line in file2 that matches.
In your example the 7th field is the last one, so a short version of the above solution is:
sed -r 's|.*,(.*)|/.*,\1/s/.*/&/p|' file1 | sed -nf - file2
I have two files as below:
File1:
a1|f1|c1|d1|e1
a2|f1|c2|d2|e2
a3|f2|c3|d3|e3
a4|f2|c4|d4|e4
a5|f4|c5|d5|e5
File2:
z1|f1|c1|d1|e1
z2|f1|c2|d2|e2
z3|f2|c3|d3|e3
z4|f2|c4|d4|e4
z5|f3|c5|d5|e5
Output file should have lines interleaved from both the files such that the rows are sorted according to 2nd field.
Output file:
a1|f1|c1|d1|e1
a2|f1|c2|d2|e2
z1|f1|c1|d1|e1
z2|f1|c2|d2|e2
a3|f2|c3|d3|e3
a4|f2|c4|d4|e4
z3|f2|c3|d3|e3
z4|f2|c4|d4|e4
z5|f3|c5|d5|e5
a5|f4|c5|d5|e5
I tried appending File2 to File1 and then sort on 2nd field. But it does not maintain the order present in the source files.
file_1:
a1|f1|c1|d1|e1
a2|f1|c2|d2|e2
a3|f2|c3|d3|e3
a4|f2|c4|d4|e4
a5|f4|c5|d5|e5
file_2:
z1|f1|c1|d1|e1
z2|f1|c2|d2|e2
z3|f2|c3|d3|e3
z4|f2|c4|d4|e4
z5|f3|c5|d5|e5
awk -F"|" '{a[$2] = a[$2]"\n"$0;} END {for (var in a) print a[var]}' file_1 file_2 | sed '/^\s*$/d'
awk
-F : tokenize the data on '|' character.
a[$2] : creates an hash table whose key is string identified by $2 and
value is previous data at a[$2] + current complete line ($0) separated by newline.
sed
used to remove the empty lines from the output.
Output:
a1|f1|c1|d1|e1
a2|f1|c2|d2|e2
z1|f1|c1|d1|e1
z2|f1|c2|d2|e2
a3|f2|c3|d3|e3
a4|f2|c4|d4|e4
z3|f2|c3|d3|e3
z4|f2|c4|d4|e4
z5|f3|c5|d5|e5
a5|f4|c5|d5|e5
I have 2 files like below. I need a script to find string from file2 in file1 and delete the line which contains the string from file1 and put it in another file (output1.txt). Also it shld print the lines deleted and the string if the string doesn't exist in File1 (Ouput2.txt).
File1:
Apple
Boy: Goes to school
Cat
File2:
Boy
Dog
I need output like below.
Output1.txt:
Apple
Cat
Output2.txt:
Dog
Can anyone help please
If you have awk available on your system:
awk -v FS='[ :]' 'NR==FNR{a[$1]}NR>FNR&&!($1 in a){print $1}' File2 File1 > Output1.txt
awk -v FS='[ :]' 'NR==FNR{a[$1]}NR>FNR&&!($1 in a){print $1}' File1 File2 > Output2.txt
The script is storing in an array a the first element $1 of the first file given in argument.
If the first parameter of the second file is not part of the array, print it.
Note that the delimiter is either a space or a :
I have five files as shown below, with the single line with comma separated value
File 1
abc,100
File 2
abc,200
File 3
abc,300
File 4
abc,700
File 5
abc,800
I need the output as by adding the numbers from all above files.
the output script should be in the single line code.
Output file
abc,2100
awk -F, '{code=$1; total += $2} END {printf("%s,%d\n", code, total)}' file1 file2 file3 file4 file5 > outputfile
Try:
awk -F\, '{a[$1]+=$2}END{for (i in a){print i","a[i]}}' file* > target
This will be usable for mutiple key input files.
For the new expected output:
awk -F\, '{a[$1]+=$2}END{for (i in a){key=key"_"i;cont+=a[i]};sub(/^_/,"",key);print key","cont}' file*
Results
abc_bbc,2100
i have small file with around 50 lines and 2 fields like below
file1
-----
12345 8373
65236 7376
82738 2872
..
..
..
i have some around 100 files which are comma"," separated as below:
file2
-----
1,3,4,4,12345,,,23,3,,,2,8373,1,1
each file has many lines similar to the above line.
i want to extract from all these 100 files whose
5th field is eqaul to 1st field in the first file and
13th field is equal to 2nd field in the first file
I want to search all the 100 files using that single file?
i came up with the below in case of a single comma separated file.i am not even sure whether this is correct!
but i have multiple comma separated files.
awk -F"\t|," 'FNR==NR{a[$1$2]++;next}($5$13 in a)' file1 file2
can anyone help me pls?
EDIT:
the above command is working fine in case of a single file.
Here is another using an array, avoiding multiple work files:
#!/bin/awk -f
FILENAME == "file1" {
keys[$1] = ""
keys[$2] = ""
next
}
{
split($0, fields, "," )
if (fields[5] in keys && fields[13] in keys) print "*:",$0
}
I am using split because the field seperator in the two files are different. You could swap it around if necessary. You should call the script thus:
runit.awk file1 file2
An alternative is to open the first file explicitly (using "open") and reading it (readline) in a BEGIN block.
Here is a simple approach. Extract each line from the small file, split it into fields and then use awk to print lines from the other files which match those fields:
while read line
do
f1=$(echo $line | awk '{print $1}')
f2=$(echo $line | awk '{print $2}')
awk -v f1="$f1" -v f2="$f2" -F, '$5==f1 && $13==f2' file*
done < small_file