awk find and replace variable file2 into file1 when matched

awk find and replace variable file2 into file1 when matched - bash

Tried couple of answers from similar questions but not quite getting correct results. Trying to search second file for variable and replace with second variable if there, otherwise keep original...
File1.txt
a
2
c
4
e
f
File2.txt
2 b
4 d
Wanted Output.txt
a
b
c
d
e
f
So far what I have seems to sort of work, but anywhere the replacement is happening I'm getting a blank row instead of the new variable...
Current Output.txt
a
c
e
f
Curent code....
awk -F'\t' 'NR==FNR{a[$1]=$2;next} {print (($1 in a) ? a[$1] : $1)}' file2.txt file1.txt > output.txt
Also tried and got same results...
awk -F'\t' 'NR==FNR{a[$1]=$2;next} {$1 = a[$1]}1' file2.txt file1.txt > output.txt
Sorry first wrote incorrectly..fixed the key value issue.
Did try what you did, still not getting missing in output.txt
awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1 = a[$1]}1' file2.txt file1.txt > output.txt

your key value pair is not right... $1 is the key, $2 is the value.
$ awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1=a[$1]}1' file.2 file.1
a
b
c
d
e
f

try below solution -
awk 'NR==FNR{a[$1]=$NF;next} {print (a[$NF]?a[$NF]:$1)}' file2.txt file1.txt
a
b
c
d
e
f

Related

Bash: compare 2 files and show the unique content of one file with 'hierachy'

So basically, these are two files I need to compare
file1.txt
1 a
2 b
3 c
44 d
file2.txt
11 a
123 a
3 b
445 d
To show the unique lines in file 1, I use 'comm -23' command after 'sort -u' these 2 files. Additionally, I would like to make '11 a' '123 a' in file 2 become subsets of '1 a' in file 1, similarly, '445 d' is a subset of ' 44 d'. These subsets are considered the same as their superset. So the desired output is
2 b
3 c
I'm a beginner and my loop is way too slow... So here is my code
comm -23 <( awk {print $1,$2}' file1.txt | sort -u ) <( awk '{print $1,$2}' file2.txt | sort -u ) >output.txt
array=($( awk -F ',' '{print $1}' file1.txt ))
for i in "${array[#]}";do
awk -v pattern="$i" 'match($0, "^" pattern)' output.txt > repeat.txt
done
comm -23 <( cat output.txt | sort -u ) <( cat repeat.txt | sort -u )
Anyone got any good ideas?
Another question: Any ways I could show the row numbers from original file at output? For example,
(row num from file 1)
2 2 b
3 3 c

With GNU awk for arrays of arrays:
$ cat tst.awk
NR==FNR {
vals[$2][$1]
next
}
$2 in vals {
for (i in vals[$2]) {
if ( index(i,$1) == 1 ) {
next
}
}
}
{ print FNR, $0 }
$ awk -f tst.awk file2 file1
2 2 b
3 3 c

How to print lines with the specified word in the path?

Let's say I have file abc.txt which contains the following lines:
a b c /some/path/123/path/120
a c b /some/path/312/path/098
a p t /some/path/123/path/321
a b c /some/path/098/path/123
and numbers.txt:
123
321
123
098
I want to print the whole line which contain "123" only in the third place under "/some/path/123/path",
I don't want to print line "a c b/some/path/312/path" or
"a b c /some/path/098/path/123/". I want to save all files with the "123" in the third place in the new file.
I tried several methods and the best way seems to be use awk. Here is my example code which is not working correctly:
for i in `cat numbers.txt | xargs`
do
cat abc.txt | awk -v i=$i '$4 ~ /i/ {print $0}' > ${i}_number.txt;
done
because it's catching also for example "a b c /some/path/098/path/123/".
Example:
For number "123" I want to save only one line from abc.txt in 123_number.txt:
a b c /some/path/123/path/120
For number "312" I want to save only one line from abc.txt in 312_number.txt:
a c b /some/path/312/path/098

this can be accomplished in a single awk call:
$ awk -F'/' 'NR==FNR{a[$0];next} ($4 in a){f=$4"_number.txt";print >>f;close(f)}' numbers.txt abc.txt
$ cat 098_number.txt
a b c /some/path/098/path/123
$ cat 123_number.txt
a b c /some/path/123/path/120
a p t /some/path/123/path/321
keep numbers in an array and use it for matching lines, append matching lines to corresponding files.
if your files are huge you may speed up the process using sort:
sort -t'/' -k4 abc.txt | awk -F'/' 'NR==FNR{a[$0];next} ($4 in a){if($4!=p){close(f);f=(p=$4)"_number.txt"};print >>f}' numbers.txt -

Reading two columns in two files and outputting them to another file

I recently posted this question - paste -d " " command outputting a return separated file
However I am concerned there is formatting in the text files that is causing it to error. For this reason I am attempting to do it with awk.
I am not very experienced with awk but currently I have the following:
awk {print $1} file1 | {print $1} file2 > file 3
Is this the kind of syntax I should be using? It gives an error saying missing } Each file contains a single column of numbers and the same number of rows.

By seeing your OLD post seems to be you could have control M characters in your files. To remove control M characters in your files either use dos2unix utility or use following command(s).
1st: To remove junk chars everywhere.
tr -d '\r' < Input_file > temp_file && mv temp_file Input_file
2nd: To remove them only at last of lines use following.
awk '{sub(/\r$/,"")} 1' Input_file > temp_file && mv temp_file Input_file
I believe once you remove junk chars your paste command should work properly too. Run following after you fix the control M chars in your Input_file(s).
paste -d " " Input_file1 Input_file2 > Output_file
OR to concatenate 2 files simply use:(considering that your Input_files have either 1 column or you want full lines to be there in output)
cat Input_file1 Input_file2 > output_file

awk to the rescue:
awk 'FNR==NR{a[FNR]=$1;next}{print a[FNR],$1}' a.txt b.txt > output.txt
a.txt:
1
2
3
4
5
b.txt:
A
B
C
D
E
output.txt:
1 A
2 B
3 C
4 D
5 E

Concatenation of two columns from the same file

From a text file
file
a d
b e
c f
how are the tab delimited columns concatenated into one column
a
b
c
d
e
f
Now I use awk to output columns to two files that I then concatenated using cat. But there must be a better one line command?

for a generalized approach
$ f() { awk '{print $'$1'}' file; }; f 1; f 2
a
b
c
d
e
f
if the file is tab delimited perhaps simply with cut (the inverse operation of paste)
$ cut -f1 file.t; cut -f2 file.t

This simple awk command should do the job:
awk '{print $1; s=s $2 ORS} END{printf "%s", s}' file
a
b
c
d
e
f

You can use process substitution; that would eliminate the need to create file for each column.
$ cat file
a d
b e
c f
$ cat <(awk '{print $1}' file) <(awk '{print $2}' file)
a
b
c
d
e
f
$
OR
as per the comment you can just combine multiple commands and redirect their output to a different file like this:
$ cat file
a d
b e
c f
$ (awk '{print $1}' file; awk '{print $2}' file) > output
$ cat output
a
b
c
d
e
f
$

try: Without reading file twice or without any external calls of any other commands, only single awk to rescue. Also considering that your Input_file is same like shown sample.
awk '{VAL1=VAL1?VAL1 ORS $1:$1;VAL2=VAL2?VAL2 ORS $2:$2} END{print VAL1 ORS VAL2}' Input_file
Explanation: Simply creating a variable named VAL1 which will contain $1's value and keep on concatenating in it's own value, VAL2 will have $2's value and keep on concatenating value in it's own. In END section of awk printing the values of VAL1 and VAL2.

You can combine bash commands with ; to get a single stream:
$ awk '{print $1}' file; awk '{print $2}' file
a
b
c
d
e
f
Use process substitution if you want that to be as if it were a single file:
$ txt=$(awk '{print $1}' file; awk '{print $2}' file)
$ echo "$txt"
a
b
c
d
e
f
Or for a Bash while loop:
$ while read -r line; do echo "line: $line"; done < <(awk '{print $1}' file; awk '{print $2}' file)
line: a
line: b
line: c
line: d
line: e
line: f

If you're using notepadd++ you could replace all tab values with the newline char "\r\n"

another approach:
for i in $(seq 1 2); do
awk '{print $'$i'}' file
done
output:
a
b
c
d
e
f

compare the value from file in awk

I have two files named file1 and file2, now i want to pick value from file1 and search it on whole file2, if record found then apply the operation on the file1 records else apply some other operation on file1 records.
file 1
a
s
d
f
g
file 2
q
e
r
a
g
earlier i was using like below
awk -F'|' '{if($1="abc" || $1="a") print ......}' file1
Now i have multiple values to compare and i have put the values in the file (abc,a.....)
but i don't know how to use it.
Please help

A common way to do this would be to store the values you are searching for in a hash and use it as a lookup table. Something like this might work for you:
search.awk
FNR==NR {
seen[$0]
next
}
$0 in seen {
print $0 " is in file2"
next
}
{
print $0 " is not in file2"
}
Run it like this:
awk -f search.awk file2 file1
Output:
a is in file2
s is not in file2
d is not in file2
f is not in file2
g is in file2

Using awk
awk 'NR==FNR{a[$1];next}{print $0,($1 in a)?"Yes":"No"}' file2 file1
a Yes
s No
d No
f No
g Yes

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

awk find and replace variable file2 into file1 when matched - bash

your key value pair is not right... $1 is the key, $2 is the value. $ awk -F'\t' 'NR==FNR{a[$1]=$2;next} $1 in a{$1=a[$1]}1' file.2 file.1 a b c d e f

try below solution - awk 'NR==FNR{a[$1]=$NF;next} {print (a[$NF]?a[$NF]:$1)}' file2.txt file1.txt a b c d e f

Related

Bash: compare 2 files and show the unique content of one file with 'hierachy'

How to print lines with the specified word in the path?

Reading two columns in two files and outputting them to another file

Concatenation of two columns from the same file

compare the value from file in awk

Categories

Resources