Awk results not working - shell

Hi I have a text file with values
A VAL|1|2|3|
C VAL|2|2|3|
D VAL|1|2|3|
[No space between lines]
I want to replace the values in the above as per the first col i.e A VAL,C
VAL,D VAL,
so I want to
1. replace 3 from A VAL row
2. replace 2 value from C VAL row.
3. replace 1 value from D VAL row.
Basically I want to modify the above values by using AWK as AWK helps
treating csv , pipe delimited files
So I tried by using AWK command as
enter code here
`awk 'BEGIN {OFS=FS="|"} {if ($1="A") sub($4,"A1") ;elseif ($1="C") sub
($2,"B1"); print }' myval.txt`
*But I am getting wrong results *
C|B1|2|A1|B1C
C|B1|2|A1|B1C
C|B1|2|3|B1C
>The fisrt column itself is geting replace and the substitution is at wrong
>position.
**Expected output is **
A VAL|1|2|A1|
C VAL|2|2|B1|
D VAL|1|2|3|

You can try this awk:
awk 'BEGIN{OFS=FS="|"} $1 ~ /^A/{$(NF-1)="A1"} $1 ~ /^C/{$(NF-1)="B1"} 1' file.csv
A VAL|1|2|A1|
C VAL|2|2|B1|
D VAL|1|2|3|

awk 'BEGIN{OFS=FS="|"}{if(substr($1,0,1)=="A")sub($3,"A1",$3);else if(substr($1,0,1)=="C")sub($3,"B1",$3);else if(substr($1,0,1)=="D")sub($3,"3",$3);print }' inputtext.txt > outtext.txt
This is working fine

Related

awk to get first column if the a specific number in the line is greater than a digit

I have a data file (file.txt) contains the below lines:
123 pro=tegs, ETA=12:00, team=xyz,user1=tom,dom=dby.com
345 pro=rbs, team=abc,user1=chan,dom=sbc.int,ETA=23:00
456 team=efg, pro=bvy,ETA=22:00,dom=sss.co.uk,user2=lis
I'm expecting to get the first column ($1) only if the ETA= number is greater than 15, like here I will have 2nd and 3rd line first column only is expected.
345
456
I tried like cat file.txt | awk -F [,TPF=]' '{print $1}' but its print whole line which has ETA at the end.
Using awk
$ awk -F"[=, ]" '{for (i=1;i<NF;i++) if ($i=="ETA") if ($(i+1) > 15) print $1}' input_file
345
456
With your shown samples please try following GNU awk code. Using match function of GNU awk where I am using regex (^[0-9]+).*ETA=([0-9]+):[0-9]+ which creates 2 capturing groups and saves its values into array arr. Then checking condition if 2nd element of arr is greater than 15 then print 1st value of arr array as per requirement.
awk '
match($0,/(^[0-9]+).*\<ETA=([0-9]+):[0-9]+/,arr) && arr[2]+0>15{
print arr[1]
}
' Input_file
I would harness GNU AWK for this task following way, let file.txt content be
123 pro=tegs, ETA=12:00, team=xyz,user1=tom,dom=dby.com
345 pro=rbs, team=abc,user1=chan,dom=sbc.int,ETA=23:00
456 team=efg, pro=bvy,ETA=02:00,dom=sss.co.uk,user2=lis
then
awk 'substr($0,index($0,"ETA=")+4,2)+0>15{print $1}' file.txt
gives output
345
Explanation: I use String functions, index to find where is ETA= then substr to get 2 characters after ETA=, 4 is used as ETA= is 4 characters long and index gives start position, I use +0 to convert to integer then compare it with 15. Disclaimer: this solution assumes every row has ETA= followed by exactly 2 digits.
(tested in GNU Awk 5.0.1)
Whenever input contains tag=value pairs as yours does, it's best to first create an array of those mappings (v[]) below and then you can just access the values by their tags (names):
$ cat tst.awk
BEGIN {
FS = "[, =]+"
OFS = ","
}
{
delete v
for ( i=2; i<NF; i+=2 ) {
v[$i] = $(i+1)
}
}
v["ETA"]+0 > 15 {
print $1
}
$ awk -f tst.awk file
345
456
With that approach you can trivially enhance the script in future to access whatever values you like by their names, test them in whatever combinations you like, output them in whatever order you like, etc. For example:
$ cat tst.awk
BEGIN {
FS = "[, =]+"
OFS = ","
}
{
delete v
for ( i=2; i<NF; i+=2 ) {
v[$i] = $(i+1)
}
}
(v["pro"] ~ /b/) && (v["ETA"]+0 > 15) {
print $1, v["team"], v["dom"]
}
$ awk -f tst.awk file
345,abc,sbc.int
456,efg,sss.co.uk
Think about how you'd enhance any other solution to do the above or anything remotely similar.
It's unclear why you think your attempt would do anything of the sort. Your attempt uses a completely different field separator and does not compare anything against the number 15.
You'll also want to get rid of the useless use of cat.
When you specify a column separator with -F that changes what the first column $1 actually means; it is then everything before the first occurrence of the separator. Probably separately split the line to obtain the first column, space-separated.
awk -F 'ETA=' '$2 > 15 { split($0, n, /[ \t]+/); print n[1] }' file.txt
The value in $2 will be the data after the first separator (and up until the next one) but using it in a numeric comparison simply ignores any non-numeric text after the number at the beginning of the field. So for example, on the first line, we are actually literally checking if 12:00, team=xyz,user1=tom,dom=dby.com is larger than 15 but it effectively checks if 12 is larger than 15 (which is obviously false).
When the condition is true, we split the original line $0 into the array n on sequences of whitespace, and then print the first element of this array.
Using awk you could match ETA= followed by 1 or more digits. Then get the match without the ETA= part and check if the number is greater than 15 and print the first field.
awk '/^[0-9]/ && match($0, /ETA=[0-9]+/) {
if(substr($0, RSTART+4, RLENGTH-4)+0 > 15) print $1
}' file
Output
345
456
If the first field should start with a number:
awk '/^[0-9]/ && match($0, /ETA=[0-9]+/) {
if(substr($0, RSTART+4, RLENGTH-4) > 15)+0 print $1
}' file

Formatting output using awk

I've a file with following content:
A 28713.64 27736.1000
B 9835.32
C 38548.96
Now, i need to check if the last row in the first column is 'C', then the value of first row in third column should be printed in the third column against 'C'.
Expected Output:
A 28713.64 27736.1000
B 9835.32
C 38548.96 27736.1000
I tried below, but it's not working:
awk '{if ($1 == "C") ; print $1,$2,$3}' file_name
Any help is most welcome!!!
This works for the given example:
awk 'NR==1{v=$3}$1=="C"{$0=$0 FS v}7' file|column -t
If you want to append the 3rd column value from A row to C row, change NR==1 into $1=="A"
The column -t part is just for making output pretty. :-)
EDIT: As per OP's comment OP is looking for very first line and looking to match C string at very last line of Input_file, if this is the case then one should try following.
awk '
FNR==1{
value=$NF
print
next
}
prev{
print prev
}
{
prev=$0
prev_first=$1
}
END{
if(prev_first=="C"){
print prev,value
}
else{
print
}
}' file | column -t
Assuming that your actual Input_file is same as shown samples and you want to pick value from 1st column whose value is A.
awk '$1=="A" && FNR==1{value=$NF} $1=="C"{print $0,value;next} 1' Input_file| column -t
Output will be as follows.
A 28713.64 27736.1000
B 9835.32
C 38548.96 27736.1000
POSIX dictates that "assigning to a nonexistent field (for example, $(NF+2)=5) shall increase the value of NF; create any intervening fields with the uninitialized value; and cause the value of $0 to be recomputed, with the fields being separated by the value of OFS."
So...
awk 'NR==1{x=$3} $1=="C"{$3=x} 1' input.txt
Note that the output is not formatted well, but that's likely the case with most of the solutions here. You could pipe the output through column, as Ravinder suggested. Or you could control things precisely by printing your data with printf.
awk 'NR==1{x=$3} $1=="C"{$3=x} {printf "%-2s%-26s%s\n",$1,$2,$3}' input.txt
If your lines can be expressed in a printf format, you'll be able to avoid the unpredictability of column -t and save the overhead of a pipe.

Unix - search a value and print before line

I have below input and output needed.
input :
insert xxx to table xx.xxx.
1
insert yyy to table yy.yyy
10000
output:
insert yyy to table yy.yyy
10000
I want to print the a line before the value >= 10000.
tried wasnt wrking:
awk '($1>10000) {print$1)' < log2 > log3 | awk '/[0-9]$/ {print $1)' < log
You can use this awk:
awk '/10000/{print line ORS $0} {line=$0}' file
insert yyy to table yy.yyy
10000
You can use this code:
awk '{if (($1+0==$1) && $1 >= 10000) {print a; print}; a=$0}' < input
Here, ($1+0==$1) makes sure that $1 is a number, and $1 >= 10000 check that it is not smaller than 10000. If both conditions are met, it prints the current and the last line, which was saved using a=$0.
Like it's stated in the question, you need to get the before line and the pattern ('10000' for your example) line right?
I think the easiest way to do it is using grep:
grep -B 1 '^([1-9]\d{4,})$' file
or if you want to print it to an output file:
grep -B 1 '^([1-9]\d{4,})$' file > outputFile
It gets values that start with 1 and have 4 or more digits after. ^ and $ to guarantee that the line only contains the numeric value in it (^ is the line beginning and $ is the line end).
This, for the input you suggested will output:
insert yyy to table yy.yyy
10000

Shell Script add column values

I have a text file which contains like below:
{"userId":"f1fcab","count":"3","type":"Stack"}
{"userId":"fcab","count":"2","type":"Stack"}
{"userId":"abcd","count":"5","type":"Stack"}
I want to get sum of the value of count.
I am using awk to achive this like below:
$ awk -F "," '{print $4}' test.txt
How can I get only the integer type using awk and add them all.
My script should give me as
sum=10
You could try the below,
$ awk -F'"' '{sum = sum + $8;}END{print "sum="sum+0}' file
sum=10
-F'"' Sets the double quotes as FS value. Awk splits the row into colunms according to the value of FS variable.
sum = sum + $8 Calculate the sum of all the values in column no 8 and store it into a variable called sum
Finally by printing the variable sum at the end will give you the desired output.
You can get the value of count key using double quotes (") as delimiter so that the eighth column will be the value to count on:
$ awk -F"\"" 'BEGIN {sum=0} {sum+=$8} END {print sum}' fd
10
Assuming consistent use of double quote characters, you can use:
awk -F\" '{s += $8} END{print "sum=" s+0}' inputFile
This will generate:
sum=10
This works because a quote delimiter gives you the fields:
1 2 3 4 5 6 7 8 ...
{"userId":"f1fcab","count":"3","type":"Stack"}
awk -F'[:"]' '{sum+=$10} END{print "sum=" sum}' File
Setting ':' and '"' as delimiters. Then taking the 10th field, which is the count value. add then up to sum and print at the end.
Example:
sdlcb#ubuntu:~/AMD_C/SO$ cat File
{"userId":"f1fcab","count":"3","type":"Stack"}
{"userId":"fcab","count":"2","type":"Stack"}
{"userId":"abcd","count":"5","type":"Stack"}
sdlcb#ubuntu:~/AMD_C/SO$ awk -F'[:"]' '{sum+=$10} END{print "sum=" sum}' File
sum=10

replace a particular row and column value of one file with another

I have a file containing
a b c d
g h i j
d e f f
and a another file containing
1 2 3 4
5 6 7 8
9 1 0 1
I know that I can extract a particular row and column using
awk 'FNR == 2 {print $3}' fit_detail.txt
But, I need to replace 2nd column and 3rd row of first file with the 2nd row and 3rd column of second file. How I could do this and saves it into another file.
Finally, my output should look like
a b c d
g h i j
d 1 f f
$ awk 'NR==FNR && NR==3 {a=$2} NR==FNR {next} FNR==3 {$2=a} {print}' file2 file1
a b c d
g h i j
d 1 f f
Explanation:
NR==FNR && NR==3 {a=$2}
In awk, NR is the number of records (lines) that have been read in total and FNR is the number of records (lines) that have been read in from the current file. So, when NR==FNR, then we know that we are working on the first file named on the command line. For that file, we select only the third row (NR==3) and save the value of its second column in the variable a.
NR==FNR {next}
If we are processing the first named file on the command line, skip to next line.
FNR==3 {$2=a}
Because of the preceding next statement, it is only possible to get to this command if we are now working on the second named file. For this file, if we are on the third row, change the 2nd column to the value a.
{print}
All lines from the second named file are printed.
Controlling the output format
By default, awk separates output fields with a space. If another output field separator, such as a tab, is desired, it can be specified as follows:
$ awk -v OFS="\t" 'NR==FNR && NR==3 {a=$2} NR==FNR {next} {$2=$2} FNR==3 {$2=a} {print}' file2 file1
a b c d
g h i j
d 1 f f
To accomplish this, we made two changes:
The output field separator (OFS) was specified as a tab with the -v option: -v OFS="\t"
When using a simple print statement, such as {print}, awk will normally apply the new output field separator only if the line had been changed in some way. That is accomplished here with the statement $2=$2. This assigns the second field to itself. Even though this leaves the second field unchanged, it is enough to trigger awk` to replace the old field separators with new ones on output.

Resources