process second column if first column matches

process second column if first column matches - bash

I just want the second column to be multiplied by exp(3) if the first column matches the parameter I define.
cat inputfile.i
100 2
200 3
300 1
100 5
200 2
300 3
I want the output to be:
100 2
200 60.25
300 1
100 5
200 40.17
300 3
I tried this code:
awk ' $1 == "200" {print $2*exp(3)}' inputfile
but nothing actually shows

you are not printing the unmatched lines, you don't need to quote numbers
$ awk '$1==200{$2*=exp(3)}1' file
100 2
200 60.2566
300 1
100 5
200 40.1711
300 3

Is there a difference between inputfile.i and inputfile?
Anyway, here is my solution for you:
awk '$1 == 200 {printf "%s %.2f\n",$1,$2*exp(3)};$1 != 200 {print $0}' inputfile.i
100 2
200 60.26
300 1
100 5
200 40.17
300 3

Related

Handling ties when ranking in bash

Let's say I have a list of numbers that are already sorted as below
100
222
343
423
423
500
What I want is to create a rank field such that same values are assigned the same rank
100 1
222 2
343 3
423 4
423 4
500 5
I have been using the following piece of code to mimic a rank field
awk '{print $0, NR}' file
That gives me below, but it's technically a rownumber.
100 1
222 2
343 3
423 4
423 5
500 6
How do I go about this? I am an absolute beginner in bash so would really appreciate if you could add a little explanation for learning sake.

That's a job for awk:
$ awk '{if($0!=p)++r;print $0,r;p=$0}' file
Output:
100 1
222 2
343 3
423 4
423 4
500 5
Explained:
$ awk '{ # using awk
if($0!=p) # if the value does not equal the previous value
++r # increase the rank
print $0,r # output value and rank
p=$0 # store value for next round
}' file

Could you please try following.
awk 'prev==$0{--count} {print $0,++count;prev=$1}' Input_file
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk code from here.
prev==$0 ##Checking condition if variable prev is equal to current line then do following.
{
--count ##Subtract count variable with 1 here.
}
{
print $0,++count ##Printing current line and variable count with increasing value of it.
prev=$1 ##Setting value of prev to 1st field of current line.
}
' Input_file ##Mentioning Input_file name here.

another awk
$ awk '{print $1, a[$1]=a[$1]?a[$1]:++c}' file
100 1
222 2
343 3
423 4
423 4
500 5
where the file is not need to be sorted, for example after adding a new 423 at the end of the file
$ awk '{print $1, a[$1]=a[$1]?a[$1]:++c}' file
100 1
222 2
343 3
423 4
423 4
500 5
423 4
increment the rank counter a for new value observed, otherwise use the registered value for the key. since c is initialized to zero, pre-increment the value. This will use the same rank value for the same key regardless or the position.

Find header value for first occurance of "1" instance in column

I have a matrix example:
1 3 5 8 10 12
50 1 1 1 1 1 1
100 0 0 1 1 1 1
150 0 0 1 1 1 1
200 0 0 0 1 1 1
250 0 0 0 0 1 1
300 0 0 0 0 1 1
350 0 0 0 0 0 1
For each row name (50, 100, 150, 200, etc.) I want to know what is the "header" value when the instance "1" first occurs. Based on the example the answer is:
50 1
100 5
150 5
200 8
250 10
300 10
350 12
I am not sure how to play with IFs and WHENs to get my answer from this format. R, Excel, bash, awk, all welcome as solutions.

You can do this using awk as following :
$ awk 'FNR==1{for(i=1; i<=NF; i++){a[i]=$i}; next} {for(i=2; i<=NF; i++){if($i=="1"){print $1, a[i-1]; break}}} ' file
50 1
100 5
150 5
200 8
250 10
300 10
350 12
Explanation :
For header i.e FNR==1 we are populating all values in the array a;
For all next lines we are checking which field equates to 1, if found print the col1 value i.e $1 and the corresponding value in the array a and break the loop.

Awk solution:
awk 'NR==1{ for(i=1;i<=NF;i++) h[i]=$i; next }
{
for(i=2;i<=NF;i++) { if($i==1) { n=h[i-1]; break } }
print $1,(n)?n:"None"; n=""
}' file

Arithmetic calculation in shell scripting-bash

I have an input notepad file as shown below:
sample input file:
vegetables and rates
kg rate total
Tomato 4 50 100
potato 2 60 120
Beans 3 80 240
Overalltotal: (100+120++240) = 460
I need to multiply the column 2 and column 3 and check the total if it is right and the overall total as well. If that's not right we need to print in the same file as an error message as shown below
enter code here
sample output file:
vegetables and rates
kg rate vegtotal
Tomato 4 50 200
potato 2 60 120
Beans 3 80 240
Overalltotal: (200+120++240) = 560
Error in calculations:
Vegtotal for tomato is wrong: It should be 200 instead of 100
Overalltotal is wrong: It should be 560 instead of 460
Code so far:
for f in Date*.log; do
awk 'NR>1{ a[$1]=$2*$3 }{ print }END{ printf("\n");
for(i in a)
{ if(a[i]!=$4)
{ print i,"Error in calculations",a[i] }
} }' "$f" > tmpfile && mv tmpfile "$f";
done
It calculates the total but not comparing the values. How can I compare them and print to same file?

Complex awk solution:
awk 'NF && NR>1 && $0!~/total:/{
r=$2*$3; v=(v!="")? v"+"r : r;
if(r!=$4){ veg_er[$1]=r" instead of "$4 }
err_t+=$4; t+=r; $4=r
}
$0~/total/ && err_t {
print $1,"("v")",$3,t; print "Error in calculations:";
for(i in veg_er) { print "Veg total for "i" is wrong: it should be "veg_er[i] }
print "Overalltotal is wrong: It should be "t" instead of "err_t; next
}1' inputfile
The output:
kg rate total
Tomato 4 50 200
potato 2 60 120
Beans 3 80 240
Overalltotal: (200+120+240) = 560
Error in calculations:
Veg total for Tomato is wrong: it should be 200 instead of 100
Overalltotal is wrong: It should be 560 instead of 460
Details:
NF && NR>1 && $0!~/total:/ - considering veg lines (excuding header and total lines)
r=$2*$3 - the result of product of the 2nd and 3rd fields
v=(v!="")? v"+"r : r - concatenating resulting product values
veg_er - the array containing erroneous vegs info (veg name, erroneous product value, and real product value)
err_t+=$4 - accumulating erroneous total value
t+=r - accumulating real total value
$0~/total/ && err_t - processing total line and error events

Input
akshay#db-3325:/tmp$ cat file
kg rate total
Tomato 4 50 100
potato 2 60 120
Beans 3 80 240
Output
akshay#db-3325:/tmp$ awk 'FNR>1{sum+= $2 * $3 }1;END{print "Total : "sum}' file
kg rate total
Tomato 4 50 100
potato 2 60 120
Beans 3 80 240
Total : 560
Explanation
awk ' # call awk
FNR>1{ # if no of lines of current file is greater than 1,
# then , this is to skip first row
sum+= $2 * $3 # sum total which is product of value
# in column2 and column3
}1; # 1 at the end does default operation,
# that is print current record ( print $0 )
# if you want to skip record being printed remove "1", so that script just prints total
END{ # end block
print "Total : "sum # print sum
}
' file

Divide column values of different files by a constant then output one minus the other

I have two files of the form
file1:
#fileheader1
0 123
1 456
2 789
3 999
4 112
5 131
6 415
etc.
file2:
#fileheader2
0 442
1 232
2 542
3 559
4 888
5 231
6 322
etc.
How can I take the second column of each, divide it by a value then minus one from the other and then output a new third file with the new values?
I want the output file to have the form
#outputheader
0 123/c-422/k
1 456/c-232/k
2 789/c-542/k
etc.
where c and k are numbers I can plug into the script
I have seen this question: subtract columns from different files with awk
But I don't know how to use awk to do this by myself, does anyone know how to do this or could explain what is going on in the linked question so I can try to modify it?

I'd write:
awk -v c=10 -v k=20 ' ;# pass values to awk variables
/^#/ {next} ;# skip headers
FNR==NR {val[$1]=$2; next} ;# store values from file1
$1 in val {print $1, (val[$1]/c - $2/k)} ;# perform the calc and print
' file1 file2
output
0 -9.8
1 34
2 51.8
3 71.95
4 -33.2
5 1.55
6 25.4
etc. 0

Search for a value in a file and remove subsequent lines

I'm developing a shell script but I am stuck with the below part.
I have the file sample.txt:
S.No Sub1 Sub2
1 100 200
2 100 200
3 100 200
4 100 200
5 100 200
6 100 200
7 100 200
I want to search the S.No column in sample.txt. For example if I'm searching the value 5 I need the rows up to 5 only I don't want the rows after the value of in S.NO is larger than 5.
the output must look like, output.txt
S.No Sub1 Sub2
1 100 200
2 100 200
3 100 200
4 100 200
5 100 200

Print the first line and any other line where the first field is less than or equal to 5:
$ awk 'NR==1||$1<=5' file
S.No Sub1 Sub2
1 100 200
2 100 200
3 100 200
4 100 200
5 100 200

Using perl:
perl -ane 'print if $F[$1]<=5' file

And the sed solution
n=5
sed "/^$n[[:space:]]/q" filename
The sed q command exits after printing the current line

The suggested awk relies on that column 1 is numeric sorted. A generic awk that fulfills the question title would be:
gawk -v p=5 '$1==p {print; exit} {print}'
However, in this situation, sed is better IMO. Use -i to modify the input file.

sed '6q' sample.txt > output.txt

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

process second column if first column matches - bash

you are not printing the unmatched lines, you don't need to quote numbers $ awk '$1==200{$2*=exp(3)}1' file 100 2 200 60.2566 300 1 100 5 200 40.1711 300 3

Is there a difference between inputfile.i and inputfile? Anyway, here is my solution for you: awk '$1 == 200 {printf "%s %.2f\n",$1,$2*exp(3)};$1 != 200 {print $0}' inputfile.i 100 2 200 60.26 300 1 100 5 200 40.17 300 3

Related

Handling ties when ranking in bash

Find header value for first occurance of "1" instance in column

Arithmetic calculation in shell scripting-bash

Divide column values of different files by a constant then output one minus the other

Search for a value in a file and remove subsequent lines

Categories

Resources