How to color the output of awk depending on a condition - shell

I have an input file test.txt containing the fallowing:
a 1 34
f 2 1
t 3 16
g 4 11
j 5 16
I use awk to print only string 2 and 3:
awk '{print $2 " " $3}' test.txt
Is there a way to color only the second string of my output depending on a condition, if the value is higher than 15 then print in orange, if the value is higher than 20, print in red. It will give the same but colored:
1 34(red)
2 1
3 16(orange)
4 11
5 16(orange)
The input could contain many more lines in a different order.

This awk command should do what you want:
awk -v red="$(tput setaf 1)" -v yellow="$(tput setaf 3)" -v reset="$(tput sgr0)" '{printf "%s"OFS"%s%s%s\n", $1, ($3>20)?red:($3>15?yellow:""), $3, reset}'
The key bits here are
the use of tput to get the correct representation of setting the color for the current terminal (as opposed to hard-coding a specific escape sequence)
the use of -v to set the values of the variables the awk command uses to construct its output
The above script is tersely written but could be written less tersely like this:
{
printf "%s"OFS, $1
if ($3 > 20) {
printf "%s", red
} else if ($3 > 15) {
printf "%s", yellow
}
printf "%s%s\n", $3, reset
}
Edit: Ed Morton correctly points out that the awk programs above could be simplified by using a color variable and separating the color choice from the printing. Like this:
awk -v red="$(tput setaf 1)" -v yellow="$(tput setaf 3)" -v reset="$(tput sgr0)" \
'{
if ($3>20) color=red; else if ($3>15) color=yellow; else color=""
printf "%s %s%s%s\n", $1, color, $3, reset
}'

You can do the following:
awk '{printf "%s ", $2}
$3 > 15 {printf "\033[33m"}
$3 > 20 {printf "\033[31m"}
{printf "%s\033[0m\n", $3}' test.txt
Unfortunately, I don't know the orange ansi escape code...
Another approach:
awk -v color1="\033[33m" -v color2="\033[31m" -v reset="\033[0m" '
$3 > 15 && $3 <= 20 {$3=color1 $3 reset}
$3 > 20 {$3=color2 $3 reset}
{print $2, $3}' test.txt

awk '{if($2>15 && $2<=20){$2="\033[1;33m" $2 "\033[0m"};if($2>20){$2="\033[1;31m" $2 "\033[0m"};print}' file
breakdown
($2>15 && $2<=20){$2="\033[1;33m" $2 "\033[0m"} # if field 2>15 and =<20, then add colour code to field 2.
if($2>20){$2="\033[1;31m" $2 "\033[0m"} ## if field 2>20, then add colour code to field 2.
print #print line afterwards

Related

Linux - loop through each element on each line

I have a text file with the following information:
cat test.txt
a,e,c,d,e,f,g,h
d,A,e,f,g,h
I wish to iterate through each line and then for each line print the index of all the characters different from e. So the ideal output would be either with a tab seperator or comma seperator
1 3 4 6 7 8
1 2 4 5 6
or
1,3,4,6,7,8
1,2,4,5,6
I have managed to iterate through each line and print the index, but the results are printed to the same line and not seperated.
while read line;do echo "$line" | awk -F, -v ORS=' ' '{for(i=1;i<=NF;i++) if($i!="e") {print i}}' ;done<test.txt
With the result being
1 3 4 6 7 8 1 2 4 5 6
If I do it only using awk
awk -F, -v ORS=' ' '{for(i=1;i<=NF;i++) if($i!="e") {print i}}'
I get the same output.
Could anyone help me with this specific issue with seperating the lines?
If you don't mind some trailing whitespace, you can just do:
while read line;do echo "$line" | awk -F, '{for(i=1;i<=NF;i++) if($i!="e") {printf i " "}; print ""}' ;done<test.txt
but it would be more typical to omit the while loop and do:
awk -F, '{for(i=1;i<=NF;i++) if($i!="e") {printf i " "}; print ""}' <test.txt
You can avoid the trailing whitespace with the slightly cryptic:
awk -F, '{m=0; for(i=1;i<=NF;i++) if($i!="e") {printf "%c%d", m++ ? " " : "", i }; print ""}' <test.txt

awk: sort file based on user input

I have this simple awk code:
awk -F, 'BEGIN{OFS=FS} {print $2,$1,$3}' $1
Works great, except I've hardcoded how I want to sort the comma-delimited fields of my plaintext file. I want to be able to specify at run time in which order I'd like to sort my fields.
One hacky way I thought about doing this was this:
read first
read second
read third
TOTAL=$first","$second","$third
awk -F, 'BEGIN{OFS=FS} {print $TOTAL}' $1
But this doesn't actually work:
awk: illegal field $(), name "TOTAL"
Also, I know a bit about awk's ability to accept user input:
BEGIN {
getline first < "-"
}
$1 == first {
}
But I wonder whether the variables created can in turn be used as variables in the original print command? Is there a better way?
You have to let bash expand $TOTAL before awk is called, so that awk sees the value of $TOTAL, not the literal string $TOTAL. This means using double, not single, quotes.
read first
read second
read third
# Dynamically construct the awk script to run
TOTAL="\$$first,\$$second,\$$third"
SCRIPT="BEGIN{OFS=FS} {print $TOTAL}"
awk -F, "$SCRIPT" "$1"
A safer method is to pass the field numbers as awk variables.
awk -F, -v c1="$first" -v c2="$second" -v c3="$third" 'BEGIN{OFS=FS} {print $c1, $c2, $c3}' "$1"
All you need is:
awk -v order='3 1 2' 'BEGIN{split(order,o)} {for (i=1;i<=NF;i++) printf "%s%s", $(o[i]), (i<NF?OFS:ORS)}'
e.g.:
$ echo 'a b c' | awk -v order='3 1 2' 'BEGIN{split(order,o)} {for (i=1;i<=NF;i++) printf "%s%s", $(o[i]), (i<NF?OFS:ORS)}'
c a b
$ echo 'a b c' | awk -v order='2 3 1' 'BEGIN{split(order,o)} {for (i=1;i<=NF;i++) printf "%s%s", $(o[i]), (i<NF?OFS:ORS)}'
b c a

using sed, awk, or sort for csv manipulation

I have a csv file that needs a lot of manipulation. Maybe by using awk and sed?
input:
"Sequence","Fat","Protein","Lactose","Other Solids","MUN","SCC","Batch Name"
1,4.29,3.3,4.69,5.6,11,75,"35361305a"
2,5.87,3.58,4.41,5.32,10.9,178,"35361305a"
3,4.01,3.75,4.75,5.66,12.2,35,"35361305a"
4,6.43,3.61,3.56,4.41,9.6,275,"35361305a"
final output:
43330075995647
59360178995344
40380035995748
64360275964436
I'm able to get through some of it going step by step.
How do I test specific columns for a value over 9.9 and replace it with 9.9 ?
Also, is there a way to combine any of these steps?
remove first line:
tail -n +2 test.csv > test1.txt
remove commas:
sed 's/,/ /g' test1.txt > test2.txt
remove quotes:
sed 's/"//g' test2.txt > test3.txt
remove columns 1 and 8 and
reorder remaining columns as 1,2,6,5,4,3:
sort test3.txt | uniq -c | awk '{print $3 "\t" $4 "\t" $8 "\t" $7 "\t" $6 "\t" $5}' test4.txt
test new columns 1,2,4,5,6 - if the value is over 9.9, replace it with 9.9
How should I do this step?
solution for following parts were found in a previous question - reformating a text file
columns 1,2,4,5,6 round decimals to tenths
column 3 needs to be four characters long, using zero to left fill
remove periods and spaces
awk '{$0=sprintf("%.1f%.1f%4s%.1f%.1f%.1f", $1,$2,$3,$4,$5,$6);gsub(/ /,"0");gsub(/\./,"")}1' test5.txt > test6.txt
This produces the output you want from the original file. Note that in the question you specified - note that in the question you specified "column 4 round to whole number" but in the desired output you had rounded it to one decimal place instead:
awk -F'[,"]+' 'function m(x) { return x < 9.9 ? x : 9.9 }
NR > 1 {
s = sprintf("%.1f%.1f%04d%.1f%.1f%.1f", m($2),m($3),$7,m($6),m($5),m($4))
gsub(/\./, "", s)
print s
}' test.csv
I have specified the field separator as any number of commas and double quotes together, so this "parses" your CSV format for you without requiring any additional steps.
The function m returns the minimum of 9.9 and the number you pass to it.
Output:
43330075995647
59360178995344
40380035995748
64360275964436
The three first in one go:
awk -F, '{gsub(/"/,"");$1=$1} NR>1' test.csc
1 4.29 3.3 4.69 5.6 11 75 35361305a
2 5.87 3.58 4.41 5.32 10.9 178 35361305a
3 4.01 3.75 4.75 5.66 12.2 35 35361305a
4 6.43 3.61 3.56 4.41 9.6 275 35361305a
tail -n +2 file | sort -u | awk -F , '
{
$0 = $1 FS $2 FS $6 FS $5 FS $4 FS $3
for (i = 1; i <= 6; ++i)
if ($i > 9.9)
$i = 9.9
$0 = sprintf("%.1f%.1f%4s%.0f%.1f%.1f", $1, $2, $3, $4, $5, $6)
gsub(/ /, "0"); gsub(/[.]/, "")
print
}
'
Or
< file awk -F , '
NR > 1 {
$0 = $1 FS $2 FS $6 FS $5 FS $4 FS $3
for (i = 1; i <= 6; ++i)
if ($i > 9.9)
$i = 9.9
$0 = sprintf("%.1f%.1f%4s%.0f%.1f%.1f", $1, $2, $3, $4, $5, $6)
gsub(/ /, "0"); gsub(/[.]/, "")
print
}
'
Output:
104309964733
205909954436
304009964838
406409643636

Multiple condition in nawk command

I have the nawk command where I need to format the data based on the length .All the time I need to keep first 6 digit and last 4 digit and make xxxx in the middle. Can you help in fine tuning the below script
#!/bin/bash
FILES=/export/home/input.txt
cat $FILES | nawk -F '|' '{
if (length($3) >= 13 )
print $1 "|" $2 "|" substr($3,1,6) "xxxxxx" substr($3,13,4) "|" $4"|" $5
else
print $1 "|" $2 "|" $3 "|" $4 "|" $5"|
}' > output.txt
done
input.txt
"2"|"X"|"A"|"ST"|"245552544555201"|"1111-11-11"|75.00
"6"|"Y"|"D"|"VT"|"245652544555200"|"1111-11-11"|95.00
"5"|"X"|"G"|"ST"|"3445625445552023"|"1111-11-11"|75.00
"3"|"Y"|"S"|"VT"|"24532254455524"|"1111-11-11"|95.00
output.txt
"X"|"ST"|"245552544555201"|"245552xxxxx5201"
"Y"|"VT"|"245652544555200"|"245652xxxxx5200"
"X"|"ST"|"3445625445552023"|"344562xxxxxx2023"
"Y"|"VT"|"24532254455524"|"245322xxxx5524"
Try this:
$ awk '
BEGIN {FS = OFS = "|"}
length($5)>=13 {
fld5=$5
start = substr($5,1,7)
end = substr($5,length($5)-4)
gsub(/./,"x",fld5)
sub(/^......./,start,fld5)
sub(/.....$/,end,fld5)
$1=$2; $2=$4; $3=$5; $4=fld5; NF-=3;
}1' file
"X"|"ST"|"245552544555201"|"245552xxxxx5201"
"Y"|"VT"|"245652544555200"|"245652xxxxx5200"
"X"|"ST"|"3445625445552023"|"344562xxxxxx2023"
"Y"|"VT"|"24532254455524"|"245322xxxx5524"

set shell variable in awk and reuse

How can I pass a shell variable to awk, set it, use it in another awk in same line and print it?
I want to save $0 (all fields) into a variable first, parse $6 (ABC 123456M123000) - get '12300', do a range check on it and if it satisfies, print all fields ($0)
part 1: I am trying to do:
line="hello"
java class .... | awk -F, -v '{line=$0}' | awk 'begin my range check code' | if(p>100) print $line }
part2:
$6="ABC 123456M123000" ( string that I will parse)
Once I store all fields into a variable, I can parse $6 using this:
awk 'begin {FS=" "} { print $2; len=length($2); p=substr($2,8,len)+0 ; print len,p ; if(p>100) print $line }'
But my question is in part1: how to store $0 into a variable so that after my check is done, I can print them?
It's not clear why you need multiple invocations of awk. From your description, it looks like you are just trying to do:
... | awk -F, '{split( $6, f, "M" )} f[2] > min' min=100
or, if you can't split on 'M' but need to use substr (or some other method to extract the desired value):
... | awk -F, '{ split( $6, f, " " )} 0+substr( f[2], 8 ) > min' min=100
With the shell:
java ... | while IFS= read -r line ; do
sixth=$(IFS=,; set -- $line; echo "$6")
val=${sixth:11}
(( $val > 100 )) && echo "$line"
done
Some bash-isms there.

Resources