Awk one liners into script - shell

I got some one liners in awk. How can i figure this three line into a script?
awk -F":|," 'FNR==NR && /INFO - AId:/ {a[$6$8]=$0;next} END {for (i in a) print i "|" a[i]}' log >t1
awk '/<?xml version/ {f=1} /<\/iSig>/ {f=0;print $0 "\n" } f' log >t2
awk -F\| 'FNR==NR {a[$1]=$2;next} FNR==1 {RS="\n\n"} { for (i in a) {if ($0~i) {print a[i] $0 > i".log";close(i".log")}}}' t1 t2
Thanks for helping!

How can I figure this three line into a script?
By learning awk! The best place to start is by reading Effective Awk Programming.

$ cat > myscript.sh <<EOF
#!/bin/sh
awk -F":|," 'FNR==NR && /INFO - AId:/ {a[$6$8]=$0;next} END {for (i in a) print i "|" a[i]}' log > $1
awk '/<?xml version/ {f=1} /<\/iSig>/ {f=0;print $0 "\n" } f' log >$2
awk -F\| 'FNR==NR {a[$1]=$2;next} FNR==1 {RS="\n\n"} { for (i in a) {if ($0~i) {print a[i] $0 > i".log";close(i".log")}}}' $1 $2
EOF
$ chmod +x myscript.sh
$ ./myscript.sh file1 file2

First as bash you could do it like this:
awk -F\| 'FNR==NR {a[$1]=$2;next} FNR==1 {RS="\n\n"} { for (i in a) {if ($0~i) {print a[i] $0 > i".log";close(i".log")}}}' <(awk -F":|," 'FNR==NR && /INFO - AId:/ {a[$6$8]=$0;next} END {for (i in a) print i "|" a[i]}' log) <(awk '/<?xml version/ {f=1} /<\/iSig>/ {f=0;print $0 "\n" } f' log)

Related

How to avoid generating intermediate files in bash script

I would like to know if it is possible to change the following script, such that "intermediate.tmp" is not generated as output:
To call the script on the command line:
./script.sh file1 file2
script.sh:
#!/bin/bash
FILE_1=$1
FILE_2=$2
awk '{print $1,$2}' $FILE_1 > intermediate.tmp
awk 'NR==FNR {h[$1] = $0; next} {print $0,h[$1]}' intermediate.tmp $FILE_2 > output.file
The awk scripts are not really important per se. I just want to know how to "feed" intermediate.tmp into the second awk command without generating an intermediate.tmp output file in addition to the desired output.file.
Thanks.
awk 'NR==FNR {h[$1] = $1 OFS $2; next} {print $0,h[$1]}' "$FILE_1" "$FILE_2" > output.file
or less sensibly:
awk '{print $1,$2}' "$FILE_1" |
awk 'NR==FNR {h[$1] = $0; next} {print $0,h[$1]}' - "$FILE_2" > output.file

Awk multiple search terms with a variable and negation

I have a little test file containing:
awk this
and not awk this
but awk this
so do awk this
And I've tried the following awk commands, in bash, but each produces no output:
f=awk; awk '/$f/ && !/not/' test.txt
f=awk; awk '/\$f/ && !/not/' test.txt
f=awk; awk '/"$f"/ && !/not/' test.txt
f=awk; awk -v f="$f" '/f/ && !/not/' gtest.txt
Using double quotes " produces "event not found" error in the shell due to the !.
How can I search on a variable and negate another string in the same command?
Use awk like this:
f='awk'
awk -v f="$f" -v n='not' '$0 ~ f && $0 !~ n' file
awk this
but awk this
so do awk this
Or if you don't want to pass n='not' to awk:
awk -v f="$f" '$0 ~ f && $0 !~ /not/' file
awk this
but awk this
so do awk this
awk points to gawk for me and the following worked just fine:
awk -vf=awk '$0 ~ f && !/not/' file

How to print the remaining columns using awk?

Right now I have a command that prints my log file with a delimited | per column.
cat ambari-alerts.log | awk -F '[ ]' '{print $1 "|" $2 "|" $3 "|" $4 "|" $5 "|"}' |
grep "$(date +"%Y-%m-%d")"
Sample of the log file data is this:
2016-02-11 09:40:33,875 [OK] [MAPREDUCE2] [mapreduce_history_server_rpc_latency] (History Server RPC Latency) Average Queue Time:[0.0], Average Processing Time:[0.0]
The result of my command is this:
2016-02-11|09:40:33,875|[OK]|[MAPREDUCE2]|[mapreduce_history_server_rpc_latency]
I want to print the remaining columns. How can I do that? I tried this syntax adding $0, but unfortunately it just prints the whole line again.
awk -F '[ ]' '{print $1 "|" $2 "|" $3 "|" $4 "|" $5 "|" $0}'
Hope you can help me, newbie here in using awk.
This seems to be all you need:
$ awk '{for (i=1;i<=5;i++) sub(/ /,"|")} 1' file
2016-02-11|09:40:33,875|[OK]|[MAPREDUCE2]|[mapreduce_history_server_rpc_latency]|(History Server RPC Latency) Average Queue Time:[0.0], Average Processing Time:[0.0]
This is a bit of a hassle with awk
awk -F '[ ]' '{
printf "%s|%s|%s|%s|%s|", $1, $2, $3, $4, $5
for (i=6; i<=NF; i++) printf "%s ", $i
print ""
}'
or, replace the first 5 spaces:
awk -F '[ ]' '{
sub(/ /, "|");sub(/ /, "|");sub(/ /, "|");sub(/ /, "|");sub(/ /, "|")
print
}'
This is actually easier in bash
while IFS=" " read -r a b c d e rest; do
echo "$a|$b|$c|$d|$e|$rest"
done < file.log
Folding in your grep:
awk -F '[ ]' -v date="$(date +%Y-%m-%d)" '{
$0 ~ date {
printf "%s|%s|%s|%s|%s|", $1, $2, $3, $4, $5
for (i=6; i<=NF; i++) printf "%s ", $i
print ""
}
}'
Here is some awk that provides a somewhat more generalized approach than brute-forcing the first 5 columns:
awk '{
for (i = 1; i < 6; i++)
printf "%s|", $i
for (i = 6; i < NF; i++)
printf " %s ", $i
}' ambari-alerts.log | grep "$(date +"%Y-%m-%d")"

awk working with intervals

I have this file
goodtime 20:30 21:40
badtime 19:52 24:00
and when I enter for example 21:00 and 21:15 I should get goodtime
So here's my script
#!/bin/sh
last > duom.txt
grep -F 'stud.if.ktu.lt' duom.txt > ktu.txt
echo "Nurodykite laiko intervala "
read h
read min
read h2
read min2
awk '{if ($2 ~ /$h.$m/ && $3 ~ /$h2.$min2/) print $1}' data.txt
But I don't get any results.
The problem with this:
awk '{if ($2 ~ /$h.$m/ && $3 ~ /$h2.$min2/) print $1}' data.txt
Is that you're trying to use shell variables in a single quoted string. You need to pass the shell variables into awk with its -v option:
awk -v patt1="$h.$min" -v patt2="$h2.$min2" '
$2 ~ patt1 && $3 ~ patt2 {print $1}
' data.txt
But, given your sample input, this will not match anything.
Until your requirements are clarified, I can't help with the logic.

How to use for i loop in search pattern in awk

I am trying to count strings containing a number at the end in a large data file, and for this use the "for i loop" to search all of them consecutively. Here is my code:
#!/bin/bash
for (( i=2; i<=253; i++ ))
do
awk -F "\t" '$3 ~ /^names.i$/ {++c} END {print c}' myfile >> output.txt
done
For some reason although using awk only gives the right output, the script produces just empty spaces in shell. What do I do wrong?
Just do the whole thing in 1 awk invocation:
awk -F '\t' '
{ split($3,arr,/\./); ++c[arr[2]] }
END { for (i=2;i <= 253;i++) print c[i]+0 }
' myfile > output.txt
You can't use shell variable i directly in awk like that. Pass it to awk first:
for (( i=2; i<=253; i++ ))
do
awk -v i=$i -F "\t" '$3 ~ "^names\." i "$" {++c} END {print c}' myfile >> output.txt
done
Try this
awk -F "\t" '{for (i=2;i<=253;i++) if ($3 ~ /^names.i$/) ++c} END {print c}' myfile

Resources