Invoking 'date' command inside awk string, with +%a formatting - bash

Still newish to the site, but here goes...
Basically I'm storing events in multiple files, with each event being a line and each line containing dates ($1), start($2) and stop($3) times and several other pieces of data. I use two double underscores ("__") as Field Separators. I've been using a variety of shell scripts to manage the data, and I was using awk to calculate stats and I'm having trouble invoking the date function so I can get a total by day of the week. After much tinkering and scanning of discussion boards I got to this:
ls /home/specified/folder/MBRS.db/* |
xargs -n 1 -I % awk -F"__" '$6 == "CLOSED" && $1 >= "'$backDATE'" { print $0 }' % |
awk 'BEGIN{FS="__"}{specDATE=system("date --date="$1" +%a")} specDATE == "Tue" {print $2" "$3}'
or
ls /home/lingotech/Einstein/data/MBRS.db/* |
xargs -n 1 -I % awk -F"__" '$6 == "CLOSED" && $1 >= "'$backDATE'" { print $0 }' % |
awk 'BEGIN{FS="__"}system("date --date="$1" +%a") == "Mon" {print $2" "$3}'`
Instead of outputting the start and stop times, I'm getting a list of all the different days of the week for each entry.
I've tried more variations of the date usage than I care to admit, including:
for y in Sun Mon Tue Wed Thu Fri Sat; do
for directory in $( ls /home/specified/directory/MBRS.db/* | xargs -n 1 ); do
printf "."
[[ $( cat $directory | awk -F"__" '$6 == "CLOSED" && $1 >= "'$backDATE'" { print $1 }' | xargs -n 1 -I z date +%a -d z ) == "$y" ]] && echo BLAH
done
done
Some helpful explanation of what I'm screwing up would be much appreciated. Thanks in advance. Oh and I'm storing the date in YYMMDD format but that doesn't seem to be an issue for ubuntu server's version of 'date'.

I don't know about all the rest of it (too much text for my reading tastes!) but wrt the answer you posted, this part of it:
awk 'BEGIN{FS="__"} NF == 10 && $1 >= "'$backDATE'" && $4 == "'$x'" && $6 == "CLOSED" {while ( "date +%a -d "$1"" | getline newDAY){if (newDAY == "'$y'") print $2" "$3}}' /home/absolute/path/*
assuming it does what you want would be written as:
awk -v backDATE="$backDATE" -v x="$x" -v y="$y" '
BEGIN { FS="__" }
(NF == 10) && ($1 >= backDATE) && ($4 == x) && ($6 == "CLOSED") {
cmd = "date +%a -d \"" $1 "\""
while ( (cmd | getline newDAY) > 0 ) {
if (newDAY == y) {
print $2, $3
}
}
close(cmd)
}
' /home/absolute/path/*
wrt why use awk variables instead of letting shell variables expand to become part of the body of a shell script, the answer is robustness and simplicity.
This is letting a shell variable expand to become part of the body of an awk script:
$ x="hello world"
$ awk 'BEGIN{ print '$x' }'
awk: cmd. line:1: BEGIN{ print hello
awk: cmd. line:1: ^ unexpected newline or end of string
$ awk 'BEGIN{ print "'$x'" }'
awk: cmd. line:1: BEGIN{ print "hello
awk: cmd. line:1: ^ unterminated string
awk: cmd. line:1: BEGIN{ print "hello
awk: cmd. line:1: ^ syntax error
$ awk 'BEGIN{ print "'"$x"'" }'
hello world
$ x="hello
world"
$ awk 'BEGIN{ print "'"$x"'" }'
awk: cmd. line:1: BEGIN{ print "hello
awk: cmd. line:1: ^ unterminated string
awk: cmd. line:1: BEGIN{ print "hello
awk: cmd. line:1: ^ syntax error
and this is using an awk variable initialized with the value of a shell variable:
$ x="hello world"
$ awk -v x="$x" 'BEGIN{ print x }'
hello world
$ x="hello
world"
$ awk -v x="$x" 'BEGIN{ print x }'
hello
world
See the difference?
As for why store the command in a variable - because you have to close it after you use it and it must be spelled exactly the same way in the close command as it was when you opened the pipe. Compare:
cmd = "date +%a -d \"" $1 "\""
cmd | getline
close(cmd)
vs:
"date +%a -d \"" $1 "\"" | getline
close("date +%a -d \"" $l "\"")
and take an extremely close second look to spot the bug in the 2nd version.

Ok, so I ended up using this:
>backDATE=150000;
> for x in $listOFsites; do
> for y in Sun Mon Tue Wed Thu Fri Sat; do
> totalHOURS=$( awk 'BEGIN{FS="__"} NF == 10 && $1 >= "'$backDATE'" && $4 == "'$x'" && $6 == "CLOSED" {while ( ( "date +%a -d \""$1"\"" | getline newDAY) > 0 ){if (newDAY == "'$y'") print $2" "$3}}' /home/absolute/path/* | xargs -I % /home/custom/duration/calc % | paste -sd+ | bc ); printf ".";
> done
> done
I had to use date inside the single quotes (so that I could pass $1 to it) rather than outside (using -F"__" -v newDAY=...), but inside the single quotes getting the output of system() is problematic. After seeing:How can I pass variables from awk to a shell command? I finally saw the while (cmd | get line x) format which was the crux of my issue. Props to Ed Morton

Related

Unable to substract two variable in shell scripting

I am writing a script that's picking up two values from a file and then subtracting them.But I am unable to do substraction as it is throwing error.
res1= awk 'FNR == '${num1}' {print $1}' /home/shell/test.txt
res2= awk 'FNR == '${num2}' {print $1}' /home/shell/test.txt
res= $((res2 - res1))
echo $res
I also tried expr = `expr $res2 -$res1` but it didn't work. please help me so as to get the desired result.
your assignments for res1/res2 are wrong. It should be
res1=$(awk 'FNR == '${num1}' {print $1}' /home/shell/test.txt)
However, you can do it all in awk
$ num1=5; num2=2; awk -v n1=${num1} -v n2=${num2} 'FNR==n1{r1=$1;f1=1}
FNR==n2{r2=$1;f2=1}
f1&&f2{print r1-r2; exit}' <(seq 5)
3
This is because there is one space char after each = sign: res1= awk
Remove the spaces and use $( command ) to execute a command and gather its output.
Give a try to this:
res1="$(awk -v num=${num1} 'FNR == num {print $1}' /home/shell/test.txt)"
res2="$(awk -v num=${num2} 'FNR == num {print $1}' /home/shell/test.txt)"
res=$(( res2 - res1 ))
printf "%d\n" ${res}
I had read in another answer that it is preferred to pass variable's value to awk script using -v var_name=value, rather than concatenating strings.

awk working with intervals

I have this file
goodtime 20:30 21:40
badtime 19:52 24:00
and when I enter for example 21:00 and 21:15 I should get goodtime
So here's my script
#!/bin/sh
last > duom.txt
grep -F 'stud.if.ktu.lt' duom.txt > ktu.txt
echo "Nurodykite laiko intervala "
read h
read min
read h2
read min2
awk '{if ($2 ~ /$h.$m/ && $3 ~ /$h2.$min2/) print $1}' data.txt
But I don't get any results.
The problem with this:
awk '{if ($2 ~ /$h.$m/ && $3 ~ /$h2.$min2/) print $1}' data.txt
Is that you're trying to use shell variables in a single quoted string. You need to pass the shell variables into awk with its -v option:
awk -v patt1="$h.$min" -v patt2="$h2.$min2" '
$2 ~ patt1 && $3 ~ patt2 {print $1}
' data.txt
But, given your sample input, this will not match anything.
Until your requirements are clarified, I can't help with the logic.

How can I specify a row in awk in for loop?

I'm using the following awk command:
my_command | awk -F "[[:space:]]{2,}+" 'NR>1 {print $2}' | egrep "^[[:alnum:]]"
which successfully returns my data like this:
fileName1
file Name 1
file Nameone
f i l e Name 1
So as you can see some file names have spaces. This is fine as I'm just trying to echo the file name (nothing special). The problem is calling that specific row within a loop. I'm trying to do it this way:
i=1
for num in $rows
do
fileName=$(my_command | awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]])"
echo "$num $fileName"
$((i++))
done
But my output is always null
I've also tried using awk -v record=$i and then printing $record but I get the below results.
f i l e Name 1
EDIT
Sorry for the confusion: rows is a variable that list ids like this 11 12 13
and each one of those ids ties to a file name. My command without doing any parsing looks like this:
id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3
I can only use the id field to run a the command that I need, but I want to use the File Info field to notify the user of the actual File that the command is being executed against.
I think your $i does not expand as expected. You should quote your arguments this way:
fileName=$(my_command | awk -F "[[:space:]]{2,}+" "NR==$i {print \$2}" | egrep "^[[:alnum:]]")
And you forgot the other ).
EDIT
As an update to your requirement you could just pass the rows to a single awk command instead of a repeatitive one inside a loop:
#!/bin/bash
ROWS=(11 12)
function my_command {
# This function just emulates my_command and should be removed later.
echo " id File Info OS
11 File Name1 OS1
12 Fi leNa me2 OS2
13 FileName 3 OS3"
}
awk -- '
BEGIN {
input = ARGV[1]
while (getline line < input) {
sub(/^ +/, "", line)
split(line, a, / +/)
for (i = 2; i < ARGC; ++i) {
if (a[1] == ARGV[i]) {
printf "%s %s\n", a[1], a[2]
break
}
}
}
exit
}
' <(my_command) "${ROWS[#]}"
That awk command could be condensed to one line as:
awk -- 'BEGIN { input = ARGV[1]; while (getline line < input) { sub(/^ +/, "", line); split(line, a, / +/); for (i = 2; i < ARGC; ++i) { if (a[1] == ARGV[i]) {; printf "%s %s\n", a[1], a[2]; break; }; }; }; exit; }' <(my_command) "${ROWS[#]}"
Or better yet just use Bash instead as a whole:
#!/bin/bash
ROWS=(11 12)
while IFS=$' ' read -r LINE; do
IFS='|' read -ra FIELDS <<< "${LINE// +( )/|}"
for R in "${ROWS[#]}"; do
if [[ ${FIELDS[0]} == "$R" ]]; then
echo "${R} ${FIELDS[1]}"
break
fi
done
done < <(my_command)
It should give an output like:
11 File Name1
12 Fi leNa me2
Shell variables aren't expanded inside single-quoted strings. Use the -v option to set an awk variable to the shell variable:
fileName=$(my_command | awk -v i=$i -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]])"
This method avoids having to escape all the $ characters in the awk script, as required in konsolebox's answer.
As you already heard, you need to populate an awk variable from your shell variable to be able to use the desired value within the awk script so thi:
awk -F "[[:space:]]{2,}+" 'NR==$i {print $2}' | egrep "^[[:alnum:]]"
should be this:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
Also, though, you don't need awk AND grep since awk can do anything grep van do so you can change this part of your script:
awk -v i="$i" -F "[[:space:]]{2,}+" 'NR==i {print $2}' | egrep "^[[:alnum:]]"
to this:
awk -v i="$i" -F "[[:space:]]{2,}+" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
and you don't need a + after a numeric range so you can change {2,}+ to just {2,}:
awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}'
Most importantly, though, instead of invoking awk once for every invocation of my_command, you can just invoke it once for all of them, i.e. instead of this (assuming this does what you want):
i=1
for num in rows
do
fileName=$(my_command | awk -v i="$i" -F "[[:space:]]{2,}" '(NR==i) && ($2~/^[[:alnum:]]/){print $2}')
echo "$num $fileName"
$((i++))
done
you can do something more like this:
for num in rows
do
my_command
done |
awk -F '[[:space:]]{2,}' '$2~/^[[:alnum:]]/{print NR, $2}'
I say "something like" because you don't tell us what "my_command", "rows" or "num" are so I can't be precise but hopefully you see the pattern. If you give us more info we can provide a better answer.
It's pretty inefficient to rerun my_command (and awk) every time through the loop just to extract one line from its output. Especially when all you're doing is printing out part of each line in order. (I'm assuming that my_command really is exactly the same command and produces the same output every time through your loop.)
If that's the case, this one-liner should do the trick:
paste -d' ' <(printf '%s\n' $rows) <(my_command |
awk -F '[[:space:]]{2,}+' '($2 ~ /^[::alnum::]/) {print $2}')

How to pass a date range to awk as a variable

Here is my case.
bash ~]# TIME="2012-05-25 06:42:57"
bash ~]# echo "2012-05-25 00:16:51,610" | awk -v var=$TIME '{if ($0 < var) print $0}'
Then, here is the error message
awk: 06:42:57
awk: ^ syntax error
I just want to pass a date range to my awk command. How to archive this? Please help. Thanks.
Modify the case
START_TIME="2012-05-24 00:00:00"
END_TIME="2012-05-24 01:00:00"
echo "2012-05-24 00:10:10" | awk -v "START=$START_TIME" -v "END=$END_TIME" '{ if ( $0 > START && $0 < END) print $0 }'
It seems not working in IF conditions.
awk: { if ( $0 < START && $0 > END) print $0 }
awk: ^ syntax error
After serval trying, seems found the solution with another approach.
echo "2012-05-24 00:10:10" | awk '{ if ( $0 > "'"$START_TIME"'" && $0 < "'"$END_TIME"'" ) print $0 }'
Not sure how to do it with awk variable "-v". Anyone have idears?
Quote your variable when passing it to AWK:
echo "2012-05-25 00:16:51,610" | awk -v "var=$TIME" '{if ($0 < var) print $0}'
At least on my system, it appears that "END" is a reserved word even for variables. Awk uses END as it uses BEGIN, but I hadn't seen attempts to use it as a variable before. Note:
[ghoti#pc ~]$ START_TIME="2012-05-24 00:00:00"
[ghoti#pc ~]$ END_TIME="2012-05-24 01:00:00"
[ghoti#pc ~]$ echo "2012-05-24 00:10:10" | awk -v "START=$START_TIME" -v "END=$END_TIME" '{ if ( $0 < START && $0 > END) print $0 }'
awk: syntax error at source line 1
context is
{ if ( $0 < START && $0 > >>> END <<< ) print $0 }
awk: illegal statement at source line 1
[ghoti#pc ~]$ echo "2012-05-24 00:10:10" | awk -v s_time="$START_TIME" -v e_time="$END_TIME" '{ if ( $0 < s_time && $0 > e_time) print $0 }'
[ghoti#pc ~]$
Obviously this still isn't working, but now it's not working because of a misunderstanding about how comparisons work, rather than because we're trying to use a reserved word as a variable.
Looking at your if statement, it seems that you're trying to evaluate TRUE only if the comparison string is both BEFORE the start date and AFTER the end date. Barring theories of time being circular, I think we can assume that this logic is flawed.
So here's what I came up with. Note that this uses gawk's mktime() function, so it won't work everywhere.
[ghoti#pc ~]$ START_TIME="2012-05-24 00:00:00"
[ghoti#pc ~]$ END_TIME="2012-05-24 01:00:00"
[ghoti#pc ~]$ printf '2012-05-23 22:10:10\n2012-05-24 00:10:10\n2012-05-24 01:10:10\n' | gawk -v s_time="$START_TIME" -v e_time="$END_TIME" 'BEGIN { s=mktime(gensub(/[^0-9]/," ","G",s_time)); e=mktime(gensub(/[^0-9]/," ","G",e_time)); } { now=mktime(gensub(/[^0-9]/," ","G")); if ( now > s && now < e) print $0 }'
2012-05-24 00:10:10
[ghoti#pc ~]$
Spaced out for easier reading, the gawk script looks like this:
BEGIN {
s=mktime(gensub(/[^0-9]/," ","G",s_time));
e=mktime(gensub(/[^0-9]/," ","G",e_time));
}
{
now=mktime(gensub(/[^0-9]/," ","G"));
if ( now > s && now < e) print $0;
}
Obviously, this relies completely on the fact that your date/time specification matches mktime()'s input format so closely. But it works with the sample data in your question.

Eval awk command with single quotes

I have a function "checkExist" that takes in a command and executes it based on whether or not the output file already exists. I pass a command like this, where file1 and file2 are just the names of output files that the commands create, so if they already exist it will ask if you want to overwrite, else it will skip:
checkExist file1 file2 command1 command2
In actual use like this:
checkExist 1.txt 2.txt "echo $1 | awk '$5 <= 10 {print $3, $4}'" "echo $2 | awk '$5 <= 10 {print $3, $4}'"
$1 and $2 above are inputs to the script "smartfilter.sh" containing the function checkExist within. So they are just file inputs.
Later in the checkExist function if the user types 'Y/y' to overwrite, or the files don't already exist then it will
eval $3 &
eval $4 &
wait
And I get an error like so:
awk: >= 10 {print , }
awk: ^ syntax error
awk: >= 10 {print , }
awk: ^ syntax error
awk: >= 10 {print , }
awk: ^ syntax error
awk: cmd. line:1: >= 10 {print , }
awk: cmd. line:1: ^ unexpected newline or end of string
I know it is to do with the single quotations ' around the awk and eval not parsing them correctly. I have tried \' but that doesn't work either. Is there a proper way to do this?
checkExist 1.txt 2.txt "echo $1 | awk '\$5 <= 10 {print \$3, \$4}'" "echo $2 | awk '\$5 <= 10 {print \$3, \$4}'"

Resources