I have a file with date field like this.
20|1|124|Mar 19 2016 3:00AM
20|1|144|Mar 19 2016 2:00PM
43|1|146|Mar 19 2016 5:30AM
42|1|158|Mar 19 2016 1:50PM
40|1|15|Mar 19 2016 2:30AM
I want to sort by date field, such that the AM will come before PM. so far I have this:
sort -t"|" -k4 testfile.
But i am not sure how to sort the "AM" and "PM" portion. Any help is appreciated.
You can use:
while read -r; do
IFS='|' read -ra arr <<< "$REPLY"
date -d "${arr[-1]}" "+$REPLY#%s"
done < file | sort -t# -k2 | cut -d# -f1
40|1|15|Mar 19 2016 2:30AM
20|1|124|Mar 19 2016 3:00AM
43|1|146|Mar 19 2016 5:30AM
42|1|158|Mar 19 2016 1:50PM
20|1|144|Mar 19 2016 2:00PM
Using date command we parse last field in your pipe delimited field and add EPOCH value in each line delimited by #. Then using sort we do the sorting by 2nd field (EPOCH value) and finally using cut we discard value after #.
You can use a temporary delimiter (ie |) to make AM/PM a column that can be used as a sort field :
$ cat sourcefile | sed 's/\(.\)M$/|\1M/' | sort -t"|" -k5 -k4 | sed 's/|\(.\)M/\1M/'
40|1|15|Mar 19 2016 2:30AM
20|1|124|Mar 19 2016 3:00AM
43|1|146|Mar 19 2016 5:30AM
42|1|158|Mar 19 2016 1:50PM
20|1|144|Mar 19 2016 2:00PM
Related
I need to add the timestamp of all remote servers as part of output and check & compare whether the timestamp is the same or not,
I am able to print the machine IP and date.
#!/bin/bash
all_ip=(192.168.1.121 192.168.1.122 192.168.1.123)
for ip_addr in "${all_ip[#]}"; do
aws_ip=$"ip route get 1 | sed -n 's/^.*src \([0-9.]*\) .*$/\1/p'"
date=date
sshpass -p "password" ssh root#$ip_addr "$aws_ip & $date"
echo "==================================================="
done
Getting Output as :
Wed 27 Jul 2022 05:48:15 AM PDT
192.168.1.121
===================================================
Wed Jul 27 05:48:15 PDT 2022
192.168.1.122
===================================================
Wed Jul 27 05:48:15 PDT 2022
192.168.1.123
===================================================
How to check whether the timestamp ( ignoring seconds ) of all machines is the same or not ,
eg: (Wed 27 Jul 2022 05:48:15 || Wed 27 Jul 2022 05:48:15 || Wed 27 Jul 2022 05:48:15)
Expected Output:
|| Time are in sync on all machines || # if in sync
|| Time are not in sync on all machines || # if not sync
Wed 27 Jul 2022 05:48:15 AM PDT
192.168.1.121
===================================================
Wed Jul 27 05:48:15 PDT 2022
192.168.1.122
===================================================
Wed Jul 27 05:48:15 PDT 2022
192.168.1.123
===================================================
How to check whether the time ( ignoring seconds )
tmpdir=$(mktemp -d)
trap 'rm -r "$tmpdir"' EXIT
for ip in "${allips[#]}"; do
# Do N connections, in paralllel, each one writes to a separate file.
sshpass -p "password" ssh root#"$ip" "date +%Y-%m-%d_%H:%M" > "$tmpdir/$ip.txt" &
done
wait
times=$(
for i in "$tmpdir"/*.txt; do
# print filename with file contents.
echo "$i $(<$i)"
done |
# Sort them on second column
sort -k2 |
# Uniq on second field
uniq -f 2
)
echo "$times"
timeslines=$(wc -l <<<"$times")
if ((timeslines == 1)); then
echo "YAY! minutes on all servers the same"
fi
First, you may adjust your "date" command as folow in order to exclude the seconds:
date +%Y-%m-%d_%H:%M
Then, simply grep your output and validate that all the timestamps are identical. You may dump in a temporary file or any other way.
Ex:
grep [aPatternSpecificToTheLinewithTheDate] [yourTemporaryFile] | sort | uniq | wc -l
If the result is 1, it means that all the timestamps are identical.
However you will have to deal with the corner case where the minute shift while you are fetching the time form all your servers.
How can I get the number of logins of each day from the beginning of the wtmp file using AWK?
I thought about using an associative array but I don't know how to implement it in AWK..
myscript.sh
#!/bin/bash
awk 'BEGIN{numberoflogins=0}
#code goes here'
The output of the last command:
[fnorbert#localhost Documents]$ last
fnorbert tty2 /dev/tty2 Mon Apr 24 13:25 still logged in
reboot system boot 4.8.6-300.fc25.x Mon Apr 24 16:25 still running
reboot system boot 4.8.6-300.fc25.x Mon Apr 24 13:42 still running
fnorbert tty2 /dev/tty2 Fri Apr 21 16:14 - 21:56 (05:42)
reboot system boot 4.8.6-300.fc25.x Fri Apr 21 19:13 - 21:56 (02:43)
fnorbert tty2 /dev/tty2 Tue Apr 4 08:31 - 10:02 (01:30)
reboot system boot 4.8.6-300.fc25.x Tue Apr 4 10:30 - 10:02 (00:-27)
fnorbert tty2 /dev/tty2 Tue Apr 4 08:14 - 08:26 (00:11)
reboot system boot 4.8.6-300.fc25.x Tue Apr 4 10:13 - 08:26 (-1:-47)
wtmp begins Mon Mar 6 09:39:43 2017
The shell script's output should be:
Apr 4: 4
Apr 21: 2
Apr 24: 3
, using associative array if it's possible
In awk, arrays can be indexed by strings or numbers, so you can use it like an associative array.
However, what you're asking will be hard to do with awk reliably because the delimiters are whitespace, therefore empty fields will throw off the columns, and if you use FIELDWIDTHS you'll also get thrown off by columns longer than their assigned width.
If all you're looking for is just the number of logins per day you might want to use a combination of sed and awk (and sort):
last | \
sed -E 's/^.*(Mon|Tue|Wed|Thu|Fri|Sat|Sun) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) ([ 0-9]{2}).*$/\2 \3/p;d' | \
awk '{arr[$0]++} END { for (a in arr) print a": " arr[a]}' | \
sort -M
The sed -E uses extended regular expressions, and the pattern just prints the date of each line that is emitted by last (This matches on the day of week, but only prints the Month and Date)
We could have used uniq -c to get the counts, but using awk we can do an associative array as you hinted.
Finally using sort -M we're sorting on the abbreviated date formats like Apr 24, Mar 16, etc.
Try the following awk script(assuming that the month is the same, points to current month):
myscript.awk:
#!/bin/awk -f
{
a[NR]=$0; # saving each line into an array indexed by line number
}
END {
for (i=NR-1;i>1;i--) { # iterating lines in reverse order(except the first/last line)
if (match(a[i],/[A-Z][a-z]{2} ([A-Z][a-z]{2}) *([0-9]{1,2}) [0-9]{2}:[0-9]{2}/, b))
m=b[1]; # saving month name
c[b[2]]++; # accumulating the number of occurrences
}
for (i in c) print m,i": "c[i]
}
Usage:
last | awk -f myscript.awk
The output:
Apr 4: 4
Apr 21: 2
Apr 24: 3
Im trying to write a script in which a while loop reads a file line by line, performs a command on two values on that line (to get two new values), then replaces the two old values with the two new values, then moves on to the next line.
For example the txt file example.txt contains the following data:
1432771200 != 1432800000 OPTION VALUE
1432771200 != 1432800210 OPTION VALUE
1432771200 != 1432800033 OPTION VALUE
And I run the following script:
#!/bin/bash -x
#
while read line
do
arr=($line)
CURRENTDATE=`h2e ${arr[0]}`
PROPOSEDDATE=`h2e ${arr[2]}`
echo $CURRENTDATE
echo $PROPOSEDDATE
# echo $line |
sed -i "s/${arr[0]}/$CURRENTDATE/"
# echo $line |
sed -i "s/${arr[2]}/$PROPOSEDDATE/"
done < /srg/pro/data/example.txt
What I expect to see now in example.txt file is the replacement of the first and third value on each line.. so it should look like this:
Thu May 28 01:00:00.000 2015 BST != Thu May 28 09:00:00.000 2015 BST OPTION VALUE
Thu May 28 01:00:00.000 2015 BST != Wed May 27 01:00:00.000 2015 BST OPTION VALUE
Fri May 28 01:00:00.000 2015 BST != Fri May 29 06:00:00.000 2015 BST OPTION VALUE
And so on and so forth..
When I run the shell script with bash -x Interface.sh Im getting this:
read line
+ arr=($line)
++ h2e 1432771200
+ CURRENTDATE='1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST (1)'
++ h2e 1432800000
+ PROPOSEDDATE='1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST (1)'
+ echo 1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST '(1)'
1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST (1)
+ echo 1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST '(1)'
1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST (1)
+ sed -i 's/1432771200/1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST (1)/'
sed: no input files
+ sed -i 's/1432800000/1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST (1)/'
sed: no input files
+ read line
+ arr=($line)
++ h2e 1432771200
+ CURRENTDATE='1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST (1)'
++ h2e 1432800000
+ PROPOSEDDATE='1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST (1)'
+ echo 1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST '(1)'
1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST (1)
+ echo 1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST '(1)'
1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST (1)
+ sed -i 's/1432771200/1432771200.000 2015148 Thu May 28 01:00:00.000 2015 BST (1)/'
sed: no input files
+ sed -i 's/1432800000/1432800000.000 2015148 Thu May 28 09:00:00.000 2015 BST (1)/'
sed: no input files
Please help! Dont know how to fix this!
The read command can read split the line into fields:
outfile=/tmp/some_tmp_file_you_will_move
while read date1 marker date2 otherfields; do
CURRENTDATE=`h2e ${date1}`
PROPOSEDDATE=`h2e ${date2}`
echo $CURRENTDATE
echo $PROPOSEDDATE
echo "${CURRENTDATE} ${marker} ${PROPOSEDDATE} ${otherfields}" >> ${outfile}
done < /srg/pro/data/example.txt
When you don't need to see the CURRENTDATE/PROPOSEDDATE inside your loop, you can redirect the output outside the loop for a better performance:
outfile=/tmp/some_tmp_file_you_will_move
while read date1 marker date2 otherfields; do
CURRENTDATE=`h2e ${date1}`
PROPOSEDDATE=`h2e ${date2}`
echo "${CURRENTDATE} ${marker} ${PROPOSEDDATE} ${otherfields}"
done < /srg/pro/data/example.txt > ${outfile}
Note: Consider using $(h2e ...) and not ``
Using your batch (there are other way)
#!/bin/bash -x
#
cp /srg/pro/data/example.txt /tmp/temp.file
cat /srg/pro/data/example.txt \
| while read line
do
arr=($line)
CURRENTDATE=`h2e ${arr[0]}`
PROPOSEDDATE=`h2e ${arr[2]}`
echo $CURRENTDATE
echo $PROPOSEDDATE
# echo $line |
sed -i "s/${arr[0]}\(.*\)${arr[2]}/${CURRENTDATE}\1${PROPOSEDDATE}/" /tmp/temp.file
done
mv /tmp/temp.file /srg/pro/data/example.txt
sed -i does not work on stream, need a file
I use a temporary file to avoid problem reading the same file that is modified inside the loop that often create unexpected result (like empty file)
this work for your sample but it depend of line structure for your array and sed substitution in real structure of data file
I have this list of files . Now I will have to pick the latest file based on some condition
3679 Jul 21 23:59 belk_rpo_error_**po9324892**_07212014.log
0 Jul 22 23:59 belk_rpo_error_**po9324892**_07222014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
0 Jul 20 05:50 belk_rpo_error_**po9999992**_07202014.log
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
742 Jul 21 07:30 belk_rpo_error_**po9999991**_07212014.log
0 Jul 23 2014 belk_rpo_error_**po9999991**_07232014.log
For a PATRICULAR Order_No(Marked with ** **)
If the latest file is 0 kB then we will discard it (rest of the files with same Order_no as well)
if the latest file is non Zero then I will take it.(Only the latest one)
Then append the contents in a txt file .
My expected output would be ::
411 Jul 21 06:30 belk_rpo_error_**po9999992**_07212014.log
3679 Jul 23 23:59 belk_rpo_error_**po9324892**_07232014.log
22 Jul 22 06:30 belk_rpo_error_**po9324267**_07012014.log
I am at my wits end here. I cant seem to figure out how to compare dates in Unix. Any help is very appreciated.
You can try something like:
touch test.txt
for var in ` find . ! -empty -exec ls -r {} \;`
do
cat $var>>test.txt
done
untested
use stat to emit date (epoch time), size and filename.
use awk to filter out zero-length files and extract order number.
sort by order number and date
awk to pick up the last filename for each order number
stat -c $'%Y\t%s\t%n' *.log |
awk -F'\t' -v OFS='\t' '
$2 > 0 {
split($3, a, /_/)
print a[4], $1, $3
}' |
sort -t $'\t' -k1,1 -k2,2n |
awk -F'\t' '
NR > 1 && $1 != prev_order {print filename}
{filename = $3; prev_order = $1}
END {print filename}
'
The sort command might be wrong: In order to group by order number, you might need to sort first by file time then by order number.
If I understand your question, the resulting files need to be concatenated and appended to a file. If the above pipeline is working OK, then pipe into | xargs cat >> something.log
i'm just wondering how can we use awk to do exact matches.
for eg
$ cal 09 09 2009
September 2009
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30
$ cal 09 09 2009 | awk '{day="9"; col=index($0,day); print col }'
17
0
0
11
20
0
8
0
As you can see the above command outputs the index number of all the lines that contain the string/number "9", is there a way to make awk output index number in only the 4th line of cal output above.??? may be an even more elegant solution?
I'm using awk to get the day name using the cal command. here's the whole line of code:
$ dayOfWeek=$(cal $day $month $year | awk '{day='$day'; split("Sunday Monday Tuesday Wednesday Thursday Friday Saturday", array); column=index($o,day); dow=int((column+2)/3); print array[dow]}')
The problem with the above code is that if multiple matches are found then i get multiple results, whereas i want it to output only one result.
Thanks!
Limit the call to index() to only those lines which have your "day" surrounded by spaces:
awk -v day=$day 'BEGIN{split("Sunday Monday Tuesday Wednesday Thursday Friday Saturday", array)} $0 ~ "\\<"day"\\>"{for(i=1;i<=NF;i++)if($i == day){print array[i]}}'
Proof of Concept
$ cal 02 1956
February 1956
Su Mo Tu We Th Fr Sa
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29
$ day=18; cal 02 1956 | awk -v day=$day 'BEGIN{split("Sunday Monday Tuesday Wednesday Thursday Friday Saturday", array)} $0 ~ "\\<"day"\\>"{for(i=1;i<=NF;i++)if($i == day){print array[i]}}'
Saturday
Update
If all you are looking for is to get the day of the week from a certain date, you should really be using the date command like so:
$ day=9;month=9;year=2009;
$ dayOfWeek=$(date +%A -d "$day/$month/$year")
$ echo $dayOfWeek
Wednesday
you wrote
cal 09 09 2009
I'm not aware of a version of cal that accepts day of month as an input,
only
cal ${mon} (optional) ${year} (optional)
But, that doesn't affect your main issue.
you wrote
is there a way to make awk output index number in only the 4th line of cal output above.?
NR (Num Rec) is your friend
and there are numerous ways to use it.
cal 09 09 2009 | awk 'NR==4{day="9"; col=index($0,day); print col }'
OR
cal 09 09 2009 | awk '{day="9"; if (NR==4) {col=index($0,day); print col } }'
ALSO
In awk, if you have variable assignments that should be used throughout your whole program, then it is better to use the BEGIN section so that the assignment is only performed once. Not a big deal in you example, but why set bad habits ;-)?
HENCE
cal 09 2009 | awk 'BEGIN{day="9"}; NR==4 {col=index($0,day); print col }'
FINALLY
It is not completely clear what problem you are trying to solve. Are you sure you always want to grab line 4? If not, then how do you propose to solve that?
Problems stated as " 1. I am trying to do X. 2. Here is my input. 3. Here is my output. 4. Here is the code that generated that output" are much easier to respond to.
It looks like you're trying to do date calculations. You can be much more robust and general solutions by using the gnu date command. I have seen numerous useful discussions of this tagged as bash, shell, (date?).
I hope this helps.
This is so much easier to do in a language that has time functionality built-in. Tcl is great for that, but many other languages are too:
$ echo 'puts [clock format [clock scan 9/9/2009] -format %a]' | tclsh
Wed
If you want awk to only output for line 4, restrict the rule to line 4:
$ awk 'NR == 4 { ... }'