How increment hours in a datetimstamp? - shell

I have a file which contains time in YYYYMMDDhhmmss.sss
I am fetching only hours/minutes from the file using the following command
start=grep -i "XYZ" | head -1 | awk '{print $3}' | cut -c9-12
The start variable would contain number of hours/minutes (Example: - 1041 [HHMM])
My task is to increment this time by 60 minutes.
Please help me to do so. I am not using system date.
Here's what i tried,
start=grep -i "XYZ" | head -1 | awk '{print $3}' | cut -c9-12
end=$(($start) + 3600 )
But This logic is wrong as it will add like a normal number. Also converting the time to seconds would be a tedious job. Is there any way to increment via system commands ? Please suggest.

60 minutes is 1 hour, so all you need to do is increment the hour by one. This can be done by adding 100 to the HHMM time as shown below:
start="$1"
# add 100 to the start time to increment the hour
newTime=$((10#$start+100))
# check if we have crossed midnight
if (( newTime >= 2400 )); then
newTime=$((newTime-2400))
fi
#pad with leading zero
newTime="$(printf %04d $newTime)"
echo "$newTime"

Related

Get a percentage of randomly chosen lines from a text file

I have a text file (bigfile.txt) with thousands of rows. I want to make a smaller text file with 1 % of the rows which are randomly chosen. I tried the following
output=$(wc -l bigfile.txt)
ds1=$(0.01*output)
sort -r bigfile.txt|shuf|head -n ds1
It give the following error:
head: invalid number of lines: ‘ds1’
I don't know what is wrong.
Even after you fix your issues with your bash script, it cannot do floating point arithmetic. You need external tools like Awk which I would use as
randomCount=$(awk 'END{print int((NR==0)?0:(NR/100))}' bigfile.txt)
(( randomCount )) && sort -r file | shuf | head -n "$randomCount"
E.g. Writing a file with with 221 lines using the below loop and trying to get random lines,
tmpfile=$(mktemp /tmp/abc-script.XXXXXX)
for i in {1..221}; do echo $i; done >> "$tmpfile"
randomCount=$(awk 'END{print int((NR==0)?0:(NR/100))}' "$tmpfile")
If I print the count, it would return me a integer number 2 and using that on the next command,
sort -r "$tmpfile" | shuf | head -n "$randomCount"
86
126
Roll a die (with rand()) for each line of the file and get a number between 0 and 1. Print the line if the die shows less than 0.01:
awk 'rand()<0.01' bigFile
Quick test - generate 100,000,000 lines and count how many get through:
seq 1 100000000 | awk 'rand()<0.01' | wc -l
999308
Pretty close to 1%.
If you want the order random as well as the selection, you can pass this through shuf afterwards:
seq 1 100000000 | awk 'rand()<0.01' | shuf
On the subject of efficiency which came up in the comments, this solution takes 24s on my iMac with 100,000,000 lines:
time { seq 1 100000000 | awk 'rand()<0.01' > /dev/null; }
real 0m23.738s
user 0m31.787s
sys 0m0.490s
The only other solution that works here, heavily based on OP's original code, takes 13 minutes 19s.

Bash - convert time interval string to nr. of seconds

I'm trying to convert strings, describing a time interval, to the corresponding number of seconds.
After some experimenting I figured out that I can use date like this:
soon=$(date -d '5 minutes 10 seconds' +%s); now=$(date +%s)
echo $(( $soon-$now ))
but I think there should be an easier way to convert strings like "5 minutes 10 seconds" to the corresponding number of seconds, in this example 310. Is there a way to do this in one command?
Note: although portability would be useful, it isn't my top priority.
You could start at epoch
date -d"1970-01-01 00:00:00 UTC 5 minutes 10 seconds" "+%s"
310
You could also easily sub in times
Time="1 day"
date -d"1970-01-01 00:00:00 UTC $Time" "+%s"
86400
There is one way to do it, without using date command in pure bash (for portability)
Assuming you just have an input string to convert "5 minutes 10 seconds" in a bash variable with a : de-limiter as below.
$ convertString="00:05:10"
$ IFS=: read -r hour minute second <<< "$convertString"
$ secondsValue=$(((hour * 60 + minute) * 60 + second))
$ printf "%s\n" "$secondsValue"
310
You can run the above commands directly on the command-line without the $ mark.
This will do (add the epoch 19700101):
$ date -ud '19700101 5 minutes 10 seconds' +%s
310
It is important to add a -u to avoid local time (and DST) effects.
$ TZ=America/Los_Angeles date -d '19700101 5 minutes 10 seconds' +%s
29110
Note that date could do some math:
$ date -ud '19700101 +5 minutes 10 seconds -47 seconds -1 min' +%s
203
The previous suggestions didn't work properly on alpine linux, so here's a small helper function that is POSIX compliant, is easy to use and also supports calculations (just as a side effect of the implementation).
The function always returns an integer based on the provided parameters.
$ durationToSeconds '<value>' '<fallback>'
$ durationToSeconds "1h 30m"
5400
$ durationToSeconds "$someemptyvar" 1h
3600
$ durationToSeconds "$someemptyvar" "1h 30m"
5400
# Calculations also work
$ durationToSeconds "1h * 3"
10800
$ durationToSeconds "1h - 1h"
0
# And also supports long forms for year, day, hour, minute, second
$ durationToSeconds "3 days 1 hour"
262800
# It's also case insensitive
$ durationToSeconds "3 Days"
259200
function durationToSeconds () {
set -f
normalize () { echo $1 | tr '[:upper:]' '[:lower:]' | tr -d "\"\\\'" | sed 's/years\{0,1\}/y/g; s/months\{0,1\}/m/g; s/days\{0,1\}/d/g; s/hours\{0,1\}/h/g; s/minutes\{0,1\}/m/g; s/min/m/g; s/seconds\{0,1\}/s/g; s/sec/s/g; s/ //g;'; }
local value=$(normalize "$1")
local fallback=$(normalize "$2")
echo $value | grep -v '^[-+*/0-9ydhms]\{0,30\}$' > /dev/null 2>&1
if [ $? -eq 0 ]
then
>&2 echo Invalid duration pattern \"$value\"
else
if [ "$value" = "" ]; then
[ "$fallback" != "" ] && durationToSeconds "$fallback"
else
sedtmpl () { echo "s/\([0-9]\+\)$1/(0\1 * $2)/g;"; }
local template="$(sedtmpl '\( \|$\)' 1) $(sedtmpl y '365 * 86400') $(sedtmpl d 86400) $(sedtmpl h 3600) $(sedtmpl m 60) $(sedtmpl s 1) s/) *(/) + (/g;"
echo $value | sed "$template" | bc
fi
fi
set +f
}
Edit : Yes. I developed for OP after comment and checked on Mac OS X, CentOS and Ubuntu. One liner, POSIX compliant command for converting "X minutes Y seconds" format to seconds. That was the question.
echo $(($(echo "5 minutes 10 seconds" | cut -c1-2)*60 + $(echo "5 minutes 10 seconds" | cut -c1-12 | awk '{print substr($0,11)}')))
OP told me via comment that he wants for "X minutes Y seconds" format not for HH:MM:SS format. The command with date and "+%s" is throwing error on (my) Mac. OP wanted to grab the numerical values from "X minutes Y seconds" format and convert it to seconds. First I extracted the minute in digit (take it as equation A) :
echo "5 minutes 10 seconds" | cut -c1-2)
then I extracted the seconds part (take it as equation B) :
echo "5 minutes 10 seconds" | cut -c1-12 | awk '{print substr($0,11)}'
Now multiply minute by 60 then add with the other :
echo $((equation A)*60) + (equation B))
OP should ask the others to check my developmental version (but working) of command before using it for automatic repeated usage like we do with cron on a production server.
If we want to run this on a log file with values in "X minutes Y seconds" format, we have to change echo "5 minutes 10 seconds" to cat file | ... like command. I kept a gist of it too if I or others ever need we can use it with cat to run on server log files with x minutes y seconds like log format.
Although off-topic (what I understood, question has not much to do with current time), this is not working for POSIX-compliant OS to get current time in seconds :
date -d "1970-01-01 00:00:00 UTC 5 minutes 10 seconds" "+%s"
It will throw error on MacOS X but work on most GNU/Linux distro. That +%s part will throw error on POSIX-compliant OS upon complicated usage. These commands are mostly suitable to get current time in seconds on POSIX compliant to any kind of unix like OS :
awk 'BEGIN{srand(); print srand()}'
perl -le 'print time'
If OP needs can extend it by generating current time in seconds and subtract. I hope it will help.
---- OLD Answer before EDIT ----
You can get the current time without that date -- echo | awk '{print systime();}' or wget -qO- http://www.timeapi.org/utc/now?\\s. Other way to convert time to second is echo "00:20:40.25" | awk -F: '{ print ($1 * 3600) + ($2 * 60) + $3 }'.
The example with printf shown in another answer is near perfect.
That thing you want is always needed by the basic utilities of GNU/Linux - gnu.org/../../../../../Setting-an-Alarm.html
Way to approach really depends how much foolproof way you need.

Why does awk skip the second field in first entry?

I have a manually created log file of the format
date start duration description
2/5 10:00p 1:45 Did this and that.
2/6 2:00a 0:20 Woke up from my slumber.
==============================================
2:05 TOTAL time spent
There are many entries in the log. To avoid manually recomputing total time every time an entry is added, I wrote the following script:
#!/bin/bash
file=`ls | grep log`
head -n -1 $file | egrep -o [0-9]:[0-9]{2}[^ap] \
| awk '{ FS = ":" ; SUM += 60*$1 ; SUM += $2 } END { print SUM }'
First, the script assumes there is exactly one file with log in its name, and that's the file I'm after. Second, it takes all lines other than the line with the current total, greps the time information from the line, and feeds it to awk, which converts it to minutes.
This is where I run into problems. The final sum would always be slightly off. Through trial and error, I discovered that awk will never count the second field of the very first record, e.g. the 45 minutes in this case. It will count the hour; it won't count the minutes. It has no such problem with the other records, but it's always off by the minutes in the first record.
What could be causing this behavior? How do I debug it?
You set FS in the loop and it's already too late for the first line.
The right way to do is :
echo -e "1:45\n0:20" | awk 'BEGIN { FS=":" } { SUM += 60*$1 + $2 } END { print SUM }'
You did not show us, that how you expect output
Whether like this ?
$ cat log
date start duration description
2/5 10:00p 1:45 Did this and that.
2/6 2:00a 0:20 Woke up from my slumber.
==============================================
2:05 TOTAL time spent
Awk Code
awk '$3~/([[:digit:]]):([[:digit:]])/ && !/TOTAL/{
split($3,A,":")
sum+=A[1]*60+A[2]
}
END{
print "Total",sum,"Minutes"
}' log
Resulting
Total 125 Minutes

How can I convert a logfile entry date into a date in the future in bash

I'm sure this answer is obvious but I'm banging my head on it and getting a headache and my Search Foo is failing me…
I have a log file with this date format:
Sep 1 16:55:00 stuff happening
Sep 1 16:55:01 THIS IS THE LINE YOU WANT at this time stamp
Sep 1 16:55:02 more stuff
Sep 1 16:55:02 THIS IS THE LINE YOU WANT at this time stamp
Sep 1 16:55:03 blah
Sep 1 16:55:04 blah and so on…..
My ultimate goal is to:
Find the last line in the log file with a given string eg: "THIS IS THE LINE…" this is my "magic time" that I will do calculations on later.
Take the date of that line and set a variable that is the date +NN seconds. The time in the future will usually just short of 24hrs in the future from the time in step 1 so crossing into the next day may happen if that is important.
At some point in the script, advance the system clock to the new date/time after which I will be checking for certain events to fire.
I know this is way wrong but so far I have figured out how to:
Grab the last date stamp for my event.
logDate=cat /logdir/my.log | grep "THIS IS THE LINE" | tail -1 | cut -f1,2,3 -d" "
Returns: Sept 1 16:55:02
Convert the date into a more usable format
logDate2="$(date -d "$logDate" +"%m-%d %H:%M:%S")"; echo $logDate2
Returns: 09-17 16:55:02
I'm stuck here - what I want is:
futuredate=$logdate2 + XXXSeconds
Could someone help me with the time calculation or perhaps point out a better way to do all of this?
Thanks.
I'm stuck here - what I want is:
futuredate=$logdate2 + XXXSeconds
You can do it by converting through timestamps:
# convert log date to timestamp
logts="$(date -d "$logDate" '+%s')"
# add timestamp with seconds
futurets=$(( logts + XXXSeconds ))
# get date based from timestamp, optionally you can add a format.
futuredate=$(date -d "#${futurets}")
# Get time in seconds from the epoc (1970-01-01 00:00:00 UTC)
dateinseconds=$(date +"%s" -d "$(tail -1 logfile | grep "THIS IS THE LINE" | awk '{print $1, $2, $3}')")
# You can also use just awk without grep and tail to match and print the last line
dateinseconds=$(date +"%s" -d "$(awk '{/THIS IS THE LINE/}END{print $1, $2, $3}' logfile)")
gotofuture=$(( $dateinseconds + 2345 )) # Add 2345 seconds
newdate=$(date -d "#${gotofuture}")
echo "$newdate"

pull last 5 minutes of syslog data (750mb) with tac combo sed/awk/grep/?

Trying to pull the last 5 minutes of logs with (grep matches)
so i do a tac syslog.log | sed / date -d "5 minutes ago"
every line on the log shows this format
Jun 14 14:03:58
Jul 3 08:04:35
so i really want to get the check of data from
Jul 4 08:12
Jul 4 08:17
i tried this method but KINDA works (though its still going through every day from this that 08:12: through 08:17: fits in)
e=""
for (( i = 5; i >= 0; i-- ))
do
e='-e /'`date +\%R -d "-$i min"`':/p '$e;
done
tac /var/log/syslog.log | sed -n $e
e=""
for (( i = 5; i >= 0; i-- ))
do
if [[ -z $e ]]
then e=`date +\%R -d "-$i min"`
else e=$e'\|'`date +\%R -d "-$i min"`
fi
done
re=' \('$e'\):'
tac /var/log/syslog.log | sed -n -e "/$re/p" -e "/$re/!q"
This creates a single regular expression listing all the times from the last 5 minutes, connected with \|. It prints the lines that matches them. Then it uses the ! modifier to quit on the first line that doesn't match the RE.
If you know the format of the dates then why not do:
tac syslog.log | awk '/Jul 4 08:17/,/Jul 4 08:12/ { print } /Jul 4 08:11/ {exit}'
/ .. /,/ .. / is regex range. It will print everything in this range. So as soon as you see /Jul 4 08:11/ on your line that would mean your 5 minutes window has been captured, you exit perusing the file.
So it didnt really work for the above method But i think i got it to work
if i see this i added a RANGE for the {exit}
awk '/'"$dtnow"'/,/'"$dt6min"'/ { print } /'"$dt7min"'/,/'"$dt11min"'/ {exit}'
Seems to work im testing it again
OK Finally looks like it really works this time (where it exits after the hour using SED instead of awk finally got it to work running through some tests.
tac /var/log/syslog.log | sed -e "$( date -d '-1 hour -6 minutes' '+/^%b %e %H:/q;'
date -d '-1 day -6 minutes' '+/^%b %e /q;'
date -d '-1 month -6 minutes' '+/^%b /q;'
for ((o=0;o<=5;o++)) do date -d "-$o minutes" '+/^%b %e %R:/p;'; done ; echo d)"
It works if log entries begins from "May 14 11:41". Variable LASTMINUTES is used to set the last n minutes in the log:
cat log | awk 'BEGIN{ LASTMINUTES=30; for (L=0;L<=LASTMINUTES;L++) TAB[strftime("%b %d %H:%M",systime()-L*60)] } { if (substr($0,0,12) in TAB) print $0 }'
To run the above script you need gawk which can be installed by:
apt-get install gawk
or
yum install gawk

Resources