Shell script - is there a faster way to write date/time per second between start and end time? - bash

I have this script (which works fine) that will write all the date/time per second, from a start date/time till an end date/time to a file
while read line; do
FIRST_TIMESTAMP="20230109-05:00:01" #this is normally a variable that changes with each $line
LAST_TIMESTAMP="20230112-07:00:00" #this is normally a variable that changes with each $line
date=$FIRST_TIMESTAMP
while [[ $date < $LAST_TIMESTAMP || $date == $LAST_TIMESTAMP ]]; do
date2=$(echo $date |sed 's/ /-/g' |sed "s/^/'/g" |sed "s/$/', /g")
echo "$date2" >> "OUTPUTFOLDER/output_LABELS_$line"
date=$(date -d "$date +1 sec" +"%Y%m%d %H:%M:%S")
done
done < external_file
However this sometimes needs to run 10 times, and the start date/time and end date/time sometimes lies days apart.
Which makes the script take a long time to write all that data.
Now I am wondering if there is a faster way to do this.

Avoid using a separate date call for each date. In the next example I added a safety parameter maxloop, avoiding loosing resources when the dates are wrong.
#!/bin/bash
awkdates() {
maxloop=1000000
awk \
-v startdate="${first_timestamp:0:4} ${first_timestamp:4:2} ${first_timestamp:6:2} ${first_timestamp:9:2} ${first_timestamp:12:2} ${first_timestamp:15:2}" \
-v enddate="${last_timestamp:0:4} ${last_timestamp:4:2} ${last_timestamp:6:2} ${last_timestamp:9:2} ${last_timestamp:12:2} ${last_timestamp:15:2}" \
-v maxloop="${maxloop}" \
'BEGIN {
T1=mktime(startdate);
T2=mktime(enddate);
linenr=1;
while (T1 <= T2) {
printf("%s\n", strftime("%Y%m%d %H:%M:%S",T1));
T1+=1;
if (linenr++ > maxloop) break;
}
}'
}
mkdir -p OUTPUTFOLDER
while IFS= read -r line; do
first_timestamp="20230109-05:00:01" #this is normally a variable that changes with each $line
last_timestamp="20230112-07:00:00" #this is normally a variable that changes with each $line
awkdates >> "OUTPUTFOLDER/output_LABELS_$line"
done < <(printf "%s\n" "line1" "line2")

Using epoch time (+%s and #) with GNU date and GNU seq to
produce datetimes in ISO 8601 date format:
begin=$(date -ud '2023-01-12T00:00:00' +%s)
end=$(date -ud '2023-01-12T00:00:12' +%s)
seq -f "#%.0f" "$begin" 1 "$end" |
date -uf - -Isec
2023-01-12T00:00:00+00:00
2023-01-12T00:00:01+00:00
2023-01-12T00:00:02+00:00
2023-01-12T00:00:03+00:00
2023-01-12T00:00:04+00:00
2023-01-12T00:00:05+00:00
2023-01-12T00:00:06+00:00
2023-01-12T00:00:07+00:00
2023-01-12T00:00:08+00:00
2023-01-12T00:00:09+00:00
2023-01-12T00:00:10+00:00
2023-01-12T00:00:11+00:00
2023-01-12T00:00:12+00:00

if you're using macOS/BSD's date utility instead of the gnu one, the equivalent command to parse would be :
(bsd)date -uj -f '%FT%T' '2023-01-12T23:34:45' +%s
1673566485
...and the reverse process is using -r flag instead of -d, sans "#" prefix :
(bsd)date -uj -r '1673566485' -Iseconds
2023-01-12T23:34:45+00:00
(gnu)date -u -d '#1673566485' -Iseconds
2023-01-12T23:34:45+00:00

Related

Time difference in seconds between given two dates

I have two dates as follows:
2019-01-06 00:02:10 | END
2019-01-05 23:52:00 | START
How could I calculate and print the difference between START and END dates in seconds?
For above case I would like to get something like:
610
Assuming GNU implementation based OS, you can use date's option %s and -d to calculate the time difference in seconds using command substitution and arithmetic operations.
START="2019-01-05 23:52:00"
END="2019-01-06 00:02:10"
Time_diff_in_secs=$(($(date -d "$END" +%s) - $(date -d "$START" +%s)))
echo $Time_diff_in_secs
Output:
610
Hope this helps!!!
With bash and GNU date:
while read d t x x; do
[[ $x == "END" ]] && end="$d $t"
[[ $x == "START" ]] && start="$d $t"
done < file
end=$(date -u -d "$end" '+%s')
start=$(date -u -d "$start" '+%s')
diff=$(($end-$start))
echo "$diff"
Output:
610
See: man date
What you're asking for is difficult verging on impossible using pure bash. Bash doesn't have any date functions of its own. For date processing, most recommendations you'll get will be to use your operating system's date command, but the usage of this command varies by operating system.
In BSD (including macOS):
start="2019-01-05 23:52:00"; end="2019-01-06 00:02:10"
printf '%d\n' $(( $(date -j -f '%F %T' "$end" '+%s') - $(date -j -f '%F %T' "$start" '+%s') ))
In Linux, or anything using GNU date (possibly also Cygwin):
printf '%d\n' $(( $(date -d "$end" '+%s') - $(date -d "$start" '+%s') ))
And just for the fun of it, if you can't (or would prefer not to) use date for some reason, you might be able to get away with gawk:
gawk 'END{ print mktime(gensub(/[^0-9]/," ","g",end)) - mktime(gensub(/[^0-9]/," ","g",start)) }' start="$start" end="$end" /dev/null
The mktime() option parses a date string in almost exactly the format you're providing, making the math easy.
START="2019-01-05 23:52:00"
END="2019-01-06 00:02:10"
parse () {
local data=(`grep -oP '\d+' <<< "$1"`)
local y=$((${data[0]}*12*30*24*60*60))
local m=$((${data[1]}*30*24*60*60))
local d=$((${data[2]}*24*60*60))
local h=$((${data[3]}*60*60))
local mm=$((${data[4]}*60))
echo $((y+m+d+h+mm+${data[5]}))
}
START=$(parse "$START")
END=$(parse "$END")
echo $((END-START)) // OUTPUT: 610
Was trying to solve the same problem on a non-GNU OS, i.e. macOS. I couldn't apply any of the solutions above, although it inspired me to come up with the following solution. I am using some in-line Ruby from within my shell script, which should work out of the box on macOS.
START="2019-01-05 23:52:00"
END="2019-01-06 00:02:10"
SECONDS=$(ruby << RUBY
require 'date'
puts ((DateTime.parse('${END}') - DateTime.parse('${START}')) * 60 * 60 * 24).to_i
RUBY)
echo ${SECONDS}
# 610

Using a Loop To Search Only Logs In A Time Window

I'm trying to find a pattern "INFO: Server startup in" for last 5 mins in a log file.
Here is the line from which I'm trying to find the pattern: "INFO | jvm 1 | main | 2018/07/09 00:11:29.077 | INFO: Server startup in 221008 ms"
The pattern is coming, but I need to shorten the code or create a loop for it.
I tried to create a loop, but it is not working. Here is my code without loops, which is working:
#!/bin/bash
#Written by Ashutosh
#We will declare variables with date and time of last 5 mins.
touch /tmp/a.txt;
ldt=$(date +"%Y%m%d");
cdt=$(date +"%Y/%m/%d %H:%M");
odtm5=$(date +"%Y/%m/%d %H:%M" --date "-5 min");
odtm4=$(date +"%Y/%m/%d %H:%M" --date "-4 min");
odtm3=$(date +"%Y/%m/%d %H:%M" --date "-3 min");
odtm2=$(date +"%Y/%m/%d %H:%M" --date "-2 min");
odtm1=$(date +"%Y/%m/%d %H:%M" --date "-1 min");
## Finding the pattern and storing it in a file
grep -e "$odtm1" -e "$cdt" -e "$odtm2" -e "$odtm3" -e "$odtm4" -e
"$odtm5" /some/log/path/console-$ldt.log
> /tmp/a.txt;
out=$(grep 'INFO: Server startup in' /tmp/a.txt);
echo "$out"
## remove the file that contains the pattern
rm /tmp/a.txt;
I have tried to use sed also, but date function is not working with it.
Can someone please give me the new changed script with loops?
Adopting your original logic:
time_re='('
for ((count=5; count>0; count--)); do
time_re+="$(date +'%Y/%m/%d %H:%M' --date "-$count min")|"
done
time_re+="$(date +'%Y/%m/%d %H:%M'))"
ldt=$(date +'%Y%m%d')
awk -v time_re="$time_re" '
$0 ~ time_re && /INFO: Server startup in/ { print $0 }
' "/some/log/path/console-$ldt.log"
Performance enhancements are certainly possible -- this could be made much faster by bisecting the log for the start time -- but the above addresses the explicit question (about using a loop to generate the time window). Note that it will get unwieldy -- you wouldn't want to use this to search for the last day, for example, as the regex would become utterly unreasonable.
Sounds like all you need is:
awk -v start="$(date +'%Y/%m/%d %H:%M' --date '-5 min')" -F'[[:space:]]*[|][[:space:]]*' '
($4>=start) && /INFO: Server startup in/
' file
No explicit loops or multiple calls to date required.
Here is a bash script that does the job (thanks to Charles for its improvement):
#!/bin/bash
limit=$(date -d '5 minutes ago' +%s)
today_logs="/some/log/path/console-$(date +'%Y%m%d').log"
yesterday_logs="/some/log/path/console-$(date +'%Y%m%d' -d yesterday).log"
tac "$today_logs" "$yesterday_logs" \
| while IFS='|' read -r prio jvm app date log; do
[ $(date -d "$date" +%s) -lt "$limit" ] && break
echo "|$jvm|$prio|$app|$date|$log"
done \
| grep -F 'INFO: Server startup in' \
| tac
It has the following advantages over your original script:
optimized: it parses log lines starting from the more recent ones and stops at the first line encountered that is more than 5 min old. At 23:59, no need to parse log lines from 0:00 to 23:53
arbitrary time window: you can replace "5 minutes" with "18 hours" and it will still work. A time window of more than one day needs adaptation since each day has it own log file
works correctly when day changes: at 0:00 the original script will never parse the log lines from 23:55:00 to 23:59:59
Mixing the above code with Ed Morton's answer, you get:
#!/bin/bash
limit=$(date -d '5 minutes ago' +'%Y/%m/%d %H:%M')
today_logs="/some/log/path/console-$(date +'%Y%m%d').log"
yesterday_logs="/some/log/path/console-$(date +'%Y%m%d' -d yesterday).log"
tac "$today_logs" "$yesterday_logs" \
| awk -v stop="$limit" -F'[[:space:]]*[|][[:space:]]*' '
($4 < stop) { exit }
/INFO: Server startup in/
' \
| tac

Mac OSX Shell script parse ISO 8601 date and add one second?

I am trying to figure out how to parse a file with ISO 8601-formatted time stamps, add one second and then output them to a file.
All the examples I have found don't really tell me how to do it with ISO 8601 date/time strings.
Example:
read a csv of times like: "2017-02-15T18:47:59" (some are correct, others are not)
and spit out in a new file "2017-02-15T18:48:00"
mainly just trying to correct a bunch of dates that have 59 seconds at the end to round up to the 1 second mark.
This is my current progress:
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
# startd=$(date -j -f '%Y%m%d' "$line" +'%Y%m%d');
# echo "$startd";
startd=$(date -j -u -f "%a %b %d %T %Z %Y" $line)
#startd=$(date -j -f '%Y%m%d' "$line" +'%Y%m%d');
echo "$startd";
done < "$1"
Any help would be appreciated
jm666's helpful perl answer will be much faster than your shell loop-based approach.
That said, if you want to make your bash code work on macOS, with its BSD date implementation, here's a solution:
# Define the input date format, which is also used for output.
fmt='%Y-%m-%dT%H:%M:%S'
# Note: -j in all date calls below is needed to suppress setting the
# system date.
while IFS= read -r line || [[ -n "$line" ]]; do
# Parse the line at hand using input format specifier (-f) $fmt,
# and output-format specifier (+) '%s', which outputs a Unix epoch
# timestamp (in seconds).
ts=$(date -j -f "$fmt" "$line" +%s)
# See if the seconds-component (%S) is 59...
if [[ $(date -j -f %s "$ts" +%S) == '59' ]]; then
# ... and, if so, add 1 second (-v +1S).
line=$(date -j -f %s -v +1S "$ts" +"$fmt")
fi
# Output the possibly adjusted timestamp.
echo "$line"
done < "$1"
Note that input dates such as 2017-02-15T18:47:59 are interpreted as local time, because they contain no time-zone information.
This could do the job
perl -MTime::Piece -nlE '$f=q{%Y-%m-%dT%H:%M:%S};$t=Time::Piece->strptime($_,$f)+1;say $t->strftime($f)' < dates.txt
if the dates.txt contains
2017-02-15T18:47:59
2016-02-29T23:59:59
2017-02-28T23:59:59
2015-12-31T23:59:59
2010-10-10T10:10:10
the above produces
2017-02-15T18:48:00
2016-03-01T00:00:00
2017-03-01T00:00:00
2016-01-01T00:00:00
2010-10-10T10:10:11

Parsing date and time format - Bash

I have date and time format like this(yearmonthday):
20141105 11:30:00
I need assignment year, month, day, hour and minute values to variable.
I can do it year, day and hour like this:
year=$(awk '{print $1}' log.log | sed 's/^\(....\).*/\1/')
day=$(awk '{print $1}' log.log | sed 's/^.*\(..\).*/\1/')
hour=$(awk '{print $2}' log.log | sed 's/^\(..\).*/\1/')
How can I do this for month and minute?
--
And I need that every line of my log file:
20141105 11:30:00 /bla/text.1
20141105 11:35:00 /bla/text.2
20141105 11:40:00 /bla/text.3
....
I'm trying read line by line this log file and do this:
mkdir -p "/bla/backup/$year/$month/$day/$hour/$minute"
mv $file "/bla/backup/$year/$month/$day/$hour/$minute"
Here is my not working code:
#!/bin/bash
LOG=/var/log/LOG
while read line
do
year=${line:0:4}
month=${line:4:2}
day=${line:6:2}
hour=${line:9:2}
minute=${line:12:2}
file=$(awk '{print $3}')
if [ -f "$file" ]; then
printf -v path "%s/%s/%s/%s/%s" $year $month $day $hour $minute
mkdir -p "/bla/backup/$path"
mv $file "/bla/backup/$path"
fi
done < $LOG
You don't need to call out to awk to date at all, use bash's substring operations
d="20141105 11:30:00"
yr=${d:0:4}
mo=${d:4:2}
dy=${d:6:2}
hr=${d:9:2}
mi=${d:12:2}
printf -v dir "/bla/%s/%s/%s/%s/%s/\n" $yr $mo $dy $hr $mi
echo "$dir"
/bla/2014/11/05/11/30/
Or directly, without all the variables.
printf -v dir "/bla/%s/%s/%s/%s/%s/\n" ${d:0:4} ${d:4:2} ${d:6:2} ${d:9:2} ${d:12:2}
Given your log file:
while read -r date time file; do
d="$date $time"
printf -v dir "/bla/%s/%s/%s/%s/%s/\n" ${d:0:4} ${d:4:2} ${d:6:2} ${d:9:2} ${d:12:2}
mkdir -p "$dir"
mv "$file" "$dir"
done < filename
or, making a big assumption that there are no whitespace or globbing characters in your filenames:
sed -r 's#(....)(..)(..) (..):(..):.. (.*)#mv \6 /blah/\1/\2/\3/\4/\5#' | sh
date command also do this work
#!/bin/bash
year=$(date +'%Y' -d'20141105 11:30:00')
day=$(date +'%d' -d'20141105 11:30:00')
month=$(date +'%m' -d'20141105 11:30:00')
minutes=$(date +'%M' -d'20141105 11:30:00')
echo "$year---$day---$month---$minutes"
You can use only one awk
month=$(awk '{print substr($1,5,2)}' log.log)
year=$(awk '{print substr($1,0,4)}' log.log)
minute=$(awk '{print substr($2,4,2)}' log.log)
etc
I guess you are processing the log file, which each line starts with the date string. You may have already written a loop to handle each line, in your loop, you could do:
d="$(awk '{print $1,$2}' <<<"$line")"
year=$(date -d"$d" +%Y)
month=$(date -d"$d" +%m)
day=$(date -d"$d" +%d)
min=$(date -d"$d" +%M)
Don't repeat yourself.
d='20141105 11:30:00'
IFS=' ' read -r year month day min < <(date -d"$d" '+%Y %d %m %M')
echo "year: $year"
echo "month: $month"
echo "day: $day"
echo "min: $min"
The trick is to ask date to output the fields you want, separated by a character (here a space), to put this character in IFS and ask read to do the splitting for you. Like so, you're only executing date once and only spawn one subshell.
If the date comes from the first line of the file log.log, here's how you can assign it to the variable d:
IFS= read -r d < log.log
eval "$(
echo '20141105 11:30:00' \
| sed 'G;s/\(....\)\(..\)\(..\) \(..\):\(..\):\(..\) *\(.\)/Year=\1\7Month=\2\7Day=\3\7Hour=\4\7Min=\5\7Sec=\6/'
)"
pass via a assignation string to evaluate. You could easily adapt to also check the content by replacing dot per more specific pattern like [0-5][0-9] for min and sec, ...
posix version so --posix on GNU sed
I wrote a function that I usually cut and paste into my script files
function getdate()
{
local a
a=(`date "+%Y %m %d %H %M %S" | sed -e 's/ / /'`)
year=${a[0]}
month=${a[1]}
day=${a[2]}
hour=${a[3]}
minute=${a[4]}
sec=${a[5]}
}
in the script file, on a line of it's own
getdate
echo "year=$year,month=$month,day=$day,hour=$hour,minute=$minute,second=$sec"
Of course, you can modify what I provided or use answer [6] above.
The function takes no arguments.

Is there a way to delete all log entries in a file older than a certain date? (BASH)

I have log file in which I'm trying to delete all entries older than a specified date. Though I haven't succeeded with this yet. What I've tested so far is having an input for what the entries must be older than to be deleted and then loop like this:
#!/bin/bash
COUNTER=7
DATE=$(date -d "-${COUNTER} days" +%s)
DATE=$(date -d -#${DATE} "+%Y-%m-%d")
while [ -n "$(grep $DATE test.txt)" ]; do
sed -i "/$DATE/d" test.txt
COUNTER=$((${COUNTER}+1))
DATE=$(date -d "-${COUNTER} days" +%s)
DATE=$(date -d #${DATE} +"%Y-%m-%d")
done
This kind of works except when a log entry doesn't exist for date. When it doesn't find a match it aborts the loop and the even older entries are kept.
Update
This was how I solved it:
#!/bin/bash
COUNTER=$((7+1))
DATE=$(date -d "-${COUNTER} days" +%s)
DATE=$(date -d -#${DATE} "+%Y-%m-%d")
if [ -z "$(grep $DATE test.txt)" ]; then
exit 1
fi
sed -i "1,/$DATE/d" test.txt
Sorry for answering my own question but I went with Martin Frost's suggestion in the comments. It was much easier than the other suggestions.
This was my implementation:
#!/bin/bash
# requirements for script script
COUNTER=$((7+1))
DATE=$(date -d "-${COUNTER} days" +%s)
DATE=$(date -d -#${DATE} "+%Y-%m-%d")
sed -i "1,/$DATE/d" test.txt
Thanks for all the help!
Depending on your logfile format, assuming that the timestamp is the first column in the file you can do it like this with (g)awk.
awk 'BEGIN { OneWeekEarlier=strftime("%Y-%m-%d",systime()-7*24*60*60) }
$1 <= OneWeekEarlier { next }
1' INPTUTLOG > OUTPUTLOG
This computes the date - surprise, surprise - one week earlier, then checks if the first column (white space separated columns by default) is less than or equal, and if true, skips the line, otherwise prints.
The hard part is doing the "in place" editing with awk. But it can be done:
{ rm LOGFILE && awk 'BEGIN { OneWeekEarlier=strftime("%Y-%m-%d",systime()-7*24*60*60) }
$1 <= OneWeekEarlier { next }
1' > LOGFILE ; } < LOGFILE
HTH
I deleted log records in syslog-ng files before 60 days ago with following code.
#!/bin/bash
LOGFILE=/var/log/syslog
DATE=`date +"%b %e" --date="-60days"`
sed -i "/$DATE/d" $LOGFILE

Resources