Grep dates from file and format them - bash

I have a file "tmp.txt" looking like that:
random text random text 25/06/2021 15:15:15
random text random text 26/06/2021 15:15:15
random text random text 26/06/2021 15:15:15
and I would like to:
extract all datetimes
add 4 hours
display them as timestamp
I didn't figured out yet how to add hour as I,m facing an issue with the date format not being recognized by the date function.
(I would like to be able to do it with a single line command if possible)
Here is my current command:
egrep -o "[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}" tmp.txt | while read -r line ; do echo $(date -d "$line" +%s);done
Help appreciated!

Tried and Tested, Minimal Solution
You can use the below command line to get the desired result. I have tested it with your example and it worked as expected on my Linux machine.
egrep -o "[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}" tmp.txt | while read -r line; do dd=${line:0:2}; mm=${line:3:2}; yyyy=${line:6:4}; time=${line:11:8}; date -d "${yyyy}-${mm}-${dd} ${time} 4 hours" +'%Y-%m-%d %H:%M:%S'; done
I'll break it down into multiple lines so it's easy to understand:
egrep -o "[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}" tmp.txt \
| \
while read -r line; do
# Reading date and time into separate variables
dd=${line:0:2};
mm=${line:3:2};
yyyy=${line:6:4};
time=${line:11:8};
# Adding 4 hours and displaying datetime in desired format
date -d "${yyyy}-${mm}-${dd} ${time} 4 hours" +'%Y-%m-%d %H:%M:%S';
done
To add 4 hours, you can just mention it after the datetime in -d option as shown above, I tried with hours, minute and days and it worked as expected
For your input file tmp.txt:
random text random text 25/06/2021 15:15:15
random text random text 26/06/2021 15:15:15
random text random text 26/06/2021 15:15:15
On running my command, the output was:
2021-06-25 19:15:15
2021-06-26 19:15:15
2021-06-26 19:15:15
I tested it with edge cases like close to midnight time, leap years etc and it worked fine

Let me adjust the timestamps to make the output more interesting:
$ cat tmp.txt
random text random text 25/06/2021 15:15:15
random text random text 26/06/2021 20:15:15
random text random text 26/06/2021 23:15:15
#jhnc has the right idea: use a language that's both good at text manipulation and can do date arithmetic. I'd use Time::Piece
perl -MTime::Piece -lne '
m{(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d)} or continue;
$t = Time::Piece->strptime($1, "%d/%m/%Y %T");
$t += 4 * 3600;
print $t->strftime("%F %T")
' tmp.txt
2021-06-25 19:15:15
2021-06-27 00:15:15
2021-06-27 03:15:15
Or, here's perl piping into xargs for the date stuff
perl -pe 's{.*(\d{2})/(\d{2})/(\d{4}) (\d{2}:\d{2}:\d{2}).*}
{$2/$1/$3 +4 hours $4}
' tmp.txt | xargs -I DT date -d DT '+%F %T'
2021-06-25 19:15:15
2021-06-27 00:15:15
2021-06-27 03:15:15

Related

Append number of days since the date in the line to each line in the file using Bash

I have a file that consists of the following...
false|aaa|user|aaa001|2014-12-11|
false|bbb|user|bbb||
false|ccc|user|ccc|2021-10-19|
false|ddd|user|ddd|2018-11-16|
false|eee|user|eee|2020-06-02|
I want to use the date in the 5th column to calculate the number of days from the current date and append it to each line in the file.
The end result would be a file that looks like the following, assuming the current date is 1/13/2022...
false|aaa|user|aaa001|2014-12-11|2590
false|bbb|user|bbb||
false|ccc|user|ccc|2021-10-19|86
false|ddd|user|ddd|2018-11-16|1154
false|eee|user|eee|2020-06-02|590
Some lines in the file will not contain a date value (which is expected). I need a solution for a Bash script on Linux.
I am able to submit a command using echo for a single line and then calculate the number of days from the current date by using cut on the 5th field (see below)...
echo "false|aaa|user|aaa001|2014-12-11" | echo $(( ($(date --date=date +"%Y-%m-%d" +%s) - $(date --date=cut -d'|' -f5 +%s) )/(60*60*24) ))
2590
I don't know how to do this one line at a time, capture the 'number of days' value and then append it to each line in the file.
Here's an approach using
paste to append the outputs
sed to arrange the empty lines and
awk to calculate the desired days.
This works with GNU date. BSD date has to use something like date -jf x +%s.
EDIT: Updated the date to compare with to current day.
% current=$(date +%m/%d/%Y)
% paste -d"\0" file <(cut -d"|" -f5 file |
sed 's/^$/#/' |
xargs -Ix date -d x +%s 2>&1 |
awk -v cur="$(date -d "$current" +%s)" '/invalid/{print 0; next}
{print int((cur-$1)/3600/24)}')
false|aaa|user|aaa001|2014-12-11|2590
false|bbb|user|bbb||0
false|ccc|user|ccc|2021-10-19|86
false|ddd|user|ddd|2018-11-16|1154
false|eee|user|eee|2020-06-02|590
Also date returns date: invalid date ‘#’ in the empty case. If any other implementation behaves differently the awk regex has to be adjusted accordingly.
Data
% cat file
false|aaa|user|aaa001|2014-12-11|
false|bbb|user|bbb||
false|ccc|user|ccc|2021-10-19|
false|ddd|user|ddd|2018-11-16|
false|eee|user|eee|2020-06-02|

convert timestamp to date in bash

i have time logs in timestamp (epoch unix time) format :
1515365117236
1515365123162
1515365139963
i would like to convert it to a regular date like
2017-01-07 23:48:01
2017-01-07 23:48:02
2017-01-07 23:48:03
any ideas what approach would be the fastest?
cat ff1.csv | while read line ; do echo $line\;$(date -d +"%Y-%m-%d %H:%M:%S") ; done > somefile.csv
this takes awful lot of time and just appends the current time
Another approach that must be much faster , using printf of bash version >4.2 :
$ printf '%(datefmt)T\n' epoch
For datefmt you need a string accepted by strftime(3) - see man 3 strftime
Testing:
$ cat file10
1515365117236
1515365123162
1515365139963
$ printf '%(%F %H:%M:%S)T\n' $(cat file10)
49990-01-04 04:47:16
49990-01-04 06:26:02
49990-01-04 11:06:03
In this case , printf format string is:
%F Equivalent to %Y-%m-%d (the ISO 8601 date format). (C99)
%H The hour as a decimal number using a 24-hour clock (range 00 to 23).(Calculated from tm_hour.)
%M The minute as a decimal number (range 00 to 59). (Calculated from tm_min.)
%S The second as a decimal number (range 00 to 60). (The range is up to 60 to allow for occasional leap seconds.- Calculated from tm_sec.)
Update to remove milliseconds:
$ printf '%(%F %T)T\n' $(printf '%s/1000\n' $(<file10) |bc)
2018-01-08 00:45:17
2018-01-08 00:45:23
2018-01-08 00:45:39
The way to transform epoch to date is date -d #epochtime +format
An alternative way is to use date --file switch to read dates from a file directly.
$ cat file10
1515365117236
1515365123162
1515365139963
In order date to understand that these lines are epoch time you need to add # in the beginning of each line.
This can be done like bellow:
$ sed -i 's/^/#/g' file10 #caution - this will make changes in your file
$ date --file file10 +"%Y-%m-%d %H:%M:%S"
Alternativelly, you can do it on the fly without affecting the original file:
$ sed 's/^/#/g' file10 |date --file - +"%Y-%m-%d %H:%M:%S"
PS: in this case --file reads from - == stdin == pipe
In both cases, the result is
49990-01-04 04:47:16
49990-01-04 06:26:02
49990-01-04 11:06:03
PS: by the way, the timestamps you provide seems invalid, since it seems to refer at year 49990
Your input data aren't epoch unix time, it has miliseconds. If you wish to use any method on bash first you must convert to timestamp:
cat ff1.csv | while read LINE; do echo "#$(expr $LINE \/ 1000)" | date +"%Y-%m-%d %H:%M:%S" --file - ; done
First divide by 1000 to delete miliseconds parts, the rest is the same that explain George Vasiliou

To print past 2years weekends dates using shell scripting

I want the code in Bash scripting
"It should print the dates in the below manner
From : 2015-October-03 2015-October-04(in the next line again it should print)
2015-October-10 2015-October-11
" "
" "
To :2017-October-21 2017-October-22
2017-October-28 2017-October-29
So, this should print all the months from the 2015-till date weekend dates in the above format only. please help me at the earliest
The following is the solution for your query.
Solution:-
#!/bin/bash
Date_Diff_Count=` echo $[$[$(date +%s)-$(date -d "2015-01-01" +%s)]/60/60/24] `
for i in ` seq -$Date_Diff_Count 0 `
do
VALUE=`date -d "+$i day" | egrep -i "Sat|Sun" | awk -F" " '{print $2" "$3" "$6}'`
[[ ! -z ${VALUE} ]] && date -d "${VALUE}" +%Y-%B-%d
done > sample.txt
paste -d " " - - < sample.txt
Output
2015-January-03 2015-January-04
2015-January-10 2015-January-11
2015-January-17 2015-January-18
2015-January-24 2015-January-25
2015-January-31 2015-February-01
...
2016-May-07 2016-May-08
2016-May-14 2016-May-15
2016-May-21 2016-May-22
2016-May-28 2016-May-29
...
2017-October-07 2017-October-08
2017-October-14 2017-October-15
2017-October-21 2017-October-22
2017-October-28 2017-October-29
Explanation
Date_Diff_Count is the variable i.e. getting number of days by
subtracting the start date from the current date. Based on your wish
you can edit the start date.
For loop is starting from -Date_Diff_Count to 0 for Ex: if
Date_Diff_Count is 500, for loop sequence starts from -500 to 0.
Value is where we are fetching only year,month and date after doing pipe on the output of date and egrep command.
if value is not zero then we are converting date into the format YYYY-month-DD
Final output will be saved in sample.txt file
Final paste command is to merge 2 consecutive lines into a single line. If you want to merge 3 lines then use paste -d " " - - -
d is delimiter to separate the merged lines. You can use any other operators based on your requirements.

Bash script: using variables / parameter in sed

I am trying to write a little bash script, where you can specify a number of minutes and it will show the lines of a log file from those last X minutes.
To get the lines, I am using sed
sed -n '/time/,/time/p' LOGFILE
On CLI this works perfectly, in my script however, it does not.
# Get date
now=$(date "+%Y-%m-%d %T")
# Get date minus X number of minutes -- $1 first argument, minutes
then=$(date -d "-$1 minutes" +"%Y-%m-%d %T")
# Filter logs -- $2 second argument, filename
sed -n '/'$then'/,/'$now'/p' $2
I have tried different approaches and none of them seem to work:
result=$(sed -n '/"$then"/,/"$now"/p' $2)
sed -n "/'$then'/,/'$now'/p" "$2"
sed -n "/$then/,/$now/p" $2
sed -n "/$then/,/$now/p" "$2
Any sugesstions?
I am on Debian 5, echo $SHELL says /bin/sh
EDIT : The script produces no output, so there is no error showing up.
In the logfile every entry starts with a date like this 2013-05-15 14:21:42,794
I assume that the main problem is that you try to perform an arithmetic comparison by string matching. sed -n '/23/,/27/p' gives you the lines between the first line that contains 23 and the next line that contains 27 (and then again from the next line that contains 23 to the next line that contains 27, and so on). It does not give you all lines that contain a number between 23 and 27. If the input looks like
19
22
24
26
27
30
it does not output anything (since there is no 23). An awk solution that uses string matching has the same problem. So, unless your then date string occurs verbatim in the log file, your method will fail. You have to convert your date strings into numbers (drop the -, <space>, and :) and then check whether the resulting number is in the right range, using an arithmetical comparison rather than a string match. This goes beyond the capabilities of sed; awk and perl can do it rather easily. Here is a perl solution:
#!/bin/bash
NOW=$(date "+%Y%m%d%H%M%S")
THEN=$(date -d "-$1 minutes" "+%Y%m%d%H%M%S")
perl -wne '
if (m/^(....)-(..)-(..) (..):(..):(..)/) {
$date = "$1$2$3$4$5$6";
if ($date >= '"$THEN"' && $date <= '"$NOW"') {
print;
}
}' "$2"
Don't give yourself a headache with nested quotes. Use the -v option with awk to pass the value of a shell variable into the script:
#!/bin/bash
# Get date
now=$(date "+%Y-%m-%d %T")
# Get date minus X number of minutes -- $1 first argument, minutes
delta=$(date -d "-$1 minutes" +"%Y-%m-%d %T")
# Filter logs -- $2 second argument, filename
awk -v n="$now" -v d="$delta" '$0~n,$0~d' $2
Also don't use variable names of shell builtins i.e then.

UNIX shell-scripting: Split a textfile by its entries

I'm trying to analyze an enormous text file (1.6GB), whose data lines look like this:
20090118025859 -2.400000 78.100000 1023.200000 0.000000
20090118025900 -2.500000 78.100000 1023.200000 0.000000
20090118025901 -2.400000 78.100000 1023.200000 0.000000
I don't even know how many lines there are. But I'm trying to split the file by date. The left number is a time stamp (these lines for example are from 2009, january 18th).
How can I split this file into pieces according to the date?
The number of entries per date differs, so using split with a constant number won't work.
Everything I know would be to grep file '20090118*' > data20090118.dat , but there sure is a way to do all the dates at once, right?
Thanks in advance,
Alex
Using awk:
awk '{print > "data"substr($1,0,8)".dat"}' myfile
This should work if the items are in date sequence:
date=20090101 # Change to the earliest date
while IFS= read -rd $'\n' line
do
if [ "$(echo "$line" | cut -d ' ' -f 1 | cut -c 1-8)" -eq $date ]
then
echo "$line" >> "$date.dat"
else
let date++
fi
done < log.dat
With the caveats that each day needs to have more than 1 record,
and that the output file will have blank lines:
uniq --all-repeated=separate -w8 file | csplit -s - '/^$/' '{*}'
We really should have an option to uniq to output even uniq records.
Also csplit should have an option to suppress the matched line.

Resources