date and time format for x axis in gnuplot - time

I have a csv file that looks like:
1,Fri Jun 27 23:22:17 2014
...
3500,Sat Jun 28 09:21:55 2014
I want to plot column 1 as y and column 2 as x:
set datafile separator ","
set xdata time
set timefmt "%a %b %d %T %Y"
set format x "%d-%b\n%H:%M"
plot "file.csv" u 2:1
For every line I get an error:
warning: Bad time format in string
warning: Bad abbreviated month name
warning: Skipping data file with no valid points
And at the end x range is invalid
I really do not know what is going on. My time format looks good. Right?
Thanks!

Gnuplot does not support the %a, %T specifiers in timefmt (nevertheless, they can be used in set format x) - see p. 168 in the documentation.
While %T can be directly replaced with %H:%M:%S, %a might seem a bit problematic. A workaround would be to preprocess the file in order to get rid of the day name in the second column, e.g.
set xdata time
set timefmt "%b %d %H:%M:%S %Y"
set format x "%d-%b\n%H:%M"
plot "<(gawk -F, '{print $1, substr($2, 5)}' file.csv)" u 2:1
Note that since the file was preprocessed by gawk, the command set datafile separator "," is no longer needed here.

Related

Batch change accessed and modified date, with date from another file's content?

I'm migrating old notes from a SQL database based note taking app to separate text files.
I've managed to export the notes and date codes as separate text files.
The files are ordered like this:
$ ls -1
Note0001.txt
Note0001-date.txt
Note0002.txt
Note0002-date.txt
Note0003.txt
Note0003-date.txt
The contents of the date files looks like this:
$ cat Note0001-date.txt
388766121.742373
$ cat Note0002-date.txt
274605766.273638
$ cat Note0003-date.txt
384996285.436197
The dates are seconds since the epoch 2001-01-01. See other question about the format: What type of date format is this? And how to convert it?.
How do I batch change the accessed and modified date of the notes files, NoteNNNN.txt, to the date in the contents of respective date file, NoteNNNN-date.txt?
How to convert the date to UTC+1? Preferably with consideration of DST (daylight saving time).
I am trying to convert the dates with the method described this question:
https://unix.stackexchange.com/questions/2987/
But it outputs an error message in bash 3.2.57 (macOS):
$ date -d '2001-01-01 UTC+1 + 388766121 seconds'
usage: date [-jnRu] [-d dst] [-r seconds] [-t west] [-v[+|-]val[ymwdHMS]] ...
[-f fmt date | [[[mm]dd]HH]MM[[cc]yy][.ss]] [+format]
I am new to working with the dates and timestamps in the terminal.
Iterate over each file pair, access the timestamp, shift the timestamp so it's something unix tools can understand, then touch files. Ie. big problems are composed of sum of small problems.
# find all files named .txt but not -date.txt
find . -name '*.txt' '!' -name '*-date.txt' |
# remove the .txt suffix
sed 's/\.txt$//' |
{
# the reference point of files content
start=$(date -d "2001-01-01" +%s) # will not work with BSD date
# I guess just precompute the value:
start=978303600
# for each file
while IFS= read -r f; do
# get the timestamp
diff=$(<"$f"-date.txt)
# increment the timestamp to seconds since epoch
ref=$(<<<"scale=6; $start + $diff" bc)
# TODO: use a tool convert the timestamp sinece epoch to BSD touch
# compatible format, ie. to ccyy-mm-ddTHH:MM:SS[.frac][Z]
ref=(TODO "$ref")
# change access and modification times of .txt file
touch -d "#$ref" "$f".txt
done
}
Assuming your OS local timezone is what you want for your output, and you have a version of awk that supports the GNU awk time functions, you could use the
following script. Also:
If the DST daylight-savings flag is positive, the time is assumed to
be daylight savings time; if zero, the time is assumed to be standard
time; and if negative (the default), mktime() attempts to determine
whether daylight savings time is in effect for the specified time.
file tst.awk:
BEGIN {
epoch = mktime("2001 01 01 00 00 00")
}
FNR==1 {
close(out)
out = substr(FILENAME, 1, length(FILENAME)-9) ".txt"
}
{
print strftime("%F %T %Z", epoch+$0) > out
}
Usage:
awk -f tst.awk *-date.txt
Example
Here is an example with the script, without the I/O part, just converting the datetimes.
test file:
> cat file
388766121.742373
274605766.273638
384996285.436197
script tst.awk:
BEGIN { epoch = mktime("2001 01 01 00 00 00") }
{ print strftime("%F %T %Z", epoch+$0) }
Output:
> awk -f tst.awk file
2013-04-27 15:35:21 EEST
2009-09-14 08:22:46 EEST
2013-03-14 23:24:45 EET
The timezone of my box is being used by default (EET). If we 'd like to print to a different timezone, we should define that and set the TZ. Also DST is used by default, notice that some days are printed as EEST (Summer Time).

Is there a straight-forward way to convert millisecond based timestamps on MacOs?

EpochConverter turns a timestamp value like 1586775709496 into Monday, April 13, 2020 11:01:49.496 AM.
Unfortunately, the date tool on MacOs expects seconds, not milliseconds, and gives a wrong year:
> date -r 1586775709496
Thu Dec 2 15:24:56 CET 52252
This existing question only explains the obvious: you can divide by 1000 (cut of the trailing 3 digits) and the built-in date tool will work.
But: that is not what I am looking for. I am looking for a "straightforward" way to turn such millisecond based timestamps into "human readable" including the milliseconds. Are there ways to achieve that?
timestamp=1586775709496
ms=$(( $timestamp % 1000 ))
echo "$(date -r $(( $timestamp / 1000 )) +"%a, %b %d, %Y %H:%M:%S").$ms"
Mon, Apr 13, 2020 12:01:49.496
you can edit the date format string to get exactly the result you need.
With gnu date I believe that would be:
$ a=1586775709496
$ LC_ALL=C date -u --date=#"$((a/1000)).$(printf "%03d" $((a%1000)))" +"%A, %B %2d, %Y %H:%M:%S.%3N %p"
Monday, April 13, 2020 11:01:49.496 PM
The %3N is something that GNU date supports and it prints only milliseconds.
I guess because the last 3 characters of input are just in the output, you could just input them where they should be, removing the need for %N extension:
$ a=1586775709496;
$ LC_ALL=C date -u --date=#"$((a/1000))" +"%A, %B %2d, %Y %H:%M:%S.$(printf "%03d" $((a%1000))) %p"

Date subtraction in bash

I want to get successive differences of dates shown sample below. How do I convert each of these dates into epoch seconds?
7/21/17 6:39:12:167 GMT
7/21/17 6:39:12:168 GMT
7/21/17 6:39:12:168 GMT
7/21/17 6:39:12:205 GMT
7/21/17 6:39:12:206 GMT
7/21/17 6:39:12:206 GMT
Once each line gets converted into epoch seconds, I can simply run another script to get successive differentials of each. Thanks.
You can convert times using the date command. Given a line like this:
7/21/17 6:39:12:167 GMT
You first need to strip everything at and after the seconds part, to get this:
7/21/17 6:39:12
You can use cut -d: -f1-3 for that. Then, convert to epoch seconds, if you're using FreeBSD or Mac OS:
date -ujf "%m/%d/%y %H:%M:%S" "7/21/17 6:39:12" +%s
Which gives:
1500619152
If you are using GNU date (e.g. on Linux), you can feed an entire file of dates to it. Since the input file isn't in the right format, we can do this:
date --file <(cut -d: -f1-3 infile) +%s
That will read the entire file with only a single invocation of date, which is much more efficient, but only works with GNU date.
Here is one in GNU awk. It converts the timestamps to epoch time in seconds and subtracts the former from the latter. mktime function used for converting doesn't take fraction of a seconds but the fractions are stored to hash a[7] and nothing stops you from adding it to t var before subtracting:
$ awk '
function zeropad(s) { # zeropad function
return sprintf("%02d", s)
}
{
split($0,a,"[/ :]") # split timestamp to a
for(i in a)
a[i]=zeropad(a[i]) # zeropad all components
t=mktime(20 a[3] " " a[1] " " a[3] " " a[4] " " a[5] " " a[6])
# add the fractions in a[7] here
if(NR>1) # might be unnecessary
print t-p # output subtracted seconds
p=t # set current time to previous
}' file
0
0
0
0
0
Since you didn't include the expected output or proper data sample, that's the best I can do for now.
EDIT:
Since your data does not fully reflect if the fraction of a second are presented like 0:0:0:100 or 0:0:0:1 I modified the zeropad function to left and right pad given values. Now you call it like zeropad(value, count, left/right) or zeropad(a[7],3,"r"):
function zeropad(s,c,d) {
return sprintf("%" (d=="l"? "0" c:"") "d" (d=="r"?"%0" c-length(s) "d":""), s,"")
}
{
split($0,a,"[/ :]") # split timestamp to a
for(i in a)
a[i]=zeropad(a[i],2,"l") # left-pad all components with 0s
t=mktime(20 a[3] " " a[1] " " a[3] " " a[4] " " a[5] " " a[6])
t=t+zeropad(a[7],3,"r")/1000 # right-pad fractions with 0s
if(NR>1) # might be unnecessary
print t-p # output subtracted seconds
p=t # set current time to previous
}
0.00999999
0
0.37
0.00999999
0
printf with proper modifiers should probably be used for outputing to get sane values.
Let's suppose that you have one of those dates in a variable:
$ d='7/21/17 6:39:12:167 GMT'
With GNU date, we need to remove the milliseconds part. That can be done with bash's pattern substitution:
$ echo "${d/:??? / }"
7/21/17 6:39:12 GMT
To convert that to seconds-since-epoch, we use the -d option to set the date and the %s format to request seconds-since-epoch:
$ date -d "${d/:??? / }" +%s
1500619152
Compatibility: Mac OSX does not support GNU's -d option but I gather it has similar option. On many operating systems, including OSX, there is the option to install GNU utilities.

Delete an entire row if date is less than 50 days of current date

I need help to delete a row if date is older than n days at specified column.My file contains following. From the below file , I need to find out the entries less than 50 days old of current date in column 4 and delete the entire row.
ABC, 2017-02-03, 123, 2012-09-08
BDC, 2017-01-01, 456, 2015-09-05
Test, 2017-01-05, 789, 2017-02-03
My desired output is follows.
ABC, 2017-02-03, 123, 2012-09-08
BDC, 2017-01-01, 456, 2015-09-05
Note: I have an existing script and need to integrate this to the existing one.
you can leverage date command for this task, which will simplify the script
$ awk -v t=$(date -d"-50 day" +%Y-%m-%d) '$4<t' input > output
which will have this content in the output file
ABC, 2017-02-03, 123, 2012-09-08
BDC, 2017-01-01, 456, 2015-09-05
replace input/output with your file names
You can use a gawk logic something like below,
gawk '
BEGIN {FS=OFS=",";date=strftime("%Y %m %d %H %M %S")}
{
split($4, d, "-")
epoch = mktime(d[1] " " d[2] " " d[3] " " "00" " " "00" " " "00")
if ( ((mktime(date) - epoch)/86400 ) > 50) print
}' file
The idea is to use GNU Awk string functions strftime() and mtkime() for date conversion. The former produces a timestamp in YYYY MM DD HH MM SS format which mktime uses for conversion to EPOCH time.
Once the two times, i.e. the current timestamp (date) and epoch from $4 in file are converted to EPOCH, the difference is divided by 86400 to get the differences in days and only those lines whose difference is greater 50 are printed.

gawk - suppress output of matched lines

I'm running into an issue where gawk prints unwanted output. I want to find lines in a file that match an expression, test to see if the information in the line matches a certain condition, and then print the line if it does. I'm getting the output that I want, but gawk is also printing every line that matches the expression rather than just the lines that meet the condition.
I'm trying to search through files containing dates and times for certain actions to be executed. I want to show only lines that contain times in the future. The dates are formatted like so:
text... 2016-01-22 10:03:41 more text...
I tried using sed to just print all lines starting with ones that had the current hour, but there is no guarantee that the file contains a line with that hour, (plus there is no guarantee that the lines all have any particular year, month, day etc.) so I needed something more robust. I decided trying to convert the times into seconds since epoch, and comparing that to the current systime. If the conversion produces a number greater than systime, I want to print that line.
Right now it seems like gawk's mktime() function is the key to this. Unfortunately, it requires input in the following format:
yyyy mm dd hh mm ss
I'm currently searching a test file (called timecomp) for a regular expression matching the date format.
Edit: the test file only contains a date and time on each line, no other text.
I used sed to replace the date separators (i.e. /, -, and :) with a space, and then piped the output to a gawk script called stime using the following statement:
sed -e 's/[-://_]/ /g' timecomp | gawk -f stime
Here is the script
# stime
BEGIN { tsec=systime(); } /.*20[1-9][0-9] [0-1][1-9] [0-3][0-9] [0-2][0-9][0-6][0-9] [0-6][0-9]/ {
if (tsec < mktime($0))
print "\t" $0 # the tab is just to differentiate the desired output from the other lines that are being printed.
} $1
Right now this is getting the basic information that I want, but it is also printing every like that matches the original expression, rather than just the lines containing a time in the future. Sample output:
2016 01 22 13 23 20
2016 01 22 14 56 57
2016 01 22 15 46 46
2016 01 22 16 32 30
2016 01 22 18 56 23
2016 01 22 18 56 23
2016 01 22 22 22 28
2016 01 22 22 22 28
2016 01 22 23 41 06
2016 01 22 23 41 06
2016 01 22 20 32 33
How can I print only the lines in the future?
Note: I'm doing this on a Mac, but I want it to be portable to Linux because I'm ultimately making this for some tasks I have to do at work.
I'd like trying to accomplish this in one script rather than requiring the sed statement to reformat the dates, but I'm running into other issues that probably require a different question, so I'm sticking to this for now.
Any help would be greatly appreciated! Thanks!
Answered: I had a $1 at the last line of my script, and that was the cause of the additional output.
Instead of awk, this is an (almost) pure Bash solution:
#!/bin/bash
# Regex for time string
re='[0-9]{4}-[0-9]{2}-[0-9]{2} ([0-9]{2}:){2}[0-9]{2}'
# Current time, in seconds since epoch
now=$(date +%s)
while IFS= read -r line; do
# Match time string
[[ $line =~ $re ]]
time_string="${BASH_REMATCH[0]}"
# Convert time string to seconds since epoch
time_secs=$(date -d "$time_string" +%s)
# If time is in the future, print line
if (( time_secs > now )); then
echo "$line"
fi
done < <(grep 'pattern' "$1")
This takes advantage of the Coreutils date formatting to convert a date to seconds since epoch for easy comparison of two dates:
$ date
Fri, Jan 22, 2016 11:23:59 PM
$ date +%s
1453523046
And the -d argument to take a string as input:
$ date -d '2016-01-22 10:03:41' +%s
1453475021
The script does the following:
Filter the input file with grep (for lines containing a generic pattern, but could be anything)
Loop over lines containing pattern
Match the line with a regex that matches the date/time string yyyy-mm-dd hh:mm:ss and extract the match
Convert the time string to seconds since epoch
Compare that value to the time in $now, which is the current date/time in seconds since epoch
If the time from the logfile is in the future, print the line
For an example input file like this one
text 2016-01-22 10:03:41 with time in the past
more text 2016-01-22 10:03:41 matching pattern but in the past
other text 2017-01-22 10:03:41 in the future matching pattern
some text 2017-01-23 10:03:41 in the future but not matching
blahblah 2022-02-22 22:22:22 pattern and also in the future
the result is
$ date
Fri, Jan 22, 2016 11:36:54 PM
$ ./future_time logfile
other text 2017-01-22 10:03:41 in the future matching pattern
blahblah 2022-02-22 22:22:22 pattern and also in the future
This is what I have working now. It works for a few different date formats and on the actual files that have more than just the date and time. The default format that it works for is yyyy/mm/dd, but it takes an argument to specify a mm/dd/yyyy format if needed.
BEGIN { tsec=systime(); dtstr=""; dt[1]="" } /.*[0-9][0-9]:[0-9][0-9]:[0-9][0-9]/ {
cur=$0
if ( fm=="mdy" ) {
match($0,/[0-1][1-9][-_\/][0-3][0-9][-_\/]20[1-9][0-9]/) # mm dd yyyy
section=substr($0,RSTART,RLENGTH)
split(section, dt, "[-_//]")
dtstr=dt[3] " " dt[1] " " dt[2]
gsub(/[0-1][1-9][-\/][0-3][0-9][-\/]20[1-9][0-9]/, dtstr, cur)
}
gsub(/[-_:/,]/, " ", cur)
match(cur,/20[1-9][0-9] [0-1][1-9] [0-3][0-9][[:space:] ]*[0-2][0-9] [0-6][0-9] [0-6][0-9]/)
arr=mktime(substr(cur,RSTART,RLENGTH))
if ( tsec < arr)
print $0
}
I'll be adding more format options as I find more formats, but this works for all the different files I've tested so far. If they have a mm/dd/yyyy format, you call it with:
gawk -f stime fm=mdy filename
I plan on adding an option to specify the time window that you want to see, but this is an excellent start. Thank you guys again, this is going to drastically simplify a few tasks at work ( I basically have to retrieve a great deal of data, often under time pressure depending on the situation ).

Resources