What I want to do is plot a graph where my x axis takes a date from the first column of data and then a time from my second column and uses both to create the x axis,
I have a set of data from a date logger that I want to graph on gnuplot, as I get new data every day and it would be so easy to just add on each txt file as I get them
The text files look like this (each span 24 hours)
Date Time Value
30/07/2014 00:59:38 0.075
30/07/2014 00:58:34 0.102
30/07/2014 00:57:31 0.058
30/07/2014 00:56:31 0.089
30/07/2014 00:55:28 0.119
30/07/2014 00:54:26 0.151
30/07/2014 00:53:22 0.17
30/07/2014 00:52:19 0.171
30/07/2014 00:51:17 0.221
30/07/2014 00:50:17 0
30/07/2014 00:49:13 0
30/07/2014 00:48:11 0
30/07/2014 00:47:09 0
This solution mixing date and time on gnuplot xaxis would suit me perfectly, but its very complex and I have no idea what is going on, let alone apply it to multiple files
Here's the code I tried, but I get an illegal day of the month error?
#!/gnuplot
set timefmt '%d/%m/%Y %H:%M:%S'
set xdata time
set format x '%d/%m/%Y %H:%M:%S'
#DATA FILES
plot '30.07.2014 Soli.txt' using 1:3 title '30/07/2014' with points pt 5 lc rgb 'red',\
'31.07.2014 Soli.txt' using 1:3 title '31/07/2014' with points pt 5 lc rgb 'blue'
All help appreciated! Thanks
Such an error is triggered, if some unexpected data appears in the data file, like an uncomment and unused header line in your case.
The following file.dat
Date Time Value
30/07/2014 00:59:38 0.075
30/07/2014 00:58:34 0.102
gives such an error with the minimal script
set xdata time
set timefmt '%d/%m/%Y %H:%M:%S'
plot 'file.dat' using 1:3
To solve the error, remove the first line (or similar lines in between).
Since version 4.6.6 you could also use the skip option to skip some lines at the beginning of the data file like:
set xdata time
set timefmt '%d/%m/%Y %H:%M:%S'
plot 'file.dat' using 1:3 skip 1
I wanted to note that you probably don't have to remove non-compliant lines like the "Date Time Value" header line --- you can just comment it out instead with the hash/pound/octothorp:
#Date Time Value
This will then be ignored by Gnuplot at plot time, but stay visible to you when you want to remember which data is in which column.
You get "month error" due to first line.
Your first line "Date Time Value" doesn't match with time format.
In my humble opinion, you have 2 options.
Delete first line and set titles manually and don't change anything of your code
30/07/2014 00:59:38 0.075
30/07/2014 00:58:34 0.102
30/07/2014 00:57:31 0.058`
Keep without changes data file and modify titles in your gnuplot code, setting columnhead in order to ignore first line in your data file:
plot '30.07.2014 Soli.txt' using 1:3 title columnhead with points pt 5 lc rgb 'red',\
'31.07.2014 Soli.txt' using 1:3 title columnhead with points pt 5 lc rgb 'blue'`
Related
I have a list of time in a decimal format of seconds, and I know what time the series started. I would like to convert it to a time of day with the offset of the start time applied. There must be a simple way to do this that I am really missing!
Sample source data:
\Name of source file : 260521-11_58
\Recording from 26.05.2021 11:58
\Channels : 1
\Scan rate : 101 ms = 0.101 sec
\Variable 1: n1(rpm)
\Internal identifier: 63
\Information1:
\Information2:
\Information3:
\Information4:
0.00000 3722.35645
0.10100 3751.06445
0.20200 1868.33350
0.30300 1868.36487
0.40400 3722.39355
0.50500 3722.51831
0.60600 3722.50464
0.70700 3722.32446
0.80800 3722.34277
0.90900 3722.47729
1.01000 3722.74048
1.11100 3722.66650
1.21200 3722.39355
1.31300 3751.02710
1.41400 1868.27539
1.51500 3722.49097
1.61600 3750.93286
1.71700 1868.30334
1.81800 3722.29224
The Start time & date is 26.05.2021 11:58, and the LH column is elapsed time in seconds with the column name [Time] . So I just want to convert the decimal / real to a time or timespan and add the start time to it.
I have tried lots of ways that are really hacky, and ultimately flawed - the below works, but just ignores the milliseconds.
TimeSpan(0,0,0,Integer(Floor([Time])),[Time] - Integer(Floor([Time])))
The last part works to just get milli / micro seconds on its own, but not as part of the above.
Your formula isn't really ignoring the milliseconds, you are using the decimal part of your time (in seconds) as milliseconds, so the value being returned is smaller than the format mask.
You need to convert the seconds to milliseconds, so something like this should work
TimeSpan(0,0,0,Integer(Floor([Time])),([Time] - Integer(Floor([Time]))) * 1000)
To add it to the time, this would work
DateAdd(Date("26-May-2021"),TimeSpan(0,0,0,Integer([Time]),([Time] - Integer([Time])) * 1000))
You will need to set the column format to
dd-MMM-yyyy HH:mm:ss:fff
In Gnuplot, I'm plotting timestamped values. The data looks like this:
2019-06-01 18:23:56, 508.56
2019-06-07 18:23:56, 508.56
2019-06-08 13:00:00, 492.95
...
I don't want the xtics labels to be too dense, so I print them only every 2 weeks:
set xdata time
set timefmt "%Y-%m-%d %H:%M:%S"
set xtics format "%m/%d"
set xtics 3600*24*15
unset mxtics
It works, the labels are in 2-week intervals, but they're not pleasingly aligned to the beginning of a month. The xtics labels I see are: 06/13, 06/28, 07/13 etc. I'd like to start the labeling from the first day of a month, then the 15th of a month etc. My actual data starts on 2019-06-07. I added a fake first entry for 2019-06-01 (see above), but I get the same labeling sequence (06/13, 06/28 etc). Is there a way to set where the first xtics point should be (06/01 or 06/15) without having to manually specify "xticslabels" in the data file?
You can set your tics yourself by defining a filtering function for the first occurrence of "01" or "15" within each month which returns NaN otherwise.
Edit: test data changed to illustrate that some months have 30, some 31 and February 2020 has 29 days.
Edit2: code revised after comment.
Code:
### Fixed time step "xtics" on 01 and 15 of each month
reset session
myTimeFmt = "%Y-%m-%d %H:%M:%S"
# create some test data
set samples 181
set table $Data
plot [0:1] '+' u (a=strftime(myTimeFmt, time(0)+$1*3600*24*180)):(a[9:10]) w table
unset table
set grid xtics, ytics
flag=0
myTic(s) = (s[9:10] eq "01") && (flag==0 || flag==15) ? \
(flag=1, strftime("%m/%d",strptime(myTimeFmt,s))) : \
(s[9:10] eq "15") && (flag==0 || flag==1) ? \
(flag=15, strftime("%m/%d",strptime(myTimeFmt,s))) : NaN
set xdata time
set timefmt myTimeFmt
set xrange["2019-10-01":"2020-04-30"]
plot $Data u (timecolumn(1,myTimeFmt)):3: \
xtic(myTic(strftime(myTimeFmt,timecolumn(1,myTimeFmt)))) w lp pt 7 notitle, \
### end of code
Result:
I am converting [ss] seconds to mm:ss format.
But, I also have to round off the value to the nearest minute.
For example, 19:29 -> 19 minutes and 19:32-> 20 minutes
I have tried using mround function. But it did not work.
=MROUND(19.45,15/60/24) gives output as 19.44791667.
It should come as 20 seconds.
try like this where B column is formatted as Time
=ARRAYFORMULA(IF(LEN(A1:A), MROUND(A1:A, "00:01:00"), ))
=TEXT(MROUND("00:"&TO_TEXT(B5), "00:01:00"), "mm:ss")
=ARRAYFORMULA(TEXT(MROUND(SUM(TIME(0,
REGEXEXTRACT(TO_TEXT(C3:C11), "(.+):"),
REGEXEXTRACT(TO_TEXT(C3:C11), ":(.+)"))), "00:01:00"), "[mm]:ss"))
I am interested in rounding off timestamps to full hours. What I got so far is to round to the nearest hour. For example with this:
df.withColumn("Full Hour", hour((round(unix_timestamp("Timestamp")/3600)*3600).cast("timestamp")))
But this "round" function uses HALF_UP rounding. This means: 23:56 results in 00:00 but I would instead prefer to have 23:00. Is this possible? I didn't find an option field how to set the rounding behaviour in the function.
I think you're overcomplicating things. Hour function returns by default an hour component of a timestamp.
from pyspark.sql.functions import to_timestamp
from pyspark.sql import Row
df = (sc
.parallelize([Row(Timestamp='2016_08_21 11_59_08')])
.toDF()
.withColumn("parsed", to_timestamp("Timestamp", "yyyy_MM_dd hh_mm_ss")))
df2 = df.withColumn("Full Hour", hour(unix_timestamp("parsed").cast("timestamp")))
df2.show()
Output:
+-------------------+-------------------+---------+
| Timestamp| parsed|Full Hour|
+-------------------+-------------------+---------+
|2016_08_21 11_59_08|2016-08-21 11:59:08| 11|
+-------------------+-------------------+---------+
I have a file with date, end time and duration in decimal format and I need to calculate the start time. The file looks like:
20140101;1212;1.5
20140102;1515;1.58
20140103;1759;.69
20140104;1100;12.5
...
The duration 1.5 for the time 12:12 means one and a half hours and the start time would be 12:12 - 1:30 = 10:42 AM or 11:00 - 12.5 = 11:00 - 12:30 = 22:30 PM. Is there an easy way for calculating such time differences in Awk or is it the good ol' split-multiply-subtract-and-handle-the-day-break-yourself all over again?
Since the values are in hours and minutes, only the minutes matter and the seconds can be discarded, for example duration 1.58 means 1:34 and the leftover 0.8 seconds can be discarded.
I'm on GNU Awk 4.1.3
As you are using gawk take adventage of its native time functions:
gawk -F\; '{tmst=sprintf("%s %s %s %s %s 00",\
substr($1,1,4),\
substr($1,5,2),\
substr($1,7,2),\
substr($2,1,2),\
substr($2,3,2))
t1=mktime(tmst)
seconds=sprintf("%f",$3)+0
seconds*=60*60
difference=strftime("%H%M",t1-seconds)
print $0""FS""difference}' file
Results:
20140101;1212;1.5;1042
20140102;1515;1.58;1340
20140103;1759;.69;1717
20140104;1100;12.5;2230
Check: https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html
Explanation:
tmst=sprintf(..) :used to create a date string from the file
that conforms with the datespec of mktime function YYYY MM
DD HH MM SS [DST].
t1=mktime(tmst) :turn datespec into a timestamp than can be
handle by gawk (as the number of seconds elapsed since 1
January 1970)
seconds=sprintf("%f",$3)+0 : convert third field to float.
seconds*=60*60 : convert hours (in float) to seconds.
difference=strftime("%H%M",t1-seconds) : get the difference in
human maner, hours an minutes.
I highly recommend to use a programming language which supports datetime calculations, because the calculation can be tricky in detail because daylight saving shifts. You can use Python for example:
start_times.py:
import csv
from datetime import datetime, timedelta
with open('input.txt', 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=';', quotechar='|')
for row in reader:
end_day = row[0]
end_time = row[1]
# Create a datetime object
end = datetime.strptime(end_day + end_time, "%Y%m%d%H%M")
# Translate duration into minutes
duration=float(row[2])*60
# Calculate start time
start = end - timedelta(minutes=duration)
# Column 3 is the start day (can differ from end day!)
row.append(start.strftime("%Y%m%d"))
# Column 4 is the start time
row.append(start.strftime("%H%M"))
print ';'.join(row)
Run:
python start_times.py
Output:
20140101;1212;1.5;20140101;1042
20140102;1515;1.58;20140102;1340
20140103;1759;.69;20140103;1717
20140104;1100;12.5;20140103;2230 <-- you see, the day matters!
The above example is using the system's timezone. If the input data refers to a different timezone, Pyhon's datetime module allows to specify it.
I would do something like this:
awk 'BEGIN{FS=OFS=";"}
{ h=substr($2,0,2); m=substr($2,3,2); mins=h*60 + m; diff=mins - $3*60;
print $0, int(diff/60) ":" int(diff%60)
}' file
That is, convert everything to minutes and then back to hours/minutes.
Test
$ awk 'BEGIN{FS=OFS=";"}{h=substr($2,0,2); m=substr($2,3,2); mins=h*60 + m; diff=mins - $3*60; print $0, int(diff/60) ":" int(diff%60)}' a
20140101;1212;1.5;10:42
20140102;1515;1.58;13:40
20140103;1759;.69;17:17