gnuplot - set xrange with date [1:31] - time

Hej!
My automatized script is running through many month. I'd like to have the monthly plots to show always the window [1:31], actually YYYY-MM-01 to YYYY-MM-31. For all month my script is handling. (And I don't care that some month have no data at the ends, e.g., February.)
Unfortunately, I don't know how to provide the range to gnuplot correctly. Since data is time, sth like [1:31] doesn't work of course. But neither can I explicitly state the full date in my script since next month will be different again.
Any ideas?
Thanks!
EDIT: I edited the original post to include some more information. Sorry for not including it right away.
DATA: My data basically looks like this (with many more lines in between). I have a separate datafile per month, starting on the 1st, 0 AM, towards the 31st, 12 PM:
2022-07-01,00:00:16,27.3,3,28.0,9.0,995.6
2022-07-01,00:05:16,27.3,3,28.0,9.0,995.5
2022-07-01,00:10:16,27.3,3,28.0,9.0,995.4
2022-07-01,00:15:16,27.3,3,28.0,9.0,995.3
2022-07-01,00:20:16,27.3,3,28.0,9.0,995.3
2022-07-01,00:25:16,27.3,3,28.0,9.0,995.3
...
2022-07-31,23:54:16,27.1,3,27.9,12.1,994.9
2022-07-31,23:55:16,27.1,3,27.9,12.1,994.9
2022-07-31,23:56:16,27.1,3,27.9,12.1,995.1
2022-07-31,23:57:16,27.1,3,27.9,11.9,995.0
2022-07-31,23:58:16,27.1,3,27.9,11.9,995.0
EXAMPLE IMAGE: An example (for the month of July) is attached. As I'm not allowed to have embedded images, see the link:
CODE EXAMPLE: The following code is used to modify the x-axis. A very basic attempt (set xrange ["01":"31"]) is included to set the range between 1 and 31. But since xdata is not integers, this is a failed attempt of course.
set xdata time
set timefmt '%Y-%m-%d,%H:%M:%S'
set format x "%d"
#set xrange ["01":"31"]
set xtics 21600*4*7
set mxtics 7
set grid xtics
set grid mxtics
Of course, the xdata changes from month to month. But I'd like to have the day of the month only, not the full date. And, like I said, fixed to 1-31 for easier comparison between months.

It's not fully clear to me how you want specify the month which you want to plot. Maybe we get a step further with the following example.
Check help tm_year, help tm_mon, help strftime, help strptime, help time_specifiers. Note, that tm_mon() will return a month index from 0 to 11, not from 1 to 12.
The script sets the range from the first of a given month to the first of the next month.
If you give more information about your actual task, we could also put the plots into a loop to loop several months.
Script:
### select only a month from time data
reset session
myTimeFmt = "%Y-%m-%d"
# create some random test data
set table $Data
set samples 365
d0 = strptime(myTimeFmt,"2022-01-01")
plot '+' u (strftime("%Y-%m-%d",d0+$0*24*3600)):(rand(0)*10) w table
unset table
t0(m) = strptime("%Y-%m",m)
t1(m) = (t=strptime("%Y-%m",m),strptime("%Y-%m",sprintf("%04d-%02d",tm_year(t),tm_mon(t)+2)))
monthName(m) = strftime("%B %Y",strptime("%Y-%m",m))
set format x "%1d" timedate
set grid x,y
set xtic 24*3600
set key noautotitle
set multiplot layout 3,1
myMonth = "2022-01"
set xrange[t0(myMonth):t1(myMonth)]
set title monthName(myMonth)
plot $Data u (timecolumn(1,myTimeFmt)):2 w l
myMonth = "2022-02"
set xrange[t0(myMonth):t1(myMonth)]
set title monthName(myMonth)
plot $Data u (timecolumn(1,myTimeFmt)):2 w l
myMonth = "2022-09"
set xrange[t0(myMonth):t1(myMonth)]
set title monthName(myMonth)
plot $Data u (timecolumn(1,myTimeFmt)):2 w l
unset multiplot
### end of script
Result:
Addition:
Now, that you showed your data, I noticed that your date and your time are in different columns. That's why it's important to always add data or describe the format in detail.
Here is a suggestion where the graphs always span 31 days, but the xtics and labels are suppressed for the shorter months. Admittedly, it's not straightforward and maybe not so easy to understand. So, you can try to improve your gnuplot skills.
Script:
### select only a month from time data
reset session
myMonth = "2022-02"
# create some random test data
set table $Data separator comma
t0(m) = strptime("%Y-%m",m)
t1(m) = (t=strptime("%Y-%m",m),strptime("%Y-%m",sprintf("%04d-%02d",tm_year(t),tm_mon(t)+2)))
set xrange[t0(myMonth):t1(myMonth)-1]
set samples 100
plot '+' u (strftime("%Y-%m-%d",$1)):(strftime("%H:%M:%S",$1)):(rand(0)+18):(rand(0)+20):(rand(0)+22) w table
unset table
myDay(col) = tm_mday(timecolumn(1,"%Y-%m-%d"))
myTime(col) = timecolumn(2,"%H:%M:%S")/24./3600.
monthName(m) = strftime("%B %Y",strptime("%Y-%m",m))
set datafile separator comma
set ytic 1
set grid x,y
set key noautotitle
set title monthName(myMonth)
set xrange[1:32]
plot for [col=3:5] $Data u (d0=myDay(1), d0+myTime(2)):col w l lc col-2, \
'+' u (t0=$0+1):(0):xtic(t0==d0+1?'':sprintf("%d",t0)) every ::::d0 w p ps 0 noautoscale
### end of script
Result: (for January, February and September)

#theozh
I had a look as your code, and, actually, the part concerning my original question was not as complicated as I thought. I am using the following code:
t0(m) = strptime("%Y-%m",m)
#t1(m) = (t=strptime("%Y-%m",m),strptime("%Y-%m",sprintf("%04d-%02d",tm_year(t),tm_mon(t)+2)))
t1(m) = t0(m)+31*24*3600
set xrange[t0(myMonth):t1(myMonth)]
The results are as follows for July and September:
The only thing that "remains" would be to set all months in the range 1-31, but that's cosmetics and not really necessary. It works pretty well as it is - and it fits my script so that the automation is working as well!
NICE!

Related

Wrong value from a second time interval in pine script

Good morning,
I have a little problem there.
I would like work with data from two different time interval.
for example, BTC (1 day time interval) and BTC (4 hour time interval) chart.
The main time interval is the 4 hour. The value "HA_C", this is the close value of "BTC 1 Day".
The "close BTC 1 Day time interval" value displayed correct in the 4 hour chart.
But the value "test" with a simple arithmetic problem differs greatly and is wrong.
You can test this as follows:
Loads the strategy in "BTC", time interval "1 Day",
note from one day the "BTC Close" value and the "test" value.
Then switch to "BTC" 4 hour time interval.
You will see, that the "HA_C Close" from the 1 hour time interval is the correct value,
but the "test" value is displayed incorrectly.
Why is the "test" value after a calculation incorrectly, although the "Close" value is correct ???
I have find out, that the problem is the "ta.ema (source, length)" function. Can someone give me a formula, that calculates the same value as the "ta.ema (source, length)" function.
**// This source code is subject to the terms of the Mozilla Public License 2.0 at https://mozilla.org/MPL/2.0/
// © flashpit
//#version=5
strategy("TEST", process_orders_on_close=true, overlay=true, calc_on_every_tick=true, pyramiding=30)
varip test = 0.0
HA_Symbol = ticker.heikinashi("BINANCE:BTCUSDT")
HA_C = request.security(HA_Symbol, "1D", close)
test:= ta.ema(HA_C, 7) * 1.05
plot (HA_C)
plot (test)**
I have finde the correct code. Here is it:
c2_1D = request.security(ticker.heikinashi('BINANCE:BTCUSDT'), "1D", t3_D (close, T3Length_1D, T3FactorCalc_1D))
It is due to the context you have called the ema function. If your chart is H4 and you perform your test calculation in the global scope, it is using 7 x H4 bars of HA_C. On BTCUSDT, over the last 7 H4 bars, it would have been 7 bars made up of multiples of only 2 daily values, hence the incorrect result.
When you change the chart to D1, it shows the correct result because now the global context of the script is now operating in the same timeframe as the security call.
If you want the correct value from the ema using 7 x 1D bars it has to be done within the context of the security call. For example :
test = request.security(ticker.heikinashi("BINANCE:BTCUSDT"), "D", ta.ema(close, 7))
If you need to perform multiple operations using the same ticker, you can also wrap them in a function and just pass the one function to a single security call. For example, this will return the daily close and the daily ema 7 :
f_ema_and_close(_src, _len) =>
_ema = ta.ema(_src, _len)
[_src, _ema]
[D1_close, D1_ema7] = request.security(ticker.heikinashi("BINANCE:BTCUSDT"), "D", f_ema_and_close(close, 7))
plot(D1_close, color = color.yellow)
plot(D1_ema7, color = color.red)

gnuplot : variable paths to data file in a for loop

I would like to plot multiple curve on the same graph using a for loop. Each data file (named stat_coupe) is located in a different folder (fwal055wal055/rep16/ and fwal055wal055_c2/rep20/). fwal055wal055 and fwal055wal055_c2 correspond to names of simulation. First, I need to get a previous result, a single number (Utau), in other files (named file_fwal055wal055 and file_fwal055wal055_c2). This is successfully done thanks to the command awk. The result depend on the file: Utaufwal055wal055=10.5 and Utaufwal055wal055_c2=12.2.
Then I need to divid the 1st column of the file stat_coupe corresponding to the path fwal055wal055/rep16/ by the value of Utaufwal055wal055 and do the same thing for the file stat_coupe corresponding to the path fwal055wal055_c2/rep20/ with the value of Utaufwal055wal055_c2. Moreover, each plot should have a specific format which depend on the type of simulation run (fwal055wal055 or fwal055wal055_c2).
The presented problem is reduced to 2 simulations fwal055wal055 and fwal055wal055_c2 and 1 plot but I have about 20 simulations and 15 various graphs to plot that is why I would like to use the for loop.
To summary at each iteration I have:
a specific format,
a specific path,
a specific value of Utau
I want to indicate the wright format, path and value of Utau at each iteration of the for loop. The solution I propose below successfully permits to obtain the value of Utau for each simulation but the code #path_.i and #format_.i does not work.
#!/bin/bash
for elem in fwal055wal055 fwal055wal055_c2;
do
Utau[${elem}]=$(awk 'FNR==5{print $1}' file_$elem)
done
gnuplot -persist <<-EOFMarker
format_fwal055wal055='pt 1 ps 1.0 lc 0 title "WALE"'
format_fwal055wal055_c2='pt 2 ps 1.0 lc 0 title "WALE c2"'
path_fwal055wal055='"fwal055wal055/rep16/stat_coupe"'
path_fwal055wal055_c2='"fwal055wal055_c2/rep20/stat_coupe"'
list="fwal055wal055 fwal055wal055_c2"
plot for [i in list] #path_.i u 1:(\$2/${Utau[${i}]}) #format_.i
EOFMarker
I would like to obtain something equivalent to:
plot #path_fwal055wal055 u 1:(\$2/${Utau[${i}]}) #format_fwal055wal055,\
#path_fwal055wal055_c2 u 1:(\$2/${Utau[${i}]}) #format_fwal055wal055_c2
Can someone help me to solve this issue ?
Thank you very much,
Martin
Check help sprintf, help words and help word.
I would create two strings with the same number of items and then combine them with sprintf(). From gnuplot 5.2 on you could also do it with arrays.
# Version 1
PATHS = '"fwal055wal055/rep16/stat_coupe" "fwal055wal055_c2/rep20/stat_coupe"'
FILES = "fwal055wal055 fwal055wal055_c2"
plot for [i=1:words(FILES)] sprintf("%s_%s",word(PATHS,i),word(FILES,i)) u 1:2
or you could define a function for your filenames to keep the plot command short and readable.
# Version 2
PATHS = '"rep16/stat_coupe" "rep20/stat_coupe"'
FILES = "fwal055wal055 fwal055wal055_c2"
myFilename(i) = sprintf("%s/%s_%s",word(FILES,i),word(PATHS,i),word(FILES,i))
plot for [i=1:words(FILES)] myFilename(i) u 1:2
Addition (after some clarifications...)
If I understand your question now correctly, the following code should do the job.
For the extraction of the UTAUS you do a separate loop before plotting and store the extracted values in a string. During plotting you get these values back via word(UTAUS,i). Since you do the mathematical operation column(2)/word(UTAUS,i), gnuplot will interpret them as number. Check help words, help word, help sprintf, help every.
Code:
### extract and normalize in a loop with individual files and directories
reset session
FILES = 'fwal055wal055 fwal055wal055_c2'
DIRS = 'rep16 rep20'
TITLES = '"WALE" "WALE c2"' # if you have spaces you need to put it into double quotes
UTAUS = ''
# define functions for better readability
myExtractionFile(i) = sprintf("file_%s",word(FILES,i))
myDataFile(i) = sprintf("%s/%s/stat_coupe",word(FILES,i),word(DIRS,i))
myTitle(i) = word(TITLES,i)
# define point or line appearance. Add more if you have more files
set style line 1 pt 1 ps 1.0 lc 0
set style line 2 pt 2 ps 1.0 lc 1
# extract the UTAUs
do for [i=1:words(FILES)] {
set table $Dummy
plot myExtractionFile(i) u (utau=$1) every ::4::4 w table # extract value row 5, column 1 (not counting header lines)
unset table
UTAUS = UTAUS.sprintf(" %g",utau) # append the extracted value as string
}
plot for [i=1:words(FILES)] myDataFile(i) u 1:(column(2)/word(UTAUS,i)) ls i title myTitle(i)
### end of code

Change All Value Labels to Numerics in SPSS

I need to change all the value labels of all my variables in my spss file to be the value itself.
I first tried -
Value Labels ALL.
EXECUTE.
This removes the value labels, but also removes the value entirely. I need this to have a label of some sort as I am converting the file and when there is no values defined it turns the value into a numeric. Therefore, I need the all value labels changed into numbers so that each value's label is just the value - value = 1 then label = 1.
Any ideas to do this across all my variables??
Thanks in advance!!
Here is a solution to get you started:
get file="C:\Program Files\IBM\SPSS\Statistics\23\Samples\English\Employee data.sav".
begin program.
import spss, spssaux, spssdata
spss.Submit("set mprint on.")
vd=spssaux.VariableDict(variableType ="numeric")
for v in vd:
allvalues = list(set(item[0] for item in spssdata.Spssdata(v.VariableName, names=False).fetchall()))
if allvalues:
cmd="value labels " + v.VariableName + "\n".join([" %(i)s '%(i)s'" %locals() for i in allvalues if i <> None]) + "."
spss.Submit(cmd)
spss.Submit("set mprint off.")
end program.
You may want to read this to understand the behaviour of fetchall in reading date variables (or simply exclude date variables from having their values labelled also, if they cause no problems?)

Can I calculate something inside a for loop and then plot those values on the same graph?

I have the following code, which plots 4 lines:
plot for [i=1:4] \
path_to_file using 1:(column(i)) , \
I also want to plot 8 horizontal lines on this graph, the values of which come from mydata.txt.
I have seen, from the answer to Gnuplot: How to load and display single numeric value from data file, that I can use the stats command to access the constant values I am interested in. I think I can access the cell (row, col) as follows:
stats 'mydata.txt' every ::row::row using col nooutput
value = int(STATS_min)
But their location is a function of i. So, inside the plot command, I want to add something like:
for [i=1:4] \
stats 'mydata.txt' every ::(1+i*10)::(1+i*10) using 1 nooutput
mean = int(STATS_min)
stats 'mydata.txt' every ::(1+i*10)::(1+i*10) using 2 nooutput
SE = int(STATS_min)
upper = mean + 2 * SE
lower = mean - 2 * SE
and then plot upper and lower, as horizontal lines on the graph, above.
I think I can plot them separately by typing plot upper, lower but how do I plot them on the graph, above, for all i?
Thank you.
You can create an array and store the values in it, then using an index that refers to the value's position in the array you can access it inside a loop.
You can create the array as follows:
array=""
do for [i=1:4] {
val = i / 9.
array = sprintf("%s %g",array,val)
}
where I have stored 4 values: 1/9, 2/9, 3/9 and 4/9. In your case you would run stats and store your upper and/or lower variables. You can check what the array looks like in this way:
gnuplot> print array
0.111111 0.222222 0.333333 0.444444
For plotting, you can access the different elements in the array using word(array,i), where i refers to the position. Since the array is a string, you need to convert it to float, which can be done multiplying by 1.:
plot for [i=1:4] 1.*word(array,i)
If you have values stored in a data file, you can process it with awk or even with gnuplot:
array = ""
plot for [i=1:4] "data" every ::i::i u (array=sprintf("%s %g",array,$1), 1/0), \
for [i=1:4] 1.*word(array,i)
The first plot instance creates the array from the first column data entries without plotting the points (the 1/0 option tells gnuplot to ignore them, so expect warning messages) and the second plot instance uses the values stored in array as variables (hence as horizontal lines in this case). Note that every takes 0 as the first entry, so [i=1:4] runs from the second through to the fifth lines of the file.

calculate standard deviation of daily data within a year

I have a question,
In Matlab, I have a vector of 20 years of daily data (X) and a vector of the relevant dates (DATES). In order to find the mean value of the daily data per year, I use the following script:
A = fints(DATES,X); %convert to financial time series
B = toannual(A,'CalcMethod', 'SimpAvg'); %calculate average value per year
C = fts2mat(B); %Convert fts object to vector
C is a vector of 20 values. showing the average value of the daily data for each of the 20 years. So far, so good.. Now I am trying to do the same thing but instead of calculating mean values annually, i need to calculate std annually but it seems there is not such an option with function "toannual".
Any ideas on how to do this?
THANK YOU IN ADVANCE
I'm assuming that X is the financial information and it is an even distribution across each year. You'll have to modify this if that isn't the case. Just to clarify, by even distribution, I mean that if there are 20 years and X has 200 values, each year has 10 values to it.
You should be able to do something like this:
num_years = length(C);
span_size = length(X)/num_years;
for n = 0:num_years-1
std_dev(n+1,1) = std(X(1+(n*span_size):(n+1)*span_size));
end
The idea is that you simply pass the date for the given year (the day to day values) into matlab's standard deviation function. That will return the std-dev for that year. std_dev should be a column vector that correlates 1:1 with your C vector of yearly averages.
unique_Dates = unique(DATES) %This should return a vector of 20 elements since you have 20 years.
std_dev = zeros(size(unique_Dates)); %Just pre allocating the standard deviation vector.
for n = 1:length(unique_Dates)
std_dev(n) = std(X(DATES==unique_Dates(n)));
end
Now this is assuming that your DATES matrix is passable to the unique function and that it will return the expected list of dates. If you have the dates in a numeric form I know this will work, I'm just concerned about the dates being in a string form.
In the event they are in a string form you can look at using regexp to parse the information and replace matching dates with a numeric identifier and use the above code. Or you can take the basic theory behind this and adapt it to what works best for you!

Resources