Sum of values between start and end date in MATLAB - algorithm

If, in MATLAB, I have a set of start and end dates that represent a contiguous period of time, how would I take a separate daily time series and accumulate any values from said time series over each start/end pair?
Is this something that can be done with accumarray()? It wasn't clear to me whether it would be efficient to construct the vector that groups each element of the time series by start/end pair.
Inputs
Start Date End Date
01/01/12 01/31/12
02/01/12 02/28/12
...
Date Value
01/01/12 23
01/02/12 87
01/03/12 5
01/04/12 12
...
02/01/12 4
Output
Start Date End Date Value
01/01/12 01/31/12 127
02/01/12 02/28/12 4
...

For consecutive periods of time, the following approach might work. Note that the strings containing dates are cellstrings and, for consecutive data, only the first column of your start date /end date matrix is necesssary.
Furthermore, note that I separated your time series data into two variables for the sake of clarity.
dateBins = {...
'01/01/12';
'02/01/12';
'03/01/12';
'04/01/12'};
dates = {
'01/01/12'
'01/02/12'
'01/03/12'
'01/04/12'
'02/01/12' };
values = [
23
87
5
12
4];
With these variables, the following code
[thrash, idx] = histc(datenum(dates), datenum(dateBins))
accumVal = accumarray(idx, values);
result = zeros(length(dateBins), 1);
result(1:length(accumVal),1) = accumVal;
will result in:
result =
127
4
0
0

Assuming you have already got two vectors with the start dates and end dates in a format that you can use to compare and you just want to count how many occurrences there are in each cateogory, then it is quite straightforward:
% Your data
Dates = 25*rand(10,1);
StartDate = [1 10 20];
EndDate = [9 19 29];
% The Calculation
Result = zeros(size(StartDate)); %Initialization
for d = 1:length(startdate)
idx = dates >= StartDate & dates <= EndDate;
Result(d) = sum(idx);
end
Note that this will require that you store your dates in a comparable format.

I would iterate over each pair of start/end dates. Then pick out the index start/stop pairs and sum them. If you use datestrs, you can make the following less brittle, while allowing for more flexibility in how you represent times that cross years, etc.
start_date = {'01/01/12'};
end_date={'01/31/12'};
datevec={'01/01/12','01/02/12','01/03/12','01/31/12'};
values=[23,87,5,12];
for i=1:numel(start_date)
is = find(ismember(datevec,start_date{i})==1);
ie = find(ismember(datevec,end_date{i})==1);
sum(values(is:ie))
end

Related

How can I add minutes and seconds to a datetime in lua?

I have a lua function to attempt to convert the time duration of the currently playing song e.g. hh:mm:ss to seconds.
function toSeconds (inputstr)
local mytable = string.gmatch(inputstr, "([^"..":".."]+)");
local conversion = { 60, 60, 24}
local seconds = 0;
--iterate backwards
local count = 0;
for i=1, v in mytable do
count = i+1
end
for i=1, v in mytable do
mytable[count-i]
seconds = seconds + v*conversion[i]
end
return seconds
end
in order to add it to os.time to get the estimated end time of a song.
but the hours may be missing, or the minutes may be missing on a short track.
When running against https://www.lua.org/cgi-bin/demo All I get is input:10: 'do' expected near 'in'
for the test script
function toSeconds (inputstr)
local mytable = string.gmatch(inputstr, "([^"..":".."]+)");
local conversion = { 60, 60, 24}
local seconds = 0;
--iterate backwards
local count = 0;
for i=1, v in mytable do
count = i+1
end
for i=1, v in mytable do
mytable[count-i]
seconds = seconds + v*conversion[i]
end
return seconds
end
print(toSeconds("1:1:1")
You're mixing up the two possible ways of writing a for loop:
a)
for i=1,10 do
print(i, "This loop is for counting up (or down) a number")
end
b)
for key, value in ipairs({"hello", "world"}) do
print(key, value, "This loop is for using an iterator function")
end
The first one, as you can see, simply counts up a number, i in this case. The second one is very generic and can be used to iterate over almost anything (for example using io.lines), but is most often used with pairs and ipairs to iterate over tables.
You also don't write for ... in tab, where tab is a table; you have to use ipairs for that, which then returns an iterator for the table (which is a function)
You're also using string.gmatch incorrectly; it doesn't return a table, but an iterator function over the matches of the pattern in the string, so you can use it like this:
local matches = {}
for word in some_string:gmatch("[^ ]") do
table.insert(matches, word)
end
which gives you an actual table containing the matches, but if you're only going to iterate over that table, you might as well use the gmatch loop directly.
for i=1, v in mytable do
count = i+1
end
I think you're just trying to count the elements in the table here? You can easily get the length of a table with the # operator, so #mytable
If you have a string like hh:mm:ss, but the hours and the minutes can be missing, the easiest thing might be to just fill them with 0. A somewhat hacky but short way to achieve this is to just append "00:00:" to your string, and look for the last 3 numbers in it:
local hours, minutes, seconds = ("00:00:"..inputstr):match("(%d%d):(%d%d):(%d%d)$")
If nothing is missing, you'll end up with something like 00:00:hh:mm:ss, which you only take the last 3 values of to end up with the correct time.

How to compare alternating rows in CSV using RUBY

I have a data set that consists thousand of rows. I would like to count how many times an alarm toggle between ALARM_OPENED and ALARM_NORMALIZED
Here is a data sample. The Alarm toggle twice and hence ideally the count = 2
The issue now is I cannot figure how to
1) compare ALARM _OPENED and ALARM_NORMALIZED for the event type
2) To compare the difference in time between the change in event (the toggling should happen within a time frame of two seconds.)
count = 0
#loop this
if event_type[0] = 'ALARM_OPENED'
if event_type[1] = 'ALARM_NORMALIZED'
#time[0] - time[1] = 2 seconds
count = count + 1
end
end
p count
If you can assume that you always have a bunch of OPENED/NORMALIZED pairs, you can slice the array into pairs:
event_type.each_slice(2) do |opened, normalized|
break unless normalized # unpaired event at the end
# whatever you want to do with the two events here
end

How to write dates (MM/DD/YY) into a matrix (SAS)

I have following problem:
I need to write a begin and end date into a matrix. Where the matrix contains the yearly quarters (1-4) in the collumns and the rows are the year.
E.g.
Matrix:
Q1 Q2 Q3 Q4
2010
2011
Now the Date 01.01.2010 should be put in the first element and the date 09.20.2011 in the sixed element.
Thanks in advance.
You first have to consider that SAS does not actually have date/time/datetime variables. It just uses numeric variables formatted as date/time/datetime. The actual value being:
days since 1/1/1960 for dates
seconds since 00:00 for times
seconds since 1/1/1960 00:00 for datetimes
SAS does not even distinguish between integer and float numeric types. So a date value can contain a fractional part.
What you do or can do with a SAS numeric variable is completely up to you, and mostly depends on the format you apply. You could mistakenly format a variable containing a date value with a datetime format... or even with a currency format... SAS won't notice or complain.
You also have to consider that SAS does not even actually have matrixes and arrays. It does provide a way to simulate their use to read and write to dataset variables.
That said, SAS does provide a whole lot of formats and informats that allow you to implement date and time manipulation.
Assuming you are coding within a data step, and assuming the "dates" are in dataset numeric variables, then the PUT function can extract the datepart you need to calculate row, column of the matrix element to write to, like so:
DATA table;
ARRAY dm{2,4} dm_r1c1-dm_r1c4 dm_r2c1-dm_r2c4;
beg_row = PUT(beg_date, YEAR4.)-2009;
end_row = PUT(end_date, YEAR4.)-2009;
beg_col = PUT(beg_date, QTR1.);
end_col = PUT(end_date, QTR1.);
dm{beg_row,beg_col} = beg_date;
dm{end_row,end_col} = end_date;
RUN;
... or if you are using a one-dimensional array:
DATA table;
ARRAY da{8} da_1-da_8;
beg_index = 4 * (PUT(beg_date, YEAR4.)-2010) + PUT(beg_date, QTR1.);
end_index = 4 * (PUT(end_date, YEAR4.)-2010) + PUT(end_date, QTR1.);
da{beg_index} = beg_date;
da{end_index} = end_date;
RUN;

calculate standard deviation of daily data within a year

I have a question,
In Matlab, I have a vector of 20 years of daily data (X) and a vector of the relevant dates (DATES). In order to find the mean value of the daily data per year, I use the following script:
A = fints(DATES,X); %convert to financial time series
B = toannual(A,'CalcMethod', 'SimpAvg'); %calculate average value per year
C = fts2mat(B); %Convert fts object to vector
C is a vector of 20 values. showing the average value of the daily data for each of the 20 years. So far, so good.. Now I am trying to do the same thing but instead of calculating mean values annually, i need to calculate std annually but it seems there is not such an option with function "toannual".
Any ideas on how to do this?
THANK YOU IN ADVANCE
I'm assuming that X is the financial information and it is an even distribution across each year. You'll have to modify this if that isn't the case. Just to clarify, by even distribution, I mean that if there are 20 years and X has 200 values, each year has 10 values to it.
You should be able to do something like this:
num_years = length(C);
span_size = length(X)/num_years;
for n = 0:num_years-1
std_dev(n+1,1) = std(X(1+(n*span_size):(n+1)*span_size));
end
The idea is that you simply pass the date for the given year (the day to day values) into matlab's standard deviation function. That will return the std-dev for that year. std_dev should be a column vector that correlates 1:1 with your C vector of yearly averages.
unique_Dates = unique(DATES) %This should return a vector of 20 elements since you have 20 years.
std_dev = zeros(size(unique_Dates)); %Just pre allocating the standard deviation vector.
for n = 1:length(unique_Dates)
std_dev(n) = std(X(DATES==unique_Dates(n)));
end
Now this is assuming that your DATES matrix is passable to the unique function and that it will return the expected list of dates. If you have the dates in a numeric form I know this will work, I'm just concerned about the dates being in a string form.
In the event they are in a string form you can look at using regexp to parse the information and replace matching dates with a numeric identifier and use the above code. Or you can take the basic theory behind this and adapt it to what works best for you!

number of days in a period that fall within another period

I have 2 independent but contiguous date ranges. The first range is the start and end date for a project. Lets say start = 3/21/10 and end = 5/16/10. The second range is a month boundary (say 3/1/10 to 3/31/10, 4/1/10 to 4/30/10, etc.) I need to figure out how many days in each month fall into the first range.
The answer to my example above is March = 10, April = 30, May = 16.
I am trying to figure out an excel formula or VBA function that will give me this value.
Any thoughts on an algorithm for this? I feel it should be rather easy but I can't seem to figure it out.
I have a formula which will return TRUE/FALSE if ANY part of the month range is within the project start/end but not the number of days. That function is below.
return month_start <= project_end And month_end >= project_start
Think it figured it out.
=MAX( MIN(project_end, month_end) - MAX(project_start,month_start) + 1 , 0 )

Resources