number of days in a period that fall within another period - algorithm

I have 2 independent but contiguous date ranges. The first range is the start and end date for a project. Lets say start = 3/21/10 and end = 5/16/10. The second range is a month boundary (say 3/1/10 to 3/31/10, 4/1/10 to 4/30/10, etc.) I need to figure out how many days in each month fall into the first range.
The answer to my example above is March = 10, April = 30, May = 16.
I am trying to figure out an excel formula or VBA function that will give me this value.
Any thoughts on an algorithm for this? I feel it should be rather easy but I can't seem to figure it out.
I have a formula which will return TRUE/FALSE if ANY part of the month range is within the project start/end but not the number of days. That function is below.
return month_start <= project_end And month_end >= project_start

Think it figured it out.
=MAX( MIN(project_end, month_end) - MAX(project_start,month_start) + 1 , 0 )


How to find the largest number of times a candlestick pattern appears within 2 hours to 15 minute timeframes

I am trying to search figure out how to search for a pattern within a range of timeframes. Obviously, it is likely that the pattern would occur several times based on the timeframes, that’s why I’m particularly interested in the largest number of times it repeats.
To explain what I’m trying to achieve further, say I am searching for a pattern from 2 hour to 15 minute chart and I find it on the 2 hour chart, then I drill into the next timeframe 1 hour, and I end up with two of the patterns on the 1 hour chart, I’ll continue to the 30 minute (in both 1 hour patterns) and to 15 minutes till I get the largest time it occurs.
I believe that a method that returns the next lower timeframe would be needed. I’ve been able to write that, see code below. I would really appreciate some help.
ENUM_TIMEFRAMES findLowerTimeframe(ENUM_TIMEFRAMES timePeriod)
int timeFrames[5] = {15, 20, 30, 60, 120};
int TFIndex=ArrayBsearch(timeFrames, (int)timePeriod);
return((ENUM_TIMEFRAMES) timeFrames[TFIndex - 1]);
I didn't add the specific candlestick pattern because I believe it isn't the most important part of my problem. The crux of the question is how to search for a pattern on several consecutive timeframes to find the largest number of times it occurs within the range of times.
ENUM_TIMEFRAMES findLowerTimeframe(ENUM_TIMEFRAMES timePeriod)
int TFIndex=ArrayBsearch(DEFAULT_TIMEFRAMES,timePeriod);
return(TFIndex>0 ? timeFrames[TFIndex - 1] : PERIOD_CURRENT);

Making complex new variables

I have troubles finding a solution to the following problem:
I have an age variable (e.g. 18, 20, 56) and a date when the survey was taken (2012). What I want to do is the following: if the respondent is 10 years old I need to make 10 categories of age with 0 and 1 when the respondent was not existing or alive: so new variable age2002 = 1, age2003 = 1, ... age2012 = 1 but age2000 = 0 and age1990 = 0.
How can I do this is in spss syntax for every respondent? I have many varying ages but the year of the survey is always the same.
this is for all the ages from 1 to 100:
do repeat NewVr=age1912 to age2012/vl=1912 to 2012.
compute NewVr=(2012-age<=vl).
end repeat.
if you only want all the ages between 1 to 10 and then 2000, 1990, 1980 etc':
do repeat NewVr=age1970 age1980 age1990 age2000 age2002 to age2012
/vl=1970 1980 1990 2000 2002 to 2012.
compute NewVr=(2012-age<=vl).
end repeat.
What is the actual problem you are attempting to solve? Creating a bunch (100) 0/1 dummy variables doesn't seem like a very sound data management practice.
If you do go with the suggested
compute NewVr=(2012-age<=vl).
I would rewrite that as
COMPUTE newvar= ( (2012-age ) LE v1 ).
just seems clearer to parse in my brain.

Finding the day in which a given year begins

This question arose when I was trying to understand Sakamoto's algorithm for finding the day of a given date.
I found the working of the algorithm to be difficult to comprehend even after reading the following Stackoverflow answer
So, I decided to first solve a specific problem of finding the day in which a given year begins( Jan-1).
From the Sakamoto's algorithm, I just took the part of adding the additional days contributed by the leap and non-leap years.
My code is as follows:
public String getDay(String date)
String[] days = { "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday" };
int day = Integer.parseInt(date.split("/")[0]);
int month = Integer.parseInt(date.split("/")[1]);
int year = Integer.parseInt(date.split("/")[2]);
year--; // to calculate the additional days till the previous year
int dayOfTheWeek = (year + year/4 - year/100 + year/400) % 7;
return days[dayOfTheWeek];
Thus, for the date "1/1/0001", it returns Sunday.
To verify its correctness, I implemented Sakamoto's algorithm and compared the results and my program's result always seems to be one day before the day returned by the Sakamoto's algorithm.
For the date "1/1/0001" my program returns Sunday, while Sakamoto's returns Monday.
1) Does it mean that the Gregorian calendar started on Monday instead of Sunday??
2) If yes, does it mean I should add 1 to the result to get the right day or is my program logically incorrect?
Finally, I used TimeAndDate site's day calculator tool and "1/1/0001" starts on Saturday.
My final question is
3) On what day does the Gregorian calendar start?
Any light on the these questions is much appreciated.
What exactly is the point of reinventing the wheel?
Joda-Time is a de facto standard for date-time operations in Java, and it provides dayOfWeek method for its DateTime objects. See e.g.
If you are then still interested in details how to get the computation right, see

Converting numbers/string to time - PROLOG

I am a beginner in prolog and was wondering if there was an easy way to convert numbers to time, for comparison.
For example:
The below two lists show bus name, capacity, time it arrives at city, time it departs city.
bus_info(bus1,150, 12:30, 14:30).
bus_info(bus2, 200, 16:00, 18:00).
passenger_info(mike, 21, 17:30). -shows name, age, and time available
I want to check which bus Mike can catch. The answer is bus 2, but how do I calculate this in prolog?
You're just comparing times for a given day so you don't need to convert the numbers to any kind of system time encoding. You only need, say "minutes past midnight" or something like that. For example, 12:30 would be (12*60)+30 minutes past midnight. And you can use that as your comparison units for a daily schedule.
To capture your hours and minutes to do this calculation, if you were to "ask" in Prolog:
bus_info(Bus, Num, StartHH:StartMM, EndHH:EndMM).
You would get two results:
Bus = bus1
Num = 150
StartHH = 12
StartMM = 30
EndHH = 14
EndHH = 30
Bus = bus2
Num = 200
StartHH = 16
StartMM = 0
EndHH = 18
EndMM = 0
To assign a numeric value of an expression in Prolog, you need the is predicate. For example:
StartTime is (StartHH * 60) + StartMM.
That basic information should get you started if you've learned how Prolog predicates basically work.

calculate standard deviation of daily data within a year

I have a question,
In Matlab, I have a vector of 20 years of daily data (X) and a vector of the relevant dates (DATES). In order to find the mean value of the daily data per year, I use the following script:
A = fints(DATES,X); %convert to financial time series
B = toannual(A,'CalcMethod', 'SimpAvg'); %calculate average value per year
C = fts2mat(B); %Convert fts object to vector
C is a vector of 20 values. showing the average value of the daily data for each of the 20 years. So far, so good.. Now I am trying to do the same thing but instead of calculating mean values annually, i need to calculate std annually but it seems there is not such an option with function "toannual".
Any ideas on how to do this?
I'm assuming that X is the financial information and it is an even distribution across each year. You'll have to modify this if that isn't the case. Just to clarify, by even distribution, I mean that if there are 20 years and X has 200 values, each year has 10 values to it.
You should be able to do something like this:
num_years = length(C);
span_size = length(X)/num_years;
for n = 0:num_years-1
std_dev(n+1,1) = std(X(1+(n*span_size):(n+1)*span_size));
The idea is that you simply pass the date for the given year (the day to day values) into matlab's standard deviation function. That will return the std-dev for that year. std_dev should be a column vector that correlates 1:1 with your C vector of yearly averages.
unique_Dates = unique(DATES) %This should return a vector of 20 elements since you have 20 years.
std_dev = zeros(size(unique_Dates)); %Just pre allocating the standard deviation vector.
for n = 1:length(unique_Dates)
std_dev(n) = std(X(DATES==unique_Dates(n)));
Now this is assuming that your DATES matrix is passable to the unique function and that it will return the expected list of dates. If you have the dates in a numeric form I know this will work, I'm just concerned about the dates being in a string form.
In the event they are in a string form you can look at using regexp to parse the information and replace matching dates with a numeric identifier and use the above code. Or you can take the basic theory behind this and adapt it to what works best for you!
