I have troubles finding a solution to the following problem:
I have an age variable (e.g. 18, 20, 56) and a date when the survey was taken (2012). What I want to do is the following: if the respondent is 10 years old I need to make 10 categories of age with 0 and 1 when the respondent was not existing or alive: so new variable age2002 = 1, age2003 = 1, ... age2012 = 1 but age2000 = 0 and age1990 = 0.
How can I do this is in spss syntax for every respondent? I have many varying ages but the year of the survey is always the same.
this is for all the ages from 1 to 100:
do repeat NewVr=age1912 to age2012/vl=1912 to 2012.
compute NewVr=(2012-age<=vl).
end repeat.
execute.
if you only want all the ages between 1 to 10 and then 2000, 1990, 1980 etc':
do repeat NewVr=age1970 age1980 age1990 age2000 age2002 to age2012
/vl=1970 1980 1990 2000 2002 to 2012.
compute NewVr=(2012-age<=vl).
end repeat.
execute.
What is the actual problem you are attempting to solve? Creating a bunch (100) 0/1 dummy variables doesn't seem like a very sound data management practice.
If you do go with the suggested
DO REPEAT ...
compute NewVr=(2012-age<=vl).
....
I would rewrite that as
COMPUTE newvar= ( (2012-age ) LE v1 ).
just seems clearer to parse in my brain.
Related
I am trying to search figure out how to search for a pattern within a range of timeframes. Obviously, it is likely that the pattern would occur several times based on the timeframes, that’s why I’m particularly interested in the largest number of times it repeats.
To explain what I’m trying to achieve further, say I am searching for a pattern from 2 hour to 15 minute chart and I find it on the 2 hour chart, then I drill into the next timeframe 1 hour, and I end up with two of the patterns on the 1 hour chart, I’ll continue to the 30 minute (in both 1 hour patterns) and to 15 minutes till I get the largest time it occurs.
I believe that a method that returns the next lower timeframe would be needed. I’ve been able to write that, see code below. I would really appreciate some help.
ENUM_TIMEFRAMES findLowerTimeframe(ENUM_TIMEFRAMES timePeriod)
{
int timeFrames[5] = {15, 20, 30, 60, 120};
int TFIndex=ArrayBsearch(timeFrames, (int)timePeriod);
return((ENUM_TIMEFRAMES) timeFrames[TFIndex - 1]);
}
EDIT
I didn't add the specific candlestick pattern because I believe it isn't the most important part of my problem. The crux of the question is how to search for a pattern on several consecutive timeframes to find the largest number of times it occurs within the range of times.
const ENUM_TIMEFRAMES DEFAULT_TIMEFRAMES[5] = {PERIOD_M15, PERIOD_M20, PERIOD_M30, PERIOD_H1, PERIOD_H2};
ENUM_TIMEFRAMES findLowerTimeframe(ENUM_TIMEFRAMES timePeriod)
{
int TFIndex=ArrayBsearch(DEFAULT_TIMEFRAMES,timePeriod);
return(TFIndex>0 ? timeFrames[TFIndex - 1] : PERIOD_CURRENT);
}
TLDR: I'm effectively looking for an algorithm that would give me a combination of the minimum amount of total messages needed , whether they be "sequential" AND/OR "layered" in order to get to the final result.
===
For a hotel imagine 12 consecutive weeks.
For each of these weeks a price of 100$ exists.
The hotel’s manager decides to change the prices of all these weeks as such
His system currently allows him only to send “price change” messages “sequentially” like so:
Week 1 to Week 2 = 120 $
Week 3 to Week 4 = 150 $
Week 5 to Week 6 = 120 $
Week 7 to Week 9 = 200 $
Week 10 = 120$
Week 11 = 250$
Week 12 = 120$
However, in this case he understands that it would be more efficient to send out the messages
in a “layered” manner like so:
Week 1 to Week 12 = 120 $
Week 3 to Week 4 = 150 $
Week 7 to Week 9 = 200 $
Week 11 = 250$
Which algorithm allows the manager to always calculate the optimal “layered” option?? so that he may systematically choose the most efficient manner of sending out the messages, no matter how many weeks are concerned and bearing in mind that some weeks will not necessarily have their prices changed.
I'm effectively looking for an algorithm that would give me a combination of the minimum amount of total messages needed , whether they be "sequential" AND/OR "layered" in order to get to the final result. Those such an algorithm exist ?
Here is a top-down memoized recursion in Python that should solve this problem in O(n^4) time (actually slightly longer because it is also keeping track of the moves to make - but this could be optimized away):
class Memoize:
def __init__(self, fn):
self.fn = fn
self.memo = {}
def __call__(self, *args):
if not self.memo.has_key(args):
self.memo[args] = self.fn(*args)
return self.memo[args]
#Memoize
def best_messages(a,b,value=None):
"""Return moves needed to make range old[a:b] have the target values
If value is not None, it means the range has been set to the given value
"""
if value is None:
while a<b and new[a]==old[a]:
a+=1
while a<b and new[b-1]==old[b-1]:
b-=1
else:
# Skip values that are correct
while a<b and new[a]==value:
a+=1
while a<b and new[b-1]==value:
b-=1
if a==b:
return [] # Nothing to change
best = None
for s in range(a,b):
for e in range(s+1,b+1):
target = new[s]
if target==new[e-1]:
moves = [ (s,e,target) ] + best_messages(s,e,target) + best_messages(a,s,value) + best_messages(e,b,value)
if best is None or len(moves)<len(best):
best = moves
return best
old = [100,100,100,100,100,100,100,100,100,100,100,100]
new = [120,120,150,150,120,120,200,200,200,120,250,120]
for s,e,value in best_messages(0,len(old)) :
print "Week {} to Week {} = {}".format(s+1,e,value)
The basic principle is that it only makes sense to consider updates where we set the first and last in the update to the final target value because otherwise we can make the update shorter and still take the same number of moves.
Update
I think it can be optimized to work in O(n^3) time if you change:
for s in range(a,b):
to
s=a
I am a beginner in prolog and was wondering if there was an easy way to convert numbers to time, for comparison.
For example:
The below two lists show bus name, capacity, time it arrives at city, time it departs city.
bus_info(bus1,150, 12:30, 14:30).
bus_info(bus2, 200, 16:00, 18:00).
passenger_info(mike, 21, 17:30). -shows name, age, and time available
I want to check which bus Mike can catch. The answer is bus 2, but how do I calculate this in prolog?
You're just comparing times for a given day so you don't need to convert the numbers to any kind of system time encoding. You only need, say "minutes past midnight" or something like that. For example, 12:30 would be (12*60)+30 minutes past midnight. And you can use that as your comparison units for a daily schedule.
To capture your hours and minutes to do this calculation, if you were to "ask" in Prolog:
bus_info(Bus, Num, StartHH:StartMM, EndHH:EndMM).
You would get two results:
Bus = bus1
Num = 150
StartHH = 12
StartMM = 30
EndHH = 14
EndHH = 30
And
Bus = bus2
Num = 200
StartHH = 16
StartMM = 0
EndHH = 18
EndMM = 0
To assign a numeric value of an expression in Prolog, you need the is predicate. For example:
StartTime is (StartHH * 60) + StartMM.
That basic information should get you started if you've learned how Prolog predicates basically work.
Can anyone help me with a method that calculates the IRR of a series of stock trades?
Let's say the scenario is:
$10,000 of stock #1 purchased 1/1 and sold 1/7 for $11,000 (+10%)
$20,000 of stock #2 purchased 1/1 and sold 1/20 for $21,000 (+5%)
$15,000 of stock #3 purchased on 1/5 and sold 1/18 for $14,000 (-6.7%)
This should be helpful: http://www.rubyquiz.com/quiz156.html
But I couldn't figure out how to adapt any of the solutions since they assume the period of each return is over a consistent period (1 year).
I finally found exactly what I was looking for: http://rubydoc.info/gems/finance/1.1.0/Finance/Cashflow
gem install finance
To solve the scenario I posted originally:
include Finance
trans = []
trans << Transaction.new( -10000, date: Time.new(2012,1,1) )
trans << Transaction.new( 11000, date: Time.new(2012,1,7) )
trans << Transaction.new( -20000, date: Time.new(2012,1,1) )
trans << Transaction.new( 21000, date: Time.new(2012,1,20) )
trans << Transaction.new( -15000, date: Time.new(2012,1,5) )
trans << Transaction.new( 14000, date: Time.new(2012,1,18) )
trans.xirr.apr.to_f.round(2)
I also found this simple method: https://gist.github.com/1364990
However, it gave me some trouble. I tried a half dozen different test cases and one of them would raise an exception that I was never able to debug. But the xirr() method in this Finance gem worked for every test case I could throw at it.
For an investment that has an initial value and final value, as is the case with your example data that includes purchase price, sell price and a holding period, you only need to find holding period yield.
Holding period yield is calculated by subtracting 1 from holding period return
HPY = HPR - 1
HPR = final value/initial value
HPY = 11,000/10,000 - 1 = 1.1 - 1 = 0.10 = 10%
HPY = 21,000/20,000 - 1 = 1.05 - 1 = 0.05 = 5%
HPY = 14,000/15,000 - 1 = 0.9333 - 1 = -0.0667 = -6.7%
This article explains holding period return and yield
You can also annualize the holding period return and holding period yield using following formula
AHPR = HPR^(1/n)
AHPY = AHPR - 1
The above formulas only apply if you have a single period return as is the case with your example stock purchase and sale.
Yet if you had multiple returns, for example, you purchased a stock A on 1/1 for 100 and it's closing price over the next week climbed and fell to 98, 103, 101, 100, 99, 104
Then you will have to look beyond what HPR and HPY for multiple returns. In this case you can calculate ARR and GRR. Try out these online calculators for arithmetic rate of return and geometric rate of return.
But then if you had a date schedule for your investments then none of these would apply. You would then have to resort to finding IRR for irregular cash flows. IRR is the internal rate of return for periodic cash flows. For irregular cash flows such as for stock trade, the term XIRR is used. XIRR is an Excel function that calculates internal rate of return for irregular cash flows. To find XIRR you would need a series of cash flows and a date schedule for the cash flows.
Finance.ThinkAndDone.com explains IRR in much more detail than the articles you cited on RubyQuiz and Wiki. The IRR article on Think & Done explains IRR calculation with Newton Raphson method and Secant method using either the NPV equation set to 0 or the profitability index equation set to 1. The site also provides online IRR and XIRR calculators
I don't know anything about finance, but it makes sense to me that if you want to know the rate of return over 6 months, it should be the rate which equals the yearly rate when compounded twice. If you want to know the rate for 3 months, it should be the rate which equals the yearly rate when compounded 4 times, etc. This implies that converting from a yearly return rate to a rate for an arbitrary period is closely related to calculating roots. If you express the yearly return rate as a proportion of the original amount (i.e. express 20% return as 1.2, 100% return as 2.0, etc), then you can get the 6-month return rate by taking the square root of that number.
Ruby has a very handy way to calculate all kinds of complex roots: the exponentiation operator, **.
n ** 0.5 # square root
n ** (1.0/3.0) # 3rd root
...and so on.
So I think you should be able to convert a yearly rate of return to one for an arbitrary period by:
yearly_return ** (days.to_f / 365)
Likewise to convert a daily, weekly, or monthly rate or return to a yearly rate:
yearly_return = daily_return ** 365
yearly_return = weekly_return ** 52
yearly_return = monthly_return ** 12
...and so on.
As far as I can see (from reading the Wikipedia article), the IRR calculation is not actually dependent on the time period used. If you give a series of yearly cash flows as input, you get a yearly rate. If you give a series of daily cash flows as input, you get a daily rate, and so on.
I suggest you use one of the solutions you linked to to calculate IRR for daily or weekly cash flows (whatever is convenient), and convert that to a yearly rate using exponentiation. You will have to add 1 to the output of the irr() method (so that 10% return will be 1.1 rather than 0.1, etc).
Using the daily cash flows for the example you gave, you could do this to get daily IRR:
irr([-30000,0,0,0,-15000,0,11000,0,0,0,0,0,0,0,0,0,0,14000,0,21000])
You can use the Exonio library:
https://github.com/Noverde/exonio
and use it like this:
Exonio.irr([-100, 39, 59, 55, 20]) # ==> 0.28095
I believe that the main problem in order to be able to understand your scenario is the lack of a cash flow for each of the stocks, which is an essential ingredient for computing any type of IRR, without these, none of the formulas can be used. If you clarify this I can help you solve your problem
Heberto del Rio
There is new gem 'finance_math' that solves this problem very easy
https://github.com/kolosek/finance_math
I have 2 independent but contiguous date ranges. The first range is the start and end date for a project. Lets say start = 3/21/10 and end = 5/16/10. The second range is a month boundary (say 3/1/10 to 3/31/10, 4/1/10 to 4/30/10, etc.) I need to figure out how many days in each month fall into the first range.
The answer to my example above is March = 10, April = 30, May = 16.
I am trying to figure out an excel formula or VBA function that will give me this value.
Any thoughts on an algorithm for this? I feel it should be rather easy but I can't seem to figure it out.
I have a formula which will return TRUE/FALSE if ANY part of the month range is within the project start/end but not the number of days. That function is below.
return month_start <= project_end And month_end >= project_start
Think it figured it out.
=MAX( MIN(project_end, month_end) - MAX(project_start,month_start) + 1 , 0 )