How to approach Vertical Sticks challenge? - algorithm

This problem is taken from interviewstreet.com
Given array of integers Y=y1,...,yn, we have n line segments such that
endpoints of segment i are (i, 0) and (i, yi). Imagine that from the
top of each segment a horizontal ray is shot to the left, and this ray
stops when it touches another segment or it hits the y-axis. We
construct an array of n integers, v1, ..., vn, where vi is equal to
length of ray shot from the top of segment i. We define V(y1, ..., yn)
= v1 + ... + vn.
For example, if we have Y=[3,2,5,3,3,4,1,2], then v1, ..., v8 =
[1,1,3,1,1,3,1,2], as shown in the picture below:
For each permutation p of [1,...,n], we can calculate V(yp1, ...,
ypn). If we choose a uniformly random permutation p of [1,...,n], what
is the expected value of V(yp1, ..., ypn)?
Input Format
First line of input contains a single integer T (1 <= T <= 100). T
test cases follow.
First line of each test-case is a single integer N (1 <= N <= 50).
Next line contains positive integer numbers y1, ..., yN separated by a
single space (0 < yi <= 1000).
Output Format
For each test-case output expected value of V(yp1, ..., ypn), rounded
to two digits after the decimal point.
Sample Input
6
3
1 2 3
3
3 3 3
3
2 2 3
4
10 2 4 4
5
10 10 10 5 10
6
1 2 3 4 5 6
Sample Output
4.33
3.00
4.00
6.00
5.80
11.15
Explanation
Case 1: We have V(1,2,3) = 1+2+3 = 6, V(1,3,2) = 1+2+1 = 4, V(2,1,3) =
1+1+3 = 5, V(2,3,1) = 1+2+1 = 4, V(3,1,2) = 1+1+2 = 4, V(3,2,1) =
1+1+1 = 3. Average of these values is 4.33.
Case 2: No matter what the permutation is, V(yp1, yp2, yp3) = 1+1+1 =
3, so the answer is 3.00.
Case 3: V(y1 ,y2 ,y3)=V(y2 ,y1 ,y3) = 5, V(y1, y3, y2)=V(y2, y3, y1) =
4, V(y3, y1, y2)=V(y3, y2, y1) = 3, and average of these values is
4.00.
A naive solution to the problem will run forever for N=50. I believe that the problem can be solved by independently calculating a value for each stick. I still need to know if there is any other efficient approach for this problem. On what basis do we have to independently calculate value for each stick?

We can solve this problem, by figure out:
if the k th stick is put in i th position, what is the expected ray-length of this stick.
then the problem can be solve by adding up all the expected length for all sticks in all positions.
Let expected[k][i] be the expected ray-length of k th stick put in i th position, let num[k][i][length] be the number of permutations that k th stick put in i th position with ray-length equals to length, then
expected[k][i] = sum( num[k][i][length] * length ) / N!
How to compute num[k][i][length]? For example, for length=3, consider the following graph:
...GxxxI...
Where I is the position, 3 'x' means we need 3 sticks that are strictly lower then I, and G means we need a stick that are at least as high as I.
Let s_i be the number of sticks that are smaller then the k th the stick, and g_i be the number of sticks that are greater or equal to the k th stick, then we can choose any one of g_i to put in G position, we can choose any length of s_i to fill the x position, so we have:
num[k][i][length] = P(s_i, length) * g_i * P(n-length-1-1)
In case that all the positions before I are all smaller then I, we don't need a greater stick in G, i.e. xxxI...., we have:
num[k][i][length] = P(s_i, length) * P(n-length-1)
And here's a piece of Python code that can solve this problem:
def solve(n, ys):
ret = 0
for y_i in ys:
s_i = len(filter(lambda x: x < y_i, ys))
g_i = len(filter(lambda x: x >= y_i, ys)) - 1
for i in range(n):
for length in range(1, i+1):
if length == i:
t_ret = combination[s_i][length] * factorial[length] * factorial[ n - length - 1 ]
else:
t_ret = combination[s_i][length] * factorial[length] * g_i * factorial[ n - length - 1 - 1 ]
ret += t_ret * length
return ret * 1.0 / factorial[n] + n

This is the same question as https://cs.stackexchange.com/questions/1076/how-to-approach-vertical-sticks-challenge and my answer there (which is a little simpler than those given earlier here) was:
Imagine a different problem: if you had to place k sticks of equal heights in n slots then the expected distance between sticks (and the expected distance between the first stick and a notional slot 0, and the expected distance between the last stick and a notional slot n+1) is (n+1)/(k+1) since there are k+1 gaps to fit in a length n+1.
Returning to this problem, a particular stick is interested in how many sticks (including itself) as as high or higher. If this is k, then the expected gap before it is also (n+1)/(k+1).
So the algorithm is simply to find this value for each stick and add up the expectation. For example, starting with heights of 3,2,5,3,3,4,1,2, the number of sticks with a greater or equal height is 5,7,1,5,5,2,8,7 so the expectation is 9/6+9/8+9/2+9/6+9/6+9/3+9/9+9/8 = 15.25.
This is easy to program: for example a single line in R
V <- function(Y){(length(Y) + 1) * sum(1 / (rowSums(outer(Y, Y, "<=")) + 1) )}
gives the values in the sample output in the original problem
> V(c(1,2,3))
[1] 4.333333
> V(c(3,3,3))
[1] 3
> V(c(2,2,3))
[1] 4
> V(c(10,2,4,4))
[1] 6
> V(c(10,10,10,5,10))
[1] 5.8
> V(c(1,2,3,4,5,6))
[1] 11.15

As you correctly, noted we can solve problem independently for each stick.
Let F(i, len) is number of permutations, that ray from stick i is exactly len.
Then answer is
(Sum(by i, len) F(i,len)*len)/(n!)
All is left is to count F(i, len). Let a(i) be number of sticks j, that y_j<=y_i. b(i) - number of sticks, that b_j>b_i.
In order to get ray of length len, we need to have situation like this.
B, l...l, O
len-1 times
Where O - is stick #i. B - is stick with bigger length, or beginning. l - is stick with heigth, lesser then ith.
This gives us 2 cases:
1) B is the beginning, this can be achieved in P(a(i), len-1) * (b(i)+a(i)-(len-1))! ways.
2) B is bigger stick, this can be achieved in P(a(i), len-1)*b(i)*(b(i)+a(i)-len)!*(n-len) ways.
edit: corrected b(i) as 2nd term in (mul)in place of a(i) in case 2.

Related

Finding Probability of Multinomial distribution

A bag contain 5 dices and each have six faces with probability $p_1$=$p_2$=$p_3$=$2p_4$=$2p_5$=$3*p_6$. What is the probability of selecting two dice with face 4, and three dice with face 1?
Someone have try the codes for this problem in r shown in picture but I not understand how the probability is obtained. Kindly explain me the answer for this problem.
First, find p_is values based on the given equalities and the following assumption (replace all p_is based on their relations with p_6 to find it, then find the value of each p_i):
p_1 + p_2 + p_3 + p_4 + p_5 + p_6 = 1
To find the probability, we need to select 2 out of 5 of dices with face 4 which its probability is p_4^2 and for other dices, they should have face 1 that it's probability is p_1^3.
Now, to understand the last step, you should read the explanation of the dmultinom(x, size, prob) function (from this post):
Generate multinomially distributed random number vectors and compute multinomial probabilities. If x is a K-component vector, dmultinom(x, prob) is the probability
P(X[1]=x[1], … , X[K]=x[k]) = C * prod(j=1 , …, K) p[j]^x[j]. where C is the ‘multinomial coefficient’ C = N! / (x[1]! * … * x[K]!) and N = sum(j=1, …, K) x[j].
By definition, each component X[j] is binomially distributed as Bin(size, prob[j]) for j = 1, …, K.
Therefore, dmultinom(j, n, p) means C(5,2) * p_1^3 * p_4^2, as j = (3, 0, 0, 2, 0), n=5, and p = (p_1, p_2, p_3, p_4, p_5, p_6).

Haskell Performance Optimization

I am writing code to find nth Ramanujan-Hardy number. Ramanujan-Hardy number is defined as
n = a^3 + b^3 = c^3 + d^3
means n can be expressed as sum of two cubes.
I wrote the following code in haskell:
-- my own implementation for cube root. Expected time complexity is O(n^(1/3))
cube_root n = chelper 1 n
where
chelper i n = if i*i*i > n then (i-1) else chelper (i+1) n
-- It checks if the given number can be expressed as a^3 + b^3 = c^3 + d^3 (is Ramanujan-Hardy number?)
is_ram n = length [a| a<-[1..crn], b<-[(a+1)..crn], c<-[(a+1)..crn], d<-[(c+1)..crn], a*a*a + b*b*b == n && c*c*c + d*d*d == n] /= 0
where
crn = cube_root n
-- It finds nth Ramanujan number by iterating from 1 till the nth number is found. In recursion, if x is Ramanujan number, decrement n. else increment x. If x is 0, preceding number was desired Ramanujan number.
ram n = give_ram 1 n
where
give_ram x 0 = (x-1)
give_ram x n = if is_ram x then give_ram (x+1) (n-1) else give_ram (x+1) n
In my opinion, time complexity to check if a number is Ramanujan number is O(n^(4/3)).
On running this code in ghci, it is taking time even to find 2nd Ramanujan number.
What are possible ways to optimize this code?
First a small clarification of what we're looking for. A Ramanujan-Hardy number is one which may be written two different ways as a sum of two cubes, i.e. a^3+b^3 = c^3 + d^3 where a < b and a < c < d.
An obvious idea is to generate all of the cube-sums in sorted order and then look for adjacent sums which are the same.
Here's a start - a function which generates all of the cube sums with a given first cube:
cubes a = [ (a^3+b^3, a, b) | b <- [a+1..] ]
All of the possible cube sums in order is just:
allcubes = sort $ concat [ cubes 1, cubes 2, cubes 3, ... ]
but of course this won't work since concat and sort don't work
on infinite lists.
However, since cubes a is an increasing sequence we can sort all of
the sequences together by merging them:
allcubes = cubes 1 `merge` cubes 2 `merge` cubes 3 `merge` ...
Here we are taking advantage of Haskell's lazy evaluation. The definition
of merge is just:
merge [] bs = bs
merge as [] = as
merge as#(a:at) bs#(b:bt)
= case compare a b of
LT -> a : merge at bs
EQ -> a : b : merge at bt
GT -> b : merge as bt
We still have a problem since we don't know where to stop. We can solve that
by having cubes a initiate cubes (a+1) at the appropriate time, i.e.
cubes a = ...an initial part... ++ (...the rest... `merge` cubes (a+1) )
The definition is accomplished using span:
cubes a = first ++ (rest `merge` cubes (a+1))
where
s = (a+1)^3 + (a+2)^3
(first, rest) = span (\(x,_,_) -> x < s) [ (a^3+b^3,a,b) | b <- [a+1..]]
So now cubes 1 is the infinite series of all the possible sums a^3 + b^3 where a < b in sorted order.
To find the Ramanujan-Hardy numbers, we just group adjacent elements of the list together which have the same first component:
sameSum (x,a,b) (y,c,d) = x == y
rjgroups = groupBy sameSum $ cubes 1
The groups we are interested in are those whose length is > 1:
rjnumbers = filter (\g -> length g > 1) rjgroups
Thre first 10 solutions are:
ghci> take 10 rjnumbers
[(1729,1,12),(1729,9,10)]
[(4104,2,16),(4104,9,15)]
[(13832,2,24),(13832,18,20)]
[(20683,10,27),(20683,19,24)]
[(32832,4,32),(32832,18,30)]
[(39312,2,34),(39312,15,33)]
[(40033,9,34),(40033,16,33)]
[(46683,3,36),(46683,27,30)]
[(64232,17,39),(64232,26,36)]
[(65728,12,40),(65728,31,33)]
Your is_ram function checks for a Ramanujan number by trying all values for a,b,c,d up to the cuberoot, and then looping over all n.
An alternative approach would be to simply loop over values for a and b up to some limit and increment an array at index a^3+b^3 by 1 for each choice.
The Ramanujan numbers can then be found by iterating over non-zero values in this array and returning places where the array content is >=2 (meaning that at least 2 ways have been found of computing that result).
I believe this would be O(n^(2/3)) compared to your method that is O(n.n^(4/3)).

Number of ways to reach N from 0 using only 2 or 3?

I am solving this problem where we need to reach from X=0 to X=N.We can only take a step of 2 or 3 at a time.
For each step of 2 we have a probability of 0.2 and for each step of 3 we have a probability of 0.8.How can we find the total probability to reach N.
e.g. for reaching 5,
2+3 with probability =0.2 * 0.8=0.16
3+2 with probability =0.8 * 0.2=0.16 total = 0.32.
My initial thoughts:
Number of ways can be found out by simple Fibonacci sequence.
f(n)=f(n-3)+f(n-2);
But how do we remember the numbers so that we can multiply them to find the probability?
This can be solved using Dynamic programming.
Lets call the function F(N) = probability to reach 0 using only 2 and 3 when the starting number is N
F(N) = 0.2*F(N-2) + 0.3*F(N-3)
Base case:
F(0) = 1 and F(k)= 0 where k< 0
So the DP code would be somthing like that:
F[0] = 1;
for(int i = 1;i<=N;i++){
if(i>=3)
F[i] = 0.2*F[i-2] + 0.8*F[i-3];
else if(i>=2)
F[i] = 0.2*F[i-2];
else
F[i] = 0;
}
return F[N];
This algorithm would run in O(N)
Some clarifications about this solution: I assume the only allowed operation for generating the number from 2s and 3s is addition (your definition would allow substraction aswell) and the input-numbers are always valid (2 <= input). Definition: a unique row of numbers means: no other row with the same number of 3s and 2s in another order is in scope.
We can reduce the problem into multiple smaller problems:
Problem A: finding all sequences of numbers that can sum up to the given number. (Unique rows of numbers only)
Start by finding the minimum-number of 3s required to build the given number, which is simply input % 2. The maximum-number of 3s that can be used to build the input can be calculated this way:
int max_3 = (int) (input / 3);
if(input - max_3 == 1)
--max_3;
Now all sequences of numbers that sum up to input must hold between input % 2 and max_3 3s. The 2s can be easily calculated from a given number of 3s.
Problem B: calculating the probability for a given list and it's permutations to be the result
For each unique row of numbers, we can easily derive all permutations. Since these consist of the same number, they have the same likeliness to appear and produce the same sum. The likeliness can be calculated easily from the row: 0.8 ^ number_of_3s * 0.2 ^ number_of_2s. Next step would be to calculate the number of different permuatations. Calculating all distinct sets with a specific number of 2s and 3s can be done this way: Calculate all possible distributions of 2s in the set: (number_of_2s + number_of_3s)! / (number_of_3s! * numer_of_2s!). Basically just the number of possible distinct permutations.
Now from theory to praxis
Since the math is given, the rest is pretty straight forward:
define prob:
input: int num
output: double
double result = 0.0
int min_3s = (num % 2)
int max_3s = (int) (num / 3)
if(num - max_3 == 1)
--max_3
for int c3s in [min_3s , max_3s]
int c2s = (num - (c3s * 3)) / 2
double p = 0.8 ^ c3s * 0.2 * c2s
p *= (c3s + c2s)! / (c3s! * c2s!)
result += p
return result
Instead of jumping into the programming, you can use math.
Let p(n) be the probability that you reach the location that is n steps away.
Base cases:
p(0)=1
p(1)=0
p(2)=0.2
Linear recurrence relation
p(n+3)=0.2 p(n+1) + 0.8 p(n)
You can solve this in closed form by finding the exponential solutions to the linear recurrent relation.
c^3 = 0.2 c + 0.8
c = 1, (-5 +- sqrt(55)i)/10
Although this was cubic, c=1 will always be a solution in this type of problem since there is a constant nonzero solution.
Because the roots are distinct, all solutions are of the form a1(1)^n + a2((-5+sqrt(55)i) / 10)^n + a3((-5-sqrt(55)i)/10)^n. You can solve for a1, a2, and a3 using the initial conditions:
a1=5/14
a2=(99-sqrt(55)i)/308
a3=(99+sqrt(55)i)/308
This gives you a nonrecursive formula for p(n):
p(n)=5/14+(99-sqrt(55)i)/308((-5+sqrt(55)i)/10)^n+(99+sqrt(55)i)/308((-5-sqrt(55)i)/10)^n
One nice property of the non-recursive formula is that you can read off the asymptotic value of 5/14, but that's also clear because the average value of a jump is 2(1/5)+ 3(4/5) = 14/5, and you almost surely hit a set with density 1/(14/5) of the integers. You can use the magnitudes of the other roots, 2/sqrt(5)~0.894, to see how rapidly the probabilities approach the asymptotics.
5/14 - (|a2|+|a3|) 0.894^n < p(n) < 5/14 + (|a2|+|a3|) 0.894^n
|5/14 - p(n)| < (|a2|+|a3|) 0.894^n
f(n, p) = f(n-3, p*.8) + f(n -2, p*.2)
Start p at 1.
If n=0 return p, if n <0 return 0.
Instead of using the (terribly inefficient) recursive algorithm, start from the start and calculate in how many ways you can reach subsequent steps, i.e. using 'dynamic programming'. This way, you can easily calculate the probabilities and also have a complexity of only O(n) to calculate everything up to step n.
For each step, memorize the possible ways of reaching that step, if any (no matter how), and the probability of reaching that step. For the zeroth step (the start) this is (1, 1.0).
steps = [(1, 1.0)]
Now, for each consecutive step n, get the previously computed possible ways poss and probability prob to reach steps n-2 and n-3 (or (0, 0.0) in case of n < 2 or n < 3 respectively), add those to the combined possibilities and probability to reach that new step, and add them to the list.
for n in range(1, 10):
poss2, prob2 = steps[n-2] if n >= 2 else (0, 0.0)
poss3, prob3 = steps[n-3] if n >= 3 else (0, 0.0)
steps.append( (poss2 + poss3, prob2 * 0.2 + prob3 * 0.8) )
Now you can just get the numbers from that list:
>>> for n, (poss, prob) in enumerate(steps):
... print "%s\t%s\t%s" % (n, poss, prob)
0 1 1.0
1 0 0.0
2 1 0.2
3 1 0.8
4 1 0.04
5 2 0.32 <-- 2 ways to get to 5 with combined prob. of 0.32
6 2 0.648
7 3 0.096
8 4 0.3856
9 5 0.5376
(Code is in Python)
Note that this will get you both the number of possible ways of reaching a certain step (e.g. "first 2, then 3" or "first 3, then 2" for 5), and the probability to reach that step in one go. Of course, if you need only the probability, you can just use single numbers instead of tuples.

Number of Paths in a Triangle

I recently encountered a much more difficult variation of this problem, but realized I couldn't generate a solution for this very simple case. I searched Stack Overflow but couldn't find a resource that previously answered this.
You are given a triangle ABC, and you must compute the number of paths of certain length that start at and end at 'A'. Say our function f(3) is called, it must return the number of paths of length 3 that start and end at A: 2 (ABA,ACA).
I'm having trouble formulating an elegant solution. Right now, I've written a solution that generates all possible paths, but for larger lengths, the program is just too slow. I know there must be a nice dynamic programming solution that reuses sequences that we've previously computed but I can't quite figure it out. All help greatly appreciated.
My dumb code:
def paths(n,sequence):
t = ['A','B','C']
if len(sequence) < n:
for node in set(t) - set(sequence[-1]):
paths(n,sequence+node)
else:
if sequence[0] == 'A' and sequence[-1] == 'A':
print sequence
Let PA(n) be the number of paths from A back to A in exactly n steps.
Let P!A(n) be the number of paths from B (or C) to A in exactly n steps.
Then:
PA(1) = 1
PA(n) = 2 * P!A(n - 1)
P!A(1) = 0
P!A(2) = 1
P!A(n) = P!A(n - 1) + PA(n - 1)
= P!A(n - 1) + 2 * P!A(n - 2) (for n > 2) (substituting for PA(n-1))
We can solve the difference equations for P!A analytically, as we do for Fibonacci, by noting that (-1)^n and 2^n are both solutions of the difference equation, and then finding coefficients a, b such that P!A(n) = a*2^n + b*(-1)^n.
We end up with the equation P!A(n) = 2^n/6 + (-1)^n/3, and PA(n) being 2^(n-1)/3 - 2(-1)^n/3.
This gives us code:
def PA(n):
return (pow(2, n-1) + 2*pow(-1, n-1)) / 3
for n in xrange(1, 30):
print n, PA(n)
Which gives output:
1 1
2 0
3 2
4 2
5 6
6 10
7 22
8 42
9 86
10 170
11 342
12 682
13 1366
14 2730
15 5462
16 10922
17 21846
18 43690
19 87382
20 174762
21 349526
22 699050
23 1398102
24 2796202
25 5592406
26 11184810
27 22369622
28 44739242
29 89478486
The trick is not to try to generate all possible sequences. The number of them increases exponentially so the memory required would be too great.
Instead, let f(n) be the number of sequences of length n beginning and ending A, and let g(n) be the number of sequences of length n beginning with A but ending with B. To get things started, clearly f(1) = 1 and g(1) = 0. For n > 1 we have f(n) = 2g(n - 1), because the penultimate letter will be B or C and there are equal numbers of each. We also have g(n) = f(n - 1) + g(n - 1) because if a sequence ends begins A and ends B the penultimate letter is either A or C.
These rules allows you to compute the numbers really quickly using memoization.
My method is like this:
Define DP(l, end) = # of paths end at end and having length l
Then DP(l,'A') = DP(l-1, 'B') + DP(l-1,'C'), similar for DP(l,'B') and DP(l,'C')
Then for base case i.e. l = 1 I check if the end is not 'A', then I return 0, otherwise return 1, so that all bigger states only counts those starts at 'A'
Answer is simply calling DP(n, 'A') where n is the length
Below is a sample code in C++, you can call it with 3 which gives you 2 as answer; call it with 5 which gives you 6 as answer:
ABCBA, ACBCA, ABABA, ACACA, ABACA, ACABA
#include <bits/stdc++.h>
using namespace std;
int dp[500][500], n;
int DP(int l, int end){
if(l<=0) return 0;
if(l==1){
if(end != 'A') return 0;
return 1;
}
if(dp[l][end] != -1) return dp[l][end];
if(end == 'A') return dp[l][end] = DP(l-1, 'B') + DP(l-1, 'C');
else if(end == 'B') return dp[l][end] = DP(l-1, 'A') + DP(l-1, 'C');
else return dp[l][end] = DP(l-1, 'A') + DP(l-1, 'B');
}
int main() {
memset(dp,-1,sizeof(dp));
scanf("%d", &n);
printf("%d\n", DP(n, 'A'));
return 0;
}
EDITED
To answer OP's comment below:
Firstly, DP(dynamic programming) is always about state.
Remember here our state is DP(l,end), represents the # of paths having length l and ends at end. So to implement states using programming, we usually use array, so DP[500][500] is nothing special but the space to store the states DP(l,end) for all possible l and end (That's why I said if you need a bigger length, change the size of array)
But then you may ask, I understand the first dimension which is for l, 500 means l can be as large as 500, but how about the second dimension? I only need 'A', 'B', 'C', why using 500 then?
Here is another trick (of C/C++), the char type indeed can be used as an int type by default, which value is equal to its ASCII number. And I do not remember the ASCII table of course, but I know that around 300 will be enough to represent all the ASCII characters, including A(65), B(66), C(67)
So I just declare any size large enough to represent 'A','B','C' in the second dimension (that means actually 100 is more than enough, but I just do not think that much and declare 500 as they are almost the same, in terms of order)
so you asked what DP[3][1] means, it means nothing as the I do not need / calculate the second dimension when it is 1. (Or one can think that the state dp(3,1) does not have any physical meaning in our problem)
In fact, I always using 65, 66, 67.
so DP[3][65] means the # of paths of length 3 and ends at char(65) = 'A'
You can do better than the dynamic programming/recursion solution others have posted, for the given triangle and more general graphs. Whenever you are trying to compute the number of walks in a (possibly directed) graph, you can express this in terms of the entries of powers of a transfer matrix. Let M be a matrix whose entry m[i][j] is the number of paths of length 1 from vertex i to vertex j. For a triangle, the transfer matrix is
0 1 1
1 0 1.
1 1 0
Then M^n is a matrix whose i,j entry is the number of paths of length n from vertex i to vertex j. If A corresponds to vertex 1, you want the 1,1 entry of M^n.
Dynamic programming and recursion for the counts of paths of length n in terms of the paths of length n-1 are equivalent to computing M^n with n multiplications, M * M * M * ... * M, which can be fast enough. However, if you want to compute M^100, instead of doing 100 multiplies, you can use repeated squaring: Compute M, M^2, M^4, M^8, M^16, M^32, M^64, and then M^64 * M^32 * M^4. For larger exponents, the number of multiplies is about c log_2(exponent).
Instead of using that a path of length n is made up of a path of length n-1 and then a step of length 1, this uses that a path of length n is made up of a path of length k and then a path of length n-k.
We can solve this with a for loop, although Anonymous described a closed form for it.
function f(n){
var as = 0, abcs = 1;
for (n=n-3; n>0; n--){
as = abcs - as;
abcs *= 2;
}
return 2*(abcs - as);
}
Here's why:
Look at one strand of the decision tree (the other one is symmetrical):
A
B C...
A C
B C A B
A C A B B C A C
B C A B B C A C A C A B B C A B
Num A's Num ABC's (starting with first B on the left)
0 1
1 (1-0) 2
1 (2-1) 4
3 (4-1) 8
5 (8-3) 16
11 (16-5) 32
Cleary, we can't use the strands that end with the A's...
You can write a recursive brute force solution and then memoize it (aka top down dynamic programming). Recursive solutions are more intuitive and easy to come up with. Here is my version:
# search space (we have triangle with nodes)
nodes = ["A", "B", "C"]
#cache # memoize!
def recurse(length, steps):
# if length of the path is n and the last node is "A", then it's
# a valid path and we can count it.
if length == n and ((steps-1)%3 == 0 or (steps+1)%3 == 0):
return 1
# we don't want paths having len > n.
if length > n:
return 0
# from each position, we have two possibilities, either go to next
# node or previous node. Total paths will be sum of both the
# possibilities. We do this recursively.
return recurse(length+1, steps+1) + recurse(length+1, steps-1)

Find the sum of least common multiples of all subsets of a given set

Given: set A = {a0, a1, ..., aN-1} (1 &leq; N &leq; 100), with 2 &leq; ai &leq; 500.
Asked: Find the sum of all least common multiples (LCM) of all subsets of A of size at least 2.
The LCM of a setB = {b0, b1, ..., bk-1} is defined as the minimum integer Bmin such that bi | Bmin, for all 0 &leq; i < k.
Example:
Let N = 3 and A = {2, 6, 7}, then:
LCM({2, 6}) = 6
LCM({2, 7}) = 14
LCM({6, 7}) = 42
LCM({2, 6, 7}) = 42
----------------------- +
answer 104
The naive approach would be to simply calculate the LCM for all O(2N) subsets, which is not feasible for reasonably large N.
Solution sketch:
The problem is obtained from a competition*, which also provided a solution sketch. This is where my problem comes in: I do not understand the hinted approach.
The solution reads (modulo some small fixed grammar issues):
The solution is a bit tricky. If we observe carefully we see that the integers are between 2 and 500. So, if we prime factorize the numbers, we get the following maximum powers:
2 8
3 5
5 3
7 3
11 2
13 2
17 2
19 2
Other than this, all primes have power 1. So, we can easily calculate all possible states, using these integers, leaving 9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 states, which is nearly 70000. For other integers we can make a dp like the following: dp[70000][i], where i can be 0 to 100. However, as dp[i] is dependent on dp[i-1], so dp[70000][2] is enough. This leaves the complexity to n * 70000 which is feasible.
I have the following concrete questions:
What is meant by these states?
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
How is dp[i] computed from dp[i-1]?
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
*The original problem description can be found from this source (problem F). This question is a simplified version of that description.
Discussion
After reading the actual contest description (page 10 or 11) and the solution sketch, I have to conclude the author of the solution sketch is quite imprecise in their writing.
The high level problem is to calculate an expected lifetime if components are chosen randomly by fair coin toss. This is what's leading to computing the LCM of all subsets -- all subsets effectively represent the sample space. You could end up with any possible set of components. The failure time for the device is based on the LCM of the set. The expected lifetime is therefore the average of the LCM of all sets.
Note that this ought to include the LCM of sets with only one item (in which case we'd assume the LCM to be the element itself). The solution sketch seems to sabotage, perhaps because they handled it in a less elegant manner.
What is meant by these states?
The sketch author only uses the word state twice, but apparently manages to switch meanings. In the first use of the word state it appears they're talking about a possible selection of components. In the second use they're likely talking about possible failure times. They could be muddling this terminology because their dynamic programming solution initializes values from one use of the word and the recurrence relation stems from the other.
Does dp stand for dynamic programming?
I would say either it does or it's a coincidence as the solution sketch seems to heavily imply dynamic programming.
If so, what recurrence relation is being solved? How is dp[i] computed from dp[i-1]?
All I can think is that in their solution, state i represents a time to failure , T(i), with the number of times this time to failure has been counted, dp[i]. The resulting sum would be to sum all dp[i] * T(i).
dp[i][0] would then be the failure times counted for only the first component. dp[i][1] would then be the failure times counted for the first and second component. dp[i][2] would be for the first, second, and third. Etc..
Initialize dp[i][0] with zeroes except for dp[T(c)][0] (where c is the first component considered) which should be 1 (since this component's failure time has been counted once so far).
To populate dp[i][n] from dp[i][n-1] for each component c:
For each i, copy dp[i][n-1] into dp[i][n].
Add 1 to dp[T(c)][n].
For each i, add dp[i][n-1] to dp[LCM(T(i), T(c))][n].
What is this doing? Suppose you knew that you had a time to failure of j, but you added a component with a time to failure of k. Regardless of what components you had before, your new time to fail is LCM(j, k). This follows from the fact that for two sets A and B, LCM(A union B} = LCM(LCM(A), LCM(B)).
Similarly, if we're considering a time to failure of T(i) and our new component's time to failure of T(c), the resultant time to failure is LCM(T(i), T(c)). Note that we recorded this time to failure for dp[i][n-1] configurations, so we should record that many new times to failure once the new component is introduced.
Why do the big primes not contribute to the number of states?
Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
You're right, of course. However, the solution sketch states that numbers with large primes are handled in another (unspecified) fashion.
What would happen if we did include them? The number of states we would need to represent would explode into an impractical number. Hence the author accounts for such numbers differently. Note that if a number less than or equal to 500 includes a prime larger than 19 the other factors multiply to 21 or less. This makes such numbers amenable for brute forcing, no tables necessary.
The first part of the editorial seems useful, but the second part is rather vague (and perhaps unhelpful; I'd rather finish this answer than figure it out).
Let's suppose for the moment that the input consists of pairwise distinct primes, e.g., 2, 3, 5, and 7. Then the answer (for summing all sets, where the LCM of 0 integers is 1) is
(1 + 2) (1 + 3) (1 + 5) (1 + 7),
because the LCM of a subset is exactly equal to the product here, so just multiply it out.
Let's relax the restriction that the primes be pairwise distinct. If we have an input like 2, 2, 3, 3, 3, and 5, then the multiplication looks like
(1 + (2^2 - 1) 2) (1 + (2^3 - 1) 3) (1 + (2^1 - 1) 5),
because 2 appears with multiplicity 2, and 3 appears with multiplicity 3, and 5 appears with multiplicity 1. With respect to, e.g., just the set of 3s, there are 2^3 - 1 ways to choose a subset that includes a 3, and 1 way to choose the empty set.
Call a prime small if it's 19 or less and large otherwise. Note that integers 500 or less are divisible by at most one large prime (with multiplicity). The small primes are more problematic. What we're going to do is to compute, for each possible small portion of the prime factorization of the LCM (i.e., one of the ~70,000 states), the sum of LCMs for the problem derived by discarding the integers that could not divide such an LCM and leaving only the large prime factor (or 1) for the other integers.
For example, if the input is 2, 30, 41, 46, and 51, and the state is 2, then we retain 2 as 1, discard 30 (= 2 * 3 * 5; 3 and 5 are small), retain 41 as 41 (41 is large), retain 46 as 23 (= 2 * 23; 23 is large), and discard 51 (= 3 * 17; 3 and 17 are small). Now, we compute the sum of LCMs using the previously described technique. Use inclusion-exclusion to get rid of the subsets whose LCM whose small portion properly divides the state instead of being exactly equal. Maybe I'll work a complete example later.
What is meant by these states?
I think here, states refer to if the number is in set B = {b0, b1, ..., bk-1} of LCMs of set A.
Does dp stand for dynamic programming and if so, what recurrence relation is being solved?
dp in the solution sketch stands for dynamic programming, I believe.
How is dp[i] computed from dp[i-1]?
It's feasible that we can figure out the state of next group of LCMs from previous states. So, we only need array of 2, and toggle back and forth.
Why do the big primes not contribute to the number of states? Each of them occurs either 0 or 1 times. Should the number of states not be multiplied by 2 for each of these primes (leading to a non-feasible state space again)?
We can use Prime Factorization and exponents only to present the number.
Here is one example.
6 = (2^1)(3^1)(5^0) -> state "1 1 0" to represent 6
18 = (2^1)(3^2)(5^0) -> state "1 2 0" to represent 18
Here is how we can get LMC of 6 and 18 using Prime Factorization
LCM (6,18) = (2^(max(1,1)) (3^ (max(1,2)) (5^max(0,0)) = (2^1)(3^2)(5^0) = 18
2^9 > 500, 3^6 > 500, 5^4 > 500, 7^4>500, 11^3 > 500, 13^3 > 500, 17^3 > 500, 19^3 > 500
we can use only count of exponents of prime number 2,3,5,7,11,13,17,19 to represent the LCMs in the set B = {b0, b1, ..., bk-1}
for the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500.
9 * 6 * 4 * 4 * 3 * 3 * 3 * 3 <= 70000, so we only need two of dp[9][6][4][4][3][3][3][3] to keep tracks of all LCMs' states. So, dp[70000][2] is enough.
I put together a small C++ program to illustrate how we can get sum of LCMs of the given set A = {a0, a1, ..., aN-1} (1 ≤ N ≤ 100), with 2 ≤ ai ≤ 500. In the solution sketch, we need to loop through 70000 max possible of LCMs.
int gcd(int a, int b) {
int remainder = 0;
do {
remainder = a % b;
a = b;
b = remainder;
} while (b != 0);
return a;
}
int lcm(int a, int b) {
if (a == 0 || b == 0) {
return 0;
}
return (a * b) / gcd(a, b);
}
int sum_of_lcm(int A[], int N) {
// get the max LCM from the array
int max = A[0];
for (int i = 1; i < N; i++) {
max = lcm(max, A[i]);
}
max++;
//
int dp[max][2];
memset(dp, 0, sizeof(dp));
int pri = 0;
int cur = 1;
// loop through n x 70000
for (int i = 0; i < N; i++) {
for (int v = 1; v < max; v++) {
int x = A[i];
if (dp[v][pri] > 0) {
x = lcm(A[i], v);
dp[v][cur] = (dp[v][cur] == 0) ? dp[v][pri] : dp[v][cur];
if ( x % A[i] != 0 ) {
dp[x][cur] += dp[v][pri] + dp[A[i]][pri];
} else {
dp[x][cur] += ( x==v ) ? ( dp[v][pri] + dp[v][pri] ) : ( dp[v][pri] ) ;
}
}
}
dp[A[i]][cur]++;
pri = cur;
cur = (pri + 1) % 2;
}
for (int i = 0; i < N; i++) {
dp[A[i]][pri] -= 1;
}
long total = 0;
for (int j = 0; j < max; j++) {
if (dp[j][pri] > 0) {
total += dp[j][pri] * j;
}
}
cout << "total:" << total << endl;
return total;
}
int test() {
int a[] = {2, 6, 7 };
int n = sizeof(a)/sizeof(a[0]);
int total = sum_of_lcm(a, n);
return 0;
}
Output
total:104
The states are one more than the powers of primes. You have numbers up to 2^8, so the power of 2 is in [0..8], which is 9 states. Similarly for the other states.
"dp" could well stand for dynamic programming, I'm not sure.
The recurrence relation is the heart of the problem, so you will learn more by solving it yourself. Start with some small, simple examples.
For the large primes, try solving a reduced problem without using them (or their equivalents) and then add them back in to see their effect on the final result.

Resources