I have a sorted list of overlapping intervals, intervals are never contained in each other, e.g.,
[(7, 11), (9, 14), (12, 17)]
The constraint for the output is to keep every element as close as possible to its
origin (the middle of the interval), preserve the order of the input, and remove all overlap. Only an
approximate solution is necessary. The expected result for the example input would be:
[(5,9), (9, 14), (14, 19)]
I'm only aware of solutions that go about this in some simulation
style: shift each element by some value in a free direction and
iterate until all overlap has been removed.
Is there an existing algorithm to solve this?
find the overall average:
in our example:
(7 + 11 + 9 + 14 + 12 + 17)/6 = 11.667
find the total length:
(11-7) + (14-9) + (17-12) = 4 + 5 + 5 = 14;
find the new min/max;
14/2 = 7
11.667 - 7 = 4.667
11.667 + 7 = 18.667
you can round 'em
4.667 ~ 5
18.667 ~ 19
start from the min, creating the sections by the intervals
(5, (11-7)+5) = (5,9)
(9, (14-9)+9) = (9,14)
(14, (17-12)+14) = (14,19)
NOTE:
this method will not keep the elements as equal as possible to the originals, but will keep them as close as possible to the original considering their relative values (preserving the center)
EDIT:
if you want to keep the averages of all intervals as close as possible to the original, you can implement a mathematical solution.
our problem's input is:
a1=(a1,1, a1,2) , ... , an=(an,1,an,2)
we will define:
ai1 = a1,2-a1,1 // define the intervals
b1 = (d, d+ai1)
bn = (d + sum(ai1..ain-1), d + sum(ai1..ain) )
bi1 = b1,2-b1,1 // define the intervals
we need to find a 'd' such as:
s = sum( abs((a1,1+a1,2)/2 - (b1,1+b1,2)/2) )
min(s) is what we want
in our example:
a1 = (7,11), ai1 = 4, Aavg1 = 9
a2 = (9,14), ai2 = 5, Aavg2 = 11.5
a3 = (12,7), ai3 = 5, Aavg3 = 14.5
b1 = (d, d+4) Bavg1 = d+2
b2 = (d+4, d+9) Bavg2 = d+6.5
b3 = (d+9, d+14) Bavg3 = d+11.5
s = abs(9-(d+2)) + abs(11.5-(d+6.5)) + abs(14.5-(d+11.5)) = abs(7-d) + abs(5-d) + abs(3-d)
now calculcate the derivative to find min/max OR iterate over d to get a result. in our case you will need to iterate from 3 to 7
that should do the trick
Given that the solution must be order-preserving, we can formulate this problem as a linear program. Let [ai, bi] be the ith interval. Let variables xi be the left shift of the ith interval and yi be the right shift of the ith interval.
minimize sumi (xi + yi)
subject to
(*) for all i: bi - xi + yi ≤ ai+1 - xi+1 + yi+1
for all i: xi, yi ≥ 0
Rewrite constraint (*) by introducing a variable zi.
for all i: xi - yi - xi+1 + yi+1 - zi = 0
for all i: zi ≥ bi - ai+1
Now the problem is reduced to computing a minimum-cost circulation, which can be done in poly-time. I have a feeling, however, that there's a more direct solution to this problem.
The graph looks something like
(*)
---- | ----
/ z| \
/ i| \
/ xi | xi+1 \
|/ <---- v <---- \|
... (*) ...
----> ---->
yi yi+1
Related
I have been working on a Hackerearth Problem. Here is the problem statement:
We have three variables a, b and c. We need to convert a to b and following operations are allowed:
1. Can decrement by 1.
2. Can decrement by 2.
3. Can multiply by c.
Minimum steps required to convert a to b.
Here is the algorithm I came up with:
Increment count to 0.
Loop through till a === b:
1. Perform (x = a * c), (y = a - 1) and (z = a - 2).
2. Among x, y and z, choose the one whose absolute difference with b is the least.
3. Update the value of a to the value chosen among x, y and z.
4. Increment the count by 1.
I can get pass the basic test case but all my advance cases are failing. I guess my logic is correct but due to the complexity it seems to fail.
Can someone suggest a more optimized solution.
Edit 1
Sample Code
function findMinStep(arr) {
let a = parseInt(arr[0]);
let b = parseInt(arr[1]);
let c = parseInt(arr[2]);
let numOfSteps = 0;
while(a !== b) {
let multiply = Math.abs(b - (a * c));
let decrement = Math.abs(b - (a - 1));
let doubleDecrement = Math.abs(b - (a - 2));
let abs = Math.min(multiply, decrement, doubleDecrement);
if(abs === multiply) a = a * c;
else if(abs === decrement) a -= 1;
else a -= 2;
numOfSteps += 1;
}
return numOfSteps.toString()
}
Sample Input: a = 3, b = 10, c = 2
Explanation: Multiply 3 with 2 to get 6, subtract 1 from 6 to get 5, multiply 5 with 2 to get 10.
Reason for tagging both Python and JS: Comfortable with both but I am not looking for code, just an optimized algorithm and analytical thinking.
Edit 2:
function findMinStep(arr) {
let a = parseInt(arr[0]);
let b = parseInt(arr[1]);
let c = parseInt(arr[2]);
let depth = 0;
let queue = [a, 'flag'];
if(a === b ) return 0
if(a > b) {
let output = Math.floor((a - b) / 2);
if((a - b) % 2) return output + 1;
return output
}
while(true) {
let current = queue.shift();
if(current === 'flag') {
depth += 1;
queue.push('flag');
continue;
}
let multiple = current * c;
let decrement = current - 1;
let doubleDecrement = current -2;
if (multiple !== b) queue.push(multiple);
else return depth + 1
if (decrement !== b) queue.push(decrement);
else return depth + 1
if (doubleDecrement !== b) queue.push(doubleDecrement);
else return depth + 1
}
}
Still times out. Any more suggestions?
Link for the question for you reference.
BFS
A greedy approach won't work here.
However it is already on the right track. Consider the graph G, where each node represents a value and each edge represents one of the operations and connects two values that are related by that operation (e.g.: 4 and 3 are connected by "subtract 1"). Using this graph, we can easily perform a BFS-search to find the shortest path:
def a_to_b(a, b, c):
visited = set()
state = {a}
depth = 0
while b not in state:
visited |= state
state = {v - 1 for v in state if v - 1 not in visited} | \
{v - 2 for v in state if v - 2 not in visited} | \
{v * c for v in state if v * c not in visited}
depth += 1
return 1
This query systematically tests all possible combinations of operations until it reaches b by testing stepwise. I.e. generate all values that can be reached with a single operation from a, then test all values that can be reached with two operations, etc., until b is among the generated values.
In depth analysis
(Assuming c >= 0, but can be generalized)
So far for the standard-approach that works with little analysis. This approach has the advantage that it works for any problem of this kind and is easy to implement. However it isn't very efficient and will reach it's limits fairly fast, once the numbers grow. So instead I'll show a way to analyze the problem in depth and gain a (far) more performant solution:
In a first step this answer will analyze the problem:
We need operations -->op such that a -->op b and -->op is a sequence of
subtract 1
subtract 2
multiply by c
First of all, what happens if we first subtract and afterwards multiply?
(a - x) * c = a * c - x * c
Next what happens, if we first multiply and afterwards subtract?
a * c - x'
Positional systems
Well, there's no simplifying transformation for this. But we've got the basic pieces to analyze more complicated chains of operations. Let's see what happens when we chain subtractions and multiplications alternatingly:
(((a - x) * c - x') * c - x'') * c - x'''=
((a * c - x * c - x') * c - x'') * c - x''' =
(a * c^2 - x * c^2 - x' * c - x'') * c - x''' =
a * c^3 - x * c^3 - x' * c^2 - x'' * c - x'''
Looks familiar? We're one step away from defining the difference between a and b in a positional system base c:
a * c^3 - x * c^3 - x' * c^2 - x'' * c - x''' = b
x * c^3 + x' * c^2 + x'' * c + x''' = a * c^3 - b
Unfortunately the above is still not quite what we need. All we can tell is that the LHS of the equation will always be >=0. In general, we first need to derive the proper exponent n (3 in the above example), s.t. it is minimal, nonnegative and a * c^n - b >= 0. Solving this for the individual coefficients (x, x', ...), where all coefficients are non-negative is a fairly trivial task.
We can show two things from the above:
if a < b and a < 0, there is no solution
solving as above and transforming all coefficients into the appropriate operations leads to the optimal solution
Proof of optimality
The second statement above can be proven by induction over n.
n = 0: In this case a - b < c, so there is only one -->op
n + 1: let d = a * c^(n + 1) - b. Let d' = d - m * c^(n + 1), where m is chosen, such that d' is minimal and nonnegative. Per induction-hypothesis d' can be generated optimally via a positional system. Leaving a difference of exactly m * c^n. This difference can not be covered more efficiently via lower-order terms than by m / 2 subtractions.
Algorithm (The TLDR-part)
Consider a * c^n - b as a number base c and try to find it's digits. The final number should have n + 1 digits, where each digit represents a certain number of subtractions. Multiple subtractions are represented by a single digit by addition of the subtracted values. E.g. 5 means -2 -2 -1. Working from the most significant to the least significant digit, the algorithm operates as follows:
perform the subtractions as specified by the digit
if the current digit is was the last, terminate
multiply by c and repeat from 1. with the next digit
E.g.:
a = 3, b = 10, c = 2
choose n = 2
a * c^n - b = 3 * 4 - 10 = 2
2 in binary is 010
steps performed: 3 - 0 = 3, 3 * 2 = 6, 6 - 1 = 5, 5 * 2 = 10
or
a = 2, b = 25, c = 6
choose n = 2
a * c^n - b = 47
47 base 6 is 115
steps performed: 2 - 1 = 1, 1 * 6 = 6, 6 - 1 = 5, 5 * 6 = 30, 30 - 2 - 2 - 1 = 25
in python:
def a_to_b(a, b, c):
# calculate n
n = 0
pow_c = 1
while a * pow_c - b < 0:
n += 1
pow_c *= 1
# calculate coefficients
d = a * pow_c - b
coeff = []
for i in range(0, n + 1):
coeff.append(d // pow_c) # calculate x and append to terms
d %= pow_c # remainder after eliminating ith term
pow_c //= c
# sum up subtractions and multiplications as defined by the coefficients
return n + sum(c // 2 + c % 2 for c in coeff)
I have given a Set A I have to find the sum of Fibonacci Sum of All the Subsets of A.
Fibonacci(X) - Is the Xth Element of Fibonacci Series
For example, for A = {1,2,3}:
Fibonacci(1) + Fibonacci(2) + Fibonacci(3) + Fibonacci(1+2) + Fibonacci(2+3) + Fibonacci(1+3) + Fibonacci(1+2+3)
1 + 1 + 2 + 2 + 5 + 3 + 8 = 22
Is there any way I can find the sum without generating the subset?
Since I find the Sum of all subset easily
i.e. Sum of All Subset - (1+2+3)*(pow(2,length of set-1))
There surely is.
First, let's recall that the nth Fibonacci number equals
φ(n) = [φ^n - (-φ)^(-n)]/√5
where φ = (√5 + 1)/2 (Golden Ratio) and (-φ)^(-1) = (1-√5)/2. But to make this shorter, let me denote φ as A and (-φ)^(-1) as B.
Next, let's notice that a sum of Fibonacci numbers is a sum of powers of A and B:
[φ(n) + φ(m)]*√5 = A^n + A^m - B^n - B^m
Now what is enough to calc (in the {1,2,3} example) is
A^1 + A^2 + A^3 + A^{1+2} + A^{1+3} + A^{2+3} + A^{1+2+3}.
But hey, there's a simpler expression for this:
(A^1 + 1)(A^2 + 1)(A^3 + 1) - 1
Now, it is time to get the whole result.
Let our set be {n1, n2, ..., nk}. Then our sum will be equal to
Sum = 1/√5 * [(A^n1 + 1)(A^n2 + 1)...(A^nk + 1) - (B^n1 + 1)(B^n2 + 1)...(B^nk + 1)]
I think, mathematically, this is the "simplest" form of the answer as there's no relation between n_i. However, there could be some room for computative optimization of this expression. In fact, I'm not sure at all if this (using real numbers) will work faster than the "straightforward" summing, but the question was about avoiding subsets generation, so here's the answer.
I tested the answer from YakovL using Python 2.7. It works very well and is plenty quick. I cannot imagine that summing the sequence values would be quicker. Here's the implementation.
_phi = (5.**0.5 + 1.)/2.
A = lambda n: _phi**n
B = lambda n: (-_phi)**(-n)
prod = lambda it: reduce(lambda x, y: x*y, it)
subset_sum = lambda s: (prod(A(n)+1 for n in s) - prod(B(n)+1 for n in s))/5**0.5
And here are some test results:
print subset_sum({1, 2, 3})
# 22.0
# [Finished in 0.1s]
print subset_sum({1, 2, 4, 8, 16, 32, 64, 128, 256, 512})
# 7.29199318438e+213
# [Finished in 0.1s]
Given the matrix A x A and a number of movements N.
And walking like a spiral:
right while possible, then
down while possible, then
left while possible, then
up while possible, repeat until got N.
Image with example (A = 8; N = 36)
In this example case, the final square is (4; 7).
My question is: Is it possible to use a generic formula to solve this?
Yes, it is possible to calculate the answer.
To do so, it will help to split up the problem into three parts.
(Note: I start counting at zero to simplify the math. This means that you'll have to add 1 to some parts of the answer. For instance, my answer to A = 8, N = 36 would be the final square (3; 6), which has the label 35.)
(Another note: this answer is quite similar to Nyavro's answer, except that I avoid the recursion here.)
In the first part, you calculate the labels on the diagonal:
(0; 0) has label 0.
(1; 1) has label 4*(A-1). The cycle can be evenly split into four parts (with your labels: 1..7, 8..14, 15..21, 22..27).
(2; 2) has label 4*(A-1) + 4*(A-3). After taking one cycle around the A x A matrix, your next cycle will be around a (A - 2) x (A - 2) matrix.
And so on. There are plenty of ways to now figure out the general rule for (K; K) (when 0 < K < A/2). I'll just pick the one that's easiest to show:
4*(A-1) + 4*(A-3) + 4*(A-5) + ... + 4*(A-(2*K-1)) =
4*A*K - 4*(1 + 3 + 5 + ... + (2*K-1)) =
4*A*K - 4*(K + (0 + 2 + 4 + ... + (2*K-2))) =
4*A*K - 4*(K + 2*(0 + 1 + 2 + ... + (K-1))) =
4*A*K - 4*(K + 2*(K*(K-1)/2)) =
4*A*K - 4*(K + K*(K-1)) =
4*A*K - 4*(K + K*K - K) =
4*A*K - 4*K*K =
4*(A-K)*K
(Note: check that 4*(A-K)*K = 28 when A = 8 and K = 1. Compare this to the label at (2; 2) in your example.)
Now that we know what labels are on the diagonal, we can figure out how many layers (say K) we have to remove from our A x A matrix so that the final square is on the edge. If we do this, then answering our question
What are the coordinates (X; Y) when I take N steps in a A x A matrix?
can be done by calculating this K and instead solve the question
What are the coordinates (X - K; Y - K) when I take N - 4*(A-K)*K steps in a (A - 2*K) x (A - 2*K) matrix?
To do this, we should find the largest integer K such that K < A/2 and 4*(A-K)*K <= N.
The solution to this is K = floor(A/2 - sqrt(A*A-N)/2).
All that remains is to find out the coordinates of a square that is N along the edge of some A x A matrix:
if 0*E <= N < 1*E, the coordinates are (0; N);
if 1*E <= N < 2*E, the coordinates are (N - E; E);
if 2*E <= N < 3*E, the coordinates are (E; 3*E - N); and
if 3*E <= N < 4*E, the coordinates are (4*E - N; 0).
Here, E = A - 1.
To conclude, here is a naive (layerNumber gives incorrect answers for large values of a due to float inaccuracy) Haskell implementation of this answer:
finalSquare :: Integer -> Integer -> Maybe (Integer, Integer)
finalSquare a n
| Just (x', y') <- edgeSquare a' n' = Just (x' + k, y' + k)
| otherwise = Nothing
where
k = layerNumber a n
a' = a - 2*k
n' = n - 4*(a-k)*k
edgeSquare :: Integer -> Integer -> Maybe (Integer, Integer)
edgeSquare a n
| n < 1*e = Just (0, n)
| n < 2*e = Just (n - e, e)
| n < 3*e = Just (e, 3*e - n)
| n < 4*e = Just (4*e - n, 0)
| otherwise = Nothing
where
e = a - 1
layerNumber :: Integer -> Integer -> Integer
layerNumber a n = floor $ aa/2 - sqrt(aa*aa-nn)/2
where
aa = fromInteger a
nn = fromInteger n
Here is the possible solution:
f a n | n < (a-1)*1 = (0, n)
| n < (a-1)*2 = (n-(a-1), a-1)
| n < (a-1)*3 = (a-1, 3*(a-1)-n)
| n < (a-1)*4 = (4*(a-1)-n, 0)
| otherwise = add (1,1) (f (a-2) (n - 4*(a-1))) where
add (x1, y1) (x2, y2) = (x1+x2, y1+y2)
This is a basic solution, it may be generalized further - I just don't know how much generalization you need. So you can get the idea.
Edit
Notes:
The solution is for 0-based index
Some check for existence is required (n >= a*a)
I'm going to propose a relatively simple workaround here which generates all the indices in O(A^2) time so that they can later be accessed in O(1) for any N. If A changes, however, we would have to execute the algorithm again, which would once more consume O(A^2) time.
I suggest you use a structure like this to store the indices to access your matrix:
Coordinate[] indices = new Coordinate[A*A]
Where Coordinate is just a pair of int.
You can then fill your indices array by using some loops:
(This implementation uses 1-based array access. Correct expressions containing i, sentinel and currentDirection accordingly if this is an issue.)
Coordinate[] directions = { {1, 0}, {0, 1}, {-1, 0}, {0, -1} };
Coordinate c = new Coordinate(1, 1);
int currentDirection = 1;
int i = 1;
int sentinel = A;
int sentinelIncrement = A - 1;
boolean sentinelToggle = false;
while(i <= A * A) {
indices[i] = c;
if (i >= sentinel) {
if (sentinelToggle) {
sentinelIncrement -= 1;
}
sentinel += sentinelIncrement;
sentinelToggle = !sentinelToggle;
currentDirection = currentDirection mod 4 + 1;
}
c += directions[currentDirection];
i++;
}
Alright, off to the explanation: I'm using a variable called sentinel to keep track of where I need to switch directions (directions are simply switched by cycling through the array directions).
The value of sentinel is incremented in such a way that it always has the index of a corner in our spiral. In your example the sentinel would take on the values 8, 15, 22, 28, 34, 39... and so on.
Note that the index of "sentinel" increases twice by 7 (8, 15 = 8 + 7, 22 = 15 + 7), then by 6 (28 = 22 + 6, 34 = 28 + 6), then by 5 and so on. In my while loop I used the boolean sentinelToggle for this. Each time we hit a corner of the spiral (this is exactly iff i == sentinel, which is where the if-condition comes in) we increment the sentinel by sentinelIncrement and change the direction we're heading. If sentinel has been incremented twice by the same value, the if-condition if (sentinelToggle) will be true, so sentinelIncrement is decreased by one. We have to decrease sentinelIncrement because our spiral gets smaller as we go on.
This goes on as long as i <= A*A, that is, as long as our array indices has still entries that are zero.
Note that this does not give you a closed formula for a spiral coordinate in respect to N (which would be O(1) ); instead it generates the indices for all N which takes up O(A^2) time and after that guarantees access in O(1) by simply calling indices[N].
O(n^2) hopefully shouldn't hurt too badly because I'm assuming that you'll also need to fill your matrix at some point which also takes O(n^2).
If efficiency is a problem, consider getting rid off sentinelToggle so it doesn't mess up branch prediction. Instead, decrement sentinelIncrement every time the while condition is met. To get the same effect for your sentinel value, simply start sentinelIncrement at (A - 1) * 2 and every time the if-condition is met, execute:
sentinel += sentinelIncrement / 2
The integer division will have the same effect as only decreasing sentinelIncrement every second time. I didn't do this whole thing in my version because I think it might be more easily understandable with just a boolean value.
Hope this helps!
How to minimize function y12 + y22 + ... + yn2 with constraints y1*y2*...*yn = c; y1,y2,...,yn > 0 using dynamic programming? I have tried to solve this problem, but I have no idea how to create a recurrent function.
You need to think how to reduce the problem into "smaller problem"
D(i,c) = min { D(i-1, c/y) + y^2 | 1 <= y <= c }
In the above, you reduce the problem from y1,y2,....,yi to y1,...,y_{i-1}, and check all possible assignments for y_i - and chose the best out of them.
Base clauses will be:
D(0,0) = 0
D(i,0) = Infinity i>0
You can do a top-down or bottom-up DP solution with these recurrence formulas, assuming i,c are integers. (Might need to add stop clause of D(i,c) = Infinity if c is not natural
Based on your question and the clarification given, I think dynamic programming is not necessary. The minimal solution is to choose y1, ..., yn to be the prime factors of c (with repeats).
Example: Given c = 60, we let y1 = 2, y2 = 2, y3 = 3, y4 = 5.
Then the sum is y12 + y22 + y32 + y42 = 4 + 4 + 9 + 25 = 42.
If we took fewer factors, then the sum would be bigger:
y1 = 60, then sum = 602 = 3600.
y1 = 5, y2 = 6, then sum = 52 + 62 = 61.
y1 = 3, y2 = 4, y2 = 5, then sum = 32 + 42 + 45 = 50.
Informal justification - it is always beneficial to split the product:
Suppose c = ab, with a and b ≥ 2.
Then c2 = a2b2. (a2b2 > 2a2) and (a2b2 > 2b2) are both true.
Adding these inequalities we get 2a2b2 > 2a2 + 2b2.
Therefore c2 = a2b2 > a2 + b2.
This problem is taken from interviewstreet.com
Given array of integers Y=y1,...,yn, we have n line segments such that
endpoints of segment i are (i, 0) and (i, yi). Imagine that from the
top of each segment a horizontal ray is shot to the left, and this ray
stops when it touches another segment or it hits the y-axis. We
construct an array of n integers, v1, ..., vn, where vi is equal to
length of ray shot from the top of segment i. We define V(y1, ..., yn)
= v1 + ... + vn.
For example, if we have Y=[3,2,5,3,3,4,1,2], then v1, ..., v8 =
[1,1,3,1,1,3,1,2], as shown in the picture below:
For each permutation p of [1,...,n], we can calculate V(yp1, ...,
ypn). If we choose a uniformly random permutation p of [1,...,n], what
is the expected value of V(yp1, ..., ypn)?
Input Format
First line of input contains a single integer T (1 <= T <= 100). T
test cases follow.
First line of each test-case is a single integer N (1 <= N <= 50).
Next line contains positive integer numbers y1, ..., yN separated by a
single space (0 < yi <= 1000).
Output Format
For each test-case output expected value of V(yp1, ..., ypn), rounded
to two digits after the decimal point.
Sample Input
6
3
1 2 3
3
3 3 3
3
2 2 3
4
10 2 4 4
5
10 10 10 5 10
6
1 2 3 4 5 6
Sample Output
4.33
3.00
4.00
6.00
5.80
11.15
Explanation
Case 1: We have V(1,2,3) = 1+2+3 = 6, V(1,3,2) = 1+2+1 = 4, V(2,1,3) =
1+1+3 = 5, V(2,3,1) = 1+2+1 = 4, V(3,1,2) = 1+1+2 = 4, V(3,2,1) =
1+1+1 = 3. Average of these values is 4.33.
Case 2: No matter what the permutation is, V(yp1, yp2, yp3) = 1+1+1 =
3, so the answer is 3.00.
Case 3: V(y1 ,y2 ,y3)=V(y2 ,y1 ,y3) = 5, V(y1, y3, y2)=V(y2, y3, y1) =
4, V(y3, y1, y2)=V(y3, y2, y1) = 3, and average of these values is
4.00.
A naive solution to the problem will run forever for N=50. I believe that the problem can be solved by independently calculating a value for each stick. I still need to know if there is any other efficient approach for this problem. On what basis do we have to independently calculate value for each stick?
We can solve this problem, by figure out:
if the k th stick is put in i th position, what is the expected ray-length of this stick.
then the problem can be solve by adding up all the expected length for all sticks in all positions.
Let expected[k][i] be the expected ray-length of k th stick put in i th position, let num[k][i][length] be the number of permutations that k th stick put in i th position with ray-length equals to length, then
expected[k][i] = sum( num[k][i][length] * length ) / N!
How to compute num[k][i][length]? For example, for length=3, consider the following graph:
...GxxxI...
Where I is the position, 3 'x' means we need 3 sticks that are strictly lower then I, and G means we need a stick that are at least as high as I.
Let s_i be the number of sticks that are smaller then the k th the stick, and g_i be the number of sticks that are greater or equal to the k th stick, then we can choose any one of g_i to put in G position, we can choose any length of s_i to fill the x position, so we have:
num[k][i][length] = P(s_i, length) * g_i * P(n-length-1-1)
In case that all the positions before I are all smaller then I, we don't need a greater stick in G, i.e. xxxI...., we have:
num[k][i][length] = P(s_i, length) * P(n-length-1)
And here's a piece of Python code that can solve this problem:
def solve(n, ys):
ret = 0
for y_i in ys:
s_i = len(filter(lambda x: x < y_i, ys))
g_i = len(filter(lambda x: x >= y_i, ys)) - 1
for i in range(n):
for length in range(1, i+1):
if length == i:
t_ret = combination[s_i][length] * factorial[length] * factorial[ n - length - 1 ]
else:
t_ret = combination[s_i][length] * factorial[length] * g_i * factorial[ n - length - 1 - 1 ]
ret += t_ret * length
return ret * 1.0 / factorial[n] + n
This is the same question as https://cs.stackexchange.com/questions/1076/how-to-approach-vertical-sticks-challenge and my answer there (which is a little simpler than those given earlier here) was:
Imagine a different problem: if you had to place k sticks of equal heights in n slots then the expected distance between sticks (and the expected distance between the first stick and a notional slot 0, and the expected distance between the last stick and a notional slot n+1) is (n+1)/(k+1) since there are k+1 gaps to fit in a length n+1.
Returning to this problem, a particular stick is interested in how many sticks (including itself) as as high or higher. If this is k, then the expected gap before it is also (n+1)/(k+1).
So the algorithm is simply to find this value for each stick and add up the expectation. For example, starting with heights of 3,2,5,3,3,4,1,2, the number of sticks with a greater or equal height is 5,7,1,5,5,2,8,7 so the expectation is 9/6+9/8+9/2+9/6+9/6+9/3+9/9+9/8 = 15.25.
This is easy to program: for example a single line in R
V <- function(Y){(length(Y) + 1) * sum(1 / (rowSums(outer(Y, Y, "<=")) + 1) )}
gives the values in the sample output in the original problem
> V(c(1,2,3))
[1] 4.333333
> V(c(3,3,3))
[1] 3
> V(c(2,2,3))
[1] 4
> V(c(10,2,4,4))
[1] 6
> V(c(10,10,10,5,10))
[1] 5.8
> V(c(1,2,3,4,5,6))
[1] 11.15
As you correctly, noted we can solve problem independently for each stick.
Let F(i, len) is number of permutations, that ray from stick i is exactly len.
Then answer is
(Sum(by i, len) F(i,len)*len)/(n!)
All is left is to count F(i, len). Let a(i) be number of sticks j, that y_j<=y_i. b(i) - number of sticks, that b_j>b_i.
In order to get ray of length len, we need to have situation like this.
B, l...l, O
len-1 times
Where O - is stick #i. B - is stick with bigger length, or beginning. l - is stick with heigth, lesser then ith.
This gives us 2 cases:
1) B is the beginning, this can be achieved in P(a(i), len-1) * (b(i)+a(i)-(len-1))! ways.
2) B is bigger stick, this can be achieved in P(a(i), len-1)*b(i)*(b(i)+a(i)-len)!*(n-len) ways.
edit: corrected b(i) as 2nd term in (mul)in place of a(i) in case 2.