I am reading about Dynamic programming in Cormen etc book on algorithms. following is text from book
Suppose we have motor car factory with two assesmly lines called as line 1 and line 2. We have to determine fastest time to get chassis all the way.
Ultimate goal is to determine the fastest time to get a chassis all the way through the factory, which we denote by Fn. The chasssis has to get all the way through station "n" on either line 1 or line 2 and then to factory exit. Since the faster of these ways is the fastest way through the entire factory, we have
Fn = min(f1[n] + x1, f2[n]+x2) ---------------- Eq1
Above x1 and x2 final additional time for comming out from line 1 and line 2
I have following recurrence equations. Consider following are Eq2.
f1[j] = e1 + a1,1 if j = 1
min(f1[j-1] + a1,j, f2[j-1] + t2,j-1 + a1,j if j >= 2
f2[j] = e2 + a2,1 if j = 1
min(f2[j-1] + a2,j, f1[j-1] + t1,j-1 + a2,j if j >= 2
Let Ri(j) be the number of references made to fi[j] in a recursive algorithm.
From equation R1(n) = R2(n) = 1
From equation 2 above we have
R1(j) = R2(j) = R1(j+1) + R2(j+1) for j = 1, 2, ...n-1
My question is how author came with R(n) =1 because usally we have base case as 0 rather than n, here then how we will write recursive functions in code
for example C code?
Another question is how author came up with R1(j) and R2(j)?
Thanks for all the help.
If you solve the problem in a recursive way, what would you do?
You'd start calculating F(n). F(n) would recursively call f1(n-1) and f2(n-1) until getting to the leaves (f1(0), f2(0)), right?
So, that's the reason the number of references to F(n) in the recursive solution is 1, because you'd need to compute f1(n) and f2(n) only once. This is not true to f1(n-1), which is referenced when you compute f1(n) and when you compute f2(n).
Now, how did he come up with R1(j) = R2(j) = R1(j+1) + R2(j+1)?
well, computing it in a recursive way, every time you need f1(i), you have to compute f1(j), f2(j), for every j in the interval [0, i) -- AKA for every j smaller than i.
In other words, the value of f1,2(i) depends on the value of f1,2(0..i-1), so every time you compute a f_(i), you're computing EVERY f1,2(1..i-1) - (because it depends on their value).
For this reason, the number of times you compute f_(i) depends on how many f1,2 there are "above him".
Hope that's clear.
Related
Problem: Show that RANDOMIZED-SELECT never makes a recursive call to a 0-length array.
Hint: Don't assume that the input array is empty, i.e., p>r. Rather, show that if an empty
(sub-)array is ever generated by RANDOMIZED-PARTITION, then a recursive call will not
be made on such an empty (sub-)array
This is the exercise problem of Cormen's Introduction to Algorithms Chapter 9. Median and order statistics exercise No. 9.2-1.
The answer should be:
Calling a 0-length array would mean that the second and third arguments are equal. So, if the call is made on line 8, we would need that p=q−1, which means that q - p + 1 = 0.
However, i is assumed to be a nonnegative number, and to be executing line 8, we would need that i < k = q - p + 1 = 0, a contradiction. The other possibility is that the bad recursive call occurs on line 9. This would mean that q + 1 = r. To be executing line 9, we need that i > k = q - p + 1 = r - p. This would be a nonsensical original call to the array though because we are asking for the ith element from an array of strictly less size.
This solution can be found this link
The algorithm it's refer can be found Cormen's Introduction to Algorithms Chapter 9. Median and order statistics section 9.2 Selection in expected linear time
Line number 8: of the algorithm says return RANDOMIZED-SELECT(A,p,q-1,i)
The solution says 2nd and 3rd argument should be equal, So, p=q-1 which means p-q+1 =0 but in the solution it was given q - p + 1 = 0. How could they get that?
Then again for line 9, they calculated q - p + 1 = r - p. As I cannot figure out how did they get q-p+1=0 the equation q-p+1=r-p also meaningless for me.
Can anyone please clarify my doubts?
Thank you.
Algorithm 1: RANDOMIZED-SELECT
RANDOMIZED-SELECT(A, p, r, i)
1 if p == r
2 return A[p]
3 q = RANDOMIZED-PARTITION (A,p,r)
4 k = q - p + 1
5 if i = = k // the pivot value is the answer
6 return A[q]
7 elseif i<k
8 return RANDOMIZED-SELECT(A,p,q - 1,i)
9 else return RANDOMIZED-SELECT(A, q + 1, r, i - k)
Algorithm 2: RANDOMIZED_PARTITION
RANDOMIZED-PARTITION(A,p,r)
1 i = RANDOM(p,r)
2 exchange A[r] with A[i]
3 return PARTITION (A,p, r)
Yes, I think you are right that the proposed solution is incorrect.
The solutions you are looking at are not part of the textbook, nor were they written by any of the textbook's authors, nor were they reviewed by the textbook's authors. In short, they are, like this site, the unverified opinions of uncertified contributors of uncertain value. It hardly seems necessary to observe that the internet is full of inexact, imprecise and plainly incorrect statements, some of them broadcast maliciously with intent to deceive, but the vast majority simple errors with no greater fault than sloppiness or ignorance. The result is the same: you have the responsibility to carefully evaluate the veracity of anything you read.
One aid in this particular repository of proposed solutions is the bug list, which is also not authored by infallible and reliable reviewers, but still allows some kind of triangulation since it largely consists of peer reviews. So it should be your first point of call when you suspect that a solution is buggy. And, indeed, there you will find this issue, which seems quite similar to your complaint. I'll quote the second comment in that issue (from "Alice-182"), because I don't think I can say it better; lightly edited, it reads:
Calling a 0-length array would mean that the second argument is larger than the third argument by 1. So, if the call is made on line 8, we would need that p = q - 1 + 1 = q.
However, i is assumed to be a positive number, and to be executing line 8, we would need that i < k = q - p + 1 = 1, which means that i ≤ 0, a contradiction. The other possibility is that the bad recursive call occurs on line 9. This would mean that q + 1 = r + 1. But if line 9 runs, it must be that i > k = q - p + 1 = r - p + 1. This would be a nonsensical original call to the array though for i should be in [1, r - p + 1].
I am trying to find a solution in which a given resource (eg. budget) will be best distributed to different options which yields different results on the resource provided.
Let's say I have N = 1200 and some functions. (a, b, c, d are some unknown variables)
f1(x) = a * x
f2(x) = b * x^c
f3(x) = a*x + b*x^2 + c*x^3
f4(x) = d^x
f5(x) = log x^d
...
And also, let's say there n number of these functions that yield different results based on its input x, where x = 0 or x >= m, where m is a constant.
Although I am not able to find exact formula for the given functions, I am able to find the output. This means that I can do:
X = f1(N1) + f2(N2) + f3(N3) + ... + fn(Nn) where (N1 + ... Nn) = N as many times as there are ways of distributing N into n numbers, and find a specific case where X is the greatest.
How would I actually go about finding the best distribution of N with the least computation power, using whatever libraries currently available?
If you are happy with allocations constrained to be whole numbers then there is a dynamic programming solution of cost O(Nn) - so you can increase accuracy by scaling if you want, but this will increase cpu time.
For each i=1 to n maintain an array where element j gives the maximum yield using only the first i functions giving them a total allowance of j.
For i=1 this is simply the result of f1().
For i=k+1 consider when working out the result for j consider each possible way of splitting j units between f_{k+1}() and the table that tells you the best return from a distribution among the first k functions - so you can calculate the table for i=k+1 using the table created for k.
At the end you get the best possible return for n functions and N resources. It makes it easier to find out what that best answer is if you maintain of a set of arrays telling the best way to distribute k units among the first i functions, for all possible values of i and k. Then you can look up the best allocation for f100(), subtract off the value this allocated to f100() from N, look up the best allocation for f99() given the resulting resources, and carry on like this until you have worked out the best allocations for all f().
As an example suppose f1(x) = 2x, f2(x) = x^2 and f3(x) = 3 if x>0 and 0 otherwise. Suppose we have 3 units of resource.
The first table is just f1(x) which is 0, 2, 4, 6 for 0,1,2,3 units.
The second table is the best you can do using f1(x) and f2(x) for 0,1,2,3 units and is 0, 2, 4, 9, switching from f1 to f2 at x=2.
The third table is 0, 3, 5, 9. I can get 3 and 5 by using 1 unit for f3() and the rest for the best solution in the second table. 9 is simply the best solution in the second table - there is no better solution using 3 resources that gives any of them to f(3)
So 9 is the best answer here. One way to work out how to get there is to keep the tables around and recalculate that answer. 9 comes from f3(0) + 9 from the second table so all 3 units are available to f2() + f1(). The second table 9 comes from f2(3) so there are no units left for f(1) and we get f1(0) + f2(3) + f3(0).
When you are working the resources to use at stage i=k+1 you have a table form i=k that tells you exactly the result to expect from the resources you have left over after you have decided to use some at stage i=k+1. The best distribution does not become incorrect because that stage i=k you have worked out the result for the best distribution given every possible number of remaining resources.
I have to implement an algorithm that solves the Towers of Hanoi game for k pods and d rings in a limited number of moves (let's say 4 pods, 10 rings, 50 moves for example) using Bellman dynamic programming equation (if the problem is solvable of course).
Now, I understand the logic behind the equation:
where V^T is the objective function at time T, a^0 is the action at time 0, x^0 is the starting configuration, H_0 is cumulative gain f(x^0, a^0)=x^1.
The cardinality of the state space is $k^d$ and I get that a good representation for a state is a number in base k: d digits that can go from 0 to k-1. Each digit represents a ring and the digit can go from 0 to k-1, that are the labels of the k rings.
I want to minimize the number of moves for going from the initial configuration (10 rings on the first pod) to the end one (10 rings on the last pod).
What I don't get is: how do I write my objective function?
The first you need to do is choose a reward function H_t(s,a) which will define you goal. Once this function is chosen, the (optimal) value function is defined and all you have to do is compute it.
The idea of dynamic programming for the Bellman equation is that you should compute V_t(s) bottom-up: you start with t=T, then t=T-1 and so on until t=0.
The initial case is simply given by:
V_T(s) = 0, ∀s
You can compute V_{T-1}(x) ∀x from V_T:
V_{T-1}(x) = max_a [ H_{T-1}(x,a) ]
Then you can compute V_{T-2}(x) ∀s from V_{T-1}:
V_{T-2}(x) = max_a [ H_{T-2}(x,a) + V_{T-1}(f(x,a)) ]
And you keep on computing V_{t-1}(x) ∀s from V_{t}:
V_{t-1}(x) = max_a [ H_{t-1}(x,a) + V_{t}(f(x,a)) ]
until you reach V_0.
Which gives the algorithm:
forall x:
V[T](x) ← 0
for t from T-1 to 0:
forall x:
V[t](x) ← max_a { H[t](x,a) + V[t-1](f(x,a)) }
What actually was requested was this:
def k_hanoi(npods,nrings):
if nrings == 1 and npods > 1: #one remaining ring: just one move
return 1
if npods == 3:
return 2**nrings - 1 #optimal solution with 3 pods take 2^d -1 moves
if npods > 3 and nrings > 0:
sol = []
for pivot in xrange(1, nrings): #loop on all possible pivots
sol.append(2*k_hanoi(npods, pivot)+k_hanoi(npods-1, nrings-pivot))
return min(sol) #minimization on pivot
k = 4
d = 10
print k_hanoi(k, d)
I think it is the Frame algorithm, with optimization on the pivot chosen to divide the disks in two subgroups. I also think someone demonstrated this is optimal for 4 pegs (in 2014 or something like that? Not sure btw) and conjectured to be optimal for more than 4 pegs. The limitation on the number of moves can be implemented easily.
The value function in this case was the number of steps needed to go from the initial configuration to the ending one and it needed be minimized. Thank you all for the contribution.
just out of curiosity I tried to do the following, which turned out to be not so obvious to me;
Suppose I have nested loops with runtime bounds, for example:
t = 0 // trip count
for l in 0:N
for k in 0:N
for j in max(l,k):N
for i in k:j+1
t += 1
t is loop trip count
is there a general algorithm/way (better than N^4 obviously) to calculate loop trip count?
if not, I would be curious to know how you would approach just this particular loop. the above loop is symmetric (it's loops over symmetric rank-4 tensor), and I am also interested in methods to detect loop symmetry.
I am working on the assumption that the iteration bounds depend only on constant or previous loop variables. link/journal article, If you know of one, would be great.
I believe the inner loop will run
t = 1/8 * (N^4 + 6 * N^3 + 7 * N^2 + 2 * N)
times.
I did not really solve the problem directly, I fitted a 4-th order polynomial expression to exactly calculated t for N from 1 to 50 hoping that I'll get exact fit.
To calculate exact t I used
sum(sum(sum(sum(1,i,k,j+1),j,max(l,k),N),k,1,N),l,1,N)
which should be the equivalent of actually running your loops.
data fit, log scale http://img714.imageshack.us/img714/2313/plot3.png
The fit for N from 1 to 50 matches exactly and calculating it for N=100 gives 13258775 using both methods.
EDIT:
The exercise was done using open source algebra system maxima, here's the actual source (output discarded):
nr(n):=sum(sum(sum(sum(1,i,k,j+1),j,max(l,k),n),k,1,n),l,1,n);
M : genmatrix( lambda([i,j],if j=1 then i else nr(i)), 50, 2 );
coefs : lsquares_estimates(M, [x,y], y = A*x^4+B*x^3+C*x^2+D*x+E, [A,B,C,D,E]);
sol(x):=ev(A*x^4+B*x^3+C*x^2+D*x+E, coefs);
sol(N);
S : genmatrix( lambda([i,j], if j=1 then i else sol(i)), 50, 2);
M-S;
plot2d([[discrete,makelist([M[N][1],M[N][2]],N,1,50)], sol(N)], [N, 1, 60], [style, points, lines], [color, red, blue], [legend, "simulation", sol(N)], [logy]);
compare(nr(100),sol(100));
If you want to know how many times the inner loop:
for j in max(l,k):N
Would be executed, just compute: N - max(l, k) assuming open range, N + 1 - max(l, k) assuming closed range.
For example, if:
l = 2
k = 7
N = 10
then it will run on 7, 8, 9, 10 (closed range), so indeed 10 + 1 - 7 = 4 times.
the answer is no, as long as the loop bounds can depend from the outer variables in an arbitrary fashionm as this would provide a general means for getting closed form formulations of integral series.
To see this, consider the following:
for x in 0:N
for y in 0:f(x)
t += 1
The trip count t(N) equals the sum t(N) = f(0)+f(1)+f(2)+f(3)+...+f(N-1).
So if you can get a closed form formulation for t(N) regardless of f(), you have found a very general method of producing closed forms, too general I would say, because what you have here correspond to an integral, and it's known that not all integrals admit closed form formulations.
I have five values, A, B, C, D and E.
Given the constraint A + B + C + D + E = 1, and five functions F(A), F(B), F(C), F(D), F(E), I need to solve for A through E such that F(A) = F(B) = F(C) = F(D) = F(E).
What's the best algorithm/approach to use for this? I don't care if I have to write it myself, I would just like to know where to look.
EDIT: These are nonlinear functions. Beyond that, they can't be characterized. Some of them may eventually be interpolated from a table of data.
There is no general answer to this question. A solver finding the solution to any equation does not exist. As Lance Roberts already says, you have to know more about the functions. Just a few examples
If the functions are twice differentiable, and you can compute the first derivative, you might try a variant of Newton-Raphson
Have a look at the Lagrange Multiplier Method for implementing the constraint.
If the function F is continuous (which it probably is, if it is an interpolant), you could also try the Bisection Method, which is a lot like binary search.
Before you can solve the problem, you really need to know more about the function you're studying.
As others have already posted, we do need some more information on the functions. However, given that, we can still try to solve the following relaxation with a standard non-linear programming toolbox.
min k
st.
A + B + C + D + E = 1
F1(A) - k = 0
F2(B) - k = 0
F3(C) -k = 0
F4(D) - k = 0
F5(E) -k = 0
Now we can solve this in any manner we wish, such as penalty method
min k + mu*sum(Fi(x_i) - k)^2
st
A+B+C+D+E = 1
or a straightforward SQP or interior-point method.
More details and I can help advise as to a good method.
m
The functions are all monotonically increasing with their argument. Beyond that, they can't be characterized. The approach that worked turned out to be:
1) Start with A = B = C = D = E = 1/5
2) Compute F1(A) through F5(E), and recalculate A through E such that each function equals that sum divided by 5 (the average).
3) Rescale the new A through E so that they all sum to 1, and recompute F1 through F5.
4) Repeat until satisfied.
It converges surprisingly fast - just a few iterations. Of course, each iteration requires 5 root finds for step 2.
One solution of the equations
A + B + C + D + E = 1
F(A) = F(B) = F(C) = F(D) = F(E)
is to take A, B, C, D and E all equal to 1/5. Not sure though whether that is what you want ...
Added after John's comment (thanks!)
Assuming the second equation should read F1(A) = F2(B) = F3(C) = F4(D) = F5(E), I'd use the Newton-Raphson method (see Martijn's answer). You can eliminate one variable by setting E = 1 - A - B - C - D. At every step of the iteration you need to solve a 4x4 system. The biggest problem is probably where to start the iteration. One possibility is to start at a random point, do some iterations, and if you're not getting anywhere, pick another random point and start again.
Keep in mind that if you really don't know anything about the function then there need not be a solution.
ALGENCAN (part of TANGO) is really nice. There are Python bindings, too.
http://www.ime.usp.br/~egbirgin/tango/codes.php - " general nonlinear programming that does not use matrix manipulations at all and, so, is able to solve extremely large problems with moderate computer time. The general algorithm is of Augmented Lagrangian type ... "
http://pypi.python.org/pypi/TANGO%20Project%20-%20ALGENCAN/1.0
Google OPTIF9 or ALLUNC. We use these for general optimization.
You could use standard search technic as the others mentioned. There are a few optimization you could make use of it while doing the search.
First of all, you only need to solve A,B,C,D because 1-E = A+B+C+D.
Second, you have F(A) = F(B) = F(C) = F(D), then you can search for A. Once you get F(A), you could solve B, C, D if that is possible. If it is not possible to solve the functions, you need to continue search each variable, but now you have a limited range to search for because A+B+C+D <= 1.
If your search is discrete and finite, the above optimizations should work reasonable well.
I would try Particle Swarm Optimization first. It is very easy to implement and tweak. See the Wiki page for it.