In which order the woman should bring the cats back in order to minimize the time? - algorithm

A woman watches her cats leave one by one with different speeds in different directions. She took a motorcycle with one extra seat and follows the cats and picks up one cat at a time and brings them back home. Each cat moves with constant individual speed Vi and left home at time Ti. In which order the woman should bring the cats back in order to minimize the time?
I am trying to solve this problem but do not know how to begin.

Summary:
Sort the cats according to the metric v / x in descending order, where v is the cat's constant speed and x is the cat's initial displacement at time t = 0. It doesn't matter how you break ties. Once the order is initially established, it will remain the most efficient order in which to get cats as long as it is followed; so follow it.
Candidates debunked:
In both cases, allow the motorcycle speed to be w = 20.
It is proposed that you get cats in order from fastest to slowest. Counterexample: Cat #1 (x, v) = (1, 9) and Cat #2 (x, v) = (100, 10).
It is proposed that you get cats in order from closest to farthest. Counterexample: Cat #1 (x, v) = (1, 1) and Cat #2 (x, v) = (2, 100).
Detailed Derivation:
Let c(k) refer to the kth cat the lady picks up, v(k) refer to the speed of that cat and x(k) to the cat's initial displacement (at time t = 0, which we set a the time the lady starts her motorcycle initially in pursuit of the first cat).
The total time taken to get the first cat is:
t(1) = 2 * x(1) / (w - v(1))
where w is the constant speed of the motorcycle. Since this expression is going to be important we can motivate every part of it:
2 * comes from the fact that the lady must catch the cat, and then spend the same amount of time to return the cat home;
x(1) / (w - v(1)) is the time taken to reach the cat, that is, close the distance x(1) by traveling w - v(1) faster than the cat's v(1).
The time to get the first two cats is:
t(2) = t(1) + 2 * (x(2) + v(2)t(1)) / (w - v(2))
That is, it takes time equal to the time to get the first cat plus the time to get the second cat. The extra v(2)t(1) term accounts for the fact that the second cat moves while the lady is getting the first cat; otherwise, this part is the same.
Rearranging this expression, we get:
t(2) = t(1)(1 + 2 * v(2) / (w - v(2))) + 2 * x(2) / (w - v(2))
We define the following derivative terms:
T(k) = 2 * x(k) / (w - v(k))
s(k) = 2 * v(k) / (w - v(k)) + 1
Now we rewrite:
t(1) = T(1)
t(2) = s(2)T(1) + T(2)
and continue
t(1) = T(1)
t(2) = s(2)T(1) + T(2)
t(3) = s(3)s(2)t(1) + s(3)T(2) + T(3)
...
t(n) = s(n)...s(2)T(1) + s(n)...s(3)T(2) + ... + T(n)
This last expression gives us the total time to get all n cats:
s(n)...s(2)T(1) + s(n)...s(3)T(2) + ... + T(n)
Now we assume that we have an optimal solution in that the cats are picked up in the most efficient order possible. To derive useful properties about this hypothetical optimal solution, we can use the supposed optimality to infer that swapping cats produces a solution that is no better. Imagine swapping cats j and j+1:
... + s(n)...s(j+1)T(j) + s(n)...s(j+2)T(j+1) + ...
<= ... + s(n)...s(j)T(j+1) + s(n)...s(j+2)T(j) + ...
Terms involving T(k) for k < j have both s(j) and s(j+1) and by the commutativity of multiplication they are unaffected by the swap. Terms involving T(k) for k > j + 1 have neither s(j) nor s(j+1) and so cannot be affected by the swap. Only the terms with T(k) such that j <= k <= j + 1 are affected by the swap, so we can remove like terms:
s(n)...s(j+2)s(j+1)T(j) + s(n)...s(j+2)T(j+1)
<= s(n)...s(j+2)s(j)T(j+1) + s(n)...s(j+2)T(j)
The partial product s(n)...s(j+2) is common to all remaining terms and must be positive, so we can remove this like term by dividing both sides of the inequality:
s(j+1)T(j) + T(j+1) <= s(j)T(j+1) + T(j)
Rearrange this as follows:
(s(j+1) - 1)T(j) <= (s(j) - 1)T(j+1)
Finally:
(s(j+1) - 1) / T(j+1) <= (s(j) - 1) / T(j)
Recalling our definitions of s(k) and T(k), simplify to put this in terms of v and x:
v(j+1) / x(j+1) <= v(j) / x(j)
That is: if we have an optimal solution, it must be the case that the ratio of cats' speeds to initial displacements must be in descending order. This is a necessary, but perhaps not sufficient, condition.
Note that this result agrees with intuition:
Get still cats last (v = 0)
Get cats that haven't left yet first (x = 0)
Get cats approaching the motorcycle's speed first (or never)
Get cats that are really far away last (x -> +inf)
It also gives the correct result for the two-cat case; and in that case, if the ratios of speed to displacement are equal, then it can be easily shown that it doesn't matter which order you get the cats in (if they are unequal, you must get the cat with the higher ratio first).
Now - I have not addressed the case where cats may have the same ratio. It's not immediately obvious to me that the order in which you get cats with the same ratio doesn't matter.
However, suppose you have chosen optimally up until some point k < n. Now you need to decide which of two cats with the same ratio to go after. As we already mentioned, for the two-cat problem, it's a wash: so I think the answer is that it can't matter which one you choose, as either order among the two will take the same time and "look" the same afterwards. To see that two cats that start with the same ratio keep the same ratio:
v(i) / x(i) = c; X(i) = x(i) + v(i)t = x(i) + x(i)ct = x(i)(1 + ct)
v(j) / x(j) = c; X(j) = x(j) + v(j)t = x(j) + x(j)ct = x(j)(1 + ct)
So the ratio changes over time (if you take X as the new initial displacement) but two cats that start out with the same ratio will keep it. The new ratio will be:
v / x = c; v / X = v / x(1 + ct) = c / (1 + ct)
It is important to note that these ratios don't "cross over" each other either; if you start out with a higher or lower ratio, it will change over time, but it will not become higher or lower than other cats' ratios:
c(i) / (1 + c(i)t) > c(j) / (1 + c(j)t)
<=> c(i) + c(i)c(j)t > c(j) + c(i)c(j)t
<=> c(i) > c(j)
Based on all of these considerations, my best answer is:
Sort the cats according to the metric v / x in descending order. It doesn't matter how you break ties. Get the cats in that order.

Update
Thanks to #Patrick87 counterexample, I know my solution does not work in the general case, however, I'm going to leave it here because it provides a simpler solution under the extra assumption that all cats start their moves from the home at time 0. Please see #Patrick87 solution for a general solution.
Short Answer:
She must start with the fastest cat. i.e, order cats by velocity (in decreasing order).
Simplify the problem: Assume there are only two cats, one is running the other one walking very slowly. Which one you will go after first?
Detailed answer:
The total distance of all cats from home at time 0 is 0:
X(0) = 0
Therefore, if we assume that the woman catches the last cat at time Tn then the total distance the woman has traveled at Tn is:
X(Tn) = (V1 * T1) + ... + (Vn * Tn)
Where Ti is the time she catches the n's cat. Vi values are predetermined, so, we need to minimize this equation based on values of Ti.
We have n POSITIVE V values with n positive T coefficients to assign to them. Minimizing it under these conditions is easy:
Give the largest V, the smallest coefficient T and so on.
Which means start with the fastest cat (largest V) and bring it back first (multiply by smallest T) and continue.

Related

Probabilistic algorithm

We are given a binary array that has either n zeros or floor(n/2) zeros and ceiling(n/2) ones.
We want to decide whether the array includes ones.
Q. Suggest a random algorithm that has time complexity O(1) and gives the correct answer with a probability of at least 3/4. The algorithm can give a wrong answer but not for more than 1/4 possible inputs.
I would like to get some direction on how to solve this question.
Check random item in the array:
If item == 0 return first possibility (n zeroes)
If item == 1 return second possibility (n/2 zeroes and n/2 ones)
Let's have a look what's going on: the only possibility to give incorrect answer is when we have second possibility,
but we get item == 0 and answer is first possibility. The conditional (second possibility) probability is
p = 1/2
If we check two random items
p = 1/4 (two items are zeroes)
If we check three random items
p = 1/8 (three items are zeroes)
Now, let's compute bayesian probability of incorrect answer, let
P0 - probability of the 1st (all zeroes) outcome
P1 - probability of the 2nd (half zeroes, half ones) outcome
Perror = P1 * p / (P0 + P1) <= 1/4
Or
P1 * p / (P0 + P1) <= 1/4
p <= (P0 + P1) / 4 / P1
p <= P0 / (4 * P1) + 1/4
From the worst case, P0 = 0 (P1 = 1) we get condition for p:
p <= 1/4
So far so good, we should check two random array's items and then
If both items are 0, we answer "All zeroes case"
If any item is 1, we answer "Half zeroes, half ones case"

Solve recurrence relation in which there is a separate relation for even and odd values

Can someone help me how to solve these type of questions? What kind of approach should I follow?
Looking over the question, since you will be asked to
evaluate the recurrence lots of times
for very large inputs,
you will likely need to either
find a closed-form solution to the recurrence, or
find a way to evaluate the nth term of the recurrence in sublinear time.
The question, now, is how to do this. Let's take a look at the recurrence, which was defined as
f(1) = f(2) = 1,
f(n+2) = 3f(n) if n is odd, and
f(n+2) = 2f(n+1) - f(n) + 2 if n is even.
Let's start off by just exploring the recurrence to see if any patterns arise. Something that stands out here - the odd terms of this recurrence only depend on other odd terms in the recurrence. This means that we can imagine trying to split this recurrence into two smaller recurrences: one that purely deals with the odd terms, and one that purely deals with the even terms. Let's have D(n) be the sequence of the odd terms, and E(n) be the sequence of the even terms. Then we have
D(1) = 1
D(n+2) = 3D(n)
We only need to evaluate D on odd numbers, so we can play around with that to see if a pattern emerges:
D(2·0 + 1) = 1 = 30
D(2·1 + 1) = 3 = 31
D(2·2 + 1) = 9 = 32
D(2·3 + 1) = 27 = 33
The pattern here is that D(2n+1) = 3n. And hey, that's great news! That means that we have a direct way of computing D(2n+1).
With that in mind, notice that E(n) is defined as
E(2) = 1 = D(1)
E(n+2) = 2D(n+1) - E(n) + 2
Remember that we know the exact value of D(n+1), which is going to make our lives a lot easier. Let's see what happens if we iterate on this recurrence a bit. For example, notice that
E(8)
= 2D(7) - E(6) + 2
= 2D(7) + 2 - (2D(5) - E(4) + 2)
= 2D(7) - 2D(5) + E(4)
= 2D(7) - 2D(5) + (2D(3) - E(2) + 2)
= 2D(7) - 2D(5) + 2D(3) + 2 - D(1)
= 2D(7) - 2D(5) + 2D(3) - D(1) + 2
Okay... that's really, really interesting. It seems like we're getting an alternating sum of the D recurrence, where we alternate between including and excluding 2. At this point, if I had to make a guess, I'd say that the way to solve this recurrence is going to be to think about subdividing the even case further into cases where the inputs are 2n for an even n and 2n for an odd n. In fact, notice that if the input is 2n for even n, then there won't be a +2 term at the end (all the +2's are balanced out by -2's), whereas if the input is odd, then there will be a +2 term at the end (all the +2's are balanced out by -2's).
Now, let's turn to a different aspect of the problem. You weren't asked to query for individual terms of the recurrence. You were asked to query for the sum of the recurrence, evaluated over a range of inputs. The fact that we're getting alternating sums and differences of the D terms here is really, really interesting. For example, what is f(10) + f(11) + f(12)? Well, we know that f(11) = D(11), which we can compute directly. And we also know that f(10) and f(12) are E(10) and E(12). And watch what happens if we evalute E(10) + E(12):
E(10) + E(12)
= (D(9) - D(7) + D(5) - D(3) + D(1) + 2) + (D(11) - D(9) + D(7) - D(5) + D(3) - D(1))
= D(11) + (D(9) - D(9)) + (D(7) - D(7)) + (D(5) - D(5)) + (D(3) - D(3)) + (D(1) - D(1)) + 2
= D(11) + 2.
Now that's interesting. Notice that all of the terms have cancelled out except for the D(11) term and the +2 term! More generally, this might lead us to guess that there's some rule about how to simplify E(n+2) + E(n). In fact, there is. Specifically:
E(2n) + E(2n+2) = D(2n+1) + 2
This means that if we're summing up lots of consecutive values in a range, every pair of adjacent even terms will simplify instantly to something of the form D(2n+1) + 2.
There's still some more work to be done here. For example, you'll need to be able to sum up enormous numbers of D(n) terms, and you'll need to factor in the effects of all the +2 terms. I'll leave those to you to figure out.
One hint: all the values you're asked to return are modulo some number P. This means that the sequence of values 0, D(1), D(1) + D(3), D(1) + D(3) + D(5), D(1) + D(3) + D(5) + D(7), etc. eventually has to reach 0 again (mod P). You can both compute how many terms have to happen before this occurs and write down all the values encountered when doing this by just computing these values explicitly. That will enable you to sum up huge numbers of consecutive D terms in a row - you can mod the number of terms by the length of the cycle, then look up the residual sum in the table.
Hope this helps!

How do you determine the average-case complexity of this algorithm?

It's usually easy to calculate the time complexity for the best case and the worst case, but when it comes to the average case especially when there's a probability p given, I don't know where to start.
Let's look at the following algorithm to compute the product of all the elements in a matrix:
int computeProduct(int[][] A, int m, int n) {
int product = 1;
for (int i = 0; i < m; i++ {
for (int j = 0; j < n; j++) {
if (A[i][j] == 0) return 0;
product = product * A[i][j];
}
}
return product;
}
Suppose p is the probability of A[i][j] being 0 (i.e. the algorithm terminates there, return 0); how do we derive the average case time complexity for this algorithm?
Let’s consider a related problem. Imagine you have a coin that flips heads with probability p. How many times, on expectation, do you need to flip the coin before it comes up heads? The answer is 1/p, since
There’s a p chance that you need one flip.
There’s a p(1-p) chance that you need two flips (the first flip has to go tails and the second has to go heads).
There’s a p(1-p)^2 chance that you need three flips (the first two flips need to go tails and the third has to go heads)
...
There’s a p(1-p)^(k-1) chance that you need k flips (the first k-1 flips need to go tails and the kth needs to go heads.)
So this means the expected value of the number of flips is
p + 2p(1 - p) + 3p(1 - p)^2 + 4p(1 - p)^3 + ...
= p(1(1 - p)^0 + 2(1 - p)^1 + 3(1 - p)^2 + ...)
So now we need to work out what this summation is. The general form is
p sum from k = 1 to infinity (k(1 - p)^k).
Rather than solving this particular summation, let's make this more general. Let x be some variable that, later, we'll set equal to 1 - p, but which for now we'll treat as a free value. Then we can rewrite the above summation as
p sum from k = 1 to infinity (kx^(k-1)).
Now for a cute trick: notice that the inside of this expression is the derivative of x^k with respect to x. Therefore, this sum is
p sum from k = 1 to infinity (d/dx x^k).
The derivative is a linear operator, so we can move it out to the front:
p d/dx sum from k = 1 to infinity (x^k)
That inner sum (x + x^2 + x^3 + ...) is the Taylor series for 1 / (1 - x) - 1, so we can simplify this to get
p d/dx (1 / (1 - x) - 1)
= p / (1 - x)^2
And since we picked x = 1 - p, this simplifies to
p / (1 - (1 - p))^2
= p / p^2
= 1 / p
Whew! That was a long derivation. But it shows that the expected number of coin tosses needed is 1/p.
Now, in your case, your algorithm can be thought of as tossing mn coins that come up heads with probability p and stopping if any of them come up heads. Surely, the expected number of coins you’d need to toss won’t be more than the case where you’re allowed to flip infinitely often, so your expected runtime is at most O(1 / p) (assuming p > 0).
If we assume that p is independent of m and n, then we can notice that at after some initial growth, each added term into our summation as we increase the number of flips is exponentially lower than the previous ones. More specifically, after adding in roughly logarithmically many terms into the sum we’ll be off from the total in the case of the infinite summation. Therefore, provided that mn is roughly larger than Θ(log p), the sum ends up being Θ(1 / p). So in a big-O sense, if mn is independent of p, the runtime is Θ(1 / p).

Hot and Cold Binary Search Game

Hot or cold.
I think you have to do some sort of binary search but I'm not sure how.
Your goal is the guess a secret integer between 1 and N. You
repeatedly guess integers between 1 and N. After each guess you learn
if it equals the secret integer (and the game stops); otherwise
(starting with the second guess), you learn if the guess is hotter
(closer to) or colder (farther from) the secret number than your
previous guess. Design an algorithm that finds the secret number in lg
N + O(1) guesses.
Hint: Design an algorithm that solves the problem in lg N + O(1)
guesses assuming you are permitted to guess integers in the range -N
to 2N.
I've been racking my brain and I can't seem to come up with a a lg N + O(1).
I found this: http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1316188034 but could not understand the diagram and it did not describe other possible cases.
Suppose you know that your secret integer is in [a,b], and that your last guess is c.
You want to divide your interval by two, and to know whether your secret integer lies in between [a,m] or [m,b], with m=(a+b)/2.
The trick is to guess d, such that (c+d)/2 = (a+b)/2.
Without loss of generality, we can suppose that d is bigger than c. Then, if d is hotter than c, your secret integer will be bigger than (c+d)/2 = (a+b)/2 = m, and so your secret integer will lie in [m,b]. If d is cooler than c, your secret integer will belong to [a,m].
You need to be able to guess between -N and 2N because you can't guarantee that c and d as defined above will always be [a,b]. Your two first guess can be 1 and N.
So, your are dividing your interval be two at each guess, so the complexity is log(N) + O(1).
A short example to illustrate this (results chosen randomly):
Guess Result Interval of the secret number
1 *** [1 , N ] // d = a + b - c
N cooler [1 , N/2 ] // N = 1 + N - 1
-N/2 cooler [N/4 , N/2 ] //-N/2 = 1 + N/2 - N
5N/4 hotter [3N/8, N/2 ] // 5N/4 = N/4 + N/2 + N/2
-3N/8 hotter [3N/8, 7N/16] //-3N/8 = 3N/8 + N/2 - 5N/4
. . . .
. . . .
. . . .
Edit, suggested by #tmyklebu:
We still need to prove that our guess will always fall in bewteen [-N,2N]
By recurrence, suppose that c (our previous guess) is in [a-(a+b), b+(a+b)] = [-b,a+2b]
Then d = a+b-c <= a+b-(-b) <= a+2b and d = a+b-c >= a+b-(a+2b) >= -b
Initial case: a=1, b=N, c=1, c is indeed in [-b,a+2*b]
QED
This was a task at IOI 2010, for which I sat on the Host Scientific Committee. (We asked for an optimal solution instead of simply lg N + O(1), and what follows is not quite optimal.)
Not swinging outside -N .. 2N and using lg N + 2 guesses is straightforward; all you need to do is show that the obvious translation of binary search works.
Once you have something that doesn't swing outside -N .. 2N and takes lg N + 2 guesses, do this:
Guess N/2, then N/2+1. This tells you which half of the array the answer is in. Then guess the end of that half-array. You're either in one of the two "middle" quarters or you're in one of the two "end" quarters. If you're in a middle quarter, do the thing before and you win in lg N + 4 guesses. The ends are slightly trickier.
Suppose I need to guess a number in 1 .. K without straying outside 1 .. N and my last guess was 1. If I guess K/2 and I'm colder, then I next guess 1; I spent two guesses to get a similar subproblem that's 1/4 the size. If K/2 is hotter, I know the answer is in K/4 .. K. Guess K/2-1 next. The two subcases are K/4 .. K/2-1 and K/2 .. K, both of which are nice. But it took me three guesses to (in the worst-case) halve the size of the problem; if I ever do this, I wind up doing lg N + 6 guesses.
The solution is close to binary search. At each step you have an interval that the number can be in. Start with the whole interval [1, N]. First guess both ends - that is the numbers 1 and N. One of them will be closer, thus you will know that now the number you are searching for is in [1, N/2] or in [N/2 + 1, N](considering N even for simplicity). Now you go to the next step having a twice smaller interval. Continue using the same approach. Keep in mind that you've already probed one of the ends, however it may not be your last guess.
I am not sure what you mean by lg N + O(1), but the approach I suggest will perform O(log(N)) operations and in the worst case it will do exactly log4(N) probes.
Here're my two cents for this problem, since I got obsessed with it for two days. I'm not going to say anything new to what others have already said, but I'm going to explain it in a way that might get some people to understand the solution easily(or at least that was the way I managed to understand it).
Drawing from the ~ 2 lg N solution, if I knew that solution existed in [a, b] I'd want to know if it's in the left half [a, (a + b) / 2] or the right half [(a + b) / 2, b], with the point (a + b) / 2 separating the two halves. So what do I do? I guess a then b; if I get colder with b I know I'm in the first(left) half, if I get hotter I know I'm in the second(right) one. So guessing a and b is the way to know the secret integer position with respect to the mid point (a + b) / 2. However a and b aren't the only points that I can guess at to know the secret position. (a - 1, b + 1), (a - 2, b + 2), ... etc are all valid pairs of points to guess at to know the secret position, as the mid point of all these pairs is (a + b) / 2, the mid point of the original interval [a, b]. In fact any two numbers c and d such that (c + d) / 2 = (a + b) / 2 can be used.
So considering [a, b] as the interval we know the secret integer exists within, take c to be the last number we guessed. We want to determine the position of the secret with respect to the mid point (a + b) / 2, so we a new number d to guess at to know the secret relative position to (a + b) / 2. How do we know such number d? By solving the equation (c + d) / 2 = (a + b) / 2, which yields d = a + b - c. Guessing at that d, we shrink the range [a, b] appropriately based on the answer(colder or hotter) and then repeat the process taking d as our last guess and trying a new guess at number e for example with the same conditions.
To establish the initial conditions, we should start with a = 1, b = N, and c = 1. We guess at c to establish a reference(since the first guess can't tell you anything useful as there were no prior guesses). We then proceed with new guesses and adjusting the enclosing interval as appropriate with each guess. The table in #R2B2's answer explains it all.
You have to be vigilant however when trying to code this solution. When I tried to code it in python, I first ran into the mistake of getting [a, b] stuck when it was small enough(like [a, a + 1]) where neither a nor b would move inwards. I had to phase the cases where the interval size was 2 outside the loop and handle them separately(like I did with intervals with size 1 also).
the task is quite brain cracking
but I'll try to keep it simple
to solve the problem in 2lnN let's make following guesses:
1 and N : to decide which half is hotter (1, N/2) or (N/2, N), for example (N/2, N) is the hotter half, then let's make next guesses
N/2 and N : again to decide which half is hotter (N/2, 3/4N) or (3/4N, N) and so on
... so we need 2 guesses for every half-division, therefore we should make 2 * lnN guesses.
here we see that each time we need to repeat one of previous intervals' borders one more time - in our example point 'N' is repeated.
to solve the problem in 1*lnN guesses instead of 2*lnN, we need to find way to spend only one guess for each intervals' half-division, good illustration of such method is depicted on image at http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1316188034
the idea is to avoid repeating one of border points of each interval again in subsequent steps, but be smart to spend one guess for each half-division step by mirroring points.
the smart idea is that when we want to decide which half of current interval is hotter, we don't need to try only its borders, but we can as well try any points located outside its borders, these guessing points must be at equal distances (in other words mirrored) relatively to the center of interesting interval, nothing bad even if these guessing points are negative (i.e. < 0) or if they are > N (this is not prohibited by the conditions of the task)
so to guess the secret we can freely make guesses using pounts in interval (-N, 2N)
public class HotOrCold {
static int num=5000;// number to search
private static int prev=0;
public static void main(String[] args) {
System.out.println(guess(1,Integer.MAX_VALUE));
}
public static int guess(int lo,int hi) {
while(hi>=lo) {
boolean one=false;
boolean two=false;
if(hi==lo) {
return lo;
}
if(isHot(lo)) {
one=true;
}
if(isHot(hi)) {
two=true;
}
if(!(one||two)) {
return (hi+lo)/2;// equal distance
}
if(two) {
lo=hi-(hi-lo)/2;
continue;//checking two before as it's hotter than lo so ignoring the lo
}
if(one) {
hi=lo+(hi-lo)/2;
}
}
return 0;
}
public static boolean isHot(int curr) {
boolean hot=false;
if(Math.abs(num-curr)<Math.abs(num-prev)) {
hot=true;
}
prev=curr;
return hot;
}
}

Algorithm to compute k fractions of form 1/r summing up to 1

Given k, we need to write 1 as a sum of k fractions of the form 1/r.
For example,
For k=2, 1 can uniquely be written as 1/2 + 1/2.
For k=3, 1 can be written as 1/3 + 1/3 + 1/3 or 1/2 + 1/4 + 1/4 or 1/6 + 1/3 + 1/2
Now, we need to consider all such set of k fractions that sum upto 1 and return the highest denominator among all such sets; for instance, the sample case 2, our algorithm should return 6.
I came across this problem in a coding competition and couldn't come up with an algorithm for the same. A bit of Google search later revealed that such fractions are called Egyption Fractions but probably they are set of distinct fractions summing upto a particular value (not like 1/2 + 1/2). Also, I couldn't find an algo to compute Egyption Fractions (if they are at all helpful for this problem) when their number is restricted by k.
If all you want to do is find the largest denominator, there's no reason to find all the possibilities. You can do this very simply:
public long largestDenominator(int k){
long denominator = 1;
for(int i=1;i<k;i++){
denominator *= denominator + 1;
}
return denominator;
}
For you recursive types:
public long largestDenominator(int k){
if(k == 1)
return 1;
long last = largestDenominator(k-1);
return last * (last + 1); // or (last * last) + last)
}
Why is it that simple?
To create the set, you need to insert the largest fraction that will keep it under 1 at each step(except the last). By "largest fraction", I mean by value, meaning the smallest denominator.
For the simple case k=3, that means you start with 1/2. You can't fit another half, so you go with 1/3. Then 1/6 is left over, giving you three terms.
For the next case k=4, you take that 1/6 off the end, since it won't fit under one, and we need room for another term. Replace it with 1/7, since that's the biggest value that fits. The remainder is 1/42.
Repeat as needed.
For example:
2 : [2,2]
3 : [2,3,6]
4 : [2,3,7,42]
5 : [2,3,7,43,1806]
6 : [2,3,7,43,1807,3263442]
As you can see, it rapidly becomes very large. Rapidly enough that you'll overflow a long if k>7. If you need to do so, you'll need to find an appropriate container (ie. BigInteger in Java/C#).
It maps perfectly to this sequence:
a(n) = a(n-1)^2 + a(n-1), a(0)=1.
You can also see the relationship to Sylvester's sequence:
a(n+1) = a(n)^2 - a(n) + 1, a(0) = 2
Wikipedia has a very nice article explaining the relationship between the two, as pointed out by Peter in the comments.
I never heard of Egyptian fractions before but here are some thoughts:
Idea
You can think of them geometrically:
Start with a unit square (1x1)
Draw either vertical or horizontal lines dividing the square into equal parts.
Repeat optionally the drawing of lines inside any of the sub-boxes evenly.
Stop any time you want.
The rectangles present will form a set of fractions of the form 1/n that add to 1.
You can count them and they might equal your 'k'.
Depending on how many equal sections you divided a rectangle into, it will tell whether you have 1/2 or 1/3 or whatever. 1/6 is 1/2 of 1/3 or 1/3 of 1/2. (i.e. You dived by 2 and then one of the sub-boxes by 3 OR the other way around.)
Idea 2
You start with 1 box. This is the fraction 1/1 with k=1.
When you sub-divide by n you add n to the count of boxes (k or of fractions summed) and subtract 1.
When you sub-divide any of those boxes, again, subtract 1 and add n, the number of divisions. Note that n-1 is the number of lines you drew to divide them.
More
You are going to start searching for the answer with k. Obviously k * 1/k = 1 so you have one solution right there.
How about k-1?
There's a solution there: (k-2) * 1/(k-1) + 2 * (1/((k-1)*2))
How did I get that? I made k-1 equal sections (with k-2 vertical lines) and then divided the last one in half horizontally.
Each solution is going to consist of:
taking a prior solution
using j less lines and some stage and dividing one of the boxes or sub-boxes into j+1 equal sections.
I don't know if all solutions can be formed by repeating this rule starting from k * 1/k
I do know you can get effective duplicates this way. For example: k * 1/k with j = 1 => (k-2) * 1/(k-1) + 2 * (1/((k-1)*2)) [from above] but k * 1/k with j = (k-2) => 2 * (1/((k-1)*2)) + (k-2) * 1/(k-1) [which just reverses the order of the parts]
Interesting
k = 7 can be represented by 1/2 + 1/4 + 1/8 + ... + 1/(2^6) + 1/(2^6) and the general case is 1/2 + ... + 1/(2^(k-1)) + 1/(2^(k-1)).
Similarly for any odd k it can be represented by 1/3 + ... + 3 * [1/(3^((k-1)/2)].
I suspect there are similar patterns for all integers up to k.

Resources