Recursion Puzzle - algorithm

Recently, one of my friends challenged me to solve this puzzle which goes as follows:
Suppose that you have two variables x and y. These are the only variables which can be used for storage in the program. There are three operations which can be done:
Operation 1: x = x+y
Operation 2: x = x-y
Operation 3: y = x-y
Now, you are given two number n1 and n2 and a target number k. Starting with x = n1 and y = n2, is there a way to arrive at x = k using the operations mentioned above? If yes, what is the sequence of operations which can generate x = k.
Example: If n1 = 16, n2 = 6 and k = 28 then the answer is YES. The sequence is:
Operation 1
Operation 1
If n1 = 19, n2 = 7 and k = 22 then the answer is YES. The sequence is:
Operation 2
Operation 3
Operation 1
Operation 1
Now, I have wrapped my head around the problem for too long but I am not getting any initial thoughts. I have a feeling that this is recursion but I do not know what should be the boundary conditions. It would be very helpful if someone can direct me towards an approach which can be used to solve this problem. Thanks!

Maybe not a complete answer, but a proof that a sequence exists if and only if k is a multiple of the greatest common divisor (GCD) of n1 and n2. Let's write G = GCD(n1, n2) for brevity.
First I'll prove that x and y are always integer multiples of the G. This proof is really straightforward by induction. Hypothesis: x = p * G and y = q * G, for some integers p and q.
Initially, the hypothesis holds by definition of G.
Each of the rules respects the induction hypothesis. The rules yield:
x + y = p * G + q * G = (p + q) * G
x - y = p * G - q * G = (p - q) * G
y - x = q * G - p * G = (q - p) * G
Due to this result, there can only be a sequence to k if k is an integer multiple of the GCD of n1 and n2.
For the other direction we need to show that any integer multiple of G can be achieved by the rules. This is definitely the case if we can reach x = G and y = G. For this we use Euclid's algorithm. Consider the second implementation in the linked wiki article:
function gcd(a, b)
while a ≠ b
if a > b
a := a − b
else
b := b − a
return a
This is a repetitive application of rules 2 and 3 and results in x = G and y = G.
Knowing that a solution exists, you can apply a BFS, as shown in Amit's answer, to find the shortest sequence.

Assuming a solution exists, finding the shortest sequence to get to it can be done using a BFS.
The pseudo code should be something like:
queue <- new empty queue
parent <- new map of type map:pair->pair
parent[(x,y)] = 'root' //special indicator to stop the search there
queue.enqueue(pair(x,y))
while !queue.empty():
curr <- queue.dequeue()
x <- curr.first
y <- curr.second
if x == target or y == target:
printSolAccordingToMap(parent,(x,y))
return
x1 <- x+y
x2 <- x-y
y1 <- x-y
if (x1,y) is not a key in parent:
parent[(x1,y)] = (x,y)
queue.enqueue(pair(x1,y))
//similarly to (x2,y) and (x,y1)
The function printSolAccordingToMap() simply traces back on the map until it finds the root, and prints it.
Note that this solution only finds the optimal sequence if one exists, but will cause infinite loop if one does not exist, so this is only partial answer yet.

Consider that you have both (x,y) always <= target & >0 if not you can always bring them in the range by simple operations. If you consider this constraints you can make a graph where there are O(target*target) nodes and edge you can find by doing an operation among three on that node. You now need to evaluate the shortest path from start position node to target node which is (target,any). The assumption here is (x,y) values always stay within (0,target). The time complexity is O(target*target*log(target)) using djikstra.

In the Vincent's answer, I think the proof is not complete.
Let us suppose two relatively prime numbers suppose n1=19 and n2=13 whose GCD will be 1. According to him, sequence exits if k is multiple of GCD.Since every number is multiple of 1. I think it is not possible for every k.

Related

Finding distinct pairs {x, y} that satisfies the equation 1/x + 1/y = 1/n with x, y, and n being whole numbers

The task is to find the amount of distinct pairs of {x, y} that fits the equation 1/x + 1/y = 1/n, with n being the input given by the user. Different ordering of x and y does not count as a new pair.
For example, the value n = 2 will mean 1/n = 1/2. 1/2 can be formed with two pairs of {x, y}, whcih are 6 and 3 and 4 and 4.
The value n = 3 will mean 1/n = 1/3. 1/3 can be formed with two pairs of {x, y}, which are 4 and 12 and 6 and 6.
The mathematical equation of 1/x + 1/y = 1/n can be converted to y = nx/(x-n) where if y and x in said converted equation are whole, they count as a pair of {x, y}. Using said converted formula, I will iterate n times starting from x = n + 1 and adding x by 1 per iteration to find whether nx % (x - n) == 0; if it yields true, the x and y are a new distinct pair.
I found the answer to limit my iteration by n times by manually computing the answers and finding the number of repetitions 'pattern'. x also starts with n+1 because otherwise, division by zero will happen or y will result in a negative number. The modulo operator is to indicate that the y attained is whole.
Questions:
Is there a mathematical explanation behind why the iteration is limited to n times? I found out that the limit of iteration is n times by doing manual computation and finding the pattern: that I only need to iterate n times to find the amount of distinct pairs.
Is there another way to find the amount of distinct pairs {x, y} other than my method above, which is by finding the VALUES of distinct pairs itself and then summing the amount of distinct pair? Is there a quick mathematical formula I'm not aware of?
For reference, my code can be seen here: https://gist.github.com/TakeNoteIAmHere/596eaa2ccf5815fe9bbc20172dce7a63
Assuming that x,y,n > 0 we have
Observation 1: both, x and y must be greater than n
Observation 2: since (x,y) and (y,x) do not count as distinct, we can assume that x <= y.
Observation 3: x = y = 2n is always a solution and if x > 2n then y < x (thus no new solution)
This means the possible values for x are from n+1 up to 2n.
A little algebra convers the equation
1/x + 1/y = n
into
(x-n)*(y-n) = n*n
Since we want a solution in integers, we seek integers f, g so that
f*g = n*n
and then the solution for x and y is
x = f+n, y = g+n
I think the easiest way to proceed is to factorise n, ie write
n = (P[1]^k[1]) * .. *(P[m]^k[m])
where the Ps are distinct primes, the ks positive integers and ^ denotes exponentiation.
Then the possibilities for f and g are
f = P[1]^a[1]) * .. *(P[m]^a[m])
g = P[1]^b[1]) * .. *(P[m]^b[m])
where the as and bs satisfy, for each i=1..m
0<=a[i]<=2*k[i]
b[i] = 2*k[i] - a[i]
If we just wanted to count the number of solutions, we would just need to count the number of fs, ie the number of distinct sequences a[]. But this is just
Nall = (2*k[1]+1)*... (2*[k[m]+1)
However we want to count the solution (f,g) and (g,f) as being the same. There is only one case where f = g (because the factorisation into primes is unique, we can only have f=g if the a[] equal the b[]) and so the number we seek is
1 + (Nall-1)/2

Find the value of f(T) for big value T

I am trying to solve a problem which is described below,
Given value of f(0) and k , which are integers.
I need to find value of f( T ). where T<=1010
Recursive function is,
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
My efforts,
#include<iostream>
using namespace std;
int main(){
long k,f0,i;
cin>>k>>f0;
long operation ;
cin>>operation;
long answer=f0;
for(i=1;i<=operation;i++){
answer=(4*answer <= k )?(2*answer):(k-(2*answer));
}
cout<<answer;
return 0;
}
My code gives me right answer. But, The code will run 1010 time in worst case that gives me Time Limit Exceed. I need more efficient solution for this problem. Please help me. I don't know the correct algorithm.
If 2f(0) < k then you can compute this function in O(log n) time (using exponentiation by squaring modulo k).
r = f(0) * 2^n mod k
return 2 * r >= k ? k - r : r
You can prove this by induction. The induction hypothesis is that 0 <= f(n) < k/2, and that the above code fragment computes f(n).
Here's a Python program which checks random test cases, comparing a naive implementation (f) with an optimized one (g).
def f(n, k, z):
r = z
for _ in xrange(n):
if 4*r <= k:
r = 2 * r
else:
r = k - 2 * r
return r
def g(n, k, z):
r = (z * pow(2, n, k)) % k
if 2 * r >= k:
r = k - r
return r
import random
errs = 0
while errs < 20:
k = random.randrange(100, 10000000)
n = random.randrange(100000)
z = random.randrange(k//2)
a1 = f(n, k, z)
a2 = g(n, k, z)
if a1 != a2:
print n, k, z, a1, a2
errs += 1
print '.',
Can you use methmetical solution before progamming and compulating?
Actually,
f(n) = f0*2^(n-1) , if f(n-1)*4 <= k
k - f0*2^(n-1) , if f(n-1)*4 > k
thus, your code will write like this:
condition = f0*pow(2, operation-2)
answer = condition*4 =< k? condition*2: k - condition*2
For a simple loop, your answer looks pretty tight; one could optimise a little bit using answer<<2 instead of 4*answer, and answer<<1 for 2*answer, but quite possibly your compiler is already doing that. If you're blowing the time with this, it might be necessary to reduce the loop itself somehow.
I can't figure out a mathematical pattern that #Shannon was going for, but I'm thinking we could exploit the fact that this function will sooner or later cycle. If the cycle is short enough, then we could short the loop by just getting the answer at the same point in the cycle.
So let's get some cycle detection equipment in the form of Brent's algorithm, and see if we can cut the loop to reasonable levels.
def brent(f, x0):
# main phase: search successive powers of two
power = lam = 1
tortoise = x0
hare = f(x0) # f(x0) is the element/node next to x0.
while tortoise != hare:
if power == lam: # time to start a new power of two?
tortoise = hare
power *= 2
lam = 0
hare = f(hare)
lam += 1
# Find the position of the first repetition of length λ
mu = 0
tortoise = hare = x0
for i in range(lam):
# range(lam) produces a list with the values 0, 1, ... , lam-1
hare = f(hare)
# The distance between the hare and tortoise is now λ.
# Next, the hare and tortoise move at same speed until they agree
while tortoise != hare:
tortoise = f(tortoise)
hare = f(hare)
mu += 1
return lam, mu
f0 = 2
k = 198779
t = 10000000000
def f(x):
if 4 * x <= k:
return 2 * x
else:
return k - 2 * x
lam, mu = brent(f, f0)
t2 = t
if t >= mu + lam: # if T is past the cycle's first loop,
t2 = (t - mu) % lam + mu # find the equivalent place in the first loop
x = f0
for i in range(t2):
x = f(x)
print("Cycle start: %d; length: %d" % (mu, lam))
print("Equivalent result at index: %d" % t2)
print("Loop iterations skipped: %d" % (t - t2))
print("Result: %d" % x)
As opposed to the other proposed answers, this approach actually could use a memo array to speed up the process, since the start of the function is actually calculated multiple times (in particular, inside brent), or it may be irrelevant, depending on how big the cycle happens to be.
The algorithm you proposed already has O(n).
To come up with more efficient algorithms, there is not that much direction we can go about. Some typical options we have
1.Decease the coefficients of the linear term( but I doubt it would make a difference in this case
2.Change to O(Logn)(typically use some sort of divide and conquer technique)
3.Change to O(1)
In this case, we can do the last one.
The recursion function is a piece-wise function
f(n) = 2*f(n-1) , if 4*f(n-1) <=k
k - ( 2*f(n-1) ) , if 4*f(n-1) > k
Let's tackle it by case:
case 1: if 4*f(n-1) <= k (1)(assuming the starting index is zero)
this is a obvious a geometry series
a_n = 2*a_n-1
Therefore, have the formula
Sn = 2^(n-1)f(0) ----()
Case 2: if 4*f(n-1) > k (2), we have
a_n = -2a_n-1 + k
Assuming, a_j is the element in the sequence which just satisfy condition (2)
Nestedly sub in an_1 to the formula, you will obtain the equation
an = k -2k +4k -8k... +(-2)^(n-j)* a_j
k -2k 4k -8... is another gemo series
Sn = k*(1-2^(n-j))/(1-2) ---gemo series sum formula with starting value k and ratio = -2
Therefore, we have a formula for an in the case 2
an = k * (1-2^(n-j))/(1-2) + (-2)^(n-j) * a_j ----(**)
All we left to do it to find aj which just dissatisfy condition (1) and satisfy (2)
This can be obtained in constant time again using the formula we have for case 1:
find n such that, 4*an = 4*Sn = 4*2^(n-1)*f(0)
solve for n: 4*2^(n-1)*f(0) = k, if n is not integer, take ceiling of n
In my first attempt to solve this question, I had wrong assumption that the value of the sequence is monotonically increasing but in fact the sequence might jump between case 1 and case 2. Therefore, there might not be constant algorithm to solve the problem.
However, we can use utilize the result above to skip iterative update complexity.
The overall algorithm will look something like:
start with T, K, and f(0)
compute n that make the condition switch using either (*) or (**)
update f(0) with f(n), update T - n
repeat
terminate when T-n = 0(the last iteration might over compute causing T-n<0, therefore, you need to go back a little bit if that happen)
Create a map that can store your results. Before finding f(n) check in that map, if solution is already existed or not.
If exists, use that solution.
Otherwise find it, store it for future use.
For C++:
Definition:
map<long,long>result;
Insertion:
result[key]=value
Accessing:
value=result[key];
Checking:
map<long,long>::iterator it=result.find(key);
if(it==result.end())
{
//key was not found, find the solution and insert into result
}
else
{
return result[key];
}
Use above technique for better solution.

Find the closest to a given natural number, which satisfies non-linear condition

I have three integers: A, B, C i need to find integer X, which is closest to C. If N is any natural number, then A, B and X should satisfy the following equation:
A*B*X=sqrt(N)
Could you help with algorithm?
we can do a binary search over all possible values of N, and compare corresponding value of X = sqrt(N)/(A*B)) to decide which half to carry on search.
A possible implementation in python could be -
A = 5
B = 2
C = 3
left = -10000000000000
right = 10000000000000 #assuming that's the maximum value N can take
while right-left>1:
N = (left+right)//2
X = N**.5/(A*B)
if X>C:
right = N
else:
left = N
N = (left+right)//2
print(N)
which in this case outputs: 900

Recursive division algorithm for two n bit numbers

In the below division algorithm, I am not able to understand why multiplying q and r by two works and also why r is incremented if x is odd.
Please give a theoretical justification of this recursive division algorithm.
Thanks in advance.
function divide(x, y)
if x = 0:
return (q, r) = (0, 0)
(q, r) = divide(floor(x/2), y)
q = 2q, r = 2r
if x is odd:
r = r + 1
if r ≥ y:
r = r − y, q = q + 1
return (q, r)
Let's assume you want to divide x by y, i.e. represent x = Q * y + R
Let's assume that x is even. You recursively divide x / 2 by y and get your desired representation for a smaller case: x / 2 = q * y + r.
By multiplying it by two, you would get: x = 2q * y + 2r. Looking at the representation you wanted to get for x in the first place, you see that you have found it! Let Q = 2q and R = 2r and you found the desired Q and R.
If x is odd, you again first get the desired representation for a smaller case: (x - 1) / 2 = q * y + r, multiply it by two: x - 1 = 2q * y + 2r, and send 1 to the right: x = 2q * y + 2r + 1. Again, you have found Q and R you wanted: Q = 2q, R = 2r + 1.
The final part of the algorithm is just normalization so that r < y. r can become bigger than y when you perform multiplication by two.
Algorithm PuzzleSolve(k,S,U) :
Input: An integer k, sequence S, and set U
Output: An enumeration of all k-length extensions to S using elements in U without repetitions
for each e in U do
Add e to the end of S
Remove e from U /e is now being used/
if k == 1 then
Test whether S is a configuration that solves the puzzle
if S solves the puzzle then
return "Solution found: " S
else
PuzzleSolve(k-1,S,U) /a recursive call/
Remove e from the end of S
Add e back to U e is now considered as unused
This algorithm enumerates every possible size-k ordered subset of U, and tests each subset for being
a possible solution to our puzzle. For summation puzzles, U = 0,1,2,3,4,5,6,7,8,9 and each position
in the sequence corresponds to a given letter. For example, the first position could stand for b, the
second for o, the third for y, and so on.

Dijkstra Algorithm on a graph modeling a network

We have a directed graph G = (V, E) for a comm. network with each edge having a probability of not failing r(u, v) (defined as edge weight) which lies in interval [0, 1]. The probabilities are independent, so that from one vertex to another, if we multiply all probabilities, we get the the probability of the entire path not failing.
I need an efficient algorithm to find a most reliable path from one given vertex to another given vertex (i.e., a path from the first vertex to the second that is least likely to fail). I am given that log(r · s) = log r + log s will be helpful.
This is what I have so far -:
DIJKSTRA-VARIANT (G, s, t)
for v in V:
val[v] ← ∞
A ← ∅
Q ← V to initialize Q with vertices in V.
val[s] ← 0
while Q is not ∅ and t is not in A
do x ← EXTRACT-MIN (Q)
A ← A ∪ {x}
for each vertex y ∈ Adj[x]
do if val[x] + p(x, y) < val[y]:
val[y] = val[x] + p(x, y)
s is the source vertex and t is the destination vertex. Of course, I have not exploited the log property as I am not able to understand how to use it. The relaxation portion of the algorithm at the bottom needs to be modified, and the val array will capture the results. Without log, it would probably be storing the next highest probability. How should I modify the algorithm to use log?
Right now, your code has
do if val[x] + p(x, y) < val[y]:
val[y] = val[x] + p(x, y)
Since the edge weights in this case represent probabilities, you need to multiply them together (rather than adding):
do if val[x] * p(x, y) > val[y]:
val[y] = val[x] * p(x, y)
I've changed the sign to >, since you want the probability to be as large as possible.
Logs are helpful because (1) log(xy) = log(x) + log(y) (as you said) and sums are easier to compute than products, and (2) log(x) is a monotonic function of x, so log(x) and x have their maximum in the same place. Therefore, you can deal with the logarithm of the probability, instead of the probability itself:
do if log_val[x] + log(p(x, y)) > log_val[y]:
log_val[y] = log_val[x] + log(p(x, y))
Edited to add (since I don't have enough rep to leave a comment): you'll want to initialize your val array to 0, rather than Infinity, because you're calculating a maximum instead of a minimum. (Since you want the largest probability of not failing.) So, after log transforming, the initial log_val array values should be -Infinity.
In order to calculate probabilities you should multiply (instead of add) in the relaxation phase, which means changing:
do if val[x] + p(x, y) < val[y]:
val[y] = val[x] + p(x, y)
to:
do if val[x] * p(x, y) < val[y]:
val[y] = val[x] * p(x, y)
Using the Log is possible if the range is (0,1] since log(0) = -infinity and log(1) = 0, it means that for every x,y in (0,1]: probability x < probability y than: log(x) < log(y). Since we are maintaining the same relation (between probabilities) this modification will provide the correct answer.
I think you'll be able to take it from here.
I think I may have solved the question partially.
Here is my attempt. Edits and pointers are welcome -:
DIJKSTRA-VARIANT (G, s, t)
for v in V:
val[v] ← 0
A ← ∅
Q ← V to initialize Q with vertices in V.
val[s] ← 1
while Q is not ∅ and t is not in A
do x ← EXTRACT-MAX (Q)
A ← A ∪ {x}
for each vertex y ∈ Adj[x]
do if log(val[x]) + log(p(x, y)) > log(val[y]):
log(val[y]) = log(val[x]) + log(p(x, y))
Since I am to find the highest possible probability values, I believe I should be using >. The following questions remain -:
What should the initial values in the val array be?
Is there anything else I need to add?
EDIT: I have changed the initial val values to 0. However, log is undefined at 0. I am open to a better alternative. Also, I changed the priority queue's method to EXTRACT-MAX since it is the larger probabilities that need to be extracted. This would ideally be implemented on a binary max-heap.
FURTHER EDIT: I have marked tinybike's answer as accepted, since they have posted most of the necessary details that I require. The algorithm should be as I have posted here.

Resources