Find n fibonacci numbers after a given number - algorithm

Is there a way to find n fibonacci numbers starting from a given k?
I know that the basic method would be to find all fibonacci numbers starting from 0, keep track of when a number in the series is greater than k, and then find n numbers from that point. But is there a simpler way?
What if I want to find only 3 fibonacci numbers after 5,000,000? Do I have to find all the numbers in the series starting from 0?
Also, if the only way to solve this would be to start from 0, then which approach would be better? The iterative or the recursive one?
Thanks.

Using the golden ratio you can calculate Nth fibonacci.
phi = 1.61803...
Xn=(phi^n - (1-phi)^n) / sqrt(5)
Where n starts with 0.
http://en.wikipedia.org/wiki/Golden_ratio#Relationship_to_Fibonacci_sequence
This formula gives you the position of the number related to the next and previous Fibonacci number. That is, if the formula yields a natural number, it is the Nth Fibonacci number. If yields a number with decimals it belongs between the previous and next natural number. If the number is 2.7, it is between 2 and 3, so you are looking for fib(3), fib(4) and fib(5)
Or you can use Gessel formula.
A number is a Fibonacci if and only if
5*n^2+4 is a square number or 5*n^2-4 is a square number
So you could start counting from your ``N (in this example 5*10^6) until you hit the two first Fibonacci.

The fibonacci sequence grows exponentially, which means you don't have to do very many iterations before you're above 5 million. In fact, the 37th Fibonacci number is above 5 million.
So I wouldn't look further than naive iteration, here in Python:
def fib(a0, k):
a, b = 0, 1
while a < a0:
a, b = b, a + b
for _ in xrange(k):
yield a
a, b = b, a + b
print list(fib(5000000, 3))

You might want to check this out
http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/fibFormula.html
although I don't think it is worthy to use for big input of N, this method uses the binet's formula.

Related

Find four factors of a number such that their product is maximum and their sum is the original number

Given number of test cases T and an integer N, you need to find four integers A,B,C,D , such that they're all factors of N(A|N,B|N,C|N,D|N), and N=A+B+C+D. Goal is to maximize A * B * C * D. If it's not possible to find such four factors simply return -1.
Input format for the problem is:
First line contains an integer T(1<=T<=40000), represents the number of test cases.
Each of the next T lines contains an integer N (1<=N<=40000, N^4 will not exceed 64 bit integer).
This question is on Hackerearth under recursion category, but i'm not able to understand the algorithm in the editorial( editorial link:- https://www.hackerearth.com/practice/basic-programming/recursion/recursion-and-backtracking/practice-problems/algorithm/divide-number-a410603f/editorial/).
In the editorial it's been solved using unit fractions but i'm not able to understand the algorithm( I've provided the editorial below if you are not able to open the above link, I'm not able to understand the points marked with ***). Brute force solution results in TLE(Time Limit Exceeded). Please provide algorithm or pseudo-code using DFS or backtracking.
My brute force approach:- calculate the factors of a number 'n' in O(sqrt(n)) and store them in an array, then traverse the array to get A,B,C,D using four for loops. But for T(1<=T<=40000) test cases it gets TLE.
Editorial(If you are not able to open the above link):-
Consider the equation N = A+B+C+D , if we divide the equation by N , we get 1 = 1/A' + 1/B' + 1/C' + 1/D' , here A',B',C',D' are all intergers, because A,B,C,D are factors of N.
So the original problem is equal to divide 1 into four unit fractions.
We can enumerate the unit fractions from large to small.
*** If we need to divide X into Y unit fractions, and the last unit fraction is 1/Z, we can enumerate unit fractions between 1/Z and X/Y(because we are enumerating the largest remaining fraction), and recursively solve.
*** After find all solutions to 1 = 1/A' + 1/B' + 1/C' + 1/D' (about 20 solutions if the numbers are in order), we can enumerate them in each test case. If A',B',C',D' are all factors of N, we can use this solution to update the answer.
Time Complexity: O(T), where T is the number of Test cases.
*** If we need to divide X into Y unit fractions, and the last unit fraction is 1/Z, we can enumerate unit fractions between 1/Z and X/Y(because we are enumerating the largest remaining fraction), and recursively solve.
Answer: We are trying to find out all the combination from 1 = 1/A + 1/B + 1/C + 1/D. Initially, we have X=1 and Y=4, and we are enumerating A as the largest factor, which should be no less than X/Y = 1/4. Because this is the first element, there is no last fraction 1/Z. Suppose we chose A=3, so last fraction 1/Z is 1/A=1/3, and X=1-1/3=2/3, and Y=3. Now we shall choose 1/B from [X/Y, 1/Z] = [2/9, 1/3]. And do the same thing for the next steps.
*** After find all solutions to 1 = 1/A' + 1/B' + 1/C' + 1/D' (about 20 solutions if the numbers are in order), we can enumerate them in each test case. If A',B',C',D' are all factors of N, we can use this solution to update the answer.
Answer: Because 1/A should be no less than 1/4, so A could only be 2, 3, 4. If A==4, then A=B=C=D, only have one solution. If A==3, [X/Y, 1/Z] = [2/9, 1/3], so B could only be 3 or 4, if B ==4, then next round C should be 4 where [X/Y, 1/Z] = [5/24, 1/4]; if B=3, then C could be 4,5,6 because [X/Y, 1/Z]=[1/6,1/3]. If A==2, [X/Y, 1/Z] = [1/6, 1/2], B could be 3,4,5,6. You could do the rest calculation using the code, feel like we could cut off many search branches. (Ignore my enumeration order, you should start from A=2. )
The time complexity of your code can be improved by using only 3 for loop and applying binary search to find the fourth number as time complexity of the binary search is log(n).
Time complexity = O(n^3*(log(n)) and according to the
constraints of question it should able to pass all the test cases.

How many times variable m is updated

Given the following pseudo-code, the question is how many times on average is the variable m being updated.
A[1...n]: array with n random elements
m = a[1]
for I = 2 to n do
if a[I] < m then m = a[I]
end for
One might answer that since all elements are random, then the variable will be updated on average on half the number of iterations of the for loop plus one for the initialization.
However, I suspect that there must be a better (and possibly the only correct) way to prove it using binomial distribution with p = 1/2. This way, the average number of updates on m would be
M = 1 + Σi=1 to n-1[k.Cn,k.pk.(1-p)(n-k)]
where Cn,k is the binomial coefficient. I have tried to solve this but I have stuck some steps after since I do not know how to continue.
Could someone explain me which of the two answers is correct and if it is the second one, show me how to calculate M?
Thank you for your time
Assuming the elements of the array are distinct, the expected number of updates of m is the nth harmonic number, Hn, which is the sum of 1/k for k ranging from 1 to n.
The summation formula can also be represented by the recursion:
H1 &equals; 1
Hn &equals; Hn−1&plus;1/n (n > 1)
It's easy to see that the recursion corresponds to the problem.
Consider all permutations of n−1 numbers, and assume that the expected number of assignments is Hn−1. Now, every permutation of n numbers consists of a permutation of n−1 numbers, with a new smallest number inserted in one of n possible insertion points: either at the beginning, or after one of the n−1 existing values. Since it is smaller than every number in the existing series, it will only be assigned to m in the case that it was inserted at the beginning. That has a probability of 1/n, and so the expected number of assignments of a permutation of n numbers is Hn−1 + 1/n.
Since the expected number of assignments for a vector of length one is obviously 1, which is H1, we have an inductive proof of the recursion.
Hn is asymptotically equal to ln n &plus; γ where γ is the Euler-Mascheroni constant, approximately 0.577. So it increases without limit, but quite slowly.
The values for which m is updated are called left-to-right maxima, and you'll probably find more information about them by searching for that term.
I liked #rici answer so I decided to elaborate its central argument a little bit more so to make it clearer to me.
Let H[k] be the expected number of assignments needed to compute the min m of an array of length k, as indicated in the algorithm under consideration. We know that
H[1] = 1.
Now assume we have an array of length n > 1. The min can be in the last position of the array or not. It is in the last position with probability 1/n. It is not with probability 1 - 1/n. In the first case the expected number of assignments is H[n-1] + 1. In the second, H[n-1].
If we multiply the expected number of assignments of each case by their probabilities and sum, we get
H[n] = (H[n-1] + 1)*1/n + H[n-1]*(1 - 1/n)
= H[n-1]*1/n + 1/n + H[n-1] - H[n-1]*1/n
= 1/n + H[n-1]
which shows the recursion.
Note that the argument is valid if the min is either in the last position or in any the first n-1, not in both places. Thus we are using that all the elements of the array are different.

Find only two numbers in array that evenly divide each other

Find the only two numbers in an array where one evenly divides the other - that is, where the result of the division operation is a whole number
Input Arrays Output
5 9 2 8 8/2 = 4
9 4 7 3 9/3 = 3
3 8 6 5 6/3 = 2
The brute force approach of having nested loops has time complexity of O(n^2). Is there any better way with less time complexity?
This question is part of advent of code.
Given an array of numbers A, you can identify the denominator by multiplying all the numbers together to give E, then testing each ith element by dividing E by Ai2. If this is a whole number, you have found the denominator, as no other factors can be introduced by multiplication.
Once you have the denominator, it's a simple task to do a second, independent loop searching for the paired numerator.
This eliminates the n2 comparisons.
Why does this work? First, we have an n-2 collection of non-divisors: abcde..
To complete the array, we also have numerator x and denominator y.
However, we know that x and only x has a factor of y, so it can be expressed as yz (z being a whole remainder from the division of x by y)
When we multiply out all the numbers, we end up with xyabcde.., but as x = yz, we can also say y2zabcde..
When we loop through dividing by the squared i'th element from the array, for most of the elements we create a fraction, e.g. for a:
y2zabcde.. / a2 = y2zbcde.. / a
However, for y and y only:
y2zabcde.. / y^2 = zabcde..
Why doesn't this work? The same is true of the other numbers. There's no guarantee that a and b can't produce another common factor when multiplied. Take the example of [9, 8, 6, 4], 9 and 8 multiplied equals 72, but as they both include prime factors 2 and 3, 72 has a factor of 6, also in the array. When we multiply it all out to 1728, those combine with the original 6 so that it can divide soundly by 36.
How might this be fixed? More accurately, if y is a factor of x, then y's prime factors will uniquely be a subset of x's prime factors, so maybe things can be refined along those lines. Obtaining a prime factorization should not scale according to the size of the array, but comparing subsets would, so it's not clear to me if this is at all useful.
I think that O(n^2) is the best time complexity you can get without any assumptions on the data.
If you can't tell anything about the numbers, knowing that x and y do not divide each other tells you nothing about x and z or y and z for any x, y, z. Therefore, in the worst case you must check all pairs of numbers - equal to n Choose 2 = n*(n-1)/2 = O(n^2).
Clearly, we can get O(n * sqrt(m)), where m is the absolute value range, by listing the pairs of divisors of each element against a hash of unique values in the array. This can be more efficient than O(n^2) depending on the input.
5 9 2 8
list divisor pairs (at most sqrt m iterations per element m)
5 (1,5)
9 (1,9), (3,3)
2 (1,2)
8 (1,8), (2,4) BINGO!
If you prime factorise all the numbers in the array progressively into a tree, when we discover a completely factored number leaf while factoring another number, we know we've found the divisor.
However, given we don't know which number is the divisor, we do need to test all primes up to divisor's largest factor. The largest factor for any m-digit number is, at most, sqrt(m), while the average number of primes below any m-digit number is m / ln(m). This means we will make at most n (sqrt(m) / ln(sqrt(m)) operations with very basic factorization and no optimization.
To be a little more specific, the algorithm should keep track of four things: a common tree of explored prime factors, the original number from the array, its current partial factorization, and its position in the tree.
For each prime number, we should test all numbers in the array (repeatedly to account for repeated factors). If the number divides evenly, we a) update the partial factorization, b) add/navigate to the corresponding child to the tree, c) if the partial factorization is 1, we have found the last factor and can indicate a leaf by adding the terminating '1' child, and d) if not, we can check for other numbers having left a child '1' to indicate they are completely factored.
When we find a child '1', we can identify the other number by multiplying out the partial factorization (e.g. all the parents up the tree) and exit.
For further optimization, we can cache the factorization (both partial and full) of numbers. We can also stop checking further factors of numbers that have a unique factor, narrowing the field of candidates over time.

Number of Fibonacci numbers smaller than number k. Sub O(n)

Interview question: How many Fibonacci numbers exists less than a given number k? Can you find a function in terms of k, to get the number of fibonacci number less than k?
Example : n = 6
Answer: 6 as (0, 1, 1, 2, 3, 5)
Easy enough, write a loop or use the recursive definition of Fibonacci. However, that sounds too easy... is there a way to do this using the closed-form definition? (https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression)
Here is a close-form Python solution which is O(1). It uses Binet's formula (from the Wikipedia article that you linked to):
>>> from math import sqrt,log
>>> def numFibs(n): return int(log(sqrt(5)*n)/log((1+sqrt(5))/2))
>>> numFibs(10)
6
Which tracks with 1,1,2,3,5,8
The point is that the second term in Binet's formula is negligible and it is easy enough to invert the result of neglecting it.
The above formula counts the number of Fibonacci numbers which are less than or equal to n. It jumps by 1 with each new Fibonacci number. So, for example, numFibs(12) = 6 and numFibs(13) = 7. 13 is the 7th Fibonacci number, so if you want the number of Fibobacci numbers which are strictly smaller than n you have to introduce a lag. Something like:
def smallerFibs(n):
if n <= 1:
return 0
else:
return min(numFibs(n-1),numFibs(n))
Now smallerFibs(13) is still 6 but then smallerFibs(14) = 7. This is of course still O(1).
I think it's fairly easy to see the growth of this number, at least. By the Binet / De-Moivre formula,
fn = (φn - ψn) / 5
Since |ψ| < 1 < φ, then
fn ∼ φn / 5.
From this it follows that the number of Fibonacci numbers smaller than x grows like logφ(5x).

Greatest GCD between some numbers

We've got some nonnegative numbers. We want to find the pair with maximum gcd. actually this maximum is more important than the pair!
For example if we have:
2 4 5 15
gcd(2,4)=2
gcd(2,5)=1
gcd(2,15)=1
gcd(4,5)=1
gcd(4,15)=1
gcd(5,15)=5
The answer is 5.
You can use the Euclidean Algorithm to find the GCD of two numbers.
while (b != 0)
{
int m = a % b;
a = b;
b = m;
}
return a;
If you want an alternative to the obvious algorithm, then assuming your numbers are in a bounded range, and you have plenty of memory, you can beat O(N^2) time, N being the number of values:
Create an array of a small integer type, indexes 1 to the max input. O(1)
For each value, increment the count of every element of the index which is a factor of the number (make sure you don't wraparound). O(N).
Starting at the end of the array, scan back until you find a value >= 2. O(1)
That tells you the max gcd, but doesn't tell you which pair produced it. For your example input, the computed array looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
4 2 1 1 2 0 0 0 0 0 0 0 0 0 1
I don't know whether this is actually any faster for the inputs you have to handle. The constant factors involved are large: the bound on your values and the time to factorise a value within that bound.
You don't have to factorise each value - you could use memoisation and/or a pregenerated list of primes. Which gives me the idea that if you are memoising the factorisation, you don't need the array:
Create an empty set of int, and a best-so-far value 1.
For each input integer:
if it's less than or equal to best-so-far, continue.
check whether it's in the set. If so, best-so-far = max(best-so-far, this-value), continue. If not:
add it to the set
repeat for all of its factors (larger than best-so-far).
Add/lookup in a set could be O(log N), although it depends what data structure you use. Each value has O(f(k)) factors, where k is the max value and I can't remember what the function f is...
The reason that you're finished with a value as soon as you encounter it in the set is that you've found a number which is a common factor of two input values. If you keep factorising, you'll only find smaller such numbers, which are not interesting.
I'm not quite sure what the best way is to repeat for the larger factors. I think in practice you might have to strike a balance: you don't want to do them quite in decreasing order because it's awkward to generate ordered factors, but you also don't want to actually find all the factors.
Even in the realms of O(N^2), you might be able to beat the use of the Euclidean algorithm:
Fully factorise each number, storing it as a sequence of exponents of primes (so for example 2 is {1}, 4 is {2}, 5 is {0, 0, 1}, 15 is {0, 1, 1}). Then you can calculate gcd(a,b) by taking the min value at each index and multiplying them back out. No idea whether this is faster than Euclid on average, but it might be. Obviously it uses a load more memory.
The optimisations I can think of is
1) start with the two biggest numbers since they are likely to have most prime factors and thus likely to have the most shared prime factors (and thus the highest GCD).
2) When calculating the GCDs of other pairs you can stop your Euclidean algorithm loop if you get below your current greatest GCD.
Off the top of my head I can't think of a way that you can work out the greatest GCD of a pair without trying to work out each pair individually (and optimise a bit as above).
Disclaimer: I've never looked at this problem before and the above is off the top of my head. There may be better ways and I may be wrong. I'm happy to discuss my thoughts in more length if anybody wants. :)
There is no O(n log n) solution to this problem in general. In fact, the worst case is O(n^2) in the number of items in the list. Consider the following set of numbers:
2^20 3^13 5^9 7^2*11^4 7^4*11^3
Only the GCD of the last two is greater than 1, but the only way to know that from looking at the GCDs is to try out every pair and notice that one of them is greater than 1.
So you're stuck with the boring brute-force try-every-pair approach, perhaps with a couple of clever optimizations to avoid doing needless work when you've already found a large GCD (while making sure that you don't miss anything).
With some constraints, e.g the numbers in the array are within a given range, say 1-1e7, it is doable in O(NlogN) / O(MAX * logMAX), where MAX is the maximum possible value in A.
Inspired from the sieve algorithm, and came across it in a Hackerrank Challenge -- there it is done for two arrays. Check their editorial.
find min(A) and max(A) - O(N)
create a binary mask, to mark which elements of A appear in the given range, for O(1) lookup; O(N) to build; O(MAX_RANGE) storage.
for every number a in the range (min(A), max(A)):
for aa = a; aa < max(A); aa += a:
if aa in A, increment a counter for aa, and compare it to current max_gcd, if counter >= 2 (i.e, you have two numbers divisible by aa);
store top two candidates for each GCD candidate.
could also ignore elements which are less than current max_gcd;
Previous answer:
Still O(N^2) -- sort the array; should eliminate some of the unnecessary comparisons;
max_gcd = 1
# assuming you want pairs of distinct elements.
sort(a) # assume in place
for ii = n - 1: -1 : 0 do
if a[ii] <= max_gcd
break
for jj = ii - 1 : -1 :0 do
if a[jj] <= max_gcd
break
current_gcd = GCD(a[ii], a[jj])
if current_gcd > max_gcd:
max_gcd = current_gcd
This should save some unnecessary computation.
There is a solution that would take O(n):
Let our numbers be a_i. First, calculate m=a_0*a_1*a_2*.... For each number a_i, calculate gcd(m/a_i, a_i). The number you are looking for is the maximum of these values.
I haven't proved that this is always true, but in your example, it works:
m=2*4*5*15=600,
max(gcd(m/2,2), gcd(m/4,4), gcd(m/5,5), gcd(m/15,15))=max(2, 2, 5, 5)=5
NOTE: This is not correct. If the number a_i has a factor p_j repeated twice, and if two other numbers also contain this factor, p_j, then you get the incorrect result p_j^2 insted of p_j. For example, for the set 3, 5, 15, 25, you get 25 as the answer instead of 5.
However, you can still use this to quickly filter out numbers. For example, in the above case, once you determine the 25, you can first do the exhaustive search for a_3=25 with gcd(a_3, a_i) to find the real maximum, 5, then filter out gcd(m/a_i, a_i), i!=3 which are less than or equal to 5 (in the example above, this filters out all others).
Added for clarification and justification:
To see why this should work, note that gcd(a_i, a_j) divides gcd(m/a_i, a_i) for all j!=i.
Let's call gcd(m/a_i, a_i) as g_i, and max(gcd(a_i, a_j),j=1..n, j!=i) as r_i. What I say above is g_i=x_i*r_i, and x_i is an integer. It is obvious that r_i <= g_i, so in n gcd operations, we get an upper bound for r_i for all i.
The above claim is not very obvious. Let's examine it a bit deeper to see why it is true: the gcd of a_i and a_j is the product of all prime factors that appear in both a_i and a_j (by definition). Now, multiply a_j with another number, b. The gcd of a_i and b*a_j is either equal to gcd(a_i, a_j), or is a multiple of it, because b*a_j contains all prime factors of a_j, and some more prime factors contributed by b, which may also be included in the factorization of a_i. In fact, gcd(a_i, b*a_j)=gcd(a_i/gcd(a_i, a_j), b)*gcd(a_i, a_j), I think. But I can't see a way to make use of this. :)
Anyhow, in our construction, m/a_i is simply a shortcut to calculate the product of all a_j, where j=1..1, j!=i. As a result, gcd(m/a_i, a_i) contains all gcd(a_i, a_j) as a factor. So, obviously, the maximum of these individual gcd results will divide g_i.
Now, the largest g_i is of particular interest to us: it is either the maximum gcd itself (if x_i is 1), or a good candidate for being one. To do that, we do another n-1 gcd operations, and calculate r_i explicitly. Then, we drop all g_j less than or equal to r_i as candidates. If we don't have any other candidate left, we are done. If not, we pick up the next largest g_k, and calculate r_k. If r_k <= r_i, we drop g_k, and repeat with another g_k'. If r_k > r_i, we filter out remaining g_j <= r_k, and repeat.
I think it is possible to construct a number set that will make this algorithm run in O(n^2) (if we fail to filter out anything), but on random number sets, I think it will quickly get rid of large chunks of candidates.
pseudocode
function getGcdMax(array[])
arrayUB=upperbound(array)
if (arrayUB<1)
error
pointerA=0
pointerB=1
gcdMax=0
do
gcdMax=MAX(gcdMax,gcd(array[pointera],array[pointerb]))
pointerB++
if (pointerB>arrayUB)
pointerA++
pointerB=pointerA+1
until (pointerB>arrayUB)
return gcdMax

Resources