Interview question - Finding numbers - algorithm

I just got this question on a SE position interview, and I'm not quite sure how to answer it, other than brute force:
Given a natural number N, find two numbers, A and P, such that:
N = A + (A+1) + (A+2) + ... + (A+P-1)
P should be the maximum possible.
Ex: For N=14, A = 2 and P = 4
N = 2 + (2+1) + (2+2) + (4+2-1)
N = 2 + 3 + 4 + 5
Any ideas?

If N is even/odd, we need an even/odd number of odd numbers in the sum. This already halfes the number of possible solutions. E.g. for N=14, there is no point in checking any combinations where P is odd.
Rewriting the formula given, we get:
N = A + (A+1) + (A+2) + ... + (A+P-1)
= P*A + 1 + 2 + ... + (P-1)
= P*A + (P-1)P/2 *
= P*(A + (P-1)/2)
= P/2*(2*A + P-1)
The last line means that N must be divisible by P/2, this also rules out a number of possibilities. E.g. 14 only has these divisors: 1, 2, 7, 14. So possible values for P would be 2, 4, 14 and 28. 14 and 28 are ruled our for obvious reasons (in fact, any P above N/2 can be ignored).
This should be a lot faster than the brute-force approach.
(* The sum of the first n natural numbers is n(n+1)/2)

With interview questions, it is often wise to think about what is probably the purpose of the question. If I would be asking you this question, it is not because I think you know the solution, but I want to see you finding the solution. Reformulating the problem, making implications, devising what is known, ... this is what I would like to see.
If you just sit and tell me "I do not know how to solve it", you immediately fail the interview.
If you say: I know how to solve it by brute force, and I am aware it will be probably slow, I will give you some hints or help you to get you started. If that does not help, you most likely fail (unless you show some extraordinary skills to compensate for the fact you are probably lacking something in the field of general problem analysis, e.g. you will show how to implement a solution paralelized for many cores or implemented on GPU).
If you bring me a ready solution, but you are unable to derive it, I will give you another similar problem, because I am not interested about solution, I am interested in your thinking.

A + (A+1) + (A+2) + ... + (A+P-1) simplifies to P*A + (P*(P-1)/2) resp P*(A+(P-1)/2).
Thus, you could just enumerate all divisors of N, and then test each divisor P to the following:
Is A = (N-(P*(P-1)/2))/P (solved the first simplification for A) an integral number? (I assume it should be an integral number, otherwise it would be trivial.) If so, return it as a solution.

Can be solved using 0-1 Knapsack solution .
Observation : N/2 + N/2 + 1 > N
so our series is 1,2,...,N/2
Consider the constraints of W=N and vi =1 for all elements, I think this trivially maps to 0-1 knapsack, O(n^2)

Here is a O(n) solution.
It uses the property of the sum of an arithmetic progression.
S = difference*(first_term + last_term)/2
Here our sum is N, the difference is P and first term is A.
Manipulation the above equation we get some equations and we can iterate P from 1 to n - 1 to get a valid A.
def solve(n,p):
return (2*n - (p**2) + p)/(2*p)
def condition(n,p,a):
if (2*n == (2*a*p) + (p**2) - p) and (a*-1 < 0):
return True
else:
return False
def find(n):
for x in xrange(n,-1,-1):
a = solve(n,x)
if condition(n,x,a):
return n,x,a

Related

Understanding a Particular Recursive Algorithm

Which problem does the algorithm Delta below solve, where m, n >= 0 are integers?
So Im finding the algorithm very hard break down due to the nature of the nested recursion and how it calls on another recursive algorithm. If I had to guess I would say that Delta solves the LCS(longest common subsequence) problem, but Im not able to give a good explanation as to why.
Could someone help me break down the algorithm and explain the recursion and how it works?
As you found out yourself, delta computes the product of two integers.
The recursion indeed makes this confusing to look at but the best way to gain intuition is to perform the computation by hand on some example data. But by looking at the functions separately, you will find that:
Gamma is just summation. Gamma(n,m) = gamma(n, m - 1) + 1 essentially performs a naive summation where you count down the second term, while adding 1 to the first. Example:
3 + 3 =
(3 + 2) + 1 =
((3 + 1) + 1) + 1 =
(((3 + 0) + 1) + 1) + 1 =
6
Knowing this, we can simplify Delta:
Delta(n, m) = n + Delta(n, m - 1) (if m!=0, else return 0).
In the same way, we are counting down on the second factor, but instead of adding 1, we add n. This is in indeed one definition of multiplication. It is easy to understand this if you manually solve an example just like above.

Solve recurrence relation in which there is a separate relation for even and odd values

Can someone help me how to solve these type of questions? What kind of approach should I follow?
Looking over the question, since you will be asked to
evaluate the recurrence lots of times
for very large inputs,
you will likely need to either
find a closed-form solution to the recurrence, or
find a way to evaluate the nth term of the recurrence in sublinear time.
The question, now, is how to do this. Let's take a look at the recurrence, which was defined as
f(1) = f(2) = 1,
f(n+2) = 3f(n) if n is odd, and
f(n+2) = 2f(n+1) - f(n) + 2 if n is even.
Let's start off by just exploring the recurrence to see if any patterns arise. Something that stands out here - the odd terms of this recurrence only depend on other odd terms in the recurrence. This means that we can imagine trying to split this recurrence into two smaller recurrences: one that purely deals with the odd terms, and one that purely deals with the even terms. Let's have D(n) be the sequence of the odd terms, and E(n) be the sequence of the even terms. Then we have
D(1) = 1
D(n+2) = 3D(n)
We only need to evaluate D on odd numbers, so we can play around with that to see if a pattern emerges:
D(2·0 + 1) = 1 = 30
D(2·1 + 1) = 3 = 31
D(2·2 + 1) = 9 = 32
D(2·3 + 1) = 27 = 33
The pattern here is that D(2n+1) = 3n. And hey, that's great news! That means that we have a direct way of computing D(2n+1).
With that in mind, notice that E(n) is defined as
E(2) = 1 = D(1)
E(n+2) = 2D(n+1) - E(n) + 2
Remember that we know the exact value of D(n+1), which is going to make our lives a lot easier. Let's see what happens if we iterate on this recurrence a bit. For example, notice that
E(8)
= 2D(7) - E(6) + 2
= 2D(7) + 2 - (2D(5) - E(4) + 2)
= 2D(7) - 2D(5) + E(4)
= 2D(7) - 2D(5) + (2D(3) - E(2) + 2)
= 2D(7) - 2D(5) + 2D(3) + 2 - D(1)
= 2D(7) - 2D(5) + 2D(3) - D(1) + 2
Okay... that's really, really interesting. It seems like we're getting an alternating sum of the D recurrence, where we alternate between including and excluding 2. At this point, if I had to make a guess, I'd say that the way to solve this recurrence is going to be to think about subdividing the even case further into cases where the inputs are 2n for an even n and 2n for an odd n. In fact, notice that if the input is 2n for even n, then there won't be a +2 term at the end (all the +2's are balanced out by -2's), whereas if the input is odd, then there will be a +2 term at the end (all the +2's are balanced out by -2's).
Now, let's turn to a different aspect of the problem. You weren't asked to query for individual terms of the recurrence. You were asked to query for the sum of the recurrence, evaluated over a range of inputs. The fact that we're getting alternating sums and differences of the D terms here is really, really interesting. For example, what is f(10) + f(11) + f(12)? Well, we know that f(11) = D(11), which we can compute directly. And we also know that f(10) and f(12) are E(10) and E(12). And watch what happens if we evalute E(10) + E(12):
E(10) + E(12)
= (D(9) - D(7) + D(5) - D(3) + D(1) + 2) + (D(11) - D(9) + D(7) - D(5) + D(3) - D(1))
= D(11) + (D(9) - D(9)) + (D(7) - D(7)) + (D(5) - D(5)) + (D(3) - D(3)) + (D(1) - D(1)) + 2
= D(11) + 2.
Now that's interesting. Notice that all of the terms have cancelled out except for the D(11) term and the +2 term! More generally, this might lead us to guess that there's some rule about how to simplify E(n+2) + E(n). In fact, there is. Specifically:
E(2n) + E(2n+2) = D(2n+1) + 2
This means that if we're summing up lots of consecutive values in a range, every pair of adjacent even terms will simplify instantly to something of the form D(2n+1) + 2.
There's still some more work to be done here. For example, you'll need to be able to sum up enormous numbers of D(n) terms, and you'll need to factor in the effects of all the +2 terms. I'll leave those to you to figure out.
One hint: all the values you're asked to return are modulo some number P. This means that the sequence of values 0, D(1), D(1) + D(3), D(1) + D(3) + D(5), D(1) + D(3) + D(5) + D(7), etc. eventually has to reach 0 again (mod P). You can both compute how many terms have to happen before this occurs and write down all the values encountered when doing this by just computing these values explicitly. That will enable you to sum up huge numbers of consecutive D terms in a row - you can mod the number of terms by the length of the cycle, then look up the residual sum in the table.
Hope this helps!

Efficiently grab some subsets that meet criteria [duplicate]

This question already has an answer here:
Count the total number of subsets that don't have consecutive elements
(1 answer)
Closed 4 years ago.
Given a set of consecutive numbers from 1 to n, I'm trying to find the number of subsets that do not contain consecutive numbers.
E.g., for the set [1, 2, 3], some possible subsets are [1, 2] and [1, 3]. The former would not be counted while the latter would be, since 1 and 3 are not consecutive numbers.
Here is what I have:
def f(n)
consecutives = Array(1..n)
stop = (n / 2.0).round
(1..stop).flat_map { |x|
consecutives.combination(x).select { |combo|
consecutive = false
combo.each_cons(2) do |l, r|
consecutive = l.next == r
break if consecutive
end
combo.length == 1 || !consecutive
}
}.size
end
It works, but I need it to work faster, under 12 seconds for n <= 75. How do I optimize this method so I can handle high n values no sweat?
I looked at:
Check if array is an ordered subset
How do I return a group of sequential numbers that might exist in an array?
Check if an array is subset of another array in Ruby
and some others. I can't seem to find an answer.
Suggested duplicate is Count the total number of subsets that don't have consecutive elements, although that question is slightly different as I was asking for this optimization in Ruby and I do not want the empty subset in my answer. That question would have been very helpful had I initially found that one though! But SergGr's answer is exactly what I was looking for.
Although #user3150716 idea is correct the details are wrong. Particularly you can see that for n = 3 there are 4 subsets: [1],[2],[3],[1,3] while his formula gives only 3. That is because he missed the subset [3] (i.e. the subset consisting of just [i]) and that error accumulates for larger n. Also I think it is easier to think if you start from 1 rather than n. So the correct formulas would be
f(1) = 1
f(2) = 2
f(n) = f(n-1) + f(n-2) + 1
Those formulas are easy to code using a simple loop in constant space and O(n) speed:
def f(n)
return 1 if n == 1
return 2 if n == 2
# calculate
# f(n) = f(n-1) + f(n - 2) + 1
# using simple loop
v2 = 1
v1 = 2
i = 3
while i <= n do
i += 1
v1, v2 = v1 + v2 + 1, v1
end
v1
end
You can see this online together with the original code here
This should be pretty fast for any n <= 75. For much larger n you might require some additional tricks like noticing that f(n) is actually one less than a Fibonacci number
f(n) = Fib(n+2) - 1
and there is a closed formula for Fibonacci number that theoretically can be computed faster for big n.
let number of subsets with no consecutive numbers from{i...n} be f(i), then f(i) is the sum of:
1) f(i+1) , the number of such subsets without i in them.
2) f(i+2) + 1 , the number of such subsets with i in them (hence leaving out i+1 from the subset)
So,
f(i)=f(i+1)+f(i+2)+1
f(n)=1
f(n-1)=2
f(1) will be your answer.
You can solve it using matrix exponentiation(http://zobayer.blogspot.in/2010/11/matrix-exponentiation.html) in O(logn) time.

Max subset of arrays whose mean is larger than threshold

I recently came across the following problem, and so far got no insight on how to solve it.
Let S = {v1, v2, v3, ..., vn} be a set of n arrays defined on the ℝ6. That is, each array has 6 entries.
For a given set of arrays, let the mean of a dimension be the average between the coordinates corresponding to that dimension for all elements in the set.
Also, let us define a certain property P of a set of arrays as the lowest value amongst all means of a set (there is a total of 6 means, one for each dimension). For instance, if a certain set has {10, 4, 1, 5, 6, 3} as means for its dimensions, then P for this set is 1.
Now to the definition of the problem: Return the maximum cardinality amongst all the subsets S' of S such that P(S') ≥ T, T a known threshold value, or 0 if such subset does not exist. Additionally, output any maximal S' (such that P(S') ≥ T).
Summarising: Inputs: the set S and the threshold value T. Output: A certain subset S' (|S'| is evidently immediate).
I first began trying to come up with a greedy solution, but got no success. Then, I moved on to a dynamic programming approach, but could not establish a recursion that solved the problem. I could expand a little more on my thoughts on the solution, but I don't think they would be of much use, given how far I got (or didn't get).
Any help would be greatly appreciated!
Bruteforce evaluation through recursion would have a time complexity of O(2^n) because each array can either be present in the subset or not.
One (still inefficient but slightly better) way to solve this problem is by taking the help of Integer Linear Programming.
Define Xi = { 1 if array Vi is present in the maximal subset, 0 otherwise }
Hence Cardinality k = summation(Xi) {i = 1 to n }
Also, since the average of all dimensions >= T, this means:
d11X1 + d12X2 + ... + d1nXn >= T*k
d21X1 + d22X2 + ... + d2nXn >= T*k
d31X1 + d32X2 + ... + d3nXn >= T*k
d41X1 + d42X2 + ... + d4nXn >= T*k
d51X1 + d52X2 + ... + d5nXn >= T*k
d61X1 + d62X2 + ... + d6nXn >= T*k
Objective function: Maximize( k )
Actually you should eliminate k by the cardinality equation but I have included it here for clarity.

Computing number of permutations of two values, with a restriction on runs

I was thinking about ways to solve this other question about counting the number of values whose digits sum to a target, and decided to try the case where the range was of the form [0, n^base). So essentially you get N independent digits to work with, which is a simpler problem.
The number of ways N natural numbers can sum to a target T is easy to compute. If you think of it as placing N-1 dividers among T sticks, you should see the answer is (T+N-1)!/(T!(N-1)!).
However, our N natural numbers are restricted to [0, base) and so there will be fewer possibilities. I want to find a simple formula for this case as well.
The first thing I considered was deducting the number of possibilities where 'base' of the sticks had been replaced with a 'big stick'. Unfortunately, some possibilities are double counted because they have multiple places a 'big stick' could be inserted.
Any ideas?
You can use generating functions.
Assuming that the order matters, then you are looking for the coefficient of x^T in
(1 + x + x^2 + ... + x^b)(1 + x + x^2 + .. + x^b) ... n times
= (x^(b+1) - 1)^n/(x-1)^n
Using binomial theorem (works even for -n), you should be able to write you answer as a sum of products of binomial coefficients.
Let b+1 = B.
Using binomial theorem we have
(x^(b+1) - 1)^n = Sum_{r=0}^{n} (-1)^(n-r)* (n choose r) x^(Br)
1/(x-1)^n = Sum (n+s-1 choose s) x^s
So the answer we need is:
Sum (-1)^(n-r) * (n choose r)*(n+s-1 choose s)
for any r and s subject to the condition that
Br + s = T.

Resources