The 1000th element which is product of 2, 3, 5 - algorithm

There is a sequence S.
All the elements in S is product of 2, 3, 5.
S = {2, 3, 4, 5, 6, 8, 9, 10, 12, 15, 16, 18, 20, 24 ...}
How to get the 1000th element in this sequence efficiently?
I check each number from 1, but this method is too slow.

A geometric approach:
Let s = 2^i . 3^j . 5^k, where the triple (i, j, k) belongs to the first octant of a 3D state space.
Taking the logarithm,
ln(s) = i.ln(2) + j.ln(3) + k.ln(5)
so that in the state space the iso-s surfaces are planes, which intersect the first octant along a triangle. On the other hand, the feasible solutions are the nodes of a square grid.
If one wants to produce the s-values in increasing order, one can keep a list of the grid nodes closest to the current s-plane*, on its "greater than" side.
If I am right, to move from one s-value to the next, it suffices to discard the current (i, j, k) and replace it by the three triples (i+1, j, k), (i, j+1, k) and (i, j, k+1), unless they are already there, and pick the next smallest s.
An efficient implementation will be by storing the list as a binary tree with the log(s)-value as the key.
If you are asking for the first N values, you will explore a pyramidal volume of state-space of height O(³√N), and base area O(³√N²), which is the number of tree nodes, hence the spatial complexity. Every query in the tree will take O(log(N)) comparisons (and O(1) operations to fetch the minimum), for a total of O(N.log(N)).
*More precisely, the list will contain all triples on the "greater than" side and such that no index can be decreased without getting on the other side of the plane.
Here is Python code that implements these ideas.
You will notice that the logarithms are converted to fixed point (7 decimals) to avoid floating-point inaccuracies that could result in the log(s)-values not being found equal. This causes the s values being inexact in the last digits, but this does not matter as long as the ordering of the values is preserved. Recomputing the s-values from the indexes yields exact values.
import math
import bintrees
# Constants
ln2= round(10000000 * math.log(2))
ln3= round(10000000 * math.log(3))
ln5= round(10000000 * math.log(5))
# Initial list
t= bintrees.FastAVLTree()
t.insert(0, (0, 0, 0))
# Find the N first products
N= 100
for i in range(N):
# Current s
s= t.pop_min()
print math.pow(2, s[1][0]) * math.pow(3, s[1][1]) * math.pow(5, s[1][2])
# Update the list
if not s[0] + ln2 in t:
t.insert(s[0] + ln2, (s[1][0]+1, s[1][1], s[1][2]))
if not s[0] + ln3 in t:
t.insert(s[0] + ln3, (s[1][0], s[1][1]+1, s[1][2]))
if not s[0] + ln5 in t:
t.insert(s[0] + ln5, (s[1][0], s[1][1], s[1][2]+1))
The 100 first values are
1 2 3 4 5 6 8 9 10 12
15 16 18 20 24 25 27 30 32 36
40 45 48 50 54 60 64 72 75 80
81 90 96 100 108 120 125 128 135 144
150 160 162 180 192 200 216 225 240 243
250 256 270 288 300 320 324 360 375 384
400 405 432 450 480 486 500 512 540 576
600 625 640 648 675 720 729 750 768 800
810 864 900 960 972 1000 1024 1080 1125 1152
1200 1215 1250 1280 1296 1350 1440 1458 1500 1536
The plot of the number of tree nodes confirms the O(³√N²) spatial behavior.
Update:
When there is no risk of overflow, a much simpler version (not using logarithms) is possible:
import math
import bintrees
# Initial list
t= bintrees.FastAVLTree()
t[1]= None
# Find the N first products
N= 100
for i in range(N):
# Current s
(s, r)= t.pop_min()
print s
# Update the list
t[2 * s]= None
t[3 * s]= None
t[5 * s]= None

Simply put, you just have to generate each ith number consecutively. Let's call the set {2, 3, 5} to be Z. At ith iteration, assume you have all (i-1) of the values generated in the previous iteration. While generating the next one, what you basically have to do is trying all the elements in Z and for each of them generating **the least element they can form that is larger than the element generated at (i-1)th iteration. Then, you simply consider the smallest one among them as the ith value. A simple and not so efficient implementation is given below.
def generate_simple(N, Z):
generated = [1]
for i in range(1, N+1):
minFound = -1
minElem = -1
for j in range(0, len(Z)):
for k in range(0, len(generated)):
candidateVal = Z[j] * generated[k]
if candidateVal > generated[-1]:
if minFound == -1 or minFound > candidateVal:
minFound = candidateVal
minElem = j
break
generated.append(minFound)
return generated[-1]
As you may observe, this approach has a time complexity of O(N2 * |Z|). An improvement in terms of efficiency would be to store where we left off scanning in the array of generated values for each element in a second array, indicesToStart. Then, for each element we would only scan all N values of the array generated for once(i.e. all through the algorithm), which means the time complexity after such an improvement would be O(N * |Z|).
A simple implementation of the improvement based on the simple version provided above, is given below.
def generate_improved(N, Z):
generated = [1]
indicesToStart = [0] * len(Z)
for i in range(1, N+1):
minFound = -1
minElem = -1
for j in range(0, len(Z)):
for k in range(indicesToStart[j], len(generated)):
candidateVal = Z[j] * generated[k]
if candidateVal > generated[-1]:
if minFound == -1 or minFound > candidateVal:
minFound = candidateVal
minElem = j
break
indicesToStart[j] += 1
generated.append(minFound)
indicesToStart[minElem] += 1
return generated[-1]
If you have a hard time understanding how complexity decreases with this algorithm, try looking into the difference in time complexity of any graph traversal algorithm when an adjacency list is used, and when an adjacency matrix is used. The improvement adjacency lists help achieve is almost exactly the same kind of improvement we get here. In a nutshell, you have an index for each element and instead of starting to scan from the beginning you continue from wherever you left the last time you scanned the generated array for that element. Consequently, even though there are N iterations in the algorithm(i.e. the outermost loop) the overall number of operations you make is O(N * |Z|).
Important Note: All the code above is a simple implementation for demonstration purposes, and you should consider it just as a pseudocode you can test. While implementing this in real life, based on the programming language you choose to use, you will have to consider issues like integer overflow when computing candidateVal.

Related

Falling segment reunites with other segments with a probability, determine the expected medium length of the segment

This is a really tough problem, just a heads-up.
We have N segments, numbered from 1 to N and defined by their left and right points, {Left[i],Right[i]}.
The i-th segment is at height N-i. The first segment (the highest one) starts falling while the others remain fixed. If during the fall a segment i intersects another segment j in at least one point, then the two will reunite with the probability P[j]/Q[j], and the obtained segment will keep falling. From the reunion of two segments, {A,B} and {C,D}, the obtained segment will be {min(A,C),max(B,D)}.
You are asked to determine the expected medium length of the first segment (i.e after it reached a height smaller than the height of any of the other segments). If this answer is a rational number U/V, you are asked to determine X such that X*V=U (mod 10^9+7)
Restrictions :
0 < P < Q < 1 000
0 < Left < Right < 1 000 000
N ≤ 100 000
time : 2.5 sec
memory : 32768 kbytes
`
The input contains N on the first line, then on the following N lines there are 4 integers : Left, Right, P, Q, representing the i-th segment [Left, Right] with a probability P/Q to reunite with the falling segment.
Example:
input:
5
35 64 58 873
41 70 407 729
18 90 165 628
10 57 33 104
60 69 152 466
output:
779316733
The answer is approximately 49.813963.
Idea 1
The length of the final segment is R-L where R is the location of the right end, and L is the location of the left end.
Expectation is a linear operation so
E(length) = E(R) - E(L)
We can compute E(R) and E(L) separately, then combined the results.
Idea 2
We can iteratively compute the PDF for the position of the left end.
It starts off being at the left end of the first segment (Left[1]) with probability 1.
When it falls past segment i, there will be an interesting collision if the left end is between Left[i] and Right[i]. We define an interesting collision to be one that affects the position of the left end.
The key point here is that if we need to know the current position of the right end to determine if there is a collision, then it is not an interesting collision! This is because if we need to know the right end, then the segment i must be completely to the right of the start point, and therefore it does not affect the position of the left edge.
So to update the PDF we collect up all the probability mass between Left[i] and Right[i], multiply by the probability of collision, and add the result to Left[i]. (The existing mass in those locations is scaled down by the probability of collision.)
Idea 3
At the moment we have an O(n^2) algorithm made of n iterations of O(n) to count and modify the mass in each range.
However, we can use a data structure such as a segment tree to allow us to perform each iteration in O(logn) time for a total time complexity of O(nlogn).

3n+1 Optimization Idea for Larger Integers

I recently got into the book "Programming Challenges" by Skiena and Revilla and was somewhat surprised when I saw the solution to the 3n+1 problem, which was simply brute forced. Basically it's an algorithm that generates a list of numbers, dividing by 2 if even and multiplying by 3 and adding 1 if odd. This occurs until n=1 is reached, its base case. Now the trick is to find the maximum length of a list between integers i and j which in the problem ranges between 1 and 1,000,000 for both variables. So I was wondering how much more efficient (if so) a program would be with Dynamic Programming. Basically, the program would do one pass on the first number, i, find the total length, and then check each individual number within the array and store the associated lengths within a HashMap or other dictionary data type.
For Example:
Let's say i = 22 and j = 23
For 22:
22 11 34 17 52 26 13 40 20 10 5 16 8 4 2 1
This means that in the dictionary, with the structure would store
(22,16) , (11,15) , (34,14) and so on... until (1,1)
Now for 23:
23 70 35 106 53 160 80 40 ...
Since 40 was hit, and it is in the dictionary
program would get the length of 23 to 80, which is 7, and add it to the length stored previously by 40 which is 9 resulting in total list length of 16. And of course the program would store lengths of 23, 70 , 35 etc... such that if the numbers were bigger it should compute faster.
So what are the opinions of approaching such a question in this manner?
I tried both approaches and submitted them to UVaOJ, the brute force solution got runtime ~0.3s and the dp solution ~0.0s. It gets pretty slow when the range gets long (like over 1e7 elements).
I just used an array (memo) to be able to memorize the first 5 million (SIZE) values:
int cycleLength(long long n)
{
if(n < 1) //overflow
return 0;
if (n == 1)
return 1;
if (n < SIZE && memo[n] != 0)
return memo[n];
int res = 1 + cycleLength(n % 2 == 0 ? n / 2 : 3 * n + 1);
if (n < SIZE)
memo[n] = res;
return res;
}

How to scale down the values so they could fit inside the min and max values

I have 6 graph bars with the prices.
Each price number will represent its graphbar's height by respecting min and max heights.
What i want is that graph bar's height wouldn't go below or above the min and the max value.
So i have values of min = 55 and max = 110.
And price numbers are:
49
212
717
1081
93
By which mathematical algorithm I could achieve expected results ?
It's some sort of dynamic scalable bar graphs.
Modified
So the min and max values from the price list will be: 49(min price) => 55(min) and 1081 (max price) => 110(max)
The solution is simple:
Pick the smallest, and largest item and find the difference.
(largest_item - smallest_item) maps to (max-min).
Compute ratio = (max-min)/(largest_item-smallest_item)
final_value = min_value + ratio*(value-smallest_item)
As a mathematical function:
f(x,max,min,largest,smallest) = min + (max-min)/(largest-smallest)*(x-smallest)
where:
x : Input item's price
max: Maximum value (here, 110)
min: Minimum value (here, 55)
largest: Largest item in input (Here, 1081)
smallest: Smallest item in input (Here, 49)
One check, as #amit correctly points out: Ensure largest and smallest item are distinct.
So let x = 93. We have other 4 values with us.
f(x,max,min,largest,smallest) = min + (max-min)/(largest-smallest)*(x-smallest)
value = 55 + ((110-55)/(1081-49)) * (93-49)
value = 57.344961
Further,
f(93,110,55,1081,49) = 57.344961
f(49,110,55,1081,49) = 55
f(1081,110,55,1081,49) = 110
The function:
[(x - min ) / (max-min)*55] + 55
ensures the boundaries you are after - but you should also consider - what should the graph show? What do you want the reader to understand from it?
Why?
(x-min) / (max-min) gives a number in range [0,1] - 0 for min,
1 for max.
Multiplying it with 55 ensures a number in range [0,55].
Adding 55 ensures a number in range [55,110] - as expected.
(*) Note: for max = min - the above fails because of division with 0, take care for these cases manually.

Google Interview: Arrangement of Blocks

You are given N blocks of height 1…N. In how many ways can you arrange these blocks in a row such that when viewed from left you see only L blocks (rest are hidden by taller blocks) and when seen from right you see only R blocks? Example given N=3, L=2, R=1 there is only one arrangement {2, 1, 3} while for N=3, L=2, R=2 there are two ways {1, 3, 2} and {2, 3, 1}.
How should we solve this problem by programming? Any efficient ways?
This is a counting problem, not a construction problem, so we can approach it using recursion. Since the problem has two natural parts, looking from the left and looking from the right, break it up and solve for just one part first.
Let b(N, L, R) be the number of solutions, and let f(N, L) be the number of arrangements of N blocks so that L are visible from the left. First think about f because it's easier.
APPROACH 1
Let's get the initial conditions and then go for recursion. If all are to be visible, then they must be ordered increasingly, so
f(N, N) = 1
If there are suppose to be more visible blocks than available blocks, then nothing we can do, so
f(N, M) = 0 if N < M
If only one block should be visible, then put the largest first and then the others can follow in any order, so
f(N,1) = (N-1)!
Finally, for the recursion, think about the position of the tallest block, say N is in the kth spot from the left. Then choose the blocks to come before it in (N-1 choose k-1) ways, arrange those blocks so that exactly L-1 are visible from the left, and order the N-k blocks behind N it in any you like, giving:
f(N, L) = sum_{1<=k<=N} (N-1 choose k-1) * f(k-1, L-1) * (N-k)!
In fact, since f(x-1,L-1) = 0 for x<L, we may as well start k at L instead of 1:
f(N, L) = sum_{L<=k<=N} (N-1 choose k-1) * f(k-1, L-1) * (N-k)!
Right, so now that the easier bit is understood, let's use f to solve for the harder bit b. Again, use recursion based on the position of the tallest block, again say N is in position k from the left. As before, choose the blocks before it in N-1 choose k-1 ways, but now think about each side of that block separately. For the k-1 blocks left of N, make sure that exactly L-1 of them are visible. For the N-k blocks right of N, make sure that R-1 are visible and then reverse the order you would get from f. Therefore the answer is:
b(N,L,R) = sum_{1<=k<=N} (N-1 choose k-1) * f(k-1, L-1) * f(N-k, R-1)
where f is completely worked out above. Again, many terms will be zero, so we only want to take k such that k-1 >= L-1 and N-k >= R-1 to get
b(N,L,R) = sum_{L <= k <= N-R+1} (N-1 choose k-1) * f(k-1, L-1) * f(N-k, R-1)
APPROACH 2
I thought about this problem again and found a somewhat nicer approach that avoids the summation.
If you work the problem the opposite way, that is think of adding the smallest block instead of the largest block, then the recurrence for f becomes much simpler. In this case, with the same initial conditions, the recurrence is
f(N,L) = f(N-1,L-1) + (N-1) * f(N-1,L)
where the first term, f(N-1,L-1), comes from placing the smallest block in the leftmost position, thereby adding one more visible block (hence L decreases to L-1), and the second term, (N-1) * f(N-1,L), accounts for putting the smallest block in any of the N-1 non-front positions, in which case it is not visible (hence L stays fixed).
This recursion has the advantage of always decreasing N, though it makes it more difficult to see some formulas, for example f(N,N-1) = (N choose 2). This formula is fairly easy to show from the previous formula, though I'm not certain how to derive it nicely from this simpler recurrence.
Now, to get back to the original problem and solve for b, we can also take a different approach. Instead of the summation before, think of the visible blocks as coming in packets, so that if a block is visible from the left, then its packet consists of all blocks right of it and in front of the next block visible from the left, and similarly if a block is visible from the right then its packet contains all blocks left of it until the next block visible from the right. Do this for all but the tallest block. This makes for L+R packets. Given the packets, you can move one from the left side to the right side simply by reversing the order of the blocks. Therefore the general case b(N,L,R) actually reduces to solving the case b(N,L,1) = f(N,L) and then choosing which of the packets to put on the left and which on the right. Therefore we have
b(N,L,R) = (L+R choose L) * f(N,L+R)
Again, this reformulation has some advantages over the previous version. Putting these latter two formulas together, it's much easier to see the complexity of the overall problem. However, I still prefer the first approach for constructing solutions, though perhaps others will disagree. All in all it just goes to show there's more than one good way to approach the problem.
What's with the Stirling numbers?
As Jason points out, the f(N,L) numbers are precisely the (unsigned) Stirling numbers of the first kind. One can see this immediately from the recursive formulas for each. However, it's always nice to be able to see it directly, so here goes.
The (unsigned) Stirling numbers of the First Kind, denoted S(N,L) count the number of permutations of N into L cycles. Given a permutation written in cycle notation, we write the permutation in canonical form by beginning the cycle with the largest number in that cycle and then ordering the cycles increasingly by the first number of the cycle. For example, the permutation
(2 6) (5 1 4) (3 7)
would be written in canonical form as
(5 1 4) (6 2) (7 3)
Now drop the parentheses and notice that if these are the heights of the blocks, then the number of visible blocks from the left is exactly the number of cycles! This is because the first number of each cycle blocks all other numbers in the cycle, and the first number of each successive cycle is visible behind the previous cycle. Hence this problem is really just a sneaky way to ask you to find a formula for Stirling numbers.
well, just as an empirical solution for small N:
blocks.py:
import itertools
from collections import defaultdict
def countPermutation(p):
n = 0
max = 0
for block in p:
if block > max:
n += 1
max = block
return n
def countBlocks(n):
count = defaultdict(int)
for p in itertools.permutations(range(1,n+1)):
fwd = countPermutation(p)
rev = countPermutation(reversed(p))
count[(fwd,rev)] += 1
return count
def printCount(count, n, places):
for i in range(1,n+1):
for j in range(1,n+1):
c = count[(i,j)]
if c > 0:
print "%*d" % (places, count[(i,j)]),
else:
print " " * places ,
print
def countAndPrint(nmax, places):
for n in range(1,nmax+1):
printCount(countBlocks(n), n, places)
print
and sample output:
blocks.countAndPrint(10)
1
1
1
1 1
1 2
1
2 3 1
2 6 3
3 3
1
6 11 6 1
6 22 18 4
11 18 6
6 4
1
24 50 35 10 1
24 100 105 40 5
50 105 60 10
35 40 10
10 5
1
120 274 225 85 15 1
120 548 675 340 75 6
274 675 510 150 15
225 340 150 20
85 75 15
15 6
1
720 1764 1624 735 175 21 1
720 3528 4872 2940 875 126 7
1764 4872 4410 1750 315 21
1624 2940 1750 420 35
735 875 315 35
175 126 21
21 7
1
5040 13068 13132 6769 1960 322 28 1
5040 26136 39396 27076 9800 1932 196 8
13068 39396 40614 19600 4830 588 28
13132 27076 19600 6440 980 56
6769 9800 4830 980 70
1960 1932 588 56
322 196 28
28 8
1
40320 109584 118124 67284 22449 4536 546 36 1
40320 219168 354372 269136 112245 27216 3822 288 9
109584 354372 403704 224490 68040 11466 1008 36
118124 269136 224490 90720 19110 2016 84
67284 112245 68040 19110 2520 126
22449 27216 11466 2016 126
4536 3822 1008 84
546 288 36
36 9
1
You'll note a few obvious (well, mostly obvious) things from the problem statement:
the total # of permutations is always N!
with the exception of N=1, there is no solution for L,R = (1,1): if a count in one direction is 1, then it implies the tallest block is on that end of the stack, so the count in the other direction has to be >= 2
the situation is symmetric (reverse each permutation and you reverse the L,R count)
if p is a permutation of N-1 blocks and has count (Lp,Rp), then the N permutations of block N inserted in each possible spot can have a count ranging from L = 1 to Lp+1, and R = 1 to Rp + 1.
From the empirical output:
the leftmost column or topmost row (where L = 1 or R = 1) with N blocks is the sum of the
rows/columns with N-1 blocks: i.e. in #PengOne's notation,
b(N,1,R) = sum(b(N-1,k,R-1) for k = 1 to N-R+1
Each diagonal is a row of Pascal's triangle, times a constant factor K for that diagonal -- I can't prove this, but I'm sure someone can -- i.e.:
b(N,L,R) = K * (L+R-2 choose L-1) where K = b(N,1,L+R-1)
So the computational complexity of computing b(N,L,R) is the same as the computational complexity of computing b(N,1,L+R-1) which is the first column (or row) in each triangle.
This observation is probably 95% of the way towards an explicit solution (the other 5% I'm sure involves standard combinatoric identities, I'm not too familiar with those).
A quick check with the Online Encyclopedia of Integer Sequences shows that b(N,1,R) appears to be OEIS sequence A094638:
A094638 Triangle read by rows: T(n,k) =|s(n,n+1-k)|, where s(n,k) are the signed Stirling numbers of the first kind (1<=k<=n; in other words, the unsigned Stirling numbers of the first kind in reverse order).
1, 1, 1, 1, 3, 2, 1, 6, 11, 6, 1, 10, 35, 50, 24, 1, 15, 85, 225, 274, 120, 1, 21, 175, 735, 1624, 1764, 720, 1, 28, 322, 1960, 6769, 13132, 13068, 5040, 1, 36, 546, 4536, 22449, 67284, 118124, 109584, 40320, 1, 45, 870, 9450, 63273, 269325, 723680, 1172700
As far as how to efficiently compute the Stirling numbers of the first kind, I'm not sure; Wikipedia gives an explicit formula but it looks like a nasty sum. This question (computing Stirling #s of the first kind) shows up on MathOverflow and it looks like O(n^2), as PengOne hypothesizes.
Based on #PengOne answer, here is my Javascript implementation:
function g(N, L, R) {
var acc = 0;
for (var k=1; k<=N; k++) {
acc += comb(N-1, k-1) * f(k-1, L-1) * f(N-k, R-1);
}
return acc;
}
function f(N, L) {
if (N==L) return 1;
else if (N<L) return 0;
else {
var acc = 0;
for (var k=1; k<=N; k++) {
acc += comb(N-1, k-1) * f(k-1, L-1) * fact(N-k);
}
return acc;
}
}
function comb(n, k) {
return fact(n) / (fact(k) * fact(n-k));
}
function fact(n) {
var acc = 1;
for (var i=2; i<=n; i++) {
acc *= i;
}
return acc;
}
$("#go").click(function () {
alert(g($("#N").val(), $("#L").val(), $("#R").val()));
});
Here is my construction solution inspired by #PengOne's ideas.
import itertools
def f(blocks, m):
n = len(blocks)
if m > n:
return []
if m < 0:
return []
if n == m:
return [sorted(blocks)]
maximum = max(blocks)
blocks = list(set(blocks) - set([maximum]))
results = []
for k in range(0, n):
for left_set in itertools.combinations(blocks, k):
for left in f(left_set, m - 1):
rights = itertools.permutations(list(set(blocks) - set(left)))
for right in rights:
results.append(list(left) + [maximum] + list(right))
return results
def b(n, l, r):
blocks = range(1, n + 1)
results = []
maximum = max(blocks)
blocks = list(set(blocks) - set([maximum]))
for k in range(0, n):
for left_set in itertools.combinations(blocks, k):
for left in f(left_set, l - 1):
other = list(set(blocks) - set(left))
rights = f(other, r - 1)
for right in rights:
results.append(list(left) + [maximum] + list(right))
return results
# Sample
print b(4, 3, 2) # -> [[1, 2, 4, 3], [1, 3, 4, 2], [2, 3, 4, 1]]
We derive a general solution F(N, L, R) by examining a specific testcase: F(10, 4, 3).
We first consider 10 in the leftmost possible position, the 4th ( _ _ _ 10 _ _ _ _ _ _ ).
Then we find the product of the number of valid sequences in the left and in the right of 10.
Next, we'll consider 10 in the 5th slot, calculate another product and add it to the previous one.
This process will go on until 10 is in the last possible slot, the 8th.
We'll use the variable named pos to keep track of N's position.
Now suppose pos = 6 ( _ _ _ _ _ 10 _ _ _ _ ). In the left of 10, there are 9C5 = (N-1)C(pos-1) sets of numbers to be arranged.
Since only the order of these numbers matters, we could look at 1, 2, 3, 4, 5.
To construct a sequence with these numbers so that 3 = L-1 of them are visible from the left, we can begin by placing 5 in the leftmost possible slot ( _ _ 5 _ _ ) and follow similar steps to what we did before.
So if F were defined recursively, it could be used here.
The only difference now is that the order of numbers in the right of 5 is immaterial.
To resolve this issue, we'll use a signal, INF (infinity), for R to indicate its unimportance.
Turning to the right of 10, there will be 4 = N-pos numbers left.
We first consider 4 in the last possible slot, position 2 = R-1 from the right ( _ _ 4 _ ).
Here what appears in the left of 4 is immaterial.
But counting arrangements of 4 blocks with the mere condition that 2 of them should be visible from the right is no different than counting arrangements of the same blocks with the mere condition that 2 of them should be visible from the left.
ie. instead of counting sequences like 3 1 4 2, one can count sequences like 2 4 1 3
So the number of valid arrangements in the right of 10 is F(4, 2, INF).
Thus the number of arrangements when pos == 6 is 9C5 * F(5, 3, INF) * F(4, 2, INF) = (N-1)C(pos-1) * F(pos-1, L-1, INF)* F(N-pos, R-1, INF).
Similarly, in F(5, 3, INF), 5 will be considered in a succession of slots with L = 2 and so on.
Since the function calls itself with L or R reduced, it must return a value when L = 1, that is F(N, 1, INF) must be a base case.
Now consider the arrangement _ _ _ _ _ 6 7 10 _ _.
The only slot 5 can take is the first, and the following 4 slots may be filled in any manner; thus F(5, 1, INF) = 4!.
Then clearly F(N, 1, INF) = (N-1)!.
Other (trivial) base cases and details could be seen in the C implementation below.
Here is a link for testing the code
#define INF UINT_MAX
long long unsigned fact(unsigned n) { return n ? n * fact(n-1) : 1; }
unsigned C(unsigned n, unsigned k) { return fact(n) / (fact(k) * fact(n-k)); }
unsigned F(unsigned N, unsigned L, unsigned R)
{
unsigned pos, sum = 0;
if(R != INF)
{
if(L == 0 || R == 0 || N < L || N < R) return 0;
if(L == 1) return F(N-1, R-1, INF);
if(R == 1) return F(N-1, L-1, INF);
for(pos = L; pos <= N-R+1; ++pos)
sum += C(N-1, pos-1) * F(pos-1, L-1, INF) * F(N-pos, R-1, INF);
}
else
{
if(L == 1) return fact(N-1);
for(pos = L; pos <= N; ++pos)
sum += C(N-1, pos-1) * F(pos-1, L-1, INF) * fact(N-pos);
}
return sum;
}

Finding the minimum and maximm element from one of many arrays

I received a question during an Amazon interview and would like assistance with solving it.
Given N arrays of size K each, each of these K elements in the N arrays are sorted, and each of these N*K elements are unique. Choose a single element from each of the N arrays, from the chosen subset of N elements. Subtract the minimum and maximum element. This difference should be the least possible minimum.
Sample:
N=3, K=3
N=1 : 6, 16, 67
N=2 : 11,17,68
N=3 : 10, 15, 100
here if 16, 17, 15 are chosen, we get the minimum difference as
17-15=2.
I can think of O(N*K*N)(edited after correctly pointed out by zivo, not a good solution now :( ) solution.
1. Take N pointer initially pointing to initial element each of N arrays.
6, 16, 67
^
11,17,68
^
10, 15, 100
^
2. Find out the highest and lowest element among the current pointer O(k) (6 and 11) and find the difference between them.(5)
3. Increment the pointer which is pointing to lowest element by 1 in that array.
6, 16, 67
^
11,17,68
^
10, 15, 100 (difference:5)
^
4. Keep repeating step 2 and 3 and store the minimum difference.
6, 16, 67
^
11,17,68
^
10,15,100 (difference:5)
^
6, 16, 67
^
11,17,68
^
10,15,100 (difference:2)
^
Above will be the required solution.
6, 16, 67
^
11,17,68
^
10,15,100 (difference:84)
^
6, 16, 67
^
11,17,68
^
10,15,100 (difference:83)
^
And so on......
EDIT:
Its complexity can be reduced by using a heap (as suggested by Uri). I thought of it but faced a problem: Each time an element is extracted from heap, its array number has to be found out in order to increment the corresponding pointer for that array. An efficient way to find array number can definitely reduce the complexity to O(K*N log(K*N)). One naive way is to use a data structure like this
Struct
{
int element;
int arraynumer;
}
and reconstruct the initial data like
6|0,16|0,67|0
11|1,17|1,68|1
10|2,15|2,100|2
Initially keep the current max for first column and insert the pointed elements in heap. Now each time an element is extracted, its array number can be found out, pointer in that array is incremented , the newly pointed element can be compared to current max and max pointer can be adjusted accordingly.
So here is an algorithm to do solve this problem in two steps:
First step is to merge all your arrays into one sorted array which would look like this:
combined_val[] - which holds all numbers
combined_ind[] - which holds index of which array did this number originally belonged to
this step can be done easily in O(K*N*log(N)) but i think you can do better than that too (maybe not, you can lookup variants of merge sort because they do step similar to that)
Now second step:
it is easier to just put code instead of explaining so here is the pseduocode:
int count[N] = { 0 }
int head = 0;
int diffcnt = 0;
// mindiff is initialized to overall maximum value - overall minimum value
int mindiff = combined_val[N * K - 1] - combined_val[0];
for (int i = 0; i &lt N * K; i++)
{
count[combined_ind[i]]++;
if (count[combined_ind[i]] == 1) {
// diffcnt counts how many arrays have at least one element between
// indexes of "head" and "i". Once diffcnt reaches N it will stay N and
// not increase anymore
diffcnt++;
} else {
while (count[combined_ind[head]] > 1) {
// We try to move head index as forward as possible while keeping diffcnt constant.
// i.e. if count[combined_ind[head]] is 1, then if we would move head forward
// diffcnt would decrease, that is something we dont want to do.
count[combined_ind[head]]--;
head++;
}
}
if (diffcnt == N) {
// i.e. we got at least one element from all arrays
if (combined_val[i] - combined_val[head] &lt mindiff) {
mindiff = combined_val[i] - combined_val[head];
// if you want to save actual numbers too, you can save this (i.e. i and head
// and then extract data from that)
}
}
}
the result is in mindiff.
The runing time of second step is O(N * K). This is because "head" index will move only N*K times maximum. so the inner loop does not make this quadratic, it is still linear.
So total algorithm running time is O(N * K * log(N)), however this is because of merging step, if you can come up with better merging step you can probably bring it down to O(N * K).
This problem is for managers
You have 3 developers (N1), 3 testers (N2) and 3 DBAs (N3)
Choose the less divergent team that can run a project successfully.
int[n] result;// where result[i] keeps the element from bucket N_i
int[n] latest;//where latest[i] keeps the latest element visited from bucket N_i
Iterate elements in (N_1 + N_2 + N_3) in sorted order
{
Keep track of latest element visited from each bucket N_i by updating 'latest' array;
if boundary(latest) < boundary(result)
{
result = latest;
}
}
int boundary(int[] array)
{
return Max(array) - Min(array);
}
I've O(K*N*log(K)), with typical execution much less. Currently cannot think anything better. I'll explain first the easier to describe (somewhat longer execution):
For each element f in the first array (loop through K elements)
For each array, starting from the second array (loop through N-1 arrays)
Do a binary search on the array, and find element closest to f. This is your element (Log(K))
This algorithm can be optimized, if for each array, you add a new Floor Index. When performent the binary search, search between 'Floor' to 'K-1'.
Initially Floor index is 0, and for first element you search through the entire arrays. Once you find an element closest to 'f', update the Floor Index with the index of that element. Worse case is the same (Floor may not update, if maximum element of first array is smaller than any other minimum), but average case will improve.
Correctness proof for the accepted answer (Terminal's solution)
Assume that the algorithm finds a series A=<A[1],A[2],...,A[N]> which isn't the optimal solution (R).
Consider the index j in R, such that item R[j] is the first item among R that the algorithm examines and replaces it with the next item in its row.
Let A' denote the candidate solution at that phase (prior to the replacement). Since R[j]=A'[j] is the minimum value of A', it's also the minimum of R.
Now, consider the maximum value of R, R[m]. If A'[m]<R[m], then R can be improved by replacing R[m] with A'[m], which contradicts the fact that R is optimal. Therefore, A'[m]=R[m].
In other words, R and A' share the same maximum and minimum, therefore they are equivalent. This completes the proof: if R is an optimal solution, then the algorithm is guaranteed to find a solution as good as R.
for every element in 1st array
choose the element in 2nd array that is closest to the element in 1st array
current_array = 2;
do
{
choose the element in current_array+1 that is closest to the element in current_array
current_array++;
} while(current_array < n);
complexity: O(k^2*n)
Here is my logic on how to resolve this issue, keeping in mind that we need to pick one element from each of the N arrays (to compute the least minimum)
// if we take the above values as an example!
// then the idea would be to sort all three arrays while keeping another
// array to keep the reference to their sets (1 or 2 or 3, could be
// extended to n sets)
1 3 2 3 1 2 1 2 3 // this is the array that holds the set index
6 10 11 15 16 17 67 68 100 // this is the sorted combined array.
| |
5 2 33 // this is the computed least minimum,
// the rule is to make sure the indexes of the values
// we are comparing are different (to make sure we are
// comparing elements from different sets), then for example
// the first element of that example is index:1|value:6 we hold
// that value 6 (that is the value we will be using to compute the least minimum,
// then we go to the edge of the comparison which would be the second different index,
// we skip index:3|value:10 (we remove it from the array) we compare index:2|value:11
// to index:1|value:6 we obtain 5 which would go to a variable named leastMinimum = 5,
// now we remove the indexes and values we already used,
// and redo the same steps.
Step 1:
1 3 2 3 1 2 1 2 3
6 10 11 15 16 17 67 68 100
|
5
leastMinumum = 5
Step 2:
3 1 2 1 2 3
15 16 17 67 68 100
|
2
leastMinimum = min(2, leastMinumum) // which is equal 2
Step 3:
1 2 3
67 68 100
33
leastMinimum = min(33, leastMinumum) // which is equal to old leastMinumum which is 2
Now: We suppose we have elements from the same array that are very close to each other (k=2 this time which means we only have 3 sets with two values) :
// After sorting the n arrays we will have the below indexes array and values array
1 1 2 3 2 3
6 7 8 12 15 16
* * *
* we skip second index of 1|7 and we take the least minimum of 1|6 and 3|12 (index:2|value:8 will be removed as it is not at the edges, we pick the minimum and maximum of the unique index subset of n elements)
1 3
6 12
=6
* second step we remove the values we already used, so the array become like below:
1 2 3
7 15 16
* * *
7 - 16
= 9
Note:
Another approach that consumes more memory would consist of creating N sub-arrays from which we would be comparing the maximum - minumum
So from the below sorted values array and its corresponding indexes array we extract three other sub arrays:
1 3 2 3 1 2 1 2 3
6 10 11 15 16 17 67 68 100
First Array:
1 3 2
6 10 11
11-6 = 5
Second Array:
3 1 2
15 15 17
17-15 = 2
Third Array:
1 2 3
67 68 100
100 - 67 = 33

Resources