Correctness of greedy algorithm - algorithm

In non-decreasing sequence of (positive) integers two elements can be removed when . How many pairs can be removed at most from this sequence?
So I have thought of the following solution:
I take given sequence and divide into two parts (first and second).
Assign to each of them iterator - it_first := 0 and it_second := 0, respectively. count := 0
when it_second != second.length
if 2 * first[it_first] <= second[it_second]
count++, it_first++, it_second++
else
it_second++
count is the answer
Example:
count := 0
[1,5,8,10,12,13,15,24] --> first := [1,5,8,10], second := [12,13,15,24]
2 * 1 ?< 12 --> true, count++, it_first++ and it_second++
2 * 5 ?< 13 --> true, count++, it_first++ and it_second++
2 * 8 ?< 15 --> false, it_second++
8 ?<24 --> true, count ++it_second reach the last element - END.
count == 3
Linear complexity (the worst case when there are no such elements to be removed. n/2 elements compare with n/2 elements).
So my missing part is 'correctness' of algorithm - I've read about greedy algorithms proof - but mostly with trees and I cannot find analogy. Any help would be appreciated. Thanks!
EDIT:
By correctness I mean:
* It works
* It cannot be done faster(in logn or constant)
I would like to put some graphics but due to reputation points < 10 - I can't.
(I've meant one latex at the beginning ;))

Correctness:
Let's assume that the maximum number of pairs that can be removed is k. Claim: there is an optimal solution where the first elements of all pairs are k smallest elements of the array.
Proof: I will show that it is possible to transform any solution into the one that contains the first k elements as the first elements of all pairs.
Let's assume that we have two pairs (a, b), (c, d) such that a <= b <= c <= d, 2 * a <= b and 2 * c <= d. In this case, pairs (a, c) and (b, d) are valid, too. And now we have a <= c <= b <= d. Thus, we can always transform out pairs in such a way that the first element from any pair is not greater than the second element of any pair.
When we have this property, we can simply substitute the smallest element among all first all elements of all pairs with the smallest element in the array, the second smallest among all first elements - with the second smallest element in the array and so on without invalidating any pair.
Now we know that there is an optimal solution that contains k smallest elements. It is clear that we cannot make the answer worse by taking the smallest unused element(making it bigger can only reduce the answer for the next elements) which fits each of them. Thus, this solution is correct.
A note about the case when the length of the array is odd: it doesn't matter where the middle element goes: to the first or to the second half. In the first half it is useless(there are not enough elements in the second half). If we put it to the second half, it is useless two(let's assume that we took it. It means that there is "free space" somewhere in the second half. Thus, we can shift some elements by one and get rid of it).
Optimality in terms of time complexity: the time complexity of this solution is O(n). We cannot find the answer without reading the entire input in the worst case and reading is already O(n) time. Thus, this algorithm is optimal.

Presuming your method. Indices are 0-based.
Denote in general:
end_1 = floor(N/2) boundary (inclusive) of first part.
Denote while iterating:
i index in first part, j index in second part,
optimal solution until this point sol(i,j) (using algorithm from front),
pairs that remain to be paired-up optimally behind (i,j) point i.e. from
(i+1,j+1) onward rem(i,j) (can be calculated using algorithm from back),
final optimal solution can be expressed as the function of any point as sol(i,j) + rem(i,j).
Observation #1: when doing algorithm from front all points in [0, i] range are used, some points from [end_1+1, j] range are not used (we skip a(j) not large engough). When doing algorithm from back some [i+1, end_1] points are not used, and all [j+1, N] points are used (we skip a(i) not small enough).
Observation #2: rem(i,j) >= rem(i,j+1), because rem(i,j) = rem(i,j+1) + M, where M can be 0 or 1 depending on whether we can pair up a(j) with some unused element from [i+1, end_1] range.
Argument (by contradiction): let's assume 2*a(i) <= a(j) and that not pairing up a(i) and a(j) gives at least as good final solution. By the algorithm we would next try to pair up a(i) and a(j+1). Since:
rem(i,j) >= rem(i,j+1) (see above),
sol(i,j+1) = sol(i,j) (since we didn't pair up a(i) and a(j))
we get that sol(i,j) + rem(i,j) >= sol(i,j+1) + rem(i,j+1) which contradicts the assumption.

Related

Searching for time complexity of my recursive algorithm

So I have an assignment from my school, and I only wondered what time complexity my algorithm has (not needed for answer per se, only the algorithm need to run faster than O(n) worst case))
The question is: Given n (n ≥ 3) distinct elements, design a divide and conquer algorithm to compute the first three smallest elements. Your algorithm should return a triple (x, y, z) such that x < y < z < the rest n − 3 input elements, and run in linear time in the worst case.
And my solution is as follows:
Solution: Since “n” must be ≥ 3, the given array cannot have less than 3
elements.
If n == 3, simply compare the elements to each other (maximum of 3
comparisons needed).
Compare the first 2 elements to each other and assign the elements with a
“smallest element” and “2nd smallest element”, we’ll call
them “x” (smallest) and “y” (2nd smallest) from here on out. Compare “x”
(smallest) to the 3rd element.
If x > 3rd element, we know that the 3rd element is the smallest in this
array, therefore: “y” becomes the new “z”, “x” becomes the new “y”, and the
3rd element becomes the new “x”. You’re now left with the desired output.
Return the triple (x<y<z). (2 comparisons used)
Else if x < 3rd, make the 3rd element be “z”. Compare y with 3rd element
(now “z”).
If y < z, do nothing, you already have a triple with the desired outcome
(x<y<z).
If y > z, swap the elements. You now have a triple with the desired outcome
(x<y<z).
Now that the first case (n == 3) has been handled, let’s handle what should happen when n > 3.
If n > 3, split the array recursively into parts of n/2 (if n is an uneven
number, split the array into parts of n/2 + 1 & n/2).
Keep doing the 1st step until the split arrays have a size of ≥ 2.
Compare the elements in each sub-array to each other, and assign a lowest
and 2nd lowest value to them. (“x” and “y”)
When merging the sub-array, compare the lowest element of one sub-array, to
the other array’s lowest value.
If a1(lowest) < a2(lowest), make a1(lowest) be “x”, a2(lowest) the new “y”,
and a2(2nd lowest) the new “z”, then compare a1(2nd lowest) with the highest
value in a1 at that moment (“z” in this case).
If a1(2ndlowest) > “z”, do nothing. (“x”, “y” and “z” are already the three
lowest values in the sub-array)
Else if a1(2ndlowest) < “z”, swap the elements and make a1(2ndlowest) the
new “z”, then compare the new “z” with “y”
If “z” < “y”, swap the elements.
No more comparisons needed, as the “x” is the lowest element from a1, which
means the 3 elements, “x”, “y” & “z” are now the lowest possible from this
subsection of the array.
Repeat from step 4 until reaching the highest layer of the call, and you now
have a triple that satisfies the condition (x<y<z).
Sorry about the wall of text (and pseudocode). I've read about time complexity, and I understand the simple ones (like a statement has constant time, a for-loop has time depending on how long it is, etc.) but I have a hard time understanding the time complexity for my own algorithm. Thanks in advance.

Sample number with equal probability which is not part of a set

I have a number n and a set of numbers S ∈ [1..n]* with size s (which is substantially smaller than n). I want to sample a number k ∈ [1..n] with equal probability, but the number is not allowed to be in the set S.
I am trying to solve the problem in at worst O(log n + s). I am not sure whether it's possible.
A naive approach is creating an array of numbers from 1 to n excluding all numbers in S and then pick one array element. This will run in O(n) and is not an option.
Another approach may be just generating random numbers ∈[1..n] and rejecting them if they are contained in S. This has no theoretical bound as any number could be sampled multiple times even if it is in the set. But on average this might be a practical solution if s is substantially smaller than n.
Say s is sorted. Generate a random number between 1 and n-s, call it k. We've chosen the k'th element of {1,...,n} - s. Now we need to find it.
Use binary search on s to find the count of the elements of s <= k. This takes O(log |s|). Add this to k. In doing so, we may have passed or arrived at additional elements of s. We can adjust for this by incrementing our answer for each such element that we pass, which we find by checking the next larger element of s from the point we found in our binary search.
E.g., n = 100, s = {1,4,5,22}, and our random number is 3. So our approach should return the third element of [2,3,6,7,...,21,23,24,...,100] which is 6. Binary search finds that 1 element is at most 3, so we increment to 4. Now we compare to the next larger element of s which is 4 so increment to 5. Repeating this finds 5 in so we increment to 6. We check s once more, see that 6 isn't in it, so we stop.
E.g., n = 100, s = {1,4,5,22}, and our random number is 4. So our approach should return the fourth element of [2,3,6,7,...,21,23,24,...,100] which is 7. Binary search finds that 2 elements are at most 4, so we increment to 6. Now we compare to the next larger element of s which is 5 so increment to 7. We check s once more, see that the next number is > 7, so we stop.
If we assume that "s is substantially smaller than n" means |s| <= log(n), then we will increment at most log(n) times, and in any case at most s times.
If s is not sorted then we can do the following. Create an array of bits of size s. Generate k. Parse s and do two things: 1) count the number of elements < k, call this r. At the same time, set the i'th bit to 1 if k+i is in s (0 indexed so if k is in s then the first bit is set).
Now, increment k a number of times equal to r plus the number of set bits is the array with an index <= the number of times incremented.
E.g., n = 100, s = {1,4,5,22}, and our random number is 4. So our approach should return the fourth element of [2,3,6,7,...,21,23,24,...,100] which is 7. We parse s and 1) note that 1 element is below 4 (r=1), and 2) set our array to [1, 1, 0, 0]. We increment once for r=1 and an additional two times for the two set bits, ending up at 7.
This is O(s) time, O(s) space.
This is an O(1) solution with O(s) initial setup that works by mapping each non-allowed number > s to an allowed number <= s.
Let S be the set of non-allowed values, S(i), where i = [1 .. s] and s = |S|.
Here's a two part algorithm. The first part constructs a hash table based only on S in O(s) time, the second part finds the random value k ∈ {1..n}, k ∉ S in O(1) time, assuming we can generate a uniform random number in a contiguous range in constant time. The hash table can be reused for new random values and also for new n (assuming S ⊂ { 1 .. n } still holds of course).
To construct the hash, H. First set j = 1. Then iterate over S(i), the elements of S. They do not need to be sorted. If S(i) > s, add the key-value pair (S(i), j) to the hash table, unless j ∈ S, in which case increment j until it is not. Finally, increment j.
To find a random value k, first generate a uniform random value in the range s + 1 to n, inclusive. If k is a key in H, then k = H(k). I.e., we do at most one hash lookup to insure k is not in S.
Python code to generate the hash:
def substitute(S):
H = dict()
j = 1
for s in S:
if s > len(S):
while j in S: j += 1
H[s] = j
j += 1
return H
For the actual implementation to be O(s), one might need to convert S into something like a frozenset to insure the test for membership is O(1) and also move the len(S) loop invariant out of the loop. Assuming the j in S test and the insertion into the hash (H[s] = j) are constant time, this should have complexity O(s).
The generation of a random value is simply:
def myrand(n, s, H):
k = random.randint(s + 1, n)
return (H[k] if k in H else k)
If one is only interested in a single random value per S, then the algorithm can be optimized to improve the common case, while the worst case remains the same. This still requires S be in a hash table that allows for a constant time "element of" test.
def rand_not_in(n, S):
k = random.randint(len(S) + 1, n);
if k not in S: return k
j = 1
for s in S:
if s > len(S):
while j in S: j += 1
if s == k: return j
j += 1
Optimizations are: Only generate the mapping if the random value is in S. Don't save the mapping to a hash table. Short-circuit the mapping generation when the random value is found.
Actually, the rejection method seems like the practical approach.
Generate a number in 1...n and check whether it is forbidden; regenerate until the generated number is not forbidden.
The probability of a single rejection is p = s/n.
Thus the expected number of random number generations is 1 + p + p^2 + p^3 + ... which is 1/(1-p), which in turn is equal to n/(n-s).
Now, if s is much less than n, or even more up to s = n/2, this expected number is at most 2.
It would take s almost equal to n to make it infeasible in practice.
Multiply the expected time by log s if you use a tree-set to check whether the number is in the set, or by just 1 (expected value again) if it is a hash-set. So the average time is O(1) or O(log s) depending on the set implementation. There is also O(s) memory for storing the set, but unless the set is given in some special way, implicitly and concisely, I don't see how it can be avoided.
(Edit: As per comments, you do this only once for a given set.
If, additionally, we are out of luck, and the set is given as a plain array or list, not some fancier data structure, we get O(s) expected time with this approach, which still fits into the O(log n + s) requirement.)
If attacks against the unbounded algorithm are a concern (and only if they truly are), the method can include a fall-back algorithm for the cases when a certain fixed number of iterations didn't provide the answer.
Similarly to how IntroSort is QuickSort but falls back to HeapSort if the recursion depth gets too high (which is almost certainly a result of an attack resulting in quadratic QuickSort behavior).
Find all numbers that are in a forbidden set and less or equal then n-s. Call it array A.
Find all numbers that are not in a forbidden set and greater then n-s. Call it array B. It may be done in O(s) if set is sorted.
Note that lengths of A and B are equal, and create mapping map[A[i]] = B[i]
Generate number t up to n-s. If there is map[t] return it, otherwise return t
It will work in O(s) insertions to a map + 1 lookup which is either O(s) in average or O(s log s)

Place "sum" and "multiply" operators between the elements of a given list of integers so that the expression results in a specified value

I was given a tricky question.
Given:
A = [a1,a2,...an] (list of positive integers with length "n")
r (positive integer)
Find a list of { *, + } operators
O = [o1,o2,...on-1]
so that if we placed those operators between the elements of "A", the resulting expression would evaluate to "r". Only one solution is required.
So for example if
A = [1,2,3,4]
r = 14
then
O = [*, +, *]
I've implemented a simple recursive solution with some optimisation, but of course it's exponential O(2^n) time, so for an input with length 40, it works for ages.
I wanted to ask if any of you know a sub-exponential solution for this?
Update
Elements of A are between 0-10000,
r can be arbitrarily big
Let A and B be positive integers. Then A + B ≤ A × B + 1.
This little fact can be used to construct a very efficient algorithm.
Let's define a graph. The graph nodes correspond to operations lists, for example, [+, ×, +, +, ×]. There is an edge from graph node X to graph node Y if the Y can be obtained by changing a single + to a × in X. The graph has a source at the node corresponding to [+, +, ..., +].
Now perform a breadth-first search from the source node, constructing the graph as you go. When expanding a node [+, ×, +, +, ×], for example, you (optionally construct then) connect to the nodes [×, ×, +, +, ×], [+, ×, ×, +, ×], and [+, ×, +, ×, ×]. Do not expand to a node if the result of evaluating it is greater than r + k(O), where k(O) is the number of +'s in the operation list O. This is because of the "+ 1" in the fact at the beginning of the answer - consider the case of a = [1, 1, 1, 1, 1], r = 1.
This approach uses O(n 2n) time and O(2n) space (where both are potentially very-loose worst case bounds). This is still an exponential algorithm, however I think you will find it performs very reasonably for non-sinister inputs. (I suspect this problem is NP-complete, which is why I am happy with this "non-sinister inputs" escape clause.)
Here's an O(rn^2)-time, O(rn)-space DP approach. If r << 2^n then this will have better worst-case behaviour than exponential-time branch-and-bound approaches, though even then the latter may still be faster on many instances. This is pseudo-polynomial time, because it takes time proportional to the value of part of its input (r), not its size (which would be log2(r)). Specifically it needs rn bits of memory, so it should give answers in a few seconds for up to around rn < 1,000,000,000 and n < 1000 (e.g. n = 100, r = 10,000,000).
The key observation is that any formula involving all n numbers has a final term that consists of some number i of factors, where 1 <= i <= n. That is, any formula must be in one of the following n cases:
(a formula on the first n-1 terms) + a[n]
(a formula on the first n-2 terms) + a[n-1] * a[n]
(a formula on the first n-3 terms) + a[n-2] * a[n-1] * a[n]
...
a[1] * a[2] * ... * a[n]
Let's call the "prefix" of a[] consisting of the first i numbers P[i]. If we record, for each 0 <= i <= n-1, the complete set of values <= r that can be reached by some formula on P[i], then based on the above, we can quite easily compute the complete set of values <= r that can be reached by P[n]. Specifically, let X[i][j] be a true or false value that indicates whether the prefix P[i] can achieve the value j. (X[][] could be stored as an array of n size-(r+1) bitmaps.) Then what we want to do is compute X[n][r], which will be true if r can be reached by some formula on a[], and false otherwise. (X[n][r] isn't quite the full answer yet, but it can be used to get the answer.)
X[1][a[1]] = true. X[1][j] = false for all other j. For any 2 <= i <= n and 0 <= j <= r, we can compute X[i][j] using
X[i][j] = X[i - 1][j - a[i]] ||
X[i - 2][j - a[i-1]*a[i]] ||
X[i - 3][j - a[i-2]*a[i-1]*a[i]] ||
... ||
X[1][j - a[2]*a[3]*...*a[i]] ||
(a[1]*a[2]*...*a[i] == j)
Note that the last line is an equality test that compares the product of all i numbers in P[i] to j, and returns true or false. There are i <= n "terms" (rows) in the expression for X[i][j], each of which can be computed in constant time (note in particular that the multiplications can be built up in constant time per row), so computing a single value X[i][j] can be done in O(n) time. To find X[n][r], we need to calculate X[i][j] for every 1 <= i <= n and every 0 <= j <= r, so there is O(rn^2) overall work to do. (Strictly speaking we may not need to compute all of these table entries if we use memoization instead of a bottom-up approach, but many inputs will require us to compute a large fraction of them anyway, so it's likely that the latter is faster by a small constant factor. Also a memoization approach requires keeping an "already processed" flag for each DP cell -- which doubles the memory usage when each cell is just 1 bit!)
Reconstructing a solution
If X[n][r] is true, then the problem has a solution (satisfying formula), and we can reconstruct one in O(n^2) time by tracing back through the DP table, starting from X[n][r], at each location looking for any term that enabled the current location to assume the value "true" -- that is, any true term. (We could do this reconstruction step faster by storing more than a single bit per (i, j) combination -- but since r is allowed to be "arbitrarily big", and this faster reconstruction won't improve the overall time complexity, it probably makes more sense to go with the approach that uses the fewest bits per DP table entry.) All satisfying solutions can be reconstructed this way, by backtracking through all true terms instead of just picking any one -- but there may be an exponential number of them.
Speedups
There are two ways that calculation of an individual X[i][j] value can be sped up. First, because all the terms are combined with ||, we can stop as soon as the result becomes true, since no later term can make it false again. Second, if there is no zero anywhere to the left of i, we can stop as soon as the product of the final numbers becomes larger than r, since there's no way for that product to be decreased again.
When there are no zeroes in a[], that second optimisation is likely to be very important in practice: it has the potential to make the inner loop much smaller than the full i-1 iterations. In fact if a[] contains no zeroes, and its average value is v, then after k terms have been computed for a particular X[i][j] value the product will be around v^k -- so on average, the number of inner loop iterations (terms) needed drops from n to log_v(r) = log(r)/log(v). That might be much smaller than n, in which case the average time complexity for this model drops to O(rn*log(r)/log(v)).
[EDIT: We actually can save multiplications with the following optimisation :)]
8/32/64 X[i][j]s at a time: X[i][j] is independent of X[i][k] for k != j, so if we are using bitsets to store these values, we can calculate 8, 32 or 64 of them (or maybe more, with SSE2 etc.) in parallel using simple bitwise OR operations. That is, we can calculate the first term of X[i][j], X[i][j+1], ..., X[i][j+31] in parallel, OR them into the results, then calculate their second terms in parallel and OR them in, etc. We still need to perform the same number of subtractions this way, but the products are all the same, so we can reduce the number of multiplications by a factor of 8/32/64 -- as well as, of course, the number of memory accesses. OTOH, this makes the first optimisation from the previous paragraph harder to accomplish -- you have to wait until an entire block of 8/32/64 bits have become true before you can stop iterating.
Zeroes: Zeroes in a[] may allow us to stop early. Specifically, if we have just computed X[i][r] for some i < n and found it to be true, and there is a zero anywhere to the right of position i in a[], then we can stop: we already have a formula on the first i numbers that evaluates to r, and we can use that zero to "kill off" all numbers to the right of position i by creating one big product term that includes all of them.
Ones: An interesting property of any a[] entry containing the value 1 is that it can be moved to any other position in a[] without affecting whether or not there is a solution. This is because every satisfying formula either has a * on at least one side of this 1, in which case it multiplies some other term and has no effect there, and would likewise have no effect anywhere else; or it has a + on both sides (imagine extra + signs before the first position and after the last), in which case it might as well be added in anywhere.
So, we can safely shunt all 1 values to the end of a[] before doing anything else. The point of doing this is that now we don't have to evaluate these rows of X[][] at all, because they only influence the outcome in a very simple way. Suppose there are m < n ones in a[], which we have moved to the end. Then after computing the m+1 values X[n-m][r-m], X[n-m][r-m+1], X[n-m][r-m+2], ..., X[n-m][r], we already know what X[n][r] must be: if any of them are true, then X[n][r] must be true, otherwise (if they are all false) it must be false. This is because the final m ones can add anywhere from 0 up to m to a formula on the first n-m values. (But if a[] consists entirely of 1s, then at least 1 must be "added" -- they can't all multiply some other term.)
Here is another approach that might be helpful. It is sometimes known as a "meet-in-the-middle" algorithm and runs in O(n * 2^(n/2)). The basic idea is this. Suppose n = 40 and you know that the middle slot is a +. Then, you can brute force all N := 2^20 possibilities for each side. Let A be a length N array storing the possible values of the left side, and similarly let B be a length N array storing the values for the right side.
Then, after sorting A and B, it is not hard to efficiently check for whether any two of them sum to r (e.g. for each value in A, do a binary search on B, or you can even do it in linear time if both arrays are sorted). This part takes O(N * log N) = O(n * 2^(n/2)) time.
Now, this was all assuming the middle slot is a +. If not, then it has to be a *, and you can combine the middle two elements into one (their product), reducing the problem to n = 39. Then you try the same thing, and so on. If you analyze it carefully, you should get O(n * 2^(n/2)) as the asymptotic complexity, since actually the largest term dominates.
You need to do some bookkeeping to actually recover the +'s and *'s, which I have left out to simplify the explanation.

How to find pair with kth largest sum?

Given two sorted arrays of numbers, we want to find the pair with the kth largest possible sum. (A pair is one element from the first array and one element from the second array). For example, with arrays
[2, 3, 5, 8, 13]
[4, 8, 12, 16]
The pairs with largest sums are
13 + 16 = 29
13 + 12 = 25
8 + 16 = 24
13 + 8 = 21
8 + 12 = 20
So the pair with the 4th largest sum is (13, 8). How to find the pair with the kth largest possible sum?
Also, what is the fastest algorithm? The arrays are already sorted and sizes M and N.
I am already aware of the O(Klogk) solution , using Max-Heap given here .
It also is one of the favorite Google interview question , and they demand a O(k) solution .
I've also read somewhere that there exists a O(k) solution, which i am unable to figure out .
Can someone explain the correct solution with a pseudocode .
P.S.
Please DON'T post this link as answer/comment.It DOESN'T contain the answer.
I start with a simple but not quite linear-time algorithm. We choose some value between array1[0]+array2[0] and array1[N-1]+array2[N-1]. Then we determine how many pair sums are greater than this value and how many of them are less. This may be done by iterating the arrays with two pointers: pointer to the first array incremented when sum is too large and pointer to the second array decremented when sum is too small. Repeating this procedure for different values and using binary search (or one-sided binary search) we could find Kth largest sum in O(N log R) time, where N is size of the largest array and R is number of possible values between array1[N-1]+array2[N-1] and array1[0]+array2[0]. This algorithm has linear time complexity only when the array elements are integers bounded by small constant.
Previous algorithm may be improved if we stop binary search as soon as number of pair sums in binary search range decreases from O(N2) to O(N). Then we fill auxiliary array with these pair sums (this may be done with slightly modified two-pointers algorithm). And then we use quickselect algorithm to find Kth largest sum in this auxiliary array. All this does not improve worst-case complexity because we still need O(log R) binary search steps. What if we keep the quickselect part of this algorithm but (to get proper value range) we use something better than binary search?
We could estimate value range with the following trick: get every second element from each array and try to find the pair sum with rank k/4 for these half-arrays (using the same algorithm recursively). Obviously this should give some approximation for needed value range. And in fact slightly improved variant of this trick gives range containing only O(N) elements. This is proven in following paper: "Selection in X + Y and matrices with sorted rows and columns" by A. Mirzaian and E. Arjomandi. This paper contains detailed explanation of the algorithm, proof, complexity analysis, and pseudo-code for all parts of the algorithm except Quickselect. If linear worst-case complexity is required, Quickselect may be augmented with Median of medians algorithm.
This algorithm has complexity O(N). If one of the arrays is shorter than other array (M < N) we could assume that this shorter array is extended to size N with some very small elements so that all calculations in the algorithm use size of the largest array. We don't actually need to extract pairs with these "added" elements and feed them to quickselect, which makes algorithm a little bit faster but does not improve asymptotic complexity.
If k < N we could ignore all the array elements with index greater than k. In this case complexity is equal to O(k). If N < k < N(N-1) we just have better complexity than requested in OP. If k > N(N-1), we'd better solve the opposite problem: k'th smallest sum.
I uploaded simple C++11 implementation to ideone. Code is not optimized and not thoroughly tested. I tried to make it as close as possible to pseudo-code in linked paper. This implementation uses std::nth_element, which allows linear complexity only on average (not worst-case).
A completely different approach to find K'th sum in linear time is based on priority queue (PQ). One variation is to insert largest pair to PQ, then repeatedly remove top of PQ and instead insert up to two pairs (one with decremented index in one array, other with decremented index in other array). And take some measures to prevent inserting duplicate pairs. Other variation is to insert all possible pairs containing largest element of first array, then repeatedly remove top of PQ and instead insert pair with decremented index in first array and same index in second array. In this case there is no need to bother about duplicates.
OP mentions O(K log K) solution where PQ is implemented as max-heap. But in some cases (when array elements are evenly distributed integers with limited range and linear complexity is needed only on average, not worst-case) we could use O(1) time priority queue, for example, as described in this paper: "A Complexity O(1) Priority Queue for Event Driven Molecular Dynamics Simulations" by Gerald Paul. This allows O(K) expected time complexity.
Advantage of this approach is a possibility to provide first K elements in sorted order. Disadvantages are limited choice of array element type, more complex and slower algorithm, worse asymptotic complexity: O(K) > O(N).
EDIT: This does not work. I leave the answer, since apparently I am not the only one who could have this kind of idea; see the discussion below.
A counter-example is x = (2, 3, 6), y = (1, 4, 5) and k=3, where the algorithm gives 7 (3+4) instead of 8 (3+5).
Let x and y be the two arrays, sorted in decreasing order; we want to construct the K-th largest sum.
The variables are: i the index in the first array (element x[i]), j the index in the second array (element y[j]), and k the "order" of the sum (k in 1..K), in the sense that S(k)=x[i]+y[j] will be the k-th greater sum satisfying your conditions (this is the loop invariant).
Start from (i, j) equal to (0, 0): clearly, S(1) = x[0]+y[0].
for k from 1 to K-1, do:
if x[i+1]+ y[j] > x[i] + y[j+1], then i := i+1 (and j does not change) ; else j:=j+1
To see that it works, consider you have S(k) = x[i] + y[j]. Then, S(k+1) is the greatest sum which is lower (or equal) to S(k), and such as at least one element (i or j) changes. It is not difficult to see that exactly one of i or j should change.
If i changes, the greater sum you can construct which is lower than S(k) is by setting i=i+1, because x is decreasing and all the x[i'] + y[j] with i' < i are greater than S(k). The same holds for j, showing that S(k+1) is either x[i+1] + y[j] or x[i] + y[j+1].
Therefore, at the end of the loop you found the K-th greater sum.
tl;dr: If you look ahead and look behind at each iteration, you can start with the end (which is highest) and work back in O(K) time.
Although the insight underlying this approach is, I believe, sound, the code below is not quite correct at present (see comments).
Let's see: first of all, the arrays are sorted. So, if the arrays are a and b with lengths M and N, and as you have arranged them, the largest items are in slots M and N respectively, the largest pair will always be a[M]+b[N].
Now, what's the second largest pair? It's going to have perhaps one of {a[M],b[N]} (it can't have both, because that's just the largest pair again), and at least one of {a[M-1],b[N-1]}. BUT, we also know that if we choose a[M-1]+b[N-1], we can make one of the operands larger by choosing the higher number from the same list, so it will have exactly one number from the last column, and one from the penultimate column.
Consider the following two arrays: a = [1, 2, 53]; b = [66, 67, 68]. Our highest pair is 53+68. If we lose the smaller of those two, our pair is 68+2; if we lose the larger, it's 53+67. So, we have to look ahead to decide what our next pair will be. The simplest lookahead strategy is simply to calculate the sum of both possible pairs. That will always cost two additions, and two comparisons for each transition (three because we need to deal with the case where the sums are equal);let's call that cost Q).
At first, I was tempted to repeat that K-1 times. BUT there's a hitch: the next largest pair might actually be the other pair we can validly make from {{a[M],b[N]}, {a[M-1],b[N-1]}. So, we also need to look behind.
So, let's code (python, should be 2/3 compatible):
def kth(a,b,k):
M = len(a)
N = len(b)
if k > M*N:
raise ValueError("There are only %s possible pairs; you asked for the %sth largest, which is impossible" % M*N,k)
(ia,ib) = M-1,N-1 #0 based arrays
# we need this for lookback
nottakenindices = (0,0) # could be any value
nottakensum = float('-inf')
for i in range(k-1):
optionone = a[ia]+b[ib-1]
optiontwo = a[ia-1]+b[ib]
biggest = max((optionone,optiontwo))
#first deal with look behind
if nottakensum > biggest:
if optionone == biggest:
newnottakenindices = (ia,ib-1)
else: newnottakenindices = (ia-1,ib)
ia,ib = nottakenindices
nottakensum = biggest
nottakenindices = newnottakenindices
#deal with case where indices hit 0
elif ia <= 0 and ib <= 0:
ia = ib = 0
elif ia <= 0:
ib-=1
ia = 0
nottakensum = float('-inf')
elif ib <= 0:
ia-=1
ib = 0
nottakensum = float('-inf')
#lookahead cases
elif optionone > optiontwo:
#then choose the first option as our next pair
nottakensum,nottakenindices = optiontwo,(ia-1,ib)
ib-=1
elif optionone < optiontwo: # choose the second
nottakensum,nottakenindices = optionone,(ia,ib-1)
ia-=1
#next two cases apply if options are equal
elif a[ia] > b[ib]:# drop the smallest
nottakensum,nottakenindices = optiontwo,(ia-1,ib)
ib-=1
else: # might be equal or not - we can choose arbitrarily if equal
nottakensum,nottakenindices = optionone,(ia,ib-1)
ia-=1
#+2 - one for zero-based, one for skipping the 1st largest
data = (i+2,a[ia],b[ib],a[ia]+b[ib],ia,ib)
narrative = "%sth largest pair is %s+%s=%s, with indices (%s,%s)" % data
print (narrative) #this will work in both versions of python
if ia <= 0 and ib <= 0:
raise ValueError("Both arrays exhausted before Kth (%sth) pair reached"%data[0])
return data, narrative
For those without python, here's an ideone: http://ideone.com/tfm2MA
At worst, we have 5 comparisons in each iteration, and K-1 iterations, which means that this is an O(K) algorithm.
Now, it might be possible to exploit information about differences between values to optimise this a little bit, but this accomplishes the goal.
Here's a reference implementation (not O(K), but will always work, unless there's a corner case with cases where pairs have equal sums):
import itertools
def refkth(a,b,k):
(rightia,righta),(rightib,rightb) = sorted(itertools.product(enumerate(a),enumerate(b)), key=lamba((ia,ea),(ib,eb):ea+eb)[k-1]
data = k,righta,rightb,righta+rightb,rightia,rightib
narrative = "%sth largest pair is %s+%s=%s, with indices (%s,%s)" % data
print (narrative) #this will work in both versions of python
return data, narrative
This calculates the cartesian product of the two arrays (i.e. all possible pairs), sorts them by sum, and takes the kth element. The enumerate function decorates each item with its index.
The max-heap algorithm in the other question is simple, fast and correct. Don't knock it. It's really well explained too. https://stackoverflow.com/a/5212618/284795
Might be there isn't any O(k) algorithm. That's okay, O(k log k) is almost as fast.
If the last two solutions were at (a1, b1), (a2, b2), then it seems to me there are only four candidate solutions (a1-1, b1) (a1, b1-1) (a2-1, b2) (a2, b2-1). This intuition could be wrong. Surely there are at most four candidates for each coordinate, and the next highest is among the 16 pairs (a in {a1,a2,a1-1,a2-1}, b in {b1,b2,b1-1,b2-1}). That's O(k).
(No it's not, still not sure whether that's possible.)
[2, 3, 5, 8, 13]
[4, 8, 12, 16]
Merge the 2 arrays and note down the indexes in the sorted array. Here is the index array looks like (starting from 1 not 0)
[1, 2, 4, 6, 8]
[3, 5, 7, 9]
Now start from end and make tuples. sum the elements in the tuple and pick the kth largest sum.
public static List<List<Integer>> optimization(int[] nums1, int[] nums2, int k) {
// 2 * O(n log(n))
Arrays.sort(nums1);
Arrays.sort(nums2);
List<List<Integer>> results = new ArrayList<>(k);
int endIndex = 0;
// Find the number whose square is the first one bigger than k
for (int i = 1; i <= k; i++) {
if (i * i >= k) {
endIndex = i;
break;
}
}
// The following Iteration provides at most endIndex^2 elements, and both arrays are in ascending order,
// so k smallest pairs must can be found in this iteration. To flatten the nested loop, refer
// 'https://stackoverflow.com/questions/7457879/algorithm-to-optimize-nested-loops'
for (int i = 0; i < endIndex * endIndex; i++) {
int m = i / endIndex;
int n = i % endIndex;
List<Integer> item = new ArrayList<>(2);
item.add(nums1[m]);
item.add(nums2[n]);
results.add(item);
}
results.sort(Comparator.comparing(pair->pair.get(0) + pair.get(1)));
return results.stream().limit(k).collect(Collectors.toList());
}
Key to eliminate O(n^2):
Avoid cartesian product(or 'cross join' like operation) of both arrays, which means flattening the nested loop.
Downsize iteration over the 2 arrays.
So:
Sort both arrays (Arrays.sort offers O(n log(n)) performance according to Java doc)
Limit the iteration range to the size which is just big enough to support k smallest pairs searching.

How can I compute the average cost for this solution of the element uniqueness problem?

In the book Introduction to the Design & Analysis of Algorithms, the following solution is proposed to the element uniqueness problem:
ALGORITHM UniqueElements(A[0 .. n-1])
// Determines whether all the elements in a given array are distinct
// Input: An array A[0 .. n-1]
// Output: Returns "true" if all the elements in A are distinct
// and false otherwise.
for i := 0 to n - 2 do
for j := i + 1 to n - 1 do
if A[i] = A[j] return false
return true
How can I compute the average cost (i.e. number of comparisons for a given n) for this algorithm? What is a reasonable assumption about the input?
If you don't know anything else about the input, then a reasonable assumption is that it's random. If so, and if the space of possible choices is large (e.g. the set of all real numbers), then the likelihood of two elements being the same is vanishingly small. (Mathematically, we say that the event of two randomly selected real numbers being distinct is almost sure.)
That means that your average case is equal to your worst case: you'll have to scan every element in the array to be sure that each one is distinct. Then the number of comparisons is n * (n - 1) / 2, or the sum of 1 ... n.
I think it's hard to talk about an average cost. The worst case cost is O(n2) and happens either when the repeated elements are towards the end of the array, for example something like this:
2 3 4 5 ... 1 1
Or when the array contains nothing but distinct elements.
The best case is when the array starts with two repeated elements, like this:
1 1 ...
In which case the cost is a single comparison. Another good case is when there exists an element near the beginning of the array that repeats at the end of the array, something like this:
2 3 4 1 ... 1
This will be (closer to) O(n).
The fact is the cost depends on the input, so you might as well assume you're going to always hit a worst case and try to find a better algorithm, maybe something based on sorting the array or on using hash tables, giving you O(nlog n) worst case and O(n) average case respectively.
Since you iterate twice over the array in a nested way, worst case cost should be O(n²)..
a closer look would show you that since you start second loop from the element after the one you are checking you have:
N-1 + (N-2) + (N-3) + (N-4) + (N-5) + .... + 1
comparisons so the exact average cost would be N*(N-1) / 2
According to your comment I think that you should assume that every element is uniformely chosen between the set of possible values.
This means that the element A[i] has the probability 1/n of being exactly a specified value. Starting from here you can do your considerations:
first of all you choose a whatever element of the array A[i]. What is the probability of having A[i] == A[i+1]? It's 1/n² since both elements are supposed to be random.
what is the probability of having A[i] == A[i+2]? You have 1/n * (n-1/n) * 1/n because you have respectively a specified element, anything except the specified one, and the same specified element
you can extend the argumentation over any element A[k] with k>i, then you add all probabilities and you will have which is the average probability of having two unique element in the array starting from a specified one.
you extend thing thing further considering that you can start from any A[i] with i = 0..l-1. Of course every different i will have different probabilities because array will be shorter as i increases.
NOTE: n is the number of different items that can be inserted into the array, not its length.
After this you can easily estimate your average comparison cost..
If you need an exact value for a given input length then this will work (thought it is overkill):
ALGORITHM complexity_counter_of_UniqueElements(A[0 .. n-1])
// Determines whether all the elements in a given array are distinct
// Input: An array A[0 .. n-1]
// Output: Returns "true" if all the elements in A are distinct
// and false otherwise.
counter acc = 0;
for i := 0 to n - 2 do
for j := i + 1 to n - 1 do
//if A[i] = A[j] return false
acc := 1 + acc
return acc
It is easy to see that this algorithm is O(nn) though, which is probably what you're interested in. The algorithm compares every element by every other element. If you created a table with the results of this the table would have to be at least ((nn)/2) to hold all of the results.
edit:
I see now what you were really asking.
You need to compute the probability that each comparison may result in a match. This depends on the size of your elements (things that live in A) and what kind of distribution they have.
Assuming a random distribution the chance that any two random A[x] == A[y] where x != y would be 1.0/(number of possible values of element).
P(n)
total_chance := 0.0
for i:= 0 to n - 2 do
for j := i + 1 to n - 1 do
this_chance := 1.0/(number_of_possible_values_of_element)
total_chance := total_chance + ((1-total_chance)*this_chance)
// This should be the the probability of the newly compared pair being equal weighted
// to account for the chance that it actually mattered (ie, hadn't found a match earlier)
return total_chance
O((1-P(n))nn), but P(n) is <= 1, so it is less than n*n

Resources