Number of Binary Search Trees with given right arm stretch - algorithm

For a given array of distinct (unique) integers I want to know the number of BST in all permutations with right most arm of length k.
(If k = 3, root->right->right is a leaf node)
(At my current requirement, I can not afford an algorithm with cost greater than N^3)
Two identical BSTs generated from different permutations are considered different.
My approach so far is:
Assume a function:
F(arr) = {a1, a2, a3...}
where a1 is count of array with k = 1, a2 is count of array with k2 etc.
F(arr[1:n]) = for i in range 1 to n (1 + df * F(subarr where each element is larger than arr[i]))
Where df is dynamic factor (n-1)C(count of elements smaller than arr[i])
I am trying to create a dp to solve the problem
Sort the array
Start from largest number to smaller number
dp[i][i] = 1
for(j in range i-1 to 1) dp[j][i] = some func of dp[j][i-1], but I am unable to formulate
For ex: for arr{4, 3, 2, 1}, I expect the following dp
arr[i] 4 3 2 1
+---+---+---+---+
k = 1 | 1 | 1 | 2 | 6 |
+---+---+---+---+
k = 2 | - | 1 | 3 |11 |
+---+---+---+---+
k = 3 | - | - | 1 | 6 |
+---+---+---+---+
k = 4 | - | - | - | 1 |
+---+---+---+---+
verification(n!) 1 2 6 24
Any hint, suggestions, pointers or redirection to a good source where I can meet my curiosity is welcome.
Thank you.
edit: It seems I may need 3D dp array. I am working on the same.
edit: Corrected col 3 of dp

The good new is that is you don't want the permutation but only their numbers, there is a formula for that. These are know as (unsigned) Stirling numbers of the first kind. The reason for that is that is that the numbers appearing on the right arm of a binary search tree are the left to right minima, that is the i such that the number appearing before i are greater than i. Here is a example where the records are underlined
6 8 3 5 4 2 7 1 9
_ _ _ _
This gives the tree
6
3 8
2 5 7 9
1 4
Those number are know to count permutation according to various characteristics (number of cycles... ). It is know that maxima or minima are among those characteristics. .You can find more information on Entry A008275 of the The On-Line Encyclopedia of Integer Sequences.
Now to answer the question of computing them. Let S(n,k) be the number of permutations of n numbers with k left to right minima. You can use the recurrence:
S(n, 0) = 0 for all n
S(n+1, k) = n*S(n, k) + S(n, k-1) for all n>0 and k>0

If I understand your problem correctly.
You do not need to sort an array. Since all number in you array are unique, you can assume that every possible subtree is a unique one.
Therefore you just need to count how may unique trees you can build having N - k unique elements, where N is a length of your array and k is a length of right most arm. In other words it will be number of permutations of your left subtree if you fix your right subtree to a fixed structure (root (node1 (node2 ... nodeK)))
Here is a way to calculate the number of binary trees of size N:
public int numTrees(int n) {
int[] ut = new int[Math.max(n + 1, 3)];
ut[1] = 1;
ut[2] = 2;
for (int i = 3; i <= n; i++) {
int u = 0;
for (int j = 0; j < i; j++) {
u += Math.max(1, ut[j]) * Math.max(1, ut[i - j - 1]);
}
ut[i] = u;
}
return ut[n];
}
it has O(n^2) time complexity and O(n) space complexity.

Related

Could anyone tell me a better solution for this problem? I could only think of brute force way which is O(n^2)

Recently I was attempting the following problem:
Given an array of integers, arr.
Find sum of floor of (arr[i]/arr[j]) for all pairs of indices (i,j).
e.g.
Input: arr[]={1,2,3,4,5}
Output: Sum=27.
Explanation:
(1/1)+(1/5)+(1/4)+(1/2)+(1/3) = 1+0+0+0+0 = 1
(5/1)+(5/5)+(5/4)+(5/2)+(5/3) = 5+1+1+2+1 = 10
(4/1)+(4/5)+(4/4)+(4/2)+(4/3) = 4+0+1+2+1 = 8
(2/1)+(2/5)+(2/4)+(2/2)+(2/3) = 2+0+0+1+0 = 3
(3/1)+(3/5)+(3/4)+(3/2)+(3/3) = 3+0+0+1+1 = 5
I could only think of naive O(n^2) solution. Is there any other better approach?
Thanks in advance.
A possibility resides in "quickly" skipping the elements that are the same integer multiple of a given element (after rounding).
For the given example, the vertical bars below delimit runs of equal ratios (the lower triangle is all zeroes and ignored; I show the elements on the left and the ratios on the right):
1 -> 2 | 3 | 4 | 5 ≡ 2 | 3 | 4 | 5
2 -> 3 | 4 5 ≡ 1 | 2 2
3 -> 4 5 ≡ 1 1
4 -> 5 ≡ 1
For bigger arrays, the constant runs can be longer.
So the algorithm principle is
sort all elements increasingly;
process the elements from smallest to largest;
for a given element, find the index of the first double and count the number of skipped elements;
from there, find the index of the first triple and count twice the number of skipped elements;
continue with higher multiples until you exhaust the tail of the array.
A critical operation is to "find the next multiple". This should be done by an exponential search followed by a dichotomic search, so that the number of operations remains logarithmic in the number of elements to skip (a pure dichotomic search would be logarithmic in the total number of remaining elements). Hence the cost of a search will be proportional to the sum of the logarithms of the distances between the multiples.
Hopefully, this sum will be smaller than the sum of the distances themselves, though in the worst case the complexity remains O(N). In the best case, O(log(N)).
A global analysis is difficult and in theory the worst-case complexity remains O(N²); but in practice it could go down to O(N log N), because the worst case would require that the elements grow faster than a geometric progression of common ratio 2.
Addendum:
If the array contains numerous repeated values, it can be beneficial to compress it by storing a repetition count and a single instance of every value. This can be done after sorting.
int[] arr = { 1, 2, 3, 4, 5 };
int result = 0;
int BuffSize = arr.Max() * 2;
int[] b = new int[BuffSize + 1];
int[] count = new int[BuffSize];
for (int i = 0; i < arr.Length; ++i)
count[arr[i]]++;
for (int i = BuffSize - 1; i >= 1; i--)
{
b[i] = b[i + 1] + count[i];
}
for (int i = 1; i < BuffSize; i++)
{
if (count[i] == 0)
{
continue;
}
for (int j = i, mul = 1; j < BuffSize; j += i, mul++)
{
result += 1 * (b[j] - b[Math.Min(BuffSize - 1, j + i)]) * mul * count[i];
}
}
This code takes advantage of knowing difference between each successive value ahead of time, and only process the remaining portion of the array rather than redundantly processing the entire thing n^2 times,
I believe it has a worst case runtime of O(n*sqrt(n)*log(n))

How to prove the optimality of this greedy algo?

Given N integers. Each of these numbers can be increased or decreased once by no more than given positive integer L. After each operation if any numbers become equal we consider them as one number. The problem is to calculate cardinality of minimal set of distinct integers.
Constraints: N <= 100, L <= 3200, integers are in the range [-32000, 32000]
Example: N = 3, L = 10
11 21 27
1) increase 11 by 10 => 21 21 27
2) decrease 27 by 6 => 21 21 21
The answer is 1.
Algo in C++ language:
sort(v.begin(), v.end());
// the algo tries to include elements in interval of length 2 * L
int ans = 0;
int first = 0;
for(int i = 1; i < N; ++i) {
if(v[i] - v[first] > 2 * L) { // if we can't include i-th element
ans++; // into the current interval
first = i; // the algo construct new
}
}
ans++;
printf("%d", ans);
I try to understand why this greedy algo is optimal. Any help is appreciated.
Reframed, we're trying to cover the set of numbers that appear in the input with as few intervals of cardinality 2*L + 1 as possible. You can imagine that, for an interval [C - L, C + L], all numbers in it are adjusted to C.
Given any list of intervals in sorted order, we can show inductively in k that, considering only the first k intervals, the first k of greedy covers at least as much of the input. The base case k = 0 is trivial. Inductively, greedy covers the next uncovered input element and as much as it can in addition; the interval in the arbitrary solution that covers its next uncovered input element must be not after greedy's, so the arbitrary solution has no more coverage.

Finding the Closest Divisable in an Array

I am having an Array of N integer where N<= 10^6 for every index i i have to find the closest left neighbor of i such that A[j]%A[i]==0 0<j<i
Example
3 4 2 6 7 3
Nearest Neighbor array
-1 -1 1 -1 -1 3
As for last element i.e 3 6%3==0 and index of 6 is 3 so ans is 3
similar for 2 nearest neighbor is 4
Brute Force Approach
int[] Neg = new int[n];
Arrays.fill(Neg,-1);
for(int i=1;i<n;i++)
for(int j=i-1;j>=0;j--)
if(A[j]%A[i]==0){
Neg[i]=j;
break;
}
This O(n^2) approach and will fail how can i come up with a better approach probably O(nlogn)
There is a simple O(n.sqrt(n)) algorithm as follows:
Initialize an array D to all -1.
For i = 1,2,...,n
Let a = A[i]
Output D[a]
For each divisor d of A[i]:
set D[d] = i
You can find all the divisors of a number in O(sqrt(n)) with a simple loop, or you may find it faster to do some precomputing of factorizations.
The algorithm works by using D[x] to store the position j of the most recent number A[j] that is a multiple of x.

Fastest unconditional sort algorithm

I have a function, which can take two elements and return them back in ascending order:
void Sort2(int &a, int &b) {
if (a < b) return;
int t = a;
a = b;
b = t;
}
what is the fastest way to sort an array with N entries using this function if I am not allowed to use extra conditional operators?
That means that whole my program should look like this:
int main(){
int a[N];
// fill a array
const int NS = ...; // number of comparison, depending on N.
const int c[NS] = { {0,1}, {0,2}, ... }; // consequence of indices pairs generated depending on N.
for( int i = 0; i < NS; i++ ) {
Sort2(a[c[i][0]], a[c[i][1]]);
}
// sort is finished
return 1;
}
Most of the fast sort algorithms use conditions to decide what to do. There is bubble sort of course, but it takes M = N(N-1)/2 comparisons. This is not the optimum, for instance, with N = 4 it takes M = 6 comparison, meanwhile 4 entries can be sorted with 5:
Sort2(a[0],a[1]);
Sort2(a[2],a[3]);
Sort2(a[1],a[3]);
Sort2(a[0],a[2]);
Sort2(a[1],a[2]);
The standard approach is known as Bitonic Mergesort. It is hella efficient when paralellized, and only slightly less efficient than conventional algorithms when not parallelized. Bitonic mergesort is a special kind of a wider class of algorithms known as "sorting networks"; it is unusual among sorting networks in that some of its reorderings are in reverse order of the desired sort (though everything is in the correct order once the algorithm completes). You can do that with your Sort2 by passing in a higher array slot for the first argument than the second.
For N a power of 2 you can generalize the approach you used, by using a "merge-sortish" kind of approach: you sort the first half and the last half separately, and then merge these using a few comparisons.
For instance, consider an array of size 8. And assume that the first half is sorted and the last half is sorted (by applying this same approach recursively):
A B C D P Q R S
In the first round, you do a comparison of 1 vs 1, 2 vs 2, etc:
---------
| |
| ---------
| | | |
A B C D P Q R S
| | | |
| ---------
| |
---------
After this round, the first and the last element are in the right place, so you need to repeat the process for the inner 6 elements (I keep the names of the elements the same, because it is unknown where they end up):
-------
| |
| -------
| | | |
A B C D P Q R S
| |
-------
In the next round, the inner 4 elements are compared, and in the last round the inner 2.
Let f(n) be the number of comparisons needed to sort an array of length n (where n is a power of 2, for the moment). Clearly, an array consisting of 1 element is sorted already:
f(1) = 0
For a longer array, you first need to sort both halves, and then perform the procedure described above. For n=8, that took 4+3+2+1 = (n/2)(n/2+1)/2 comparisons. Hence in general:
f(n) = 2 f(n/2) + (n/2)(n/2+1)/2
Note that for n=4, this indeed gives:
f(4) = 2 f(2) + 2*3/2
= 2 * (2 f(1) + 1*2/2) + 3
= 5
To facilitate ns that are no power of 2, the important thing is to do the merging step on an odd-length array. The simplest strategy seems to be to compare the smallest element of both subarrays (which yields the smallest element) and then just continue on the rest of the array (which has now even length).
If we write g(k) = k(k+1)/2, we can now have a short way of writing the recursive formula (I use 2k and 2k+1 to distinguish even and odd):
f(1) = 0
f(2k) = 2 f(k) + g(k)
f(2k+1) = f(k+1) + f(k) + 1 + g(k)
Some pseudocode on how to approach this:
function sort(A, start, length) {
if (length == 1) {
// do nothing
} else if (length is even) {
sort(A, start, length/2)
sort(A, start+length/2, length/2)
merge(A, start, length)
} else if (length is odd) {
sort(A, start, length/2+1)
sort(A, start+length/2+1, length/2)
Sort2(A[start], A[start+length/2+1])
merge(A, start+1, length-1)
}
}
function merge(A, start, length) {
if (length > 0) {
for (i = 0; i < length/2; i++)
Sort2(A[i], A[i]+length/2)
merge(A, start+1, length-2)
}
}
And you would run this on your array by
sort(A, 0, A.length)

nth smallest element in a union of an array of intervals with repetition

I want to know if there is a more efficient solution than what I came up with(not coded it yet but described the gist of it at the bottom).
Write a function calcNthSmallest(n, intervals) which takes as input a non-negative int n, and a list of intervals [[a_1; b_1]; : : : ; [a_m; b_m]] and calculates the nth smallest number (0-indexed) when taking the union of all the intervals with repetition. For example, if the intervals were [1; 5]; [2; 4]; [7; 9], their union with repetition would be [1; 2; 2; 3; 3; 4; 4; 5; 7; 8; 9] (note 2; 3; 4 each appear twice since they're in both the intervals [1; 5] and [2; 4]). For this list of intervals, the 0th smallest number would be 1, and the 3rd and 4th smallest would both be 3. Your implementation should run quickly even when the a_i; b_i can be very large (like, one trillion), and there are several intervals
The way I thought to go about it is the straightforward solution which is to make the union array and traverse it.
This problem can be solved in O(N log N) where N is the number of intervals in the list, regardless of the actual values of the interval endpoints.
The key to solving this problem efficiently is to transform the list of possibly-overlapping intervals into a list of intervals which are either disjoint or identical. In the given example, only the first interval needs to be split:
{ [1,5], [2,4], [7,9]} =>
+-----------------+ +---+ +---+
{[1,1], [2,4], [5,5], [2,4], [7,9]}
(This doesn't have to be done explicitly, though: see below.) Now, we can sort the new intervals, replacing duplicates with a count. From that, we can compute the number of values each (possibly-duplicated) interval represents. Now, we simply need to accumulate the values to figure out which interval the solution lies in:
interval count size values cumulative
in interval values
[1,1] 1 1 1 [0, 1)
[2,4] 2 3 6 [1, 7) (eg. from n=1 to n=6 will be here)
[5,5] 1 1 1 [7, 8)
[7,9] 1 3 3 [8, 11)
I wrote the cumulative values as a list of half-open intervals, but obviously we only need the end-points. We can then find which interval holds value n by, for example, binary-searching the cumulative values list, and we can figure out which value in the interval we want by subtracting the start of the interval from n and then integer-dividing by the count.
It should be clear that the maximum size of the above table is twice the number of original intervals, because every row must start and end at either the start or end of some interval in the original list. If we'd written the intervals as half-open instead of closed, this would be even clearer; in that case, we can assert that the precise size of the table will be the number of unique values in the collection of end-points. And from that insight, we can see that we don't really need the table at all; we just need the sorted list of end-points (although we need to know which endpoint each value represents). We can simply iterate through that list, maintaining the count of the number of active intervals, until we reach the value we're looking for.
Here's a quick python implementation. It could be improved.
def combineIntervals(intervals):
# endpoints will map each endpoint to a count
endpoints = {}
# These two lists represent the start and (1+end) of each interval
# Each start adds 1 to the count, and each limit subtracts 1
for start in (i[0] for i in intervals):
endpoints[start] = endpoints.setdefault(start, 0) + 1
for limit in (i[1]+1 for i in intervals):
endpoints[limit] = endpoints.setdefault(limit, 0) - 1
# Filtering is a possibly premature optimization but it was easy
return sorted(filter(lambda kv: kv[1] != 0,
endpoints.iteritems()))
def nthSmallestInIntervalList(n, intervals):
limits = combineIntervals(intervals)
cumulative = 0
count = 0
index = 0
here = limits[0][0]
while index < len(limits):
size = limits[index][0] - here
if n < cumulative + count * size:
# [here, next) contains the value we're searching for
return here + (n - cumulative) / count
# advance
cumulative += count * size
count += limits[index][1]
here += size
index += 1
# We didn't find it. We could throw an error
So, as I said, the running time of this algorithm is independent of the actual values of the intervals; it only depends in the length of the interval list. This particular solution is O(N log N) because of the cost of the sort (in combineIntervals); if we used a priority queue instead of a full sort, we could construct the heap in O(N) but making the scan O(log N) for each scanned endpoint. Unless N is really big and the expected value of the argument n is relatively small, this would be counter-productive. There might be other ways to reduce complexity, though.
Edit2:
Here's yet another take on your question.
Let's consider the intervals graphically:
1 1 1 2 2 2 3
0-2-4--7--0--3---7-0--4--7--0
[-------]
[-----------------]
[---------]
[--------------]
[-----]
When sorted in increasing order on the lower bound, we could get something that looks like the above for the interval list ([2;10];[4;24];[7;17];[13;30];[20;27]). Each lower bound indicates the start of a new interval, and would also marks the beginning of one more "level" of duplication of the numbers. Conversely, upper bounds mark the end of that level, and decrease the duplication level of one.
We could therefore convert the above into the following list:
[2;+];[4;+];[7;+][10;-];[13;+];[17;-][20;+];[24;-];[27;-];[30;-]
Where the first value indicates the rank of the bound, and the second value whether the bound is lower (+) or upper (-). The computation of the nth element is done by following the list, raising or lowering the duplication level when encountering an lower or upper bound, and using the duplication level as a counting factor.
Let's consider again the list graphically, but as an histogram:
3333 44444 5555
2222222333333344444555
111111111222222222222444444
1 1 1 2 2 2 3
0-2-4--7--0--3---7-0--4--7--0
The view above is the same as the first one, with all the intervals packed vertically.
1 being the elements of the 1st one, 2 the second one, etc. In fact, what matters here
is the height at each index, corresponding of the number of time each index is duplicated in the union of all intervals.
3333 55555 7777
2223333445555567777888
112223333445555567777888999
1 1 1 2 2 2 3
0-2-4--7--0--3---7-0--4--7--0
| | | | | | || | |
We can see that histogram blocks start at lower bounds of intervals, and end either on upper bounds, or one unit before lower bounds, so the new notation must be modified accordingly.
With a list containing n intervals, as a first step, we convert the list into the notation above (O(n)), and sort it in increasing bound order (O(nlog(n))). The second step of computing the number is then in O(n), for a total average time in O(nlog(n)).
Here's a simple implementation in OCaml, using 1 and -1 instead of '+' and '-'.
(* transform the list in the correct notation *)
let rec convert = function
[] -> []
| (l,u)::xs -> (l,1)::(u+1,-1)::convert xs;;
(* the counting function *)
let rec count r f = function
[] -> raise Not_found
| [a,x] -> (match f + x with
0 -> if r = 0 then a else raise Not_found
| _ -> a + (r / f))
| (a,x)::(b,y)::l ->
if a = b
then count r f ((b,x+y)::l)
else
let f = f + x in
if f > 0 then
let range = (b - a) * f in
if range > r
then a + (r / f)
else count (r - range) f ((b,y)::l)
else count r f ((b,y)::l);;
(* the compute function *)
let compute l =
let compare (x,_) (y,_) = compare x y in
let l = List.sort compare (convert l) in
fun m -> count m 0 l;;
Notes:
- the function above will raise an exception if the sought number is above the intervals. This corner case isn't taken in account by the other methods below.
- the list sorting function used in OCaml is merge sort, which effectively performs in O(nlog(n)).
Edit:
Seeing that you might have very large intervals, the solution I gave initially (see down below) is far from optimal.
Instead, we could make things much faster by transforming the list:
we try to compress the interval list by searching for overlapping ones and replace them by prefixing intervals, several times the overlapping one, and suffixing intervals. We can then directly compute the number of entries covered by each element of the list.
Looking at the splitting above (prefix, infix, suffix), we see that the optimal structure to do the processing is a binary tree. A node of that tree may optionally have a prefix and a suffix. So the node must contain :
an interval i in the node
an integer giving the number of repetition of i in the list,
a left subtree of all the intervals below i
a right subtree of all the intervals above i
with this structure in place, the tree is automatically sorted.
Here's an example of an ocaml type embodying that tree.
type tree = Empty | Node of int * interval * tree * tree
Now the transformation algorithm boils down to building the tree.
This function create a tree out of its component:
let cons k r lt rt =
the tree made of count k, interval r, left tree lt and right tree rt
This function recursively insert an interval in a tree.
let rec insert i it =
let r = root of it
let lt = the left subtree of it
let rt = the right subtree of it
let k = the count of r
let prf, inf, suf = the prefix, infix and suffix of i according to r
return cons (k+1) inf (insert prf lt) (insert suf rt)
Once the tree is built, we do a pre-order traversal of the tree, using the count of the node to accelerate the computation of the nth element.
Below is my previous answer.
Here are the steps of my solution:
you need to sort the interval list in increasing order on the lower bound of each interval
you need a deque dq (or a list which will be reversed at some point) to store the intervals
here's the code:
let lower i = lower bound of interval i
let upper i = upper bound of i
let il = sort of interval list
i <- 0
j <- lower (head of il)
loop on il:
i <- i + 1
let h = the head of il
let il = the tail of il
if upper h > j then push h to dq
if lower h > j then
il <- concat dq and il
j <- j + 1
dq <- empty
loop
if i = k then return j
loop
This algorithm works by simply iterating through the intervals, only taking in account the relevant intervals, and counting both the rank i of the element in the union, and the value j of that element. When the targeted rank k has been reached, the value is returned.
The complexity is roughly in O(k) + O(sort(l)).
if i have understood your question correctly, you want to find the kth largest element in union of list of intervals.
If we assume that no of list = 2 the question is :
Find the kth smallest element in union of two sorted arrays (where an interval [2,5] is nothing but elements from 2 to 5 {2,3,4,5}) this sollution can be solved in (n+m)log(n+m) time where (n and m are sizes of list) . where i and j are list iterators .
Maintaining the invariant
i + j = k – 1,
If Bj-1 < Ai < Bj, then Ai must be the k-th smallest,
or else if Ai-1 < Bj < Ai, then Bj must be the k-th smallest.
For details click here
Now the problem is if you have no of lists=3 lists then
Maintaining the invariant
i + j+ x = k – 1,
i + j=k-x-1
The value k-x-1 can take y (size of third list, because x iterates from start point of list to end point) .
problem of 3 lists size can be reduced to y*(problem of size 2 list). So complexity is `y*((n+m)log(n+m))`
If Bj-1 < Ai < Bj, then Ai must be the k-th smallest,
or else if Ai-1 < Bj < Ai, then Bj must be the k-th smallest.
So for problem of size n list the complexity is NP .
But yes we can do minor improvement if we know that k< sizeof(some lists) we can chop the elements starting from k+1th element to end(from our search space ) in those list whose size is bigger than k (i think it doesnt help for large k).If there is any mistake please let me know.
Let me explain with an example:
Assume we are given these intervals [5,12],[3,9],[8,13].
The union of these intervals is:
number : 3 4 5 5 6 6 7 7 8 8 8 9 9 9 10 10 11 11 12 12 13.
indices: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
The lowest will return 11 when 9 is passed an input.
The highest will return 14 when 9 is passed an input.
Lowest and highest function just check whether the x is present in that interval, if it is present then adds x-a(lower index of interval) to return value for that one particular interval. If an interval is completely smaller than x, then adds total number of elements in that interval to the return value.
The find function will return 9 when 13 is passed.
The find function will use the concept of binary search to find the kth smallest element. In the given range [0,N] (if range is not given we can find high range in O(n)) find the mid and calculate the lowest and highest for mid. If given k falls in between lowest and highest return mid else if k is less than or equal to lowest search in the lower half(0,mid-1) else search in the upper half(mid+1,high).
If the number of intervals are n and the range is N, then the running time of this algorithm is n*log(N). we will find lowest and highest (which runs in O(n)) log(N) times.
//Function call will be `find(0,N,k,in)`
//Retrieves the no.of smaller elements than first x(excluding) in union
public static int lowest(List<List<Integer>> in, int x){
int sum = 0;
for(List<Integer> lst: in){
if(x > lst.get(1))
sum += lst.get(1) - lst.get(0)+1;
else if((x >= lst.get(0) && x<lst.get(1)) || (x > lst.get(0) && x<=lst.get(1))){
sum += x - lst.get(0);
}
}
return sum;
}
//Retrieve the no.of smaller elements than last x(including) in union.
public static int highest(List<List<Integer>> in, int x){
int sum = 0;
for(List<Integer> lst: in){
if(x > lst.get(1))
sum += lst.get(1) - lst.get(0)+1;
else if((x >= lst.get(0) && x<lst.get(1)) || (x > lst.get(0) && x<=lst.get(1))){
sum += x - lst.get(0)+1;
}
}
return sum;
}
//Do binary search on the range.
public static int find(int low, int high, int k,List<List<Integer>> in){
if(low > high)
return -1;
int mid = low + (high-low)/2;
int lowIdx = lowest(in,mid);
int highIdx = highest(in,mid);
//k lies between the current numbers high and low indices
if(k > lowIdx && k <= highIdx) return mid;
//k less than lower index. go on to left side
if(k <= lowIdx) return find(low,mid-1,k,in);
// k greater than higher index go to right
if(k > highIdx) return find(mid+1,high,k,in);
else
return -1; // catch statement
}
It's possible to count how many numbers in the list are less than some chosen number X (by iterating through all of the intervals). Now, if this number is greater than n, the solution is certainly smaller than X. Similarly, if this number is less than or equal to n, the solution is greater than or equal to X. Based on these observation we can use binary search.
Below is a Java implementation :
public int nthElement( int[] lowerBound, int[] upperBound, int n )
{
int lo = Integer.MIN_VALUE, hi = Integer.MAX_VALUE;
while ( lo < hi ) {
int X = (int)( ((long)lo+hi+1)/2 );
long count = 0;
for ( int i=0; i<lowerBound.length; ++i ) {
if ( X >= lowerBound[i] && X <= upperBound[i] ) {
// part of interval i is less than X
count += (long)X - lowerBound[i];
}
if ( X >= lowerBound[i] && X > upperBound[i] ) {
// all numbers in interval i are less than X
count += (long)upperBound[i] - lowerBound[i] + 1;
}
}
if ( count <= n ) lo = X;
else hi = X-1;
}
return lo;
}

Resources