I am reading "Insertion Sort is O(nlogn) by Michael A. Bender , Martín Farach-Colton , Miguel Mosteiro" and I don't quite understand how the algorithm works and how to implement it even with the help of Wikipedia. The following is the description of the algorithm extracted from the original article.
1) Let A be an n-element array to be sorted. These elements are inserted one at
a time in random order into a sorting array S of size (1 + ε)n.
So the first step is creating array of size (1 + ε)n. Let ε = 1, then I need to create an array with twice the size of the original array.
2) The insertions proceed in log(n) rounds as follows.
Each round doubles the number of elements inserted into S and doubles the prefix of S where elements reside.
I understand that there will be outer loop that will loop log(n) time. Each round, I need to double the number of elements from A (original array) to S array. What I don't really understand is "double the prefix of S".
3) Specifically, round ith ends when element 2i is inserted and the elements are rebalanced. Before the rebalance, the 2i elements are in the first (1 + ε)2i positions.
A rebalance moves them into the first (2 + 2ε)2i positions, spreading
the elements as evenly as possible. We call 2 + 2ε the spreading factor.
From what I understand is that for every round, we will do "rebalance". "rebalance" will uniformly spread the original element in S array so that it leaves some gap between the element. The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.
4) The insertion of 2i−1 intercalated elements within round ith is performed the
brute force way: search for the target position of the element to be inserted by
binary search (amongst the 2i−1 support positions in S), and move elements
of higher rank to make room for the new element. Not all elements of higher
rank need to be moved, only those in adjacent array positions until the nearest
gap is found.
This part shows how to insert each element into S array. First, use binary search to search for where the element should belongs. Then, shift the higher rank until it hit the gap.
This is the translation of the algorithm from what I understand (where A is array to sort and array start with index of 1):
def LibrarySort(A)
n ← length(A)
S ← array of size (1 + ε) * n
for i ← 1 to n
S[i] = null
for i ← 1 to floor(log(n) + 1)
for j ← 2i - 1 to 2i
index = binarysearch(S, A[j])
insert(S, A[j], index)
rebalance()
Then for insertion() function takes 3 parameters: array, item to insert, and location.
def insert(S, item, index)
if S[index] != null
tmp ← S[index]
i ← index + 1
while i <= length(S) and S[i] != null
swap(tmp, S[i])
i++
S[index] ← item
Questions
Is what I understand correct?
What is "double the prefix of S"?
Ad "double the prefix of S": The array (memory) is allocated once at the beginning to the size of (1 + ε) n, where n is total number of elements to be sorted. But the elements are added gradually and as they are added, they are not spread across the whole array, but only some prefix of it. When m elements are rebalanced, they are spread across first (1 + ε) m elements of the array. This is the prefix. When m doubles, so does (1 + ε) m.
Ad correctness: I see one slight mistake:
The formula to spread the element is: k = i * (1 + ε) where i is old index, and k is a new index.
The quoted description does not say what the formula is, but it can't be this one. Because this would map array of length m to length (1 + ε) m, but the description says you are mapping array of length (1 + ε) m to array of length 2 (1 + ε) m.
A simple expression would be k = 2 i where i is old index, but that would not spread the elements evenly. To spread the elements evenly, the formula is k = (2 + 2 ε) i, but i is index excluding any gaps.
Related
I want to maximize number of zigzag sequence in an array(without reordering).
I've a main array of random sequence of integers.I want a sub-array of index of main array that has zigzag pattern.
A sequence of integers is called zigzag sequence if each of its elements is either strictly less or strictly greater than its neighbors(and two adjacent of neighbors).
Example : The sequence 4 2 3 1 5 2 forms a zigzag, but 7 3 5 5 2 and 3 8 6 4 5
and 4 2 3 1 5 3 don't.
For a given array of integers we need to find (contiguous) sub-array of indexes that forms a zigzag sequence.
Can this be done in O(N) ?
Yes, this would seem to be solvable in O(n) time. I'll describe the algorithm as a dynamic program.
Setup
Let the array containing potential zig-zags be called Z.
Let U be an array such that len(U) == len(Z), and U[i] is an integer representing the largest contiguous left-to-right subsequence starting at i that is a zig-zag such that Z[i] < Z[i+1] (it zigs up).
Let D be similar to U, except that D[i] is an integer representing the largest contiguous left-to-right subsequence starting at i that is a zig-zag such that Z[i] > Z[i+1] (it zags down).
Subproblem
The subproblem is to find both U[i] and D[i] at each i. This can be done as follows:
U[i] = {
1 + D[i+1] if i < i+1
0 otherwise
}
L[i] = {
1 + U[i+1] if i > i+1
0 otherwise
}
The top version says that if we're looking for the largest sequence beginning with an up-zig, we see if the next element is larger (goes up), and then add a single zig to the size of the next down-zag sequence. The next one is the reverse.
Base Cases
If i == len(Z) (it is the last element), U[i] = L[i] = 0. The last element cannot have a left-to-right sequence after it because there is nothing after it.
Solution
To get the solution, first we find max(U[i]) and max(L[i]) for every i. Then get the maximum of those two values, store i, and store the length of this largest zig-zag (in a variable called length). The sequence begins at index i and ends at index i + length.
Runtime
There are n indexes, so there are 2n subproblems between U and L. Each subproblem takes O(1) time to solve, given that solutions to previously solved subproblems are memoized. Finally, iterating through U and L to get the final answer takes O(2n) time.
We thus have O(2n) + O(2n) time, or O(n).
This may be an overly complex solution, but it demonstrates that it can be done in O(n).
Recently I got stuck in a problem. The part of algorithm requires to compute sum of maximum element of sliding windows of length K. Where K ranges from 1<=K<=N (N length of an array).
Example if I have an array A as 5,3,12,4
Sliding window of length 1: 5 + 3 + 12 + 4 = 24
Sliding window of length 2: 5 + 12 + 12 = 29
Sliding window of length 3: 12 + 12 = 24
Sliding window of length 4: 12
Final answer is 24,29,24,12.
I have tried to this O(N^2). For each sliding window of length K, I can calculate the maximum in O(N). Since K is upto N. Therefore, overall complexity turns out to be O(N^2).
I am looking for O(N) or O(NlogN) or something similar to this algorithm as N maybe upto 10^5.
Note: Elements in array can be as large as 10^9 so output the final answer as modulo 10^9+7
EDIT: What I actually want to find answer for each and every value of K (i.e. from 0 to N) in overall linear time or in O(NlogN) not in O(KN) or O(KNlogN) where K={1,2,3,.... N}
Here's an abbreviated sketch of O(n).
For each element, determine how many contiguous elements to the left are no greater (call this a), and how many contiguous elements to the right are lesser (call this b). This can be done for all elements in time O(n) -- see MBo's answer.
A particular element is maximum in its window if the window contains the element and only elements among to a to its left and the b to its right. Usefully, the number of such windows of length k (and hence the total contribution of these windows) is piecewise linear in k, with at most five pieces. For example, if a = 5 and b = 3, there are
1 window of size 1
2 windows of size 2
3 windows of size 3
4 windows of size 4
4 windows of size 5
4 windows of size 6
3 windows of size 7
2 windows of size 8
1 window of size 9.
The data structure that we need to encode this contribution efficiently is a Fenwick tree whose values are not numbers but linear functions of k. For each linear piece of the piecewise linear contribution function, we add it to the cell at beginning of its interval and subtract it from the cell at the end (closed beginning, open end). At the end, we retrieve all of the prefix sums and evaluate them at their index k to get the final array.
(OK, have to run for now, but we don't actually need a Fenwick tree for step two, which drops the complexity to O(n) for that, and there may be a way to do step one in linear time as well.)
Python 3, lightly tested:
def left_extents(lst):
result = []
stack = [-1]
for i in range(len(lst)):
while stack[-1] >= 0 and lst[i] >= lst[stack[-1]]:
del stack[-1]
result.append(stack[-1] + 1)
stack.append(i)
return result
def right_extents(lst):
result = []
stack = [len(lst)]
for i in range(len(lst) - 1, -1, -1):
while stack[-1] < len(lst) and lst[i] > lst[stack[-1]]:
del stack[-1]
result.append(stack[-1])
stack.append(i)
result.reverse()
return result
def sliding_window_totals(lst):
delta_constant = [0] * (len(lst) + 2)
delta_linear = [0] * (len(lst) + 2)
for l, i, r in zip(left_extents(lst), range(len(lst)), right_extents(lst)):
a = i - l
b = r - (i + 1)
if a > b:
a, b = b, a
delta_linear[1] += lst[i]
delta_linear[a + 1] -= lst[i]
delta_constant[a + 1] += lst[i] * (a + 1)
delta_constant[b + 2] += lst[i] * (b + 1)
delta_linear[b + 2] -= lst[i]
delta_linear[a + b + 2] += lst[i]
delta_constant[a + b + 2] -= lst[i] * (a + 1)
delta_constant[a + b + 2] -= lst[i] * (b + 1)
result = []
constant = 0
linear = 0
for j in range(1, len(lst) + 1):
constant += delta_constant[j]
linear += delta_linear[j]
result.append(constant + linear * j)
return result
print(sliding_window_totals([5, 3, 12, 4]))
Let's determine for every element an interval, where this element is dominating (maximum). We can do this in linear time with forward and backward runs using stack. Arrays L and R will contain indexes out of the domination interval.
To get right and left indexes:
Stack.Push(0) //(1st element index)
for i = 1 to Len - 1 do
while Stack.Peek < X[i] do
j = Stack.Pop
R[j] = i //j-th position is dominated by i-th one from the right
Stack.Push(i)
while not Stack.Empty
R[Stack.Pop] = Len //the rest of elements are not dominated from the right
//now right to left
Stack.Push(Len - 1) //(last element index)
for i = Len - 2 to 0 do
while Stack.Peek < X[i] do
j = Stack.Pop
L[j] = i //j-th position is dominated by i-th one from the left
Stack.Push(i)
while not Stack.Empty
L[Stack.Pop] = -1 //the rest of elements are not dominated from the left
Result for (5,7,3,9,4) array.
For example, 7 dominates at 0..2 interval, 9 at 0..4
i 0 1 2 3 4
X 5 7 3 9 4
R 1 3 3 5 5
L -1 -1 1 -1 4
Now for every element we can count it's impact in every possible sum.
Element 5 dominates at (0,0) interval, it is summed only in k=1 sum entry
Element 7 dominates at (0,2) interval, it is summed once in k=1 sum entry, twice in k=2 entry, once in k=3 entry.
Element 3 dominates at (2,2) interval, it is summed only in k=1 sum entry
Element 9 dominates at (0,4) interval, it is summed once in k=1 sum entry, twice in k=2, twice in k=3, twice in k=4, once in k=5.
Element 4 dominates at (4,4) interval, it is summed only in k=1 sum entry.
In general element with long domination interval in the center of long array may give up to k*Value impact in k-length sum (it depends on position relative to array ends and to another dom. elements)
k 1 2 3 4 5
--------------------------
5
7 2*7 7
3
9 2*9 2*9 2*9 9
4
--------------------------
S(k) 28 32 25 18 9
Note that the sum of coefficients is N*(N-1)/2 (equal to the number of possible windows), the most of table entries are empty, so complexity seems better than O(N^2)
(I still doubt about exact complexity)
The sum of maximum in sliding windows for a given window size can be computed in linear time using a double ended queue that keeps elements from the current window. We maintain the deque such that the first (index 0, left most) element in the queue is always the maximum of the current window.
This is done by iterating over the array and in each iteration, first we remove the first element in the deque if it is no longer in the current window (we do that by checking its original position, which is also saved in the deque together with its value). Then, we remove any elements from the end of the deque that are smaller than the current element, and finally we add the current element to the end of the deque.
The complexity is O(N) for computing the maximum for all sliding windows of size K. If you want to do that for all values of K from 1..N, then time complexity will be O(N^2). O(N) is the best possible time to compute the sum of maximum values of all windows of size K (that is easy to see). To compute the sum for other values of K, the simple approach is to repeat the computation for each different value of K, which would lead to overall time of O(N^2). Is there a better way ? No, because even if we save the result from a computation for one value of K, we would not be able to use it to compute the result for a different value of K, in less then O(N) time. So best time is O(N^2).
The following is an implementation in python:
from collections import deque
def slide_win(l, k):
dq=deque()
for i in range(len(l)):
if len(dq)>0 and dq[0][1]<=i-k:
dq.popleft()
while len(dq)>0 and l[i]>=dq[-1][0]:
dq.pop()
dq.append((l[i],i))
if i>=k-1:
yield dq[0][0]
def main():
l=[5,3,12,4]
print("l="+str(l))
for k in range(1, len(l)+1):
s=0
for x in slide_win(l,k):
s+=x
print("k="+str(k)+" Sum="+str(s))
This question already has answers here:
Take K elements and maximise the minimum distance
(2 answers)
Closed 7 years ago.
We are given N elements in form of array A , Now we have to choose K indexes from N given indexes such that for any 2 indexes i and j minimum value of |A[i]-A[j]| is as large as possible. We need to tell this maximum value.
Lets take an example : Let N=5 and K=2 and array be [1,5,3,7,11] then here answer is 10 as we can simply choose first and last position and differ = 11-1=10.
Example 2 : Let N=10 and K=3 and array A be [3 9 6 11 15 20 23] then here answer will be 8. As we can select [3,11,23] or [3,15,23].
Now given N , K and Array A we need to find this maximum difference.
We are given that 1 ≤ N ≤ 10^5 and 1 ≤ S ≤ 10^7
Let's sort the array.
Now we can do a binary search over the answer.
For a fixed candidate x, we can just pick the elements greedily(iterating over the sorted array and taking each element if we can). If the number of elements we have picked is not less than K, x is feasible. Otherwise, it is not.
The time complexity is O(N * log N + N * log (MAX_ELEMENT - MIN_ELEMENT))
A pseudo code:
bool isFeasible(int x):
cnt = 1
last = a[0]
for i <- 1 ... n - 1:
if a[i] - last >= x:
last = a[i]
cnt++
return cnt >= k
sort(a)
low = 0
high = a[n - 1] - a[0] + 1
while high - low > 1:
mid = low + (high - low) / 2
if isFeasible(mid):
low = mid
else
high = mid
print(low)
I think this can be dealt with as a dynamic programming problem. Start off by sorting A, and then the problem is to mark K elements in A such that the minimum difference between adjacent marked items is as large as possible. As a starter, you can always mark the first and last elements.
Moving from left to right, at each position for i=1..N work out the largest minimum difference you can get by marking i elements in the sub-array terminating at this position. You can work out the largest minimum difference for k items terminating at this position by considering the largest minimum difference for k-1 items terminating at each position to the left of the position you are working on. The obvious thing to do is to consider each possible position up to the position you are currently working on as ending a stretch of k-1 items with minimum difference, but you may be able to do a binary search here to speed things up.
Once you have worked all the way to the right hand end you know the maximum possible value for the original problem. If you need to know where to put the K elements, you can take notes as you go along so that you can backtrack to find out the elements chosen that lead to this solution, working from right to left.
Given an array A with N elements I need to find pair (i,j) such that i is not equal to j and if we write the sum A[i]+A[j] for all pairs of (i,j) then it comes at the kth position.
Example : Let N=4 and arrays A=[1 2 3 4] and if K=3 then answer is 5 as we can see it clearly that sum array becomes like this : [3,4,5,5,6,7]
I can't go for all pair of i and j as N can go up to 100000. Please help how to solve this problem
I mean something like this :
int len=N*(N+1)/2;
int sum[len];
int count=0;
for(int i=0;i<N;i++){
for(int j=i+1;j<N;j++){
sum[count]=A[i]+A[j];
count++;
}
}
//Then just find kth element.
We can't go with this approach
A solution that is based on a fact that K <= 50: Let's take the first K + 1 elements of the array in a sorted order. Now we can just try all their combinations. Proof of correctness: let's assume that a pair (i, j) is the answer, where j > K + 1. But there are K pairs with the same or smaller sum: (1, 2), (1, 3), ..., (1, K + 1). Thus, it cannot be the K-th pair.
It is possible to achieve an O(N + K ^ 2) time complexity by choosing the K + 1 smallest numbers using a quickselect algorithm(it is possible to do even better, but it is not required). You can also just the array and get an O(N * log N + K ^ 2 * log K) complexity.
I assume that you got this question from http://www.careercup.com/question?id=7457663.
If k is close to 0 then the accepted answer to How to find kth largest number in pairwise sums like setA + setB? can be adapted quite easily to this problem and be quite efficient. You need O(n log(n)) to sort the array, O(n) to set up a priority queue, and then O(k log(k)) to iterate through the elements. The reversed solution is also efficient if k is near n*n - n.
If k is close to n*n/2 then that won't be very good. But you can adapt the pivot approach of http://en.wikipedia.org/wiki/Quickselect to this problem. First in time O(n log(n)) you can sort the array. In time O(n) you can set up a data structure representing the various contiguous ranges of columns. Then you'll need to select pivots O(log(n)) times. (Remember, log(n*n) = O(log(n)).) For each pivot, you can do a binary search of each column to figure out where it split it in time O(log(n)) per column, and total cost of O(n log(n)) for all columns.
The resulting algorithm will be O(n log(n) log(n)).
Update: I do not have time to do the finger exercise of supplying code. But I can outline some of the classes you might have in an implementation.
The implementation will be a bit verbose, but that is sometimes the cost of a good general-purpose algorithm.
ArrayRangeWithAddend. This represents a range of an array, summed with one value.with has an array (reference or pointer so the underlying data can be shared between objects), a start and an end to the range, and a shiftValue for the value to add to every element in the range.
It should have a constructor. A method to give the size. A method to partition(n) it into a range less than n, the count equal to n, and a range greater than n. And value(i) to give the i'th value.
ArrayRangeCollection. This is a collection of ArrayRangeWithAddend objects. It should have methods to give its size, pick a random element, and a method to partition(n) it into an ArrayRangeCollection that is below n, count of those equal to n, and an ArrayRangeCollection that is larger than n. In the partition method it will be good to not include ArrayRangeWithAddend objects that have size 0.
Now your main program can sort the array, and create an ArrayRangeCollection covering all pairs of sums that you are interested in. Then the random and partition method can be used to implement the standard quickselect algorithm that you will find in the link I provided.
Here is how to do it (in pseudo-code). I have now confirmed that it works correctly.
//A is the original array, such as A=[1,2,3,4]
//k (an integer) is the element in the 'sum' array to find
N = A.length
//first we find i
i = -1
nl = N
k2 = k
while (k2 >= 0) {
i++
nl--
k2 -= nl
}
//then we find j
j = k2 + nl + i + 1
//now compute the sum at index position k
kSum = A[i] + A[j]
EDIT:
I have now tested this works. I had to fix some parts... basically the k input argument should use 0-based indexing. (The OP seems to use 1-based indexing.)
EDIT 2:
I'll try to explain my theory then. I began with the concept that the sum array should be visualised as a 2D jagged array (diminishing in width as the height increases), with the coordinates (as mentioned in the OP) being i and j. So for an array such as [1,2,3,4,5] the sum array would be conceived as this:
3,4,5,6,
5,6,7,
7,8,
9.
The top row are all values where i would equal 0. The second row is where i equals 1. To find the value of 'j' we do the same but in the column direction.
... Sorry I cannot explain this any better!
You are given N and an int K[].
The task at hand is to generate a equal probabilistic random number between 0 to N-1 which doesn't exist in K.
N is strictly a integer >= 0.
And K.length is < N-1. And 0 <= K[i] <= N-1. Also assume K is sorted and each element of K is unique.
You are given a function uniformRand(int M) which generates uniform random number in the range 0 to M-1 And assume this functions's complexity is O(1).
Example:
N = 7
K = {0, 1, 5}
the function should return any random number { 2, 3, 4, 6 } with equal
probability.
I could get a O(N) solution for this : First generate a random number between 0 to N - K.length. And map the thus generated random number to a number not in K. The second step will take the complexity to O(N). Can it be done better in may be O(log N) ?
You can use the fact that all the numbers in K[] are between 0 and N-1 and they are distinct.
For your example case, you generate a random number from 0 to 3. Say you get a random number r. Now you conduct binary search on the array K[].
Initialize i = K.length/2.
Find K[i] - i. This will give you the number of numbers missing from the array in the range 0 to i.
For example K[2] = 5. So 3 elements are missing from K[0] to K[2] (2,3,4)
Hence you can decide whether you have to conduct the remaining search in the first part of array K or the next part. This is because you know r.
This search will give you a complexity of log(K.length)
EDIT: For example,
N = 7
K = {0, 1, 4} // modified the array to clarify the algorithm steps.
the function should return any random number { 2, 3, 5, 6 } with equal probability.
Random number generated between 0 and N-K.length = random{0-3}. Say we get 3. Hence we require the 4th missing number in array K.
Conduct binary search on array K[].
Initial i = K.length/2 = 1.
Now we see K[1] - 1 = 0. Hence no number is missing upto i = 1. Hence we search on the latter part of the array.
Now i = 2. K[2] - 2 = 4 - 2 = 2. Hence there are 2 missing numbers up to index i = 2. But we need the 4th missing element. So we again have to search in the latter part of the array.
Now we reach an empty array. What should we do now? If we reach an empty array between say K[j] & K[j+1] then it simply means that all elements between K[j] and K[j+1] are missing from the array K.
Hence all elements above K[2] are missing from the array, namely 5 and 6. We need the 4th element out of which we have already discarded 2 elements. Hence we will choose the second element which is 6.
Binary search.
The basic algorithm:
(not quite the same as the other answer - the number is only generated at the end)
Start in the middle of K.
By looking at the current value and it's index, we can determine the number of pickable numbers (numbers not in K) to the left.
Similarly, by including N, we can determine the number of pickable numbers to the right.
Now randomly go either left or right, weighted based on the count of pickable numbers on each side.
Repeat in the chosen subarray until the subarray is empty.
Then generate a random number in the range consisting of the numbers before and after the subarray in the array.
The running time would be O(log |K|), and, since |K| < N-1, O(log N).
The exact mathematics for number counts and weights can be derived from the example below.
Extension with K containing a bigger range:
Now let's say (for enrichment purposes) K can also contain values N or larger.
Then, instead of starting with the entire K, we start with a subarray up to position min(N, |K|), and start in the middle of that.
It's easy to see that the N-th position in K (if one exists) will be >= N, so this chosen range includes any possible number we can generate.
From here, we need to do a binary search for N (which would give us a point where all values to the left are < N, even if N could not be found) (the above algorithm doesn't deal with K containing values greater than N).
Then we just run the algorithm as above with the subarray ending at the last value < N.
The running time would be O(log N), or, more specifically, O(log min(N, |K|)).
Example:
N = 10
K = {0, 1, 4, 5, 8}
So we start in the middle - 4.
Given that we're at index 2, we know there are 2 elements to the left, and the value is 4, so there are 4 - 2 = 2 pickable values to the left.
Similarly, there are 10 - (4+1) - 2 = 3 pickable values to the right.
So now we go left with probability 2/(2+3) and right with probability 3/(2+3).
Let's say we went right, and our next middle value is 5.
We are at the first position in this subarray, and the previous value is 4, so we have 5 - (4+1) = 0 pickable values to the left.
And there are 10 - (5+1) - 1 = 3 pickable values to the right.
We can't go left (0 probability). If we go right, our next middle value would be 8.
There would be 2 pickable values to the left, and 1 to the right.
If we go left, we'd have an empty subarray.
So then we'd generate a number between 5 and 8, which would be 6 or 7 with equal probability.
This can be solved by basically solving this:
Find the rth smallest number not in the given array, K, subject to
conditions in the question.
For that consider the implicit array D, defined by
D[i] = K[i] - i for 0 <= i < L, where L is length of K
We also set D[-1] = 0 and D[L] = N
We also define K[-1] = 0.
Note, we don't actually need to construct D. Also note that D is sorted (and all elements non-negative), as the numbers in K[] are unique and increasing.
Now we make the following claim:
CLAIM: To find the rth smallest number not in K[], we need to find right most occurrence of r' in D (which occurs at position defined by j), where r' is the largest number in D, which is < r. Such an r' exists, because D[-1] = 0. Once we find such an r' (and j), the number we are looking for is r-r' + K[j].
Proof: Basically the definition of r' and j tells us that there are exactlyr' numbers missing from 0 to K[j], and more than r numbers missing from 0 to K[j+1]. Thus all the numbers from K[j]+1 to K[j+1]-1 are missing (and these missing are at least r-r' in number), and the number we seek is among them, given by K[j] + r-r'.
Algorithm:
In order to find (r',j) all we need to do is a (modified) binary search for r in D, where we keep moving to the left even if we find r in the array.
This is an O(log K) algorithm.
If you are running this many times, it probably pays to speed up your generation operation: O(log N) time just isn't acceptable.
Make an empty array G. Starting at zero, count upwards while progressing through the values of K. If a value isn't in K add it to G. If it is in K don't add it and progress your K pointer. (This relies on K being sorted.)
Now you have an array G which has only acceptable numbers.
Use your random number generator to choose a value from G.
This requires O(N) preparatory work and each generation happens in O(1) time. After N look-ups the amortized time of all operations is O(1).
A Python mock-up:
import random
class PRNG:
def __init__(self, K,N):
self.G = []
kptr = 0
for i in range(N):
if kptr<len(K) and K[kptr]==i:
kptr+=1
else:
self.G.append(i)
def getRand(self):
rn = random.randint(0,len(self.G)-1)
return self.G[rn]
prng=PRNG( [0,1,5], 7)
for i in range(20):
print prng.getRand()