Interview Question: Remove repeating numbers at the end of an array - algorithm

I got a surprising interview question today at a big Bay Area tech company that I was absolutely stumped by despite seeming so easy. Was wondering if anyone has seen it or can offer a simpler solution as the interviewer didn't want to show me the answer. The solution can be written in any language or pseudocode.
Question:
Given a list of numbers, remove any extraneous repeating suffix sequences of numbers that appear at the end of the list until it has no repeating suffix sequences. The repeating sequence can be cut-off.
For example:
[1,2,3,4,5,6,7,5,6,7,5,6] -> [1,2,3,4,5,6,7]
explanation: [5, 6, 7] were repeating
Also consider the situation
[1,2,3,4,5,4,5,1,4,5,4,5,1,4,5,4,5,] -> [1,2,3,4,5,4,5,1] # not [1,2,3,4,5,4,5,1,4,5,4,5,1]
explanation: [4,5,4,5,1] is a repeating sequence

There are always two ways to approach this topic. Finding any solution and finding an efficient one. It is usually better to start with any and then think on how to optimize it.
Now as we can see in the second example, the problem is complicated by the fact that the repeating pattern is not known. So we could just do it for all the possible patterns at the end. Then we would need to check two things
is it actually repeating
how long is the result
Then we could just take the shortest result. Here is the Python code:
def remove_repeating_tail(a: list) -> list:
results = []
for i in range(len(a)):
tail = a[i:]
results.append(remove_repeats(a, tail))
if len(results) == 0:
return a
return sorted(results, key=len)[0]
Also we made sure we cover all the cases. Empty list, no repeating pattern. Next we need to write remove_repeats. Also we check the empty repeating pattern, so we need to be aware of that.
def remove_repeats(a: list, tail: list) -> list:
assert len(tail) <= len(a)
if len(tail) == 0:
return a
remainder = a
count = 0
while remainder[-len(tail):] == tail:
remainder = remainder[:-len(tail)]
count += 1
if count <= 1:
return a
return remainder
We remove the repeating pattern and then add it back at the end. Now it's time to test the code if it actually works, if that is possible in the interview.
remove_repeating_tail([1,2,3,4,5,6,7,5,6,7,5,6])
-> [1, 2, 3, 4, 5, 6]
remove_repeating_tail([1,2,3,4,5,4,5,1,4,5,4,5,1,4,5,4,5])
-> [1, 2, 3, 4, 5, 4, 5]
Also good to check some other cases:
remove_repeating_tail([1,2,3,4])
-> [1, 2, 3, 4]
remove_repeating_tail([])
-> []
After quite a bit of fixing we got the above, which I think is correct. In particular I missed:
first I had an infinite loop in remove_repeats for an empty tail
remove_repeats removed always the tail and sometimes everything, as I wasn't checking that there is at least one repeat. I then added the counting.
I made simple mistakes like writing results = res instead of results.append(res) leading to some Exceptions.
Then a lot of simplification. First I used some sentinel None to communicate back that it is not repeating, but we could just return the whole list. Then I checked the repeating with some if before the while loop, but realized its basically doing the same as the first iteration, so I used counting.
Similarly I don't like the if len(results) == 0: check. I would probably add a to the result in the beginning and remove the check, as now there is always a result. Then we could start the counting from 1 instead of 0. Still I kept it in.
If we want something fast, we first need to analyze the complexity.
So remove repeating tails for a list of size n and tail size k is: O(n / k). Then we call this function n times. And then we sort it. Wait why do we sort it, we could just take the minimum return min(results, key=len). That's better.
In each loop we call remove_repeats starting with k = 1 to n. So we have:
sum(k = 1 .. n) O(n / k). This is n / 1 + n / 2 + n / 3 + .. n / n. I had to look this up on Wikipedia, but these are called harmonic numbers. We can also just make our live easy and say its less than O(n^2) for now. Otherwise I found an approximation of H_n = n ln(n) + 0.5 n here. So the complexity overall is O(n log n). Not to bad I would say. Is it the optimal? Maybe. Here I would compare it to some other similar algorithms (like substring search, etc).
Before going there, at this point, I would check with the interviewer, where he would like to go next. As there are many directions.

This seems a tricky question and there may not be a simple solution. Best solution I can think of would be O(n) time and O(n) and that is if I am not missing any edge case.
Let's take as example
[1,2,3,4,5,4,5,1,4,5,4,5,1,4,5,4,5] -> [1,2,3,4,5,4,5,1]
Steps would be as follows:
Iterate over the input array from last index to first and build a dictionary (hashtable) with every number in the array being a key and value: a list of positions where the specific number is found in the array.
Occurrences dictionary will become:
{
5: [14, 11, 9, 6, 4],
4: [13, 10, 8, 5, 3],
1: [12, 7, 0],
3: [2]
2: [1]
}
Find the possible suffix lengths by calculating deltas between every position and first position for every number. This way we take into consideration the case in which a specific number repeats in the suffix or in the prefix.
We then add each distinct possible suffix length to a set.
We sort the possible suffix lengths in descending order.
We get following suffix lengths:
[12, 10, 7, 5, 2]
For every possible length l, we test if arr[n-1] == arr[n-1-l]. If l is our suffix's length, it means that the number at last position is repeated at exactly l positions before. We then check the last l elements to respect the same condition. If they do, we found the maximum suffix length. If not, the max suffix length is even smaller, so we check the next possible length.
After finding the correct suffix length, we delete the remaining numbers that repeat at positions pos-l. We then return the slice of array with suffix removed.
def removeRepeatingSuffixes(arr):
if not arr:
return []
n = len(arr)
occurrences = {}
for i in range(n - 1, -1, -1):
c = arr[i]
if c not in occurrences:
occurrences[c] = []
occurrences[c].append(i)
# treat edge case: no repeating suffix
if len(occurrences[arr[n-1]]) == 1:
return arr
# create a set of possible suffix lengths,
# based on the differences between the positions of each number.
possible_suffixes_lengths_set = set()
for c, olist in occurrences.items():
if len(olist) >= 2:
for i in range(len(olist)-1):
delta = olist[i] - olist[len(olist)-1]
possible_suffixes_lengths_set.add(delta)
suff_lengths = sorted(possible_suffixes_lengths_set, reverse=True)
for l in suff_lengths:
if arr[n - 1] == arr[n - 1 - l]:
# possible suffix length, check if last l characters repeat
ok_length = True
for j in range(n-2, n-1-l, -1):
if arr[j] != arr[j-l]:
ok_length = False
break
if ok_length:
last_i = n-1-l
while last_i > 0 and arr[last_i] == arr[last_i - l]:
last_i -= 1
# return non-repeating slice, from 0 to last_i
return arr[0:last_i + 1]

quick way to remove repeating or dedupe is change to a type set() instead of a list

Related

Daily Coding Problem 316 : Coin Change Problem - determination of denomination?

I'm going through the Daily Coding Problems and am currently stuck in one of the problems. It goes by:
You are given an array of length N, where each element i represents
the number of ways we can produce i units of change. For example, [1,
0, 1, 1, 2] would indicate that there is only one way to make 0, 2, or
3 units, and two ways of making 4 units.
Given such an array, determine the denominations that must be in use.
In the case above, for example, there must be coins with values 2, 3,
and 4.
I'm unable to figure out how to determine the denomination from the total number of ways array. Can you work it out?
Somebody already worked out this problem here, but it's devoid of any explanation.
From what I could gather is that he collects all the elements whose value(number of ways == 1) and appends it to his answer, but I think it doesn't consider the fact that the same number can be formed from a combination of lower denominations for which still the number of ways would come out to be 1 irrespective of the denomination's presence.
For example, in the case of arr = [1, 1, a, b, c, 1]. We know that denomination 1 exists since arr[1] = 1. Now we can also see that arr[5] = 1, this should not necessarily mean that denomination 5 is available since 5 can be formed using coins of denomination 1, i.e. (1 + 1 + 1 + 1 + 1).
Thanks in advance!
If you're solving the coin change problem, the best technique is to maintain an array of ways of making change with a partial set of the available denominations, and add in a new denomination d by updating the array like this:
for i = d upto N
a[i] += a[i-d]
Your actual problem is the reverse of this: finding denominations based on the total number of ways. Note that if you know one d, you can remove it from the ways array by reversing the above procedure:
for i = N downto d
a[i] -= a[i-d]
You can find the lowest denomination available by looking for the first 1 in the array (other than the value at index 0, which is always 1). Then, once you've found the lowest denomination, you can remove its effect on the ways array, and repeat until the array is zeroed (except for the first value).
Here's a full solution in Python:
def rways(A):
dens = []
for i in range(1, len(A)):
if not A[i]: continue
dens.append(i)
for j in range(len(A)-1, i-1, -1):
A[j] -= A[j-i]
return dens
print(rways([1, 0, 1, 1, 2]))
You might want to add error-checking: if you find a non-zero value that's not 1 when searching for the next denomination, then the original array isn't valid.
For reference and comparison, here's some code for computing the ways of making change from a set of denominations:
def ways(dens, N):
A = [1] + [0] * N
for d in dens:
for i in range(d, N+1):
A[i] += A[i-d]
return A
print(ways([2, 3, 4], 4))

Can someone explain how recursive insertion sort works?

Assuming A is an array, and n is the number of elements in A,
recursive_insertion_sort(A, n)
IF n > 1 THEN
recursive_insertion_sort(A, n-1)
key = A[n]
i = n - 1
DOWHILE A[i] > key AND i > 0
A[i+1] = A[i]
i = i - 1
ENDDO
A[i+1] = temp
ENDIF
END
Can someone explain how recursion works in this case? There are a few things I don't understand:
I don't understand why we have to call the function again if n > 1.
Why do we input (n-1) when we call the function again? Is it so that we start the entire process from n = 2, the first 2 elements?
How does recursion in general work? Like, once we call the function again, do we ignore the code from line 4 onwards, and jump straight into the second call? Or do we run the 2nd call in conjunction with the first call?
Before discussing the implementation, let's explain what this function does: it does not sort the entire array A, but only its initial n elements. You can pass the length of the array for n to sort the whole thing, but the fact that you pass the length separately is essential to understanding the rest of the answer.
I don't understand why we have to call the function again if n > 1.
Perhaps a better way to explain the meaning of this condition would be that we do not call this function again when n is one or less. This is called the base case of recursive algorithm, i.e. the case when you don't have to do anything. In case of sorting it means that an array of only one element is already sorted.
Why do we input (n-1) when we call the function again?
Since n is the number of elements that we need to sort, We pass n-1 to sort the front of the array. Once the function returns, we know that the portion A[1..n-1] is already sorted. All we need to do is to move A[n] to its right place. We do that in the DOWHILE loop that follows: we go backward one element at a time, moving elements that are bigger than A[n] to the right. Once the loop is over, we place A[n] to its new place. Now the range A[1..n] is sorted.
How does recursion in general work?
The function has two cases - the trivial base case, when everything is done, and a reduction step, when you use recursive invocation to solve a simpler problem, and then use the results of the simpler solution to construct your final solution.
once we call the function again, do we ignore the code from line 4 onwards, and jump straight into the second call?
No, once the function returns, we continue where we left. In your case, the function waits for the A[1..n-1] range to be sorted before placing A[n] to the right place.
Small example to understand how this works :
recursive_insertion_sort([1, 7, 5, 2], 4)
| recursive_insertion_sort([1, 7, 5, 2], 3)
| | recursive_insertion_sort([1, 7, 5, 2], 2)
| | | recursive_insertion_sort([1, 7, 5, 2], 1)
| | puts 7 in the right position between it's ORDERED left values [1] -> [1,7]
| puts 5 in the right position between it's ORDERED left values [1,7] -> [1,5,7]
puts 2 in the right position between it's ORDERED left values [1,5,7] -> [1,2,5,7]

Disperse Duplicates in an Array

Source : Google Interview Question
Write a routine to ensure that identical elements in the input are maximally spread in the output?
Basically, we need to place the same elements,in such a way , that the TOTAL spreading is as maximal as possible.
Example:
Input: {1,1,2,3,2,3}
Possible Output: {1,2,3,1,2,3}
Total dispersion = Difference between position of 1's + 2's + 3's = 4-1 + 5-2 + 6-3 = 9 .
I am NOT AT ALL sure, if there's an optimal polynomial time algorithm available for this.Also,no other detail is provided for the question other than this .
What i thought is,calculate the frequency of each element in the input,then arrange them in the output,each distinct element at a time,until all the frequencies are exhausted.
I am not sure of my approach .
Any approaches/ideas people .
I believe this simple algorithm would work:
count the number of occurrences of each distinct element.
make a new list
add one instance of all elements that occur more than once to the list (order within each group does not matter)
add one instance of all unique elements to the list
add one instance of all elements that occur more than once to the list
add one instance of all elements that occur more than twice to the list
add one instance of all elements that occur more than trice to the list
...
Now, this will intuitively not give a good spread:
for {1, 1, 1, 1, 2, 3, 4} ==> {1, 2, 3, 4, 1, 1, 1}
for {1, 1, 1, 2, 2, 2, 3, 4} ==> {1, 2, 3, 4, 1, 2, 1, 2}
However, i think this is the best spread you can get given the scoring function provided.
Since the dispersion score counts the sum of the distances instead of the squared sum of the distances, you can have several duplicates close together, as long as you have a large gap somewhere else to compensate.
for a sum-of-squared-distances score, the problem becomes harder.
Perhaps the interview question hinged on the candidate recognizing this weakness in the scoring function?
In perl
#a=(9,9,9,2,2,2,1,1,1);
then make a hash table of the counts of different numbers in the list, like a frequency table
map { $x{$_}++ } #a;
then repeatedly walk through all the keys found, with the keys in a known order and add the appropriate number of individual numbers to an output list until all the keys are exhausted
#r=();
$g=1;
while( $g == 1 ) {
$g=0;
for my $n (sort keys %x)
{
if ($x{$n}>1) {
push #r, $n;
$x{$n}--;
$g=1
}
}
}
I'm sure that this could be adapted to any programming language that supports hash tables
python code for algorithm suggested by Vorsprung and HugoRune:
from collections import Counter, defaultdict
def max_spread(data):
cnt = Counter()
for i in data: cnt[i] += 1
res, num = [], list(cnt)
while len(cnt) > 0:
for i in num:
if num[i] > 0:
res.append(i)
cnt[i] -= 1
if cnt[i] == 0: del cnt[i]
return res
def calc_spread(data):
d = defaultdict()
for i, v in enumerate(data):
d.setdefault(v, []).append(i)
return sum([max(x) - min(x) for _, x in d.items()])
HugoRune's answer takes some advantage of the unusual scoring function but we can actually do even better: suppose there are d distinct non-unique values, then the only thing that is required for a solution to be optimal is that the first d values in the output must consist of these in any order, and likewise the last d values in the output must consist of these values in any (i.e. possibly a different) order. (This implies that all unique numbers appear between the first and last instance of every non-unique number.)
The relative order of the first copies of non-unique numbers doesn't matter, and likewise nor does the relative order of their last copies. Suppose the values 1 and 2 both appear multiple times in the input, and that we have built a candidate solution obeying the condition I gave in the first paragraph that has the first copy of 1 at position i and the first copy of 2 at position j > i. Now suppose we swap these two elements. Element 1 has been pushed j - i positions to the right, so its score contribution will drop by j - i. But element 2 has been pushed j - i positions to the left, so its score contribution will increase by j - i. These cancel out, leaving the total score unchanged.
Now, any permutation of elements can be achieved by swapping elements in the following way: swap the element in position 1 with the element that should be at position 1, then do the same for position 2, and so on. After the ith step, the first i elements of the permutation are correct. We know that every swap leaves the scoring function unchanged, and a permutation is just a sequence of swaps, so every permutation also leaves the scoring function unchanged! This is true at for the d elements at both ends of the output array.
When 3 or more copies of a number exist, only the position of the first and last copy contribute to the distance for that number. It doesn't matter where the middle ones go. I'll call the elements between the 2 blocks of d elements at either end the "central" elements. They consist of the unique elements, as well as some number of copies of all those non-unique elements that appear at least 3 times. As before, it's easy to see that any permutation of these "central" elements corresponds to a sequence of swaps, and that any such swap will leave the overall score unchanged (in fact it's even simpler than before, since swapping two central elements does not even change the score contribution of either of these elements).
This leads to a simple O(nlog n) algorithm (or O(n) if you use bucket sort for the first step) to generate a solution array Y from a length-n input array X:
Sort the input array X.
Use a single pass through X to count the number of distinct non-unique elements. Call this d.
Set i, j and k to 0.
While i < n:
If X[i+1] == X[i], we have a non-unique element:
Set Y[j] = Y[n-j-1] = X[i].
Increment i twice, and increment j once.
While X[i] == X[i-1]:
Set Y[d+k] = X[i].
Increment i and k.
Otherwise we have a unique element:
Set Y[d+k] = X[i].
Increment i and k.

Algorithm to count the number of valid blocks in a permutation [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Finding sorted sub-sequences in a permutation
Given an array A which holds a permutation of 1,2,...,n. A sub-block A[i..j]
of an array A is called a valid block if all the numbers appearing in A[i..j]
are consecutive numbers (may not be in order).
Given an array A= [ 7 3 4 1 2 6 5 8] the valid blocks are [3 4], [1,2], [6,5],
[3 4 1 2], [3 4 1 2 6 5], [7 3 4 1 2 6 5], [7 3 4 1 2 6 5 8]
So the count for above permutation is 7.
Give an O( n log n) algorithm to count the number of valid blocks.
Ok, I am down to 1 rep because I put 200 bounty on a related question: Finding sorted sub-sequences in a permutation
so I cannot leave comments for a while.
I have an idea:
1) Locate all permutation groups. They are: (78), (34), (12), (65). Unlike in group theory, their order and position, and whether they are adjacent matters. So, a group (78) can be represented as a structure (7, 8, false), while (34) would be (3,4,true). I am using Python's notation for tuples, but it is actually might be better to use a whole class for the group. Here true or false means contiguous or not. Two groups are "adjacent" if (max(gp1) == min(gp2) + 1 or max(gp2) == min(gp1) + 1) and contigous(gp1) and contiguos(gp2). This is not the only condition, for union(gp1, gp2) to be contiguous, because (14) and (23) combine into (14) nicely. This is a great question for algo class homework, but a terrible one for interview. I suspect this is homework.
Just some thoughts:
At first sight, this sounds impossible: a fully sorted array would have O(n2) valid sub-blocks.
So, you would need to count more than one valid sub-block at a time. Checking the validity of a sub-block is O(n). Checking whether a sub-block is fully sorted is O(n) as well. A fully sorted sub-block contains n·(n - 1)/2 valid sub-blocks, which you can count without further breaking this sub-block up.
Now, the entire array is obviously always valid. For a divide-and-conquer approach, you would need to break this up. There are two conceivable breaking points: the location of the highest element, and that of the lowest element. If you break the array into two at one of these points, including the extremum in the part that contains the second-to-extreme element, there cannot be a valid sub-block crossing this break-point.
By always choosing the extremum that produces a more even split, this should work quite well (average O(n log n)) for "random" arrays. However, I can see problems when your input is something like (1 5 2 6 3 7 4 8), which seems to produce O(n2) behaviour. (1 4 7 2 5 8 3 6 9) would be similar (I hope you see the pattern). I currently see no trick to catch this kind of worse case, but it seems that it requires other splitting techniques.
This question does involve a bit of a "math trick" but it's fairly straight forward once you get it. However, the rest of my solution won't fit the O(n log n) criteria.
The math portion:
For any two consecutive numbers their sum is 2k+1 where k is the smallest element. For three it is 3k+3, 4 : 4k+6 and for N such numbers it is Nk + sum(1,N-1). Hence, you need two steps which can be done simultaneously:
Create the sum of all the sub-arrays.
Determine the smallest element of a sub-array.
The dynamic programming portion
Build two tables using the results of the previous row's entries to build each successive row's entries. Unfortunately, I'm totally wrong as this would still necessitate n^2 sub-array checks. Ugh!
My proposition
STEP = 2 // amount of examed number
B [0,0,0,0,0,0,0,0]
B [1,1,0,0,0,0,0,0]
VALID(A,B) - if not valid move one
B [0,1,1,0,0,0,0,0]
VALID(A,B) - if valid move one and step
B [0,0,0,1,1,0,0,0]
VALID (A,B)
B [0,0,0,0,0,1,1,0]
STEP = 3
B [1,1,1,0,0,0,0,0] not ok
B [0,1,1,1,0,0,0,0] ok
B [0,0,0,0,1,1,1,0] not ok
STEP = 4
B [1,1,1,1,0,0,0,0] not ok
B [0,1,1,1,1,0,0,0] ok
.....
CON <- 0
STEP <- 2
i <- 0
j <- 0
WHILE(STEP <= LEN(A)) DO
j <- STEP
WHILE(STEP <= LEN(A) - j) DO
IF(VALID(A,i,j)) DO
CON <- CON + 1
i <- j + 1
j <- j + STEP
ELSE
i <- i + 1
j <- j + 1
END
END
STEP <- STEP + 1
END
The valid method check that all elements are consecutive
Never tested but, might be ok
The original array doesn't contain duplicates so must itself be a consecutive block. Lets call this block (1 ~ n). We can test to see whether block (2 ~ n) is consecutive by checking if the first element is 1 or n which is O(1). Likewise we can test block (1 ~ n-1) by checking whether the last element is 1 or n.
I can't quite mould this into a solution that works but maybe it will help someone along...
Like everybody else, I'm just throwing this out ... it works for the single example below, but YMMV!
The idea is to count the number of illegal sub-blocks, and subtract this from the total possible number. We count the illegal ones by examining each array element in turn and ruling out sub-blocks that include the element but not its predecessor or successor.
Foreach i in [1,N], compute B[A[i]] = i.
Let Count = the total number of sub-blocks with length>1, which is N-choose-2 (one for each possible combination of starting and ending index).
Foreach i, consider A[i]. Ignoring edge cases, let x=A[i]-1, and let y=A[i]+1. A[i] cannot participate in any sub-block that does not include x or y. Let iX=B[x] and iY=B[y]. There are several cases to be treated independently here. The general case is that iX<i<iY<i. In this case, we can eliminate the sub-block A[iX+1 .. iY-1] and all intervening blocks containing i. There are (i - iX + 1) * (iY - i + 1) such sub-blocks, so call this number Eliminated. (Other cases left as an exercise for the reader, as are those edge cases.) Set Count = Count - Eliminated.
Return Count.
The total cost appears to be N * (cost of step 2) = O(N).
WRINKLE: In step 2, we must be careful not to eliminate each sub-interval more than once. We can accomplish this by only eliminating sub-intervals that lie fully or partly to the right of position i.
Example:
A = [1, 3, 2, 4]
B = [1, 3, 2, 4]
Initial count = (4*3)/2 = 6
i=1: A[i]=1, so need sub-blocks with 2 in them. We can eliminate [1,3] from consideration. Eliminated = 1, Count -> 5.
i=2: A[i]=3, so need sub-blocks with 2 or 4 in them. This rules out [1,3] but we already accounted for it when looking right from i=1. Eliminated = 0.
i=3: A[i] = 2, so need sub-blocks with [1] or [3] in them. We can eliminate [2,4] from consideration. Eliminated = 1, Count -> 4.
i=4: A[i] = 4, so we need sub-blocks with [3] in them. This rules out [2,4] but we already accounted for it when looking right from i=3. Eliminated = 0.
Final Count = 4, corresponding to the sub-blocks [1,3,2,4], [1,3,2], [3,2,4] and [3,2].
(This is an attempt to do this N.log(N) worst case. Unfortunately it's wrong -- it sometimes undercounts. It incorrectly assumes you can find all the blocks by looking at only adjacent pairs of smaller valid blocks. In fact you have to look at triplets, quadruples, etc, to get all the larger blocks.)
You do it with a struct that represents a subblock and a queue for subblocks.
struct
c_subblock
{
int index ; /* index into original array, head of subblock */
int width ; /* width of subblock > 0 */
int lo_value;
c_subblock * p_above ; /* null or subblock above with same index */
};
Alloc an array of subblocks the same size as the original array, and init each subblock to have exactly one item in it. Add them to the queue as you go. If you start with array [ 7 3 4 1 2 6 5 8 ] you will end up with a queue like this:
queue: ( [7,7] [3,3] [4,4] [1,1] [2,2] [6,6] [5,5] [8,8] )
The { index, width, lo_value, p_above } values for subbblock [7,7] will be { 0, 1, 7, null }.
Now it's easy. Forgive the c-ish pseudo-code.
loop {
c_subblock * const p_left = Pop subblock from queue.
int const right_index = p_left.index + p_left.width;
if ( right_index < length original array ) {
// Find adjacent subblock on the right.
// To do this you'll need the original array of length-1 subblocks.
c_subblock const * p_right = array_basic_subblocks[ right_index ];
do {
Check the left/right subblocks to see if the two merged are also a subblock.
If they are add a new merged subblock to the end of the queue.
p_right = p_right.p_above;
}
while ( p_right );
}
}
This will find them all I think. It's usually O(N log(N)), but it'll be O(N^2) for a fully sorted or anti-sorted list. I think there's an answer to this though -- when you build the original array of subblocks you look for sorted and anti-sorted sequences and add them as the base-level subblocks. If you are keeping a count increment it by (width * (width + 1))/2 for the base-level. That'll give you the count INCLUDING all the 1-length subblocks.
After that just use the loop above, popping and pushing the queue. If you're counting you'll have to have a multiplier on both the left and right subblocks and multiply these together to calculate the increment. The multiplier is the width of the leftmost (for p_left) or rightmost (for p_right) base-level subblock.
Hope this is clear and not too buggy. I'm just banging it out, so it may even be wrong.
[Later note. This doesn't work after all. See note below.]

finding a triplet having a given sum

I have been struggling with this questions for sometime now. The question goes like this:-
We have n^2 numbers. We need to find out if there exists a triplet a,b,c such that a+b+c = 0. For a more generic case, a+b+c = k. (k is given)
There exists a solution with O(n^2log(n)) complexity.
Any help would be greatly appreciated.
thanks
To get this in O(n²logn), you'd have to sort the numbers. Find all combinations of 2 numbers, and do a binary search to find the third.
The upper bound is much higher for the general version of the problem.
I wrote a rough solution.
It can definitely be done in O(n^2).
You don't have to sort this.
It's an extension of the problem which requires summing two numbers to x and the trick is to use the hash table.
def triplets(l, total):
"""Sum of 3 numbers to get to total
Basically an extension of the 2 table
"""
l = set( l)
d = { }
for i in l:
remain = total - i
inside = {}
for j in l:
if i == j:
continue
inside[j] = remain -j
d[i] = inside
good = set()
for first, dic in d.iteritems():
for second, third in dic.iteritems():
if third in l:
good.add( tuple(sorted([first, second, third])) )
for each in good:
print each
triplets( [2, 3, 4, 5, 6], 3+4+5)
NOTE: we can use a fast sorting method for triplets which will be O(1).

Resources