design a recursive algorithm that find the minimum of an array - algorithm

I was thinking about a recursive algorithm (it's a theoretical question, so it's not important the programming language). It consists of finding the minimum of a set of numbers
I was thinking of this way: let "n" be the number of elements in the set. let's rearrange the set as:
(a, (b, c, ..., z) ).
the function moves from left to right, and the first element is assumed as minimum in the first phase (it's, of course, the 0-th element, a). next steps are defined as follows:
(a, min(b, c, ..., z) ), check if a is still minimum, or if b is to be assumed as minimum, then (a or b, min(c, d, ..., z) ), another check condition, (a or b or c, min(d, e, ..., z)), check condition, etc.
I think the theoretical pseudocode may be as follows:
f(x) {
// base case
if I've reached the last element, assume it's a possible minimum, and check if y < z. then return a value to stop recursive calls.
// inductive steps
if ( f(i-th element) < f(i+1, next element) ) {
/* just assume the current element is the current minimum */
}
}
I'm having trouble with the base case. I don't know how to formalize it. I think I've understood the basic idea about it: it's basically what I've written in the pseudocode, right?
does what I've written so far make sense? Sorry if it's not clear but I'm a beginner, and I'm studying recursion for the first time, and I personally find it confusing. So, I've tried my best to explain it. If it's not clear, let me know, and I'll try to explain it better with different words.

Recursive problems can be hard to visualize. Let's take an example : arr = [3,5,1,6]
This is a relatively small array but still it's not easy to visualize how recursion will work here from start to end.
Tip : Try to reduce the size of the input. This will make it easy to visualize and help you with finding the base case. First decide what our function should do. In our case it finds the minimum number from an array. If our function works for array of size n then it should also work for array of size n-1 (Recursive leap of faith). Now using this we can reduce the size of input until we cannot reduce it any further, which should give us our base case.
Let's use the above example: arr = [3,5,1,6]
Let create a function findMin(arr, start) which takes an array and a start index and returns the minimum number from start index to end of array.
1st Iteration : [3,5,1,6]
// arr[start] = 3, If we can somehow find minimum from the remaining array,
// then we can compare it with current element and return the minimum of the two.
// so our input is now reduced to the remaining array [5,1,6]
2nd Iteration : [5,1,6]
// arr[start] = 5, If we can somehow find minimum from the remaining array,
// then we can compare it with current element and return the minimum of the two.
// so our input is now reduced to the remaining array [1,6]
3rd Iteration : [1,6]
// arr[start] = 1, If we can somehow find minimum from the remaining array,
// then we can compare it with current element and return the minimum of the two.
// so our input is now reduced to the remaining array [6]
4th Iteration : [6]
// arr[start] = 6, Since it is the only element in the array, it is the minimum.
// This is our base case as we cannot reduce the input any further.
// We will simply return 6.
------------ Tracking Back ------------
3rd Iteration : [1,6]
// 1 will be compared with whatever the 4th Iteration returned (6 in this case).
// This iteration will return minimum(1, 4th Iteration) => minimum(1,6) => 1
2nd Iteration : [5,1,6]
// 5 will be compared with whatever the 3th Iteration returned (1 in this case).
// This iteration will return minimum(5, 3rd Iteration) => minimum(5,1) => 1
1st Iteration : [3,5,1,6]
// 3 will be compared with whatever the 2nd Iteration returned (1 in this case).
// This iteration will return minimum(3, 2nd Iteration) => minimum(3,1) => 1
Final answer = 1
function findMin(arr, start) {
if (start === arr.length - 1) return arr[start];
return Math.min(arr[start], findMin(arr, start + 1));
}
const arr = [3, 5, 1, 6];
const min = findMin(arr, 0);
console.log('Minimum element = ', min);
This is a good problem for practicing recursion for beginners. You can also try these problems for practice.
Reverse a string using recursion.
Reverse a stack using recursion.
Sort a stack using recursion.

To me, it's more like this:
int f(int[] x)
{
var minimum = head of X;
if (x has tail)
{
var remainder = f(tail of x);
if (remainder < minimum)
{
minimum = remainder;
}
}
return minimum;
}

You have the right idea.
You've correctly observed that
min_recursive(array) = min(array[0], min_recursive(array[1:]))
The function doesn't care about who's calling it or what's going on outside of it -- it just needs to return the minimum of the array passed in. The base case is when the array has a single value. That value is the minimum of the array so it should just return it. Otherwise find the minimum of the rest of the array by calling itself again and compare the result with the head of the array.
The other answers show some coding examples.

This is a recursive solution to the problem that you posed, using JavaScript:
a = [5,12,3,5,34,12]
const min = a => {
if (!a.length) { return 0 }
if (a.length === 1) { return a[0] }
return Math.min(a[0], min(a.slice(1)))
}
min(a)
Note the approach which is to first detect the simplest case (empty array), then a more complex case (single element array), then finally a recursive call which will reduce more complex cases to functions of simpler ones.
However, you don't need recursion to traverse a one dimensional array.

Related

How can I stop recursion after I succeed my insertion?

The problem definition is that
Given an additional digit 0 ≤ x ≤ 9, write a function that returns
the integer that results from inserting x in n, such that its digits
also appear in ascending order from left to right. For instance, if n
= 24667 and x = 5, the function should return 245667.
My code
// the divisions are integer division, no floating point
int x(int n, int insertValue)
{
if (n == 0) return 0;
int val = x(n/10, insertValue);
if((n%10) > insertValue)
{
int q = insertValue * 10 + (n%10);
return val * 100 + q;
}
return val*10 + (n%10);
}
For the case of, for example, x(2245,3), it outputs 223435. But I have already done with it while processing 224. It shouldn't go on adding the value to be inserted any more, I mean the 3 shouldn't be there before 5 .
I can come up with a solution that I can put in each recursion step a boolean flag that identify by taking modulo by 10 and dividing by 10 up to reach the single digit case. If there is no any identification, go in to that if block, else not. But it sounds too silly.
When you're dividing recursively, you are actually going right to left, not left to right, so you should check if a digit is smaller than the one inserted and not greater (unless you always let the recursion reach n==0 condition and make your comparisons on your way out of it but that would be ineffective).
The second thing is that you do not break recursion once you inserted the digit (which you were aware of as I can see now in the question title), so it gets inserted repeatedly before every digit that is larger than insertValue. As to how to do it: you were already stopping recursion with if(n==0) condition, i.e. if n==0 the function stops calling itself (it returns immediately). When inserting the digit the difference is that you need to use the original value (n) to return from the function instead of passing it further.
At this point it works well for your example but there's also one edge case you need to consider if you want the function to work property. When you have nothing to divide anymore [if(n==0)] you need to insert your digit anyway (return insertValue) so it does not get lost on the left edge like in x(2245,1) call.
Improvements for brevity:
% has the same precedence as * and /, so brackets around it are not needed here.
I removed val variable as it was now used only once and its calculation was not always necessary.
Here's the working code:
int x(int n, int insertValue){
if(n == 0) return insertValue;
//if insertion point was found, use original value (n)
if(n%10 <= insertValue)
return n*10 + insertValue;
//if not there yet, keep calling x()
return x(n/10, insertValue)*10 + n%10;
}

Partition algorithm without loop, only recursion

Given a list of integers. Find out whether it can be divided into two sublists with equal sum. The numbers in the list is not sorted.
For example:
A list like [1, 2, 3, 4, 6] will return true, because 2 + 6 = 1 + 3 + 4 = 8
A list like [2, 1, 8, 3] will return false.
I saw this problem in an online-practice platform. I know how to do it with for loop + recursion, but how do I solve it with no loop (or iterator)?
The way to think about this it to try to think in which ways you can break up this problem into smaller problems which are dependent on each other. The first thing that comes to mind is to split the problem according to the index. Let's try just going from left to right (increase index by 1 with every call). Our base case would be after we've gone through all elements, i.e. the end of the array. What do we need to do there? Compare the sums with each other, so we need to pass those through. For the actual recursive call, there are 2 possibilities, those being the current element added to either sum. We just need to check both of those calls and return true if one returned true.
This leads to a solution without any loops by having your recursive function take the current index along with both sums as parameters, and having 2 recursive cases - one where you add the current element to the one sum, and one where you add it to the other sum. And then compare the sums when you get to the end of the array.
In Java, it would look like this:
boolean canBeDividedEqually(int[] array)
{
return canBeDividedEquallyHelper(array, 0, 0, 0);
}
boolean canBeDividedEquallyHelper(int[] array, int i, int sum1, int sum2)
{
if (i == array.length)
return sum1 == sum2;
return canBeDividedEquallyHelper(array, i+1, sum1 + array[i], sum2) ||
canBeDividedEquallyHelper(array, i+1, sum1, sum2 + array[i]);
}
This can be improved slightly by only passing through 1 sum and either adding or subtracting from it and then comparing against 0 at the end.
Note that this is the partition problem and this brute force solution takes exponential time (i.e. it gets very slow very quickly as input size increases).

algorithm to find longest non-overlapping sequences

I am trying to find the best way to solve the following problem. By best way I mean less complex.
As an input a list of tuples (start,length) such:
[(0,5),(0,1),(1,9),(5,5),(5,7),(10,1)]
Each element represets a sequence by its start and length, for example (5,7) is equivalent to the sequence (5,6,7,8,9,10,11) - a list of 7 elements starting with 5. One can assume that the tuples are sorted by the start element.
The output should return a non-overlapping combination of tuples that represent the longest continuous sequences(s). This means that, a solution is a subset of ranges with no overlaps and no gaps and is the longest possible - there could be more than one though.
For example for the given input the solution is:
[(0,5),(5,7)] equivalent to (0,1,2,3,4,5,6,7,8,9,10,11)
is it backtracking the best approach to solve this problem ?
I'm interested in any different approaches that people could suggest.
Also if anyone knows a formal reference of this problem or another one that is similar I'd like to get references.
BTW - this is not homework.
Edit
Just to avoid some mistakes this is another example of expected behaviour
for an input like [(0,1),(1,7),(3,20),(8,5)] the right answer is [(3,20)] equivalent to (3,4,5,..,22) with length 20. Some of the answers received would give [(0,1),(1,7),(8,5)] equivalent to (0,1,2,...,11,12) as right answer. But this last answer is not correct because is shorter than [(3,20)].
Iterate over the list of tuples using the given ordering (by start element), while using a hashmap to keep track of the length of the longest continuous sequence ending on a certain index.
pseudo-code, skipping details like items not found in a hashmap (assume 0 returned if not found):
int bestEnd = 0;
hashmap<int,int> seq // seq[key] = length of the longest sequence ending on key-1, or 0 if not found
foreach (tuple in orderedTuples) {
int seqLength = seq[tuple.start] + tuple.length
int tupleEnd = tuple.start+tuple.length;
seq[tupleEnd] = max(seq[tupleEnd], seqLength)
if (seqLength > seq[bestEnd]) bestEnd = tupleEnd
}
return new tuple(bestEnd-seq[bestEnd], seq[bestEnd])
This is an O(N) algorithm.
If you need the actual tuples making up this sequence, you'd need to keep a linked list of tuples hashed by end index as well, updating this whenever the max length is updated for this end-point.
UPDATE: My knowledge of python is rather limited, but based on the python code you pasted, I created this code that returns the actual sequence instead of just the length:
def get_longest(arr):
bestEnd = 0;
seqLengths = dict() #seqLengths[key] = length of the longest sequence ending on key-1, or 0 if not found
seqTuples = dict() #seqTuples[key] = the last tuple used in this longest sequence
for t in arr:
seqLength = seqLengths.get(t[0],0) + t[1]
tupleEnd = t[0] + t[1]
if (seqLength > seqLengths.get(tupleEnd,0)):
seqLengths[tupleEnd] = seqLength
seqTuples[tupleEnd] = t
if seqLength > seqLengths.get(bestEnd,0):
bestEnd = tupleEnd
longestSeq = []
while (bestEnd in seqTuples):
longestSeq.append(seqTuples[bestEnd])
bestEnd -= seqTuples[bestEnd][1]
longestSeq.reverse()
return longestSeq
if __name__ == "__main__":
a = [(0,3),(1,4),(1,1),(1,8),(5,2),(5,5),(5,6),(10,2)]
print(get_longest(a))
Revised algorithm:
create a hashtable of start->list of tuples that start there
put all tuples in a queue of tupleSets
set the longestTupleSet to the first tuple
while the queue is not empty
take tupleSet from the queue
if any tuples start where the tupleSet ends
foreach tuple that starts where the tupleSet ends
enqueue new tupleSet of tupleSet + tuple
continue
if tupleSet is longer than longestTupleSet
replace longestTupleSet with tupleSet
return longestTupleSet
c# implementation
public static IList<Pair<int, int>> FindLongestNonOverlappingRangeSet(IList<Pair<int, int>> input)
{
var rangeStarts = input.ToLookup(x => x.First, x => x);
var adjacentTuples = new Queue<List<Pair<int, int>>>(
input.Select(x => new List<Pair<int, int>>
{
x
}));
var longest = new List<Pair<int, int>>
{
input[0]
};
int longestLength = input[0].Second - input[0].First;
while (adjacentTuples.Count > 0)
{
var tupleSet = adjacentTuples.Dequeue();
var last = tupleSet.Last();
int end = last.First + last.Second;
var sameStart = rangeStarts[end];
if (sameStart.Any())
{
foreach (var nextTuple in sameStart)
{
adjacentTuples.Enqueue(tupleSet.Concat(new[] { nextTuple }).ToList());
}
continue;
}
int length = end - tupleSet.First().First;
if (length > longestLength)
{
longestLength = length;
longest = tupleSet;
}
}
return longest;
}
tests:
[Test]
public void Given_the_first_problem_sample()
{
var input = new[]
{
new Pair<int, int>(0, 5),
new Pair<int, int>(0, 1),
new Pair<int, int>(1, 9),
new Pair<int, int>(5, 5),
new Pair<int, int>(5, 7),
new Pair<int, int>(10, 1)
};
var result = FindLongestNonOverlappingRangeSet(input);
result.Count.ShouldBeEqualTo(2);
result.First().ShouldBeSameInstanceAs(input[0]);
result.Last().ShouldBeSameInstanceAs(input[4]);
}
[Test]
public void Given_the_second_problem_sample()
{
var input = new[]
{
new Pair<int, int>(0, 1),
new Pair<int, int>(1, 7),
new Pair<int, int>(3, 20),
new Pair<int, int>(8, 5)
};
var result = FindLongestNonOverlappingRangeSet(input);
result.Count.ShouldBeEqualTo(1);
result.First().ShouldBeSameInstanceAs(input[2]);
}
This is a special case of the longest path problem for weighted directed acyclic graphs.
The nodes in the graph are the start points and the points after the last element in a sequence, where the next sequence could start.
The problem is special because the distance between two nodes must be the same independently of the path.
Just thinking about the algorithm in basic terms, would this work?
(apologies for horrible syntax but I'm trying to stay language-independent here)
First the simplest form: Find the longest contiguous pair.
Cycle through every member and compare it to every other member with a higher startpos. If the startpos of the second member is equal to the sum of the startpos and length of the first member, they are contiguous. If so, form a new member in a new set with the lower startpos and combined length to represent this.
Then, take each of these pairs and compare them to all of the single members with a higher startpos and repeat, forming a new set of contiguous triples (if any exist).
Continue this pattern until you have no new sets.
The tricky part then is you have to compare the length of every member of each of your sets to find the real longest chain.
I'm pretty sure this is not as efficient as other methods, but I believe this is a viable approach to brute forcing this solution.
I'd appreciate feedback on this and any errors I may have overlooked.
Edited to replace pseudocode with actual Python code
Edited AGAIN to change the code; The original algorithm was on the solution, but I missunderstood what the second value in the pairs was! Fortunatelly the basic algorithm is the same, and I was able to change it.
Here's an idea that solves the problem in O(N log N) and doesn't use a hash map (so no hidden times). For memory we're going to use N * 2 "things".
We're going to add two more values to each tuple: (BackCount, BackLink). In the successful combination BackLink will link from right to left from the right-most tuple to the left-most tuple. BackCount will be the value accumulated count for the given BackLink.
Here's some python code:
def FindTuplesStartingWith(tuples, frm):
# The Log(N) algorithm is left as an excersise for the user
ret=[]
for i in range(len(tuples)):
if (tuples[i][0]==frm): ret.append(i)
return ret
def FindLongestSequence(tuples):
# Prepare (BackCount, BackLink) array
bb=[] # (BackCount, BackLink)
for OneTuple in tuples: bb.append((-1,-1))
# Prepare
LongestSequenceLen=-1
LongestSequenceTail=-1
# Algorithm
for i in range(len(tuples)):
if (bb[i][0] == -1): bb[i] = (0, bb[i][1])
# Is this single pair the longest possible pair all by itself?
if (tuples[i][1] + bb[i][0]) > LongestSequenceLen:
LongestSequenceLen = tuples[i][1] + bb[i][0]
LongestSequenceTail = i
# Find next segment
for j in FindTuplesStartingWith(tuples, tuples[i][0] + tuples[i][1]):
if ((bb[j][0] == -1) or (bb[j][0] < (bb[i][0] + tuples[i][1]))):
# can be linked
bb[j] = (bb[i][0] + tuples[i][1], i)
if ((bb[j][0] + tuples[j][1]) > LongestSequenceLen):
LongestSequenceLen = bb[j][0] + tuples[j][1]
LongestSequenceTail=j
# Done! I'll now build up the solution
ret=[]
while (LongestSequenceTail > -1):
ret.insert(0, tuples[LongestSequenceTail])
LongestSequenceTail = bb[LongestSequenceTail][1]
return ret
# Call the algoritm
print FindLongestSequence([(0,5), (0,1), (1,9), (5,5), (5,7), (10,1)])
>>>>>> [(0, 5), (5, 7)]
print FindLongestSequence([(0,1), (1,7), (3,20), (8,5)])
>>>>>> [(3, 20)]
The key for the whole algorithm is where the "THIS IS THE KEY" comment is in the code. We know our current StartTuple can be linked to EndTuple. If a longer sequence that ends at EndTuple.To exists, it was found by the time we got to this point, because it had to start at an smaller StartTuple.From, and the array is sorted on "From"!
I removed the previous solution because it was not tested.
The problem is finding the longest path in a "weighted directed acyclic graph", it can be solved in linear time:
http://en.wikipedia.org/wiki/Longest_path_problem#Weighted_directed_acyclic_graphs
Put a set of {start positions} union {(start position + end position)} as vertices. For your example it would be {0, 1, 5, 10, 11, 12}
for vertices v0, v1 if there is an end value w that makes v0 + w = v1, then add a directed edge connecting v0 to v1 and put w as its weight.
Now follow the pseudocode in the wikipedia page. since the number of vertices is the maximum value of 2xn (n is number of tuples), the problem can still be solved in linear time.
This is a simple reduce operation. Given a pair of consecutive tuples, they either can or can't be combined. So define the pairwise combination function:
def combo(first,second):
if first[0]+first[1] == second[0]:
return [(first[0],first[1]+second[1])]
else:
return [first,second]
This just returns a list of either one element combining the two arguments, or the original two elements.
Then define a function to iterate over the first list and combine pairs:
def collapse(tupleList):
first = tupleList.pop(0)
newList = []
for item in tupleList:
collapsed = combo(first,item)
if len(collapsed)==2:
newList.append(collapsed[0])
first = collapsed.pop()
newList.append(first)
return newList
This keeps a first element to compare with the current item in the list (starting at the second item), and when it can't combine them it drops the first into a new list and replaces first with the second of the two.
Then just call collapse with the list of tuples:
>>> collapse( [(5, 7), (12, 3), (0, 5), (0, 7), (7, 2), (9, 3)] )
[(5, 10), (0, 5), (0, 12)]
[Edit] Finally, iterate over the result to get the longest sequence.
def longest(seqs):
collapsed = collapse(seqs)
return max(collapsed, key=lambda x: x[1])
[/Edit]
Complexity O(N). For bonus marks, do it in reverse so that the initial pop(0) becomes a pop() and you don't have to reindex the array, or move the iterator instead. For top marks make it run as a pairwise reduce operation for multithreaded goodness.
This sounds like a perfect "dynamic programming" problem...
The simplest program would be to do it brute force (e.g. recursive), but this has exponential complexity.
With dynamic programming you can set up an array a of length n, where n is the maximum of all (start+length) values of your problem, where a[i] denotes the longest non-overlapping sequence up to a[i]. You can then step trought all tuples, updating a. The complexity of this algorithm would be O(n*k), where k is the number of input values.
Create an ordered array of all start and end points and initialise all of them to one
For each item in your tuple, compare the end point (start and end) to the ordered items in your array, if any point is between them (e.g. point in the array is 5 and you have start 2 with length 4) change value to zero.
After finishing the loop, start moving across the ordered array and create a strip when you see 1 and while you see 1, add to the existing strip, with any zero, close the strip and etc.
At the end check the length of strips
I think complexity is around O(4-5*N)
(SEE UPDATE)
with N being number of items in the tuple.
UPDATE
As you figured out, the complexity is not accurate but definitely very small since it is a function of number of line stretches (tuple items).
So if N is number of line stretches, sorting is O(2N * log2N). Comparison is O(2N). Finding line stretches is also O(2N). So all in all O(2N(log2N + 2)).

Find unique common element from 3 arrays

Original Problem:
I have 3 boxes each containing 200 coins, given that there is only one person who has made calls from all of the three boxes and thus there is one coin in each box which has same fingerprints and rest of all coins have different fingerprints. You have to find the coin which contains same fingerprint from all of the 3 boxes. So that we can find the fingerprint of the person who has made call from all of the 3 boxes.
Converted problem:
You have 3 arrays containing 200 integers each. Given that there is one and only one common element in these 3 arrays. Find the common element.
Please consider solving this for other than trivial O(1) space and O(n^3) time.
Some improvement in Pelkonen's answer:
From converted problem in OP:
"Given that there is one and only one common element in these 3 arrays."
We need to sort only 2 arrays and find common element.
If you sort all the arrays first O(n log n) then it will be pretty easy to find the common element in less than O(n^3) time. You can for example use binary search after sorting them.
Let N = 200, k = 3,
Create a hash table H with capacity ≥ Nk.
For each element X in array 1, set H[X] to 1.
For each element Y in array 2, if Y is in H and H[Y] == 1, set H[Y] = 2.
For each element Z in array 3, if Z is in H and H[Z] == 2, return Z.
throw new InvalidDataGivenByInterviewerException();
O(Nk) time, O(Nk) space complexity.
Use a hash table for each integer and encode the entries such that you know which array it's coming from - then check for the slot which has entries from all 3 arrays. O(n)
Use a hashtable mapping objects to frequency counts. Iterate through all three lists, incrementing occurrence counts in the hashtable, until you encounter one with an occurrence count of 3. This is O(n), since no sorting is required. Example in Python:
def find_duplicates(*lists):
num_lists = len(lists)
counts = {}
for l in lists:
for i in l:
counts[i] = counts.get(i, 0) + 1
if counts[i] == num_lists:
return i
Or an equivalent, using sets:
def find_duplicates(*lists):
intersection = set(lists[0])
for l in lists[1:]:
intersection = intersection.intersect(set(l))
return intersection.pop()
O(N) solution: use a hash table. H[i] = list of all integers in the three arrays that map to i.
For all H[i] > 1 check if three of its values are the same. If yes, you have your solution. You can do this check with the naive solution even, it should still be very fast, or you can sort those H[i] and then it becomes trivial.
If your numbers are relatively small, you can use H[i] = k if i appears k times in the three arrays, then the solution is the i for which H[i] = 3. If your numbers are huge, use a hash table though.
You can extend this to work even if you can have elements that can be common to only two arrays and also if you can have elements repeating elements in one of the arrays. It just becomes a bit more complicated, but you should be able to figure it out on your own.
If you want the fastest* answer:
Sort one array--time is N log N.
For each element in the second array, search the first. If you find it, add 1 to a companion array; otherwise add 0--time is N log N, using N space.
For each non-zero count, copy the corresponding entry into the temporary array, compacting it so it's still sorted--time is N.
For each element in the third array, search the temporary array; when you find a hit, stop. Time is less than N log N.
Here's code in Scala that illustrates this:
import java.util.Arrays
val a = Array(1,5,2,3,14,1,7)
val b = Array(3,9,14,4,2,2,4)
val c = Array(1,9,11,6,8,3,1)
Arrays.sort(a)
val count = new Array[Int](a.length)
for (i <- 0 until b.length) {
val j =Arrays.binarySearch(a,b(i))
if (j >= 0) count(j) += 1
}
var n = 0
for (i <- 0 until count.length) if (count(i)>0) { count(n) = a(i); n+= 1 }
for (i <- 0 until c.length) {
if (Arrays.binarySearch(count,0,n,c(i))>=0) println(c(i))
}
With slightly more complexity, you can either use no extra space at the cost of being even more destructive of your original arrays, or you can avoid touching your original arrays at all at the cost of another N space.
Edit: * as the comments have pointed out, hash tables are faster for non-perverse inputs. This is "fastest worst case". The worst case may not be so unlikely unless you use a really good hashing algorithm, which may well eat up more time than your sort. For example, if you multiply all your values by 2^16, the trivial hashing (i.e. just use the bitmasked integer as an index) will collide every time on lists shorter than 64k....
//Begineers Code using Binary Search that's pretty Easy
// bool BS(int arr[],int low,int high,int target)
// {
// if(low>high)
// return false;
// int mid=low+(high-low)/2;
// if(target==arr[mid])
// return 1;
// else if(target<arr[mid])
// BS(arr,low,mid-1,target);
// else
// BS(arr,mid+1,high,target);
// }
// vector <int> commonElements (int A[], int B[], int C[], int n1, int n2, int n3)
// {
// vector<int> ans;
// for(int i=0;i<n2;i++)
// {
// if(i>0)
// {
// if(B[i-1]==B[i])
// continue;
// }
// //The above if block is to remove duplicates
// //In the below code we are searching an element form array B in both the arrays A and B;
// if(BS(A,0,n1-1,B[i]) && BS(C,0,n3-1,B[i]))
// {
// ans.push_back(B[i]);
// }
// }
// return ans;
// }

Is it possible to rearrange an array in place in O(N)?

If I have a size N array of objects, and I have an array of unique numbers in the range 1...N, is there any algorithm to rearrange the object array in-place in the order specified by the list of numbers, and yet do this in O(N) time?
Context: I am doing a quick-sort-ish algorithm on objects that are fairly large in size, so it would be faster to do the swaps on indices than on the objects themselves, and only move the objects in one final pass. I'd just like to know if I could do this last pass without allocating memory for a separate array.
Edit: I am not asking how to do a sort in O(N) time, but rather how to do the post-sort rearranging in O(N) time with O(1) space. Sorry for not making this clear.
I think this should do:
static <T> void arrange(T[] data, int[] p) {
boolean[] done = new boolean[p.length];
for (int i = 0; i < p.length; i++) {
if (!done[i]) {
T t = data[i];
for (int j = i;;) {
done[j] = true;
if (p[j] != i) {
data[j] = data[p[j]];
j = p[j];
} else {
data[j] = t;
break;
}
}
}
}
}
Note: This is Java. If you do this in a language without garbage collection, be sure to delete done.
If you care about space, you can use a BitSet for done. I assume you can afford an additional bit per element because you seem willing to work with a permutation array, which is several times that size.
This algorithm copies instances of T n + k times, where k is the number of cycles in the permutation. You can reduce this to the optimal number of copies by skipping those i where p[i] = i.
The approach is to follow the "permutation cycles" of the permutation, rather than indexing the array left-to-right. But since you do have to begin somewhere, everytime a new permutation cycle is needed, the search for unpermuted elements is left-to-right:
// Pseudo-code
N : integer, N > 0 // N is the number of elements
swaps : integer [0..N]
data[N] : array of object
permute[N] : array of integer [-1..N] denoting permutation (used element is -1)
next_scan_start : integer;
next_scan_start = 0;
while (swaps < N )
{
// Search for the next index that is not-yet-permtued.
for (idx_cycle_search = next_scan_start;
idx_cycle_search < N;
++ idx_cycle_search)
if (permute[idx_cycle_search] >= 0)
break;
next_scan_start = idx_cycle_search + 1;
// This is a provable invariant. In short, number of non-negative
// elements in permute[] equals (N - swaps)
assert( idx_cycle_search < N );
// Completely permute one permutation cycle, 'following the
// permutation cycle's trail' This is O(N)
while (permute[idx_cycle_search] >= 0)
{
swap( data[idx_cycle_search], data[permute[idx_cycle_search] )
swaps ++;
old_idx = idx_cycle_search;
idx_cycle_search = permute[idx_cycle_search];
permute[old_idx] = -1;
// Also '= -idx_cycle_search -1' could be used rather than '-1'
// and would allow reversal of these changes to permute[] array
}
}
Do you mean that you have an array of objects O[1..N] and then you have an array P[1..N] that contains a permutation of numbers 1..N and in the end you want to get an array O1 of objects such that O1[k] = O[P[k]] for all k=1..N ?
As an example, if your objects are letters A,B,C...,Y,Z and your array P is [26,25,24,..,2,1] is your desired output Z,Y,...C,B,A ?
If yes, I believe you can do it in linear time using only O(1) additional memory. Reversing elements of an array is a special case of this scenario. In general, I think you would need to consider decomposition of your permutation P into cycles and then use it to move around the elements of your original array O[].
If that's what you are looking for, I can elaborate more.
EDIT: Others already presented excellent solutions while I was sleeping, so no need to repeat it here. ^_^
EDIT: My O(1) additional space is indeed not entirely correct. I was thinking only about "data" elements, but in fact you also need to store one bit per permutation element, so if we are precise, we need O(log n) extra bits for that. But most of the time using a sign bit (as suggested by J.F. Sebastian) is fine, so in practice we may not need anything more than we already have.
If you didn't mind allocating memory for an extra hash of indexes, you could keep a mapping of original location to current location to get a time complexity of near O(n). Here's an example in Ruby, since it's readable and pseudocode-ish. (This could be shorter or more idiomatically Ruby-ish, but I've written it out for clarity.)
#!/usr/bin/ruby
objects = ['d', 'e', 'a', 'c', 'b']
order = [2, 4, 3, 0, 1]
cur_locations = {}
order.each_with_index do |orig_location, ordinality|
# Find the current location of the item.
cur_location = orig_location
while not cur_locations[cur_location].nil? do
cur_location = cur_locations[cur_location]
end
# Swap the items and keep track of whatever we swapped forward.
objects[ordinality], objects[cur_location] = objects[cur_location], objects[ordinality]
cur_locations[ordinality] = orig_location
end
puts objects.join(' ')
That obviously does involve some extra memory for the hash, but since it's just for indexes and not your "fairly large" objects, hopefully that's acceptable. Since hash lookups are O(1), even though there is a slight bump to the complexity due to the case where an item has been swapped forward more than once and you have to rewrite cur_location multiple times, the algorithm as a whole should be reasonably close to O(n).
If you wanted you could build a full hash of original to current positions ahead of time, or keep a reverse hash of current to original, and modify the algorithm a bit to get it down to strictly O(n). It'd be a little more complicated and take a little more space, so this is the version I wrote out, but the modifications shouldn't be difficult.
EDIT: Actually, I'm fairly certain the time complexity is just O(n), since each ordinality can have at most one hop associated, and thus the maximum number of lookups is limited to n.
#!/usr/bin/env python
def rearrange(objects, permutation):
"""Rearrange `objects` inplace according to `permutation`.
``result = [objects[p] for p in permutation]``
"""
seen = [False] * len(permutation)
for i, already_seen in enumerate(seen):
if not already_seen: # start permutation cycle
first_obj, j = objects[i], i
while True:
seen[j] = True
p = permutation[j]
if p == i: # end permutation cycle
objects[j] = first_obj # [old] p -> j
break
objects[j], j = objects[p], p # p -> j
The algorithm (as I've noticed after I wrote it) is the same as the one from #meriton's answer in Java.
Here's a test function for the code:
def test():
import itertools
N = 9
for perm in itertools.permutations(range(N)):
L = range(N)
LL = L[:]
rearrange(L, perm)
assert L == [LL[i] for i in perm] == list(perm), (L, list(perm), LL)
# test whether assertions are enabled
try:
assert 0
except AssertionError:
pass
else:
raise RuntimeError("assertions must be enabled for the test")
if __name__ == "__main__":
test()
There's a histogram sort, though the running time is given as a bit higher than O(N) (N log log n).
I can do it given O(N) scratch space -- copy to new array and copy back.
EDIT: I am aware of the existance of an algorithm that will proceed through. The idea is to perform the swaps on the array of integers 1..N while at the same time mirroring the swaps on your array of large objects. I just cannot find the algorithm right now.
The problem is one of applying a permutation in place with minimal O(1) extra storage: "in-situ permutation".
It is solvable, but an algorithm is not obvious beforehand.
It is described briefly as an exercise in Knuth, and for work I had to decipher it and figure out how it worked. Look at 5.2 #13.
For some more modern work on this problem, with pseudocode:
http://www.fernuni-hagen.de/imperia/md/content/fakultaetfuermathematikundinformatik/forschung/berichte/bericht_273.pdf
I ended up writing a different algorithm for this, which first generates a list of swaps to apply an order and then runs through the swaps to apply it. The advantage is that if you're applying the ordering to multiple lists, you can reuse the swap list, since the swap algorithm is extremely simple.
void make_swaps(vector<int> order, vector<pair<int,int>> &swaps)
{
// order[0] is the index in the old list of the new list's first value.
// Invert the mapping: inverse[0] is the index in the new list of the
// old list's first value.
vector<int> inverse(order.size());
for(int i = 0; i < order.size(); ++i)
inverse[order[i]] = i;
swaps.resize(0);
for(int idx1 = 0; idx1 < order.size(); ++idx1)
{
// Swap list[idx] with list[order[idx]], and record this swap.
int idx2 = order[idx1];
if(idx1 == idx2)
continue;
swaps.push_back(make_pair(idx1, idx2));
// list[idx1] is now in the correct place, but whoever wanted the value we moved out
// of idx2 now needs to look in its new position.
int idx1_dep = inverse[idx1];
order[idx1_dep] = idx2;
inverse[idx2] = idx1_dep;
}
}
template<typename T>
void run_swaps(T data, const vector<pair<int,int>> &swaps)
{
for(const auto &s: swaps)
{
int src = s.first;
int dst = s.second;
swap(data[src], data[dst]);
}
}
void test()
{
vector<int> order = { 2, 3, 1, 4, 0 };
vector<pair<int,int>> swaps;
make_swaps(order, swaps);
vector<string> data = { "a", "b", "c", "d", "e" };
run_swaps(data, swaps);
}

Resources