In what seems to me a common implementation of quicksort, the program is composed of a partitioning subroutine and two recursive calls to quicksort those (two) partitions.
So the flow of control, in the quickest and pseudo-est of pseudocode, goes something like this:
quicksort[list, some parameters]
.
.
.
q=partition[some other parameters]
quicksort[1,q]
quicksort[q+1,length[list]]
.
.
.
End
The q is the "pivot" after a partitioning. That second quicksort call--the one that'll quicksort the second part of the list, also uses q. This is what I don't understand. If the "flow of control" is going through the first quicksort first, q is going to be updated. How is the same q going to work in the second quicksort, when it comes time to do the second parts of all those partitions?
I think my misunderstanding comes from the limitations of pseudocode. There are details that have been likely left out by expressing this implementation of the quicksort algorithm in pseudocode.
Edit 1 This seems related to my problem:
For[i = 1, i < 5, i = i + 1, Print[i]]
The first time through, we would get i=1, true, i=2, 1. Even though i was updated to 2, i is still 1 in body (i.e., Print[i]=1). This "flow of control" is what I don't understand. Where is the i=1 being stored when it increments to 2 and before it gets to body?
Edit 2
As an example of what I'm trying to get at, I'm pasting this here. It's from here.
Partition(A,p,r)
x=A[r]
i=p+1
j=r+1
while TRUE
repeat j=j-1
until A[j]<=x
repeat i=i+1
until A[i]>=x
if i<j
then exchange A[i] with A[j]
else return j
Quicksort(A,1,length[A])
Quicksort(A,p,r)
if p<r
then q=Partition(A,p,r)
Quicksort(A,p,q)
Quicksort(A,q+1,r)
Another example can be found here.
Where or when in these algorithms is q being put onto a stack?
q is not updated. The pivot remains in his place. In each iteration of quicksort, the only element who is guaranteed to be in its correct place, is the pivot.
Also, note that the q which is "changed" during the recursive call is NOT actually changed, since it is a different variable, stored in a different area, this is true because q is a local variable of the function, and is generated for each call.
EDIT: [response to the question edit]
In quicksort, the algorithm actually generate number of qs, which are stored on the stack. Every variable is 'alive' only on its own function, and is accessible [in this example] only from it. When the function ends, the local variable is being released automatically, so actually you don't have only one pivot, you actually have number of pivots, one for each recursive step.
Turns out Quicksort demands extra memory to function precisely in order to do the bookeeping you mentioned. Perhaps the following (pseudocode) iterative version of the algorithm might clear things up:
quicksort(array, begin, end) =
intervals_to_sort = {(begin, end)}; //a set
while there are intervals to sort:
(begin, end) = remove an interval from intervals_to_sort
if length of (begin, end) >= 2:
q = partition(array, begin, end)
add (begin, q) to intervals_to_sort
add (q+1, end) to intervals_to_sort
You may notice that now the intervals to sort are being explicitly kept in a data structure (usually just an array, inserting and removing at the end, in a stack-like fashion) so there is no risk of "forgetting" about old intervals.
What might confuse you is that the most common description of Quicksort is recursive so the q variable appears multiple times. The answer to this is that every time a function is called it creates a new batch of local variables so it doesn't touch the old ones. In the end, the explicit stack from that previous imperative example ends up being implemented as an implicit stack with function variables.
(An interesting side note: some early programming languages didn't implement neat local variables like that and Quicksort was actually first described using the iterative version with the explicit stack. It was only latter that it was seen how Quicksort could be elegantly described as a recursive algorithm in Algol.)
As for the part after your edit, the i=1 is forgotten since assignment will destructively update the variable.
The partition code picks some value from the array (such as the value at the midpoint of the array ... your example code picks the last element) -- this is the pivot. It then puts all the values <= pivot on the left and all values >= pivot on the right, and then stores the pivot in the one remaining slot between them. At that point, the pivot is necessarily in the correct slot, q. Then the algorithm sorts the partition [p, q) and the partition [q+1, r), which are disjoint but cover all of A except q, resulting in the entire array being sorted.
Related
I was going over the implementation of quicksort (from CLRS 3rd Edition). I found that the recursive divide of the array goes from the low index to middle-1 and then again from middle+1 to high.
QUICKSORT(A,p,r)
1 if(p < r)
2 q = PARTITION(A,p,r)
3 QUICKSORT(A,p,q-1)
4 QUICKSORT(A,q+1,r)
And the implementation of the merge sort is given as follows:
MERGE-SORT(A,p,r)
1 if(p < r)
2 q = (p+r)/2 (floor)
3 MERGE-SORT(A,p,q)
4 MERGE-SORT(A,q+1,r)
5 MERGE(A,p,q,r)
As both of them use the divide strategy to be the same, why does quicksort ignore the middle elements as going from 0 to q-1 and q+1 to r does not have q included in it while the mergesort has?
Quicksort puts all the elements smaller than the pivot on one side and all elements bigger on the other side. After this step we know the final position of the pivot will be between those two, and that's where we put it, so we don't need to look at it again.
Thus we can exclude the pivot element in the recursive calls.
Mergesort just picks the middle position and doesn't do anything with that element until later. There's no guarantee that the element in that position will already be in the right place, thus we need to look at that element again later on.
Thus we must include the middle element in the recursive calls.
Both methods exploit divide strategy but in different ways
Mergesort (the most common implementation) divides array recursively into equal (if possible) size parts, middle indexes are fixed positions (for given array length). Recurive calls treat left part and right part of array completely.
Quicksort partition subroutine places pivot element in the needed (final) position (in most cases pivot index is not middle). There is no need to treat this eleьent further, and recursive calls treat pieces before and after that element.
In many quicksort algorithms, the programming involves placing the elements from each array into three groups:(less, pivot, more), and sometimes placing the groups back together. What if I do not want to use this? Is there a simpler approach to sorting a list with quicksort manually?
Basically, I plan to keep the array as one, and swap all the elements based on a partition (for example, given a list x and pivot r, we could have the reference lists of [0:r] and [r:len(x)]. However, as the sorting continues, how do I continue referencing each smaller "subarray"?
So this is my code, but I'm not sure how to continue from here:
x = [4,7,4,2,4,6,5]
#r is pivot POSITION
r = len(x)-1
i = -1
for a in range(0,r+1):
if x[a] <= x[r]:
i+=1
x[i], x[a] = x[a], x[i]
You can implement quicksort purely by swapping the locations of items in a list, rather than actually creating new lists.
But unless this is some sort of homework assignment, the best option is generally to use python's built-in sort() function, which automatically uses quicksort where appropriate.
There's something not right in here. You need to have two definitions, one for the partition and one for the quicksort process itself. The quicksort will then need to have some sort of loop, so that it will continue applying the partition to subarrays of the array. Go and check the Wikipedia article to understand how this works.
We have a linked list of size L, and we want to retrieve the nth to the last element.
Solution 1: naive solution
make a first pass from the beginning to the end to compute L
make a second pass from the beginning to the expected position
Solution 2: use 2 pointers p1, p2
p1 starts iterating from the beginning, p2 does not move.
when there are n elements between p1 and p2, p2 starts iterating as well
when p1 arrives at the end of the list, p2 is at the expected position
Both solutions seem to have the same time complexity (i.e, 2L - n iterations over list elements)
Which one is better?
Both those algorithms are two-pass. The second may have better performance for reasonably small n because the second pass accesses memory that is already cached by the first pass. (The passes are interleaved.)
A one-pass solution would store the pointers in a circular buffer or queue, and return the "head" of the queue once the end of the list is reached.
How about using 3 pointers p, q, r and a counter.
Iterate through the list with p updating the counter.
Every n nodes assign r to q and q to p
When you hit the end of the list you can figure out how far
r is from the end of the list.
You can get the answer in no more than O(L + n)
If n << L, solution 2 is typically faster, because of caching, i.e. the memory blocks containing p1 and p2 are copied to the CPU cache once and the pointers moved for a bunch of iterations before RAM needs to be accessed again.
Would it not be much cheaper to simply store the length of the linked list in O(1) memory? The only reason you have to do a "first pass" at all is because you don't know the length of your linked list. If you store the length, you could iterate over (|L|-n) elements every time and get retrieve the element easily. For higher values of n in comparison to L, this way would save you substantial amounts of time. For example if n was equal to |L|, you could simply return the head of the list with no iteration at all.
This method uses slightly more memory than your first algorithm since it stores the length in memory, but your second algorithm uses two pointers, whereas this method only uses 1 pointer. If you have the memory for a second pointer, you probably have the memory to store the length of your linked list.
Granted O(|L|-n) is equivalent to O(n) in pure theory, but there are "fast" linear algorithms and then there are "slow" ones. Two-pass algorithms for this kind of problem are slow.
As #HotLicks pointed out in the comments, "One needs to understand that "big O" complexity is only loosely related to actual performance in many cases, since it ignores additive factors and constant multipliers." IMO just go for the laziest method in this case and don't overthink it.
As seen in the answers to Linear time majority algorithm?, it is possible to compute the majority of an array of elements in linear time and log(n) space.
It was shown that everyone who sees this algorithm believes that it is a cool technique. But does the idea generalize to new algorithms?
It seems the hidden power of this algorithm is in keeping a counter that plays a complex role -- such as "(count of majority element so far) - (count of second majority so far)". Are there other algorithms based on the same idea?
Umh, let's first start to understand why the algorithm works, in order to "isolate" the ideas there.
The point of the algorithm is that if you have a majority element, then you can match each occurrence of it with an "another" element, and then you have some more "spare".
So, we just have a counter which counts the number of "spare" occurrences of our guest answer.
If it reaches 0, then it isn't a majority element for the subsequence starting from when we have "elected" the "current" element as the guest major element to the "current" position.
Also, since our "guest" element matches every other element occurrence in the considered subsequence, there are no major elements in the considered subsequence.
Now, since:
our algorithm gives a correct answer only if there is a major element, and
if there is a major element, then it'll still be if we ignore the "current" subsequence when the counter goes to zero
it is obvious to see by contradiction that, if a major element exists, then we have a suffix of the whole sequence when the counter never gets to zero.
Now: what's the idea that can be exploited in new, O(1) size O(n) time algorithms?
To me, you can apply this technique whenever you have to compute a property P on a sequence of elements which:
can be exteded from seq[n, m] to seq[n, m+1] in O(1) time if Q(seq[n, m+1]) doesn't hold
P(seq[n, m]) can be computed in O(1) time and space from P(seq[n, j]) and P(seq[j, m]) if Q(seq[n, j]) holds
In our case, P is the "spare" occurrences of our "elected" major element and Q is "P is zero".
If you see things in that way, longest common subsequence exploits the same idea (dunno about its "coolness factor" ;))
Jaydev Misra and David Gries have a paper called Finding Repeated Elements (ACM page) which generalizes it to an element repeating more than n/k times (k=2 is the majority problem).
Of course, this is probably very similar to the original problem, and you are probably looking for 'different' algorithms.
Here is an example which is possibly different.
Give an algorithm which will detect if a string of parentheses ( '(' and ')') is well formed.
I believe the standard solution is to maintain a counter.
Side note:
As to answers which claim cannot be constant space etc, ask them for the model of computation. In the WORD RAM model for instance, you assume the integers/array indices etc are O(1).
A lot of folks incorrectly mix and match models. For instance, they will happily have the input array of n integers be O(n), have an array index be O(1) space, but a counter they consider Omega(log n) etc, which is nonsense. If they want to consider the size in bits, then the input itself is Omega(n log n) etc.
For people who want to understand what does this algorithm do and why does it works: look at my detailed answer.
Here I will describe a natural extension of this algorithm (or a generalization). So in a standard majority voting algorithm you have to find an element which appears at least n/2 times in the stream, where n is the size of the stream. You can do this in O(n) time (with a tiny constant and in O(log(n)) space, worse case and highly unlikely.
The generalized algorithm allows you to find k most frequent items, where each time appeared at least n/(k+1) times in the original stream. Note that if k=1, you end up with your original problem.
Solution to this problem is really similar to the original one, except instead of one counter and one possible element, you maintain k counters and k possible elements. Now the logic goes in a similar way. You iterate through the array and if the element is in the possible elements, you increase it's counter, if one of the counters is zero - substitute the element of this counter with new element. Otherwise just decrease the values.
As with original majority voting algorithm, you need to have a guarantee that you have these k majority elements, otherwise you have to do another pass over the array to verify that your previously found possible elements are correct. Here is my python attempt (have not done a thorough testing).
from collections import defaultdict
def majority_element_general(arr, k=1):
counter, i = defaultdict(int), 0
while len(counter) < k and i < len(arr):
counter[arr[i]] += 1
i += 1
for i in arr[i:]:
if i in counter:
counter[i] += 1
elif len(counter) < k:
counter[i] = 1
else:
fields_to_remove = []
for el in counter:
if counter[el] > 1:
counter[el] -= 1
else:
fields_to_remove.append(el)
for el in fields_to_remove:
del counter[el]
potential_elements = counter.keys()
# might want to check that they are really frequent.
return potential_elements
I'm taking an Algorithms class and the latest homework is really stumping me. Essentially, the assignment is to implement a version of merge sort that doesn't allocate as much temporary memory as the implementation in CLRS. I'm supposed to do this by creating only 1 temp array at the beginning, and put all the temp stuff in it while splitting and merging.
I should also note that the language of the class is Lua, which is important because the only available data structures are tables. They're like Java maps in that they come in key-value pairs, but they're like arrays in that you don't have to insert things in pairs - if you insert only one thing it's treated as a value, and its key will be what its index would be in a language with real arrays. At least that's how I understand it, since I'm new to Lua as well. Also, anything at all, primitives, strings, objects, etc can be a key - even different types in the same table.
Anyway, 2 things that are confusing me:
First, well, how is it done? Do you just keep overwriting the temp array with each recursion of splitting and merging?
Second, I'm really confused about the homework instructions (I'm auditing the class for free so I can't ask any of the staff). Here are the instructions:
Write a top level procedure merge_sort that takes as its argument the ar-
ray to sort. It should declare a temporary array and then call merge_sort_1,
a procedure of four arguments: The array to sort, the one to use as tem-
porary space, and the start and finish indexes within which this call to
merge_sort_1 should work.
Now write merge_sort_1, which computes the midpoint of the start–finish
interval, and makes a recursive call to itself for each half. After that it
calls merge to merge the two halves.
The merge procedure you write now will be a function of the permanent
array and the temporary array, the start, the midpoint, and the finish.
It maintains an index into the temporary array and indices i, j into each
(sorted) half of the permanent array.
It needs to walk through the temporary array from start to finish, copying
a value either from the lower half of the permanent array or from the
upper half of the permanent array. It chooses the value at i in the lower
half if that is less than or equal to the value at j in the upper half, and
advances i. It chooses the value at j in the upper half if that is less than
the value at i in the lower half, and advances j.
After one part of the permanent array is used up, be sure to copy the rest
of the other part. The textbook uses a trick with an infinite value ∞ to
avoid checking whether either part is used up. However, that trick is hard
to apply here, since where would you put it?
Finally, copy all the values from start to finish in the temporary array
back to the permanent array.
Number 2 is confusing because I have no idea what merge_sort_1 is supposed to do, and why it has to be a different method from merge_sort. I also don't know why it needs to be passed starting and ending indexes. In fact, maybe I misread something, but the instructions sound like merge_sort_1 doesn't do any real work.
Also, the whole assignment is confusing because I don't see from the instructions where the splitting is done to make 2 halves of the original array. Or am I misunderstanding mergesort?
I hope I've made some sense. Thanks everyone!
First, I would make sure you understand mergesort.
Look at this explanation, with fancy animations to help you understand it.
This is their pseudo code version of it:
# split in half
m = n / 2
# recursive sorts
sort a[1..m]
sort a[m+1..n]
# merge sorted sub-arrays using temp array
b = copy of a[1..m]
i = 1, j = m+1, k = 1
while i <= m and j <= n,
a[k++] = (a[j] < b[i]) ? a[j++] : b[i++]
→ invariant: a[1..k] in final position
while i <= m,
a[k++] = b[i++]
→ invariant: a[1..k] in final position
See how they use b to hold a temporary copy of the data?
What your teacher wants is for you to pass one table in to be used for this temporary storage.
Does that clear up the assignment?
Your main sort routine would look like this: (sorry, I don't know Lua, so I'll write some Javaish code)
void merge_sort(int[] array) {
int[] t = ...allocate a temporary array...
merge_sort_1(array, 0, array.length, t);
}
merge_sort_1 takes an array to sort, some start and finish indexes, and an array to use for some temporary space. It does the actual divide-and-conquer calls and calls to the merge routine. Note that the recursive calls need to go to merge_sort_1 and not merge_sort because you don't want to allocate the array on each recursive level, just once at the start of the merge sort procedure. (This is the whole point in dividing the merge sort into two routines.)
I'll leave it up to you to write a merge routine. It should take the original array that contains 2 sorted sub-parts and a temporary array, and sorts the original array. The easiest way to do that would be to merge into the temporary array, then just copy it back when done.
First, well, how is it done? Do you
just keep overwriting the temp array
with each recursion of splitting and
merging?
Yes, the temp array keeps getting overwritten. The temp array is used during the merge phase to hold the merge results that are then copied back into the permanent array at the end of the merge.
Number 2 is confusing because I have
no idea what merge_sort_1 is supposed
to do, and why it has to be a
different method from merge_sort.
merge_sort_1 is the recursive center of the recursive merge sort. merge_sort will only be a convenience function, creating the temp array and populating the initial start and finish positions.
I also don't know why it needs to be
passed starting and ending indexes. In
fact, maybe I misread something, but
the instructions sound like
merge_sort_1 doesn't do any real work.
Also, the whole assignment is
confusing because I don't see from the
instructions where the splitting is
done to make 2 halves of the original
array. Or am I misunderstanding
mergesort?
The recursive function merge_sort_1 will only work on a portion of the passed in array. The portion it works on is defined by the start and ending indexes. The mid-point between the start and end is how the array is split and then split again on recursive calls. After the recursive calls for the upper and lower half are complete the two halves are merged into the temp array and then copied back to the permanent array.
I was able to write the merge sort in Lua as described and can comment on my implementation. It does seem as through the instructions were written as if they were comments in or about the teacher's implementation.
Here is the merge_sort function. As I said, it is only a convenience function and I feel is not the meat of the problem.
-- Write a top level procedure merge_sort that takes as its argument
-- the array to sort.
function merge_sort(a)
-- It should declare a temporary array and then call merge_sort_1,
-- a procedure of four arguments: The array to sort, the one to use
-- as temporary space, and the start and finish indexes within which
-- this call to merge_sort_1 should work.
merge_sort_1(a,{},1,#a)
end