I'm taking an Algorithms class and the latest homework is really stumping me. Essentially, the assignment is to implement a version of merge sort that doesn't allocate as much temporary memory as the implementation in CLRS. I'm supposed to do this by creating only 1 temp array at the beginning, and put all the temp stuff in it while splitting and merging.
I should also note that the language of the class is Lua, which is important because the only available data structures are tables. They're like Java maps in that they come in key-value pairs, but they're like arrays in that you don't have to insert things in pairs - if you insert only one thing it's treated as a value, and its key will be what its index would be in a language with real arrays. At least that's how I understand it, since I'm new to Lua as well. Also, anything at all, primitives, strings, objects, etc can be a key - even different types in the same table.
Anyway, 2 things that are confusing me:
First, well, how is it done? Do you just keep overwriting the temp array with each recursion of splitting and merging?
Second, I'm really confused about the homework instructions (I'm auditing the class for free so I can't ask any of the staff). Here are the instructions:
Write a top level procedure merge_sort that takes as its argument the ar-
ray to sort. It should declare a temporary array and then call merge_sort_1,
a procedure of four arguments: The array to sort, the one to use as tem-
porary space, and the start and finish indexes within which this call to
merge_sort_1 should work.
Now write merge_sort_1, which computes the midpoint of the start–finish
interval, and makes a recursive call to itself for each half. After that it
calls merge to merge the two halves.
The merge procedure you write now will be a function of the permanent
array and the temporary array, the start, the midpoint, and the finish.
It maintains an index into the temporary array and indices i, j into each
(sorted) half of the permanent array.
It needs to walk through the temporary array from start to finish, copying
a value either from the lower half of the permanent array or from the
upper half of the permanent array. It chooses the value at i in the lower
half if that is less than or equal to the value at j in the upper half, and
advances i. It chooses the value at j in the upper half if that is less than
the value at i in the lower half, and advances j.
After one part of the permanent array is used up, be sure to copy the rest
of the other part. The textbook uses a trick with an infinite value ∞ to
avoid checking whether either part is used up. However, that trick is hard
to apply here, since where would you put it?
Finally, copy all the values from start to finish in the temporary array
back to the permanent array.
Number 2 is confusing because I have no idea what merge_sort_1 is supposed to do, and why it has to be a different method from merge_sort. I also don't know why it needs to be passed starting and ending indexes. In fact, maybe I misread something, but the instructions sound like merge_sort_1 doesn't do any real work.
Also, the whole assignment is confusing because I don't see from the instructions where the splitting is done to make 2 halves of the original array. Or am I misunderstanding mergesort?
I hope I've made some sense. Thanks everyone!
First, I would make sure you understand mergesort.
Look at this explanation, with fancy animations to help you understand it.
This is their pseudo code version of it:
# split in half
m = n / 2
# recursive sorts
sort a[1..m]
sort a[m+1..n]
# merge sorted sub-arrays using temp array
b = copy of a[1..m]
i = 1, j = m+1, k = 1
while i <= m and j <= n,
a[k++] = (a[j] < b[i]) ? a[j++] : b[i++]
→ invariant: a[1..k] in final position
while i <= m,
a[k++] = b[i++]
→ invariant: a[1..k] in final position
See how they use b to hold a temporary copy of the data?
What your teacher wants is for you to pass one table in to be used for this temporary storage.
Does that clear up the assignment?
Your main sort routine would look like this: (sorry, I don't know Lua, so I'll write some Javaish code)
void merge_sort(int[] array) {
int[] t = ...allocate a temporary array...
merge_sort_1(array, 0, array.length, t);
}
merge_sort_1 takes an array to sort, some start and finish indexes, and an array to use for some temporary space. It does the actual divide-and-conquer calls and calls to the merge routine. Note that the recursive calls need to go to merge_sort_1 and not merge_sort because you don't want to allocate the array on each recursive level, just once at the start of the merge sort procedure. (This is the whole point in dividing the merge sort into two routines.)
I'll leave it up to you to write a merge routine. It should take the original array that contains 2 sorted sub-parts and a temporary array, and sorts the original array. The easiest way to do that would be to merge into the temporary array, then just copy it back when done.
First, well, how is it done? Do you
just keep overwriting the temp array
with each recursion of splitting and
merging?
Yes, the temp array keeps getting overwritten. The temp array is used during the merge phase to hold the merge results that are then copied back into the permanent array at the end of the merge.
Number 2 is confusing because I have
no idea what merge_sort_1 is supposed
to do, and why it has to be a
different method from merge_sort.
merge_sort_1 is the recursive center of the recursive merge sort. merge_sort will only be a convenience function, creating the temp array and populating the initial start and finish positions.
I also don't know why it needs to be
passed starting and ending indexes. In
fact, maybe I misread something, but
the instructions sound like
merge_sort_1 doesn't do any real work.
Also, the whole assignment is
confusing because I don't see from the
instructions where the splitting is
done to make 2 halves of the original
array. Or am I misunderstanding
mergesort?
The recursive function merge_sort_1 will only work on a portion of the passed in array. The portion it works on is defined by the start and ending indexes. The mid-point between the start and end is how the array is split and then split again on recursive calls. After the recursive calls for the upper and lower half are complete the two halves are merged into the temp array and then copied back to the permanent array.
I was able to write the merge sort in Lua as described and can comment on my implementation. It does seem as through the instructions were written as if they were comments in or about the teacher's implementation.
Here is the merge_sort function. As I said, it is only a convenience function and I feel is not the meat of the problem.
-- Write a top level procedure merge_sort that takes as its argument
-- the array to sort.
function merge_sort(a)
-- It should declare a temporary array and then call merge_sort_1,
-- a procedure of four arguments: The array to sort, the one to use
-- as temporary space, and the start and finish indexes within which
-- this call to merge_sort_1 should work.
merge_sort_1(a,{},1,#a)
end
Related
AFAIK counting sort is using following algorithm:
// A: input array
// B: output array
// C: counting array
sort(A,B,n,k)
1. for(i:k) C[i]=0;
2. for(i:n) ++C[A[i]];
3. for(i:k) C[i]+=C[i-1];
4. for(i:n-1..0) { B[C[A[i]]-1]=A[i]; --C[A[i]]; }
What about I remove step 3 and 4, and do following?
3. t=0; for(i:k) while(C[A[i]]) { --A[i]; B[t++]=i; }
Full code here, looks like fine, but I don't know which one has better performance.
Questions:
I guess the complexity of these two versions would be the same, is that ture?
In step 3 and step 4 the first version need to iterate n+k times, the second one only need to iterate n times. So does the second one have better performance?
Your code seems to be correct and it will work in case of sorting numbers. But, suppose you had an array of structures that you were sorting according to their keys. Your method will not work in that case because it simply counts the frequency of a number and while it remains positive assigns it to increasing indices in the output array. The classical method however will work for arrays of structures and objects etc. because it calculates the position that each element should go to and then copies data from the initial array to the output array.
To answer your question:
1> Yes, the runtime complexity of your code will be the same because for an array of size n and range 0...k, your inner and outer loop run proportional to f(0)+f(1)+...+f(k), where f denotes frequency of a number. Therefore runtime is O(n).
2> In terms of asymptotic complexity, both the methods have same performance. Due to an extra loop, the constants may be higher. But, that also makes the classical method a stable sort and have the benefits that I pointed out earlier.
In what seems to me a common implementation of quicksort, the program is composed of a partitioning subroutine and two recursive calls to quicksort those (two) partitions.
So the flow of control, in the quickest and pseudo-est of pseudocode, goes something like this:
quicksort[list, some parameters]
.
.
.
q=partition[some other parameters]
quicksort[1,q]
quicksort[q+1,length[list]]
.
.
.
End
The q is the "pivot" after a partitioning. That second quicksort call--the one that'll quicksort the second part of the list, also uses q. This is what I don't understand. If the "flow of control" is going through the first quicksort first, q is going to be updated. How is the same q going to work in the second quicksort, when it comes time to do the second parts of all those partitions?
I think my misunderstanding comes from the limitations of pseudocode. There are details that have been likely left out by expressing this implementation of the quicksort algorithm in pseudocode.
Edit 1 This seems related to my problem:
For[i = 1, i < 5, i = i + 1, Print[i]]
The first time through, we would get i=1, true, i=2, 1. Even though i was updated to 2, i is still 1 in body (i.e., Print[i]=1). This "flow of control" is what I don't understand. Where is the i=1 being stored when it increments to 2 and before it gets to body?
Edit 2
As an example of what I'm trying to get at, I'm pasting this here. It's from here.
Partition(A,p,r)
x=A[r]
i=p+1
j=r+1
while TRUE
repeat j=j-1
until A[j]<=x
repeat i=i+1
until A[i]>=x
if i<j
then exchange A[i] with A[j]
else return j
Quicksort(A,1,length[A])
Quicksort(A,p,r)
if p<r
then q=Partition(A,p,r)
Quicksort(A,p,q)
Quicksort(A,q+1,r)
Another example can be found here.
Where or when in these algorithms is q being put onto a stack?
q is not updated. The pivot remains in his place. In each iteration of quicksort, the only element who is guaranteed to be in its correct place, is the pivot.
Also, note that the q which is "changed" during the recursive call is NOT actually changed, since it is a different variable, stored in a different area, this is true because q is a local variable of the function, and is generated for each call.
EDIT: [response to the question edit]
In quicksort, the algorithm actually generate number of qs, which are stored on the stack. Every variable is 'alive' only on its own function, and is accessible [in this example] only from it. When the function ends, the local variable is being released automatically, so actually you don't have only one pivot, you actually have number of pivots, one for each recursive step.
Turns out Quicksort demands extra memory to function precisely in order to do the bookeeping you mentioned. Perhaps the following (pseudocode) iterative version of the algorithm might clear things up:
quicksort(array, begin, end) =
intervals_to_sort = {(begin, end)}; //a set
while there are intervals to sort:
(begin, end) = remove an interval from intervals_to_sort
if length of (begin, end) >= 2:
q = partition(array, begin, end)
add (begin, q) to intervals_to_sort
add (q+1, end) to intervals_to_sort
You may notice that now the intervals to sort are being explicitly kept in a data structure (usually just an array, inserting and removing at the end, in a stack-like fashion) so there is no risk of "forgetting" about old intervals.
What might confuse you is that the most common description of Quicksort is recursive so the q variable appears multiple times. The answer to this is that every time a function is called it creates a new batch of local variables so it doesn't touch the old ones. In the end, the explicit stack from that previous imperative example ends up being implemented as an implicit stack with function variables.
(An interesting side note: some early programming languages didn't implement neat local variables like that and Quicksort was actually first described using the iterative version with the explicit stack. It was only latter that it was seen how Quicksort could be elegantly described as a recursive algorithm in Algol.)
As for the part after your edit, the i=1 is forgotten since assignment will destructively update the variable.
The partition code picks some value from the array (such as the value at the midpoint of the array ... your example code picks the last element) -- this is the pivot. It then puts all the values <= pivot on the left and all values >= pivot on the right, and then stores the pivot in the one remaining slot between them. At that point, the pivot is necessarily in the correct slot, q. Then the algorithm sorts the partition [p, q) and the partition [q+1, r), which are disjoint but cover all of A except q, resulting in the entire array being sorted.
The setup: I have two arrays which are not sorted and are not of the same length. I want to see if one of the arrays is a subset of the other. Each array is a set in the sense that there are no duplicates.
Right now I am doing this sequentially in a brute force manner so it isn't very fast. I am currently doing this subset method sequentially. I have been having trouble finding any algorithms online that A) go faster and B) are in parallel. Say the maximum size of either array is N, then right now it is scaling something like N^2. I was thinking maybe if I sorted them and did something clever I could bring it down to something like Nlog(N), but not sure.
The main thing is I have no idea how to parallelize this operation at all. I could just do something like each processor looks at an equal amount of the first array and compares those entries to all of the second array, but I'd still be doing N^2 work. But I guess it'd be better since it would run in parallel.
Any Ideas on how to improve the work and make it parallel at the same time?
Thanks
Suppose you are trying to decide if A is a subset of B, and let len(A) = m and len(B) = n.
If m is a lot smaller than n, then it makes sense to me that you sort A, and then iterate through B doing a binary search for each element on A to see if there is a match or not. You can partition B into k parts and have a separate thread iterate through every part doing the binary search.
To count the matches you can do 2 things. Either you could have a num_matched variable be incremented every time you find a match (You would need to guard this var using a mutex though, which might hinder your program's concurrency) and then check if num_matched == m at the end of the program. Or you could have another array or bit vector of size m, and have a thread update the k'th bit if it found a match for the k'th element of A. Then at the end, you make sure this array is all 1's. (On 2nd thoughts bit vector might not work out without a mutex because threads might overwrite each other's annotations when they load the integer containing the bit relevant to them). The array approach, atleast, would not need any mutex that can hinder concurrency.
Sorting would cost you mLog(m) and then, if you only had a single thread doing the matching, that would cost you nLog(m). So if n is a lot bigger than m, this would effectively be nLog(m). Your worst case still remains NLog(N), but I think concurrency would really help you a lot here to make this fast.
Summary: Just sort the smaller array.
Alternatively if you are willing to consider converting A into a HashSet (or any equivalent Set data structure that uses some sort of hashing + probing/chaining to give O(1) lookups), then you can do a single membership check in just O(1) (in amortized time), so then you can do this in O(n) + the cost of converting A into a Set.
If I have N arrays, what is the best(Time complexity. Space is not important) way to find the common elements. You could just find 1 element and stop.
Edit: The elements are all Numbers.
Edit: These are unsorted. Please do not sort and scan.
This is not a homework problem. Somebody asked me this question a long time ago. He was using a hash to solve the problem and asked me if I had a better way.
Create a hash index, with elements as keys, counts as values. Loop through all values and update the count in the index. Afterwards, run through the index and check which elements have count = N. Looking up an element in the index should be O(1), combined with looping through all M elements should be O(M).
If you want to keep order specific to a certain input array, loop over that array and test the element counts in the index in that order.
Some special cases:
if you know that the elements are (positive) integers with a maximum number that is not too high, you could just use a normal array as "hash" index to keep counts, where the number are just the array index.
I've assumed that in each array each number occurs only once. Adapting it for more occurrences should be easy (set the i-th bit in the count for the i-th array, or only update if the current element count == i-1).
EDIT when I answered the question, the question did not have the part of "a better way" than hashing in it.
The most direct method is to intersect the first 2 arrays and then intersecting this intersection with the remaining N-2 arrays.
If 'intersection' is not defined in the language in which you're working or you require a more specific answer (ie you need the answer to 'how do you do the intersection') then modify your question as such.
Without sorting there isn't an optimized way to do this based on the information given. (ie sorting and positioning all elements relatively to each other then iterating over the length of the arrays checking for defined elements in all the arrays at once)
The question asks is there a better way than hashing. There is no better way (i.e. better time complexity) than doing a hash as time to hash each element is typically constant. Empirical performance is also favorable particularly if the range of values is can be mapped one to one to an array maintaining counts. The time is then proportional to the number of elements across all the arrays. Sorting will not give better complexity, since this will still need to visit each element at least once, and then there is the log N for sorting each array.
Back to hashing, from a performance standpoint, you will get the best empirical performance by not processing each array fully, but processing only a block of elements from each array before proceeding onto the next array. This will take advantage of the CPU cache. It also results in fewer elements being hashed in favorable cases when common elements appear in the same regions of the array (e.g. common elements at the start of all arrays.) Worst case behaviour is no worse than hashing each array in full - merely that all elements are hashed.
I dont think approach suggested by catchmeifyoutry will work.
Let us say you have two arrays
1: {1,1,2,3,4,5}
2: {1,3,6,7}
then answer should be 1 and 3. But if we use hashtable approach, 1 will have count 3 and we will never find 1, int his situation.
Also problems becomes more complex if we have input something like this:
1: {1,1,1,2,3,4}
2: {1,1,5,6}
Here i think we should give output as 1,1. Suggested approach fails in both cases.
Solution :
read first array and put into hashtable. If we find same key again, dont increment counter. Read second array in same manner. Now in the hashtable we have common elelements which has count as 2.
But again this approach will fail in second input set which i gave earlier.
I'd first start with the degenerate case, finding common elements between 2 arrays (more on this later). From there I'll have a collection of common values which I will use as an array itself and compare it against the next array. This check would be performed N-1 times or until the "carry" array of common elements drops to size 0.
One could speed this up, I'd imagine, by divide-and-conquer, splitting the N arrays into the end nodes of a tree. The next level up the tree is N/2 common element arrays, and so forth and so on until you have an array at the top that is either filled or not. In either case, you'd have your answer.
Without sorting and scanning the best operational speed you'll get for comparing 2 arrays for common elements is O(N2).
Say, I employ merge sort to sort an array of Integers. Now I need to also remember the positions that elements had in the unsorted array, initially. What would be the best way to do this?
A very very naive and space consuming way to do would be to (in C), to maintain each number as a "structure" with another number storing its index:
struct integer {
int value;
int orig_pos;
};
But, obviously there are better ways. Please share your thoughts and solution if you have already tackled such problems. Let me know if you would need more context. Thank you.
Clearly for an N-long array you do need to store SOMEwhere N integers -- the original position of each item, for example; any other way to encode "1 out of N!" possibilities (i.e., what permutation has in fact occurred) will also take at least O(N) space (since, by Stirling's approximation, log(N!) is about N log(N)...).
So, I don't see why you consider it "space consuming" to store those indices most simply and directly. Of course there are other possibilities (taking similar space): for example, you might make a separate auxiliary array of the N indices and sort THAT auxiliary array (based on the value at that index) leaving the original one alone. This means an extra level of indirectness for accessing the data in sorted order, but can save you a lot of data movement if you're sorting an array of large structures, so there's a performance tradeoff... but the space consumption is basically the same!-)
Is the struct such a bad idea? The alternative, to me, would be an array of pointers.
It feels to me that in this question you have to consider the age old question: speed vs size. In either case, you are keeping both a new representation of your data (the sorted array) and an old representation of your data (the way the array use to look), so inherently your solution will have some data replication. If you are sorting n numbers, and you need to remember after they were sorted where those n numbers were, you will have to store n amount of information somewhere, there is no getting around that.
As long as you accept that you are doubling the amount of space you need to be able to keep this old data, then you should consider the specific application and decide what will be faster. One option is to just make a copy of the array before you sort it, however resolving which was where later might turn into a O(N) problem. From that point of view your suggestion of adding another int to your struct doesn't seem like such a bad idea, if it fits with the way you will be using the data later.
This looks like the case where I use an index sort. The following C# example shows how to do it with a lambda expression. I am new at using lambdas, but they can do some complex tasks very easily.
// first, some data to work with
List<double> anylist = new List<double>;
anylist.Add(2.18); // add a value
... // add many more values
// index sort
IEnumerable<int> serial = Enumerable.Range(0, anylist.Count);
int[] index = serial.OrderBy(item => (anylist[item])).ToArray();
// how to use
double FirstValue = anylist[index[0]];
double SecondValue = anylist[index[1]];
And, of course, anylist is still in the origial order.
you can do it the way you proposed
you can also remain a copy of the original unsorted array (means you may use a not inplace sorting algorithm)
you can create an additional array containing only the original indices
All three ways are equally space consuming, there is no "better" way. you may use short instead of int to safe space if you array wont get >65k elements (but be aware of structure padding with your suggestion).