I'm new to openMP and I'm trying to implement quicksort in it.
I understand that I have to select one pivot and then sort the equal parts of the array in parallel. Then I have to reorder the array. However after doing this I need to know the indexes on which the threads are supposed to start and stop in the next step.
I've found that you have to use prefix sum somehow to keep information about the indexes but I couldn't figure out how to use it.
I would very much appreciate any help.
Related
I have an algorithm that stores elements in boost::ptr_vector. It is important for the algorithm that once allocated pointers to elements would not change until ptr_vector is destroyed. On the other hand I need to sort ptr_vector. I assumed (may be naively) that since regular std::sort swaps elements it will simply swap the order of pointers inside of ptr_vector without new and delete. On the other hand I see in this post indications that sorting ptr_vector may actually change element pointers. Can someone confirm that reallocation actually happens? Is there a way to avoid it?
I now think that my fears may not be justified. It seems that the post that prompted my concerns referred to standard library sort which indeed would cause swapping and reallocation of elements. But ptr_vector has a member function implementation of sort and I have every reason to expect that it preserves element pointers and avoids reallocations.
The pointervector, as a container, is obviously being sorted.
Had you wanted to retain all iterators and element references valid, you should be using
boost::stable_vector
or possibly some combination of/with boost multi-index
Sadly (?) I don't think there is currently anything that combines the two concepts. Of course such a thing can be written, and then Boost Intrusive could be highly instrumental to manage the implementation.
In short, is the cost (in time and cpu) higher to call kind_of? twice or to create a new array with one value, then iterate through it? The 'backstory' below simply details why I need to know this, but is not a necessary read to answer the question.
Backstory:
I have a bunch of location data. Latitude/longitude pairs and the name of the place they represent. I need to sort these lat/lon values by distance from another lat/lon pair provided by a user. I have to calculate the distances on the fly, and they aren't known before.
I was thinking it would be easy to do this by adding the distance => placename map to a hash, then get a keyset and sort that, then read out the values in that order. However, there is the potential for two distances being equal, making two keys equal to each other.
I have come up with two solutions to this, either I map
if hash.has_key?(distance)
hash[distance].kind_of? Array
? hash[distance] << placename
: hash.merge!({distance => [hash[distance], placename]})
else
hash.merge!({distance => placename})
end
then when reading the values I check
hash[distance] kind_of? Array ? grab the placename : iterate through hash and grab all placenames
each time. Or I could make each value an array from the start even if it has only one placename.
You've probably spent more time thinking about the issue than you will ever save in CPU time. Developer brain time (both yours and others who will maintain the code when you're gone) is often much more precious than CPU cycles. Focus on code clarity.
If you get indications that your code is a bottleneck, it may be a good idea to benchmark it, but don't forget to benchmark both before and after any changes you make, to make sure that you are actually improving the code. It is surprisingly how often "optimizations" aren't improving the code at all, just making it harder to read.
To be honest, this sounds like a very negligible performance issue, so I'd say just go with whatever feels better to you.
If you really believe that this has a real world performance impact (and frankly, there are other areas of Ruby you should worry more about speed-wise), reduce your problem to the simplest form that still resembles your problem and use the Benchmark module:
http://www.ruby-doc.org/stdlib/libdoc/benchmark/rdoc/index.html
I would bet that you'll achieve both higher performance and better legibility using the built-in Enumerable#group_by method.
As others have said, it's likely that this isn't a bottleneck, that gains will be negligible in any case and that you should focus on other things!
As a learning excercise, I've just had an attempt at implementing my own 'merge sort' algorithm. I did this on an std::list, which apparently already had the functions sort() and merge() built in. However, I'm planning on moving this over to a linked list of my own making, so the implementation is not particuarly important.
The problem lies with the fact that a std::list doesnt have facilities for accessing random nodes, only accessing the front/back and stepping through. I was originally planning on somehow performing a simple binary search through this list, and finding my answer in a few steps.
The fact that there are already built in functions in an std::list for performing these kinds of ordering leads me to believe that there is an equally easy way to access the list in the way I want.
Anyway, thanks for your help in advance!
The way a linked list works is that you step through the items in the list one at a time. By definition there is no way to access a "random" element in the list. The Sort method you refer to actually creates a brand new list by going through each node one at a time and placing items at the correct location.
You'll need to store the data differently if you want to access it randomly. Perhaps an array of the elements you're storing.
Further information on linked lists: http://en.wikipedia.org/wiki/Linked_list
A merge sort doesn't require access to random elements, only to elements from one end of the list.
I am doing an assignment using MPI to implement Game of Life. I was wondering if I should use a block-row partitioning, a cyclic row partitioning or a block-checkerboard partitioning?
What are the pros and cons between the types of partitioning? I tried to find references to the partitionings (which seems to tie in with parallell processing) but it was difficult to find such without going way over my head into it. :)
Try the one that fits your needs the most, since it is an assignment you should try the simplest one first and do the others when time allows.
However you do it, don't forget to make your partitions bigger on each side with some overlap.
This will mean duplicating some data, but it also means each partition can compute independently. At the end of each tick your partitions can copy their overlap to their neighbors.
I am now looking at my old school assignment and want to find the solution of a question.
Which sorting method is most suitable for parallel processing?
Bubble sort
Quick sort
Merge sort
Selection sort
I guess quick sort (or merge sort?) is the answer.
Am I correct?
Like merge sort, quicksort can also be easily parallelized due to its divide-and-conquer nature. Individual in-place partition operations are difficult to parallelize, but once divided, different sections of the list can be sorted in parallel.
One advantage of parallel quicksort over other parallel sort algorithms is that no synchronization is required. A new thread is started as soon as a sublist is available for it to work on and it does not communicate with other threads. When all threads complete, the sort is done.
http://en.wikipedia.org/wiki/Quicksort
It depends completely on the method of parallelization. For multithreaded general computing, a merge sort provides pretty reliable load balancing and memory localization properties. For a large sorting network in hardware, a form of Batcher, Bitonic, or Shell sort is actually best if you want good O(logĀ² n) performance.
i think merge sort
you can divide the dataset and make parallel operations on them..
I think Merge Sort would be the best answer here. Because the basic idea behind merge sort is to divide the problem into individual solutions.Solve them and Merge them.
Thats what we actually do in parallel processing too. Divide the whole problem into small unit statements to compute parallely and then join the results.
Thanks
Just a couple of random remarks:
Many discussions of how easy it is to parallelize quicksort ignore the pivot selection. If you traverse the array to find it, you've introduced a linear time sequential component.
Quicksort is not easy to implement at all in distributed memory. There is a discussion in the Kumar book
Yeah, I know, one should not use bubble sort. But "odd-even transposition sort", which is more or less equivalent, is actually a pretty good parallel programming exercise. In particular for distributed memory parallelism. It is the easiest example of a sorting network, which is very doable in MPI and such.
It is merge sort since the sorting is done on two sub arrays and they are compared and sorted at the end. these can be done in parallel