Actually sorting techniques are two types according to memory usage. One is that internal Another one is that external.
Insertion selection exchange sorts are internal sorts. That means they are processed in internal memory.
But I don't know about merge sort?
You can certainly write a completely internal merge sort. See https://www.geeksforgeeks.org/merge-sort/ for an example.
People often talk about an "external merge sort", but that often works out to a two-pass sorting technique where you successively load small portions of a large file into memory, sort them, and write them to disk. In the second pass, you merge those multiple portions into a single sorted file. See https://en.wikipedia.org/wiki/External_sorting for details.
Related
My motivation for this is to write some Z80 Assembly code to sort the TI-83+ series' Variable Allocation Table (VAT), but I am also interested in this as a general problem.
The part of the VAT that I want to sort is arranged in contiguous memory with each element comprised of some fixed-size data, followed by a size byte for the name, then the name. To complicate matters, there are two stacks located on either side of the VAT, offering no wiggle room to safely pad it with allocated RAM.
Ideally, I'd want to use O(1) space as I have ready access to 2 768-byte non-user RAM buffers. I also want to make it fast as it can contain many entries and this is a 6MHz processor (effectively 1MIPS, though-- no instruction pipeline). It's also important to note that each entry is at least 8 bytes and at most 15 bytes.
The best approach that I've been able to think up relies on block memory transfers which aren't particularly fast on the Z80. In the past others have implemented an insertion sort algorithm, but it wasn't particularly efficient. As well, while I can (and have) written code to collect into an array and sort the pointers to all of the entries, it requires variable amounts of space, so I have to allocate user RAM which is already in short supply.
I feel like it vaguely reminds me of some combinatorial trick I came across once, but for the life of me, a good solution to this problem has evaded me. Any help would be much appreciated.
Divide the table into N pieces which each piece is small enough to be sorted by your existing code using the fixed size temporary buffers available. Then perform a merge sort on the N lists to produce the final result.
Instead of an N-way merge it may be easiest to sort the N pieces pairwise using 2-way merges.
When sorting each piece it may be an advantage to use hash codes to avoid string comparisons. Seems like radix sorting might provide some benefit.
For copying data the Z-80's block move instructions LDIR and LDDR are fairly expensive but hard to beat. Unrolling LDIR into a series of LDI can be faster. Pointing the stack pointer at source and destination and using multiple POP and then PUSH can be faster but requires interrupts be disabled and a guarantee of no non-maskable interrupts occurring.
So sometimes I have a certain ds with certain functionalities which have a get time complexity of O(N) like a queue, stack, heap, etc.. I use one of these ds in a program which just needs to check whether a certain element is in one of theses ds, but because they have a get complexity of O(N), it is the pitfall in my algorithm.
If memory isn't much of my worries, would it be poor design to have a hashmap which keeps track of the elements in the restricted data structure? Doing this would essentially remove the O(N) restriction and allow it to be O(1).
Having a supplemental hash table is warranted in many situations. However, maintaining a "parallel hash" could become a liability.
The situation that you describe, when you need to check membership quickly, is often modeled with a hash-based set (HashSet<T>, std::unordered_set<T>, and so on, depending on the language). The disadvantage of these structures is that the order of elements is not specified, and that they cannot have duplicates.
Depending on the library, you may have access to data structures that fix these shortcomings. For example, Java offers LinkedHashSet<T> which provides a predictable order of enumeration, and C++ provides std::unordered_multiset<T>, which allows duplicates.
Given a cloud storage folder with say 1PB of data in it, what would be the quickest way to sort all of that data? It's easy to sort small chunks of it, but then merging them into a larger sorted output will take longer since at some point a single process will have to merge the whole thing. I would like to avoid this, and have a fully distributed solution, is there a way? If so, is there any implementation that would be suitable for using to sort data in S3?
Since the amount of data you need to sort exceeds RAM (by a lot), the only reasonable way (to my knowledge) is to sort chunks first and then merge them together.
Merge Sort is the best way to accomplish this task. You can sort separate chunks of data at the same time with parallel processes, which should speed up your sort.
The thing is, after you done sorting chunks, you don't have to have a single process doing all of merging, you can have several processes merging different chunks at the same time:
This algorithm uses a parallel merge algorithm to not only parallelize the recursive division of the array, but also the merge operation. It performs well in practice when combined with a fast stable sequential sort, such as insertion sort, and a fast sequential merge as a base case for merging small arrays.
Here is a link that gives a bit more info about Merge Algorithm (just in case).
Bad news- you cannot avoid k-merge of multiple sorted files.
Good thing is that you can do some operations in parallel.
I am confronted with a problem where I have a massive list of information (287,843 items) that must be sorted for display. Which is more efficient, to use a self-organizing red-black binary tree to keep them sorted or to build an array and then sort? My keys are strings, if that helps. This algorithm should make use of multiple processor cores.
Thank you!
This really depends on the particulars of your setup. If you have a multicore machine, you can probably sort the strings extremely quickly by using a parallel version of quicksort, in which each recursive call is executed in parallel with each other call. With many cores, this can take the already fast quicksort and make it substantially faster. Other sorting algorithms like merge sort can also be parallelized, though parallel quicksort has the advantage of requiring less extra memory. Since you know that you're sorting strings, you may also want to look into parallel radix sort, which could potentially be extremely fast.
Most binary search trees cannot easily be multithreaded, because rebalance operations often require changing multiple parts of the tree at once, so a balanced red/black tree may not be the best approach here. However, you may want to look into a concurrent skiplist, which is a data structure that can be made to work efficiently in parallel. There are some newer binary search trees designed for parallelism that sometimes outperform the skiplist (here is one such data structure), though I expect that there will be fewer existing implementations and discussion of these newer structures.
If the elements are not changing frequently or you only need sorted order once, then just sorting once with parallel quicksort is probably the best bet. If the elements are changing frequently, then a concurrent data structure like the parallel skiplist will probably be a better bet.
Hope this helps!
Assuming that you're reading that list from a file or some other data source, it seems quite right to read all that into an array, and then sort it. If you have a GUI of some sort, it seems even more feasible to do both reading and sorting in a thread, while having the GUI in a "waiting to complete" state. Keeping a tree of the values sounds feasible only if you're going to do a lot of deletions/insertions, which would make an array less usable in this case.
When it comes to multi-core sorting, I believe the merge sort is the easiest to parallelize. But I'm no expert when it comes to this, so don't take my word for a definite answer.
Say I have 50 million features, each feature comes from disk.
At the beggining of my program, I handle each feature and depending on some conditions, I apply some modifications to some.
A this point in my program, I am reading a feature from disk, processing it, and writing it back, because well I don't have enough ram to open all 50 million features at once.
Now say I want to sort these 50 million features, is there any optimal algorithm to do this as I can't load everyone at the same time?
Like a partial sorting algorithm or something like that?
In general, the class of algorithms you're looking for is called external sorting. Perhaps the most widely known example of such sorting algorithm is called Merge sort.
The idea of this algorithm (the external version) is that you split the data into pieces that you can sort in-place in memory (say 100 thousands) and sort each block independently (using some standard algorithm such as Quick sort). Then you take the blocks and merge them (so you merge two 100k blocks into one 200k block) which can be done by reading elements from both of the block into buffers (since the blocks are already sorted). At the end, you merge two smaller blocks into one block which will contain all the elements in the right order.
If you are on Unix, use sort ;)
It may seem stupid but the command-line tool has been programmed to handle this case and you won't have to reprogram it.