Multi-threaded Counting Sort

Multi-threaded Counting Sort - sorting

I have a university assignment that requires me to code the Counting Sort algorithm with n threads in Java. We haven't really been given more information than that. I thought that the best way would be to partition the array into n sections, then each thread sorts a section. The problem is that I am unsure of how to partition the array properly; I have only seen examples on how to partition into 2 sections, not n sections.
I would appreciate it if someone could provide me with the logic on how to partition it like I've explained, or give some pseudo-code. No source code please, this is an assignment I have to do.
I have no problem with the actual sorting, just the partitioning.
Thanks.

Definitions
Let's say that you have an array a[0..n-1] to sort and you want to do it using k threads.
For simplicity, let's assume that the smallest element has value 0 and the largest have a value m. If the smallest is not equal 0, then you can scale the values during assigning elements to threads.
Splitting into threads
Partition your array into k chunks each consisting of at most floor(m/k) + 1 different values of elements.
The i-th chunk consists of elements a[j] such that:
(i - 1) * (floor(m/k) + 1) <= a[j] < i * (floor(m/k) + 1)
For example, if you have an array with 10 elements:
a[0..9] = {1, 2, 5, 0, 3, 7, 2, 3 ,4, 6} and k = 3, then m = 7 and the 3 chunks are:
chunk_1: elements in range [0,3) -> [1, 2, 0, 2]
chunk_2: elements in range [3,6) -> [5, 3, 3, 4]
chunk_3: elements in range [6,9) -> [6, 7]
Next, assign each chunk to a separated thread. Each thread sorts one chunk and to get the whole array sorted, just concatenate the results from all threads in order:
thread_1 thread_2 ... thread_k
Complexity:
As you know, the complexity of the count sort is O(n + L) where n is the number of elements to sort, and L is the maximum value of element.
First, notice that you can scale down values in each thread in such a way, that L < floor(m/k) + 1 in that thread, so the complexity of count sort in each thread always depends on the number of elements in that thread.
If you assume that the distribution of the values is uniform, then the expected number of elements in each thread is also floor(m/k) so the total complexity of each thread is O(m/k).

The first idea popping into my mind is to partition the array recursively. That means if you can partition into 2, you can also partition into 4 , right?
A more advanced and modern approach is to partition into many more parts than you have threads or processes. Then assign these parts dynamically to the threads.

Related

Find all combinations of 3 elements from array of n elements

I was thinking about the fastest possible algorithm to return all combinations of unique 3 elements from an array of n elements. The obvious one is the O(n^3) solution, which takes under consideration all possible combinations, but this is bruteforce, and I intend to find something much quicker. Looking for an answer in C++

In the worst case (all the items of the array are different) you have
n ! / ((n - 3)! * 3!) == n * (n - 1) * (n - 2) / 6
distinct items to output and thus O(n**3) is all you can achieve.
If the array has many items, but few distinct ones, you can preprocess it:
remove all the item ocurrencies, but three:
[0, 1, 1, 1, 1, 0, 2, 1, 2, 2, 2, 1] -> [0, 0, 1, 1, 1, 2, 2, 2]
If you have a good hash function for the array's items the preprocess stage takes O(N). In the best case (all the items are same) the prepocess takes O(N)
and the only answer output is O(1) so you have O(N) for the entire
routine.
For the arbitrary array, you can't have complexity better than O(N) (since you have to scan the entire array). Finally, the complexity in case of preprocessing and good hash function for the array's items is in the range of
[O(N)..O(N**3)]
If you're lucky the process will be much quicker; if you have large data to output, well, you have but output the large collection...

There is no way to do that. Because no matter what you do you need to get those 3 different elements. which has size nC3 = n*(n-1)*(n-2)/6. You need to iterate this any times for sure. So you will have O(n^3) complexity minimum.

How to determine how many elements from a range are within another given range?

I need a little help trying to figure out something:
Given a sequence of unordered numbers (less than 15.000) - A - I must answer Q queries (Q <= 100000) of the form i, j, x, y that translate as the following:
How many numbers in range [i,j] from A are bigger (or equal) than x but smaller than y with all numbers in the sequence smaller than 5000.
I am under the impression this requires something like O(logN) because of the big length of the sequence and this got me thinking about BIT (binary indexed trees - because of the queries) but a 2D BIT is too big and requires way to much time to run even on the update side. So the only solution I see here should be 1D BIT or Segment Trees but I can't figure how to work out a solution based on these data structures. I tried retaining the positions in the ordered set of numbers but I can't figure out how to make a BIT that responds to queries of the given form.
Also the algorithm should fit in like 500ms for the given limits.
Edit 1: 500ms for all of the operations on preprocessing and answering the queries
EDIT 2: Where i, j are positions of first and last element in the sequence A to look for elements bigger than x and smaller than y
EDIT 3: Example:
Let there be 1, 3, 2, 4, 6, 3 and query 1, 4, 3, 5 so between positions 1 and 4 (inclusive) there are 2 elements (3 and 4) bigger (or equal) than 3 and smaller than 5
Thank you in advance! P.S: Sorry for the poor English!

Implement 2D-range counting by making a BIT-organized array of sorted subarrays. For example, on the input
[1, 3, 2, 4, 6, 3]
the oracle would be
[[1]
,[1, 3]
,[2]
,[1, 2, 3, 4]
,[6]
,[3, 6]
].
The space usage is O(N log N) (hopefully fine). Construction takes O(N log N) time if you're careful, or O(N log^2 N) time if not (no reason to be for your application methinks).
To answer a query with maximums on sequence index and value (four of these can be used to answer the input queries), do the BIT read procedure for the maximum index, using binary search in the array to count the number of elements not exceeding the maximum value. The query time is O(log^2 N).

Generate m random numbers out of a given array of n numbers

I have one doubt. Does creating random set m integers out of n array elements means that all the m elements have to be unique, because the probability of selecting each number is equal.
For example, If I have original array as {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (n = 10) and I am selecting 5 elements(m = 5) randomly. So does that mean that {1, 1, 5, 7, 9} is an unacceptable solution because 1 has occurred twice.

If you really are selecting randomly from an array, then each selection might be the same element, so a procedure that tried to build a set of m elements from an array of n elements (where n>=m) might never terminate. For example, it might just pick the same element over and over, thus never increasing the size of the set of selected elements.

I think that depends on the use case. The normal way to get random numbers is by using Fisher-Yates algorithm. So basically you pick a number and then move that number to the end of the array and reduce the size of the array by 1 so that the next number you choose doesn't repeat.
Basic pseudo code:
for i from n − 1 downto 1 do
j ← random integer with 0 ≤ j ≤ i
exchange a[j] and a[i]

A short answer to your question is NO. Selecting means choosing an element ( not copying/replicating) from a given set and by doing so we are not allowed to change the original set. m elements don't have to be unique but the original element set ( Space ) should remain unchanged.

find kth smallest number in O(logn) time

Here is the problem, an unsorted array a[n], and I need to find the kth smallest number in range [i, j], and absolutely 1<=i<=j<=n, k<=j-i+1.
Typically I will use quick-find to do the job, but it is not fast enough if there many query requests with different range [i, j], I hardly to figure out a algorithm to do the query in O(logn) time (preprocessing is allowed).
Any idea is appreciated.
PS
Let me make the problem easier to understand. Any kinds of preprocessing is allowed, but the query needs to be done in O(logn) time. And there will be many (more than 1) queries, like find the 1st in range [3,7], or 3rd in range [10,17], or 11th in range [33, 52].
By range [i, j] I mean in the original array, not sorted or something.
For example, a[5] = {3,1,7,5,9}, query 1st in range [3,4] is 5, 2nd in range [1,3] is 5, 3rd in range [0,2] is 7.

If pre-processing is allowed and not counted towards the time complexity, just use that to construct sub-lists so that you can efficiently find the element you're looking for. As with most optimisations, this trades space for time.
Your pre-processing step is to take your original list of n numbers and create a number of new sublists.
Each of these sublists is a portion of the original, starting with the nth element, extending for m elements and then sorted. So your original list of:
{3, 1, 7, 5, 9}
gives you:
list[0][0] = {3}
list[0][1] = {1, 3}
list[0][2] = {1, 3, 7}
list[0][3] = {1, 3, 5, 7}
list[0][4] = {1, 3, 5, 7, 9}
list[1][0] = {1}
list[1][1] = {1, 7}
list[1][2] = {1, 5, 7}
list[1][3] = {1, 5, 7, 9}
list[2][0] = {7}
list[2][1] = {5, 7}
list[2][2] = {5, 7, 9}
list[3][0] = {5}
list[3][1] = {5,9}
list[4][0] = {9}
This isn't a cheap operation (in time or space) so you may want to maintain a "dirty" flag on the list so you only perform it the first time after you do an modifying operation (insert, delete, change).
In fact, you can use lazy evaluation for even more efficiency. Basically set all sublists to an empty list when you start and whenever you perform a modifying operation. Then, whenever you attempt to access a sublist and it's empty, calculate that sublist (and that one only) before trying to get the kth value out of it.
That ensures sublists are evaluated only when needed and cached to prevent unnecessary recalculation. For example, if you never ask for a value from the 3-through-6 sublist, it's never calculated.
The pseudo-code for creating all the sublists is basically (for loops inclusive at both ends):
for n = 0 to a.lastindex:
create array list[n]
for m = 0 to a.lastindex - n
create array list[n][m]
for i = 0 to m:
list[n][m][i] = a[n+i]
sort list[n][m]
The code for lazy evaluation is a little more complex (but only a little), so I won't provide pseudo-code for that.
Then, in order to find the kth smallest number in the range i through j (where i and j are the original indexes), you simply look up lists[i][j-i][k-1], a very fast O(1) operation:
+--------------------------+
| |
| v
1st in range [3,4] (values 5,9), list[3][4-3=1][1-1-0] = 5
2nd in range [1,3] (values 1,7,5), list[1][3-1=2][2-1=1] = 5
3rd in range [0,2] (values 3,1,7), list[0][2-0=2][3-1=2] = 7
| | ^ ^ ^
| | | | |
| +-------------------------+----+ |
| |
+-------------------------------------------------+
Here's some Python code which shows this in action:
orig = [3,1,7,5,9]
print orig
print "====="
list = []
for n in range (len(orig)):
list.append([])
for m in range (len(orig) - n):
list[-1].append([])
for i in range (m+1):
list[-1][-1].append(orig[n+i])
list[-1][-1] = sorted(list[-1][-1])
print "(%d,%d)=%s"%(n,m,list[-1][-1])
print "====="
# Gives xth smallest in index range y through z inclusive.
x = 1; y = 3; z = 4; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
x = 2; y = 1; z = 3; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
x = 3; y = 0; z = 2; print "(%d,%d,%d)=%d"%(x,y,z,list[y][z-y][x-1])
print "====="
As expected, the output is:
[3, 1, 7, 5, 9]
=====
(0,0)=[3]
(0,1)=[1, 3]
(0,2)=[1, 3, 7]
(0,3)=[1, 3, 5, 7]
(0,4)=[1, 3, 5, 7, 9]
(1,0)=[1]
(1,1)=[1, 7]
(1,2)=[1, 5, 7]
(1,3)=[1, 5, 7, 9]
(2,0)=[7]
(2,1)=[5, 7]
(2,2)=[5, 7, 9]
(3,0)=[5]
(3,1)=[5, 9]
(4,0)=[9]
=====
(1,3,4)=5
(2,1,3)=5
(3,0,2)=7
=====

Current solution is O( (logn)^2 ). I am pretty sure it can be modified to run on O(logn). The main advantage of this algorithm over paxdiablo's algorithm is space efficiency. This algorithm needs O(nlogn) space, not O(n^2) space.
First, the complexity of finding kth smallest element from two sorted arrays of length m and n is O(logm + logn). Complexity of finding kth smallest element from arrays of lengths a,b,c,d.. is O(loga+logb+.....).
Now, sort the whole array and store it. Sort the first half and second half of the array and store it and so on. You will have 1 sorted array of length n, 2 sorted of arrays of length n/2, 4 sorted arrays of length n/4 and so on. Total memory required = 1*n+2*n/2+4*n/4+8*n/8...= nlogn.
Once you have i and j figure out the list of of subarrays which, when concatenated, give you range [i,j]. There are going to be logn number of arrays. Finding kth smallest number among them would take O( (logn)^2) time.
Example for the last paragraph:
Assume the array is of size 8 (indexed from 0 to 7). You have the following sorted lists:
A:0-7, B:0-3, C:4-7, D:0-1, E:2-3, F:4-5, G:6-7.
Now construct a tree with pointers to these arrays such that every node contains its immediate constituents. A will be root, B and C are its children and so on.
Now implement a recursive function that returns a list of arrays.
def getArrays(node, i, j):
if i==node.min and j==node.max:
return [node];
if i<=node.left.max:
if j<=node.left.max:
return [getArrays(node.left, i, j)]; # (i,j) is located within left node
else:
return [ getArrays(node.left, i, node.left.max), getArrays(node.right, node.right.min, j) ]; # (i,j) is spread over left and right node
else:
return [getArrays(node.right, i, j)]; # (i,j) is located within right node

Preprocess: Make an nxn array where the [k][r] element is the kth smallest element of the first r elements (1-indexed for convenience).
Then, given some particular range [i,j] and value for k, do the following:
Find the element at the [k][j] slot of the matrix; call this x.
go down the i-1 column of your matrix and find how many values in it are smaller than or equal to x (treat column 0 as having 0 smaller entries). By construction, this column will be sorted (all columns will be sorted), so it can be found in log time. Call this value s
Find the element in the [k+s][j] slot of the matrix. This is your answer.
E.g., given 3 1 7 5 9
3 1 1 1 1
X 3 3 3 3
X X 7 5 5
X X X 7 7
X X X X 9
Now, if we're asked for the 2nd smallest in [2,4] range (again, 1-indexing), I first find the 2nd smallest in [1,4] range which is 3. I then look at column 1 and see that there is 1 element less than or equal to 3. Finally, I find the 3rd smallest in [1,4] range at [3][5] slot which is 5, as desired.
This takes n^2 space, and log(n) lookup time.

This one does not require pre-process but is somehow slower than O(logN). It's significantly faster than a naive iterate&count, and could support dynamic modification on the sequence.
It goes like this. Suppose the length n has n=2^x for some x. Construct a segment-tree whose root node represent [0,n-1]. For each of the node, if it represent a node [a,b], b>a, let it has two child nodes each representing [a,(a+b)/2], [(a+b)/2+1,b]. (That is, do a recursive divide-by-two).
Then, on each node, maintain a separate binary search tree for the numbers within that segment. Therefore, each modification on the sequence takes O(logN)[on the segement]*O(logN)[on the BST]. Queries can be done like this, Let Q(a,b,x) be rank of x within segment [a,b]. Obviously, if Q(a,b,x) can be computed efficiently, a binary search on x can compute the answer desired effectively (with an extra O(logE) factor.
Q(a,b,x) can be computed as: find smallest number of segments that make up [a,b], which can be done in O(logN) on the segment tree. For each segment, query on the binary search tree for that segment for the number of elements less than x. Add all these numbers to get Q(a,b,x).
This should be O(logN*logE*logN). Well not exactly what you have asked for though.

In O(log n) time it's not possible to read all of the elements of the array. Since it's not sorted, and there's no other provided information, this is impossible.

There's no way you can do better than O(n) in both worst and average case. You have to look at every single element.

Sorting Algorithm For Array with Integers of at most n spots away

Given an array with integers, with each integer being at most n positions away from its final position, what would be the best sorting algorithm?
I've been thinking for a while about this and I can't seem to get a good strategy to start dealing with this problem. Can someone please guide me?

I'd split the list (of size N) into 2n sublists (using zero-based indexing):
list 0: elements 0, 2n, 4n, ...
list 1: elements 1, 2n+1, 4n+1, ...
...
list 2n-1: elements 2n-1, 4n-1, ...
Each of these lists is obviously sorted.
Now merge these lists (repeatedly merging 2 lists at a time, or using a min heap with one element of each of these lists).
That's all. Time complexity is O(N log(n)).
This is easy in Python:
>>> a = [1, 0, 5, 4, 3, 2, 6, 8, 9, 7, 12, 13, 10, 11]
>>> n = max(abs(i - x) for i, x in enumerate(a))
>>> n
3
>>> print(*heapq.merge(*(a[i::2 * n] for i in range(2 * n))))
0 1 2 3 4 5 6 7 8 9 10 11 12 13

The Heap Sort is very fast for initially random array/collection of elements. In pseudo code this sort would be imlemented as follows:
# heapify
for i = n/2:1, sink(a,i,n)
→ invariant: a[1,n] in heap order
# sortdown
for i = 1:n,
swap a[1,n-i+1]
sink(a,1,n-i)
→ invariant: a[n-i+1,n] in final position
end
# sink from i in a[1..n]
function sink(a,i,n):
# {lc,rc,mc} = {left,right,max} child index
lc = 2*i
if lc > n, return # no children
rc = lc + 1
mc = (rc > n) ? lc : (a[lc] > a[rc]) ? lc : rc
if a[i] >= a[mc], return # heap ordered
swap a[i,mc]
sink(a,mc,n)
For different cases like "Nearly Sorted" of "Few Unique" the algorithms can work differently and be more efficent. For a complete list of the algorithms with animations in the various cases see this brilliant site.
I hope this helps.
Ps. For nearly sorted sets (as commented above) the insertion sort is your winner.

I'd recommend using a comb sort, just start it with a gap size equal to the maximum distance away (or about there). It's expected O(n log n) (or in your case O(n log d) where d is the maximum displacement), easy to understand, easy to implement, and will work even when the elements are displaced more than you expect. If you need the guaranteed execution time you can use something like heap sort, but in the past I've found the overhead in space or computation time usually isn't worth it and end up implementing nearly anything else.

Since each integer being at most n positions away from its final position:
1) for the smallest integer (aka. the 0th integer in the final sorted array), its current position must be in A[0...n] because the nth element is n positions away from the 0th position
2) for the second smallest integer (aka. the 1st integer in the final sorted array, zero based), its current position must be in A[0...n+1]
3) for the ith smallest integer, its current position must be in A[i-n...i+n]
We could use a (n+1)-size min heap, containing a rolling window to get the array sorted. And you could find more details here:
http://www.geeksforgeeks.org/nearly-sorted-algorithm/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Multi-threaded Counting Sort - sorting

Related

Find all combinations of 3 elements from array of n elements

How to determine how many elements from a range are within another given range?

Generate m random numbers out of a given array of n numbers

find kth smallest number in O(logn) time

Sorting Algorithm For Array with Integers of at most n spots away

Categories

Resources