sorting component-wise multi value (SIMD) array - algorithm

I'm trying to find an O(n∙log(n)) sorting method to sort several arrays simultaneously so that an element in a multi-value array will represent elements from 4 different single value arrays and the sorting method would sort the multi-value elements.
For example:
For a given 4 single value arrays An, Bn, Cn and Dn,
I'd set a new array Qn
so that Qᵢ = [ Aᵢ Bᵢ Cᵢ Dᵢ ].
Qᵢ may be changed during the process so that Qᵢ = [ Aaᵢ Bbᵢ Ccᵢ Ddᵢ ]
where aᵢ, bᵢ, cᵢ and dᵢ are index lists
and of course that Qᵢ ≤ Qᵢ₊₁ = [ Aaᵢ₊₁ Bbᵢ₊₁ Ccᵢ₊₁ Ddᵢ₊₁ ] so that Aaᵢ ≤ Aaᵢ₊₁, Bbᵢ ≤ Bbᵢ₊₁ and so on.
The motivation is to use SIMD intructions of course to benefit from this structure to separately sort the 4 arrays.
I tried to use a SIMD comparer (_mm_cmplt_ps for example) and a masked swap (_mm_blendv_ps for example)
to make a modified version of traditional sorting algorithms (quick sort, heap sort, merge sort etc)
but I always encounter the problem that in theory there appear to be O(n∙log(n)) steps in the decision tree.
And so, a decision, whether if to set a pivot (quick sort) or whether if to exchange a parent with one of its children (heap sort)
is not correct for all of the whole 4 components all together at the same time (and thus, the next step - go right or left - is incorrect).
For now i only have O(n²) methods working.
Any ideas?

It sounds as though a sorting network is the answer to the question that you asked, since the position of the comparators is not data dependent. Batcher's bitonic mergesort is O(n log2 n).

Related

Sorting given pairwise orderings

I have n variables (Var 1 ... Var n) and do not know their exact values. The n choose 2 pairwise ordering between these n variables are known. For instance, it is known that Var 5 <= Var 9, Var 9 <= Var 10 and so on for all pairs. Further, it is also known that these pairwise orderings are consistent and do not lead to a degenerate case of equality throughout. That is, in the above example the inequality Var 10 <= Var 5 will not be present.
What is the most efficient sorting algorithm for such problems which gives a sorting of all variables?
Pairwise ordering is the only thing that any (comparison-based) sort needs anyway, so your question boils down to "what's the most efficient comparison-based sorting algorithm".
In answer to that, I recommend you look into Quicksort, Heapsort, Timsort, possibly Mergesort and see what will work well for your case in terms of memory requirements, programming complexity etc.
I find Quicksort the quickest to implement for a once-off program.
The question is not so much how to sort (use the standard sort of your language) but how to feed the sort criterion to the sorting algorithm.
In most languages you need to provide a int comparison (T a, T b) where T is the type of elements, that returns -1, 0 or 1 depending on who is larger.
So you need a fast access to the data structure storing (all) pairwise orderings, given a pair of elements.
So the question is not so much will Var 10 <= Var 5 be present (inconsistent) but more is Var 5 <= Var 10 ensured to be present ? If this is the case, you can test presence of the constraint in O(1) with a hash set of pairs of elements, otherwise, you need to find a transitive relationship between a and b, which might not even exist (it's unclear from OP if we are talking of a partial or total order, i.e. for all a,b we ensure a < b or b < a or a = b (total order).
With roughly worst case N^2 entries, this hash is pretty big. Building it still requires exploring transitive links which is costly.
Following links probably means a map of elements to sets of (immediately) smaller elements, when comparing a to b, if (map(a) contains b) or (map(b) contains a) you can answer immediately, otherwise you need to recurse on the elements of map(a) and map(b), with pretty bad complexity. Ultimately you'll still be cumulating sets of smaller values to build your test.
Perhaps if you have a low number of constraints a <= b, just applying a permutation of a and b when they do not respect the constraint and iterating over the constraints to fixpoint (all constraints applied in one full round with no effect) could be more efficient. At least it's O(1) in memory.
A variant of that could be sorting using a stable sort (preserves order of incomparable entries) several times with subsets of the constraints.
Last idea, computing a Max with your input data is O(number of constraints), so you could just repeatedly compute the Max, add it at the end of the target, remove constraints that use it, rinse and repeat. I'd use a stack to store the largest element up to a given constraint index, so you can backtrack to that rather than restart from scratch.

Performance of counting sort

AFAIK counting sort is using following algorithm:
// A: input array
// B: output array
// C: counting array
sort(A,B,n,k)
1. for(i:k) C[i]=0;
2. for(i:n) ++C[A[i]];
3. for(i:k) C[i]+=C[i-1];
4. for(i:n-1..0) { B[C[A[i]]-1]=A[i]; --C[A[i]]; }
What about I remove step 3 and 4, and do following?
3. t=0; for(i:k) while(C[A[i]]) { --A[i]; B[t++]=i; }
Full code here, looks like fine, but I don't know which one has better performance.
Questions:
I guess the complexity of these two versions would be the same, is that ture?
In step 3 and step 4 the first version need to iterate n+k times, the second one only need to iterate n times. So does the second one have better performance?
Your code seems to be correct and it will work in case of sorting numbers. But, suppose you had an array of structures that you were sorting according to their keys. Your method will not work in that case because it simply counts the frequency of a number and while it remains positive assigns it to increasing indices in the output array. The classical method however will work for arrays of structures and objects etc. because it calculates the position that each element should go to and then copies data from the initial array to the output array.
To answer your question:
1> Yes, the runtime complexity of your code will be the same because for an array of size n and range 0...k, your inner and outer loop run proportional to f(0)+f(1)+...+f(k), where f denotes frequency of a number. Therefore runtime is O(n).
2> In terms of asymptotic complexity, both the methods have same performance. Due to an extra loop, the constants may be higher. But, that also makes the classical method a stable sort and have the benefits that I pointed out earlier.

Algorithm for finding mutual name in lists

I've been reading up on Algorithms from the book Algorithms by Robert Sedgewick and I've been stuck on an exercise problem for a while. Here is the question :
Given 3 lists of N names each, find an algorithm to determine if there is any name common to all three lists. The algorithm must have O(NlogN) complexity. You're only allowed to use sorting algorithms and the only data structures you can use are stacks and queues.
I figured I could solve this problem using a HashMap, but the questions restricts us from doing so. Even then that still wouldn't have a complexity of NlogN.
If you sort each of the lists, then you could trivially check if all three lists have any 1 name in O(n) time by picking the first name of list A compare it to the first name in list B, if that element is < that of list A, pop the list b element and repeat until list B >= list A. If you find a match repeat the process on C. If you find a match in C also return true, otherwise return to the next element in a.
Now you have to sort all of the lists in n log n time. which you could do with your favorite sorting algorithm though you would have to be a little creative using just stacks and queues. I would probably recommend merge sort
The below psuedo code is a little messed up because I am changing lists that I am iterating over
pseudo code:
assume listA, b and c are sorted Queues where the smallest name is at the top of the queue.
eltB = listB.pop()
eltC = listC.pop()
for eltA in listA:
while(eltB<=eltA):
if eltB==eltA:
while(eltC<=eltB):
if eltB==eltC:
return true
if eltC<eltB:
eltC=listC.pop();
eltB=listB.pop()
Steps:
Sort the three lists using an O(N lgN) sorting algorithm.
Pop the one item from each list.
If any of the lists from which you tried to pop is empty, then you are done i.e. no common element exists.
Else, compare the three elements.
If the elements are equal, you are done - you have found the common element.
Else, keep the maximum of the three elements (constant time) and replenish from the same lists from which the two elements were discarded.
Go to step 3.
Step 1 takes O(N lgN) and the rest of the steps take O(N), so the overall complexity is O(N lgN).

Union of inverted lists

Give k sorted inverted lists, I want an efficient algorithm to get the union of these k lists?
Each inverted list is a read-only array in memory, each list contains integer in sorted order.
the result will be saved in a predefined array which is large enough. Is there any algorithm better than k-way merge?
K-Way merge is optimal. It has O(log(k)*n) ops [where n is the number of elements in all lists combined].
It is easy to see it cannot be done better - as #jpalecek mentioned, otherwise you could sort any array better then O(nlogn) by splitting it into chunks [inverted indexes] of size 1.
Note: This answer assumes it is important that inverted indexes
[resulting array] will be sorted. This assumption is true for most
applications that use inverted indexes, especially in the
Information-Retrieval area. This feature [sorted indexes] allows
elegant and quick intersection of indexes.
Note: that standard k-way merge allows duplications, you will have to
make sure that if an element is appearing in two lists, it will be
added only once [easy to do it by simply checking the last element in
the target array before adding].
If you don't need the resulting array to be sorted, the best approach would be using a hash table to mark which of the elements you have seen. This way, you can get O(n) (n being the total number of elements) time complexity.
Something along the lines of (Perl):
my %seen;
#merged = grep { exists $seen{$_} ? 0 : ($seen{$_} = 1) } (map {(#$_)} #inputs);

Find a common element within N arrays

If I have N arrays, what is the best(Time complexity. Space is not important) way to find the common elements. You could just find 1 element and stop.
Edit: The elements are all Numbers.
Edit: These are unsorted. Please do not sort and scan.
This is not a homework problem. Somebody asked me this question a long time ago. He was using a hash to solve the problem and asked me if I had a better way.
Create a hash index, with elements as keys, counts as values. Loop through all values and update the count in the index. Afterwards, run through the index and check which elements have count = N. Looking up an element in the index should be O(1), combined with looping through all M elements should be O(M).
If you want to keep order specific to a certain input array, loop over that array and test the element counts in the index in that order.
Some special cases:
if you know that the elements are (positive) integers with a maximum number that is not too high, you could just use a normal array as "hash" index to keep counts, where the number are just the array index.
I've assumed that in each array each number occurs only once. Adapting it for more occurrences should be easy (set the i-th bit in the count for the i-th array, or only update if the current element count == i-1).
EDIT when I answered the question, the question did not have the part of "a better way" than hashing in it.
The most direct method is to intersect the first 2 arrays and then intersecting this intersection with the remaining N-2 arrays.
If 'intersection' is not defined in the language in which you're working or you require a more specific answer (ie you need the answer to 'how do you do the intersection') then modify your question as such.
Without sorting there isn't an optimized way to do this based on the information given. (ie sorting and positioning all elements relatively to each other then iterating over the length of the arrays checking for defined elements in all the arrays at once)
The question asks is there a better way than hashing. There is no better way (i.e. better time complexity) than doing a hash as time to hash each element is typically constant. Empirical performance is also favorable particularly if the range of values is can be mapped one to one to an array maintaining counts. The time is then proportional to the number of elements across all the arrays. Sorting will not give better complexity, since this will still need to visit each element at least once, and then there is the log N for sorting each array.
Back to hashing, from a performance standpoint, you will get the best empirical performance by not processing each array fully, but processing only a block of elements from each array before proceeding onto the next array. This will take advantage of the CPU cache. It also results in fewer elements being hashed in favorable cases when common elements appear in the same regions of the array (e.g. common elements at the start of all arrays.) Worst case behaviour is no worse than hashing each array in full - merely that all elements are hashed.
I dont think approach suggested by catchmeifyoutry will work.
Let us say you have two arrays
1: {1,1,2,3,4,5}
2: {1,3,6,7}
then answer should be 1 and 3. But if we use hashtable approach, 1 will have count 3 and we will never find 1, int his situation.
Also problems becomes more complex if we have input something like this:
1: {1,1,1,2,3,4}
2: {1,1,5,6}
Here i think we should give output as 1,1. Suggested approach fails in both cases.
Solution :
read first array and put into hashtable. If we find same key again, dont increment counter. Read second array in same manner. Now in the hashtable we have common elelements which has count as 2.
But again this approach will fail in second input set which i gave earlier.
I'd first start with the degenerate case, finding common elements between 2 arrays (more on this later). From there I'll have a collection of common values which I will use as an array itself and compare it against the next array. This check would be performed N-1 times or until the "carry" array of common elements drops to size 0.
One could speed this up, I'd imagine, by divide-and-conquer, splitting the N arrays into the end nodes of a tree. The next level up the tree is N/2 common element arrays, and so forth and so on until you have an array at the top that is either filled or not. In either case, you'd have your answer.
Without sorting and scanning the best operational speed you'll get for comparing 2 arrays for common elements is O(N2).

Resources