which one will be faster? - algorithm

let's say i have an array, size 40. and the element im looking for is in position 38.
having a simple loop, it will take 38 steps right?
but, having, 2 loops, running in parallel, and a variable, "found"
set to false, and changes to true when the element is found.
the first loop, will start from index 0
the second loop, will start from index 40.
so basically, it will take only, 4 steps right? to find the element. the worst case will be if the element is in the middle of the array. right?

It depends how much work it takes to synchronize the state between the two threads.
If it takes 0 work, then this will be, on average, 50% faster than a straight through algorithm.
On the other hand, if it takes more work than X, it will start to get slower (which is very likely the case).
From an algorithm standpoint, I don't think this is how you want to go. Even 2 threads is still going to be O(n) runtime. You would want to sort the data (n log n ), and then do a binary search to get the data. Especially you can sort it 1 time and use it for many searches...

If you're talking about algorithmic complexity, this is still a linear search. Just because you're searching two elements per iteration doesn't change the fact that the algorithm is O(n).
In terms of actual performance you would see, this algorithm is likely to be slower than a linear search with a single processor. Since very little work is done per-element in a search, the algorithm would be memory bound, so there would be no benefit to using multiple processors. Also, since you're searching in two locations, this algorithm would not be as cache efficient. And then, as bwawok points out, there would be a lot of time lost in synchronization.

When you are running in parallel you are dividing your CPU power into two + creating some overhead. If you mean you are running the search on a say, a multicore machine, with your proposed algorithm then the worse case is 20 steps. You are not making any change in the complexity class. So where those 4 steps, that you mentioned, are coming from?

On average there is no different in runtime.
Take for example if you are searching for an item out of 10.
The original algorithm will process in the following search order:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
The worse case is the last item (taking 10 steps).
While the second algorithm will process in the following search order:
1, 3, 5, 7, 9, 10, 8, 6, 4, 2
The worse case in this scenario is item 6 (taking 10 steps).
There are some cases where algorithm 1 is faster.
There are some cases where algorithm 2 is faster.
Both take the same time on average - O(n).
On a side note, it is interesting to compare this to a binary search order (on a sorted array).
4, 3, 2, 3, 1, 4, 3, 2, 4, 3
Taking at most 4 steps to complete.

Related

Sorting with limited stack operations

I am working on a sorting machine, and to minimize complexity, I would like to keep the moving parts to a minimum. I've come to the following design:
1 Input Stack
2+ Output Stacks
When starting, machine already knows all the items, their current order, and their desired order.
The machine can move one item from the bottom of the input stack to the bottom of an output stack of its choice.
The machine can move all items from an output stack to the top of the input stack. This is called a "return". (In my machine, I plan for this to be done by the user.)
The machine only accesses the bottom of a stack, except by a return. When a stack is returned to the input, the "new" items will be the last items out of the input. This also means that if the machine moves a set of items from the input to one output, the order of those items is reversed.
The goal of the machine is to take all the items from the input stack, and eventually move them all to an output stack in sorted order. A secondary goal is to reduce the number of "stack returns" to a minimum, because in my machine, this is the part that requires user intervention. Ideally, the machine should do as much sorting as it can without the user's help.
The issue I'm encountering is that I can't seem to find an appropriate algorithm for doing the actual sorting. Pretty much all algorithms I can find rely on being able to swap arbitrary elements. Distribution/external sorting seems promising, but all the algorithms I can find seem to rely on accessing multiple inputs at once.
Since machine already knows all the items, I can take advantage of this and sort all the items "in-memory". I experimented with "path-finding" from the unsorted state to the sorted state, but I'm unable to get it to actually converge on a solution. (It commonly just gets stuck in a loop moving stacks back and forth.)
Preferably, I would like a solution that works with a minimum of 2 output stacks, but is able to use more if available.
Interestingly, this is a "game" you can play with standard playing cards:
Get as many cards as you would like to sort. (I usually get 13 of a suit.)
Shuffle them and put them in your hand. Decide how many output stacks you get.
You have two valid moves:
You may move the front-most card in your hand and put it on top of any output stack.
You may pick up all the cards in an output stack and put them at the back of the cards you have in hand.
You win when the cards are in order in an output stack. Your score is the number of times you picked up a stack. Lower scores are better.
This can be done in O(log(n)) returns of an output to an input. More precisely in no more than 2 ceil(log_2(n)) - 1 returns if 1 < n.
Let's call the output stacks A and B.
First consider the simplest algorithm that works. We run through them, putting the smallest card on B and the rest on A. Then put A on input and repeat. After n passes you've got them in sorted order. Not very efficient, but it works.
Now can we make it so that we pull out 2 cards per pass? Well if we had cards 1, 4, 5, 8, 9, 12, ... in the top half and the rest in the bottom half, then the first pass will find card 1 before card 2, reverse them, the second finds card 3 before card 4, reverses them, and so on. 2 cards per pass. But with 1 pass with 2 returns we can put all the cards we want in the top half on stack A, and the rest on stack B, return stack A, return stack B, and then start extracting. This takes 2 + n/2 passes.
How about 4 cards per pass? Well we want it divided into quarters. With the top quarter having cards 1, 8, 9, 16, .... The second quarter having 2, 7, 10, 15, .... The third having 3, 6, 11, 14, .... And the last having 4, 5, 12, 13, .... Basically if you were dealing them you deal the first 4 in order, the second 4 in reverse, the next for in order.
We can divide them into quarters in 2 passes. Can we figure out how to get there? Well working backwards, after the second pass we want A to have quarters 2,1. And B to have quarters 4,3. Then we return A, return B, and we're golden. So after the first pass we wanted A to have quarters 2,4 and B to have quarters 1,3, return A return B.
Turning that around to work forwards, in pass 1 we put groups 2,4 on A, 1,3 on B. Return A, return B. Then in pass 2 we put groups 1,2 on A, 3,4 on B, return A, return B. Then we start dealing and we get 4 cards out per pass. So now we're using 4 + n/4 returns.
If you continue the logic forward, in 3 passes (6 returns) you can figure out how to get 8 cards per pass on the extract phase. In 4 passes (8 returns) you can get 16 cards per pass. And so on. The logic is complex, but all you need to do is remember that you want them to wind up in order ... 5, 4, 3, 2, 1. Work backwards from the last pass to the first figuring out how you must have done it. And then you have your forward algorithm.
If you play with the numbers, if n is a power of 2 you do equally well to take log_2(n) - 2 passes with 2 log_2(n) - 4 returns and then take 4 extraction passes with 3 returns between them for 2 log_2(n) - 1 returns, or if you take log_2(n) - 1 passes with 2 log_2(n) - 2 returns and then 2 extraction passes with 1 returns between them for 2 log_2(n) - 1 returns. (This is assuming, of course, that n is sufficiently large that it can be so divided. Which means "not 1" for the second version of the algorithm.) We'll see shortly a small reason to prefer the former version of the algorithm if 2 < n.
OK, this is great if you've got a multiple of a power of 2 to get. But what if you have, say, 10 cards? Well insert imaginary cards until we've reached the nearest power of 2, rounded up. We follow the algorithm for that, and simply don't actually do the operations that we would have done on the imaginary cards, and we get the exact results we would have gotten, except with the imaginary cards not there.
So we have a general solution which takes no more than 2 ceil(log_2(n)) - 1 returns.
And now we see why to prefer breaking that into 4 groups instead of 2. If we break into 4 groups, it is possible that the 4th group is only imaginary cards and we get to skip one more return. If we break into 2 groups, there always are real cards in each group and we don't get to save a return.
This speeds us up by 1 if n is 3, 5, 6, 9, 10, 11, 12, 17, 18, ....
Calculating the exact rules is going to be complicated, and I won't try to write code to do it. But you should be able to figure it out from here.
I can't prove it, but there is a chance that this algorithm is optimal in the sense that there are permutations of cards which you can't do better than this on. (There are permutations that you can beat this algorithm with, of course. For example if I hand you everything in reverse, just extracting them all is better than this algorithm.) However I expect that finding the optimal strategy for a given permutation is an NP-complete problem.

Algorithm for seeing if many different arrays are subsets of another one?

Let's say I have an array of ~20-100 integers, for example [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] (actually numbers more like [106511349 , 173316561, ...], all nonnegative 64-bit integers under 2^63, but for demonstration purposes let's use these).
And many (~50,000) smaller arrays of usually 1-20 terms to match or not match:
1=[2, 3, 8, 20]
2=[2, 3, NOT 8]
3=[2, 8, NOT 16]
4=[2, 8, NOT 16] (there will be duplicates with different list IDs)
I need to find which of these are subsets of the array being tested. A matching list must have all of the positive matches, and none of the negative ones. So for this small example, I would need to get back something like [3, 4]. List 1 fails to match because it requires 20, and list 2 fails to match because it has NOT 8. The NOT can easily be represented by using the high bit/making the number negative in those cases.
I need to do this quickly up to 10,000 times per second . The small arrays are "fixed" (they change infrequently, like once every few seconds), while the large array is done per data item to be scanned (so 10,000 different large arrays per second).
This has become a bit of a bottleneck, so I'm looking into ways to optimize it.
I'm not sure the best data structures or ways to represent this. One solution would be to turn it around and see what small lists we even need to consider:
2=[1, 2, 3, 4]
3=[1, 2]
8=[1, 2, 3, 4]
16=[3, 4]
20=[1]
Then we'd build up a list of lists to check, and do the full subset matching on these. However, certain terms (often the more frequent ones) are going to end up in many of the lists, so there's not much of an actual win here.
I was wondering if anyone is aware of a better algorithm for solving this sort of problem?
you could try to make a tree with the smaller arrays since they change less frequently, such that each subtree tries to halve the number of small arrays left.
For example, do frequency analysis on numbers in the smaller arrays. Find which number is found in closest to half of the smaller arrays. Make that the first check in the tree. In your example that would be '3' since it occurs in half the small arrays. Now that's the head node in the tree. Now put all the small lists that contain 3 to the left subtree and all the other lists to the right subtree. Now repeat this process recursively on each subtree. Then when a large array comes in, reverse index it, and then traverse the subtree to get the lists.
You did not state which of your arrays are sorted - if any.
Since your data is not that big, I would use a hash-map to store the entries of the source set (the one with ~20-100 integers). That would basically let you test if a integer is present in O(1).
Then, given that 50,000(arrays) * 20(terms each) * 8(bytes per term) = 8 megabytes + (hash map overhead), does not seem large either for most systems, I would use another hash-map to store tested arrays. This way you don't have to re-test duplicates.
I realize this may be less satisfying from a CS point of view, but if you're doing a huge number of tiny tasks that don't affect each other, you might want to consider parallelizing them (multithreading). 10,000 tasks per second, comparing a different array in each task, should fit the bill; you don't give any details about what else you're doing (e.g., where all these arrays are coming from), but it's conceivable that multithreading could improve your throughput by a large factor.
First, do what you were suggesting; make a hashmap from input integer to the IDs of the filter arrays it exists in. That lets you say "input #27 is in these 400 filters", and toss those 400 into a sorted set. You've then gotta do an intersection of the sorted sets for each one.
Optional: make a second hashmap from each input integer to it's frequency in the set of filters. When an input comes in, sort it using the second hashmap. Then take the least common input integer and start with it, so you have less overall work to do on each step. Also compute the frequencies for the "not" cases, so you basically get the most bang for your buck on each step.
Finally: this could be pretty easily made into a parallel programming problem; if it's not fast enough on one machine, it seems you could put more machines on it pretty easily, if whatever it's returning is useful enough.

Unique combination of 10 questions

How to form a combination of say 10 questions so that each student (total students = 10) get unique combination.
I don't want to use factorial.
you can use circular queue data structure
now you can cut this at any point you like , and it then it will give you a unique string
for example , if you cut this at point between 2 and 3 and then iterate your queue, you will get :
3, 4, 5, 6, 7, 8, 9, 10, 1, 2
so you need to implement a circular queue, then cut it from 10 different points (after 1, after 2[shown in picture 2],after 3,....)
There are 3,628,800 different permutations of 10 items taken 10 at a time.
If you only need 10 of them you could start with an array that has the values 1-10 in it. Then shuffle the array. That becomes your first permutation. Shuffle the array again and check to see that you haven't already generated that permutation. Repeat that process: shuffle, check, save, until you have 10 unique permutations.
It's highly unlikely (although possible) that you'll generate a duplicate permutation in only 10 tries.
The likelihood that you generate a duplicate increases as you generate more permutations, increasing to 50% by the time you've generated about 2,000. But if you just want a few hundred or less, then this method will do it for you pretty quickly.
The proposed circular queue technique works, too, and has the benefit of simplicity, but the resulting sequences are simply rotations of the original order, and it can't produce more than 10 without a shuffle. The technique I suggest will produce more "random" looking orderings.

Subset calculation of list of integers

I'm currently implementing an algorithm where one particular step requires me to calculate subsets in the following way.
Imagine I have sets (possibly millions of them) of integers. Where each set could potentially contain around a 1000 elements:
Set1: [1, 3, 7]
Set2: [1, 5, 8, 10]
Set3: [1, 3, 11, 14, 15]
...,
Set1000000: [1, 7, 10, 19]
Imagine a particular input set:
InputSet: [1, 7]
I now want to quickly calculate to which this InputSet is a subset. In this particular case, it should return Set1 and Set1000000.
Now, brute-forcing it takes too much time. I could also parallelise via Map/Reduce, but I'm looking for a more intelligent solution. Also, to a certain extend, it should be memory-efficient. I already optimised the calculation by making use of BloomFilters to quickly eliminate sets to which the input set could never be a subset.
Any smart technique I'm missing out on?
Thanks!
Well - it seems that the bottle neck is the number of sets, so instead of finding a set by iterating all of them, you could enhance performance by mapping from elements to all sets containing them, and return the sets containing all the elements you searched for.
This is very similar to what is done in AND query when searching the inverted index in the field of information retrieval.
In your example, you will have:
1 -> [set1, set2, set3, ..., set1000000]
3 -> [set1, set3]
5 -> [set2]
7 -> [set1, set7]
8 -> [set2]
...
EDIT:
In inverted index in IR, to save space we sometimes use d-gaps - meaning we store the offset between documents and not the actual number. For example, [2,5,10] will become [2,3,5]. Doing so and using delta encoding to represent the numbers tends to help a lot when it comes to space.
(Of course there is also a downside: you need to read the entire list in order to find if a specific set/document is in it, and cannot use binary search, but it sometimes worths it, especially if it is the difference between fitting the index into RAM or not).
How about storing a list of the sets which contain each number?
1 -- 1, 2, 3, 1000000
3 -- 1, 3
5 -- 2
etc.
Extending amit's solution, instead of storing the actual numbers, you could just store intervals and their associated sets.
For example using a interval size of 5:
(1-5): [1,2,3,1000000]
(6-10): [2,1000000]
(11-15): [3]
(16-20): [1000000]
In the case of (1,7) you should consider intervals (1-5) and (5-10) (which can be determined simply by knowing the size of the interval). Intersecting those ranges gives you [2,1000000]. Binary search of the sets shows that indeed, (1,7) exists in both sets.
Though you'll want to check the min and max values for each set to get a better idea of what the interval size should be. For example, 5 is probably a bad choice if the min and max values go from 1 to a million.
You should probably keep it so that a binary search can be used to check for values, so the subset range should be something like (min + max)/N, where 2N is the max number of values that will need to be binary searched in each set. For example, "does set 3 contain any values from 5 to 10?" this is done by finding the closest values to 5 (3) and 10 (11), in this case, no it does not. You would have to go through each set and do binary searches for the interval values that could be within the set. This means ensuring that you don't go searching for 100 when the set only goes up to 10.
You could also just store the range (min and max). However, the issue is that I suspect your numbers are going be be clustered, thus not providing much use. Although as mentioned, it'll probably be useful for determining how to set up the intervals.
It'll still be troublesome to pick what range to use, too large and it'll take a long time to build the data structure (1000 * million * log(N)). Too small, and you'll start to run into space issues. The ideal size of the range is probably such that it ensures that the number of set's related to each range is approximately equal, while also ensuring that the total number of ranges isn't too high.
Edit:
One benefit is that you don't actually need to store all intervals, just the ones you need. Although, if you have too many unused intervals, it might be wise to increase the interval and split the current intervals to ensure that the search is fast. This is especially true if processioning time isn't a major issue.
Start searching from biggest number (7) of input set and
eliminate other subsets (Set1 and Set1000000 will returned).
Search other input elements (1) in remaining sets.

Sorting an array in minimum cost

I have an array A[] with 4 element A={
8 1 2 4 }. How to sort it with minimized cost. Criteria is defined as follows-
a. It is possible to swap any 2 element.
b. The cost of any swap is sum of the element value , Like if i swap 8 and 4 the cost is 12 an resultant array is look like A={4 1 2 8}, which is still unsorted so more swap needed.
c. Need to find a way to sort the array with minimum cost.
From my observation greedy will not work, like in each step place any element to its sorted position in array with minimum cost. So a DP solution needed.
Can any one help??
Swap 2 and 1, and then 1 and 4, and then 1 and 8? Or is it a general question?
For a more general approach you could try:
Swapping every pair of 2 elements (with the highest sum) if they are perfect swaps (i.e. swapping them will put them both at their right spot). Th
Use the lowest element as a pivot for swaps (by swapping the element whose spot it occupies), until it reaches its final spot
Then, you have two possibilities:
Repeat step 2: use the lowest element not in its final spot as a pivot until it reaches its final spot, then go back to step 3
Or swap the lowest element not in its final spot (l2) with the lowest element (l1), repeat step 2 until l1 reaches the final spot of l2. Then:
Either swap l1 and l2 again, go to step 3.1
Or go to step 3.2 again, with the next lowest element not in its final spot being used.
When all this is done, if some opposite swaps are performed one next to another (for example it could happen from going to step 2. to step 3.2.), remove them.
There are still some things to watch out for, but this is already a pretty good approximation. Step one and two should always work though, step three would be the one to improve in some borderline cases.
Example of the algorithm being used:
With {8 4 5 3 2 7}: (target array {2 3 4 5 7 8})
Step 2: 2 <> 7, 2 <> 8
Array is now {2, 4, 5, 3, 7, 8}
Choice between 3.1 and 3.2:
3.1 gives 3 <> 5, 3 <> 4
3.2 gives 2 <> 3, 2 <> 5, 2 <> 4, 2 <> 3
3 <> 5, 3 <> 4 is the better result
Conclusion: 2 <> 7, 2 <> 8, 3 <> 5, 3 <> 4 is the best answer.
With {1 8 9 7 6} (resulting array {1 6 7 8 9})
You're beginning at step three already
Choice between 3.1 and 3.2:
3.1 gives 6 <> 9, 6 <> 7, 6 <> 8 (total: 42)
3.2 gives 1 <> 6, 1 <> 9, 1 <> 7, 1 <> 8, 1 <> 6 (total: 41)
So 1 <> 6, 1 <> 9, 1 <> 7, 1 <> 8, 1 <> 6 is the best result
This smells like homework. What you need to do is sort the array but doing so while minimizing cost of swaps. So, it's a optimization problem rather than a sorting problem.
A greedy algorithm would despite this work, all you do is that you fix the solution by swapping the cheapest first (figuring out where in the list it belongs). This is however, not necessarily optimal.
As long as you never swap the same element twice a greedy algorithm should be optimal though.
Anyway, back to the dynamic programming stuff, just build your solution tree using recursion and then prune the tree as you find a more optimal solutions. This is pretty basic recursion.
If you a more complicated sorting algorithm you'll have a lot more difficulty puzzling that together with the dynamic programming so I suggest you start out with a simple, slow O(n^2) sort. And build on top of this.
Rather than to provide you with a solution, I'd like to explain how dynamic programming works in my own words.
The first thing you need to do, is to figure out an algorithm that will explore all possible solutions (this can be a really stupid brute force algorithm).
You then implement this using recursion because dynamic programming is based around being able to figure out overlapping sub problems quickly, ergo recursion.
At each recursive call you look up where you are in your solution and check where you've computed this part of the solution tree before, if you have done this, you can test whether the current solution is more optimal, if it is then you continue, otherwise you're done with this branch of the problem.
When you arrive at the final solution you will have solved the problem.
Think of each recursive call as a snapshot of a partial solution. It's your job to figure how each recursive call fits together in the final optimal solution.
This what I recommend you do:
Write a recursive sort algorithm
Add a parameter to your recursive function that maintains the cost of this execution path, as you sort the array, add to this cost. For every possible swap at any given point do another recursive call (this will branch your solution tree)
Whenever you realize that the cost of the solution you are currently exploring exceeds what you already have somewhere else, abort (just return).
To be able to answer the last question you need to maintain shared memory area in which you can index depending on where you are in you're recursive algorithm. If there's a precomputed cost there you just return that value and don't continue processing (this is the pruning, which makes it fast).
Using this method you can even base your solution on a permutation brute force algorithm, it will probably be very slow or memory intensive because it is stupid when it comes to when you branch or prune but you don't really need a specific sort algorithm to make this work, it will just be more efficient to go about it that way.
Good luck!
If you do a high-low selection sort, you can guarantee that the Nth greatest element isn't swapped more than N times. This a simple algorithm with a pretty easy and enticing guarantee... Maybe check this on a few examples and see how it could be tweaked. Note: this may not lead to an optimal answer...
To find the absolute minimal cost you'll have to try all ways to swap and then find the fastest one.
def recsort(l, sort):
if sorted(l2):
if min>cost:
cost=min
bestsort=sort
if(len(sort) > len(l)*len(l)): //or some other criteria
return
for p1 in (0,len(l)):
for p2 in (0,len(l)):
cost += l[p1] + l[p2]
l2 = swap(l, p1,p2)
if cost<min:
recsort(l2, append sort (p1,p2))
An approach that will be pretty good is to recursively place the biggest value at the top.

Resources