I am working on a sorting machine, and to minimize complexity, I would like to keep the moving parts to a minimum. I've come to the following design:
1 Input Stack
2+ Output Stacks
When starting, machine already knows all the items, their current order, and their desired order.
The machine can move one item from the bottom of the input stack to the bottom of an output stack of its choice.
The machine can move all items from an output stack to the top of the input stack. This is called a "return". (In my machine, I plan for this to be done by the user.)
The machine only accesses the bottom of a stack, except by a return. When a stack is returned to the input, the "new" items will be the last items out of the input. This also means that if the machine moves a set of items from the input to one output, the order of those items is reversed.
The goal of the machine is to take all the items from the input stack, and eventually move them all to an output stack in sorted order. A secondary goal is to reduce the number of "stack returns" to a minimum, because in my machine, this is the part that requires user intervention. Ideally, the machine should do as much sorting as it can without the user's help.
The issue I'm encountering is that I can't seem to find an appropriate algorithm for doing the actual sorting. Pretty much all algorithms I can find rely on being able to swap arbitrary elements. Distribution/external sorting seems promising, but all the algorithms I can find seem to rely on accessing multiple inputs at once.
Since machine already knows all the items, I can take advantage of this and sort all the items "in-memory". I experimented with "path-finding" from the unsorted state to the sorted state, but I'm unable to get it to actually converge on a solution. (It commonly just gets stuck in a loop moving stacks back and forth.)
Preferably, I would like a solution that works with a minimum of 2 output stacks, but is able to use more if available.
Interestingly, this is a "game" you can play with standard playing cards:
Get as many cards as you would like to sort. (I usually get 13 of a suit.)
Shuffle them and put them in your hand. Decide how many output stacks you get.
You have two valid moves:
You may move the front-most card in your hand and put it on top of any output stack.
You may pick up all the cards in an output stack and put them at the back of the cards you have in hand.
You win when the cards are in order in an output stack. Your score is the number of times you picked up a stack. Lower scores are better.
This can be done in O(log(n)) returns of an output to an input. More precisely in no more than 2 ceil(log_2(n)) - 1 returns if 1 < n.
Let's call the output stacks A and B.
First consider the simplest algorithm that works. We run through them, putting the smallest card on B and the rest on A. Then put A on input and repeat. After n passes you've got them in sorted order. Not very efficient, but it works.
Now can we make it so that we pull out 2 cards per pass? Well if we had cards 1, 4, 5, 8, 9, 12, ... in the top half and the rest in the bottom half, then the first pass will find card 1 before card 2, reverse them, the second finds card 3 before card 4, reverses them, and so on. 2 cards per pass. But with 1 pass with 2 returns we can put all the cards we want in the top half on stack A, and the rest on stack B, return stack A, return stack B, and then start extracting. This takes 2 + n/2 passes.
How about 4 cards per pass? Well we want it divided into quarters. With the top quarter having cards 1, 8, 9, 16, .... The second quarter having 2, 7, 10, 15, .... The third having 3, 6, 11, 14, .... And the last having 4, 5, 12, 13, .... Basically if you were dealing them you deal the first 4 in order, the second 4 in reverse, the next for in order.
We can divide them into quarters in 2 passes. Can we figure out how to get there? Well working backwards, after the second pass we want A to have quarters 2,1. And B to have quarters 4,3. Then we return A, return B, and we're golden. So after the first pass we wanted A to have quarters 2,4 and B to have quarters 1,3, return A return B.
Turning that around to work forwards, in pass 1 we put groups 2,4 on A, 1,3 on B. Return A, return B. Then in pass 2 we put groups 1,2 on A, 3,4 on B, return A, return B. Then we start dealing and we get 4 cards out per pass. So now we're using 4 + n/4 returns.
If you continue the logic forward, in 3 passes (6 returns) you can figure out how to get 8 cards per pass on the extract phase. In 4 passes (8 returns) you can get 16 cards per pass. And so on. The logic is complex, but all you need to do is remember that you want them to wind up in order ... 5, 4, 3, 2, 1. Work backwards from the last pass to the first figuring out how you must have done it. And then you have your forward algorithm.
If you play with the numbers, if n is a power of 2 you do equally well to take log_2(n) - 2 passes with 2 log_2(n) - 4 returns and then take 4 extraction passes with 3 returns between them for 2 log_2(n) - 1 returns, or if you take log_2(n) - 1 passes with 2 log_2(n) - 2 returns and then 2 extraction passes with 1 returns between them for 2 log_2(n) - 1 returns. (This is assuming, of course, that n is sufficiently large that it can be so divided. Which means "not 1" for the second version of the algorithm.) We'll see shortly a small reason to prefer the former version of the algorithm if 2 < n.
OK, this is great if you've got a multiple of a power of 2 to get. But what if you have, say, 10 cards? Well insert imaginary cards until we've reached the nearest power of 2, rounded up. We follow the algorithm for that, and simply don't actually do the operations that we would have done on the imaginary cards, and we get the exact results we would have gotten, except with the imaginary cards not there.
So we have a general solution which takes no more than 2 ceil(log_2(n)) - 1 returns.
And now we see why to prefer breaking that into 4 groups instead of 2. If we break into 4 groups, it is possible that the 4th group is only imaginary cards and we get to skip one more return. If we break into 2 groups, there always are real cards in each group and we don't get to save a return.
This speeds us up by 1 if n is 3, 5, 6, 9, 10, 11, 12, 17, 18, ....
Calculating the exact rules is going to be complicated, and I won't try to write code to do it. But you should be able to figure it out from here.
I can't prove it, but there is a chance that this algorithm is optimal in the sense that there are permutations of cards which you can't do better than this on. (There are permutations that you can beat this algorithm with, of course. For example if I hand you everything in reverse, just extracting them all is better than this algorithm.) However I expect that finding the optimal strategy for a given permutation is an NP-complete problem.
Related
I need help understanding the following paragraph from a book on algorithms -
Search spaces for natural combinatorial problems tend to grow
exponentially in the size N of the input; if the input size increases
by one, the number of possibilities increases multiplicatively. We’d
like a good algorithm for such a problem to have a better scaling
property: when the input size increases by a constant factor—say, a
factor of 2—the algorithm should only slow down by some constant
factor C.
I don't really get why one is better than the other. If anyone can formulate any examples to aid my understanding, its greatly appreciated.
Let's consider the following problem: you're given a list of numbers, and you want to find the longest subsequence of that list where the numbers are in ascending order. For example, given the sequence
2 7 1 8 3 9 4 5 0 6
you could form the subsequence [2, 7, 8, 9] as follows:
2 7 1 8 3 9 4 5 0 6
^ ^ ^ ^
but there's an even longer one, [1, 3, 4, 5, 6] available here:
2 7 1 8 3 9 4 5 0 6
^ ^ ^ ^ ^
That one happens to be the longest subsequence that's in increasing order, I believe, though please let me know if I'm mistaken.
Now that we have this problem, how would we go about solving it in the general case where you have a list of n numbers? Let's start with a not so great option. One possibility would be to list off all the subsequences of the original list of numbers, then filter out everything that isn't in increasing order, and then to take the longest one out of all the ones we find. For example, given this short list:
2 7 1 8
we'd form all the possible subsequences, which are shown here:
[]
[8]
[1]
[1, 8]
[7]
[7, 8]
[7, 1]
[7, 1, 8]
[2]
[2, 8]
[2, 1]
[2, 1, 8]
[2, 7]
[2, 7, 8]
[2, 7, 1]
[2, 7, 1, 8]
Yikes, that list is pretty long. But by looking at it, we can see that the longest increasing subsequences have length two, and that there are plenty of choices for which one we could pick.
Now, how well is this going to scale as our input list gets longer and longer? Here's something to think about - how many subsequences are there of this new list, which I made by adding 3 to the end of the existing list?
2 7 1 8 3
Well, every existing subsequence is still a perfectly valid subsequence here. But on top of that, we can form a bunch of new subsequences. In fact, we could take any existing subsequence and then tack a 3 onto the end of it. That means that if we had S subsequences for our length-four list, we'll have 2S subsequences for our length-five list.
More generally, you can see that if you take a list and add one more element onto the end of it, you'll double the number of subsequences available. That's a mathematical fact, and it's neither good nor bad by itself, but if we're in the business of listing all those subsequences and checking each one of them to see whether it has some property, we're going to be in trouble because that means there's going to be a ton of subsequences. We already see that there are 16 subsequences of a four-element list. That means there's 32 subsequences of a five-element list, 64 subsequences of a six-element list, and, more generally, 2n subsequences of an n-element list.
With that insight, let's make a quick calculation. How many subsequences are we going to have to check if we have, say, a 300-element list? We'd have to potentially check 2300 of them - a number that's bigger than the number of atoms in the observable universe! Oops. That's going to take way more time than we have.
On the other hand, there's a beautiful algorithm called patience sorting that will always find the longest increasing subsequence, and which does so quite easily. You can do this by playing a little game. You'll place each of the items in the list into one of many piles. To determine what pile to pick, look for the first pile whose top number is bigger than the number in question and place it on top. If you can't find a pile this way, put the number into its own pile on the far right.
For example, given this original list:
2 7 1 8 3 9 4 5 0 6
after playing the game we'd end up with these piles:
0
1 3 4 5
2 7 8 9 6
And here's an amazing fact: the number of piles used equals the length of the longest increasing subsequence. Moreover, you can find that subsequence in the following way: every time you place a number on top of a pile, make a note of the number that was on top of the pile to its left. If we do this with the above numbers, here's what we'll find; the parenthesized number tells us what was on top of the stack to the left at the time we put the number down:
0
1 3 (1) 4 (3) 5 (4)
2 7 (2) 8 (7) 9 (8) 6 (5)
To find the subsequence we want, start with the top of the leftmost pile. Write that number down, then find the number in parentheses and repeat this process. Doing that here gives us 6, 5, 4, 3, 1, which, if reversed, is 1, 3, 4, 5, 6, the longest increasing subsequence! (Wow!) You can prove that this works in all cases, and it's a really beautiful exercise to actually go and do this.
So now the question is how fast this process is. Placing the first number down takes one unit of work - just place it in its own pile. Placing the second number down takes at most two units of work - we have to look at the top of the first pile, and optionally put the number into a second pile. Placing the third number takes at most three units of work - we have to look at up to two piles, and possibly place the number into its own third pile. More generally, placing the kth number down takes k units of work. Overall, this means that the work we're doing is roughly
1 + 2 + 3 + ... + n
if we have n total elements. That's a famous sum called Gauss's sum, and it simplifies to approximately n2 / 2. So we can say that we'll need to do roughly n2 / 2 units of work to solve things this way.
How does that compare to our 2n solution from before? Well, unlike 2n, which grows stupidly fast as a function of n, n2 / 2 is actually a pretty nice function. If we plug in n = 300, which previously in 2n land gave back "the number of atoms in the universe," we get back a more modest 45,000. If that's a number of nanoseconds, that's nothing; that'll take a computer under a second to do. In fact, you have to plug in a pretty big value of n before you're looking at something that's going to take the computer quite a while to complete.
The function n2 / 2 has an interesting property compared with 2n. With 2n, if you increase n by one, as we saw earlier, 2n will double. On the other hand, if you take n2 / 2 and increase n by one, then n2 / 2 will get bigger, but not by much (specifically, by n + 1/2).
By contrast, if you take 2n and then double n, then 2n squares in size - yikes! But if you take n2 / 2 and double n, then n2 / 2 goes up only by a factor of four - not that bad, actually, given that we doubled our input size!
This gets at the heart of what the quote you mentioned is talking about. Algorithms with runtimes like 2n, n!, etc. scale terribly as a function of n, since increasing n by one causes a huge jump in the runtime. On the other hand, functions like n, n log n, n2, etc. have the property that if you double n, the runtime only goes up by some constant term. They therefore scale much more nicely as a function of input.
I was doing some coding challenges and a problem came up that said roughly this:
"Two players each taking turns starting with player one. There are N
sticks given, each player takes 1, 2 or 3 sticks on their turn, the
player to take the last stick loses, the goal is to find an algorithm
that lets player one win with certainty (not always possible, player two is supposed to take turns that will ensure victory) and output 1, 2 or 3 as
the starting amount of sticks taken or 0 if it's impossible to win.
Input is N. Example: Input:2 Output:1"
I tried to think about it but all I came up with is that it would take checking every possible outcome because of all the possibilities that could be chained together if N is big. I also thought that if the last stick is to be taken by player 2 so as to not lose, that is N-1 is taken by player 1 (whether by taking N-1 only or N-1, N-2 or N-1, N-2, N-3) leaving N to player 2, that is the only way to ensure victory.
It turned out that the solution was (N-1) mod 4, but I can't understand why that is the case.
So my question is how do you approach a problem like that and why is the solution a modulo? Also is there a way to spot modulo problems like these? Other coders did it fairly quickly so I suppose practice makes perfect, but I have no idea where to start from.
It is modulo 4 because if one player has the advantage, he can keep the same advantage by taking 3 sticks if the first player took 1, 2 if the first player took 2, and 1 stick if the first player took 3. The other player simply doesn't have any control anymore.
Take the problem backwards :
You don't have to care about a big N, you just need to analyze what the situation looks like when only 4 sticks or less are left.
Who will win when there are 1, 2, 3 or 4 sticks left?
Who will win when there are 4n+1, 4n+2, 4n+3 or 4n+4 sticks left?
How to form a combination of say 10 questions so that each student (total students = 10) get unique combination.
I don't want to use factorial.
you can use circular queue data structure
now you can cut this at any point you like , and it then it will give you a unique string
for example , if you cut this at point between 2 and 3 and then iterate your queue, you will get :
3, 4, 5, 6, 7, 8, 9, 10, 1, 2
so you need to implement a circular queue, then cut it from 10 different points (after 1, after 2[shown in picture 2],after 3,....)
There are 3,628,800 different permutations of 10 items taken 10 at a time.
If you only need 10 of them you could start with an array that has the values 1-10 in it. Then shuffle the array. That becomes your first permutation. Shuffle the array again and check to see that you haven't already generated that permutation. Repeat that process: shuffle, check, save, until you have 10 unique permutations.
It's highly unlikely (although possible) that you'll generate a duplicate permutation in only 10 tries.
The likelihood that you generate a duplicate increases as you generate more permutations, increasing to 50% by the time you've generated about 2,000. But if you just want a few hundred or less, then this method will do it for you pretty quickly.
The proposed circular queue technique works, too, and has the benefit of simplicity, but the resulting sequences are simply rotations of the original order, and it can't produce more than 10 without a shuffle. The technique I suggest will produce more "random" looking orderings.
I have an array A[] with 4 element A={
8 1 2 4 }. How to sort it with minimized cost. Criteria is defined as follows-
a. It is possible to swap any 2 element.
b. The cost of any swap is sum of the element value , Like if i swap 8 and 4 the cost is 12 an resultant array is look like A={4 1 2 8}, which is still unsorted so more swap needed.
c. Need to find a way to sort the array with minimum cost.
From my observation greedy will not work, like in each step place any element to its sorted position in array with minimum cost. So a DP solution needed.
Can any one help??
Swap 2 and 1, and then 1 and 4, and then 1 and 8? Or is it a general question?
For a more general approach you could try:
Swapping every pair of 2 elements (with the highest sum) if they are perfect swaps (i.e. swapping them will put them both at their right spot). Th
Use the lowest element as a pivot for swaps (by swapping the element whose spot it occupies), until it reaches its final spot
Then, you have two possibilities:
Repeat step 2: use the lowest element not in its final spot as a pivot until it reaches its final spot, then go back to step 3
Or swap the lowest element not in its final spot (l2) with the lowest element (l1), repeat step 2 until l1 reaches the final spot of l2. Then:
Either swap l1 and l2 again, go to step 3.1
Or go to step 3.2 again, with the next lowest element not in its final spot being used.
When all this is done, if some opposite swaps are performed one next to another (for example it could happen from going to step 2. to step 3.2.), remove them.
There are still some things to watch out for, but this is already a pretty good approximation. Step one and two should always work though, step three would be the one to improve in some borderline cases.
Example of the algorithm being used:
With {8 4 5 3 2 7}: (target array {2 3 4 5 7 8})
Step 2: 2 <> 7, 2 <> 8
Array is now {2, 4, 5, 3, 7, 8}
Choice between 3.1 and 3.2:
3.1 gives 3 <> 5, 3 <> 4
3.2 gives 2 <> 3, 2 <> 5, 2 <> 4, 2 <> 3
3 <> 5, 3 <> 4 is the better result
Conclusion: 2 <> 7, 2 <> 8, 3 <> 5, 3 <> 4 is the best answer.
With {1 8 9 7 6} (resulting array {1 6 7 8 9})
You're beginning at step three already
Choice between 3.1 and 3.2:
3.1 gives 6 <> 9, 6 <> 7, 6 <> 8 (total: 42)
3.2 gives 1 <> 6, 1 <> 9, 1 <> 7, 1 <> 8, 1 <> 6 (total: 41)
So 1 <> 6, 1 <> 9, 1 <> 7, 1 <> 8, 1 <> 6 is the best result
This smells like homework. What you need to do is sort the array but doing so while minimizing cost of swaps. So, it's a optimization problem rather than a sorting problem.
A greedy algorithm would despite this work, all you do is that you fix the solution by swapping the cheapest first (figuring out where in the list it belongs). This is however, not necessarily optimal.
As long as you never swap the same element twice a greedy algorithm should be optimal though.
Anyway, back to the dynamic programming stuff, just build your solution tree using recursion and then prune the tree as you find a more optimal solutions. This is pretty basic recursion.
If you a more complicated sorting algorithm you'll have a lot more difficulty puzzling that together with the dynamic programming so I suggest you start out with a simple, slow O(n^2) sort. And build on top of this.
Rather than to provide you with a solution, I'd like to explain how dynamic programming works in my own words.
The first thing you need to do, is to figure out an algorithm that will explore all possible solutions (this can be a really stupid brute force algorithm).
You then implement this using recursion because dynamic programming is based around being able to figure out overlapping sub problems quickly, ergo recursion.
At each recursive call you look up where you are in your solution and check where you've computed this part of the solution tree before, if you have done this, you can test whether the current solution is more optimal, if it is then you continue, otherwise you're done with this branch of the problem.
When you arrive at the final solution you will have solved the problem.
Think of each recursive call as a snapshot of a partial solution. It's your job to figure how each recursive call fits together in the final optimal solution.
This what I recommend you do:
Write a recursive sort algorithm
Add a parameter to your recursive function that maintains the cost of this execution path, as you sort the array, add to this cost. For every possible swap at any given point do another recursive call (this will branch your solution tree)
Whenever you realize that the cost of the solution you are currently exploring exceeds what you already have somewhere else, abort (just return).
To be able to answer the last question you need to maintain shared memory area in which you can index depending on where you are in you're recursive algorithm. If there's a precomputed cost there you just return that value and don't continue processing (this is the pruning, which makes it fast).
Using this method you can even base your solution on a permutation brute force algorithm, it will probably be very slow or memory intensive because it is stupid when it comes to when you branch or prune but you don't really need a specific sort algorithm to make this work, it will just be more efficient to go about it that way.
Good luck!
If you do a high-low selection sort, you can guarantee that the Nth greatest element isn't swapped more than N times. This a simple algorithm with a pretty easy and enticing guarantee... Maybe check this on a few examples and see how it could be tweaked. Note: this may not lead to an optimal answer...
To find the absolute minimal cost you'll have to try all ways to swap and then find the fastest one.
def recsort(l, sort):
if sorted(l2):
if min>cost:
cost=min
bestsort=sort
if(len(sort) > len(l)*len(l)): //or some other criteria
return
for p1 in (0,len(l)):
for p2 in (0,len(l)):
cost += l[p1] + l[p2]
l2 = swap(l, p1,p2)
if cost<min:
recsort(l2, append sort (p1,p2))
An approach that will be pretty good is to recursively place the biggest value at the top.
I have a 2D array that holds unique integers - this represents a physical container with rows/columns - in each position there is a vial.
I know the integers that should be in the array and where they should be located.
My array however is shuffled with potentially many/all unique integers in the wrong positions.
I now need to sort the array - however this maps to a physical process and therefore I really want to reduce the number of sort steps involved due to potential human error.
Is this just a plain sort? or is there a more specific name for this scenario? Is there well known solutions?
My colleague has suggested just creating a list of swap [1][1] with [2][1] type instructions, which seems reasonable however I can't quite get my head around if the order of swaps is important.
All assistance grateful.
If you really can tell, just by looking at the vial, where it belongs, the shortest way is to take the first vial that is in the wrong place out, then put it where it belongs, take whatever was there, put it to its proper place, etc., until you happen to get the vial that belongs where you originally made a "hole". Then repeat.
Since you take out each vial at most once, and only if it is in the wrong place, I think that this is optimal with respect to physical motion.
Sorting algorithms are analysed by the number of comparisons and the number of swaps required. Since for a human operator the cost of a swap is much higher than the cost of a comparison, you want a 2D sort that minimizes the number of swaps required.
"I can't quite get my head around if the order of swaps is important."
I general yes, it is. For a simple example consider the starting list of 3 elements, X Y Z.
The result of "swap 1 with 2, then 2 with 3" is Y Z X.
The result of "swap 2 with 3, then 1 with 2" is Z X Y.
The list of swaps you come up with will probably be (at most) 1 for each element that is out of place, and will swap that element with whatever is in its correct place. So for example you might swap [0][0] with wherever it belongs. Unless the place where it belongs happens to contain the element that belongs in [0][0], then your next swap could be, again [0][0] with wherever that belongs. So certainly the order of swaps is important - this second swap is only correct because the first swap has already happened, and moved some particular element into [0][0].
If two consecutive swaps are disjoint, though, then you can reverse their order: (1 2)(3 4) is equivalent to (3 4)(1 2), where (x y) is a mathematical notation for "swap x with y".
It's a theorem that any permutation can be written as a set of disjoint cycles. This decomposition into cycles is unique apart from which element in your cycle you choose to list first, and the order the cycles are listed, both of which are irrelevant to the result. The notation (1 2 3) means "move 1 to 2, 2 to 3, and 3 to 1", and is a 3-cycle. It's exactly the same as (2 3 1), but different from (1 3 2).
Depending how your human operative works, it might well be more efficient for them to carry out an n-cycle rather than an equivalent n swaps. So once you know how to sort your array (that is, you know what permutation must be performed on it to get it into order), it may be that the best thing to do is to generate that decomposition.