Algorithm(s) / approach - algorithm

Recently I came across this question and I have no clue where or how to start solving it. Here is the question:
There are 8 statues 0,1,2,3,4,5,6,7 . Each statue is pointing in one of the following four direction North, South, East or West. John would like to arrange the statues so that they all point in same direction. However John is restricted to the following 8 moves which correspond to rotation each statue listed 90 degrees clockwise. (N to E, E to S, S to W, W to N)
Moves
A: 0,1
B: 0,1,2
C: 1,4,5,6
D: 2,5
E: 3,5
F: 3,7
G: 5,7
H: 6,7
Help John figure out fewest number of moves to help point all statues in one direction.
Input : A string initialpos consisting of 8 chars. Each char is either 'N,'S,'E,'W'
Output: An integer which represents fewest no. of moves needed to arrange statues in same direction. If no sequence possible then return -1.
Sample test cases:
input: SSSSSSSS
Output: 0
Explanation: All statues point in same direction. So it takes 0 moves
Test case 1:
Input : WWNNNNNN
Output: 1
Exp: John can use Move A which will make all statues point to North
Test Case 3:
input: NNSEWSWN
Output: 6
Exp: John uses Move A twice, B once, F twice, G once. This will result in all statues facing W.
The only approach I was able to think of was to brute force it. But since the moves can be done multiple times (test case 3), what would be the limit to applying the moves before we conclude that such an arrangement is not possible (i.e output -1)? I am looking for specific types of algorithms that can be used to solve this, also what part of the problem is used in identifying an algorithm.

Note that the order of moves makes no difference, only the set (with repetition). Also note that making the same move 4 times is equivalent to doing nothing, so there is never any reason to make the same move more than 3 times. This reduces the space to 48 possible sequences, which isn't too terrible, but we can still do better than brute force.
The only move that treats 0 and 1 differently is C, so apply C as many times as is necessary to bring 0 and 1 into alignment. We mustn't use C any more than that, and C is the only thing that can move 4, so the remaining task is to align everything to 4.
The only way to move 6 is with H; apply H to align 6.
Now to align 3 and 7. We could do it with E and G, but we may have the option to use F as a short-cut. The optimal number of F moves is not yet clear, so we'll use E and G, and come back to F later.
Apply D to align 5.
Apply B to align 2.
Apply A to align 0 and 1.
Now revisit F, and see whether the short-cut actually saves moves. Pick the optimal number of F moves. (This is easy even by brute force, since there are only 4 possibilities to test.)

The directions N, E, W, S with operation of turning are congruent with Z mod 4 with succ: turn N = (succ 0) mod 4, turn W twice = (succ succ 2) mod 4 etc.
Each move is a vector of zeros (no change) and ones (turn by one) being added to inputs: say you have your example of NNSEWSWN, which would be [0, 0, 2, 1, 3, 2, 3, 0], and you push the button A, which is [1, 1, 0, 0, 0, 0, 0, 0], resulting in [1, 1, 2, 1, 3, 2, 3, 0], or EESEWSWN.
Now if you do a bunch of different operations, they all add up. Thus, you can represent the whole system with this matrix equation:
(start + move_matrix * applied_moves) mod 4 = finish
where start and finish are position vectors as described above, move_matrix the 8x8 matrix with all the moves, and applied_moves a 8-element vector saying how many times we push each button (in range 0..3).
From this, you can get:
applied_moves = (inverse(move_matrix) * (finish - start)) mod 4
Number of applied moves is then just this:
num_applied_moves = sum((inverse(move_matrix) * (finish - start)) mod 4)
Now just plug in the four different values for finish and see which one is least.
You can use matlab, numpy, octave, APL, whatever rocks your boat, as long as it supports matrix algebra, to get your answer very quickly and very easily.

This sounds a little like homework... but I would go with this line of logic.
Run a loop seeing how many moves it would take to move all the statues to face one direction. You would get something like allEast = 30, allWest = 5, etc. Take the lowest sum and corresponding direction would be the answer. With that mindset its pretty easy to build an algorithm to handle computation.

Brute-force could work. A move applied 4 times is the same as not applying the move at all, so each move can only be applied 0, 1, 2, or 3 times.
The order of the moves does not matter. Move a followed by b is the same as b followed by a.
So there are only 4^8 = 65536 possible combinations of moves.

A general solution is to note that there are only 4^8 = 64k different configurations. Each move can therefore be represented as a table of 64k 2 byte indices taking one configuration to the next. The 2 bytes e.g. are divided into 8 2-bit fields 0=N, 1=E, 2=S, 3=W. Further we can use one more table of bits to say which of the 64k configurations have all statues pointing in the same direction.
These tables don't need to be computed at run time. They can be preprocessed while writing the program and stored.
Let table A[c] give the configuration resulting after applying move A to configuration c. and Z[c] return true iff c is a successful config.
So we can use a kind of BFS:
1. Let C be a set of configurations, initially C = { s } where s is the starting config
2. Let n = 0
3. If Z[c] is true for any c in C, return n
4. Let C' be the result of applying A, B, ... H to each element of C.
5. Set C = C', n = n + 1 and go to 3
C can theoretically grow to be almost 64k in size, but the bigger it gets, the better the chances of success, so this ought to be quite fast in practice.

Related

what is the best algorithm to solve 'toy matching puzzle'?

Imagine puzzle like this :
puzzle
I have several shapes, for example :
10 circles
8 triangles
9 squares
I also have some plates to put shapes, for example :
plate A : 2 circle holes, 3 triangle holes, 1 square holes
plate B : 1 circle holes, 0 triangle hole, 3 square holes
plate C : 2 circle holes, 2 triangle holes, 2 square holes
I want to find minimum numbers of plates to put shapes all (plates do not need to fill completely)
for example :
I can pick 6 plates [A, A, A, B, B, C], and I can insert all shapes
but I also can pick [A, A, C, C, C] and this is okay too,
so answer of this problem is : 5
If this problem generalized to N-types of shapes, and M-types of plates,
What is the best algorithm to solve this problem and what is time complexity of the answer?
This problem is a NP-hard problem, it is easier to see it once you realize that there is a very simple polynomial time reduction from the bin packing problem to this problem.
What I would suggest is for you to use integer linear programming techniques in order to solve it.
An ILP that solves your problem can be the following:
// Data
Shapes // array of integers of size n, contains the number of each shape to fit
Plates // 2D array of size n * m, Plates[i][j] represents the number of shape of type i
// that fit on a plate of type j
// Decision variables
X // array of integer of size m, will represent the number of plates of each type to use
// Constraints
For all j in 1 .. m, X[j] >= 0 // number of plates cannot be negative
For all i in 1 .. n, sum(j in 1..m) Plates[i][j] * X[j] >= Shapes[i] // all shapes must fit
Objective function:
minimize sum(j in 1..n) X[j]
Write the pseudo code in OPL, feed it to a linear programming solver, and you should get a solution reasonably fast, given the similarity of this problem with bin packing.
Edit: if you do not want to go though the trouble of learning LP basics, OPL, LP solvers, etc .... then the best and easiest approach for this problem would be a good old branch and bound implementation of this problem. Branch and bound is a very simple and powerful algorithm that can be used to solve a wide range of problem .... a must-know.
A solution to this problem should be done using dynamic programming I think.
Here is a solution in pseudo-code (I haven't tested it, but I think it should work):
parts = the number of shapes we want to fit as a vector
plates = the of plates we can use as a matrix (vector of vector)
function findSolution(parts, usedPlates):
if parts < 0: //all elements < 0
return usedPlates;
else:
bestSolution = null //or anything that shows that there is no solution yet
for X in plates:
if (parts > 0 on any index where X is > 0): //prevents an infinite loop (or stack overflow because of the recursion) that would occur using only e.g. the plate B from your question
used = findParts(parts - X, used.add(X)); //elementwise subtraction; recursion
if (used.length < best.length):
//the solution is better than the current best one
best = used;
//return the best solution that was found
return best
using the values from your question the initial variables would be:
parts = [10, 8, 9]
plates = [[2, 3, 1], [1, 0, 3], [2, 2, 2]]
and you would start the function like this:
solution = findSolution(parts /*= [10, 8, 9]*/, new empty list);
//solution would probably be [A, A, C, C, C], but also [C, C, C, C, C] would be possible (but in every case the solution has the optimal length of 5)
Using this algorithm you divide the problem in smaller problems using recursion (which is what most dynamic programming algorithms do).
The time complexity of this is not realy good, because you have to search every possible solution.
According to the master theorem the time complexity should be something like: O(n^(log_b(a))) where n = a = the number of plates used (in your example 3). b (the base of the logarithm) can't be calculated here (or at least I don't know how) but I assume it would be close to 1 which makes it a quite big exponent. But it also depends on the size of the entries in the parts vector and the entries in the plates vectores (less plates needed -> better time complexity, much plates needed -> bad time complexity).
So the time complexity is not very good. For bigger problems this will take very very long, but for small problems like in your question it should work.

Algorithm to separate items of the same type

I have a list of elements, each one identified with a type, I need to reorder the list to maximize the minimum distance between elements of the same type.
The set is small (10 to 30 items), so performance is not really important.
There's no limit about the quantity of items per type or quantity of types, the data can be considered random.
For example, if I have a list of:
5 items of A
3 items of B
2 items of C
2 items of D
1 item of E
1 item of F
I would like to produce something like:
A, B, C, A, D, F, B, A, E, C, A, D, B, A
A has at least 2 items between occurences
B has at least 4 items between occurences
C has 6 items between occurences
D has 6 items between occurences
Is there an algorithm to achieve this?
-Update-
After exchanging some comments, I came to a definition of a secondary goal:
main goal: maximize the minimum distance between elements of the same type, considering only the type(s) with less distance.
secondary goal: maximize the minimum distance between elements on every type. IE: if a combination increases the minimum distance of a certain type without decreasing other, then choose it.
-Update 2-
About the answers.
There were a lot of useful answers, although none is a solution for both goals, specially the second one which is tricky.
Some thoughts about the answers:
PengOne: Sounds good, although it doesn't provide a concrete implementation, and not always leads to the best result according to the second goal.
Evgeny Kluev: Provides a concrete implementation to the main goal, but it doesn't lead to the best result according to the secondary goal.
tobias_k: I liked the random approach, it doesn't always lead to the best result, but it's a good approximation and cost effective.
I tried a combination of Evgeny Kluev, backtracking, and tobias_k formula, but it needed too much time to get the result.
Finally, at least for my problem, I considered tobias_k to be the most adequate algorithm, for its simplicity and good results in a timely fashion. Probably, it could be improved using Simulated annealing.
First, you don't have a well-defined optimization problem yet. If you want to maximized the minimum distance between two items of the same type, that's well defined. If you want to maximize the minimum distance between two A's and between two B's and ... and between two Z's, then that's not well defined. How would you compare two solutions:
A's are at least 4 apart, B's at least 4 apart, and C's at least 2 apart
A's at least 3 apart, B's at least 3 apart, and C's at least 4 apart
You need a well-defined measure of "good" (or, more accurately, "better"). I'll assume for now that the measure is: maximize the minimum distance between any two of the same item.
Here's an algorithm that achieves a minimum distance of ceiling(N/n(A)) where N is the total number of items and n(A) is the number of items of instance A, assuming that A is the most numerous.
Order the item types A1, A2, ... , Ak where n(Ai) >= n(A{i+1}).
Initialize the list L to be empty.
For j from k to 1, distribute items of type Ak as uniformly as possible in L.
Example: Given the distribution in the question, the algorithm produces:
F
E, F
D, E, D, F
D, C, E, D, C, F
B, D, C, E, B, D, C, F, B
A, B, D, A, C, E, A, B, D, A, C, F, A, B
This sounded like an interesting problem, so I just gave it a try. Here's my super-simplistic randomized approach, done in Python:
def optimize(items, quality_function, stop=1000):
no_improvement = 0
best = 0
while no_improvement < stop:
i = random.randint(0, len(items)-1)
j = random.randint(0, len(items)-1)
copy = items[::]
copy[i], copy[j] = copy[j], copy[i]
q = quality_function(copy)
if q > best:
items, best = copy, q
no_improvement = 0
else:
no_improvement += 1
return items
As already discussed in the comments, the really tricky part is the quality function, passed as a parameter to the optimizer. After some trying I came up with one that almost always yields optimal results. Thank to pmoleri, for pointing out how to make this a whole lot more efficient.
def quality_maxmindist(items):
s = 0
for item in set(items):
indcs = [i for i in range(len(items)) if items[i] == item]
if len(indcs) > 1:
s += sum(1./(indcs[i+1] - indcs[i]) for i in range(len(indcs)-1))
return 1./s
And here some random result:
>>> print optimize(items, quality_maxmindist)
['A', 'B', 'C', 'A', 'D', 'E', 'A', 'B', 'F', 'C', 'A', 'D', 'B', 'A']
Note that, passing another quality function, the same optimizer could be used for different list-rearrangement tasks, e.g. as a (rather silly) randomized sorter.
Here is an algorithm that only maximizes the minimum distance between elements of the same type and does nothing beyond that. The following list is used as an example:
AAAAA BBBBB CCCC DDDD EEEE FFF GG
Sort element sets by number of elements of each type in descending order. Actually only largest sets (A & B) should be placed to the head of the list as well as those element sets that have one element less (C & D & E). Other sets may be unsorted.
Reserve R last positions in the array for one element from each of the largest sets, divide the remaining array evenly between the S-1 remaining elements of the largest sets. This gives optimal distance: K = (N - R) / (S - 1). Represent target array as a 2D matrix with K columns and L = N / K full rows (and possibly one partial row with N % K elements). For example sets we have R = 2, S = 5, N = 27, K = 6, L = 4.
If matrix has S - 1 full rows, fill first R columns of this matrix with elements of the largest sets (A & B), otherwise sequentially fill all columns, starting from last one.
For our example this gives:
AB....
AB....
AB....
AB....
AB.
If we try to fill the remaining columns with other sets in the same order, there is a problem:
ABCDE.
ABCDE.
ABCDE.
ABCE..
ABD
The last 'E' is only 5 positions apart from the first 'E'.
Sequentially fill all columns, starting from last one.
For our example this gives:
ABFEDC
ABFEDC
ABFEDC
ABGEDC
ABG
Returning to linear array we have:
ABFEDCABFEDCABFEDCABGEDCABG
Here is an attempt to use simulated annealing for this problem (C sources): http://ideone.com/OGkkc.
I believe you could see your problem like a bunch of particles that physically repel eachother. You could iterate to a 'stable' situation.
Basic pseudo-code:
force( x, y ) = 0 if x.type==y.type
1/distance(x,y) otherwise
nextposition( x, force ) = coined?(x) => same
else => x + force
notconverged(row,newrow) = // simplistically
row!=newrow
row=[a,b,a,b,b,b,a,e];
newrow=nextposition(row);
while( notconverged(row,newrow) )
newrow=nextposition(row);
I don't know if it converges, but it's an idea :)
I'm sure there may be a more efficient solution, but here is one possibility for you:
First, note that it is very easy to find an ordering which produces a minimum-distance-between-items-of-same-type of 1. Just use any random ordering, and the MDBIOST will be at least 1, if not more.
So, start off with the assumption that the MDBIOST will be 2. Do a recursive search of the space of possible orderings, based on the assumption that MDBIOST will be 2. There are a number of conditions you can use to prune branches from this search. Terminate the search if you find an ordering which works.
If you found one that works, try again, under the assumption that MDBIOST will be 3. Then 4... and so on, until the search fails.
UPDATE: It would actually be better to start with a high number, because that will constrain the possible choices more. Then gradually reduce the number, until you find an ordering which works.
Here's another approach.
If every item must be kept at least k places from every other item of the same type, then write down items from left to right, keeping track of the number of items left of each type. At each point put down an item with the largest number left that you can legally put down.
This will work for N items if there are no more than ceil(N / k) items of the same type, as it will preserve this property - after putting down k items we have k less items and we have put down at least one of each type that started with at ceil(N / k) items of that type.
Given a clutch of mixed items you could work out the largest k you can support and then lay out the items to solve for this k.

How to implement Random(a,b) with only Random(0,1)? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
how to get uniformed random between a, b by a known uniformed random function RANDOM(0,1)
In the book of Introduction to algorithms, there is an excise:
Describe an implementation of the procedure Random(a, b) that only makes calls to Random(0,1). What is the expected running time of your procedure, as a function of a and b? The probability of the result of Random(a,b) should be pure uniformly distributed, as Random(0,1)
For the Random function, the results are integers between a and b, inclusively. For e.g., Random(0,1) generates either 0 or 1; Random(a, b) generates a, a+1, a+2, ..., b
My solution is like this:
for i = 1 to b-a
r = a + Random(0,1)
return r
the running time is T=b-a
Is this correct? Are the results of my solutions uniformly distributed?
Thanks
What if my new solution is like this:
r = a
for i = 1 to b - a //including b-a
r += Random(0,1)
return r
If it is not correct, why r += Random(0,1) makes r not uniformly distributed?
Others have explained why your solution doesn't work. Here's the correct solution:
1) Find the smallest number, p, such that 2^p > b-a.
2) Perform the following algorithm:
r=0
for i = 1 to p
r = 2*r + Random(0,1)
3) If r is greater than b-a, go to step 2.
4) Your result is r+a
So let's try Random(1,3).
So b-a is 2.
2^1 = 2, so p will have to be 2 so that 2^p is greater than 2.
So we'll loop two times. Let's try all possible outputs:
00 -> r=0, 0 is not > 2, so we output 0+1 or 1.
01 -> r=1, 1 is not > 2, so we output 1+1 or 2.
10 -> r=2, 2 is not > 2, so we output 2+1 or 3.
11 -> r=3, 3 is > 2, so we repeat.
So 1/4 of the time, we output 1. 1/4 of the time we output 2. 1/4 of the time we output 3. And 1/4 of the time we have to repeat the algorithm a second time. Looks good.
Note that if you have to do this a lot, two optimizations are handy:
1) If you use the same range a lot, have a class that computes p once so you don't have to compute it each time.
2) Many CPUs have fast ways to perform step 1 that aren't exposed in high-level languages. For example, x86 CPUs have the BSR instruction.
No, it's not correct, that method will concentrate around (a+b)/2. It's a binomial distribution.
Are you sure that Random(0,1) produces integers? it would make more sense if it produced floating point values between 0 and 1. Then the solution would be an affine transformation, running time independent of a and b.
An idea I just had, in case it's about integer values: use bisection. At each step, you have a range low-high. If Random(0,1) returns 0, the next range is low-(low+high)/2, else (low+high)/2-high.
Details and complexity left to you, since it's homework.
That should create (approximately) a uniform distribution.
Edit: approximately is the important word there. Uniform if b-a+1 is a power of 2, not too far off if it's close, but not good enough generally. Ah, well it was a spontaneous idea, can't get them all right.
No, your solution isn't correct. This sum'll have binomial distribution.
However, you can generate a pure random sequence of 0, 1 and treat it as a binary number.
repeat
result = a
steps = ceiling(log(b - a))
for i = 0 to steps
result += (2 ^ i) * Random(0, 1)
until result <= b
KennyTM: my bad.
I read the other answers. For fun, here is another way to find the random number:
Allocate an array with b-a elements.
Set all the values to 1.
Iterate through the array. For each nonzero element, flip the coin, as it were. If it is came up 0, set the element to 0.
Whenever, after a complete iteration, you only have 1 element remaining, you have your random number: a+i where i is the index of the nonzero element (assuming we start indexing on 0). All numbers are then equally likely. (You would have to deal with the case where it's a tie, but I leave that as an exercise for you.)
This would have O(infinity) ... :)
On average, though, half the numbers would be eliminated, so it would have an average case running time of log_2 (b-a).
First of all I assume you are actually accumulating the result, not adding 0 or 1 to a on each step.
Using some probabilites you can prove that your solution is not uniformly distibuted. The chance that the resulting value r is (a+b)/2 is greatest. For instance if a is 0 and b is 7, the chance that you get a value 4 is (combination 4 of 7) divided by 2 raised to the power 7. The reason for that is that no matter which 4 out of the 7 values are 1 the result will still be 4.
The running time you estimate is correct.
Your solution's pseudocode should look like:
r=a
for i = 0 to b-a
r+=Random(0,1)
return r
As for uniform distribution, assuming that the random implementation this random number generator is based on is perfectly uniform the odds of getting 0 or 1 are 50%. Therefore getting the number you want is the result of that choice made over and over again.
So for a=1, b=5, there are 5 choices made.
The odds of getting 1 involves 5 decisions, all 0, the odds of that are 0.5^5 = 3.125%
The odds of getting 5 involves 5 decisions, all 1, the odds of that are 0.5^5 = 3.125%
As you can see from this, the distribution is not uniform -- the odds of any number should be 20%.
In the algorithm you created, it is really not equally distributed.
The result "r" will always be either "a" or "a+1". It will never go beyond that.
It should look something like this:
r=0;
for i=0 to b-a
r = a + r + Random(0,1)
return r;
By including "r" into your computation, you are including the "randomness" of all the previous "for" loop runs.

Algorithm to find streets and same kind in a hand

This is actually a Mahjong-based question, but a Romme- or even Poker-based background will also easily suffice to understand.
In Mahjong 14 tiles (tiles are like cards in Poker) are arranged to 4 sets and a pair. A street ("123") always uses exactly 3 tiles, not more and not less. A set of the same kind ("111") consists of exactly 3 tiles, too. This leads to a sum of 3 * 4 + 2 = 14 tiles.
There are various exceptions like Kan or Thirteen Orphans that are not relevant here. Colors and value ranges (1-9) are also not important for the algorithm.
I'm trying to determine if a hand can be arranged in the way described above. For certain reasons it should not only be able to deal with 14 but any number of tiles. (The next step would be to find how many tiles need to be exchanged to be able to complete a hand.)
Examples:
11122233344455 - easy enough, 4 sets and a pair.
12345555678999 - 123, 456, 789, 555, 99
11223378888999 - 123, 123, 789, 888, 99
11223344556789 - not a valid hand
My current and not yet implemented idea is this: For each tile, try to make a) a street b) a set c) a pair. If none works (or there would be > 1 pair), go back to the previous iteration and try the next option, or, if this is the highest level, fail. Else, remove the used tiles from the list of remaining tiles and continue with the next iteration.
I believe this approach works and would also be reasonably fast (performance is a "nice bonus"), but I'm interested in your opinion on this. Can you think of alternate solutions? Does this or something similar already exist?
(Not homework, I'm learning to play Mahjong.)
The sum of the values in a street and in a set can be divided by 3:
n + n + n = 3n
(n-1) + n + (n + 1) = 3n
So, if you add together all the numbers in a solved hand, you would get a number of the form 3N + 2M where M is the value of the tile in the pair. The remainder of the division by three (total % 3) is, for each value of M :
total % 3 = 0 -> M = {3,6,9}
total % 3 = 1 -> M = {2,5,8}
total % 3 = 2 -> M = {1,4,7}
So, instead of having to test nine possible pairs, you only have to try three based on a simple addition. For each possible pair, remove two tiles with that value and move on to the next step of the algorithm to determine if it's possible.
Once you have this, start with the lowest value. If there are less than three tiles with that value, it means they're necessarily the first element of a street, so remove that street (if you can't because tiles n+1 or n+2 are missing, it means the hand is not valid) and move on to the next lowest value.
If there are at least three tiles with the lowest value, remove them as a set (if you ask "what if they were part of a street?" consider that if they were, then there are also three of tile n+1 and three of tile n+2, which can also be turned into sets) and continue.
If you reach an empty hand, the hand is valid.
For example, for your invalid hand the total is 60, which means M = {3,6,9}:
Remove the 3: 112244556789
- Start with 1: there are less than three, so remove a street
-> impossible: 123 needs a 3
Remove the 6: impossible, there is only one
Remove the 9: impossible, there is only one
With your second example 12345555678999, the total is 78, which means M = {3,6,9}:
Remove the 3: impossible, there is only one
Remove the 6: impossible, there is only one
Remove the 9: 123455556789
- Start with 1: there is only one, so remove a street
-> 455556789
- Start with 4: there is only one, so remove a street
-> 555789
- Start with 5: there are three, so remove a set
-> 789
- Start with 7: there is only one, so remove a street
-> empty : hand is valid, removals were [99] [123] [456] [555] [789]
Your third example 11223378888999 also has a total of 78, which causes backtracking:
Remove the 3: 11227888899
- Start with 1: there are less than three, so remove a street
-> impossible: 123 needs a 3
Remove the 6: impossible, there are none
Remove the 9: 112233788889
- Start with 1: there are less than three, so remove streets
-> 788889
- Start with 7: there is only one, so remove a street
-> 888
- Start with 8: there are three, so remove a set
-> empty, hand is valid, removals were : [99] [123] [123] [789] [888]
There is a special case that you need to do some re-work to get it right. This happens when there is a run-of-three and a pair with the same value (but in different suit).
Let b denates bamboo, c donates character, and d donates dot, try this hand:
b2,b3,b4,b5,b6,b7,c4,c4,c4,d4,d4,d6,d7,d8
d4,d4 should serve as the pair, and c4,c4,c4 should serve as the run-of-3 set.
But because the 3 "c4" tiles appear before the 2 d4 tiless, the first 2 c4 tiles will be picked up as the pair, leaving an orphan c4 and 2 d4s, and these 3 tiles won't form a valid set.
In this case, you'll need to "return" the 2 c4 tiles back to the hand (and keep the hand sorted), and search for next tile that meets the criteria (value == 4). To do that you'll need to make the code "remember" that it had tried c4 so in next iteration it should skip c4 and looks for other tiles with value == 4. The code will be a bit messy, but doable.
I would break it down into 2 steps.
Figure out possible combinations. I think exhaustive checking is feasible with these numbers. The result of this step is a list of combinations, where each combination has a type (set, street, or pair) and a pattern with the cards used (could be a bitmap).
With the previous information, determine possible collections of combinations. This is where a bitmap would come in handy. Using bitwise operators, you could see overlaps in usage of the same tile for different combinators.
You could also do a step 1.5 where you just check to see if enough of each type is available. This step and step 2 would be where you would be able to create a general algorithm. The first step would be the same for all numbers of tiles and possible combinations quickly.

How to master in-place array modification algorithms?

I am preparing for a software job interview, and I am having trouble with in-place array modifications.
For example, in the out-shuffle problem you interleave two halves of an array so that 1 2 3 4 5 6 7 8 would become 1 5 2 6 3 7 4 8. This question asks for a constant-memory solution (and linear-time, although I'm not sure that's even possible).
First I thought a linear algorithm is trivial, but then I couldn't work it out. Then I did find a simple O(n^2) algorithm but it took me a long time. And I still don't find a faster solution.
I remember also having trouble solving a similar problem from Bentley's Programming Pearls, column 2:
Rotate an array left by i positions (e.g. abcde rotated by 2 becomes cdeab), in time O(n) and with just a couple of bytes extra space.
Does anyone have tips to help wrap my head around such problems?
About an O(n) time, O(1) space algorithm for out-shuffle
Doing an out-shuffle in O(n) time and O(1) space is possible, but it is tough. Not sure why people think it is easy and are suggesting you try something else.
The following paper has an O(n) time and O(1) space solution (though it is for in-shuffle, doing in-shuffle makes out-shuffle trivial):
http://arxiv.org/PS_cache/arxiv/pdf/0805/0805.1598v1.pdf
About a method to tackle in-place array modification algorithms
In-place modification algorithms could become very hard to handle.
Consider a couple:
Inplace out-shuffle in linear time. Uses number theory.
In-place merge sort, was open for a few years. An algorithm came but was too complicated to be practical. Uses very complicated bookkeeping.
Sorry, if this sounds discouraging, but there is no magic elixir that will solve all in-place algorithm problems for you. You need to work with the problem, figure out its properties, and try to exploit them (as is the case with most algorithms).
That said, for array modifications where the result is a permutation of the original array, you can try the method of following the cycles of the permutation. Basically, any permutation can be written as a disjoint set of cycles (see John's answer too). For instance the permutation:
1 4 2 5 3 6
of 1 2 3 4 5 6 can be written as
1 -> 1
2 -> 3 -> 5 -> 4 -> 2
6 -> 6.
you can read the arrow as 'goes to'.
So to permute the array 1 2 3 4 5 6 you follow the three cycles:
1 goes to 1.
6 goes to 6.
2 goes to 3, 3 goes to 5, 5 goes to 4, and 4 goes to 2.
To follow this long cycle, you can use just one temp variable. Store 3 in it. Put 2 where 3 was. Now put 3 in 5 and store 5 in the temp and so on. Since you only use constant extra temp space to follow a particular cycle, you are doing an in-place modification of the array for that cycle.
Now if I gave you a formula for computing where an element goes to, all you now need is the set of starting elements of each cycle.
A judicious choice of the starting points of the cycles can make the algorithm easy. If you come up with the starting points in O(1) space, you now have a complete in-place algorithm. This is where you might actually have to get familiar with the problem and exploit its properties.
Even if you didn't know how to compute the starting points of the cycles, but had a formula to compute the next element, you could use this method to get an O(n) time in-place algorithm in some special cases.
For instance: if you knew the array of unsigned integers held only positive integers.
You can now follow the cycles, but negate the numbers in them as an indicator of 'visited' elements. Now you can walk the array and pick the first positive number you come across and follow the cycles for that, making the elements of the cycle negative and continue to find untouched elements. In the end, you just make all the elements positive again to get the resulting permutation.
You get an O(n) time and O(1) space algorithm! Of course, we kind of 'cheated' by using the sign bits of the array integers as our personal 'visited' bitmap.
Even if the array was not necessarily integers, this method (of following the cycles, not the hack of sign bits :-)) can actually be used to tackle the two problems you state:
The in-shuffle (or out-shuffle) problem: When 2n+1 is a power of 3, it can be shown (using number theory) that 1,3,3^2, etc are in different cycles and all cycles are covered using those. Combine this with the fact that the in-shuffle is susceptible to divide and conquer, you get an O(n) time, O(1) space algorithm (the formula is i -> 2*i modulo 2n+1). Refer to the above paper for more details.
The cyclic shift an array problem: Cyclic shift an array of size n by k also gives a permutation of the resulting array (given by the formula i goes to i+k modulo n), and can also be solved in linear time and in-place using the following the cycle method. In fact, in terms of the number of element exchanges this following cycle method is better than the 3 reverses algorithm. Of course, following the cycle method can kill the cache because of the access patterns, and in practice, the 3 reverses algorithm might actually fare better.
As for interviews, if the interviewer is a reasonable person, they will be looking at how you think and approach the problem and not whether you actually solve it. So even if you don't solve a problem, I think you should not be discouraged.
The basic strategy with in place algorithms is to figure out the rule for moving a entry from slot N to slot M.
So, your shuffle, for instance. if A and B are cards and N is the number of chards. the rules for the first half of the deck are different than the rules for the second half of the deck
// A is the current location, B is the new location.
// this math assumes that the first card is card 0
if (A < N/2)
B = A * 2;
else
B = (A - N/2) * 2 + 1;
Now we know the rule, we just have to move each card, each time we move a card, we calculate the new location, then remove the card that is currently in B. place A in slot B, then let B be A, and loop back to the top of the algorithm. Each card moved displaces the new card which becomes the next card to be moved.
I think the analysis is easier if we are 0 based rather than 1 based, so
0 1 2 3 4 5 6 7 // before
0 4 1 5 2 6 3 7 // after
So we want to move 1->2 2->4 4->1 and that completes a cycle
then move 3->6 6->5 5->3 and that completes a cycle
and we are done.
Now we know that card 0 and card N-1 don't move, so we can ignore those,
so we know that we only need to swap N-2 cards in total. The only sticky bit
is that there are 2 cycles, 1,2,4,1 and 3,6,5,3. when we get to card 1 the
second time, we need to move on to card 3.
int A = 1;
int N = 8;
card ary[N]; // Our array of cards
card a = ary[A];
for (int i = 0; i < N/2; ++i)
{
if (A < N/2)
B = A * 2;
else
B = (A - N/2) * 2 + 1;
card b = ary[B];
ary[B] = a;
a = b;
A = B;
if (A == 1)
{
A = 3;
a = ary[A];
}
}
Now this code only works for the 8 card example, because of that if test that moves us from 1 to 3 when we finish the first cycle. What we really need is a general rule to recognize the end of the cycle, and where to go to start the next one.
That rule could be mathematical if you can think of a way, or you could keep track of which places you had visited in a separate array, and when A is back to a visited place, you could then scan forward in your array looking for the first non-visited place.
For your in-place algorithm to be 0(n), the solution will need to be mathematical.
I hope this breakdown of the thinking process is helpful to you. If I was interviewing you, I would expect to see something like this on the whiteboard.
Note: As Moron points out, this doesn't work for all values of N, it's just an example of the sort of analysis that an interviewer is looking for.
Frank,
For programming with loops and arrays, nothing beats David Gries's textbook The Science of Programming. I studied it over 20 years ago, and there are ideas that I still use every day. It is very mathematical and will require real effort to master, but that effort will repay you many times over for your whole career.
Complementing Aryabhatta's answer:
There is a general method to "follow the cycles" even without knowing the starting positions for each cycle or using memory to know visited cycles. This is specially useful if you need O(1) memory.
For each position i in the array, follow the cycle without moving any data yet, until you reach...
the starting position i: end of the cyle. this is a new cycle: follow it again moving the data this time.
a position lower than i: this cycle was already visited, nothing to do with it.
Of course this has a time overhead (O(n^2), I believe) and has the cache problems of the general "following cycles" method.
For the first one, let's assume n is even. You have:
first half: 1 2 3 4
second : 5 6 7 8
Let x1 = first[1], x2 = second[1].
Now, you have to print one from the first half, one from the second, one from the first, one from the second...
Meaning first[1], second[1], first[2], second[2], ...
Obviously, you don't keep two halves in memory, as that will be O(n) memory. You keep pointers to the two halves. Do you see how you'd do that?
The second is a bit harder. Consider:
12345
abcde
..cde
.....ab
..cdeab
cdeab
Do you notice anything? You should notice that the question basically asks you to move the first i characters to the end of your string, without affording the luxury of copying the last n - i in a buffer then appending the first i and then returning the buffer. You need to do with O(1) memory.
To figure how to do this you basically need a lot of practice with these kinds of problems, as with anything else. Practice makes perfect basically. If you've never done these kinds of problems before, it's unlikely you'll figure it out. If you have, then you have to think about how you can manipulate the substrings and or indices such that you solve your problem under the given constraints. The general rule is to work and learn as much as possible so you'll figure out the solutions to these problems very fast when you see them. But the solution differs quite a bit from problem to problem. There's no clear recipe for success I'm afraid. Just read a lot and understand the stuff you read before you move on.
The logic for the second problem is this: what happens if we reverse the substring [1, 2], the substring [3, 5] and then concatenate them and reverse that? We have, in general:
1, 2, 3, 4, ..., i, i + 1, i + 2, ..., N
reverse [1, i] =>
i, i - 1, ..., 4, 3, 2, 1, i + 1, i + 2, ..., N
reverse [i + 1, N] =>
i, i - 1, ..., 4, 3, 2, 1, N, ..., i + 1
reverse [1, N] =>
i + 1, ..., N, 1, 2, 3, 4, ..., i - 1, i
which is what you wanted. Writing the reverse function using O(1) memory should be trivial.
Generally speaking, the idea is to loop through the array once, while
storing the value at the position you are at in a temporary variable
finding the correct value for that position and writing it
either move on to the next value, or figure out what to do with your temporary value before continuing.
A general approach could be as follows:
Construct a positions array int[] pos, such that pos[i] refers to the position (index) of a[i] in the shuffled array.
Rearrange the original array int[] a, according to this positions array pos.
/** Shuffle the array a. */
void shuffle(int[] a) {
// Step 1
int [] pos = contructRearrangementArray(a)
// Step 2
rearrange(a, pos);
}
/**
* Rearrange the given array a according to the positions array pos.
*/
private static void rearrange(int[] a, int[] pos)
{
// By definition 'pos' should not contain any duplicates, otherwise rearrange() can run forever.
// Do the above sanity check.
for (int i = 0; i < pos.length; i++) {
while (i != pos[i]) {
// This while loop completes one cycle in the array
swap(a, i, pos[i]);
swap(pos, i, pos[i]);
}
}
}
/** Swap ith element in a with jth element. */
public static void swap(int[] a, int i, int j)
{
int temp = a[i];
a[i] = a[j];
a[j] = temp;
}
As an example, for the case of outShuffle the following would be an implementation of contructRearrangementArray().
/**
* array : 1 2 3 4 5 6 7 8
* pos : 0 2 4 6 1 3 5 7
* outshuffle: 1 5 2 6 3 7 4 8 (outer boundaries remain same)
*/
public int[] contructRearrangementArray(int[] a)
{
if (a.length % 2 != 0) {
throw new IllegalArgumentException("Cannot outshuffle odd sized array");
}
int[] pos = new int[a.length];
for (int i = 0; i < pos.length; i++) {
pos[i] = i * 2 % (pos.length - 1);
}
pos[a.length - 1] = a.length - 1;
return pos;
}

Resources