Can't make/configure the difficult function - algorithm

While solving sudoku I can remove possibility digits (1) and (2) from the cells D[1,2] and D[2,2]. Because (8) and (9) are possible only in those cells, so those cells are either (8 and 9) or (9 and 8). This means that digits (1) and (2) are at the 3rd line of the D block. Thats why I can eliminate the possibility of the digit (1) from the cell A[3,3].
I have been configuring a function to do this during last 40 hours, but couldn't manage. Is there anyone who can make the function to detect this type of intellectual issue (eliminating some possibilities because some other n count of possibilities can exist only in n count of cells, in our case 2 digits 8 and 9 can exist in 2 cells D[1,2] and D[2,2]).
Please dont advice me about other functions of sudoku; I have already done them, the only algorithm that I couldn't program is this one. Btw you can use r[i] (string which consists the possibilities for the row number i), c[i] for the column, and b[i] for the blocks (ex: b[4] (in this image block A) = 1,2,3,4,5,6,7 because 8 and 9 are already defined). Thanks

I really don't see the problem, you basically already answered your problem.
In general you should do the following:
Step 1: Loop over all 9 cells of one block and check if (1) is contained only in two cells.
Step 2: If not, try the next number. If yes, Loop over all 9 cells and check if (2) is also in those two cells but not in any other of the remaining 7.
Step 3: If not, check the next number. If yes, remove the other possibilities of the two cells except for the two numbers you found and you are basically done.
Step 4: If no matching number could be found for (1) (or any larger number that was chosen in the "not" part of step 2), start over from step 1 but trying the next number, unless you are already at 8, then you can stop.
In the end you could dynamically extend the same pattern for 3 numbers in 3 cells, 4 numbers...

Related

Possible NxN matrices, t 1's in each row and column, none in diagonal?

Background:
This is extra credit in a logic and algorithms class, we are currently covering propositional logic, P implies Q that kind of thing, so I think the Prof wanted to give us and assignment out of our depth.
I will implement this in C++, but right now I just want to understand whats going on in the example....which I don't.
Example
Enclosed is a walkthrough for the Lefty algorithm which computes the number
of nxn 0-1 matrices with t ones in each row and column, but none on the main
diagonal.
The algorithm used to verify the equations presented counts all the possible
matrices, but does not construct them.
It is called "Lefty", it is reasonably simple, and is best described with an
example.
Suppose we wanted to compute the number of 6x6 0-1 matrices with 2 ones
in each row and column, but no ones on the main diagonal. We first create a
state vector of length 6, filled with 2s:
(2 2 2 2 2 2)
This state vector symbolizes the number of ones we must yet place in each
column. We accompany it with an integer which we call the "puck", which is
initialized to 1. This puck will increase by one each time we perform a ones
placement in a row of the matrix (a "round"), and we will think of the puck as
"covering up" the column that we wonít be able to place ones in for that round.
Since we are starting with the first row (and hence the first round), we place
two ones in any column, but since the puck is 1, we cannot place ones in the
first column. This corresponds to the forced zero that we must place in the first
column, since the 1,1 entry is part of the matrixís main diagonal.
The algorithm will iterate over all possible choices, but to show each round,
we shall make a choice, say the 2nd and 6th columns. We then drop the state
vector by subtracting 1 from the 2nd and 6th values, and advance the puck:
(2 1 2 2 2 1); 2
For the second round, the puck is 2, so we cannot place a one in that column.
We choose to place ones in the 4th and 6th columns instead and advance the
puck:
(2 1 2 1 2 0); 3
Now at this point, we can place two ones anywhere but the 3rd and 6th
columns. At this stage the algorithm treats the possibilities di§erently: We
can place some ones before the puck (in the column indexes less than the puck
value), and/or some ones after the puck (in the column indexes greater than
the puck value). Before the puck, we can place a one where there is a 1, or
where there is a 2; after the puck, we can place a one in the 4th or 5th columns.
Suppose we place ones in the 4th and 5th columns. We drop the state vector
and advance the puck once more:
(2 1 2 0 1 0); 4
1
For the 4th round, we once again notice we can place some ones before the
puck, and/or some ones after.
Before the puck, we can place:
(a) two ones in columns of value 2 (1 choice)
(b) one one in the column of value 2 (2 choices)
(c) one one in the column of value 1 (1 choice)
(d) one one in a column of value 2 and one one in a column of value 1 (2
choices).
After we choose one of the options (a)-(d), we must multiply the listed
number of choices by one for each way to place any remaining ones to the right
of the puck.
So, for option (a), there is only one way to place the ones.
For option (b), there are two possible ways for each possible placement of
the remaining one to the right of the puck. Since there is only one nonzero value
remaining to the right of the puck, there are two ways total.
For option (c), there is one possible way for each possible placement of the
remaining one to the right of the puck. Again, since there is only one nonzero
value remaining, there is one way total.
For option (d), there are two possible ways to place the ones.
We choose option (a). We drop the state vector and advance the puck:
(1 1 1 0 1 0); 5
Since the puck is "covering" the 1 in the 5th column, we can only place
ones before the puck. There are (3 take 2) ways to place two ones in the three
columns of value 1, so we multiply 3 by the number of ways to get remaining
possibilities. After choosing the 1st and 3rd columns (though it doesnít matter
since weíre left of the puck; any two of the three will do), we drop the state
vector and advance the puck one final time:
(0 1 0 0 1 0); 6
There is only one way to place the ones in this situation, so we terminate
with a count of 1. But we must take into account all the multiplications along
the way: 1*1*1*1*3*1 = 3.
Another way of thinking of the varying row is to start with the first matrix,
focus on the lower-left 2x3 submatrix, and note how many ways there were to
permute the columns of that submatrix. Since there are only 3 such ways, we
get 3 matrices.
What I think I understand
This algorithm counts the the all possible 6x6 arrays with 2 1's in each row and column with none in the descending diagonal.
Instead of constructing the matrices it uses a "state_vector" filled with 6 2's, representing how many 2's are in that column, and a "puck" that represents the index of the diagonal and the current row as the algorithm iterates.
What I don't understand
The algorithm comes up with a value of 1 for each row except 5 which is assigned a 3, at the end these values are multiplied for the end result. These values are supposed to be the possible placements for each row but there are many possibilities for row 1, why was it given a one, why did the algorithm wait until row 5 to figure all the possible permutations?
Any help will be much appreciated!
I think what is going on is a tradeoff between doing combinatorics and doing recursion.
The algorithm is using recursion to add up all the counts for each choice of placing the 1's. The example considers a single choice at each stage, but to get the full count it needs to add the results for all possible choices.
Now it is quite possible to get the final answer simply using recursion all the way down. Every time we reach the bottom we just add 1 to the total count.
The normal next step is to cache the result of calling the recursive function as this greatly improves the speed. However, the memory use for such a dynamic programming approach depends on the number of states that need to be expanded.
The combinatorics in the later stages is making use of the fact that once the puck has passed a column, the exact arrangement of counts in the columns doesn't matter so you only need to evaluate one representative of each type and then add up the resulting counts multiplied by the number of equivalent ways.
This both reduces the memory use and improves the speed of the algorithm.
Note that you cannot use combinatorics for counts to the right of the puck, as for these the order of the counts is still important due to the restriction about the diagonal.
P.S. You can actually compute the number of ways for counting the number of n*n matrices with 2 1's in each column (and no diagonal entries) with pure combinatorics as:
a(n) = Sum_{k=0..n} Sum_{s=0..k} Sum_{j=0..n-k} (-1)^(k+j-s)*n!*(n-k)!*(2n-k-2j-s)!/(s!*(k-s)!*(n-k-j)!^2*j!*2^(2n-2k-j))
According to OEIS.

Algorithm to find largest identical-row square in matrix

I have a matrix of 100x100 size and need to find the largest set of rows and columns that create a square having equal rows. Example:
A B C D E F C D E
a 0 1 2 3 4 5 a 2 3 4
b 2 9 7 9 8 2
c 9 0 6 8 9 7 ==>
d 8 9 2 3 4 8 d 2 3 4
e 7 2 2 3 4 5 e 2 3 4
f 0 3 6 8 7 2
Currently I am using this algorithm:
candidates = [] // element type is {rows, cols}
foreach row
foreach col
candidates[] = {[row], [col]}
do
retval = candidates.first
foreach candidates as candidate
foreach newRow > candidates.rows.max
foreach newCol > candidates.cols.max
// compare matrix cells in candidate to newRow and newCol
if (newCandidateHasEqualRows)
newCandidates[] = {candidate.rows+newRow, candidate.cols+newCol}
candidates = newCandidates
while candidates.count
return retval
Has anyone else come across a problem similar to this? And is there a better algorithm to solve it?
Here's the NP-hardness reduction I mentioned, from biclique. Given a bipartite graph, make a matrix with a row for each vertex in part A and a column for each vertex in part B. For every edge that is present, put a 0 in the corresponding matrix entry. Put a unique positive integer for each other matrix entry. For all s > 1, there is a Ks,s subgraph if and only if there is a square of size s (which necessarily is all zero).
Given a fixed set of rows, the optimal set of columns is easily determined. You could try the a priori algorithm on sets of rows, where a set of rows is considered frequent iff there exist as many columns that, together with the rows, form a valid square.
I've implemented a branch and bound solver for this problem in C++ at http://pastebin.com/J1ipWs5b. To my surprise, it actually solves randomly-generated puzzles of size up to 100x100 quite quickly: on one problem with each matrix cell chosen randomly from 0-9, an optimal 4x4 solution is found in about 750ms on my old laptop. As the range of cell entries is reduced down to just 0-1, the solution times get drastically longer -- but still, at 157s (for the one problem I tried, which had an 8x8 optimal solution), this isn't terrible. It seems to be very sensitive to the size of the optimal solution.
At any point in time, we have a partial solution consisting of a set of rows that are definitely included, and a set of rows that are definitely excluded. (The inclusion status of the remaining rows is yet to be determined.) First, we pick a remaining row to "try". We try including the row; then (if necessary; see below) we try excluding it. "Trying" here means recursively solving the corresponding subproblem. We record the set of columns that are identical across all rows that are definitely included in the solution. As rows are added to the partial solution, this set of columns can only shrink.
There are a couple of improvements beyond the standard B&B idea of pruning the search when we determine that we can't develop the current partial solution into a better (i.e. larger) complete solution than some complete solution we have already found:
A dominance rule. If there are any rows that can be added to the current partial solution without shrinking the set of identical columns at all, then we can safely add them immediately, and we never have to consider not adding them. This potentially saves a lot of branching, especially if there are many similar rows in the input.
We can reorder the remaining (not definitely included or definitely excluded) rows arbitrarily. So in particular, we can always pick as the next row to consider the row that would most shrink the set of identical columns: this (perhaps counterintuitive) strategy has the effect of eliminating bad combinations of rows near the top of the search tree, which speeds up the search a lot. It also happens to complement the dominance rule above, because it means that if there are ever two rows X and Y such that X preserves a strict subset of the identical columns that Y preserves, then X will be added to the solution first, which in turn means that whenever X is included, Y will be forced in by the dominance rule and we don't need to consider the possibility of including X but excluding Y.

Generate multiple sequences of numbers with unique values at each index

I have a row with numbers 1:n. I'm looking to add a second row also with the numbers 1:n but these should be in a random order while satisfying the following:
No positions have the same number in both rows
No combination of numbers occurs twice
For example, in the following
Row 1: 1 2 3 4 5 6 7 ...
Row 2: 3 6 15 8 13 12 7 ...
the number 7 occurs at the same position in both rows 1 and 2 (namely position 7; thereby not satisfying rule 1)
while in the following
Row 1: 1 2 3 4 5 6 7 ...
Row 2: 3 7 15 8 13 12 2 ...
the combination of 2+7 appears twice (in positions 2 and 7; thereby not satisfying rule 2).
It would perhaps be possible – but unnecessarily time-consuming – to do this by hand (at least up until a reasonable number), but there must be quite an elegant solution for this in MATLAB.
This problem is called a derangment of a permutation.
Use the function randperm, in order to find a random permutation of your data.
x = [1 2 3 4 5 6 7];
y = randperm(x);
Then, you can check that the sequence is legal. If not, do it again and again..
You have a probability of about 0.3 each time to succeed, which means that you need roughly 10/3 times to try until you find it.
Therefore you will find the answer really quickly.
Alternatively, you can use this algorithm to create a random derangment.
Edit
If you want to have only cycles of size > 2, this is a generalization of the problem.
In it is written that the probability
in that case is smaller, but big enough to find it in a fixed amount of steps. So the same approach is still valid.
This is fairly straightforward. Create a random permutation of the nodes, but interpret the list as follows: Interpret it as a random walk around the nodes, and if node 'b' appears after node 'a', it means that node 'b' appears below node 'a' in the lists:
So if your initial random permutation is
3 2 5 1 4
Then the walk in this case is 3 -> 2 -> 5 -> 1 -> 4 and you creates the rows as follows:
Row 1: 1 2 3 4 5
Row 2: 4 5 2 3 1
This random walk will satisfy both conditions.
But do you wish to allow more than one cycle in your network? I know you don't want two people to have each other's hat. But what about 7 people, where 3 of them have each other's hats and the other 4 have each other's hats? Is this acceptable and/or desirable?
Andrey has already pointed you to randperm and the rejection-sampling-like approach. After generating a permutation p, an easy way to check whether it has fixed point is any(p==1:n). An easy way to check whether it contains cycles of length 2 is any(p(p)==1:n).
So this gets permutations p of 1:n fulfilling your requirements:
p=[];
while (isempty(p))
p=randperm(n);
if any(p==1:n), p=[];
elseif any(p(p)==1:n), p=[];
end
end
Surrounding this with a for loop and for each counting the iterations of the while loop, it seems that one needs to generate on average 4.5 permutations for every "valid" one (and 6.2 if cycles of length three are not allowed, either). Very interesting.

Arranging groups of people optimally

I have this homework assignment that I think I managed to solve, but am not entirely sure as I cannot prove my solution. I would like comments on what I did, its correctness and whether or not there's a better solution.
The problem is as follows: we have N groups of people, where group ihas g[i]people in it. We want to put these people on two rows of S seats each, such that: each group can only be put on a single row, in a contiguous sequence, OR if the group has an even number of members, we can split them in two and put them on two rows, but with the condition that they must form a rectangle (so they must have the same indices on both rows). What is the minimum number of seats S needed so that nobody is standing up?
Example: groups are 4 11. Minimum S is 11. We put all 4 in one row, and the 11 on the second row. Another: groups are 6 2. We split the 6 on two rows, and also the two. Minimum is therefore 4 seats.
This is what I'm thinking:
Calculate T = (sum of all groups + 1) / 2. Store the group numbers in an array, but split all the even values x in two values of x / 2 each. So 4 5 becomes 2 2 5. Now run subset sum on this vector, and find the minimum value higher than or equal to T that can be formed. That value is the minimum number of seats per row needed.
Example: 4 11 => 2 2 11 => T = (15 + 1) / 2 = 8. Minimum we can form from 2 2 11 that's >= 8 is 11, so that's the answer.
This seems to work, at least I can't find any counter example. I don't have a proof though. Intuitively, it seems to always be possible to arrange the people under the required conditions with the number of seats supplied by this algorithm.
Any hints are appreciated.
I think your solution is correct. The minimum number of seats per row in an optimal distribution would be your T (which is mathematically obvious).
Splitting even numbers is also correct, since they have two possible arrangements; by logically putting all the "rectangular" groups of people on one end of the seat rows you can also guarantee that they will always form a proper rectangle, so that this condition is met as well.
So the question boils down to finding a sum equal or as close as possible to T (e.g. partition problem).
Minor nit: I'm not sure if the proposed solution above works in the edge case where each group has 0 members, because your numerator in T = SUM ALL + 1 / 2 is always positive, so there will never be a subset sum that is greater than or equal to T.
To get around this, maybe a modulus operation might work here. We know that we need at least n seats in a row if n is the maximal odd term, so maybe the equation should have a max(n * (n % 2)) term in it. It will come out to max(odd) or 0. Since the maximal odd term is always added to S, I think this is safe (stated boldly without proof...).
Then we want to know if we should split the even terms or not. Here's where the subset sum approach might work, but with T simply equal to SUM ALL / 2.

Algorithm: efficient way to search an integer in a two dimensional integer array? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Given a 2d array sorted in increasing order from left to right and top to bottom, what is the best way to search for a target number?
Search a sorted 2D matrix
A time efficient program to find an element in a two dimensional matrix, the rows and columns of which are increasing monotonically. (Rows and columns are increasing from top to bottom and from left to right).
I can only think of binary search, if the 2D array was sorted.
I posed this problem as homework last semester, and two students, which I had considered to be average, surprised me by coming up with a very elegant, straightforward, and (probably) optimal algorithm:
Find(k, tab, x, y)
let m = tab[x][y]
if k = m then return "Found"
else if k > m then
return Find(k, tab, x, y + 1)
else
return Find(k, tab, x - 1, y)
This algorithm eliminates either one line or one column at every call (note that it is tail recursive, and could be transformed into a loop, thereby avoiding the recursive calls). Thus, if your matrix is n*m, the algorithm performs in O(n+m). This solution is better than the dichotomic search spin off (which the solution I expected when handing out this problem).
EDIT : I fixed a typo (k became x in the recursive calls) and also, as Chris pointed out, this should initially be called with the "upper right" corner, that is Find(k, tab, n, 1), where n is the number of lines.
Since the the rows and columns are increasing monotonically, you can do a neat little search like this:
Start at the bottom left. If the element you are looking for is greater than the element at that location, go right. If it is less go up. Repeat until you find the element or you hit an edge. Example (in hex to make formatting easier):
1 2 5 6 7
3 4 6 7 8
5 7 8 9 A
7 A C D E
Let's search for 8. Start at position (0, 3): 7. 8 > 7 so we go right. We are now at (1, 3): A. 8 < A so we go up. At (1, 2): 7, 8 > 7 so we go right. (2, 2): 8 -> 8 == 8 so we are done.
You'll notice, however, that this has only found one of the elements whose value is 8.
Edit, in case it wasn't obvious this runs in O(n + m) average and worst case time.
Assuming I read right you are saying that the bottom of row n is always less than the top of row n+1. If that is the case then I'd say the simplest way is to search the first row using a binary search for either the number or the next smallest number. Then you will have identified the column it is in. Then do a binary search of that column until you find it.
Start at (0,0)
while the value is too low, continue to the right (0,1), then (0,2) etc.
when reaching a value too high, go down one and left one (1,1)
Repeating those steps should bring you to the target.

Resources