Puzzle: Two Adjacent Cells of Opposite Colors - algorithm

I have been stuck in that question from Coursera AlgorithmicBox. The question of Quiz is as below. I do not have any idea how to approac for the solution :)
You are given 20 black and white cells. The leftmost one is white, the
rightmost one is black, the colors of all other cells are hidden. You
can reveal the color of a cell by clicking on it. Your goal is to find
two adjacent cells of different colors by using at most 5 clicks.

Since there is white cell and a black cell, we know that there is a transition at some point. We can click on the 10th cell, if it's white we discard the cells left of it, if it's black, the cells right of it, we'll be left with 11 (or 10) cells, if we repeat this process using the middle, or one of the two middle cells in the case of even tiles one more times, we'll have at most 6 cells. We repeat this process again to discard two more cells and we're left with 3 cells, we then check the last cell to see all 3 cells. Since we know that the left and right cells are different, it's either the first two or last two that are different but adjacent.
We used 4 clicks.
In general this is similar binary search, with similar guarantees. we always check for the floor[n/2] th cell (or ceil, doesn't matter).

Related

How to solve Kakurasu Puzzle in an efficient time?

I'm trying to write a code in Java to find a solution of an instance of a Kakurasu puzzle in an efficient time. I was thinking about using dynamic programming but couldn't figure out how to do that.
I describe here the algorithm to find the solutions of a kakurasu. A solver implementation in the Go programming language can be found here.
A kakurasu is an n rows by m colums grid puzzle. The goal of the puzzle is to determine the black or white color of all cells by using the sums of weight of black cells in all rows and columns. The weight is a number from 1 to n.
The image bellow shows a solved kakurasu. The numbers on the top and left sides are the row and column weights. The numbers on the right and at the bottom are the sums of weights of black cells.
Each sum has a limited set of possible weight combinations. From this set of possible solutions we can deduce that some cells must be white and others must be black because they are respectively white or black in all solutions. The image below illustrate the deduction we can make from the sum 9. A cell color is grey when it’s color is left unknown by the deduction.
Once we deduced the color of a cell, we can prune solutions with an incompatible color from the row or column containing the cell. By repeating the deduction and pruning operations, we can deduce the color of the grid cells. This deduction process ends when the color of all cells has been determined, or when no new deductions can be made. In the later case we are left with cells of unknown color. This means that there are multiple solutions where the cells of unknown color are black and white.
To find the different solutions, we solve by assign the color white to a cell of unknown color, and again by assigning the color black to that cell. This can be repeated as needed until the color of all cells has been determined for all solutions.

minimum number of rectangular regions to fill a grid

Suppose we have a grid and we want to paint rectangular regions on it using the smallest number of colors possible, one for each region.
There are some cells that are already painted black and cannot be painted over:
Is there a polynomial algorithm to solve this problem?
After testing, I found out that the solution for this case is 9 (because we need 9 different colors to paint the minimum number of regions to fill the whole grid):
The greedy approach seems to work well: just search for the rectangle with biggest (white) area and paint it, repeating this until there's nothing else to be painted, but I didn't measure the complexity or the correctness.
Here are a few observations that can simplify this problem in specific cases. First of all, adjacent identical rows and columns can be reduced to one row or column without changing the required number of regions, to form a simplified grid:
A simplified grid where no row or column is divided into more than two uncoloured parts (i.e. has two or more seperate black cells), has an optimal solution which can be found by using the rows or columns as regions (depending on whether the width or height of the grid is greater):
The number of regions is then minimum(width, height) + number of black cells.
If a border row or column in a simplified grid contains no black cells, then using it as a region is always the optimal solution; adding some parts of it to other regions would require at least one additional region to be made in the border row or column (depending on the number of black cells in the adjacent row or column):
This means that the grid can be further simplified by removing border rows and columns with no black cells, and adding the number of removed regions to the region count:
Similarly, if one or more border cells are isolated by a black cell in the adjacent row or column, all the connected uncoloured neighbouring cells can be regarded as one region:
At each point you can go back to previous rules; e.g. after the right- and left-most columns have been turned into regions in the example above, we are left with the grid below, which can be simplified with the first rule, because the bottom two rows are identical:
Collapsing identical adjacent rows or columns can also be applied locally to isolated parts of the grid. The example below has no identical adjacent rows, but the center part is isolated, so there rows 3 to 6 can be collapsed:
And on the left row 3 and 4 can be collapsed locally, and on the right rows 5 and 6, so we end up with the situation in the third image above. These collapsed cells then act as one.
Once you can't find any further simplifications using the rules above, and you want to check every possible division of (part of) a grid, a first step could be to list the maximum rectangle sizes that can be made with the corresponding cell as their top left corner; for the simplified 6x7 grid in the first example above that would be:
COL.1 COL.2 COL.3 COL.4 COL.5 COL.6
ROW 1 [6x1, 3x3, 1x7] [5x1, 2x3] [4x1, 1x7] [3x1] [2x5] [1x7]
ROW 2 [3x2, 1x6] [2x2] [1x6] [] [2x4] [1x6]
ROW 3 [6x1, 1x5] [5x1] [4x3, 2x5] [3x3, 1x5] [2x3] [1x5]
ROW 4 [1x4] [] [4x2, 2x4] [3x2, 1x4] [2x2] [1x4]
ROW 5 [6x1, 4x3] [5x1, 3x3] [4x1, 2x3] [3x1, 1x3] [2x1] [1x3]
ROW 6 [4x2] [3x2] [2x2] [1x2] [] [1x2]
ROW 7 [6x1] [5x1] [4x1] [3x1] [2x1] [1x1]
You can then use these maximum sizes to generate every option for each cell; e.g. for cell (1,1) they would be:
6x1, 5x1, 4x1, 3x3, 3x2, 3x1, 2x3, 2x2, 2x1, 1x7, 1x6, 1x5, 1x4, 1x3, 1x2, 1x1
(Some rectangle sizes in the list can be skipped; e.g. it never makes sense to use the 3x1-sized region without adding the fourth isolated cell to get 4x1.)
After choosing an option, you would skip the cells which are covered by the rectangle you've chosen and try each option for the next cell, and so on...
Running this on large grids will lead to huge numbers op options. However, at each point you can go back to checking whether the simplification rules can help.
To see that a greedy algorithm, which selects the largest rectangles first, cannot guarantee an optimal solution, consider the example below. Selecting the 2x2 square in the middle would lead to a solution with 5 regions, while several solutions with only 4 regions exist.

How to move an element in a matrix r to one of its adjacent cell (without knowing the element's location in the grid in advance)?

I created a 10 x 10 matrix which was originally filled up with empty spaces. Then I placed 25 characters 'a' in random cells on the grid. Now I need to move each one of them to one of the two to four adjacent cells (consider the 'a' next to the edge or in the corner). If the chosen destination was occupied by another 'a', it should stay where it is.
Here is my approach to the problem (the code itself looks lengthy now)
break the 10✕10 grid into 9 possible situations (in the middle 8✕8, four corners and four edges of the grid)
for each of case above, I check each one of the two to four adjacent cells. If it containes an empty space, I swap 'a' with it.
However, this results in a huge ugly piece of code. So, I wanted to ask: (1)is there a way that I can do the boundary check without breaking the grid into 8 different situations? (2) How can I choose the destination for each of the 25 'a' RANDOMLY?
Thank you so much in advance!!!

How can I find hole in a 2D matrix?

I know the title seems kind of ambiguous and for this reason I've attached an image which will be helpful to understand the problem clearly. I need to find holes inside the white region. A hole is defined as one or many cells with value '0' inside the white region I mean it'll have to be fully enclosed by cell's with value '1' (e.g. here we can see three holes marked as 1, 2 and 3). I've come up with a pretty naive solution:
1. Search the whole matrix for cells with value '0'
2. Run a DFS(Flood-Fill) when such a cell (black one) is encountered and check whether we can touch the boundary of the main rectangular region
3. If we can touch boundary during DFS then it's not a hole and if we can't reach boundary then it'll be considered as a hole
Now, this solution works but I was wondering if there's any other efficient/fast solution for this problem.
Please let me know your thoughts. Thanks.
With floodfill, which you already have: run along the BORDER of your matrix and floodfill it, i.e.,
change all zeroes (black) to 2 (filled black) and ones to 3 (filled white); ignore 2 and 3's that come from an earlier floodfill.
For example with your matrix, you start from the upper left, and floodfill black a zone with area 11. Then you move right, and find a black cell that you just filled. Move right again and find a white area, very large (actually all the white in your matrix). Floodfill it. Then you move right again, another fresh black area that runs along the whole upper and right borders. Moving around, you now find two white cells that you filled earlier and skip them. And finally you find the black area along the bottom border.
Counting the number of colours you found and set might already supply the information on whethere there are holes in the matrix.
Otherwise, or to find where they are, scan the matrix: all areas you find that are still of color 0 are holes in the black. You might also have holes in the white.
Another method, sort of "arrested flood fill"
Run all around the border of the first matrix. Where you find "0", you set
to "2". Where you find "1", you set to "3".
Now run around the new inner border (those cells that touch the border you have just scanned).
Zero cells touching 2's become 2, 1 cells touching 3 become 3.
You will have to scan twice, once clockwise, once counterclockwise, checking the cells "outwards" and "before" the current cell. That is because you might find something like this:
22222222222333333
2AB11111111C
31
Cell A is actually 1. You examine its neighbours and you find 1 (but it's useless to check that since you haven't processed it yet, so you can't know if it's a 1 or should be a 3 - which is the case, by the way), 2 and 2. A 2 can't change a 1, so cell A remains 1. The same goes with cell B which is again a 1, and so on. When you arrive at cell C, you discover that it is a 1, and has a 3 neighbour, so it toggles to 3... but all the cells from A to C should now toggle.
The simplest, albeit not most efficient, way to deal with this is to scan the cells clockwise, which gives you the wrong answer (C and D are 1's, by the way)
22222222222333333
211111111DC333333
33
and then scan them again counterclockwise. Now when you arrive to cell C, it has a 3-neighbour and toggles to 3. Next you inspect cell D, whose previous-neighbour is C, which is now 3, so D toggles to 3 again. In the end you get the correct answer
22222222222333333
23333333333333333
33
and for each cell you examined two neighbours going clockwise, one going counterclockwise. Moreover, one of the neighbours is actually the cell you checked just before, so you can keep it in a ready variable and save one matrix access.
If you find that you scanned a whole border without even once toggling a single cell, you can halt the procedure. Checking this will cost you 2(W*H) operations, so it is only really worthwhile if there are lots of holes.
In at most W*H*2 steps, you should be done.
You might also want to check the Percolation Algorithm and try to adapt that one.
Make some sort of a "LinkedCells" class that will store cells that are linked with each other. Then check cells on-by-one in a from-left-to-right-from-top-to-bottom order, making the following check for each cell: if it's neighbouring cell is black - add this cell to that cell's group. Else you should create new group for this cell. You should only check for top and left neighbour.
UPD: Sorry, I forgot about merging groups: if both neighbouring cells are black and are from different groups - you should merege tha groups in one.
Your "LinkedCells" class should have a flag if it is connected to the edge. It is false by default and can be changed to true if you add edge cell to this group. In case of merging two groups you should set new flag as a || of previous flags.
In the end you will have a set of groups and each group having false connection flag will be "hole".
This algorithm will be O(x*y).
You can represent the grid as a graph with individual cells as vertexes and edges occurring between adjacent vertexes. Then you can use Breadth First Search or Depth First Search to start at each of the cells, on the sides. As you will only find the components connected to the sides, the black cells which have not been visited are the holes. You can use the search algorithm again to divide the holes into distinct components.
EDIT: Worst case complexity must be linear to the number of cells, otherwise, give some input to the algorithm, check which cells (as you're sublinear, there will be big unvisited spots) the algorithm hasn't looked into and put a hole in there. Now you've got an input for which the algorithm doesn't find one of the holes.
Your algorithm is globally Ok. It's just a matter of optimizing it by merging the flood fill exploration with the cell scanning. This will just minimize tests.
The general idea is to perform the flood fill exploration line by line while scanning the table. So you'll have multiple parallel flood fill that you have to keep track of.
The table is then processed row by row from top to bottom, and each row processed from right to left. The order is arbitrary, could be reverse if you prefer.
Let segments identify a sequence of consecutive cells with value 0 in a row. You only need the index of the first and last cell with value 0 to define a segment.
As you may guess a segment is also a flood fill in progress. So we'll add an identification number to the segments to distinguish between the different flood fills.
The nice thing of this algorithm is that you only need to keep track of segments and their identification number in row i and i-1. So that when you process row i, you have the list of segments found in the row i-1 and their associated identification number.
You then have to process segment connection in row i and row i-1. I'll explain below how this can be made efficient.
For now you have to consider three cases:
found a segment in row i not connected to a segment in row i-1. Assign it a new hole identification (incremented integer). If it's connected to the border of the table, make this number negative.
found a segment in row i-1 not connected to a segment in row i-1. You found the lowest segment of a hole. If it has a negative identification number it is connected to the border and you can ignore it. Otherwise, congratulation, you found a hole.
found a segment in row i connected to one or more segments in row i-1. Set the identification number of all these connected segments to the smallest identification number. See the following possible use case.
row i-1: 2 333 444 111
row i : **** *** ***
The segments in row i should all get the value 1 identifying the same flood fill.
Matching segments in rows i and row i-1 can be done efficiently by keeping them in order from left to right and comparing segments indexes.
Process segments by lowest start index first. Then check if it's connected to the segment with lowest start index of the other row. If no, process case 1 or 2. Otherwise continue identifying connected segments, keeping track of the smallest identification number. When no more connected segments is found, set the identification number of all connected segments found in row i to the smallest identification value.
Index comparison for connectivity test can by optimized by storing (first-1,last) as segment definition since segments may be connected by their corners. You then can directly compare indexes bare value and detect overlapping segments.
The rule to pick the smallest identification number ensures that you automatically get the negative number for connected segments and at least one connected to the border. It propagates to other segments and flood fills.
This is a nice exercise to program. You didn't specify the exact output you need. So this is also left as exercise.
The brute force algorithm as described here is as follow.
We now assume we can write in cells a value different from 0 or 1.
You need a flood fill functions receiving the coordinates of a cell to start from and an integer value to write into all connected cells holding the value 0.
Since you need to only consider holes (cells with value 0 surrounded by cells with value 1), you have to use two pass.
A first pass visit only cells touching the border. For every cell containing the value 0, you do a flood fill with the value -1. This tells you that this cell has a value different of 1 and has a connection to the border. After this scan, all cells with a value 0 belong to one or more holes.
To distinguish between different holes, you need the second scan. You then scan the remaining cells in the rectangle (1,1)x(n-2,n-2) you didn't scan yet. Whenever your scan hit a cell with value 0, you discovered a new hole. You then flood fill this hole with the integer of your choice to distinguish it from the others. After that you proceed with the scan until all cells have been visited.
When done, you may replace the values -1 with 0 because there shouldn't be any 0 left.
This algorithm works, but is not as efficient as the other algorithm I propose. Its advantage is that it's simple and doesn't need an extra data storage to hold the segments, hole identification and eventual segment chaining reference.

Is there an algorithm to determine contiguous colored regions in a grid?

Given a basic grid (like a piece of graph paper), where each cell has been randomly filled in with one of n colors, is there a tried and true algorithm out there that can tell me what contiguous regions (groups of cells of the same color that are joined at the side) there are? Let's say n is something reasonable, like 5.
I have some ideas, but they all feel horribly inefficient.
The best possible algorithm is O(number of cells), and is not related to the number of colors.
This can be achieved by iterating through the cells, and every time you visit one that has not been marked as visited, do a graph traversal to find all the contiguous cells in that region, and then continue iterating.
Edit:
Here's a simple pseudo code example of a depth first search, which is an easy to implement graph traversal:
function visit(cell) {
if cell.marked return
cell.marked = true
foreach neighbor in cell.neighbors {
if cell.color == neighbor.color {
visit(neighbor)
}
}
}
In addition to recursive's recursive answer, you can use a stack if recursion is too slow:
function visit(cell) {
stack = new stack
stack.push cell
while not stack.empty {
cell = stack.pop
if cell.marked continue
cell.marked = true
foreach neighbor in cell.neighbors {
if cell.color == neighbor.color {
stack.push neighbor
}
}
}
}
You could try doing a flood fill on each square. As the flood spreads, record the grid squares in an array or something, and colour them in an unused colour, say -1.
The Wikipedia article on flood fill might be useful to you here: http://en.wikipedia.org/wiki/Flood_fill
Union-find would work here as well. Indeed, you can formulate your question as a problem about a graph: the vertices are the grid cells, and two vertices are adjacent if their grid cells have the same color. You're trying to find the connected components.
The way you would use a union-find data structure is as follows: first create a union-find data structure with as many elements as you have cells. Then iterate through the cells, and union two adjacent cells if they have the same color. In the end, run find on each cell and store the response. Cells with the same find are in the same contiguous colored region.
If you want a little more fine grain control, you might think about using the A* algorithm and use the heuristic to include similarly colored tiles.
You iterate through the regions in a scanline, going left-right top-bottom. For each cell you make a list of cells shared as the same memory object between the cells. For each cell, you add the current cell to the list (either shared with it or created). Then if the cell to the right or below is the same color, you share that list with that cell. If that cell already has a list, you combine the lists and replace the reference to the list object in each cell listed in the lists with the new merged list.
Then located in each cell is a reference to a list that contains every contiguous cell with that cell. This aptly combines the work of the floodfill between every cell. Rather than repeating it for each cell. Since you have the lists replacing the data with the merged data is just iterating through a list. It will be O(n*c) where n is the number of cells and c is a measure of how contiguous the graph is. A completely disjointed grid will be n time. A completely contiguous 1 color graph with be n^2/2.
I heard this question in a video and also found it here and I came up with what is the best approach I have seen in my searching. Here are the basic steps of the algorithm:
Loop through the array (assuming the grid of colors is represented as a 2-dimensional array) from top-left to bottom-right.
When you go through the first row just check the color to the left to see if it is the same color. When you go through all subsequent rows, check the cell above and the cell to the left - this is more efficient than checking to the top, bottom, left and right every time. Don't forget to check that the left cell is not out of bounds.
Create a Dictionary of type <int,Dictionary<int,Hashset<cell>>> for storing colors and groups within those colors. The Hashset contains cell locations (cell object with 2 properties: int row, int column).
If the cell is not connected at the top or left to a cell of the same color then create a new Dictionary entry, a new color group within that entry, and add the current cell to that group (Hashset). Else it is connected to another cell of the same color; add the current cell to the color group containing the cell it's connected to.
If at some point you encounter a cell that has the same color at the top and left, if they both belong to the same color group then that's easy, just add the current cell to that color group. Else check the kitty-corner cell to the top-left. If it is a different color than the current cell and the cell to the top and cell to the left belong to different color groups --> merge the 2 color groups together; add the current cell to the group.
Finally, loop through all of the Hashsets to see which one has the highest count - this will be the return value.
Here is a link to a video I made with visual and full explanation:
https://d.tube/#!/v/israelgeeksout77/wm2ax1vpu3y
P.S. I found this post on GeeksForGeeks https://www.geeksforgeeks.org/largest-connected-component-on-a-grid/
They conveniently posted source code to this problem in several languages! But I tried their code vs. mine and mine ran in about 1/3 of the time.

Resources