Related
This grade 11 problem has been bothering me since 2010 and I still can't figure out/find a solution even after university.
Problem Description
There is a very unusual street in your neighbourhood. This street
forms a perfect circle, and the circumference of the circle is
1,000,000. There are H (1 ≤ H ≤ 1000) houses on the street. The
address of each house is the clockwise arc-length from the
northern-most point of the circle. The address of the house at the
northern-most point of the circle is 0. You also have special firehoses
which follow the curve of the street. However, you wish to keep the
length of the longest hose you require to a minimum. Your task is to
place k (1 ≤ k ≤ 1000) fire hydrants on this street so that the maximum
length of hose required to connect a house to a fire hydrant is as
small as possible.
Input Specification
The first line of input will be an integer H, the number of houses. The
next H lines each contain one integer, which is the address of that
particular house, and each house address is at least 0 and less than
1,000,000. On the H + 2nd line is the number k, which is the number of
fire hydrants that can be placed around the circle. Note that a fire
hydrant can be placed at the same position as a house. You may assume
that no two houses are at the same address. Note: at least 40% of the
marks for this question have H ≤ 10.
Output Specification
On one line, output the length of hose required
so that every house can connect to its nearest fire hydrant with that
length of hose.
Sample Input
4
0
67000
68000
77000
2
Output for Sample Input
5000
Link to original question
I can't even come up with a brutal force algorithm since the placement might be float number. For example if the houses are located in 1 and 2, then the hydro should be placed at 1.5 and the distance would be 0.5
Here is quick outline of an answer.
First write a function that can figures out whether you can cover all of the houses with a given maximum length per hydrant. (The maximum hose will be half that length.) It just starts at a house, covers all of the houses it can, jumps to the next, and ditto, and sees whether you stretch. If you fail it tries starting at the next house instead until it has gone around the circle. This will be a O(n^2) function.
Second create a sorted list of the pairwise distances between houses. (You have to consider it going both ways around for a single hydrant, you can only worry about the shorter way if you have 2+ hydrants.) The length covered by a hydrant will be one of those. This takes O(n^2 log(n)).
Now do a binary search to find the shortest length that can cover all of the houses. This will require O(log(n)) calls to the O(n^2) function that you wrote in the first step.
The end result is a O(n^2 log(n)) algorithm.
And here is working code for all but the parsing logic.
#! /usr/bin/env python
def _find_hoses_needed (circle_length, hose_span, houses):
# We assume that houses is sorted.
answers = [] # We can always get away with one hydrant per house.
for start in range(len(houses)):
needed = 1
last_begin = start
current_house = start + 1 if start + 1 < len(houses) else 0
while current_house != start:
pos_begin = houses[last_begin]
pos_end = houses[current_house]
length = pos_end - pos_begin if pos_begin <= pos_end else circle_length + pos_begin - pos_end
if hose_span < length:
# We need a new hose.
needed = needed + 1
last_begin = current_house
current_house = current_house + 1
if len(houses) <= current_house:
# We looped around the circle.
current_house = 0
answers.append(needed)
return min(answers)
def find_min_hose_coverage (circle_length, hydrant_count, houses):
houses = sorted(houses)
# First we find all of the possible answers.
is_length = set()
for i in range(len(houses)):
for j in range(i, len(houses)):
is_length.add(houses[j] - houses[i])
is_length.add(houses[i] - houses[j] + circle_length)
possible_answers = sorted(is_length)
# Now we do a binary search.
lower = 0
upper = len(possible_answers) - 1
while lower < upper:
mid = (lower + upper) / 2 # Note, we lose the fraction here.
if hydrant_count < _find_hoses_needed(circle_length, possible_answers[mid], houses):
# We need a strictly longer coverage to make it.
lower = mid + 1
else:
# Longer is not needed
upper = mid
return possible_answers[lower]
print(find_min_hose_coverage(1000000, 2, [0, 67000, 68000, 77000])/2.0)
Design and explain a recursive divide-and-conquer algorithm. Anyone has ideas?
Given an isosceles right triangular grid for some k ≥ 2 as shown in Figure 1(b), this problem asks you to completely cover it using the tiles given in Figure 1(a). The bottom-left corner of the grid must not be covered. No two tiles can overlap and all tiles must remain completely inside the given triangular grid. You must use all four types of tiles shown in Figure 1(a), and no tile type can be used to cover more than 40% of the total grid area. You are allowed to rotate the tiles, as needed, before putting them on the grid.
This is the idea of induction indeed, and is similar to the famous example "L-Tile" covering
As you said, you have solved the problem for k = 2, it's a good and correct starting point to solving small example first, yet I think this problem there is a bit tricky for k = 2 case, mainly due to the each type cannot exceed 40% constrain.
Then for k>2, say k = 3 in your example, we try to make use of what you have solved, i.e. the case k = 2
With very simple observation, one may notice that for k = n, it can actually be made up of 4 k=n-1 cases (see image below)
Now the shaded part in the middle form a hole that can filled by 1 type B, so we can first filled the 4 small n-1 case and fill the hole with type B...
But then this construction face a problem: type B will exceed 40% of the area!
Consider k = 2, no matter how you fill the area, 2 type B must be used, I do not have a strong proof but by some brute force trail & error you should be convinced. Then for k = 3, we have 4 small triangles meaning we have 2*4 = 8 Type B, plus 1 more to fill the hole will gives us 9 Type B, each uses 1.5 sq units, which total uses up 13.5 sq units.
As k = 3, the total area is (2^3)^2 / 2 = 32 sq units
13.5/32 = 0.42.... which violate the constrain!
So what to do? Here is the reason why we have to use a trick to handle the k = 2 case (I assume you have go through this part as you said you know how to do k = 2 case)
First, we know that using our constructive method to build a large triangle from 4 smaller triangles, only Type B will violate this constrain (i.e. the 40% area), you can verify yourself. So we want to reduce the total number of Type B used, yet each smaller triangle must use at least 2 Type B, so the only place we may reduce is the empty hole in the middle of the large triangle, can we use other Type instead of Type B? At the same time, we want the other parts of the small triangle remain unchanged so that we can use same argument to do an induction (i.e. in general speaking, form 2^n triangle from 4 2^(n-1) triangles using same construction method)
The answer is YES if we special design the k = 2 case
See my construction below: (There maybe other construction works too, but I only need to know one)
The trick is I intentionally move 1 Type B next to the empty(gray) triangle
Let's stop right here for a bit, and do some verification:
To construct a k = 2 case, we use
2 Type A = 2 sq.units < 40%
2 Type B = 3 sq.units < 40%
1 Type C = 1.5 sq.units < 40%
1 Type D = 1 sq.unit < 40%
Total use 7.5 sq.units, good
Now imagine we use exactly the same method to construct those 4 triangles to make a large one, the middle one still be an empty hole with shape of Type B, but now instead of filling it with 1 Type B, we fill the hole TOGETHER WITH the 3 Type B just next to them (look back the k = 2 case), using Type A & D
(I use same color scheme as above for easy understanding), we do this for all 3 small triangles which made up the hole in the middle.
Here is the last part (I know it's long...)
We have reduce the number of Type B used when constructing a large triangle from smaller ones, but at the same time we increase the number of Type A & D used! So is this new construction method valid?
First notice that it does not change any parts of the small triangles except the Type B next to the gray triangle, i.e. If the 40% constrain is fulfilled, this method is inductive and recursive to fill a 2^n side triangle
Then let's count again the number of each Type we used.
For k = 3, total units is 32, we uses:
2*4+3 = 11 Type A = 11 sq.units < 40%
2*4-3 = 5 Type B = 7.5 sq.units < 40%
1*4 = 4 Type C = 6 sq.units < 40%
1*4+3 = 7 Type D = 7 sq.unit < 40%
Total we cover 31.5 units, good, now let's proof the 40% constrain is satisfied for k = n > 3
Let FA(n-1) be the total area of Type A used to fill 2^n-1 triangles using our new method, likewise, FB(n-1), FC(n-1), FD(n-1) with similar definitions
Assume F*(n-1) is true, i.e. not exceeding 40% of total area, we proof that F*(n) is true.
We got
FA(n) = FA(n-1)*4 + 3*1
FB(n) = FB(n-1)*4 - 3*1.5
FC(n) = FC(n-1)*4
FD(n) = FD(n-1)*4 + 3*1
We only show the proof for FD(n), other three should be proofed with similar method (M.I.)
Using method of substitution, FD(n) = 2*(4^(n-2)) - 1 for n>=3 (You should at least try to come up with this equation yourself)
We want to show FD(n)/(2^2(n)/2) < 0.4
i.e. 2FD(n)/4^n < 0.4
Consider LHS,
LHS = (4*(4^(n-2)) - 1)/4^n
< 4^(n-1)/4^n = 1/4 < 0.4 Q.E.D
That means using this method, all Type A-D will not exceed 40% of total area for any 2^k sided triangle, for k >= 3, finally we show that inductively, there is a method satisfy all constrains to construct such a triangle.
TL;DR
The hard part is to satisfy the 40% area constrain
Use a special construction on k = 2 case first, try to use it to build k = 3 case (then k = 4, k = 5...idea of induction!)
When using k=n-1 case to build k=n case, write down the formula of total area consumed by each type, and show that they would not exceed 40% of total areas
Combined point 2 & 3, it's an induction method to show that for any k >= 2, there is a method (which we described) to fill the 2^k sided triangle without breaking any constrains
This is a problem from Google Code Jam qualification round (which is over now). How to solve this problem?
Note: If you have a different method from the ones discussed in answers, please share it so we can expand our knowledge of the different ways to solve this problem.
Problem Statement:
Minesweeper is a computer game that became popular in the 1980s, and is still included in some versions of the Microsoft Windows operating system. This problem has a similar idea, but it does not assume you have played Minesweeper.
In this problem, you are playing a game on a grid of identical cells. The content of each cell is initially hidden. There are M mines hidden in M different cells of the grid. No other cells contain mines. You may click on any cell to reveal it. If the revealed cell contains a mine, then the game is over, and you lose. Otherwise, the revealed cell will contain a digit between 0 and 8, inclusive, which corresponds to the number of neighboring cells that contain mines. Two cells are neighbors if they share a corner or an edge. Additionally, if the revealed cell contains a 0, then all of the neighbors of the revealed cell are automatically revealed as well, recursively. When all the cells that don't contain mines have been revealed, the game ends, and you win.
For example, an initial configuration of the board may look like this ('*' denotes a mine, and 'c' is the first clicked cell):
*..*...**.
....*.....
..c..*....
........*.
..........
There are no mines adjacent to the clicked cell, so when it is revealed, it becomes a 0, and its 8 adjacent cells are revealed as well. This process continues, resulting in the following board:
*..*...**.
1112*.....
00012*....
00001111*.
00000001..
At this point, there are still un-revealed cells that do not contain mines (denoted by '.' characters), so the player has to click again in order to continue the game.
You want to win the game as quickly as possible. There is nothing quicker than winning in one click. Given the size of the board (R x C) and the number of hidden mines M, is it possible (however unlikely) to win in one click? You may choose where you click. If it is possible, then print any valid mine configuration and the coordinates of your click, following the specifications in the Output section. Otherwise, print "Impossible".
My Tried Solution:
So for the solution, you need to make sure that each non-mine node is in a 3x3 matrix with other non-mine nodes, or a 3x2 or 2x2 matrix if the node is on an edge of the grid; lets call this a 0Matrix. So any node in a 0Matrix have all non-mine neighbors.
Firstly, Check whether less mines are required, or less empty nodes
if(# mines required < 1/3 of total grid size)
// Initialize the grid to all clear nodes and populate the mines
foreach (Node A : the set of non-mine nodes)
foreach (Node AN : A.neighbors)
if AN forms a OMatrix with it's neighbors, continue
else break;
// If we got here means we can make A a mine since all of it's neighbors
// form 0Matricies with their other neighbors
// End this loop when we've added the number of mines required
else
// We initialize the grid to all mines and populate the clear nodes
// Here I handle grids > 3x3;
// For smaller grids, I hard coded the logic, eg: 1xn grids, you just populate in 1 dimension
// Now we know that the # clear nodes required will be 3n+2 or 3n+4
// eg: if a 4x4 grid need 8 clear nodes : 3(2) + 2
For (1 -> num 3's needed)
Add 3 nodes going horizontally
When horizontal axis is filled, add 3 nodes going vertically
When vertical axis is filled, go back to horizontal then vertical and so on.
for(1 -> num 2's needed)
Add 2 nodes going horizontally or continuing in the direction from above
When horizontal axis is filled, add 2 nodes going vertically
For example, say we have an 4x4 grid needing 8 clean nodes, here are the steps:
// Initial grid of all mines
* * * *
* * * *
* * * *
* * * *
// Populating 3's horizontally
. * * *
. * * *
. * * *
* * * *
. . * *
. . * *
. . * *
* * * *
// Populating 2's continuing in the same direction as 3's
. . . *
. . . *
. . * *
* * * *
Another Example: 4x4 grid with 11 clear nodes needed; output:
. . . .
. . . .
. . . *
* * * *
Another Example: 4x4 grid with 14 clear nodes needed; output:
// Insert the 4 3's horizontally, then switch to vertical to insert the 2's
. . . .
. . . .
. . . .
. . * *
Now here we have a grid that is fully populated and can be solved in one click if we click on (0, 0).
My solution works for most cases, but it didn't pass the submission (I did check an entire 225 cases output file), so I'm guessing it has some problems, and I'm pretty sure there are better solutions.
Algorithm
Let's first define N, the number of non-mine cells:
N = R * C - M
A simple solution is to fill an area of N non-mine cells line-by-line from top to bottom. Example for R=5, C=5, M=12:
c....
.....
...**
*****
*****
That is:
Always start in the top-left corner.
Fill N / C rows with non-mines from top to bottom.
Fill the next line with N % C non-mines from left to right.
Fill the rest with mines.
There are only a few special cases you have to care about.
Single non-mine
If N=1, any configuration is a correct solution.
Single row or single column
If R=1, simply fill in the N non-mines from left-to-right. If C=1, fill N rows with a (single) non-mine.
Too few non-mines
If N is even, it must be >= 4.
If N is odd, it must be >= 9. Also, R and C must be >= 3.
Otherwise there's no solution.
Can't fill first two rows
If N is even and you can't fill at least two rows with non-mines, then fill the first two rows with N / 2 non-mines.
If N is odd and you can't fill at least two rows with non-mines and a third row with 3 non-mines, then fill the first two rows with (N - 3) / 2 non-mines and the third row with 3 non-mines.
Single non-mine in the last row
If N % C = 1, move the final non-mine from the last full row to the next row.
Example for R=5, C=5, M=9:
c....
.....
....*
..***
*****
Summary
It is possible to write an algorithm that implements these rules and returns a description of the resulting mine field in O(1). Drawing the grid takes O(R*C), of course. I also wrote an implementation in Perl based on these ideas which was accepted by the Code Jam Judge.
There is a more general solution to this problem that passes both the small and large test cases. It avoids having to think of all the special cases, it doesn't care what the dimensions of the board are and doesn't require any back tracking.
ALGORITHM
The basic idea is start with a grid full of mines and remove them starting from cell {0, 0} until there are the correct number of mines on the board.
Obviously there needs to be some way to determine which mines to remove next and when it is impossible to remove the correct number of mines. To do this we can keep an int[][] that represents the board. Each cell with a mine contains -1 and those without mines contain an integer which is the number of mines adjacent to the cell; the same as in the actual game.
Also define the concept of a 'frontier' which is all non-mine cells that are non-zero i.e. those cells with mines adjacent.
For example the configuration:
c . *
. . *
. . *
* * *
Would be represented as:
0 2 -1
0 3 -1
2 5 -1
-1 -1 -1
And the frontier would contain the cells with values: 2, 3, 5, 2
When removing the mines the strategy is:
Find a cell in the frontier that has the same value as the remaining number of mines to remove. So in the example above if we had 5 more mines to remove, the cells with value 5 on the frontier would be chosen.
Failing that chose the smallest frontier cell. So either of the 2's in the example above.
If the value of the chosen cell is greater than the number of mines left to remove then it's impossible to build this board, so return false.
Else remove all mines surrounding the chosen frontier cell.
Repeat until the correct number of mines are present on the board - the constraints of the problem have been met.
In java this looks like:
// Tries to build a board based on the nMines in the test case
private static boolean buildBoard(TestCase t) throws Exception {
if (t.nMines >= t.Board.rows() * t.Board.cols()) {
return false;
}
// Have to remove the cell at 0,0 as the click will go there
t.Board.removeMine(0, 0);
while (!t.boardComplete()) {
Cell c = nextCell(t);
// If the value of this cell is greater than what we need to remove we can't build a board
if (t.Board.getCell(c.getRow(), c.getCol()) > t.removalsLeft()) {
return false;
}
t.Board.removeNeighbourMines(c.getRow(), c.getCol());
}
return true;
}
// Find frontier cell matching removals left, else pick the smallest valued cell to keep things balanced
private static Cell nextCell(TestCase t) {
Cell minCell = null;
int minVal = Integer.MAX_VALUE;
for (Cell c : t.Board.getFrontier()) {
int cellVal = t.Board.getCell(c.getRow(), c.getCol());
if (cellVal == t.removalsLeft()) {
return c;
}
if (cellVal < minVal) {
minVal = cellVal;
minCell = c;
}
}
if (minCell == null) {
throw new NullPointerException("The min cell wasn't set");
}
return minCell;
}
PROOF / INTUITION
Firstly a board is defined as valid if it can be solved by a single click, even if there is only one cell on the board where this click can occur. Therefore for a board to be valid all non-mine cells must be adjacent to a cell with value 0, if this is not the case the cell is defined as unreachable. This is because we know with certainty that all cells adjacent to a 0 cell are non mines, so they can be safely revealed and the game will do it automatically for the player.
A key point to prove for this algorithm is that it is always necessary to remove all the mines surrounding the smallest frontier cell in order to keep the board in a valid state. This is quite easy to prove just by drawing out a board (such as the one above) and picking the lowest value cell (in this case the top right 2). If only a single mine is removed then the board would not be valid, it would be in either of these two states:
0 1 1
0 2 -1
2 5 -1
-1 -1 -1
or
0 1 -1
0 2 2
2 5 -1
-1 -1 -1
which both have unreachable cells.
So it is now true that always choosing the smallest frontier cell will keep the board in a valid state and my first instinct was that continuing to choose these cells would go through all valid states, however this is not correct. This can be illustrated by a test case such as 4 4 7 (so there are 9 non-mine cells). Then consider the following board:
0 2 -1 -1
2 5 -1 -1
-1 -1 -1 -1
-1 -1 -1 -1
Continuing to chose the smallest frontier cell may result in the algorithm doing the following:
0 2 -1 -1
0 3 -1 -1
0 3 -1 -1
0 2 -1 -1
Which means it is now impossible to remove just a single mine to complete the board for this case. However choosing a frontier cell that matches the number of remaining mines (if one exists) ensures that the 5 would have been chosen resulting in a 3x3 square of non-mines and a correct solution to the test case.
At this point I decided to give the algorithm a try on all test cases in the following range:
0 < R < 20
0 < C < 20
0 ≤ M < R * C
and found that it managed to correctly identify all the impossible configurations and build what looked like sensible solutions to the possible configurations.
Further intuition behind why choosing the frontier cell with the same value as the remaining number of mines (should it exist) is correct is that it allows the algorithm to find configurations for solutions requiring an odd number of non-mines.
When originally implementing this algorithm I was intending on writing heuristics that built the non-mine area in a square arrangement. Considering the 4 4 7 test case again it would end-up doing this:
0 0 2 -1
0 1 4 -1
2 4 -1 -1
-1 -1 -1 -1
notice how we now have a 1 on the frontier which would ensure the final cell removed completed the square to give:
c . . *
. . . *
. . . *
* * * *
This would mean the heuristics change slightly to:
Pick the smallest frontier cell
In the case of a tie, pick the first cell added to the frontier list
This could be implemented by keeping a FIFO queue of frontier cells, but I quickly realised that it is trickier than it first seems. This is due to the fact that the values of the frontier cells are interdependent and so care needs to be taken to keep the collection of frontier cells in the correct state and also not to use the cells value in any kind of hash value etc. I'm sure this could be done, but upon realising that just adding the extra heuristic to pick any cells with values equal to the remaining removals worked, this seemed liked the easier approach.
This is my code. I solved taking different cases like if number of rows=1 or number of columns=1 or if number of mines=(r*c)-1 and few other cases.
The position on the layout to click is placed at a[r-1][c-1]('0' indexed) every time.
For this question I had given few wrong attempts and every time I kept finding a new case. I eliminated few cases in which solution is not possible using goto and let it jump to end where it prints out impossible. A very simple solution(Indeed can be said a brute force solution since I coded for different cases possible individually). This is an editorial for my code. And on github.
I used search with backtracking, but I could solve only the small input.
Basically the algorithm starts with the board full of mines and tries to remove the mines in a way that the first "click" would solve the board. The catch is that to allow a "click" to expand to another cell, the expansion will come from another cell that must have all other neighbor cells cleared too. Sometimes, to expand to another cell you would need to remove other mines and end up with less mines than required. The algorithm will backtrack if it reaches such a position.
Maybe it is simpler to do the opposite. Start with an empty board and add each mine in a way that would not prevent the "expansion" of the initial click.
The full Python code is below:
directions = [
[-1, -1], [-1, 0], [-1, 1],
[0, -1], [0, 1],
[1, -1], [1, 0], [1, 1],
]
def solve(R, C, M):
def neighbors(i, j):
for di, dj in directions:
if 0 <= (i + di) < R and 0 <= (j + dj) < C:
yield (i + di, j + dj)
def neighbors_to_clear(i, j, board):
return [(ni, nj) for ni, nj in neighbors(i, j) if board[ni][nj] == "*"]
def clear_board(order):
to_clear = R * C - M - 1
board = [["*" for _ in range(C)] for _ in range(R)]
for i, j in order:
board[i][j] = "."
for ni, nj in neighbors_to_clear(i, j, board):
to_clear -= 1
board[ni][nj] = "."
return board, to_clear
def search(ci, cj):
nodes = []
board = []
to_clear = 1
nodes.append((ci, cj, []))
while nodes and to_clear > 0:
i, j, order = nodes.pop()
board, to_clear = clear_board(order)
neworder = order + [(i, j)]
if to_clear == 0:
board[ci][cj] = "c"
return board
elif to_clear > 0:
for ni, nj in neighbors_to_clear(i, j, board):
board[ni][nj] = "."
nodes.append([ni, nj, neworder])
for i in range(R):
for j in range(C):
board = search(i, j)
if board:
for row in board:
print "".join(row)
return
print "Impossible"
return
T = int(raw_input())
for i in range(1, T + 1):
R, C, M = map(int, raw_input().split(" "))
print("Case #%d:" % i)
solve(R, C, M)
my strategy was very similar to yours and I passed both small and large.
Did you think about cases below?
R * C - M = 1
There is only one row
There are only two rows
I flipped R and C when R > C.
I separated this into two initial special cases, then had a general algorithm. The tl;dr version is to build a square of blank spaces from the top left. Similar to other answers, but with fewer special cases.
Special cases
Case 1
Only 1 blank space. Just click in the top left corner and finish.
Case 2
2 or 3 blank spaces, with a grid that is not either Rx1 or 1xC. This is impossible, so we fail early.
Algorithm
Always click in the top left corner. Start with a 2x2 blank square in the top left (we have at least 4 blanks). Now we need to add the remaining blanks. We then expand the square along one edge then the other, until we have no more blank spaces.
Example of blanking order:
C 2 6 12
1 3 7 13
4 5 8 14
9 10 11 15
Impossible case
Note that when beginning a new edge, we must place at least two blank spaces for this to be valid. So if we have only one blank to place, then this must be invalid (unless our edge has length one). My logic looked like this:
if maxEdgeLength > 1 and remainingBlanks == 1:
print('Impossible')
return
However, we could have left off the end of the last edge, which would give us two blanks now. Of course we can only leave off the last blank if the last edge was more than 2 blanks long!
My logic for this special case looked like this:
if remainingBlanks == 1 and lastEdgeSize > 2:
mineMatrix[lastBlank] = '*'
blanks += 1
Code z self explanatory with comments. O(r+c)
import java.util.Scanner;
public class Minesweeper {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int n = sc.nextInt();
for(int j=0;j<n;j++) {
int r =sc.nextInt(),
c = sc.nextInt(),
m=sc.nextInt();
//handling for only one space.
if(r*c-m==1) {
System.out.println("Case #"+(int)(j+1)+":");
String a[][] = new String[r][c];
completeFill(a,r-1,c-1,"*");
printAll(a, r-1, c-1);
}
//handling for 2 rows or cols if num of mines - r*c < 2 not possible.
//missed here the handling of one mine.
else if(r<2||c<2) {
if(((r*c) - m) <2) {
System.out.println("Case #"+(int)(j+1)+":");
System.out.println("Impossible");
}
else {
System.out.println("Case #"+(int)(j+1)+":");
draw(r,c,m);
}
}
//for all remaining cases r*c - <4 as the click box needs to be zero to propagate
else if(((r*c) - m) <4) {
System.out.println("Case #"+(int)(j+1)+":");
System.out.println("Impossible");
}
//edge cases found during execution.
//row or col =2 and m=1 then not possible.
//row==3 and col==3 and m==2 not possible.
else {
System.out.println("Case #"+(int)(j+1)+":");
if(r==3&&m==2&&c==3|| r==2&&m==1 || c==2&&m==1) {
System.out.println("Impossible");
}
else {
draw(r,c,m);
}
}
}
}
/*ALGO : IF m < (r and c) then reduce r or c which ever z max
* by two first time then on reduce by factor 1.
* Then give the input to filling (squarefill) function which files max one line
* with given input. and returns the vals of remaining rows and cols.
* checking the r,c==2 and r,c==3 edge cases.
**/
public static void draw(int r,int c, int m) {
String a[][] = new String[r][c];
int norow=r-1,nocol=c-1;
completeFill(a,norow,nocol,".");
int startR=0,startC=0;
int red = 2;
norow = r;
nocol = c;
int row=r,col=c;
boolean first = true;
boolean print =true;
while(m>0&&norow>0&&nocol>0) {
if(m<norow&&m<nocol) {
if(norow>nocol) {
norow=norow-red;
//startR = startR + red;
}
else if(norow<nocol){
nocol=nocol-red;
//startC = startC + red;
}
else {
if(r>c) {
norow=norow-red;
}
else {
nocol=nocol-red;
}
}
red=1;
}
else {
int[] temp = squareFill(a, norow, nocol, startR, startC, m,row,col,first);
norow = temp[0];
nocol = temp[1];
startR =r- temp[0];
startC =c -temp[1];
row = temp[3];
col = temp[4];
m = temp[2];
red=2;
//System.out.println(norow + " "+ nocol+ " "+m);
if(norow==3&&nocol==3&&m==2 || norow==2&&m==1 || nocol==2&&m==1) {
print =false;
System.out.println("Impossible");
break;
}
}
first = false;
}
//rectFill(a, 1, r, 1, c);
if(print)
printAll(a, r-1, c-1);
}
public static void completeFill(String[][] a,int row,int col,String x) {
for(int i=0;i<=row;i++) {
for(int j=0;j<=col;j++) {
a[i][j] = x;
}
}
a[row][col] = "c";
}
public static void printAll(String[][] a,int row,int col) {
for(int i=0;i<=row;i++) {
for(int j=0;j<=col;j++) {
System.out.print(a[i][j]);
}
System.out.println();
}
}
public static int[] squareFill(String[][] a,int norow,int nocol,int startR,int startC,int m,int r, int c, boolean first) {
if(norow < nocol) {
int fil = 1;
m = m - norow;
for(int i=startR;i<startR+norow;i++) {
for(int j=startC;j<startC+fil;j++) {
a[i][j] = "*";
}
}
nocol= nocol-fil;
c = nocol;
norow = r;
}
else {
int fil = 1;
m = m-nocol;
for(int i=startR;i<startR+fil;i++) {
for(int j=startC;j<startC+nocol;j++) {
a[i][j] = "*";
}
}
norow = norow-fil;
r= norow;
nocol = c;
}
return new int[] {norow,nocol,m,r,c};
}
}
My approach to this problem was as follows:
For a 1x1 grid, M has to be zero otherwise it's impossible
For a Rx1 or 1xC grid, we need M <= R * C - 2 (place 'c' on the last cell with an empty cell next to it)
For a RxC grid, we need M <= R * C - 4 (place 'c' on a corner with 3 empty cells around it)
In summary, c will always have non-mine cells next to it no matter what, otherwise it's impossible. This solution makes sense to me, and I have checked the output against their sample and small inputs, however it was not accepted.
Here is my code:
import sys
fname = sys.argv[1]
handler = open(fname, "r")
lines = [line.strip() for line in handler]
testcases_count = int(lines.pop(0))
def generate_config(R, C, M):
mines = M
config = []
for row in range(1, R+1):
if mines >= C:
if row >= R - 1:
config.append(''.join(['*' * (C - 2), '.' * 2]))
mines = mines - C + 2
else:
config.append(''.join('*' * C))
mines = mines - C
elif mines > 0:
if row == R - 1 and mines >= C - 2:
partial_mines = min(mines, C - 2)
config.append(''.join(['*' * partial_mines, '.' * (C - partial_mines)]))
mines = mines - partial_mines
else:
config.append(''.join(['*' * mines, '.' * (C - mines)]))
mines = 0
else:
config.append(''.join('.' * C))
# click the last empty cell
config[-1] = ''.join([config[-1][:-1], 'c'])
return config
for case in range(testcases_count):
R, C, M = map(int, lines.pop(0).split(' '))
# for a 1x1 grid, M has to be zero
# for a Rx1 or 1xC grid, we must have M <= # of cells - 2
# for others, we need at least 4 empty cells
config_possible = (R == 1 and C == 1 and M==0) or ((R == 1 or C == 1) and M <= R * C - 2) or (R > 1 and C > 1 and M <= R * C - 4)
config = generate_config(R, C, M) if config_possible else None
print "Case #%d:" % (case+1)
if config:
for line in config: print line
else:
print "Impossible"
handler.close()
It worked pretty well against their sample on the website and against the small input they provided, but it looks like I'm missing something.
Here is the output to the sample:
Case #1:
Impossible
Case #2:
*
.
c
Case #3:
Impossible
Case #4:
***....
.......
.......
......c
Case #5:
**********
**********
**********
**********
**********
**********
**********
**********
**........
.........c
Update: Reading vinaykumar's editorial, I understand what's wrong with my solution. Basic rules of minesweeper that I should have covered, pretty much.
Pre-Checks
M = (R * C) - 1
Fill grid with all mines and put click anywhere.
R == 1 || C == 1
Fill left/right (or up/down) in order: click, non-mines, mines (ex. c...****).
M == (R * C) - 2 || M == (R * C) - 3
Impossible
Algorithm
I started with an "empty" grid (all .s) and placed the click in a corner (I will use the top-left corner for the click, and begin filling with mines from the bottom-right).
We will use R1 and C1 as our "current" rows and columns.
While we have enough mines to fill a row or column which, when removed, does not leave a single row or column left (while((M >= R1 && C1 > 2) || (M >= C1 && R1 > 2))), we "trim" the grid (fill with mines and reduce R1 or C1) using the shortest side and remove that many mines. Thus, a 4x5 with 6 mines left would become a 4x4 with 2 mines left.
If we end up with 2 x n grid we will either have 0 mines (we are done) or 1 mine left (Impossible to win).
If we end up with a 3 x 3 grid we will either have 0 mines (we are done), 1 mine (continue below), or 2 mines (Impossible to win).
Any other combination is winnable. We check if M == min(R1,C1)-1, if so we will need to put a single mine one row or column in from the shortest edge, then fill the shortest edge with the remaining mines.
Example
I will show the order I enter mines into the grid with numbers, just to help with visualization
R = 7, C = 6, M = 29
c...42
....42
...742
..6742
555542
333332
111111
It took me a few different tries to get my algorithm correct, but I wrote mine in PHP and got both the small and large correct.
I tried my luck in this question also but for some reason didn't pass checks.
I figured that it's solvable for (3x3 matrix) if there are less than (rows*cols-4) mines, as I needed only 4 cells for "c" and its boundaries as "."
My algorithms follows:
Solvable?:
Checks if there is enough room for mines (rows*cols - 4 == maximum mines)
Exceptions like rows == 1, cols == 1; then it's rows*cols-2
Conditional whether possible or impossible
Build solution
Build rows*cols matrix, with default value nil
Go to m[0][0] and assign 'c'
Define m[0][0] surroundings with '.'
Loop from bottom right of Matrix and assign '*' until mines are over, then assign '.'
The solution can be found here. Contents of page below.
There are many ways to generate a valid mine configuration. In this
analysis, we try to enumerate all possible cases and try to generate a
valid configuration for each case (if exists). Later, after having
some insight, we provide an easier to implement algorithm to generate
a valid mine configuration (if exists).
Enumerating all possible cases
We start by checking the trivial cases:
If there is only one empty cell, then we can just fill all cells with
mines except the cell where you click. If R = 1 or C = 1, the mines
can be placed from left to right or top to bottom respectively and
click on the right-most or the bottom-most cell respectively. If the
board is not in the two trivial cases above, it means the board has at
least 2 x 2 size. Then, we can manually check that:
If the number of empty cells is 2 or 3, it is Impossible to have a
valid configuration. If R = 2 or C = 2, valid configurations exists
only if M is even. For example, if R = 2, C = 7 and M = 5, it is
Impossible since M is odd. However, if M = 6, we can place the mines
on the left part of the board and click on the bottom right, like
this:
*....
*...c If the board is not in any of the above case, it means the board is at least 3 x 3 size. In this case, we can always
find a valid mine configuration if the number of empty cells is bigger
than 9. Here is one way to do it:
If the number of empty cells is equal or bigger than 3 * C, then the
mines can be placed row by row starting from top to bottom. If the
number of remaining mines can entirely fill the row or is less than C
- 2 then place the mines from left to right in that row. Otherwise, the number of remaining mines is exactly C - 1, place the last mine in
the next row. For example:
****** ******
*****. ****..
...... -> *.....
...... ......
.....c .....c If the number of empty cells is less than 3 * C but at least 9, we first fill all rows with mines except
the last 3 rows. For the last 3 rows, we fill the remaining mines
column by column from the left most column. If the remaining mines on
the last column is two, then last mine must be put in the next column.
For example:
****** ******
.... -> *...
**.... *.....
*....c *....c Now, we are left with at most 9 empty cells which are located in the 3 x 3 square cells at the bottom right
corner. In this case, we can check by hand that if the number of empty
cells is 5 or 7, it is Impossible to have a valid mine configuration.
Otherwise, we can hard-coded a valid configuration for each number of
empty cell in that 3 x 3 square cells.
Sigh... that was a lot of cases to cover! How do we convince ourselves
that when we code the solution, we do not miss any corner case?
Brute-force approach
For the small input, the board size is at most 5 x 5. We can check all
(25 choose M) possible mine configurations and find one that is valid
(i.e., clicking an empty cell in the configuration reveal all other
empty cells). To check whether a mine configuration is valid, we can
run a flood-fill algorithm (or a simple breath-first search) from the
clicked empty cell and verify that all other empty cells are reachable
(i.e., they are in one connected component). Note that we should also
check all possible click positions. This brute-force approach is fast
enough for the small input.
The brute-force approach can be used to check (for small values of R,
C, M) whether there is a false-negative in our enumeration strategy
above. A false-negative is found when there exist a valid mine
configuration, but the enumeration strategy above yields Impossible.
Once we are confident that our enumeration strategy does not produce
any false-negative, we can use it to solve the large input.
An easier to implement approach
After playing around with several valid mine configurations using the
enumeration strategy above, you may notice a pattern: in a valid mine
configuration, the number of mines in a particular row is always equal
or larger than the number of mines of the rows below it and all the
mines are left-aligned in a row. With this insight, we can implement a
simpler backtracking algorithm that places mines row by row from top
to bottom with non-increasing number of mines as we proceed to fill in
the next row and prune if the configuration for the current row is
invalid (it can be checked by clicking at the bottom right cell). This
backtracking with pruning can handle up to 50 x 50 sized board in
reasonable time and is simpler to implement (i.e., no need to
enumerate corner / tricky cases).
If the contest time were shorter, we may not have enough time to
enumerate all possible cases. In this case, betting on the
backtracking algorithm (or any other algorithm that is easier to
implement) may be a good idea. Finding such algorithms is an art :).
I'm implementing an m,n,k-game, a generalized version of tic-tac-toe, where m is the number of rows, n is the number of columns and k is the number of pieces that a player needs to put in a row to win. I have implemented a check for a win, but I haven't figured out a satisfactory way to check before the board is full of pieces, if no player can win the game. In other words, there might be empty slots on the board, but they cannot be filled in such a way that one player would win.
My question is, how to check this efficiently? The following algorithm is the best that I can think of. It checks for two conditions:
A. Go over all board positions in all 4 directions (top to bottom, right to left, and both diagonal directions). If say k = 5, and 4 (= k-1) consecutive empty slots are found, stop checking and report "no tie". This doesn't take into account for example the following situation:
OX----XO (Example 1)
where a) there are 4 empty consecutive slots (-) somewhere between two X's, b) next it is O's turn, c) there are less than four other empty positions on the board and no player can win by putting pieces to those, and d) it is not possible to win in any other direction than horizontally in the shown slots either. Now we know that it is a tie because O will eventually block the last winning possibility, but erroneously it is not reported yet because there are four consecutive empty slots. That would be ok (but not great). Checking this condition gives a good speed-up at the beginning when the checking algorithm usually finds such a case early, but it gets slower as more pieces are put on the board.
B. If this k-1-consecutive-empty-slots-condition isn't met, the algorithm would check the slots again consecutively in all 4 directions. Suppose we are currently checking from left to right. If at some point an X is encountered and it was preceded by an O or - (empty slot) or a board border, then start counting the number of consecutive X's and empty slots, counting in this first encountered X. If one can count to 5, then one knows it is possible for X to win, and "no tie" is reported. If an O preceded by an X is encountered before 5 consecutive X's, then X cannot win in those 5 slots from left to right starting from where we started counting. For example:
X-XXO (Example 2)
12345
Here we started checking at position 1, counted to 4, and encountered an O. In this case, one would continue from the encountered O in the same way, trying to find 5 consecutive O's or empty slots this time. In another case when counting X's or empty slots, an O preceded by one or more empty slots is encountered, before counting to 5. For example:
X-X-O (Example 3)
12345
In this case we would again continue from the O at position 5, but add to the new counter (of consecutive O's or empty slots) the number of consecutive empty slots that preceded O, here 1, so that we wouldn't miss for example this possible winning position:
X-X-O---X (Example 4)
In this way, in the worst case, one would have to go through all positions 4 times (4 directions, and of course diagonals whose length is less than k can be skipped), giving running time O(mn).
The best way I could think of was doing these two described checks, A and B, in one pass. If the checking algorithm gets through all positions in all directions without reporting "no tie", it reports a tie.
Knowing that you can check a win just by checking in the vicinity of the last piece that was added with running time O(k), I was wondering if there were quicker ways to do an early check for a tie. Doesn't have to be asymptotically quicker. I'm currently keeping the pieces in a two-dimensional array. Is there maybe a data structure that would allow an efficient check? One approach: what is the highest threshold of moves that one can wait the players to make before running any checks for a tie at all?
There are many related questions at Stack Overflow, for example this, but all discussions I could find either only pointed out the obvious tie condition, where the number of moves made is equal to the size of the board (or they checked if the board is full), or handled only the special case where the board is square: m = n. For example this answer claims to do the check for a tie in constant time, but only works when m = n = k. I'm interested in reporting the tie as early as possible and for general m,n and k. Also if the algorithm works for more than two players, that would be neat.
I would reduce the problem of determining a tie to the easier sub-problem:
Can player X still win?
If the answer is 'no' for all players, it is a tie.
To find out whether Player X can win:
fill all blank spaces with virtual 'X'-pieces
are there k 'X'-pieces in a row anywhere?
if there are not --> Player X cannot win. return false.
if there are, find the row of k stones with the least amount of virtual pieces. Count the number of virtual pieces in it.
count the number of moves player X has left, alternating with all other players, until the board is completely full.
if the number of moves is less than the amount of virtual pieces required to win, player X cannot win. return false.
otherwise, player X can still win. return true.
(This algorithm will report a possible win for player X even in cases where the only winning moves for X would have another player win first, but that is ok, since that would not be a tie either)
If, as you said, you can check a win just by checking in the vicinity of the last piece that was added with running time O(k), then I think you can run the above algorithm in O(k * Number_of_empty_spots): Add all virtual X-Piece, note any winning combinations in the vicinity of the added pieces.
The number of empty slots can be large, but as long as there is at least one empty row of size k and player X has still k moves left until the board is filled, you can be sure that player X can still win, so you do not need to run the full check.
This should work with any number of players.
Actually the constant time solution you referenced only works when k = m = n as well. If k is smaller then I don't see any way to adapt the solution to get constant time, basically because there are multiple locations on each row/column/diagonal where a winning consecutive k 0's or 1's may occur.
However, maintaining auxiliary information for each row/column/diagonal can give a speed up. For each row/column/diagonal, you can store the start and end locations for consecutive occurrences of 1's and blanks as possible winning positions for player 1, and similarly store start and end locations of consecutive occurrences of 0's and blanks as possible winning positions for player 0. Note that for a given row/column/diagonal, intervals for player 0 and 1 may overlap if they contain blanks. For each row/column/diagonal, store the intervals for player 1 in sorted order in a self-balancing binary tree (Note you can do this because the intervals are disjoint). Similarly store the intervals for player 0 sorted in a tree. When a player makes a move, find the row/column/diagonals that contain the move location and update the intervals containing the move in the appropriate row column and diagonal trees for the player that did not make the move. For the player that did not make a move, this will split an interval (if it exists) into smaller intervals that you can replace the old interval with and then rebalance the tree. If an interval ever gets to length less than k you can delete it. If a tree ever becomes empty then it is impossible for that player to win in that row/column/diagonal. You can maintain a counter of how many rows/columns/diagonals are impossible to win for each player, and if the counter ever reaches the total number of rows/columns/diagonals for both players then you know you have a tie. The total running time for this is O(log(n/k) + log(m/k)) to check for a tie per move, with O(mn/k) extra space.
You can similarly maintain trees that store consecutive intervals of 1's (without spaces) and update the trees in O(log n + log m) time when a move is made, basically searching for the positions before and after the move in your tree and updating the interval(s) found and merging two intervals if two intervals (before and after) are found. Then you report a win if an interval is ever created/updated and obtains length greater than or equal to k. Similarly for player 0. Total time to check for a win is O(log n + log m) which may be better than O(k) depending on how large k is. Extra space is O(mn).
Let's look at one row (or column or diagonal, it doesn't matter) and count the number of winning lines of length k ("k-line") it's possible to make, at each place in the row, for player X. This solution will keep track of that number over the course of the game, checking fulfillment of the winning condition on each move as well as detecting a tie.
1 2 3... k k k k... 3 2 1
There is one k-line including an X in the leftmost slot, two with the second slot from the left, and so on. If an opposing player, O or otherwise, plays in this row, we can reduce the k-line possibility counts for player X in O(k) time at the time of the move. (The logic for this step should be straightforward after doing an example, needing no other data structure, but any method involving checking each of the k rows of k from will do. Going left to right, only k operations on the counts is needed.) An enemy piece should set the possibility count to -1.
Then, a detectably tied game is one where no cell has a non-zero k-line possibility count for any player. It's easy to check this by keeping track of the index of the first non-zero cell. Maintaining the structure amounts to O(k*players) work on each move. The number of empty slots is less than those filled, for positions that might be tied, so the other answers are good for checking a position in isolation. However, at least for reasonably small numbers of players, this problem is intimately linked with checking the winning condition in the first place, which at minimum you must do, O(k), on every move. Depending on your game engine there may be a better structure that is rich enough to find good moves as well as detect ties. But the possibility counting structure has the nice property that you can check for a win whilst updating it.
If space isn't an issue, I had this idea:
For each player maintain a structure sized (2mn + (1 - k)(m + n) + 2(m - k + 1)(n - k + 1) + 2(sum 1 to (m - k))) where each value represents if one of another player's moves are in one distinct k-sized interval. For example for a 8-8-4 game, one element in the structure could represent row 1, cell 0 to 3; another row 1, cell 1 to 4; etc.
In addition, one variable per player will represent how many elements in their structure are still unset. Only one move is required to set an element, showing that that k-interval can no longer be used to win.
An update of between O(k) and O(4k) time per player seems needed per move. A tie is detected when the number of players exceeds the number of different elements unset.
Using bitsets, the number of bytes needed for each player's structure would be the structure size divided by 8. Notice that when k=m=n, the structure size is 4*k and update time O(4). Less than half a megabyte per player would be needed for a 1000,1000,5 game.
Below is a JavaScript example.
var m = 1000, n = 1000, k = 5, numberOfPlayers = 2
, numberOfHorizontalKIs = m * Math.max(n - k + 1,0)
, numberOfverticalKIs = n * Math.max(m - k + 1,0)
, horizontalVerticalKIArraySize = Math.ceil((numberOfHorizontalKIs + numberOfverticalKIs)/31)
, horizontalAndVerticalKIs = Array(horizontalVerticalKIArraySize)
, numberOfUnsetKIs = horizontalAndVerticalKIs
, upToM = Math.max(0,m - k) // southwest diagonals up to position m
, upToMSum = upToM * (upToM + 1) / 2
, numberOfSouthwestKIs = 2 * upToMSum //sum is multiplied by 2 to account for bottom-right-corner diagonals
+ Math.max(0,n - m + 1) * (m - k + 1)
, diagonalKIArraySize = Math.ceil(2 * numberOfSouthwestKIs/31)
, diagonalKIs = Array(diagonalKIArraySize)
, numberOfUnsetKIs = 2 * numberOfSouthwestKIs + numberOfHorizontalKIs + numberOfverticalKIs
function checkTie(move){
var row = move[0], column = move[1]
//horizontal and vertical
for (var rotate=0; rotate<2; rotate++){
var offset = Math.max(k - n + column, 0)
column -= offset
var index = rotate * numberOfHorizontalKIs + (n - k + 1) * row + column
, count = 0
while (column >= 0 && count < k - offset){
var KIArrayIndex = Math.floor(index / 31)
, bitToSet = 1 << index % 31
if (!(horizontalAndVerticalKIs[KIArrayIndex] & bitToSet)){
horizontalAndVerticalKIs[KIArrayIndex] |= bitToSet
numberOfUnsetKIs--
}
index--
column--
count++
}
//rotate board to log vertical KIs
var mTmp = m
m = n
n = mTmp
row = move[1]
column = move[0]
count = 0
}
//rotate board back
mTmp = m
m = n
n = mTmp
// diagonals
for (var rotate=0; rotate<2; rotate++){
var diagonalTopColumn = column + row
if (diagonalTopColumn < k - 1 || diagonalTopColumn >= n + m - k){
continue
} else {
var offset = Math.max(k - m + row, 0)
row -= offset
column += offset
var dBeforeM = Math.min (diagonalTopColumn - k + 1,m - k)
, dAfterM = n + m - k - diagonalTopColumn
, index = dBeforeM * (dBeforeM + 1) / 2
+ (m - k + 1) * Math.max (Math.min(diagonalTopColumn,n) - m + 1,0)
+ (diagonalTopColumn < n ? 0 : upToMSum - dAfterM * (dAfterM + 1) / 2)
+ (diagonalTopColumn < n ? row : n - 1 - column)
+ rotate * numberOfSouthwestKIs
, count = 0
while (row >= 0 && column < n && count < k - offset){
var KIArrayIndex = Math.floor(index / 31)
, bitToSet = 1 << index % 31
if (!(diagonalKIs[KIArrayIndex] & bitToSet)){
diagonalKIs[KIArrayIndex] |= bitToSet
numberOfUnsetKIs--
}
index--
row--
column++
count++
}
}
//mirror board
column = n - 1 - column
}
if (numberOfUnsetKIs < 1){
return "This player cannot win."
} else {
return "No tie."
}
}
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I'm trying to find a way to find similarities in two arrays of different points. I drew circles around points that have similar patterns and I would like to do some kind of auto comparison in intervals of let's say 100 points and tell what coefficient of similarity is for that interval. As you can see it might not be perfectly aligned also so point-to-point comparison would not be a good solution also (I suppose). Patterns that are slightly misaligned could also mean that they are matching the pattern (but obviously with a smaller coefficient)
What similarity could mean (1 coefficient is a perfect match, 0 or less - is not a match at all):
Points 640 to 660 - Very similar (coefficient is ~0.8)
Points 670 to 690 - Quite similar (coefficient is ~0.5-~0.6)
Points 720 to 780 - Let's say quite similar (coefficient is ~0.5-~0.6)
Points 790 to 810 - Perfectly similar (coefficient is 1)
Coefficient is just my thoughts of how a final calculated result of comparing function could look like with given data.
I read many posts on SO but it didn't seem to solve my problem. I would appreciate your help a lot. Thank you
P.S. Perfect answer would be the one that provides pseudo code for function which could accept two data arrays as arguments (intervals of data) and return coefficient of similarity.
Click here to see original size of image
I also think High Performance Mark has basically given you the answer (cross-correlation). In my opinion, most of the other answers are only giving you half of what you need (i.e., dot product plus compare against some threshold). However, this won't consider a signal to be similar to a shifted version of itself. You'll want to compute this dot product N + M - 1 times, where N, M are the sizes of the arrays. For each iteration, compute the dot product between array 1 and a shifted version of array 2. The amount you shift array 2 increases by one each iteration. You can think of array 2 as a window you are passing over array 1. You'll want to start the loop with the last element of array 2 only overlapping the first element in array 1.
This loop will generate numbers for different amounts of shift, and what you do with that number is up to you. Maybe you compare it (or the absolute value of it) against a threshold that you define to consider two signals "similar".
Lastly, in many contexts, a signal is considered similar to a scaled (in the amplitude sense, not time-scaling) version of itself, so there must be a normalization step prior to computing the cross-correlation. This is usually done by scaling the elements of the array so that the dot product with itself equals 1. Just be careful to ensure this makes sense for your application numerically, i.e., integers don't scale very well to values between 0 and 1 :-)
i think HighPerformanceMarks's suggestion is the standard way of doing the job.
a computationally lightweight alternative measure might be a dot product.
split both arrays into the same predefined index intervals.
consider the array elements in each intervals as vector coordinates in high-dimensional space.
compute the dot product of both vectors.
the dot product will not be negative. if the two vectors are perpendicular in their vector space, the dot product will be 0 (in fact that's how 'perpendicular' is usually defined in higher dimensions), and it will attain its maximum for identical vectors.
if you accept the geometric notion of perpendicularity as a (dis)similarity measure, here you go.
caveat:
this is an ad hoc heuristic chosen for computational efficiency. i cannot tell you about mathematical/statistical properties of the process and separation properties - if you need rigorous analysis, however, you'll probably fare better with correlation theory anyway and should perhaps forward your question to math.stackexchange.com.
My Attempt:
Total_sum=0
1. For each index i in the range (m,n)
2. sum=0
3. k=Array1[i]*Array2[i]; t1=magnitude(Array1[i]); t2=magnitude(Array2[i]);
4. k=k/(t1*t2)
5. sum=sum+k
6. Total_sum=Total_sum+sum
Coefficient=Total_sum/(m-n)
If all values are equal, then sum would return 1 in each case and total_sum would return (m-n)*(1). Hence, when the same is divided by (m-n) we get the value as 1. If the graphs are exact opposites, we get -1 and for other variations a value between -1 and 1 is returned.
This is not so efficient when the y range or the x range is huge. But, I just wanted to give you an idea.
Another option would be to perform an extensive xnor.
1. For each index i in the range (m,n)
2. sum=1
3. k=Array1[i] xnor Array2[i];
4. k=k/((pow(2,number_of_bits))-1) //This will scale k down to a value between 0 and 1
5. sum=(sum+k)/2
Coefficient=sum
Is this helpful ?
You can define a distance metric for two vectors A and B of length N containing numbers in the interval [-1, 1] e.g. as
sum = 0
for i in 0 to 99:
d = (A[i] - B[i])^2 // this is in range 0 .. 4
sum = (sum / 4) / N // now in range 0 .. 1
This now returns distance 1 for vectors that are completely opposite (one is all 1, another all -1), and 0 for identical vectors.
You can translate this into your coefficient by
coeff = 1 - sum
However, this is a crude approach because it does not take into account the fact that there could be horizontal distortion or shift between the signals you want to compare, so let's look at some approaches for coping with that.
You can sort both your arrays (e.g. in ascending order) and then calculate the distance / coefficient. This returns more similarity than the original metric, and is agnostic towards permutations / shifts of the signal.
You can also calculate the differentials and calculate distance / coefficient for those, and then you can do that sorted also. Using differentials has the benefit that it eliminates vertical shifts. Sorted differentials eliminate horizontal shift but still recognize different shapes better than sorted original data points.
You can then e.g. average the different coefficients. Here more complete code. The routine below calculates coefficient for arrays A and B of given size, and takes d many differentials (recursively) first. If sorted is true, the final (differentiated) array is sorted.
procedure calc(A, B, size, d, sorted):
if (d > 0):
A' = new array[size - 1]
B' = new array[size - 1]
for i in 0 to size - 2:
A'[i] = (A[i + 1] - A[i]) / 2 // keep in range -1..1 by dividing by 2
B'[i] = (B[i + 1] - B[i]) / 2
return calc(A', B', size - 1, d - 1, sorted)
else:
if (sorted):
A = sort(A)
B = sort(B)
sum = 0
for i in 0 to size - 1:
sum = sum + (A[i] - B[i]) * (A[i] - B[i])
sum = (sum / 4) / size
return 1 - sum // return the coefficient
procedure similarity(A, B, size):
sum a = 0
a = a + calc(A, B, size, 0, false)
a = a + calc(A, B, size, 0, true)
a = a + calc(A, B, size, 1, false)
a = a + calc(A, B, size, 1, true)
return a / 4 // take average
For something completely different, you could also run Fourier transform using FFT and then take a distance metric on the returning spectra.