I am currently working on a project involving a puzzle and various pathfinding algorithms. The puzzle is represented using a 2d array, and has a specific form factor. Each cell in the 2d array has a jump value, and that is the number of spaces you can move up, down, left, or right from the cell.
I am currently working on implementing A* search on this puzzle. I was thinking of using manhattan distance as an admissible heuristic for this problem, but I do not think that the conventional manhattan distance will work since movement is limited to a specific number of moves.
For example:
2 2 2 1 1
1 1 1 2 2
3 1 2 1 1
3 1 2 1 1
2 1 1 1 0
Is a possible grid. You start from the 2 in the top left cell and are trying to get to the 0 in the bottom right cell. From the start cell, you can move right two or down 2 to new spaces which have different jump values. This process repeats until the goal is reached if the puzzle is solvable.
How can I modify the manhattan distance abs(x1-x2) + abs(y1-y2) to incorporate moving a specific number of spaces?
As you have observed Manhattan distance isn't an admissible heuristic, e.g. starting from P = (3,2) you have:
Manhattan(P, Target) = abs(3 - 4) + abs(2 - 4) = 1 + 2 = 3
but the real distance is just 2.
An admissible heuristic is:
h(P) = (P_x != Target_x) + (P_y != Target_y)
where
| 1 if x == y
(x != y) = |
| 0 otherwise
This works since for every component of the current-position-vector that doesn't match the corresponding component in target vector you have to take at least 1 move.
lets say, we have two cubes of 3X3 with different heights in cell. Each cell value represents the height of that cell. For example in below block 1, cell (1,1) has height of 1, cell(1,2) has height of 2 and so on.
block-1,
1 2 3
1 3 2
3 1 2
block-2,
4 3 2
4 2 3
2 4 3
Giver two such blocks how to check efficiently whether two blocks can be connected in such a way that there would be no cell mismatched and both blocks together produce a cuboid.
For example, above block-1 + block-2 can be connected and resultant block will be a perfect cuboid height 5. Resultant cuboid will be,
5 5 5
5 5 5
5 5 5
Extension of the problem: Given a set (size >= 50K) of such 4X4 blocks how to connect pair of blocks and produce maximum height sum of resultant cuboid? You can take only matched blocks full height to maximise total height sum. Non matched blocks will be ignored. Each cell height can be up to 20 unit.
Further extension of the problem: Blocks can be given in such a way that might be rotated to make pair with other to maximise resultant cuboids height sum.
Any clue?
You could solve the problem in two steps (1) find all pairs of blocks that connect (build a cuboid) and (2) find the best pairing that maximizes the total height.
Find connecting pairs
For this I would (a) build a surface representation for each block, (b) hash the blocks by their surface representation and (c) search for each block all connecting blocks by looking for the connecting surface models.
(a) Building the surface model
The basic idea is to represent each block by its surface. For this you just subtract the minimum entry in the matrix from every entry in the matrix
surface representation of block-1 will be
1 2 3 -1 0 1 2
1 3 2 --> 0 2 1
3 1 2 2 0 2
and surface representation of block-2 will be
4 3 2 -2 2 1 0
4 2 3 --> 2 0 1
2 4 3 0 2 1
(b) hash the blocks
Now you hash the blocks by their surface representation
(c) Finding connecting pairs
For each block you then compute the connecting surface model, by taking the maximum value in the surface representation and subtracting the entries in the matrix from it,
for block-1 this will yield
0 1 2 2 1 0
2 - 0 2 1 = 2 0 1
2 0 2 0 2 0
the blocks with this surface representation can be found using the hash table (note that the surface representation of block-2 will match).
Note: when you allow for rotation then you will have to perform 4 queries on the hash table with all possible rotations.
Finding the best pairing
To find the best pairing (maximizing the sum of connected blocks) I would use the Hungarian Algorithm. In order to do this you will have to build a matrix where the entry (i, j) contains the height of the block when the two blocks i and j connect and 0 otherwise.
Edit
I think the second step (finding best pairing) can be done more efficiently, by connecting pairs of matching blocks greedily (connecting pairs resulting in highest blocks first).
The intuition for this is: When you have two blocks a and b and they both have the same surface model. Then they will either both connect to another block c or they both won't connect to c. With this in mind, after the "find connecting pairs" step you will end up with pairs of groups of blocks (Xi, Yi) where each block of Xi connects to each block of Yi. If the two groups Xi and Yi are of equal size, then we can connect in any way we want and will always get the same sum of height of resulting cuboids. If one of the groups (wlog Yi) contains less elements then we want to avoid connecting to the smallest blocks in Xi. Thus we can greedily connect starting with the largest blocks and in doing so avoid connecting to the smallest blocks.
So the algorithm could work as follows:
(Hash each block according to its surface representation. Sort
blocks with the same surface representation descending according to
their offset (height of block minus surface representation)
Process blocks in order of descending offset, for each block: Search
for connecting block cBlock with highest offset, connect the two
blocks, remove cBlock from the hash table and the processing
pipeline.
Overall this should be doable in O(n log n)
Question: You are given the following inputs:
3
0 0 1
3 1 1
6 0 9
The first line is the number of points on the graph.
The rest of the lines contain the points on the graph, and their reward. For example:
0 0 1 would mean at point (0,0) [which is the starting point] you are given a reward of 1.
3 1 1 would mean at point (3,1) you are given a reward of 1.
6 0 9 would mean at point (6, 0) you are given a reward of 9.
Going from point a, to point b costs 1.
Therefore if you go from (0,0) -> (3,1) -> (6,0) your reward is 11-2 (cost of traversing 2 nodes) * sqrt(10).
Goal: Determine the maximum amount of rewards you can make (the total amount of reward you collect - the cost) based on the provided inputs.
How would I go about solving this? It seems like dynamic programming is the way to go, but I am not sure where to start.
This is with reference to the 'graph existence' problem - http://acm.mipt.ru/judge/problems.pl?problem=110. Can someone explain why there is no tree in example 1 but there is a tree in example 2? In both examples, vertices 0, 1, 2 and 3 are connected to each other. Here is the problem statement and examples for your reference:
You are given a matrix of distances in a graph.
You should check whether this graph could be a tree or set of trees (forest).
Edge length is 0 or positive integer.
Input: The first line contains number of vertices N.
Next N lines contains matrix (only left bottom triangle of matrix).
Distance -1 corresponds to infinite distance.
Output: Output YES or NO. If YES, then next lines should contains list of edges
of the tree (any tree (forest) with given distance matrix).
Each edge is coded by two identifiers of it's ends.
Vertex identifiers are numbers 0, 1, ..., N-1.
Input#1
4
0
1 0
1 1 0
1 1 1 0
Output#1
NO
Input#2
5
0
1 0
2 1 0
3 2 1 0
-1 -1 -1 -1 0
Output#2
YES
0 1
1 2
2 3
The problem is not very well translated from its Russian original.
The given matrix is not the matrix of edges in the graph as one might conclude, but a distance matrix. Each edge probably has weight of 1, but I am not entirely sure has a nonnegative weight. One has to check if the matrix can be realized by a tree or a forest.
That is in the first example all vertices are connected, but the second example can be realized the graph looks like:
Example 2:
(0) - (1) - (2) - (3) (4)
The graph in example 1 is
Example 1:
(0) - (1) - (2) - (3)
|_____|_____| |
| |___________|
|_________________|
given a sorted array of distinct integers, what is the minimum number of steps required to make the integers contiguous? Here the condition is that: in a step , only one element can be changed and can be either increased or decreased by 1 . For example, if we have 2,4,5,6 then '2' can be made '3' thus making the elements contiguous(3,4,5,6) .Hence the minimum steps here is 1 . Similarly for the array: 2,4,5,8:
Step 1: '2' can be made '3'
Step 2: '8' can be made '7'
Step 3: '7' can be made '6'
Thus the sequence now is 3,4,5,6 and the number of steps is 3.
I tried as follows but am not sure if its correct?
//n is the number of elements in array a
int count=a[n-1]-a[0]-1;
for(i=1;i<=n-2;i++)
{
count--;
}
printf("%d\n",count);
Thanks.
The intuitive guess is that the "center" of the optimal sequence will be the arithmetic average, but this is not the case. Let's find the correct solution with some vector math:
Part 1: Assuming the first number is to be left alone (we'll deal with this assumption later), calculate the differences, so 1 12 3 14 5 16-1 2 3 4 5 6 would yield 0 -10 0 -10 0 -10.
sidenote: Notice that a "contiguous" array by your implied definition would be an increasing arithmetic sequence with difference 1. (Note that there are other reasonable interpretations of your question: some people may consider 5 4 3 2 1 to be contiguous, or 5 3 1 to be contiguous, or 1 2 3 2 3 to be contiguous. You also did not specify if negative numbers should be treated any differently.)
theorem: The contiguous numbers must lie between the minimum and maximum number. [proof left to reader]
Part 2: Now returning to our example, assuming we took the 30 steps (sum(abs(0 -10 0 -10 0 -10))=30) required to turn 1 12 3 14 5 16 into 1 2 3 4 5 6. This is one correct answer. But 0 -10 0 -10 0 -10+c is also an answer which yields an arithmetic sequence of difference 1, for any constant c. In order to minimize the number of "steps", we must pick an appropriate c. In this case, each time we increase or decrease c, we increase the number of steps by N=6 (the length of the vector). So for example if we wanted to turn our original sequence 1 12 3 14 5 16 into 3 4 5 6 7 8 (c=2), then the differences would have been 2 -8 2 -8 2 -8, and sum(abs(2 -8 2 -8 2 -8))=30.
Now this is very clear if you could picture it visually, but it's sort of hard to type out in text. First we took our difference vector. Imagine you drew it like so:
4|
3| *
2| * |
1| | | *
0+--+--+--+--+--*
-1| |
-2| *
We are free to "shift" this vector up and down by adding or subtracting 1 from everything. (This is equivalent to finding c.) We wish to find the shift which minimizes the number of | you see (the area between the curve and the x-axis). This is NOT the average (that would be minimizing the standard deviation or RMS error, not the absolute error). To find the minimizing c, let's think of this as a function and consider its derivative. If the differences are all far away from the x-axis (we're trying to make 101 112 103 114 105 116), it makes sense to just not add this extra stuff, so we shift the function down towards the x-axis. Each time we decrease c, we improve the solution by 6. Now suppose that one of the *s passes the x axis. Each time we decrease c, we improve the solution by 5-1=4 (we save 5 steps of work, but have to do 1 extra step of work for the * below the x-axis). Eventually when HALF the *s are past the x-axis, we can NO LONGER IMPROVE THE SOLUTION (derivative: 3-3=0). (In fact soon we begin to make the solution worse, and can never make it better again. Not only have we found the minimum of this function, but we can see it is a global minimum.)
Thus the solution is as follows: Pretend the first number is in place. Calculate the vector of differences. Minimize the sum of the absolute value of this vector; do this by finding the median OF THE DIFFERENCES and subtracting that off from the differences to obtain an improved differences-vector. The sum of the absolute value of the "improved" vector is your answer. This is O(N) The solutions of equal optimality will (as per the above) always be "adjacent". A unique solution exists only if there are an odd number of numbers; otherwise if there are an even number of numbers, AND the median-of-differences is not an integer, the equally-optimal solutions will have difference-vectors with corrective factors of any number between the two medians.
So I guess this wouldn't be complete without a final example.
input: 2 3 4 10 14 14 15 100
difference vector: 2 3 4 5 6 7 8 9-2 3 4 10 14 14 15 100 = 0 0 0 -5 -8 -7 -7 -91
note that the medians of the difference-vector are not in the middle anymore, we need to perform an O(N) median-finding algorithm to extract them...
medians of difference-vector are -5 and -7
let us take -5 to be our correction factor (any number between the medians, such as -6 or -7, would also be a valid choice)
thus our new goal is 2 3 4 5 6 7 8 9+5=7 8 9 10 11 12 13 14, and the new differences are 5 5 5 0 -3 -2 -2 -86*
this means we will need to do 5+5+5+0+3+2+2+86=108 steps
*(we obtain this by repeating step 2 with our new target, or by adding 5 to each number of the previous difference... but since you only care about the sum, we'd just add 8*5 (vector length times correct factor) to the previously calculated sum)
Alternatively, we could have also taken -6 or -7 to be our correction factor. Let's say we took -7...
then the new goal would have been 2 3 4 5 6 7 8 9+7=9 10 11 12 13 14 15 16, and the new differences would have been 7 7 7 2 1 0 0 -84
this would have meant we'd need to do 7+7+7+2+1+0+0+84=108 steps, the same as above
If you simulate this yourself, can see the number of steps becomes >108 as we take offsets further away from the range [-5,-7].
Pseudocode:
def minSteps(array A of size N):
A' = [0,1,...,N-1]
diffs = A'-A
medianOfDiffs = leftMedian(diffs)
return sum(abs(diffs-medianOfDiffs))
Python:
leftMedian = lambda x:sorted(x)[len(x)//2]
def minSteps(array):
target = range(len(array))
diffs = [t-a for t,a in zip(target,array)]
medianOfDiffs = leftMedian(diffs)
return sum(abs(d-medianOfDiffs) for d in diffs)
edit:
It turns out that for arrays of distinct integers, this is equivalent to a simpler solution: picking one of the (up to 2) medians, assuming it doesn't move, and moving other numbers accordingly. This simpler method often gives incorrect answers if you have any duplicates, but the OP didn't ask that, so that would be a simpler and more elegant solution. Additionally we can use the proof I've given in this solution to justify the "assume the median doesn't move" solution as follows: the corrective factor will always be in the center of the array (i.e. the median of the differences will be from the median of the numbers). Thus any restriction which also guarantees this can be used to create variations of this brainteaser.
Get one of the medians of all the numbers. As the numbers are already sorted, this shouldn't be a big deal. Assume that median does not move. Then compute the total cost of moving all the numbers accordingly. This should give the answer.
community edit:
def minSteps(a):
"""INPUT: list of sorted unique integers"""
oneMedian = a[floor(n/2)]
aTarget = [oneMedian + (i-floor(n/2)) for i in range(len(a))]
# aTargets looks roughly like [m-n/2?, ..., m-1, m, m+1, ..., m+n/2]
return sum(abs(aTarget[i]-a[i]) for i in range(len(a)))
This is probably not an ideal solution, but a first idea.
Given a sorted sequence [x1, x2, …, xn]:
Write a function that returns the differences of an element to the previous and to the next element, i.e. (xn – xn–1, xn+1 – xn).
If the difference to the previous element is > 1, you would have to increase all previous elements by xn – xn–1 – 1. That is, the number of necessary steps would increase by the number of previous elements × (xn – xn–1 – 1). Let's call this number a.
If the difference to the next element is >1, you would have to decrease all subsequent elements by xn+1 – xn – 1. That is, the number of necessary steps would increase by the number of subsequent elements × (xn+1 – xn – 1). Let's call this number b.
If a < b, then increase all previous elements until they are contiguous to the current element. If a > b, then decrease all subsequent elements until they are contiguous to the current element. If a = b, it doesn't matter which of these two actions is chosen.
Add up the number of steps taken in the previous step (by increasing the total number of necessary steps by either a or b), and repeat until all elements are contiguous.
First of all, imagine that we pick an arbitrary target of contiguous increasing values and then calculate the cost (number of steps required) for modifying the array the array to match.
Original: 3 5 7 8 10 16
Target: 4 5 6 7 8 9
Difference: +1 0 -1 -1 -2 -7 -> Cost = 12
Sign: + 0 - - - -
Because the input array is already ordered and distinct, it is strictly increasing. Because of this, it can be shown that the differences will always be non-increasing.
If we change the target by increasing it by 1, the cost will change. Each position in which the difference is currently positive or zero will incur an increase in cost by 1. Each position in which the difference is currently negative will yield a decrease in cost by 1:
Original: 3 5 7 8 10 16
New target: 5 6 7 8 9 10
New Difference: +2 +1 0 0 -1 -6 -> Cost = 10 (decrease by 2)
Conversely, if we decrease the target by 1, each position in which the difference is currently positive will yield a decrease in cost by 1, while each position in which the difference is zero or negative will incur an increase in cost by 1:
Original: 3 5 7 8 10 16
New target: 3 4 5 6 7 8
New Difference: 0 -1 -2 -2 -3 -8 -> Cost = 16 (increase by 4)
In order to find the optimal values for the target array, we must find a target such that any change (increment or decrement) will not decrease the cost. Note that an increment of the target can only decrease the cost when there are more positions with negative difference than there are with zero or positive difference. A decrement can only decrease the cost when there are more positions with a positive difference than with a zero or negative difference.
Here are some example distributions of difference signs. Remember that the differences array is non-increasing, so positives always have to be first and negatives last:
C C
+ + + - - - optimal
+ + 0 - - - optimal
0 0 0 - - - optimal
+ 0 - - - - can increment (negatives exceed positives & zeroes)
+ + + 0 0 0 optimal
+ + + + - - can decrement (positives exceed negatives & zeroes)
+ + 0 0 - - optimal
+ 0 0 0 0 0 optimal
C C
Observe that if one of the central elements (marked C) is zero, the target must be optimal. In such a circumstance, at best any increment or decrement will not change the cost, but it may increase it. This result is important, because it gives us a trivial solution. We pick a target such that a[n/2] remains unchanged. There may be other possible targets that yield the same cost, but there are definitely none that are better. Here's the original code modified to calculate this cost:
//n is the number of elements in array a
int targetValue;
int cost = 0;
int middle = n / 2;
int startValue = a[middle] - middle;
for (i = 0; i < n; i++)
{
targetValue = startValue + i;
cost += abs(targetValue - a[i]);
}
printf("%d\n",cost);
You can not do it by iterating once on the array, that's for sure.
You need first to check the difference between each two numbers, for example:
2,7,8,9 can be 2,3,4,5 with 18 steps or 6,7,8,9 with 4 steps.
Create a new array with the difference like so: for 2,7,8,9 it wiil be 4,1,1. Now you can decide whether to increase or decrease the first number.
Lets assume that the contiguous array looks something like this -
c c+1 c+2 c+3 .. and so on
Now lets take an example -
5 7 8 10
The contiguous array in this case will be -
c c+1 c+2 c+3
In order to get the minimum steps, the sum of the modulus of the difference of the integers(before and after) w.r.t the ith index should be the minimum. In which case,
(c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2 should be minimum
Let f(c) = (c-5)^2 + (c-6)^2 + (c-6)^2 + (c-7)^2
= 4c^2 - 48c + 146
Applying differential calculus to get the minima,
f'(c) = 8c - 48 = 0
=> c = 6
So our contiguous array is 6 7 8 9 and the minimum cost here is 2.
To sum it up, just generate f(c), get the first differential and find out c.
This should take O(n).
Brute force approach O(N*M)
If one draws a line through each point in the array a then y0 is a value where each line starts at index 0. Then the answer is the minimum among number of steps reqired to get from a to every line that starts at y0, in Python:
y0s = set((y - i) for i, y in enumerate(a))
nsteps = min(sum(abs(y-(y0+i)) for i, y in enumerate(a))
for y0 in xrange(min(y0s), max(y0s)+1)))
Input
2,4,5,6
2,4,5,8
Output
1
3