I'm practicing with dynamic programming and I'm trying to solve this exercise http://www.geeksforgeeks.org/collect-maximum-points-in-a-grid-using-two-traversals/
But I can't understand how to use dynamic programming.
My reasoning is to use a table T[n][m] to store the results and in every cells to find the max value to go (corresponding to a cell).
Using the example shown in the link: how do I know at the first cell [0][0] to go to "3" instead of "5"? Using my reasoning the choice is to go to "5" but it's a bad way
You table should not be used to look ahead as you are suggesting. That sounds more like a greedy approach which indeed is not correct. Instead use what you have calculated from previous iterations to do the calculation in the current iteration.
Simplified I would describe the algorithm as:
Initialize a table T[C][C] and initially set all values to 0. Where C is the number of columns.
T[c1][c2] holds the best possible score at the previous row where traveler 1 would be in column c1 and traveler 2 would be in column c2
Then you can just iterate over the rows:
Checking if for row r the travelers can be at their respective positions c1, c2
Fill up a new table tmp, where tmp[c1][c2] is arr[R][c1] + arr[R][c2] + Max(T[c1-1][c2-1], T[c1][c2-1], ..., T[c1+1][c2+1]).
Replace T with tmp
Result: T[0]+T[C-1]
Remark: I did not take care about the case were both travelers are on the same spot and they are not both getting the points. I am assuming this is handled in the checking because we should be able to prove that in the optimal solution c1 < c2 for all rows. (except when C = 1)
Why this is correct:
At row R there are only 9 (3x3) combinations possible starting from the previous row. And for each possible position we always take the best. This means we are actually trying every possible combination for both travelers and we can't miss any better solution.
To wrap up this algorithm is not giving you the path to the best solution, only the best score. You may be confused by this because you intuïtively are looking for a path.
Extra
Is it possible to get the path with this algorithm too?
Yes, but you will need to do some extra work.
Let's say we keep all are calculated T arrays for all rows.
Then we can check for instance the following:
T[R][c1][c2] == arr[R][c1] + arr[R][c2] + T[R-1][c-1][c-2]
This checks if an optimal path is possible if in the previous row both travelers went to the right. In this way we could walk up back to the starting positions.
Related
In the question
Calculating How Many Balls in Bins Over Several Values Using Dynamic Programming
the answer discusses a dynamic programming algorithm for placing balls into bins, and I was attempting to determine the running time, as it is not addressed in the answer.
A quick summary: Given M indistinguishable balls and N distinguishable bins, the entry in the dynamic programming table Entry[i][j] represents the number of unique ways i balls can be placed into j bins.
S[i][j] = sum(x = 0 -> i, S[i-x][j-1])
It is clear that the size of the dynamic programming 2D array is O(MN). However, I am trying to determine the impact the summation has on the running time.
I know a summation of values (1....x) means we must access values from 1 to x. Would this then mean, that for each entry computation, since we must access at most from 1...M other values, the running time is in the realm of O((M^2)N)?
Would appreciate any clarification. Thanks!
You can avoid excessive time for summation if you keep column sums in additional table.
When you calculate S[i][j], also fill Sums[i,j]=Sums[i-1,j] + S[i,j] and later use this value for the cell at right side S[i,j+1]
P.S. Note that you really need to store only two rows or even one row of sum table
I have been trying to solve an optimization problem but could not able to think it through for any efficient solution.
Here's the problem
We are given data representing a sequence of bookings on a single car. Each booking data consist of two points (start location, end location). Now given two adjacent bookings b1,b2, we say a relocation is required between those bookings if the end location of b1 not equal to the start location of b2
We have to design an algorithm that takes a sequence of bookings as
input and outputs a single permutation of the input that minimizes the
total number of relocations within the sequence.
Here's my approach
To me, it looks like one of the greedy scheduling problems but I'm not able to derive any good heuristics to solve this problem from any of the existing scheduling problems. At last, I thought of sorting the given sequence on the basis of the minimum difference between start time and end time of the two adjacent sequence using insertion sort.
So, for our given problem
[(23, 42),(77, 45),(42, 77)] will get sorted to
[(23, 42),(42, 77),(77, 45)] thus minimizing end point my start point.
Let's take another example
[(3,1),(1,3),(3,1),(2,2),(3,1),(2,3),(1,3),(1,1),(3,3),(3,2),(3,3)]
now after sorting till index 7 using insertion sort, our array will look like
[(3,1),(1,3),(3,1),(2,2),(2,3),(3,3),(3,1),(1,3),(3,3),(3,2),(3,3)]
Now for placing point (3,3) present at index 8 in the unsorted array we will do the following
The idea is to put each point in its correct location. For the point
(3,3) at index 8 I will search in the already sorted array the first
entry whose endpoint matches 3 i.e. starting point of this new point,
given the condition that adding this point after that first found
entry does not violate the variant that start of next entry should
match the end of this point. So, we inserted (3,3) in between (2,3)
and (3,1) at index. It looks like this
[(3,1),(1,3),(3,1),(2,2),(2,3),(3,3),(3,1),(1,3),(3,3),(3,2),(1,1)]
However, I'm not sure how will I prove that this is the optimal or not optimal solution. Any pointer is highly appreciated. Is there a better way which I'm sure there is which will help us solve this.
You can convert this easily into a graph problem.
[a, b] -> vertices a and b with an edge between a and b. Use DFS to find all connected components in this undirected graph and do some post processing.
It is linear in input size.
Consider the following puzzle:
A cell is either marked or unmarked. Numbers along the right and bottom side of the puzzle denote the total sum for a certain row or column. Cells contribute (if marked) to the sum in its row and column: a cell in position (i,j) contributes i to the column sum and j to the row sum. For example, in the first row in the picture above, the 1st, 2nd and 5th cell are marked. These contribute 1 + 2 + 5 to the row sum (thus totalling 8), and 1 each to their column sum.
I have a solver in ECLiPSe CLP for this puzzle and I am tyring to write a custom heuristic for it.
The easiest cells to start with, I think, are those for which the column and row hint are as low as possible. In general, the lower N is, the fewer possibilities exist to write N as a sum of natural numbers between 1 and N. In the context of this puzzle it means the cell with the lowest column hint + row hint has lowest odds of being wrong, so less backtracking.
In the implementation I have a NxN array that represents the board, and two lists of size N that represent the hints. (The numbers to the side and on the bottom.)
I see two options:
Write a custom selection predicate for search/6. However, if I understand correctly, I can only give it 2 parameters. There's no way to calculate the row + column sum for a given variable then, because I need to be able to pass it to the predicate. I need 4 parameters.
Ignore search/6 and write an own labelling method. That's how I have
it right now, see the code below.
It takes the board (the NxN array containing all decision variables), both lists of hints and returns a list containing all variables, now sorted according to their row + column sum.
However, this possibly cannot get any more cumbersome, as you can see. To be able to sort, I need to attach the sum to each variable, but in order to do that, I first need to convert it to a term that also contains the coordinates of said variable, so that I convert back to the variable as soon as sorting is done...
lowest_hints_first(Board,RowArr,ColArr,Out) :-
dim(Board,[N,N]),
dim(OutBoard,[N,N]),
( multifor([I,J],[1,1],[N,N]), foreach(Term,Terms), param(RowArr,ColArr) do
RowHint is ColArr[I],
ColHint is RowArr[J],
TotalSum is RowHint + ColHint,
Term = field(I,J,TotalSum)
),
sort(3,<,Terms,SortedTerms), % Sort based on TotalSum
terms_to_vars(SortedTerms,Board,Out), % Convert fields back to vars...
( foreach(Var,Out) do
indomain(Var,max)
).
terms_to_vars([],_,[]).
terms_to_vars([field(I,J,TotalSum)|RestTerms],Vars,[Out|RestOut]) :-
terms_to_vars(RestTerms,Vars,RestOut),
Out is Vars[I,J].
In the end this heuristic is barely faster than input_order. I suspect its due to the awful way it's implemented. Any ideas on how to do it better? Or is my feeling that this heuristic should be a huge improvement incorrect?
I see you are already happy with the improvement suggested by Joachim; however, as you ask for further improvements of your heuristic, consider that there is only one way to get 0 as a sum, as well as there is only one way to get 15.
There is only one way to get 1 and 14, 2 and 13; two ways to get 3 and 12.
In general, if you have K ways to get sum N, you also have K ways to get 15-N.
So the difficult sums are not the large ones, they are the middle ones.
I'm trying to solve a sudoku with the viewpoint that every number has 9 positions. This is the representation for my sudoku:
From the table you can read that number 5 has following positions (Row,Col) in the sudoku: (2,8),(4,2),(6,5).
When I mention a row in my explanation, I mean a row like this:
For example, the above row is row 1.
What I have done is the following:
For every row check if all ROW-Values in that row are different using alldifferent from ic_global.
Do the same as above but then for the COLUMN-Values.
For every row, check if the square numbers are different (calculated using a row and col value each time), using alldifferent again.
The above things work fine and I get a solution for the sudoku but not the correct one. This is because I have to check one more thing: every position must be different. With the current state of my solver I could get a solution that has multiple numbers on the same position, f.e.: 2 and 3 could both be at position (5,7) because I don't check if all positions are different.
How would I tackle this problem?
I tried to get ALL the positions in one list in tuple form and then check if all tuples are different but I have been struggling for hours and I'm getting really desperate. I hope I can find a solution here.
EDIT: Added code
As you already know, all_different/1 and related constraints work on integers. Also, in your case, you are actually interested in a special case of tuples, namely pairs consisting of rows and columns.
So, your question can actually be reduced to:
How can I injectively map pairs of integers to integers?
Suppose you have pairs of the form A-B, where both A and B are constrained to 1..9.
I can put such pairs in a one-to-one correspondence to integers in several ways. A very easy function that does this is: 9×A + B. Think about it!
Thus, I recommend you map such positions to integers in this way or a similar one, and then post all_different/1 on these integers.
Exercise: Think about other possible mappings and their properites. Then generalize them to work on tuples.
I have been sitting on this for almost a week now. Here is the question in a PDF format.
I could only think of one idea so far but it failed. The idea was to recursively create all connected subgraphs which works in O(num_of_connected_subgraphs), but that is way too slow.
I would really appreciate someone giving my a direction. I'm inclined to think that the only way is dynamic programming but I can't seem to figure out how to do it.
OK, here is a conceptual description for the algorithm that I came up with:
Form an array of the (x,y) board map from -7 to 7 in both dimensions and place the opponents pieces on it.
Starting with the first row (lowest Y value, -N):
enumerate all possible combinations of the 2nd player's pieces on the row, eliminating only those that conflict with the opponents pieces.
for each combination on this row:
--group connected pieces into separate networks and number these
networks starting with 1, ascending
--encode the row as a vector using:
= 0 for any unoccupied or opponent position
= (1-8) for the network group that that piece/position is in.
--give each such grouping a COUNT of 1, and add it to a dictionary/hashset using the encoded vector as its key
Now, for each succeeding row, in ascending order {y=y+1}:
For every entry in the previous row's dictionary:
--If the entry has exactly 1 group, add it's COUNT to TOTAL
--enumerate all possible combinations of the 2nd player's pieces
on the current row, eliminating only those that conflict with the
opponents pieces. (change:) you should skip the initial combination
(where all entries are zero) for this step, as the step above actually
covers it. For each such combination on the current row:
+ produce a grouping vector as described above
+ compare the current row's group-vector to the previous row's
group-vector from the dictionary:
++ if there are any group-*numbers* from the previous row's
vector that are not adjacent to any gorups in the current
row's vector, *for at least one value of X*, then skip
to the next combination.
++ any groups for the current row that are adjacent to any
groups of the previous row, acquire the lowest such group
number
++ any groups for the current row that are not adjacent to
any groups of the previous row, are assigned an unused
group number
+ Re-Normalize the group-number assignments for the current-row's
combination (**) and encode the vector, giving it a COUNT equal
to the previous row-vector's COUNT
+ Add the current-row's vector to the dictionary for the current
Row, using its encoded vector as the key. If it already exists,
then add it's COUNT to the COUNT for the pre-exising entry
Finally, for every entry in the dictionary for the last row:
If the entry has exactly one group, then add it's COUNT to TOTAL
**: Re-Normalizing simply means to re-assign the group numbers so as to eliminate any permutations in the grouping pattern. Specifically, this means that new group numbers should be assigned in increasing order, from left-to-right, starting from one. So for example, if your grouping vector looked like this after grouping ot to the previous row:
2 0 5 5 0 3 0 5 0 7 ...
it should be re-mapped to this normal form:
1 0 2 2 0 3 0 2 0 4 ...
Note that as in this example, after the first row, the groupings can be discontiguous. This relationship must be preserved, so the two groups of "5"s are re-mapped to the same number ("2") in the re-normalization.
OK, a couple of notes:
A. I think that this approach is correct , but I I am really not certain, so it will definitely need some vetting, etc.
B. Although it is long, it's still pretty sketchy. Each individual step is non-trivial in itself.
C. Although there are plenty of individual optimization opportunities, the overall algorithm is still pretty complicated. It is a lot better than brute-force, but even so, my back-of-the-napkin estimate is still around (2.5 to 10)*10^11 operations for N=7.
So it's probably tractable, but still a long way off from doing 74 cases in 3 seconds. I haven't read all of the detail for Peter de Revaz's answer, but his idea of rotating the "diamond" might be workable for my algorithm. Although it would increase the complexity of the inner loop, it may drop the size of the dictionaries (and thus, the number of grouping-vectors to compare against) by as much as a 100x, though it's really hard to tell without actually trying it.
Note also that there isn't any dynamic programming here. I couldn't come up with an easy way to leverage it, so that might still be an avenue for improvement.
OK, I enumerated all possible valid grouping-vectors to get a better estimate of (C) above, which lowered it to O(3.5*10^9) for N=7. That's much better, but still about an order of magnitude over what you probably need to finish 74 tests in 3 seconds. That does depend on the tests though, if most of them are smaller than N=7, it might be able to make it.
Here is a rough sketch of an approach for this problem.
First note that the lattice points need |x|+|y| < N, which results in a diamond shape going from coordinates 0,6 to 6,0 i.e. with 7 points on each side.
If you imagine rotating this diamond by 45 degrees, you will end up with a 7*7 square lattice which may be easier to think about. (Although note that there are also intermediate 6 high columns.)
For example, for N=3 the original lattice points are:
..A..
.BCD.
EFGHI
.JKL.
..M..
Which rotate to
A D I
C H
B G L
F K
E J M
On the (possibly rotated) lattice I would attempt to solve by dynamic programming the problem of counting the number of ways of placing armies in the first x columns such that the last column is a certain string (plus a boolean flag to say whether some points have been placed yet).
The string contains a digit for each lattice point.
0 represents an empty location
1 represents an isolated point
2 represents the first of a new connected group
3 represents an intermediate in a connected group
4 represents the last in an connected group
During the algorithm the strings can represent shapes containing multiple connected groups, but we reject any transformations that leave an orphaned connected group.
When you have placed all columns you need to only count strings which have at most one connected group.
For example, the string for the first 5 columns of the shape below is:
....+ = 2
..+++ = 3
..+.. = 0
..+.+ = 1
..+.. = 0
..+++ = 3
..+++ = 4
The middle + is currently unconnected, but may become connected by a later column so still needs to be tracked. (In this diagram I am also assuming a up/down/left/right 4-connectivity. The rotated lattice should really use a diagonal connectivity but I find that a bit harder to visualise and I am not entirely sure it is still a valid approach with this connectivity.)
I appreciate that this answer is not complete (and could do with lots more pictures/explanation), but perhaps it will prompt someone else to provide a more complete solution.