Investors and pools - backtracking - algorithm

I've decided to learn deeper the concept of backtracking and I have following task:
Given N investors, M cities, N by M matrix P of investors preferences (P[i, j] = 1 when i-th investor would like the pool to be built in the j-th city; P[i, j] = 0 then he's neutral and when P[i, j] = -1 he's sceptical) and acceptance level L (if for a given choice of places, sum of investors preferences is greater or equal to L then we consider him as convinced). Find maxmimal number of investors that can be convinced and cities in which pools should be built.
I have tried using backtracking but I wonder if it is possible to optimize it more. For now, on each recursion level I keep track of how many people can possibly be convinced. If this number is less or equal to my current maximum then I return (there will be no better answer).

I'm not sure if this is what you're looking for, but with a little trick, you can express the problem as an integer linear program (ILP). Then you can use an integer linear programming solver (for example, GLPK) to find an optimal solution.
Let s[i] be 0-1 integer variables (i ranging over investors), and c[j] 0-1 integer variables ranging over cities and K be a large number (L + the number of investors will do).
Then, your problem is to minimize sum(s[i]) such that for each i, sum(P[i, j]*c[j]) + s[i] * K >= L. The value of sum(s[i]) in the optimal solution is the number of dissatisfied investors, and c[j] indicates whether to build a pool in city j.
This formulation of the problem is in a standard form for ILPs, so you're good to go.

Related

Game of choosing maximum amount after removing K coins optimally

I am given the following task to solve:
Two players play a game. In this game there are coins and each coin has a value. Each player takes turns and chooses 1 coin. The goal is to have the highest total value at the end. Each player is forced to play optionally(that means always choosing the highest value from the pile). I must find out the sum of the 2 players/the difference between their highest possible sums
Constraints: All values are natural integers and positive.
The task above is a classic greedy problem. From what I've tried it can be sorted with quickSort and then just picking the elements in order for the 2 players. If you need a better time on my tests Radix-Sort performs better. Ok so this task is pretty easy.
Now I have the same task as above BUT the first player must remove OPTIMALLY K coins such that the difference between their scores is maximal. Well this sounds like DP but my mind can't come up with the solution. I must find out again the maximal difference between their points(with both players playing optimally). Or the points of the 2 players in such a way that the difference between them is maximal.
Is there such an algorithm already implemented? Or can someone give me some tips on this issue?
Here is a DP approach solution. We consider n coins, sorted by descending order to simplify the notation (meaning coins[0] is the highest value coin, while coins[n-1] has the lowest value), and we want to remove k coins in order to win the game with a margin as big as possible.
We will consider a matrix M, of dimensions n-k per k.
M stores the following: M(i, j) is the best possible score after playing i turns, when j coins have been removed out of the i+j best coins. It may sound a bit counter-intuitive at first, but it actually is what we are looking for.
Indeed, we have already a value to initialize our matrix: M(0, 0) = 0.
We also can see that M(n-k, k) is actually the solution to the problem we want to solve.
We now need recurrence equations to fill up our matrix. We consider that we want to maximize the score difference for the first player. To maximize the score difference for the second player, the approach is the same, just modify some signs.
if i = 0 then:
M(i, j) = 0 // score difference is always 0 after playing 0 turns
else if j = 0 and i % 2 = 0: // player 1 plays
M(i, j) = M(i-1, j) + coins[i+j]
else if j = 0 and i % 2 = 1: // player 2 plays
M(i, j) = M(i-1, j) - coins[i+j]
else if i % 2 = 0:
M(i, j) = max(M(i, j-1), M(i-1, j) + coins[i+j])
else if i % 2 = 1:
M(i, j) = max(M(i, j-1), M(i-1, j) - coins[i+j])
This recurrence simply means that the best choice, at any point, is between removing the coin (in the case where the best value is M(i, j-1)), or not removing it(case where the best value is M(i-1, j) +/- coins[i+j]) .
That will give you the final score difference, but not the set of coins to remove. To find it, you must keep the 'optimal path' that your program used to calculate the matrix values (was the best value coming from M(i-1,j) or from M(i,j-1) ?).
This path can give you the set you are looking for. By the way, you can see this makes sense, as there are k among n possible ways to remove k coins out of n coins, and there are as well k among n paths from top left to bottom right in a k per n-k matrix if you're allowed to go right or down only.
This explanation might still be unclear, do not hesitate to ask precisions in the comment, I'll edit the answer for more clarity.

Algorithm to find best combination or path through nodes

As I am not very proficient in various optimization/tree algorithms, I am seeking help.
Problem Description:
Assume, a large sequence of sorted nodes is given with each node representing an integer value L. L is always getting bigger with each node and no nodes have the same L.
The goal now is to find the best combination of nodes, where the difference between the L-values of subsequent nodes is closest to a given integer value M(L) that changes over L.
Example:
So, in the beginning I would have L = 50 and M = 100. The next nodes have L = 70,140,159,240,310.
First, the value of 159 seems to be closest to L+M = 150, so it is chosen as the right value.
However, in the next step, M=100 is still given and we notice that L+M = 259, which is far away from 240.
If we now go back and choose the node with L=140 instead, which then is followed by 240, the overall match between the M values and the L-differences is stronger. The algorithm should be able to find back to the optimal path, even if a mistake was made along the way.
Some additional information:
1) the start node is not necessarily part of the best combination/path, but if required, one could first develop an algorithm, which chooses the best starter candidate.
2) the optimal combination of nodes is following the sorted sequence and not "jumping back" -> so 1,3,5,7 is possible but not 1,3,5,2,7.
3) in the end, the differences between the L values of chosen nodes should in the mean squared sense be closest to the M values
Every help is much appreciated!
If I understand your question correctly, you could use Dijktras algorithm:
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
http://www.mathworks.com/matlabcentral/fileexchange/20025-dijkstra-s-minimum-cost-path-algorithm
For that you have to know your neighbours of every node and create an Adjacency Matrix. With the implementation of Dijktras algorithm which I posted above you can specify edge weights. You could specify your edge weight in a manner that it is L of the node accessed + M. So for every node combination you have your L of new node + M. In that way the algorithm should find the optimum path between your nodes.
To get all edge combinations you can use Matlabs graph functions:
http://se.mathworks.com/help/matlab/ref/graph.html
If I understand your problem correctly you need an undirected graph.
You can access all edges with the command
G.Edges after you have created the graph.
I know its not the perfect answer but I hope it helps!
P.S. Just watch out, Djikstras algorithm can only handle positive edge weights.
Suppose we are given a number M and a list of n numbers, L[1], ..., L[n], and we want to find a subsequence of at least q of the latter numbers that minimises the sum of squared errors (SSE) with respect to M, where the SSE of a list of k positions x[1], ..., x[k] with respect to M is given by
SSE(M, x[1], ..., x[k]) = sum((L[x[i]]-L[x[i-1]]-M)^2) over all 2 <= i <= k,
with the SSE of a list of 0 or 1 positions defined to be 0.
(I'm introducing the parameter q and associated constraint on the subsequence length here because without it, there always exists a subsequence of length exactly 2 that achieves the minimum possible SSE -- and I'm guessing that such a short sequence isn't helpful to you.)
This problem can be solved in O(qn^2) time and O(qn) space using dynamic programming.
Define f(i, j) to be the minimum sum of squared errors achievable under the following constraints:
The number at position i is selected, and is the rightmost selected position. (Here, i = 0 implies that no positions are selected.)
We require that at least j (instead of q) of these first i numbers are selected.
Also define g(i, j) to be the minimum of f(k, j) over all 0 <= k <= i. Thus g(n, q) will be the minimum sum of squared errors achievable on the entire original problem. For efficient (O(1)) calculation of g(i, j), note that
g(i>0, j>0) = min(g(i-1, j), f(i, j))
g(0, 0) = 0
g(0, j>0) = infinity
To calculate f(i, j), note that if i > 0 then any solution must be formed by appending the ith position to some solution Y that selects at least j-1 positions and whose rightmost selected position is to the left of i -- i.e. whose rightmost selected position is k, for some k < i. The total SSE of this solution to the (i, j) subproblem will be whatever the SSE of Y was, plus a fixed term of (L[x[i]]-L[x[k]]-M)^2 -- so to minimise this total SSE, it suffices to minimise the SSE of Y. But we can compute that minimum: it is g(k, j-1).
Since this holds for any 0 <= k < i, it suffices to try all such values of k, and take the one that gives the lowest total SSE:
f(i>=j, j>=2) = min of (g(k, j-1) + (L[x[i]]-L[x[k]]-M)^2) over all 0 <= k < i
f(i>=j, j<2) = 0 # If we only need 0 or 1 position, SSE is 0
f(i, j>i) = infinity # Can't choose > i positions if the rightmost chosen position is i
With the above recurrences and base cases, we can compute g(n, q), the minimum possible sum of squared errors for the entire problem. By memoising values of f(i, j) and g(i, j), the time to compute all needed values of f(i, j) is O(qn^2), since there are at most (n+1)*(q+1) possible distinct combinations of input parameters (i, j), and computing a particular value of f(i, j) requires at most (n+1) iterations of the loop that chooses values of k, each iteration of which takes O(1) time outside of recursive subcalls. Storing solution values of f(i, j) requires at most (n+1)*(q+1), or O(qn), space, and likewise for g(i, j). As established above, g(i, j) can be computed in O(1) time when all needed values of f(x, y) have been computed, so g(n, q) can be computed in the same time complexity.
To actually reconstruct a solution corresponding to this minimum SSE, you can trace back through the computed values of f(i, j) in reverse order, each time looking for a value of k that achieves a minimum value in the recurrence (there may in general be many such values of k), setting i to this value of k, and continuing on until i=0. This is a standard dynamic programming technique.
I now answer my own post with my current implementation, in order to structure my post and load images. Unfortunately, the code does not do what it should do. Imagine L,M and q given like in the images below. With the calcf and calcg functions I calculated the F and G matrices where F(i+1,j+1) is the calculated and stored f(i,j) and G(i+1,j+1) from g(i,j). The SSE of the optimal combination should be G(N+1,q+1), but the result is wrong. If anyone found the mistake, that would be much appreciated.
G and F Matrix of given problem in the workspace. G and F are created by calculating g(N,q) via calcg(L,N,q,M).
calcf and calcg functions

Partitioning a list of integers to minimize difference of their sums

Given a list of integers l, how can I partition it into 2 lists a and b such that d(a,b) = abs(sum(a) - sum(b)) is minimum. I know the problem is NP-complete, so I am looking for a pseudo-polynomial time algorithm i.e. O(c*n) where c = sum(l map abs). I looked at Wikipedia but the algorithm there is to partition it into exact halves which is a special case of what I am looking for...
EDIT:
To clarify, I am looking for the exact partitions a and b and not just the resulting minimum difference d(a, b)
To generalize, what is a pseudo-polynomial time algorithm to partition a list of n numbers into k groups g1, g2 ...gk such that (max(S) - min(S)).abs is as small as possible where S = [sum(g1), sum(g2), ... sum(gk)]
A naive, trivial and still pseudo-polynomial solution would be to use the existing solution to subset-sum, and repeat for sum(array)/2to 0 (and return the first one found).
Complexity of this solution will be O(W^2*n) where W is the sum of the array.
pseudo code:
for cand from sum(array)/2 to 0 descending:
subset <- subsetSumSolver(array,cand)
if subset != null:
return subset
The above will return the maximal subset that is lower/equals sum(array)/2, and the other part is the complement for the returned subset.
However, the dynamic programming for subset-sum should be enough.
Recall that the formula is:
f(0,i) = true
f(x,0) = false | x != 0
f(x,i) = f(x-arr[i],i-1) OR f(x,i-1)
When building the matrix, the above actually creates you each row with value lower than the initial x, if you input sum(array)/2 - it's basically all values.
After you generate the DP matrix, just find the maximal value of x such that f(x,n)=true, and this is the best partition you can get.
Complexity in this case is O(Wn)
You can phrase this as a 0/1 integer linear programming optimization problem. Let wi be the ith number, and let xi be a 0/1 variable which indicates whether wi is in the first set or not. Then you want to minimize sum(xi wi) - sum((1 - xi) wi) subject to
sum(xi wi) >= sum((1 - xi) wi)
and also subject to all xi being 0 or 1. There has been a lot of research into optimizing 0/1 linear programming solvers. For large total sum W this may be an improvement over the O(W n) pseudo-polynomial time algorithm presented because the W factor is scary.
My first thought is to:
Sort list of integers
Create two empty lists A and B
While iterating from biggest integer to smallest integer...add next integer to the list with the smallest current sum.
This is, of course, not guaranteed to give you the best result but you can bound the result it will give you by the size of the biggest integer in your list

Computing Combinations

I am facing difficulty in coming up with a solution for the problem given below:
We are given n boxes each having a weight ( it means each ball in box B_i have weight C_i),
Each box contain some balls specifically
{b1,b2,b3...,b_n} (b_i is the count of balls in Box B_i).
we have to choose m balls out of it such that sum of the weights of m chosen balls be less than a given number T.
How many ways to do it?
First, let's have a look on a similar problem:
The similar problem is: you are looking to maximize the sum (such that it is still smaller then T), you are facing a variation of subset-sum problem, which is NP-Hard. The variation with a constant number of items is discussed in this thread: Sum-subset with a fixed subset size.
An alternative way to look at the problem is with a 2-dimensional knapsack problem, where weight = cost, and an extra dimension for number of elements. This concept is discussed in this thread: What's the fastest way to solve knapsack prob with two properties
Now, look at your problem: Finding the number of possible ways to achieve a sum which is smaller/equal T is still NP-Hard.
Assume you had a polynomial algorithm to do it, let it be A.
Running A(T) and A(T-1) will give you two numbers, if A(T) > A(T-1), the answer to the subset sum problem would have been true - otherwise it is false, so given a polynomial solution to this problem, we could prove P=NP.
You can solve it by using dynamic programming techniques.
Let f[i][j][k] denote the number of ways to choose j balls from B_1 to B_i with sum of weights to be exactly k. The answer you want to get is f[n][m][T].
Initially, let f[i][j][k] = 1 for all i,j,k
for i = 1 to n
for j = 0 to m
for k = 0 to T
for x = 0 to min(b_i,j) # choose x balls from B_i
y = x * C_i
if y <= k
f[i][j][k] = f[i][j][k] * f[i-1][j-x][k-y] * Comb(b_i,x)
Comb(n,k) is the number of ways to choose k elements from n elements.
The time complexity is O(n m T b) where b is the maximum number of balls in a box.
Note that, because of the T in the big-O notation, theoretically it is NP-hard. However, in practice, when T is relatively small, this algorithm is still feasible.

Maximum Coin Partition

Since standing at the point of sale in the supermarket yesterday, once more trying to heuristically find an optimal partition of my coins while trying to ignore the impatient and nervous queue behind me, I've been pondering about the underlying algorithmic problem:
Given a coin system with values v1,...,vn, a limited stock of coins a1,...,an and the sum s which we need to pay.
We're looking for an algorithm to calculate a partition x1,...,xn (with 0<=xi<=ai) with x1*v1+x2*v2+...+xn*vn >= s such that the sum x1+...+xn - R(r) is maximized, where r is the change, i.e. r = x1*v1+x2*v2+...+xn*vn - s and R(r) is the number of coins returned from the cashier. We assume that the cashier has an unlimited amount of all coins and always gives back the minimal number of coins (by for example using the greedy-algorithm explained in SCHOENING et al.). We also need to make sure that there's no money changing, so that the best solution is NOT to simply give all of the money (because the solution would always be optimal in that case).
Thanks for your creative input!
If I understand correctly, this is basically a variant of subset sum. If we assume you have 1 of each coin (a[i] = 1 for each i), then you would solve it like this:
sum[0] = true
for i = 1 to n do
for j = maxSum downto v[i] do
sum[j] |= sum[j - v[i]]
Then find the first k >= s and sum[k] is true. You can get the actual coins used by keeping track of which coin contributed to each sum[j]. The closest you can get your sum to s using your coins, the less the change will be, which is what you're after.
Now you don't have 1 of each coin i, you have a[i] of each coin i. I suggest this:
sum[0] = true
for i = 1 to n do
for j = maxSum downto v[i] do
for k = 1 to a[i] do
if j - k*v[i] >= 0 do
sum[j] |= sum[j - k*v[i]] <- use coin i k times
It should be fairly easy to get your x vector from this. Let me know if you need any more details.

Resources