Dijkstra's Algorithm with Negative Weights Query - algorithm

In this scenario, the aim is to find a path with the smallest total weight. There are 5 sections with each section having different nodes. The nodes are only connected to nodes in adjacent sections and a path must consist of only a single node from each section.
For example, let:
section 1 has nodes [1, 2, 3].
section 2 has nodes [4, 5].
section 3 has nodes [6].
section 4 has nodes [7, 8, 9, 10, 11].
section 5 has nodes [12, 13, 14].
A valid path through the sections is [1, 4, 6, 7 , 12] and also [1, 5, 6, 11, 14] etc...
All nodes have negative weights but negative cycles are impossible (due to the one node per section policy). Therefore, does the process of adding a constant to each node resolve the issue of negative weights? If it can fix the issue, are there any papers which show this? I know there are other algorithms to resolve negative weights but I'm interestted in Dijkstra's algorithm. Thanks.

No, you can't do this. Let's have a look at the counter example. Suppose we have a graph with A, B, C nodes and egdes:
A - B -2 (negative)
A - C 6
B - C 7
we are looking for the shortest path from A to C. In the original graph we have
A - B - C => -2 + 7 = 5 (the shortest path, 5 < 6)
A - C => 6
The best choice is A - B - C. Now, let's get rid of negative edges by adding 2. We'll have now
A - B 0
A - C 8
B - C 9
A - B - C => 0 + 9 = 9
A - C => 8 (the shortest path, 8 < 9)
Please note, that now the shortest path is A - C. Alas! While adding constant value to each edge we ruin the problem itself; and it doesn't matter now which algorithm we use.
Edit: Counter example with all edges (arc to prevent negative loops) being negative:
A -> B -6
B -> C -1
A -> C -5
Before adding 6 we have
A -> B -> C = -6 - 1 = -7 (the shortest path)
A -> C = -5
After adding 6 we get
A -> B 0
B -> C 5
A -> C 1
A -> B -> C = 0 + 5 = 5
A -> C = 1 (the shortest path)

Related

Search - find closest node to n different start points (uniform cost)

Suppose I had a node path where the cost of travelling between each node is uniform. I'm trying to find the closest node that 2 or more nodes can travel to. Closest being measured as the cumulative cost of reaching the common node from all start points.
If I wanted to find the closest common node to node A and B, that node would be E.
A -> E (2 cost)
B -> E (1 cost)
If I wanted to find the closest common node to node A, B, C, that node would be F.
A -> F (3 cost)
B -> F (2 cost)
C -> F (1 cost)
And if I wanted to find the closest common node between node G, E, no node is possible.
So there should be two outputs: either the closest node or an error message stating that it cannot reach one another.
I would appreciate if I could be given a algorithm that can achieve this. A link to a article, psudocode or any language is fine, below is some python code that represents the graph above in a defaultdict(list) object.
from enum import Enum
from collections import defaultdict
class Type(Enum):
A = 1
B = 2
C = 3
D = 4
E = 5
F = 6
G = 6
paths = defaultdict(list)
paths[Type.A].append(Type.D)
paths[Type.D].append(Type.G)
paths[Type.D].append(Type.E)
paths[Type.B].append(Type.E)
paths[Type.E].append(Type.F)
paths[Type.C].append(Type.F)
Thanks in advance.
Thanks to #VincentvanderWeele for the suggestion:
Example cost of all nodes from A, B
A B C D E F G
___________________
A 0 X X 1 2 3 2
B X 1 X X 2 2 X
As an optimisation when working out the 2nd+ node you can skip any nodes that the previous nodes can not travel to, e.g.
A B C D E F G
___________________
A 0 X X 1 2 3 2
B X X X X 2 2 X
^
Possible closest nodes:
E = 2 + 2 = 4
F = 2 + 3 = 5
Result is E since it has the lowest cost

Neighbors in the matrix - algorithm

I have a problem with coming up with an algorithm for the "graph" :(
Maybe one of you would be so kind and direct me somehow <3
The task is as follows:
We have a board of at least 3x3 (it doesn't have to be a square, it can be 4x5 for example). The user specifies a sequence of moves (as in Android lock pattern). The task is to check how many points he has given are adjacent to each other horizontally or vertically.
Here is an example:
Matrix:
1 2 3 4
5 6 7 8
9 10 11 12
The user entered the code: 10,6,7,3
The algorithm should return the number 3 because:
10 is a neighbor of 6
6 is a neighbor of 7
7 is a neighbor of 3
Eventually return 3
Second example:
Matrix:
1 2 3
4 5 6
7 8 9
The user entered the code: 7,8,6,3
The algorithm should return 2 because:
7 is a neighbor of 8
8 is not a neighbor of 6
6 is a neighbor of 3
Eventually return 2
Ofc number of operations equal length of array - 1
Sorry for "ile" and "tutaj", i'm polish
If all the codes are unique, use them as keys to a dictionary (with (row/col) pairs as values). Loop thru the 2nd item in user input to the end, check if math.Abs(cur.row-prev.row)+math.Abs(cur.col-prev.col)==1. This is not space efficient but deal with user input in linear complexity.
The idea is you have 4 conditions, one for each direction. Given any matrix of the shape n,m which is made of a sequence of integers AND given any element:
The element left or right will always be + or - 1 to the given element.
The element up or down will always be + or - m to the given element.
So, if abs(x-y) is 1 or m, then x and y are neighbors.
I demonstrate this in python.
def get_neighbors(seq,matrix):
#Conditions
check = lambda x,y,m: np.abs(x-y)==1 or np.abs(x-y)==m
#Pairs of sequences appended with m
params = zip(seq, seq[1:], [matrix.shape[1]]*(len(seq)-1))
neighbours = [check(*i) for i in params]
count = sum(neighbours)
return neighbours, count
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
print('Matrix:')
print(matrix)
print('')
print('Sequence:', seq)
print('')
print('Count of neighbors:',count)
Matrix:
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Sequence: [10, 6, 7, 3]
Count of neighbors: 3
Another example -
seq = [7,8,6,3]
matrix = np.arange(1,10).reshape((3,3))
neighbours, count = get_neighbors(seq, matrix)
Matrix:
[[1 2 3]
[4 5 6]
[7 8 9]]
Sequence: [7, 8, 6, 3]
Count of neighbors: 2
So your input is the width of a table, the height of a table, and a list of numbers.
W = 4, H = 3, list = [10,6,7,3]
There are two steps:
Convert the list of numbers into a list of row/column coordinates (1 to [1,1], 5 to [2,1], 12 to [3,4]).
In the new list of coordinates, find consequent pairs, which have one coordinate identical, and the other one has a difference of 1.
Both steps are quite simple ("for" loops). Do you have problems with 1 or 2?

Even distributon of edges in bipartite graph

I am given a bipartite and directed graph, initially without edges. One set of nodes is called subjects, the other set is called objects. Edges can only be constructed from a subject to an object.
The number of subjects (numSubj) and objects (numObj) are given, respectively.
Moreover the number of available edges (numEdges) is given.
The goal is to distribute edges from subjects to objects evenly. This means all subjects should have a similar number of outgoing edges, analogously all objects should have a similar number of ingoing edges. Each subject and object has to have at least one connected edge.
Please suggest a solution (e.g. in pseudo code)
First of all let's index all the items in each set of nodes form 1 to numSubj and from 1 to numObj. Let's also assume that numSubj < numObj (if this is not true than simply flip the sets, solve and flip them again).
Now calculate the total number of edges which is the lcm of these numbers, after than you can conclude the number of outgoing edges by dividing the result by numObj (A) and find ingoing by dividing by numSubj (B).
After this calculation for each subject create an edge to the calculated number of subject where the number of ingoing edges, A, is less then the second number calculated - B.
This process can be done like this:
i is connected to [i * B, i * B + 1, ... , i * (B + 1) - 1 ] mod numObj
With 2 and 5:
LCM = 10
Ingoing = 10 / 5 = 2
Outgoing = 10 / 2 = 5
1 -> 1, 2, 3, 4, 5
2 -> 1, 2, 3, 4, 5
With 4 and 8:
LCM = 8
Ingoing = 8 / 8 = 1
Outgoing = 8 / 4 = 2
1 -> 1, 2
2 -> 3, 4
3 -> 5, 6
4 -> 7, 8
With 4 and 6:
LCM = 12
Ingoing = 12 / 6 = 2
Outgoing = 12 / 4 = 3
1 -> 1, 2, 3
2 -> 4, 5, 6
3 -> 1, 2, 3
4 -> 4, 5, 6

algorithm to maximize sum of unique set with multiple contributors

Im looking for an approach to maximize the value of a common set comprised of contributions from multiple sources with a fixed number of contributions from each.
Example problem: 3 people each have a hand of cards. Each hand contains a unique set, but the 3 sets may overlap. Each player can pick three cards to contribute to the middle. How can I maximize the sum of the 9 contributed cards where
each player contributes exactly 3 cards
all 9 cards are unique (when possible)
solution which can scale in the range of 200 possible "cards", 40
contributors and 6 contributions each.
Integer-programming sounds like a viable approach. Without guaranteeing it, this problem also feels NP-hard, meaning: there is no general algorithm beating brute-force (without assumptions about the possible input; IP-solvers actually do assume a lot / are tuned for real-world problems).
(Alternative off-the-shelf approaches: Constraint-programming and SAT-solvers; CP: easy to formulate, faster in regards to combinatorial-search but less good using branch-and-bound style in terms of maximization; SAT: hard to formulate as counters need to build, very fast combinatorial-search and again: no concept of maximization: needs decision-problem like transform).
Here is some python-based complete example solving this problem (in the hard-constraint version; each player has to play all his cards). As i'm using cvxpy, the code is quite in math-style and should be easy to read despite not knowing python or the lib!
Before presenting the code, some remarks:
General remarks:
The IP-approach is heavily dependent on the underlying solver!
Commercial solvers (Gurobi and co.) are the best
Good open-source solvers: CBC, GLPK, lpsolve
The default-solver in cvxpy is not ready for this (when increasing the problem)!
In my experiment, with my data, commercial solvers scale very well!
A popular commercial-solver needs a few seconds for:
N_PLAYERS = 40 , CARD_RANGE = (0, 400) , N_CARDS = 200 , N_PLAY = 6
Using cvxpy is not best-practice as it's created for very different use-cases and this induces some penalty in times of model-creation time
I'm using it because i'm familiar with it and i love it
Improvements: Problem
We are solving the each-player-plays-exactly-n_cards here
Sometimes there is no solution
Your model-description does not formally describe how to handle this
General idea to improve the code:
bigM-style penalty-based objective: e.g. Maximize(n_unique * bigM + classic_score)
(where bigM is a very big number)
Improvements: Performance
We are building all those pairwise-conflicts and use a classic not-both constraint
The number of conflicts, depending on the task can grow a lot
Improvement idea (too lazy to add):
Calculate the set of maximal cliques and add these as constraints
Will be much more powerful, but:
For general conflict-graphs this problem should be NP-hard too, so an approximation algorithm needs to be used
(opposed to other applications like time-invervals, where this set can be calculated in polynomial time as the graphs will be chordal)
Code:
import numpy as np
import cvxpy as cvx
np.random.seed(1)
""" Random problem """
N_PLAYERS = 5
CARD_RANGE = (0, 20)
N_CARDS = 10
N_PLAY = 3
card_set = np.arange(*CARD_RANGE)
p = np.empty(shape=(N_PLAYERS, N_CARDS), dtype=int)
for player in range(N_PLAYERS):
p[player] = np.random.choice(card_set, size=N_CARDS, replace=False)
print('Players and their cards')
print(p)
""" Preprocessing:
Conflict-constraints
-> if p[i, j] == p[x, y] => don't allow both
Could be made more efficient
"""
conflicts = []
for p_a in range(N_PLAYERS):
for c_a in range(N_CARDS):
for p_b in range(p_a + 1, N_PLAYERS): # sym-reduction
if p_b != p_a:
for c_b in range(N_CARDS):
if p[p_a, c_a] == p[p_b, c_b]:
conflicts.append( ((p_a, c_a), (p_b, c_b)) )
# print(conflicts) # debug
""" Solve """
# Decision-vars
x = cvx.Bool(N_PLAYERS, N_CARDS)
# Constraints
constraints = []
# -> Conflicts
for (p_a, c_a), (p_b, c_b) in conflicts:
# don't allow both -> linearized
constraints.append(x[p_a, c_a] + x[p_b, c_b] <= 1)
# -> N to play
constraints.append(cvx.sum_entries(x, axis=1) == N_PLAY)
# Objective
objective = cvx.sum_entries(cvx.mul_elemwise(p.flatten(order='F'), cvx.vec(x))) # 2d -> 1d flattening
# ouch -> C vs. Fortran storage
# print(objective) # debug
# Problem
problem = cvx.Problem(cvx.Maximize(objective), constraints)
problem.solve(verbose=False)
print('MIP solution')
print(problem.status)
print(problem.value)
print(np.round(x.T.value))
sol = x.value
nnz = np.where(abs(sol - 1) <= 0.01) # being careful with fp-math
sol_p = p[nnz]
assert sol_p.shape[0] == N_PLAYERS * N_PLAY
""" Output solution """
for player in range(N_PLAYERS):
print('player: ', player, 'with cards: ', p[player, :])
print(' plays: ', sol_p[player*N_PLAY:player*N_PLAY+N_PLAY])
Output:
Players and their cards
[[ 3 16 6 10 2 14 4 17 7 1]
[15 8 16 3 19 17 5 6 0 12]
[ 4 2 18 12 11 19 5 6 14 7]
[10 14 5 6 18 1 8 7 19 15]
[15 17 1 16 14 13 18 3 12 9]]
MIP solution
optimal
180.00000005500087
[[ 0. 0. 0. 0. 0.]
[ 0. 1. 0. 1. 0.]
[ 1. 0. 0. -0. -0.]
[ 1. -0. 1. 0. 1.]
[ 0. 1. 1. 1. 0.]
[ 0. 1. 0. -0. 1.]
[ 0. -0. 1. 0. 0.]
[ 0. 0. 0. 0. -0.]
[ 1. -0. 0. 0. 0.]
[ 0. 0. 0. 1. 1.]]
player: 0 with cards: [ 3 16 6 10 2 14 4 17 7 1]
plays: [ 6 10 7]
player: 1 with cards: [15 8 16 3 19 17 5 6 0 12]
plays: [ 8 19 17]
player: 2 with cards: [ 4 2 18 12 11 19 5 6 14 7]
plays: [12 11 5]
player: 3 with cards: [10 14 5 6 18 1 8 7 19 15]
plays: [14 18 15]
player: 4 with cards: [15 17 1 16 14 13 18 3 12 9]
plays: [16 13 9]
Looks like a packing problem, where you want to pack 3 disjoint subsets of your original sets, each of size 3, and maximize the sum. You can formulate it as an ILP. Without loss of generality, we can assume the cards represent natural numbers ranging from 1 to N.
Let a_i in {0,1} indicate if player A plays card with value i, where i is in {1,...,N}. Notice that if player A doesn't have card i in his hand, a_i is set to 0 in the beginning.
Similarly, define b_i and c_i variables for players B and C.
Also, similarly, let m_i in {0,1} indicate if card i will appear in the middle, i.e., one of the players will play a card with value i.
Now you can say:
Maximize Sum(m_i . i), subject to:
For each i in {1,...N,}:
a_i, b_i, c_i, m_i are in {0, 1}
m_i = a_i + b_i + c_i
Sum(a_i) = 3, Sum(b_i) = 3, Sum(c_i) = 3
Discussion
Notice that constraint 1 and 2, force the uniqueness of each card in the middle.
I'm not sure how big of a problem can be handled by commercial or non-commercial solvers with this program, but notice that this is really a binary linear program, which might be simpler to solve than the general ILP, so it might be worth trying for the size you are looking for.
Sort each hand, dropping duplicate values. Delete anything past the 10-th highest card of any hand (3 hands * 3 cards/hand, plus 1): nobody can contribute a card that low.
For accounting purposes, make a directory by card value, showing which hands hold each value. For instance, given players A, B, C and these hands
A [1, 1, 1, 6, 4, 12, 7, 11, 13, 13, 9, 2, 2]
B [13, 2, 3, 1, 5, 5, 8, 9, 11, 10, 5, 5, 9]
C [13, 12, 11, 10, 6, 7, 2, 4, 4, 12, 3, 10, 8]
We would sort and de-dup the hands. 2 is the 10th-highest card of hand c, so we drop all values 2 and below. Then build the directory
A [13, 12, 11, 9, 7, 6]
B [13, 11, 10, 9, 8, 5, 3]
C [13, 12, 11, 10, 8, 7, 6, 4, 3]
Directory:
13 A B C
12 A C
11 A B C
10 B C
9 A B
8 B C
7 A B
6 A C
5 B
4 C
3 B C
Now, you need to implement a backtracking algorithm to choose cards in some order, get the sum of that order, and compare with the best so far. I suggest that you iterate through the directory, choosing a hand from which to obtain the highest remaining card, backtracking when you run out of contributors entirely, or when you get 9 cards.
I recommend that you maintain a few parameters to allow you to prune the investigation, especially when you get into the lower values.
Make a maximum possible value, the sum of the top 9 values in the directory. If you hit this value, stop immediately, as you've found an optimum solution.
Make a high starting target: cycle through the hands in sequence, taking the highest usable card remaining in the hand. In this case, cycling A-B-C, we would have
13, 11, 12, 9, 10, 8, 7, 5, 6 => 81
// Note: because of the values I picked
// this happens to provide an optimum solution.
// It will do so far a lot of the bridge-hand problem space.
Keep count of how many cards have been contributed by each hand; when one has given its 3 cards, disqualify it in some way: have a check in the choice code, or delete it from the local copy of the directory.
As you walk down the choice list, prune the search any time the remaining cards are insufficient to reach the best-so-far total. For instance, if you have a total of 71 after 7 cards, and the highest remaining card is 5, stop: you can't get to 81 with 5+4.
Does that get you moving?

Bellman-Ford algorithm proof of correctness

I'm trying to learn about Bellman-Ford algorithm but I'm stucked with the proof of the correctness.
I have used Wikipedia, but I simply can't understand the proof. I did not find anything on Youtube that's helpfull.
Hope anyone of you can explain it briefly. This page "Bellman-ford correctness can we do better" does not answer my question.
Thank you.
Let's see the problem from the perspective of dynamic programming of a graph with no negative cycle.
We can visualize the memoization table of the dynamic programming as follows:
The columns represent nodes and the rows represent update steps(node 0 is the source node), and the arrows directing from one box in a step to another in the next step are the min updates(step 0 is the initialization).
We choose one path from all shortest paths and illustrate why it is correct. Let's choose the 0 -> 3 -> 2 -> 4 -> 5. It is the shortest path from 0 to 5, we can choose any other one otherwise. We can prove the correctness by reduction. The initial is the source 0, and obviously, the distance between 0 and itself should be 0, the shortest. And we assume 0 -> 3 -> 2 is the shortest path between 0 and 2, and we are going to prove that 0 -> 3 -> 2 -> 4 is the shortest path between 0 and 4 after the third iteration.
First, we prove that after the third iteration the node 4 must be fixed/tightened. If node 4 is not fixed it means that there is at least one path other than 0 -> 3 -> 2 -> 4 that can reach 4 and that path should be shorter than 0 -> 3 -> 2 -> 4, which contradicts our assumption that 0 -> 3 -> 2 -> 4 -> 5 is the shortest path between 0 and 5. Then after the third iteration, 2 and 4 should be connected.
Second, we prove that that relaxation should be the shortest. It cannot be greater and smaller because it is the only shortest path.
Let's see a graph with a negative cycle.
And here is its memoization table:
Let's prove that at |V|'s iteration, here |V| is the number of vertices 6, the update should not be stopped.
We assume that the update stopped(and there is a negative cycle). Let's see the cycle 3 -> 2 -> 4 -> 5 -> 3.
dist(2) <= dist(3) + w(3, 2)
dist(4) <= dist(2) + w(2, 4)
dist(5) <= dist(4) + w(4, 5)
dist(3) <= dist(5) + w(5, 3)
And we can obtain the following inequlity from the above four inequalities by summing up the left-hand side and the right-hand side:
dist(2) + dist(4) + dist(5) + dist(3) <= dist(3) + dist(2) + dist(4) + dist(5) + w(3, 2) + w(2, 4) + w(4, 5) + w(5, 3)
We subtract the distances from both sides and obtain that:
w(3, 2) + w(2, 4) + w(4, 5) + w(5, 3) >= 0, which contradicts our claim that 3 -> 2 -> 4 -> 5 -> 3 is a negative cycle.
So we are certain that at |V|'s step and after that step the update would never stop.
My code is here on Gist.
Reference:
dynamic programming - bellman-ford algorithm
Lecture 14: Bellman-Ford Algorithm

Resources