Find best path by arbitrary weight function? - algorithm

How can I find the best path by an arbitrary weight function? That means a function that says how "good" a path is, for example number of edge colors. The function will never score a path "better" than all of its subpaths.
I used the Dijkstra-Algorithm but with the score function instead of the length determining which path will be expanded next, but Im not sure if it is the best solution or if there are cases where the best path will never be found.

The particular problem you're describing - finding a path between two nodes that minimizes the number of colors used - is NP-hard. This means that, assuming P ≠ NP, there is no polynomial-time algorithm for solving this problem, and in particular Dijkstra's algorithm won't work here.
Here's a reduction that shows this, which is based on this excellent answer by Paul Hankin for a related problem. We're going to reduce the hitting set problem to the problem of finding the least-colorful s-t path in a graph. In the hitting set problem, you're given a collection of sets S1, ..., Sn containing a total of m elements and a number k, then asked whether it's possible to pick k elements such that each set Si contains at least one of them.
The reduction works as follows. We're going to build a graph and color the nodes one of m+1 colors. The first color (we'll call it black) is a neutral color with no meaning. The remaining m colors will then correspond to the different elements from the sets.
We'll construct a chain of "gadgets," one per set Si, which correspond to choosing some element from Si. Here's how each gadget works:
Each gadget has a start node si, colored black, and an end node fi, also colored black.
For each element x ∈ Si, we add a new node xi given the color associated with element x. We then add an edge from si to xi and from xi and ti.
Now, imagine walking from si to ti. You have to take two steps to do this, visiting two black nodes (si and ti) and one node of a different color. Walking through that colored node corresponds to selecting xi as one of the elements of your hitting set.
To finish things up, wire up all the gadgets in series by linking f1 to s2, f2 to s3, etc. Now, look at any path from s1 to fn. If you can find a path through the graph that uses at most k+1 colors, one of those colors will be black, and the other k colors correspond to a collection of k elements that collectively contain one element out of each of the sets Si. You've found your hitting set - great! On the flipside, imagine you have a hitting set of size k. Then walk from s1 to fn, making the choice at each point at which you have to pick a colored node corresponding to one of the items from the hitting set. Then you'll use at most k+1 colors: black plus the hitting set colors.
This graph contains 2n nodes for the si and ti nodes, plus one node for each element of each set, with a linear number of edges. It's therefore polynomially-sized with respect to the instance of hitting set, so this is a polynomial-time reduction from hitting set to your problem.
Sorry for the (probably) negative result!

Related

Finding equally-sized mutually exclusive complete subgraphs within a graph whose union is the entire graph

INPUT
Undirected graph G with n vertices and an integer k such that k divides n.
The set of all vertices will be denoted by V.
OUTPUT
A set S of sets of vertices such that:
S has k elements
Each element of S is a complete subgraph in G (all vertices in each element share an edge with each other in G)
All elements of S are mutually exclusive (the elements have no vertices in common with each other)
The union of all the elements of S is equal to V
All elements of S have cardinality n / k
BACKGROUND
I run a small play reading group and we like to read larger plays sometimes. I want to cast a large play for a small group in such a way that a single person won't be playing a set of characters that share scenes with each other. I realized that this problem could be formulated in graph theory, and I'm curious as to what a good solution looks like.
This problem is basically equivalent to graph coloring. Graph coloring gives us a graph and asks us to give each node a color such that no edge has identically colored endpoints. Here I assume that the nodes would be roles, the edges would be roles that appear together in at least one scene, and colors would be people playing the roles, and you want specifically a coloring that uses exactly k colors (for k people).
Graph coloring is NP-hard, but unless the graph is huge, a constraint programming solver (e.g., CP-SAT) should have an easy time with it, and additionally handle an optimization objective like (e.g.) maximizing the minimum number of lines that each person has.

Max flow: how to force f units to flow, with minimal changes to capacity?

Let's say I have a graph and run a max-flow on it. I get some flow, f. However, I want to to flow f1 units where f1>f. Of course, I need to go about increasing some of the edge capacities. I want to make as small a total increase as possible to the capacities. Is there a clever algorithm to achieve this?
If it helps, I care for my application about bi-partite graphs with source (s) to left vertices (L) having some finite, integer capacities (c_l), left vertices L to right vertices R having some connectivity with infinite capacities and all right vertices, R connected to a sink vertex with finite integer capacities (c_r). Here, c_l and c_r sum to the same number. Also, there are no connections among the left vertices or among the right ones.
An example is provided in the image below. The blue numbers are the flow capacities and the pink numbers are the actual flows in the max-flow. Currently, 5 units are flowing but I want 9 units to flow.
In general, turn the flow instance into a min-cost flow instance by setting the cost of existing arcs to zero and adding new, infinite-capacity arcs doubling them of cost one.
For these particular instances, the best you're going to do is to repeatedly find an unsaturated arc of finite capacity and push flow along any path that includes it. Once everything's saturated just use any path.
This seems a little too easy to be what you want, so I'll mention that it's possible to formulate more sophisticated objectives and solve them using linear programming techniques.
The graph is undirected, and all the "middle" vertices have infinite capacity. That means we can unify all vertices connected by infinite capacity in L and R, making a very simple graph indeed.
For example, in the above graph, an equivalent graph would be:
s -8-> Vertex 1+2+4 -4-> t
s -1-> Vertex 3+5 -5-> t
So we end up with just a bunch of unique paths with no branching. We can unify the nodes with a simple "floodfill" or DFS type search on infinite-capacity edges. When we unify nodes, we add up their "left" and "right" capacities.
To maximize flow in this graph we:
First, if the left and right paths are not equal, increase the lower one until they are equal. This lets us convert an increase of cost X, into an increase in flow of X.
Once the left and right paths are equal for all nodes, we pick any path. Then, we increase both halves of the path with cost 2X, increasing the flow by X.

Special case of coloring a weighted graph

The problem is: I have a graph G, where each vertex is labelled by some non-negative number (a weight), and I have to find the subset S of non-adjacent vertices (an independent set of G) that maximizes the sum of their labels (let's call it W(S), the weight of the subset S).
It comes to my mind the world of graph coloring, but in this case, the problem is coloring the graph using only two colors, white for choosen verteces, and black otherwise, so that only white verteces must be non-adjacent while their total weight is maximized (or minimized if we make all labels negative).
Has this specific problem a name? The closest thing I have found is cocoloring, but they don't apply to weighted graphs.
Have a look at independent sets (https://en.wikipedia.org/wiki/Independent_set_(graph_theory)). Your particular problem is the maximum weight independent set problem.

Variations of Dijkstra's Algorithm for graphs with two weight properties

I'm trying to find a heuristic for a problem that is mapped to a directed graph with say non-negative weight edges. However, each edge is associated with two weight properties as opposed to only one weight (e.g. say one is distance, and another one showing how good the road's 4G LTE coverage is!). Is there any specific variation of dijkstra, Bellman Ford, or any other algorithm that pursues this objective? Of course, a naive workaround is manually deriving a single weight property as a combination of all of them, but this does not look good.
Can it be generalized to cases with multiple properties?
Say you want to optimize simultaneously two criteria: distance and attractiveness (and say path attractiveness is defined as the attractiveness of the most attractive edge, although you can think of different definitions). The following variation of Dijkstra can be shown to work, but I think it is mainly useful where one of the criteria takes a small number of values - say attractiveness is 1, ..., k for some small fixed k (smaller i is better).
The standard pseudocode for Dijsktra's algorithm uses a single priority queue. Instead use k priority queues. Priority queue i will correspond in Dijkstra's algorithm to the shortest path to a node v ∈ V with attractiveness i.
Start by initializing that each node is in each of the queues with distance ∞ (because, initially, the shortest path to v with attractiveness i is infinite).
In the main Dijkstra loop, where it says
while Q is not empty
change it to
while there is an i for which Q[i] is not empty
Q = Q[i] for the lowest such i
and continue from there.
Note that when you update, you pop from queue Q[i], and insert to Q[j] for j ≥ i.
It's possible to modify the proof of Dijkstra's relaxation property to show that this works.
Note that you will obtain up to k |V| results, as per node and attractiveness, you can have the shortest distance to the node with the given attractiveness.
Example
Taking an example from the comments:
So basically if a path has a total no-coverage miles of >10, then we go for another path.
Here, e.g., assuming the miles are integers (or can be rounded to integers), we could create 11 queues: queue i corresponds to the shortest distance with i no-coverage miles, except for 10, which corresponds to 10-or-higher no-coverage-miles.
At some point of the algorithm, say all queues are empty below queue 3. We pop queue 3, and update the vertex's neighbors: this might update, e.g., some node in queue 4, if the distance from the popped node to the other node is 1.
As the algorithm runs, it outputs mappings of (node, no-coverage-distance) → shortest distance. Here, you could decide that you discard all mappings for which the second item in the pair is 10.

Number of closed regions created by a path

Given a path P described by a list of positions in the xy-plane that are each connected by edges, compute the least number of edges that have to be removed from P such that P does not close off any regions in the xy-plane (i.e., it should be possible to go from any point to any other point). Every position will have integer coordinates, and each position will be one unit left, right, up, or down from the previous one.
For example, if P = {[0,0], [0,1], [1,1], [1,0], [0,0]}, then the path is a square starting and ending at (0,0). Any 1 of the 4 edges of the square could be removed, so the answer is 1.
Note that the same edge can be drawn twice. That is, if P = {[0,0], [0,1], [1,1], [1,0], [0,0], [0,1], [1,1], [1,0], [0,0]}, the answer would be 2, because now each side of the square has 2 edges, so at least 2 edges would have to be removed to "free" the square.
I've tried a naive approach where if any position is visited twice, there could be an enclosed region (not always, but my program relies on this assumption), so I add 1 to the minimum number of edges removed. In general if a vertex is visited N times I add N-1 to the number of edges removed. However, if, for example, P = {[0,0], [0,1], [0,0]}, there is no enclosed region whereas my program would think there is. Another case of where it breaks down: if P = {[0,0], [0,1], [1,1], [1,0], [0,0], [1,0]}, my program would output 2 (since (0,0) and (0,1) are each visited twice), whereas the correct answer is 1, since we can just remove any of the other three sides of the square.
It seems that there are two primary subtasks to solve this problem: first, given the path, figure out which positions are enclosed (i.e., figure out the regions that the path splits the graph into); second, use knowledge of the regions to identify which edges must be removed to prevent enclosures.
Any hints, pseudocode, or code would be appreciated.
Source: Princeton's advanced undergraduate class on algorithms.
Here are a few ideas that might help. I'm going to assume that you have n points.
You could first insert all of the edges in a set S so that duplicate edges are removed:
for(int i = 0; i < n-1; i++)
S.insert( {min(p[i], p[i+1), max(p[i], p[i+1])} );
Now iterate over the edges again and build a graph. Then find the longest simple path in this graph.
The resulting graph is bipartite (if a cycle exists it must have even length). This piece of information might help as well.
You could use a flood-fill algorithm to find the contiguous regions of the plane created by the path. One of these regions is infinite but it's easy to compute the perimeter with a scanline sweep, and that will limit the total size to be no worse than quadratic in the length of the path. If the path length is less than 1,000 then quadratic is acceptable. (Edit: I later realized that since it is only necessary to identify the regions adjacent to edges of the line, you can do this computation by sorting the segments and then applying a scanline sweep, resulting in O(n log n) time complexity.)
Every edge in the path is between two regions (or is irrelevant because the squares on either side are the same region). For the relevant edges, you can count repetitions and then find the minimum cost boundary between any pair of adjacent regions. All that is linear once you've identified the region id of each square.
Now you have a weighted graph. Construct a minimum spanning tree. That should be precisely the minimum collection of edges which need to be removed.
There may well be a cleverer solution. The flood-fill strikes me as brute-force and naive, but it's the best I can do in ten minutes.
Good luck.

Resources