Minimizing maximum cost in an one-to-one assignment problem - algorithm

I want to compute a one-to-one assignment, from A to B where A and B have n elements. Each element of A will be assigned to one element in B. There is an associated cost between each element of A and B. The goal is to minimize the maximum cost of the assignment.
How to compute such an assignment using the cost matrix? I need to select an element from each row/column such that minimax strategy is achieved.
I was able to achieve this by listing all possible assignments and calculating their maximum costs and then selecting the one with the least maximum cost, but I wonder if there's a better approach.

The easiest is probably as a MIP (Mixed-Integer Programming) model:
min z
z >= c[i,j]*x[i,j] ∀ i,j
sum(i, x[i,j]) = 1 ∀ j
sum(j, x[i,j]) = 1 ∀ i
x[i,j] ∈ {0,1}
This can be solved with any MIP solver.

A brute force polynomial cost algorithm based on repeated application of max matching in bipartite graph:
Order the pairs by assignment cost. Take n cheapest pairs. This determines a bipartite graph with A and B as vertices (pairs are edges). Try to find a matching in this graph. If found, it determines the optimal one-to-one assignment.
If not found, add the next pair. Repeat until a matching is found.
I suspect this can be optimized by not running the matching algorithm from the start every time you add an edge but rather keeping the maximum matching and checking if it can be augmented with the newly added edge.

Related

Shortest Path in a Directed Acyclic Graph with two types of costs

I am given a directed acyclic graph G = (V,E), which can be assumed to be topologically ordered (if needed). The edges in G have two types of costs - a nominal cost w(e) and a spiked cost p(e).
The goal is to find the shortest path from a node s to a node t which minimizes the following cost:
sum_e (w(e)) + max_e (p(e)), where the sum and maximum are taken over all edges in the path.
Standard dynamic programming methods show that this problem is solvable in O(E^2) time. Is there a more efficient way to solve it? Ideally, an O(E*polylog(E,V)) algorithm would be nice.
---- EDIT -----
This is the O(E^2) solution I found using dynamic programming.
First, order all costs p(e) in an ascending order. This takes O(Elog(E)) time.
Second, define the state space consisting of states (x,i) where x is a node in the graph and i is in 1,2,...,|E|. It represents "We are in node x, and the highest edge weight p(e) we have seen so far is the i-th largest".
Let V(x,i) be the length of the shortest path (in the classical sense) from s to x, where the highest p(e) encountered was the i-th largest. It's easy to compute V(x,i) given V(y,j) for any predecessor y of x and any j in 1,...,|E| (there are two cases to consider - the edge y->x is has the j-th largest weight, or it does not).
At every state (x,i), this computation finds the minimum of about deg(x) values. Thus the complexity is O(|E| * sum_(x\in V) deg(x)) = O(|E|^2), as each node is associated to |E| different states.
I don't see any way to get the complexity you want. Here's an algorithm that I think would be practical in real life.
First, reduce the graph to only vertices and edges between s and t, and do a topological sort so that you can easily find shortest paths in O(E) time.
Let W(m) be the minimum sum(w(e)) cost of paths max(p(e)) <= m, and let P(m) be the smallest max(p(e)) among those shortest paths. The problem solution corresponds to W(m)+P(m) for some cost m. Note that we can find W(m) and P(m) simultaneously in O(E) time by finding a shortest W-cost path, using P-cost to break ties.
The relevant values for m are the p(e) costs that actually occur, so make a sorted list of those. Then use a Kruskal's algorithm variant to find the smallest m that connects s to t, and calculate P(infinity) to find the largest relevant m.
Now we have an interval [l,h] of m-values that might be the best. The best possible result in the interval is W(h)+P(l). Make a priority queue of intervals ordered by best possible result, and repeatedly remove the interval with the best possible result, and:
stop if the best possible result = an actual result W(l)+P(l) or W(h)+P(h)
stop if there are no p(e) costs between l and P(h)
stop if the difference between the best possible result and an actual result is within some acceptable tolerance; or
stop if you have exceeded some computation budget
otherwise, pick a p(e) cost t between l and P(h), find a shortest path to get W(t) and P(t), split the interval into [l,t] and [t,h], and put them back in the priority queue and repeat.
The worst case complexity to get an exact result is still O(E2), but there are many economies and a lot of flexibility in how to stop.
This is only a 2-approximation, not an approximation scheme, but perhaps it inspires someone to come up with a better answer.
Using binary search, find the minimum spiked cost θ* such that, letting C(θ) be the minimum nominal cost of an s-t path using edges with spiked cost ≤ θ, we have C(θ*) = θ*. Every solution has either nominal or spiked cost at least as large as θ*, hence θ* leads to a 2-approximate solution.
Each test in the binary search involves running Dijkstra on the subset with spiked cost ≤ θ, hence this algorithm takes time O(|E| log2 |E|), well, if you want to be technical about it and use Fibonacci heaps, O((|E| + |V| log |V|) log |E|).

Single Source Shortest Path with constraint lower bound of minimum cost

Problem description:
Given a graph G in adjacencyMatrix and adjacencyList, inside which there is a source vertex s and a destination vertex d. Find the shortest path from s to d, with a constraint. The constraint is that the shortest path cost c has a lower bound, i.e., the cost c must be greater than an assigned lower bound N but is the smallest in all the costs of possible paths that are greater or equal N.
I understand with this constraint conventional SSSP algorithm like Bellman ford cannot work correctly. How shall I find a most efficient algorithm for this problem?
I suppose you wanted a walk, since path cannot have cycles.
Unfortunately, the problem can be easily modeled as change-making problem, which is NPC too.
Change-making problem: Given N types of coins of c_i value each, is it possible that number X can be changed with those N coins?
Modelling: Assume all c_i's are even (double all c_i, and also X, if not). Create N + 2 vertices, which the i-th vertex represents the i-th coin for 1 <= i <= N. Also, the (N+1) and (N+2)-th vertices have edge to all coins with cost (c_i / 2). The problem is then equivalent to finding a shortest path of cost at least X, which is NPC. The reduction should be obvious, but if further explanations are needed I can edit my answer.

Optimal substructure for least number of perfect squares

Question: I know how the recursion works but I can't seem to understand the 'optimal substructure' for this problem which necessitates the use of dynamic programming.
Problem: Find least number of perfect squares that sum upto a given number.
Let's say we want to find the shortest path from U to V. If we have a node X in between then shortest path from U to V will be shortest path from U to X plus shortest path from X to V.
I am having difficulty understanding how the least squares problem follows the optimal substructure property.
To my understanding, the recurrence relation for sum of perfect squares behaved similarly to the recurrence relation for shortest paths in the following way. Let
f(n) := minimum number of perfect squares which sum up to exactly n
then a suitable recurrence can be stated as
f(n) = min{ f(n-i) + f(i) : 0 < i < n }
which means that all partitions of the original argument into two summands have to be taken into account. Intuitively, the 'split point' for the shortest path problem is a node, whereras in the perfect squares problem it is the decision how to partition into summands (which are then examined further).
You did not state the property correctly, the third paragraph should be rephrased this way:
Let's say we want to find the shortest path from U to V. First verify if U and V are linked directly, if not, compute for each node X in between U and V the shortest path from U to X plus shortest path from X to V, the smallest answers will be the shortest paths from U to V, possibly duplicated.
This is applicable for your problem because you can determine the set of nodes X that are between U and V. For U=0 and V=n, this set is all the numbers from 1 to n-1, because you are adding positive numbers.
For the solution, use an array to cache the smallest path from 0 to i for i going from 0 to n, for each new value, a linear search will yield the best solutions, for an overall time complexity of O(n2).
You can optimize the linear search by enumerating only the perfect squares less or equal to n. This list is much shorter that the whole list of numbers. Its length is actually sqrt(n), so the complexity for the overall search drops to O(n3/2).
The cache can be just a pair of integers: the length of the path and the value of an intermediary X that is on one of the shortest paths. This gives a space complexity of O(n).
The problem in question: Find least number of perfect squares that sum up to a given number. has been extensively studied for more than 17 centuries: Lagrange's four-square theorem, also known as Bachet's conjecture was already known to Diophantus in the third century. It states that every natural number can be represented as the sum of four integer squares. Analytical solutions exist for determine whether any given integer is the sum of 1, 2, 3 or 4 perfect squares.

Directed graph (topological sort)

Say there exists a directed graph, G(V, E) (V represents vertices and E represents edges), where each edge (x, y) is associated with a weight (x, y) where the weight is an integer between 1 and 10.
Assume s and tare some vertices in V.
I would like to compute the shortest path from s to t in time O(m + n), where m is the number of vertices and n is the number of edges.
Would I be on the right track in implementing topological sort to accomplish this? Or is there another technique that I am overlooking?
The algorithm you need to use for finding the minimal path from a given vertex to another in a weighted graph is Dijkstra's algorithm. Unfortunately its complexity is O(n*log(n) + m) which may be more than you try to accomplish.
However in your case the edges are special - their weights have only 10 valid values. Thus you can implement a special data structure(kind of a heap, but takes advantage of the small dataset for the wights) to have all operations constant.
One possible way to do that is to have 10 lists - one for each weight. Adding an edge in the data structure is simply append to a list. Finding the minimum element is iteration over the 10 lists to find the first one that is non-empty. This still is constant as no more than 10 iterations will be performed. Removing the minimum element is also pretty straight-forward - simple removal from a list.
Using Dijkstra's algorithm with some data structure of the same asymptotic complexity will be what you need.

Find sub-array of objects with maximum distance between elements

Let be an array of objects [a, b, c, d, ...] and a function distance(x, y) that gives a numeric value showing how 'different' are objects x and y.
I'm looking for an algorithm to find the subset of the array of length n that maximizes the minimum difference between that subset element.
Of course, I can't simply sort the array by the minimum of the differences with other elements and take the n highest entries, since removing an element can very well change the distances. For instance, if a=b, then removing a means the minimum distance of b with another element will change dramatically.
So far, the only solution I could find was wether to iteratively remove the element with the lowest minimum distance and re-calculate the distance at each iteration, until only n elements are left, or, vice-versa, to iteratively pick new elements, recalculate the distances, add the new pick or replace an existing one based on the distance minimums.
Does anybody know how I could get the same results without those iterations?
PS: here is an example, the matrix shows the 'distance' between each element...
a b c d
a - 1 3 2
b 1 - 4 2
c 3 4 - 5
d 2 2 5 -
If we'd keep only 2 elements, that would be c and d; if we'd keep 3, that would be a or b, c and d.
This problem is NP-hard, so no-one knows an efficient (polynomial time) algorithm for solving it.
Here's a quick sketch that it is NP-hard, by reduction from CLIQUE.
Suppose we have an instance of CLIQUE in the form of a graph G and a number n and we want to know whether there is a clique of size n in G. Construct a distance matrix d such that d(i, j) = 1 if vertices i and j are connected in G, or 0 if they are not. Then find a subset of the vertices of G of size n that maximizes the minimum distance between elements (your problem). If the minimum distance between vertices in this subset is 1, then G has a clique of size n; otherwise it does not.
As Gareth said this is an NP-hard problem, however there has been a lot of research into solving these kind of problems and as such better methods than brute force have been found. Unfortunately this is such a large area that you could spend forever looking at the possible implementations of a solutions.
However if you are interested in a heuristic way of solving this I would suggest looking into Ant Colony Optimization (ACO) which has proven fairly effective at finding optimum paths within graphs.

Resources