Find the set S which maximize the minimum distance between points in S union A - algorithm

I would like to find a set S of given cardinality k maximizing the minimum distance between each points and a given set A. Is there a simple algorithm to find the solution of this max-min problem ?
Given a universe X ⊆ R^d and A ⊆ X,
find the argmax_{S⊆X, |S|=k} min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s')
Thanks !

For a given k with X and A as input brute forcing is obviously in P (as C(|X|,k) is polynomial in |X|).
If k is also an input then it might depends on 'distance' :
If 'distance' is arbitrary then your problem is equivalent to find a fixed size clique in a graph (which is NP-complete):
NP-Hardness :
Take an instance of the clique problem, that is a graph (G,E) and a integer k.
Add to this graph a vertex 'a' connected to every other vertex, let (G',E') be your modified graph.
(G',E') k+1 is then an equivalent instance of your first instance of the clique problem.
Create a map Phi from G' to R^d (you can map G' on N anyway ...) and defined 'distance' such that distance(Phi(c),Phi(d')) = 1 if (c,d) ⊆ E', 0 otherwise.
X = Phi(G'), A = Phi({a}), k+1 give you an instance of your problem.
You can notice that by construction s ≠ Phi(a) <=> distance(s,Phi(A)) = 1 = max distance(.,.), i.e. min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s') = min_{s⊆S, s'≠s ⊆ S} distance(s,s') if |S| = k >= 2
Solve this instance (X,A,k+1) of your problem : this give you a set S of cardinality k+1 such that min( distance(s,s') |s,s'⊆ S, s≠s') is maximal.
Check whether there are s,s'⊆ S, s≠s', distance(s,s') = 0 (this can be done in k^2 as |S| = k) :
if it is the case then there is no set of cardinality k+1 such that forall s,s' distance(s,s') = 1 i.e. there is no subgraph of G' which is a k+1-clique
if it's not the case then map back S into G', by definition of the 'distance' there is an edge between any vertices of Phi^-1 (S) : it's a k+1 clique
This solve with a polynomial reduction a problem equivalent to the clique problem (known as NP-hard).
NP-Easiness :
Let X, A, k be an instance of your problem.
For any subset S of X min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s') can only take value in {distance(x,y), x,y ⊆ X} -which has a polynomial cardinal-.
To find a set that maximize this distance we will just test every possible distance for a correct set in a decreasing order.
To test a distance d we first reduce X to X' containing only points at distance >= d of A{the point itself}.
Then we create a graph (X',E) ¤ where (s,s') ⊆ E iff distance(s,s') >= d.
Test this graph for a k-clique (it's NP-easy), by construction there is one iff there it's vertices S are a set with min_{s⊆S, s'≠s ⊆ S∪A} distance(s,s') >= d
If 'distance' is euclidean or have any funny property it might be in P, I don't know but I won't hope too much if I were you.
¤ I assumed that X (and thus X') was finite here

This is probably NP-hard. Greedily choosing the point furthest away from previous choices is a 2-approximation. There might be a complicated approximation scheme for low d based on the scheme for Euclidean TSP by Arora and Mitchell. For d = 10, forget it.

The top 2 results for the search sphere packing np complete turned up references to a 1981 paper proving that optimal packing and covering in 2d is np complete. I do not have access to a research library, so I can't read the paper. But I expect that your problem can be rephrased as that one, in which case you have a proof that it you have an NP-complete problem.

Related

Choosing k vertices from binary tree vertices set such that sum of cost edges in this new k vertices subset is minimum

Given a binary tree T with a weight function on its edge set w : E -> Z
(Note the weight can be negative too) and a positive integer k. For
a subset T' of V (T), cost(T') is defined as the sum of the weights of
edges (u, v) such that u, v ∈ T'
.Give an algorithm to find a subset T'
of exactly k vertices that minimizes cost(T').
How to solve this problem using dynamic programming?
It is usually easier to solve dynamic programming problems top down. They are often more efficient if solved bottom up. But I'll just get you going on the easier version.
Fill out this recursive function:
def min_cost(root, count_included, include_root):
# root is the root of the subtree
# count_included is how many nodes in the subtree to include in T'
# include_root is whether root will be in T'.
#
# It will return the minimum cost, or math.inf if no solution exists
...
And with that, your answer will be:
min_cost(root_of_tree, k, True) + min_cost(root_of_tree, k, False)
This is slow, so memoize for the top down.
I will leave the bottom up as a more difficult exercise.

Optimum path in a graph to maximize a value

I'm trying to come up with a reasonable algorithm for this problem:
Let's say we have bunch of locations. We know the distances between each pair of locations. Each location also has a point. The goal is to maximize the sum of the points while travelling from a starting location to a destination location without exceeding a given amount of distance.
Here is a simple example:
Starting location: C , Destination: B, Given amount of distance: 45
Solution: C-A-B route with 9 points
I'm just curious if there is some kind of dynamic algorithm for this type of problem. What the would be the best, or rather easiest approach for that problem?
Any help is greatly appreciated.
Edit: You are not allowed to visit the same location many times.
EDIT: Under the newly added restriction that every node can be visited only once, the problem is most definitely NP-hard via reduction to Hamilton path: For a general undirected, unweighted graph, set all edge weights to zero and every vertex weight to 1. Then the maximum reachable score is n iif there is a Hamilton path in the original graph.
So it might be a good idea to look into integer linear programming solvers for instance families that are not constructed specifically to be hard.
The solution below assumes that a vertex can be visited more than once and makes use of the fact that node weights are bounded by a constant.
Let p(x) be the point value for vertex x and w(x,y) be the distance weight of the edge {x,y} or w(x,y) = ∞ if x and y are not adjacent.
If we are allowed to visit a vertex multiple times and if we can assume that p(x) <= C for some constant C, we might get away with the following recurrence: Let f(x,y,P) be the minimum distance we need to get from x to y while collecting P points. We have
f(x,y,P) = ∞ for all P < 0
f(x,x,p(x)) = 0 for all x
f(x,y,P) = MIN(z, w(x, z) + f(z, y, P - p(x)))
We can compute f using dynamic programming. Now we just need to find the largest P such that
f(start, end, P) <= distance upper bound
This P is the solution.
The complexity of this algorithm with a naive implementation is O(n^4 * C). If the graph is sparse, we can get O(n^2 * m * C) by using adjacency lists for the MIN aggregation.

How to generate matrices which satisfy the triangle inequality?

Let's consider square matrix
(n is a dimension of the matrix E and fixed (for example n = 4 or n=5)). Matrix entries
satisfy following conditions:
The task is to generate all matrices E. My question is how to do that? Is there any common approach or algorithm? Is that even possible? What to start with?
Naive solution
A naive solution to consider is to generate every possible n-by-n matrix E where each component is a nonnegative integer no greater than n, then take from those only the matrices that satisfy the additional constraints. What would be the complexity of that?
Each component can take on n + 1 values, and there are n^2 components, so there are O((n+1)^(n^2)) candidate matrices. That has an insanely high growth rate.
Link: WolframAlpha analysis of (n+1)^(n^2)
I think it's safe to safe that this not a feasible approach.
Better solution
A better solution follows. It involves a lot of math.
Let S be the set of all matrices E that satisfy your requirements. Let N = {1, 2, ..., n}.
Definitions:
Let a metric on N to have the usual definition, except with the requirement of symmetry omitted.
Let I and J partition the set N. Let D(I,J) be the n x n matrix that has D_ij = 1 when i is in I and j is in J, and D_ij = 0 otherwise.
Let A and B be in S. Then A is adjacent to B if and only if there exist I and J partitioning N such that A + D(I,J) = B.
We say A and B are adjacent if and only if A is adjacent to B or B is adjacent to A.
Two matrices A and B in S are path-connected if and only if there exists a sequence of adjacent elements of S between them.
Let the function M(E) denote the sum of the elements of matrix E.
Lemma 1:
E = D(I,J) is a metric on N.
Proof:
This is a trivial statement except for the case of an edge going from I to J. Let i be in I and j be in J. Then E_ij = 1 by definition of D(I,J). Let k be in N. If k is in I, then E_ik = 0 and E_kj = 1, so E_ik + E_kj >= E_ij. If k is in J, then E_ik = 1 and E_kj = 0, so E_ij + E_kj >= E_ij.
Lemma 2:
Let E be in S such that E != zeros(n,n). Then there exist I and J partitioning N such that E' = E - D(I,J) is in S with M(E') < M(E).
Proof:
Let (i,j) be such that E_ij > 0. Let I be the subset of N that can be reached from i by a directed path of cost 0. I cannot be empty, because i is in I. I cannot be N, because j is not in I. This is because E satisfies the triangle inequality and E_ij > 0.
Let J = N - I. Then I and J are both nonempty and partition N. By the definition of I, there does not exist any (x,y) such that E_xy = 0 and x is in I and y is in J. Therefore E_xy >= 1 for all x in I and y in J.
Thus E' = E - D(I,J) >= 0. That M(E') < M(E) is obvious, because all we have done is subtract from elements of E to get E'. Now, since E is a metric on N and D(I,J) is a metric on N (by Lemma 1) and E >= D(I,J), we have E' is a metric on N. Therefore E' is in S.
Theorem:
Let E be in S. Then E and zeros(n,n) are path-connected.
Proof (by induction):
If E = zeros(n,n), then the statement is trivial.
Suppose E != zeros(n,n). Let M(E) be the sum of the values in E. Then, by induction, we can assume that the statement is true for any matrix E' having M(E') < M(E).
Since E != zeros(n,n), by Lemma 2 we have some E' in S such that M(E') < M(E). Then by the inductive hypothesis E' is path-connected to zeros(n,n). Therefore E is path-connected to zeros(n,n).
Corollary:
The set S is path-connected.
Proof:
Let A and B be in S. By the Theorem, A and B are both path-connected to zeros(n,n). Therefore A is path-connected to B.
Algorithm
The Corollary tells us that everything in S is path-connected. So an effective way to discover all of the elements of S is to perform a breadth-first search over the graph defined by the following.
The elements of S are the nodes of the graph
Nodes of the graph are connected by an edge if and only if they are adjacent
Given a node E, you can find all of the (potentially) unvisited neighbors of E by simply enumerating all of the possible matrices D(I,J) (of which there are 2^n) and generating E' = E + D(I,J) for each. Enumerating the D(I,J) should be relatively straightforward (there is one for every possible subset I of D, except for the empty set and D).
Note that, in the preceding paragraph, E and D(I,J) are both metrics on N. So when you generate E' = E + D(I,J), you don't have to check that it satisfies the triangle inequality - E' is the sum of two metrics, so it is a metric. To check that E' is in S, all you have to do is verify that the maximum element in E' does not exceed n.
You can start the breadth-first search from any element of S and be guaranteed that you won't miss any of S. So you can start the search with zeros(n,n).
Be aware that the cardinality of the set S grows extremely fast as n increases, so computing the entire set S will only be tractable for small n.

Path finding algorithm on graph considering both nodes and edges

I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S &wedge; head(a) &in; S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.

find the minimum size dominating set for a tree using greedy algorithm

Dominating Set (DS) := given an undirected graph G = (V;E), a set of
vertices S V is a dominating set if for every vertex in V , there is a vertex in
S that is adjacent to v. Entire vertex set V is a trivial dominating set in
any graph.
Find minimum size dominating set for a tree.
I'll attempt to prove this in a more formal way.
OUTLINE
To prove your greedy algorithm is correct, you need to prove two things:
First, that your greedy choice is valid and can always be used in the formation of an optimal solution, and
second, that your problem has an optimal substructure property, that is, you can form an optimal solution from optimal solutions to subproblems of your own problem.
Greedy Choice: In your tree T = (V, E), find a vertex v in the tree with the highest number of leaves. Add it to your dominant set.
Optimal Substructure
T' = (V', E') such that:
V' = V \ ({a : a ϵ V, a is adjacent to v, and a's degree ≤ 2} ∪ {v})
E' = E - any edge involving any of the removed vertices
In other words
Look for a vertex with the highest number of leaves, remove any of its adjacent vertices with degree less than or equal to 2, then remove v itself, and add it to your dominant set. Repeat this until you have no vertices left.
PROOF
Greedy choice proof
For any leaf l, it must be that either itself or its parent is in the dominant set. In our case, the vertex v we would have chosen is in this situation.
Let A = {v1 , v2 , ... , vk} be a minimum dominant set of T. If A already has v as member, we are done. If it does not, we see two situations:
v has some neighbouring leaf l. Then, l must be part of the dominant set, otherwise our set is not dominating the entire tree. We can simply thus form A' = {A - {l} + {v}} and still be a dominant set. Since |A'| = |A|, A' is still optimal.
v does not have any neighbouring leaves l. Then, because v was chosen such that it has the highest number of leaves, then no vertex in T have any leaves. Then T is not a tree. Contradiction.
Thus, we will always be able to form an optimal solution with our greedy choice.
Optimal Substructure proof
Suppose that A is a minimum dominant set for T = (V, E), but that A' = A \ {v} is not a minimum dominant set for T' as defined above.
Make a minimum dominant set for T', call it B. As aforementioned, |B| < |A'|. It can be shown that B' = B ∪ {v} is a dominating set for T. Then, since |A'| = |A| - 1, |B'| = |B| + 1, we get |B'| < |A|. This is contradictory, since we assumed that A is an minimum independent set. Thus it must be that A' is also a minimum independent set of T'.
Proving B' = B ∪ {v} is a dominating set for T:
v may have had adjacent vertices adjacent not in T'. We will show that any vertices that were not considered in T' will be dominated by vertices in B' (This means that we picked our set optimally): Let y be some vertex adjacent to v and not in T'. By definition of T', y can only have degree 1 or 2. Now, y is dominated by v. If y is a leaf, then we are done. However, if y is of degree 2, then y is connected another node which is necessarily in the dominant set of B. This is because, when we removed v to make T', the degree of y became 1, meaning that y or its parent was necessarily added to the dominant set. Hence, B' is a dominant set for T.
1- Always start from leafs
2- Add their parent to DS and cut the children
3- Mark parent's of selected parent as already dominated
4- After completing process , check whether those marked nodes has a children that is not
dominated and add them to DS
Good luck

Resources