Connecting a Set of Vertices into an optimally weighted graph - algorithm

This is essentially the problem of connecting n destinations with the minimal amount of road possible.
The input is a set of vertices (a,b, ... , n)
The weight of an edge between two vertices is easily calculated (example the cartesian distance between the two vertices)
I would like an algorithm that given a set of vertices in euclidian space, returns a set of edges that would constitute a connected graph and whose total weight of edges is as small as it could be.
In graph language, this is the Minimum Spanning Tree of a Connected Graph.
With brute force I would have:
Define all possible edges between all vertices - say you have n
vertices, then you have n(n-1)/2 edges in the complete graph
A possible edge can be on or off (2 states)
Go through all possible edge on/off
combinations: 2^(n(n-1)/2)!
Ignore all those that would not connect the
graph
From the remaining combinations, find the one whose sum of
edge weights is the smallest of all
I understand this is an NP-Hard problem. However, realistically for my application, I will have a maximum of 11 vertices. I would like to be able to solve this on a typical modern smart phone, or at the very least on a small server size.
As a second variation, I would like to obtain the same goal, with the restriction that each vertex is connected to a maximum of one other vertex. Essentially obtaining a single trace, starting from any point, and finishing at any other point, as long as the graph is connected. There is no need to go back to where you started. In graph language, this is the Open Euclidian Traveling Salesman Problem.
Some pseudocode algorithms would be much helpful.

Ok for the first problem you have to build a Minimum Spanning Tree. There are several algorithms to do so, Prim and Kruskal. But take a look also in the first link to the treatment for complete graphs that it is your case.
For the second problem, it becomes a little more complicated. The problem becomes an Open Traveling Salesman Problem (oTSP). Reading the previous link maybe focused on Euclidean and Asymmetric.
Regards

Maybee you could try a greedy algorithm:
1. Create a list sortedList that stores each pair of nodes i and j and is sorted by the
weight w(i,j).
2. Create a HashSet connectedNodes that is empty at the beginning
3. while (connectedNodes.size() < n)
element := first element of sortedList
if (connectedNodes.isEmpty())
connectedNodes.put(element.nodeI);
connectedNodes.put(element.nodeJ);
delete element from sortedList
else
for(element in sortedList) //start again with the first
if(connectedNodes.get(element.nodeI) || connectedNodes.get(element.nodeJ))
if(!(connectedNodes.get(element.nodeI) && connectedNodes.get(element.nodeJ)))
//so it does not include already both nodes
connectedNodes.put(element.nodeI);
connectedNodes.put(element.nodeJ);
delete element from sortedList
break;
else
continue;
So I explain step 3 a little bit:
You add as long nodes till all nodes are connected to one other. It is sure that the graph is connected, because you just add a node, if he has a connection to an other one already in the connectedNodes list.
So this algorithm is greedy what means, it does not make sure, that the solution is optimal. But it is a quite good approximation, because it always takes the shortest edge (because sortedList is sorted by the weight of the edge).
Yo don't get duplicates in connectedNodes, because it is a HashSet, which also make the runtime faster.
All in all the runtime should be O(n^2) for the sorting at the beginning and below its around O(n^3), because in worst case you run in every step through the whole list that has size of n^2 and you do it n times, because you add one element in each step.
But more likely is, that you find an element much faster than O(n^2), i think in most cases it is O(n).

You can solve the travelsalesman problem and the hamilton path problem with the optimap tsp solver fron gebweb or a linear program solver. But the first question seems to ask for a minimum spanning tree maybe the question tag is wrong?

For the first problem, there is an O(n^2 * 2^n) time algorithm. Basically, you can use dynamic programming to reduce the search space. Let's say the set of all vertices is V, so the state space consists of all subsets of V, and the objective function f(S) is the minimum sum of weights of the edges connecting vertices in S. For each state S, you may enumerate over all edges (u, v) where u is in S and v is in V - S, and update f(S + {v}). After checking all possible states, the optimal answer is then f(V).
Below is the sample code to illustrate the idea, but it is implemented in a backward approach.
const int n = 11;
int weight[n][n];
int f[1 << n];
for (int state = 0; state < (1 << n); ++state)
{
int res = INF;
for (int i = 0; i < n; ++i)
{
if ((state & (1 << i)) == 0) continue;
for (int j = 0; j < n; ++j)
{
if (j == i || (state & (1 << j)) == 0) continue;
if (res > f[state - (1 << j)] + weight[i][j])
{
res = f[state - (1 << j)] + weight[i][j];
}
}
}
f[state] = res;
}
printf("%d\n", f[(1 << n) - 1]);
For the second problem, sorry I don't quite understand it. Maybe you should provide some examples?

Related

Count triangles (cycles of length 3) in a graph

In an undirected graph with V vertices and E edges how would you count the number of triangles in O(|V||E|)? I see the algorithm here but I'm not exactly sure how that would be implemented to achieve that complexity. Here's the code presented in that post:
for each edge (u, v):
for each vertex w:
if (v, w) is an edge and (w, u) is an edge:
return true
return false
Would you use an adjacency list representation of the graph to traverse all edges in the outer loop and then an adjacency matrix to check for the existence of the 2 edges in the inner loop?
Also, I saw a another solution presented as O(|V||E|) which involves performing a depth-first search on the graph and when you encounter a backedge (u,v) from the vertex u you're visiting check if the grandparent of the vertex u is vertex v. If it is then you have found a triangle. Is this algorithm correct? If so, wouldn't this be O(|V|+|E|)? In the post I linked to there is a counterexample for the breadth-first search solution offered up but based on the examples I came up with it seems like the depth-first search method I outlined above works.
Firstly, note that the algorithm does not so much count the number of triangles, but rather returns whether one exists at all.
For the first algorithm, the analysis becomes simple if we assume that we can do the lookup of (a, b) is an edge in constant time. (Since we loop over all vertices for all edges, and only do something with constant time we get O(|V||E|*1). ) Telling whether something is a member of a set in constant time can be done using for example a hashtable/set. We could also, as you said, do this by the use of the adjacency matrix, which we could create beforehand by looping over all the edges, not changing our total complexity.
An adjacency list representation for looping over the edges could perhaps be used, but traversing this may be O(|V|+|E|), giving us the total complexity O(|V||V| + |V||E|) which may be more than we wanted. If that is the case, we should instead loop over this first, and add all our edges to a normal collection (like a list).
For your proposed DFS algorithm, the problem is that we cannot be sure to encounter a certain edge as a backedge at the correct moment, as is illustrated by the following counterexample:
A -- B --- C -- D
\ / |
E ----- F
Here if we look from A-B-C-E, and then find the backedge E-B, we correctly find the triangle; but if we instead go A-B-C-D-F-E, the backedges E-B, and E-C, do no longer satisfy our condition.
This is a naive approach to count the number of cycles.
We need the input in the form of an adjacency matrix.
public int countTricycles(int [][] adj){
int n = adj.length;
int count = 0;
for(int i = 0; i < n ;i++){
for(int j = 0; j < n; j++){
if(adj[i][j] != 0){
for(int k = 0; k < n; k++){
if(k!=i && adj[j][k] != 0 && adj[i][k] != 0 ){
count++;
}
}
}
}
}
return count/6;
}
The complexity would be O(n^3).

Disconnect all vertices in a graph - Algorithm

I am looking for an algorithm that finds minimal subset of vertices such that by removing this subset (and edges connecting these vertices) from graph all other vertices become unconnected (i.e. the graph won't have any edges).
Is there such algorithm?
If not: Could you recommend some kind of heuristics to designate the vertices.
I have a basic knowledge of graph theory so please excuse any incorrectness.
IIUC, this is the classic Minimum Vertex Cover problem, which is, unfortunately, NP Complete.
Fortunately, the most intuitive and greedy possible algorithm is as good as it gets in this case.
The greedy algorithm is a 2-approximation for vertex cover, which in theory, under the Unique Games Conjecture, is as good as it gets. In practice, solving a formulation of vertex cover as an integer program will most likely yield much better results. The program is
min sum_{v in V} x(v)
s.t.
forall {u, v} in E, x(u) + x(v) >= 1
forall v in V, x(v) in {0, 1}.
Try this way:
Define a variable to count number of vertexes, starting by 0;
Create a Max-Heap of vertexes sorted by the length of the adjacent list of each vertex;
Remove all edges from the first vertex of the Heap (the one with biggest number of edges) and remove it from the Heap, adding 1 to the count;
Reorder the Heap now that number of edges of the vertexes changed, repeating the previous step until the length of the adjacent list from the first vertex is 0;
Heap Q
int count = 0
while(1){
Q = Create_Heap(G)
Vertex first = Q.pop
if(first.adjacents.size() == 0) {
break
}
for( Vertex v : first.adjacent ){
RemoveEdge(first, v)
RemoveEdge(v, first) /* depends on the implementation */
}
count = count + 1
}
return count

Finding cheapest path on a graph, cost determined by max-weight of used nodes

I have a graph G with a starting node S and an ending node E. What's special with this graph is that instead of edges having costs, here it's the nodes that have a cost. I want to find the way (a set of nodes, W) between S and E, so that max(W) is minimized. (In reality, I am not interested of W, just max(W)) Equivalently, if I remove all nodes with cost larger than k, what's the smallest k so that S and E are still connected?
I have one idea, but want to know if it is correct and optimal. Here's my current pseudocode:
L := Priority Queue of nodes (minimum on top)
L.add(S, S.weight)
while (!L.empty) {
X = L.poll()
return X.weight if (X == G)
mark X visited
foreach (unvisited neighbour N of X, N not in L) {
N.weight = max(N.weight, X.weight)
L.add(N, N.weight)
}
}
I believe it is worst case O(n log n) where n is the number of nodes.
Here are some details for my specific problem (percolation), but I am also interested of algorithms for this problem in general. Node weights are randomly uniformly distributed between 0 and a given max value. My nodes are Poisson distributed on the R²-plane, and an edge between two nodes exists if the distance between two nodes is less than a given constant. There are potentially very many nodes, so they are generated on the fly (hidden in the foreach in the pseudocode). My starting node is in (0,0) and the ending node is any node on a distance larger than R from (0,0).
EDIT: The weights on the nodes are floating point numbers.
Starting from an empty graph, you can insert vertices (and their edges to existing neighbours) one at a time in increasing weight order, using a fast union/find data structure to maintain the set of connected components. This is just like the Kruskal algorithm for building minimum spanning trees, but instead of adding edges one at a time, for each vertex v that you process, you would combine the components of all of v's neighbours.
You also keep track of which two components contain the start and end vertices. (Initially comp(S) = S and comp(E) = E; before each union operation, the two input components X and Y can be checked to see whether either one is either comp(S) or comp(E), and the latter updated accordingly in O(1) time.) As soon as these two components become a single component (i.e. comp(S) = comp(E)), you stop. The vertex just added is the maximum weight vertex on the the path between S and E that minimises the maximum weight of any vertex.
[EDIT: Added time complexity info]
If the graph contains n vertices and m edges, it will take O(n log n) time to sort the vertices by weight. There will be at most m union operations (since every edge could be used to combine two components). If a simple disjoint set data structure is used, all of these union operations could be done in O(m + n log n) time, and this would become the overall time complexity; if path compression is also used, this drops to O(m A(n)), where A(n) is the incredibly slowly growing inverse Ackermann function, but the overall time complexity remains unchanged from before because the initial sorting dominates.
Assuming integer weights, Pham Trung's binary search approach will take O((n + m) log maxW) time, where maxW is the heaviest vertex in the graph. On sparse graphs (where m = O(n)), this becomes O(n log maxW), while mine becomes O(n log n), so here his algorithm will beat mine if log(maxW) << log(n) (i.e. if all weights are very small). If his algorithm is called on a graph with large weights but only a small number of distinct weights, then one possible optimisation would be to sort the weights in O(n log n) time and then replace them all with their ranks in the sorted order.
This problem can be solved by using binary search.
Assume that the solution is x, Starting from the start, we will use BFS or DFS to discover the graph, visit only those nodes which have weight <= x. So, in the end, if Start and End is connected, x can be the solution. We can find the optimal value for x by applying binary search.
Pseudo code
int min = min_value_of_all_node;
int max = max_value_of_all_node;
int result = max;
while(min<= max){
int mid = (min + max)>>1;
if(BFS(mid)){//Using Breadth first search to discover the graph.
result = min(mid, result);
max = mid - 1;
}else{
min = mid + 1;
}
}
print result;
Note: we only need to apply those weights that exist in the graph, so this can help to reduce time complexity of the binary search to O(log n) with n is number of distinct weights
If the weights are float, just use the following approach:
List<Double> listWeight ;//Sorted list of weights
int min = 0;
int max = listWeight.size() - 1;
int result = max;
while(min<= max){
int mid = (min + max)>>1;
if(BFS(listWeight.get(mid))){//Using Breadth first search to discover the graph.
result = min(mid, result);
max = mid - 1;
}else{
min = mid + 1;
}
}
print listWeight.get(result);

Floyd Warshall algorithm with maximum steps allowed

Is it possible to modify Floyd-Warshall algorithm when solving the shortest path problem for a directed weighted graph with n nodes, such that each shortest path have no more than m steps? More precisely, for each pair of nodes i and j, you are about to find the shortest directed path from i to j that contains no more than m nodes. Does time complexity still remain O(n3)?
Meanwhile, I found an O(n3logm) algorithm for finding all-pairs shortest paths (ASPP) problem for the graph with n nodes such that no path contain more than m nodes.
Given two n x n matrices, say A = [aij] and B = [bij], their distance product is n x n matrix C = [cij] = A x B, defined by cij = min1≤k≤n {aik + bkj}.
This is related to the ASPP in the following way. Given weighted matrix E of distances in the graph, En is matrix of all-pairs shortest path. If we add constraint that no path contains more than m nodes, then matrix Em is the solution to ASPP. Since calculating power can be found in O(logm) time, this gives us an O(n3logm) algorithm.
Here, one may find faster algorithms for calculating distance product of matrices in some special cases, but the trivial one O(n3) is enough for me, since overall time is almost as fast as Floyd-Warshall approach.
Yes and Yes.
Each iteration of the algotithm adds a single unit of length of the paths you search for. So if you limit the iterations to m then you find a path of length at most m.
The complexity will remain O(n^3) in the worst case of m -> n. However, more precise estimate would be O(m * n^2).
I believe that this could be done with a different data structure (one that would allow you to keep track of the number of steps)?
Since usually Floyd-Warshall is done with a connectivity matrix (where matrix [j][k] represents the distance between nodes j and k), instead of making that matrix an integer we can make it a struct that has two integers : the distance between two nodes and the number of steps between them.
I wrote up something in C++ to explain what I mean :
#define INFINITY 9999999
struct floydEdge
{
int weight;
int steps;
};
floydEdge matrix[10][10];
int main()
{
//We initialize the matrix
for(int j=0;j<10;j++)
{
for(int k=0;k<10;k++)
{
matrix[j][k].weight=INFINITY;
matrix[j][k].steps=0;
}
}
//We input some weights
for(int j=0;j<10;j++)
{
int a, b;
cin>>a>>b;
cin>>matrix[a][b].weight;
matrix[b][a].weight=matrix[a][b].weight;
matrix[a][b].steps=matrix[b][a].steps=1;
}
//We do the Floyd-Warshall, also checking for the step count as well as the weights
for(int j=0;j<10;j++)
{
for(int k=0;k<10;k++)
{
for(int i=0;i<10;i++)
{
//We check if there is a shorter path between nodes j and k, using the node i. We also check if that path is shorter or equal to 4 steps.
if((matrix[j][k].weight > matrix[j][i].weight + matrix[i][k].weight) && (matrix[j][i].steps+matrix[i][k].steps<=4))
{
matrix[j][k].weight=matrix[k][j].weight=matrix[j][i].weight + matrix[i][k].weight;
matrix[j][k].steps=matrix[k][j].steps=matrix[j][i].steps+matrix[i][k].steps;
}
//If the path is not shorter, we can also check if the path is equal BUT requires less steps than the path we currently have.
else if((matrix[j][k].weight == matrix[j][i].weight + matrix[i][k].weight) && (matrix[j][i].steps+matrix[i][k].steps<matrix[j][k].steps))
{
matrix[j][k].weight=matrix[k][j].weight=matrix[j][i].weight + matrix[i][k].weight;
matrix[j][k].steps=matrix[k][j].steps=matrix[j][i].steps+matrix[i][k].steps;
}
}
}
}
return 0;
}
I believe (but I am not completely sure) this works perfectly (always giving the shortest paths for between all nodes). Give it a try and let me know!

Shortest path with a fixed number of edges

Find the shortest path through a graph in efficient time, with the additional constraint that the path must contain exactly n nodes.
We have a directed, weighted graph. It may, or may not contain a loop. We can easily find the shortest path using Dijkstra's algorithm, but Dijkstra's makes no guarantee about the number of edges.
The best we could come up with was to keep a list of the best n paths to a node, but this uses a huge amount of memory over vanilla Dijkstra's.
It is a simple dynamic programming algorithm.
Let us assume that we want to go from vertex x to vertex y.
Make a table D[.,.], where D[v,k] is the cost of the shortest path of length k from the starting vertex x to the vertex v.
Initially D[x,1] = 0. Set D[v,1] = infinity for all v != x.
For k=2 to n:
D[v,k] = min_u D[u,k-1] + wt(u,v), where we assume that wt(u,v) is infinite for missing edges.
P[v,k] = the u that gave us the above minimum.
The length of the shortest path will then be stored in D[y,n].
If we have a graph with fewer edges (sparse graph), we can do this efficiently by only searching over the u that v is connected to. This can be done optimally with an array of adjacency lists.
To recover the shortest path:
Path = empty list
v = y
For k= n downto 1:
Path.append(v)
v = P[v,k]
Path.append(x)
Path.reverse()
The last node is y. The node before that is P[y,n]. We can keep following backwards, and we will eventually arrive at P[v,2] = x for some v.
The alternative that comes to my mind is a depth first search (as opposed to Dijkstra's breadth first search), modified as follows:
stop "depth"-ing if the required vertex count is exceeded
record the shortest found (thus far) path having the correct number of nodes.
Run time may be abysmal, but it should come up with the correct result while using a very reasonable amount of memory.
Interesting problem. Did you discuss using a heuristic graph search (such as A*), adding a penalty for going over or under the node count? This may or may not be admissible, but if it did work, it may be more efficient than keeping a list of all the potential paths.
In fact, you may be able to use backtracking to limit the amount of memory being used for the Dijkstra variation you discussed.
A rough idea of an algorithm:
Let A be the start node, and let S be a set of nodes (plus a path). The invariant is that at the end of step n, S will all nodes that are exactly n steps from A and the paths will be the shortest paths of that length. When n is 0, that set is {A (empty path)}. Given such a set at step n - 1, you get to step n by starting with an empty set S1 and
for each (node X, path P) in S
for each edge E from X to Y in S,
If Y is not in S1, add (Y, P + Y) to S1
If (Y, P1) is in S1, set the path to the shorter of P1 and P + Y
There are only n steps, and each step should take less than max(N, E), which makes the
entire algorithm O(n^3) for a dense graph and O(n^2) for a sparse graph.
This algorith was taken from looking at Dijkstra's, although it is a different algorithm.
let say we want shortest distance from node x to y of k step
simple dp solution would be
A[k][x][y] = min over { A[1][i][k] + A[t-1][k][y] }
k varies from 0 to n-1
A[1][i][j] = r[i][j]; p[1][i][j]=j;
for(t=2; t<=n; t++)
for(i=0; i<n; i++) for(j=0; j<n; j++)
{
A[t][i][j]=BG; p[t][i][j]=-1;
for(k=0; k<n; k++) if(A[1][i][k]<BG && A[t-1][k][j]<BG)
if(A[1][i][k]+A[t-1][k][j] < A[t][i][j])
{
A[t][i][j] = A[1][i][k]+A[t-1][k][j];
p[t][i][j] = k;
}
}
trace back the path
void output(int a, int b, int t)
{
while(t)
{
cout<<a<<" ";
a = p[t][a][b];
t--;
}
cout<<b<<endl;
}

Resources