Overview
I have a directed graph with about 35'000 nodes and about 400'000 edge (a node is usually connected with multiple nodes).
some edges are unidirectional while others are bidirectional.
Some nodes can be Source and/or Sink but never both at the same time.
Objective
Develop an algorithm that minimizes number of routes to cover all edges of the graph also using different couples of source and sink; of course saving these routes.
Constraints
Nodes can't be visited more than once in a single route but they can appear in different routes.
Starting from a node connected with more than one nodes, algorithm can add to the route only one edge starting from it. Others edge cannot be used in this route
here there is a simply representation of the graph. Algorithm has to find some paths to cover all edges. It will be better find the minimum number of paths.
I don't want use a force brute search of all possible paths because it will be too slow.
It would be nice use a different source and sink for each route so that can be crossed in parallel (if these haven't nodes in common).
Maybe this problem can be resolved with the Graph Coloring Technique, but I can't realize in my mind an algorithm to adapt it.
Related
Let's say we have a directed cyclic graph where every node has an edge going to exactly two other nodes - except for a 'final node', which just has multiple things going into it and nothing coming out.
Given a source, I want to find the longest possible path to that final node that doesn't hit a node more than once. There are algorithms out there to do this, but the only problem is that my graph has many different cycles, some of which are inside each other, and most naive algorithms get stuck in infinite loops when evaluated.
I tried collapsing all strongly connected components (of which there is only one), however if the source I want is inside that component, the algorithm doesn't work. And it doesn't work in the general case either, because within that strongly connected component, to hit every node you may have to hit some nodes multiple times, which I don't want.
What's an efficient algorithm I can use for calculating the longest path back to the final vertex in an unweighted directed, cyclic graph such that no node is hit multiple times?
You can see this question, it's also a problem about finding the longest path in a directed cyclic graph, and the standard code is also available.
Question:http://poj.org/problem?id=3592
Answer: https://blog.csdn.net/u013514182/article/details/42364173
Although we can check a if a graph is bipartite using BFS and DFS (2 coloring ) on any given undirected graph, Same implementation may not work for the directed graph.
So for testing same on directed graph , Am building a new undirected graph G2 using my source graph G1, such that for every edge E[u -> v] am adding an edge [u,v] in G2.
So by applying a 2 coloring BFS I can now find if G2 is bipartite or not.
and same applies for the G1 since these two are structurally same. But this method is costly as am using extra space for graph. Though this will suffice my purpose as of now, I'd like know if there any better implementations for the same.
Thanks In advance.
You can execute the algorithm to find the 2-partition of an undirected graph on a directed graph as well, you just need a little twist. (BTW, in the algorithm below I assume that you will eventually find a 2-coloring. If not, then you will run into a node that is already colored and you find you need to color it to the other color. Then you just exit saying it's not bipartite.)
Start from any node and do the 2-coloring by traversing the edges. If you have traversed every edge and every node in the graph then you have your partition. If not, then you have a component that is 2-colored and there are no edges leaving the component. Pick any node not in the component and repeat. If you get into a situation when you have a few components that are all 2-colored, and there are no edges leaving any of them, and you encounter an edge that originates in a node in the component you are currently building and goes into a node in one of the previous components then you just merge the current component with the older one (and possibly need to flip the color of every node in one of the components -- flip it in the smaller component). After merging just continue. You can do the merge, because at the time of the merge you have scanned only one edge between the two components, so flipping the coloring of one of the components leaves you in a valid state.
The time complexity is still O(max(|N|,|E|)), and all you need is an extra field for every node indicating which component that node is in.
I have an undirected graph which initially has no edges. Now in every step an edge is added or deleted and one has to check whether the graph has at least one circle. Probably the easiest sufficient condition for that is
connected components + number of edges <= number of nodes.
As the "steps" I mentioned above are executed millions of times, this check has to be really fast. So I wonder what would be a quick way to check the condition depending on the fact that in each step only one edge changes.
Any suggestions?
If you are keen, you can try to implement a fully dynamic graph connectivity data structure like described in "Poly-logarithmic deterministic fully-dynamic graph algorithms I: connectivity and minimum spanning tree" by Jacob Holm, Kristian de Lichtenberg, Mikkel Thorup.
When adding an edge, you check whether the two endpoints are connected. If not, the number of connected components decreases by one. After deleting an edge, check if the two endpoints are stil connected. If not, the number of connected components increases by one. The amortized runtime of edge insertion and deletion would be O(log^2 n), but I can imagine the constant factor is quite high.
There are newer result with better bounds. There is also an experimental evaluation of some of the dynamic connectivity algorithms that considers implementation details as well. There is also a Javascript implementation. I have no idea how good it is.
I guess in practice you can have it much easier by maintaining a spanning forest. You get edge additions and non-tree edge deletions (almost) for free. For tree edge deletions you could just use "brute force" in the form of BFS or DFS to check whether the end points are still connected. Especially if the number of nodes is bounded, maybe that works well enough in practice, BFS and DFS are both O(n^2) for dense graphs and you can charge some of that work to the operations where you got lucky and didn't have a lot to do.
I suggest you label all the nodes. Use integers, that's easiest.
At any point, your graph will be divided into a number of disjoint subgraphs. Initially, each node is in its own subgraph.
Maintain the condition that each subgraph has a unique label, and all the nodes in the subgraph carry that label. Initially, just give each node a unique label. If your problem includes adding nodes, you might want to maintain a variable to hold the next available label.
If and only if a new edge would connect two nodes with identical labels, then the edge would create a cycle.
Whenever you add an edge, you will connect two previously disjoint subgraphs. You must relabel one of the subgraphs to match the other, which will require visiting all the nodes of one subgraph. This is the highest computatonal burden in this scheme.
If you don't mind allocating more space, you should also maintain a list of labels in use, associated with a count of the nodes carrying that label. This will allow you to choose the smaller subgraph when relabeling.
If you know which two nodes are being connected by the new edge, you could use some sort of path finding algorithm to detect an alternative path between the two nodes. In other words, if a path exists which connects the two nodes of your new edge before you add the new edge, adding the new edge will create a circle.
Your problem then reduces to finding the paths between two given nodes.
There is a directed graph having a single designated node called root from which all other nodes are reachable. Each terminal node (no outgoing edges) takes an string value.
Intermediate nodes have one or more outgoing edges but no value associated with them. Edges connecting a node to its neighbor have a string label. The labels for edges emanating from a single node are unique. There could be possible cycles in the graph!
What is the best graph algorithm for checking if two such directed (possibly having cycles) graphs (as described above) are equal?
The graph isomorphism problem is one of the more intriguing problems in TCS. There is an entire page dedicated to it on the wiki.
In your particular case you have two rooted directed graph with a source and a sink.
You could start two BFS in parallel, and check for isomorphism level by level; i.e. levelize the graph and check whether the subset of nodes at each level are isomorphic across the two graphs.
Note that since you have a Directed, Rooted graph you should still be able to levelize it for the purpose of finding isomorphism. Do not enque nodes already visited during the BFS; i.e. levelize using the shortest path to the node from the root when determining the level to group in.
Within a set the comparison should be relatively easy. You have many properties to distinguish nodes at the same level (degree, labels) and should be able to create suitable signatures to sort them. Since you are looking for perfect isomorphism, you should get an exact match.
I have an graph with the following attributes:
Undirected
Not weighted
Each vertex has a minimum of 2 and maximum of 6 edges connected to it.
Vertex count will be < 100
Graph is static and no vertices/edges can be added/removed or edited.
I'm looking for paths between a random subset of the vertices (at least 2). The paths should simple paths that only go through any vertex once.
My end goal is to have a set of routes so that you can start at one of the subset vertices and reach any of the other subset vertices. Its not necessary to pass through all the subset nodes when following a route.
All of the algorithms I've found (Dijkstra,Depth first search etc.) seem to be dealing with paths between two vertices and shortest paths.
Is there a known algorithm that will give me all the paths (I suppose these are subgraphs) that connect these subset of vertices?
edit:
I've created a (warning! programmer art) animated gif to illustrate what i'm trying to achieve: http://imgur.com/mGVlX.gif
There are two stages pre-process and runtime.
pre-process
I have a graph and a subset of the vertices (blue nodes)
I generate all the possible routes that connect all the blue nodes
runtime
I can start at any blue node select any of the generated routes and travel along it to reach my destination blue node.
So my task is more about creating all of the subgraphs (routes) that connect all blue nodes, rather than creating a path from A->B.
There are so many ways to approach this and in order not confuse things, here's a separate answer that's addressing the description of your core problem:
Finding ALL possible subgraphs that connect your blue vertices is probably overkill if you're only going to use one at a time anyway. I would rather use an algorithm that finds a single one, but randomly (so not any shortest path algorithm or such, since it will always be the same).
If you want to save one of these subgraphs, you simply have to save the seed you used for the random number generator and you'll be able to produce the same subgraph again.
Also, if you really want to find a bunch of subgraphs, a randomized algorithm is still a good choice since you can run it several times with different seeds.
The only real downside is that you will never know if you've found every single one of the possible subgraphs, but it doesn't really sound like that's a requirement for your application.
So, on to the algorithm: Depending on the properties of your graph(s), the optimal algorithm might vary, but you could always start of with a simple random walk, starting from one blue node, walking to another blue one (while making sure you're not walking in your own old footsteps). Then choose a random node on that path and start walking to the next blue from there, and so on.
For certain graphs, this has very bad worst-case complexity but might suffice for your case. There are of course more intelligent ways to find random paths, but I'd start out easy and see if it's good enough. As they say, premature optimization is evil ;)
A simple breadth-first search will give you the shortest paths from one source vertex to all other vertices. So you can perform a BFS starting from each vertex in the subset you're interested in, to get the distances to all other vertices.
Note that in some places, BFS will be described as giving the path between a pair of vertices, but this is not necessary: You can keep running it until it has visited all nodes in the graph.
This algorithm is similar to Johnson's algorithm, but greatly simplified thanks to the fact that your graph is unweighted.
Time complexity: Since there is a constant number of edges per vertex, each BFS will take O(n), and the total will take O(kn), where n is the number of vertices and k is the size of the subset. As a comparison, the Floyd-Warshall algorithm will take O(n^3).
What you're searching for is (if I understand it correctly) not really all paths, but rather all spanning trees. Read the wikipedia article about spanning trees here to determine if those are what you're looking for. If it is, there is a paper you would probably want to read:
Gabow, Harold N.; Myers, Eugene W. (1978). "Finding All Spanning Trees of Directed and Undirected Graphs". SIAM J. Comput. 7 (280).