Cost to transpose a directed graph? - performance

I am trying to construct the transpose of a directed graph by running DFS on the original graph and then generating a adjancy list of the mirror as new nodes are discovered.
What would the computational time of this be? I know that the DFS takes O(|V| + |E|) but what about constructing the adjancy list? How long does it take to construct the adjancy list of the transpose through DFS?

If you have O(1) insertions of items into your graph (supposing you are using a hashtable or hashmap for vertex lookup or an array if your vertices are represented by integers), then the asymptotic runtime should be no different than the DFS.
I don't think you actually need to do a DFS, to be honest. I think you could just iterate over each vertex's adjacency list and then add the edges that way. The runtime will still be O(V+E), so theoretically, it doesn't really matter.
Also, if your graph is represented as an edge list, then I believe making the transpose graph would just be O(E), but I guess that requires the graph to be connected.
Sorry if there was too much extra information in there, and I hope I was able to help!

Related

Does every matrix correspond to a graph conceptually?

I understand there are 3 common ways to represent graphs:
Adjacency Matrix
Adjacency List
Edge list
That said, problems I’ve solved on LeetCode often use matrices and the solution requires DFS or BFS. For example, given the matrix below, find if a target string exists when you go left, right, up, and down (but not diagonal).
[
[‘a’,‘p’,’p’],
[‘e’,’a’,’l’],
[‘r’,’t’,’e’]
]
This required a DFS approach. Is this because this matrix represents a graph or does DFS and BFS apply to matrices too and not just trees and graphs?
Are DFS and BFS always/mostly used against matrices (2D arrays) in implementation or are there cases where it’s used against a Graph class?
Graph algorithms are often used to solve problems on data structures that do not explicitly represent a graph, like your matrix. The graph is not in the data. It's in your head when you solve the problem. For example, "If I think of this as a graph, the I can solve the problem with DFS or BFS". Then you write a BFS or DFS algorithm, mapping the traversal operations to whatever is equivalent in the data structure you do have.
This is called operating on the "implicit graph": https://en.wikipedia.org/wiki/Implicit_graph
If you actually made a graph data structure out of your data -- an explicit graph -- then you could write a BFS or DFS on that directly, but it's often unnecessary and in fact wasteful.

Dijkstra's storing the Graph in a text file

I was wondering, what is the most efficient way of storing the graph in a text file while you are implementing Dijkstra's algorithm? (Adjacency matrix, incidence matrix? etc)
In the general case, a good approach is to store a list of all edges.
It takes O(E) space: we store two endpoints per edge.
To store it on disk, that will suffice.
To work with such a list, it is usually stored in memory as V adjacency lists, one for every vertex.
This duplicates each edge (u->v and v->u) if the graph is undirected.
However, a common operation for graph algorithms is to traverse all edges from a given vertex.
By storing an adjacency list for each vertex, we get to do that in O(number of neighbors), which is the best possible.
Adjacency matrix takes O(V^2) space, which might be fine for dense graphs, but is worse than O(E) in the general case.
Incidence matrix takes O(VE) space, and is not efficient, unless your graph is somehow very special to make it so.
The fastest implementations of Dijkstra's algorithm take O(E log V) time, so O(E) memory is usually fine.

What is the Time complexity for finding universal sink given the adjecency list representation

There are many variants of this question asking the solution in O(|V|) time.
But what is the worst case bound if I wanna compute if there is a universal sink in the graph and I have graph represented in adjacency lists. This is important because all other algorithms seem to be better for adjacency lists, so if finding universal sink is not too frequent operation that I need, I will definitely go ahead for lists rather than matrix.
In my opinion, the time complexity would be the size of the graph, that is O(|V| + |E|). the algorithm for finding universal sink of a graph is as follows. Assuming in-neighbor list, Start from the index 1 of a graph. Check the length of adjacency list at index 1, if it is |V| - 1, then traverse the list to check if there is a self loop. If list does not have a self loop and all other vertices are part of a list, store the list index. Then, we must go through other lists to check if this vertex is part of their list. If it is, then the stored vertex cannot be a universal sink. Continue the search from the next index. Even if list is out-neighbor list, we will have to search the vertices which have list with length = 0, then search all other lists to check if this vertex exists in their respective lists.
As it can be concluded from above explanation, no matter what form of adjacency list is considered, in worst case, finding the universal sink must traverse through all the vertices and edges once, hence the complexity is the size of the graph, i.e. O(|V|+|E|)
But my friend who has recently joined as a assistant professor at a university, mentioned it has to be O(|V|*|V|). I am reviewing his notes before he starts teaching the course in the spring, but before correcting it I wanna be one hundred percent sure.
You're quite correct. We can build the structures we need to track all of the intermediate results, but the basic complexity is still straightforward: we go through all of our edges once, marking and counting references. We can even build a full transition matrix in O(E) time.
Depending on the data structures, we may find an improvement by a second pass over all edges, but 2 * O(E) is still O(E).
Then we traverse each node once, looking for in/out counts and a self-loop.

Finding reachable vertices for every vertex in a directed graph

I know that brute force approach to do this is perform DFS on all the vertices of the graph.So for this algorithm the complexity would be O(V|V+E|). But is there more efficient way to do this?
I get the impression from papers like http://research.microsoft.com/pubs/144985/todsfinal.pdf that there is no algorithm that does better than O(VE) or O(V^3) in the general case. For sparse graphs and other special graphs there are faster algorithms. It seems, however, that you can still make improvements by separating "index construction" from "query", if you have some idea of the number of queries that will be made on the data. If there are going to be a lot of queries, O(1) is possible for queries if all the data is pre-computed (DFS or Floyd-Warshall, etc.) and stored in O(n^2) space. On the other hand, if there are going to be relatively few queries, space and/or index construction time can be reduced at the expense of query time.
I really suspect that there isn't a known better algorithm for general graphs. All the papers I found on the subject [1] [2] describe algorithms that run in O(|V| * |E|) time. That isn't better than your naïve attempt in the worst case.
Even the wikipedia page [3] says the fastest algorithms reduce the problem to matrix multiplication, which the fastest algorithms are only marginally better than your baseline.
[1] http://ion.uwinnipeg.ca/~ychen2/conferencePapers/tranRelationCopy.pdf
[2] http://www.vldb.org/conf/1988/P382.PDF
[3] http://en.wikipedia.org/wiki/Transitive_closure#Algorithms
[EDIT: As pointed out by kraskevich, the final query step can be worse than I had originally claimed: up to O(|V|^2) even for an output of size O(|V|), which is no better than ordinary DFS without any preprocessing.].
In the worst case, O(|V|^2) space would be needed to store all this information explicitly -- i.e., to store the complete list of reachable vertices for each vertex (think of a graph in which every vertex has an edge to every other vertex). But it's possible to represent it in such a way that only O(|V|) space is needed, and this representation can be built in O(|V|+|E|) time, and a query on it will only take time proportional to the size of the answer (number of reachable vertices).
The basic idea is: Every vertex in a strongly connected component (SCC) can reach every other vertex in the same SCC (this is the definition of SCC), and can reach all vertices in SCCs that it can reach, and no other vertices.
Find all SCCs; this can be done in O(|V|+|E|) time. Build a table SCC, so that SCC(u) = i if the SCC of u is i (both vertices in G and SCCs can be represented as integers). Afterwards make another pass through this table to build a dual table, Verts, so that Verts(i) contains a list of all vertices in the ith SCC.
Build a new graph G' whose vertices are the SCCs of G. G' will necessarily be acyclic.
So, given a vertex u in G, look up its SCC, SCC(u). Call this i. Perform a DFS through G' starting at vertex i: For each vertex (of G') j encountered during this DFS, output every vertex (of G) in Verts(j).

Compute costs only of paths between all pairs in weighted graph

Is there an algorithm faster than O(n2) for computing the costs only between every pair in a weighted non-cyclic graph, assuming I do not need the shortest paths but just the paths I would get if using simple BFS? I do not need the actual paths, only the costs of the paths. My current solution is just to do a BFS starting from each node also keeping track of the weights of the edges along the way but this is obviously O(n2) and I am wondering if it is possible to do any better.
No, there is no algorithm better then O(n2).
The algorithm will need to at least go over each pair. There are O(n2) possible pairs in the graph:
. Therefore the algorithm bottom boundary is = O(n2).

Resources