Can the adjacency matrix implementation of Prim use a min heap? - algorithm

I found that there are two ways to implement Prim algorithm, and that the time complexity with an adjacency matrix is O(V^2) while time complexity with a heap and adjacency list is O(E lg(V)).
I'm wondering can I use a heap when graph is represented with adjacency matrix. Does it make sense? If it does, is there any difference between adjacency matrix + heap and adjacency list + heap?

Generally, the matrix graph-representation is not so good for Prim's algorithm.
This is because of the main iteration of the algorithm, which pops out a node from the heap, and then scans its neighbors. How do you find its neighbors? Using the matrix graph representation, you basically need to loop over an entire matrix row (in the list graph-representation, you just need to loop over the node's list, which can be significantly shorter).
This means that, irrespective of the heap, just the sum of the part of finding the neighbor's of the popped node is already Ω(|V|2), as each node's row is eventually scanned.
So, no - it doesn't make much sense. The heap does not reduce the overall complexity.

Related

Which Graph Algorithms prefer adjacency matrix and why?

I heard that adjacency lists are used in most graph algorithms (but not all). I'm just wondering what algorithms prefer adjacency matrices and why?
So far I’ve found that Floyd Warshall uses adjacency matrices.
Adjacency lists are generally faster than adjacency matrices in algorithms in which the key operation performed per node is “iterate over all the nodes adjacent to this node.” That can be done in time O(deg(v)) time for an adjacency list, where deg(v) is the degree of node v, while it takes time Θ(n) in an adjacency matrix. Similarly, adjacency lists make it fast to iterate over all of the edges in a graph - it takes time O(m + n) to do so, compared with time Θ(n2) for adjacency matrices.
Some of the most-commonly-used graph algorithms (BFS, DFS, Dijkstra’s algorithm, A* search, Kruskal’s algorithm, Prim’s algorithm, Bellman-Ford, Karger’s algorithm, etc.) require fast iteration over all edges or the edges incident to particular nodes, so they work best with adjacency lists.
You mentioned that Floyd-Warshall uses adjacency matrices. While Floyd-Warshall does maintain an internal matrix tracking shortest paths seen so far, it doesn’t actually require the original graph to be an adjacency matrix. The overall cost of the dynamic programming work is Θ(n3), which is bigger than the O(n2) cost of converting an adjacency list into an adjacency matrix or vice-versa.
There are only a few places where an adjacency matrix is faster than an adjacency list. Adjacency matrices take time O(1) to test whether a particular edge is present in the graph, which is faster than the O(deg(v)) cost of the corresponding operation on an adjacency list. Since the cost of converting an adjacency list to an adjacency matrix is Θ(n2), the only cases where an adjacency matrix would outperform an adjacency list are in situations where (1) random access of the edges are required and (2) the total runtime of the algorithm is o(n2). I only know a few algorithms that do this. For example, there’s the celebrity-finding problem where you’re given a graph and are asked to find whether there’s a node with incoming edges from each node and outgoing edges to no nodes. This can be done in time O(n) using an adjacency matrix, faster than what can be done with an adjacency list.
(That being said, you could also use an adjacency list represented using cuckoo hash tables rather than regular lists and match the same runtime bounds as above, though with the cost of creating the adjacency list now only expected to be fast rather than actually worst-case efficient.)
The main reason I’ve found adjacency matrices to be useful is in thinking about graphs from a different perspective. For example, raising an adjacency matrix to the kth power makes a new matrix that counts the number of paths from one node to another using exactly k hops. This can be used to count and find triangles in graphs faster than the naive algorithm, for example. Similarly, the Four Russians algorithm for computing transitive closures of graphs works by representing the graph as a matrix and using some clever techniques (treating blocks of bits as integers then used in a lookup table) to outperform the naive search.
Hope this helps!

Time Complexity of Dijkstra's Algorithm when using Adjacency Matrix vs Adjacency Linked List

For a graph with v vertices and e edges, and a fringe stored in a binary min heap, the worst case runtime is O((n+e)lg(n)). However, this is assuming we use a adjacency linked list to represent the graph. Using a adjacency matrix takes O(n^2) to traverse, while a linked list representation can be traversed in O(n+e).
Therefore, would using the matrix to represent the graph change the runtime of Dijkstra's to O(n^2lg(n))?
The O(log n) cost is paid for processing edges, not for walking the graph, so if you know the actual number of edges in the graph then Dijkstra's algorithm on an adjacency matrix with min-heap is within O(n^2 + (n+e)log(n)) time.

Multigraph and adjacency list

I have a problem that can be represented as a multigraph. To represent this graph internally, I’m thinking of a matrix. I like the idea of a matrix because I want to count the number of edges for a vertex. This would be O(n) time because all I would have to do is loop through the correct column so the time complexity would be linear to the amount of vertices in the graph, right?. HOWEVER, I’m also thinking of the space complexity. If this graph were to grow, there could be a lot of wasted space. This leads me to using an adjacency list. This may reduce my space complexity but sounds like my time complexity just increased. How would I represent the time complexity if I wanted to determine the number of edges for a particular vertex? I know the operation would first be to find the vertex so this operation would be O(n), but then I would also have to scan the list of edges which could also be O(n). So does this mean my time complexity for this operation is O(n^2)?
EDIT:
I guess if I were to use a HASH table, the first operation would be O(1) so does that mean my operation to find number of edges for a vertex is O(n)?
It will be O(|e|), |e| can be O(|v|**2) but you wanna use adjacency list because the matrix is sparse so |e|<<|v| so it's better to say O(|e|).

Dijkstra’s algorithm with a Fibonacci heap

I am trying to implement Dijkstra's algorithm for finding the shortest paths between nodes using Fibonacci Heap with Adjacency List representation for the graph. According to the algorithm I know, we have to find the minimum node in the Heap and then iterate through all its neighbours and update their distances. But to get the current distances of the neighbours(which is stored in each node in the heap), I have to find that particular node from the heap. 'Find' operation takes O(N) time where N is number of nodes in the Fibonacci Heap. So is my algorithm correct or am I missing something? Any help would be greatly appreciated.
An implicit assumption in a Fibonacci heap is that when you enqueue an element into the heap, you can, in time O(1), pull up the resulting node in the Fibonacci heap. There are two common ways you'll see this done.
First, you could use an invasive Fibonacci heap. In this approach, the actual node structures in the graph have extra fields stored in them corresponding to what the Fibonacci heap needs to keep things linked up the right order. If you set things up this way, then when you iterate over a node's neighbors, those neighbors already have the necessary fields built into them, which makes it easy to query the Fibonacci heap on those nodes without needing to do a search for them.
Second, you can have the enqueue operation on the Fibonacci heap return a pointer to the newly-created Fibonacci heap node, then somehow associate each node in the graph with that Fibonacci heap node (maybe store a pointer to it, or have a hash table, etc.). This is the approach I took when I coded up a version of Dijkstra's algorithm that uses Fibonacci heap many years back.

How can a heap be used to optimizie Prim's minimum spanning tree algorithm?

I have to solve a question that is something like this:
I am given a number N which represents the number of points I have. Each point has two coordinates: X and Y.
I can find the distance between two points with the following formula:
abs(x2-x1)+abs(y2-y1),
(x1,y1) being the coordinates of the first point, (x2,y2) the coordinates of the second point and abs() being the absolute value.
I have to find the minimum spanning tree, meaning I must have all my points connected with the sum of the edges being minimal. Prim's algorithm is good, but it is too slow. I read that I can make it faster using a heap but I didn't find any article that explains how to do that.
Can anyone explain me how Prim's algorithm works with a heap(some sample code would be good but not neccesarily), please?
It is possible to solve this problem efficiently(in O(n log n) time), but it is not that easy. Just using the Prim's algorithm with a heap does not help(it actually makes it even slower), because its time complexity is O(E log V), which is O(n^2 * log n) in this case.
However, you can use the Delaunay triangulation to reduce the number of edges in the graph. The Delaunay triangulation graph is planar, so it has linear number of edges. That's why running the Prim's algorithm with a heap on it gives O(n log n) time complexity(there are O(n) edges and n vertices). You can read more about it here(covering this algorithm in details and proving its correctness would make my answer way too long): http://en.wikipedia.org/wiki/Euclidean_minimum_spanning_tree. Note that even though the article is about the Euclidian mst, the approach for your case is essentially the same(it is possible to build the Delaunay triangulation for manhattan distance efficiently, too).
A description of the Prim's algorithm with a heap itself is already present in two other answers to your question.
From the Wikipedia article on Prim's algorithm:
[S]toring vertices instead of edges can improve it still further. The heap should order the vertices by the smallest edge-weight that connects them to any vertex in the partially constructed minimum spanning tree (MST) (or infinity if no such edge exists). Every time a vertex v is chosen and added to the MST, a decrease-key operation is performed on all vertices w outside the partial MST such that v is connected to w, setting the key to the minimum of its previous value and the edge cost of (v,w).
While it was pointed out that Prim's with a heap is O(E log V), which is O(n^2 log n) in the worst case, I can provide what makes the heap faster in cases other than that worst case, since that has still not been answered.
What makes Prim's so costly at O(V^2) is the necessary updating each iteration in the algorithm. In general, Prim's works by keeping a table of your vertices with the lowest length to other vertices and picking the cheapest vertex to add to your growing tree until all are added. Every time you add a vertex, you must then go back to your table and update any vertices that can now be accessed with less weight. You then must walk back all the way through your table to decide which vertex is cheapest to add. This setup - having to pick the next vertex (O(V)) V times - gives the O(V^2).
The heap is able to help this running time is all cases besides the worst case because it fixes this bottleneck. By working with a minimum heap, you can access the minimum weight in consideration in O(1). Additionally, it costs O(log V) to fix a heap after adding a number to it to maintain its properties, which is done E times for O(E log V) to maintain the heap for Prim's. This becomes the new bottleneck, which is what gives rise to the final running time of O(E log V).
So, depending on how much you know about your data, Prim's with a heap can certainly be more efficient than without!

Resources