Kruskals Algorithm explained - algorithm

So I'm currently having a little play with some algorithms and have come across the Kruskals algorithm.
Understand the concept, understand how to do the actual process. But do not understand the algorithm.
Here is the algorithm:
From what I can figure out, |V| is all the vertices?
What is E'?
I have no idea why this algorithm is confusing me so much, I've picked up other ones with absolute ease

Kruskal's algorithm adds edges to the MST in order of weight, unless they would introduce a cycle (this detection is typically done using union-find). The code starts by initialising some values:
n := |V| // the number of vertices
E' := ∅ // the edges in our MST; it starts as the empty set
Cands = E // the edges still under consideration for adding to the MST, starts as all edges
The loop condition is:
while |E'| < n - 1 and Cands != ∅ do
That is, we continue as long as we have selected fewer than n - 1 edges (because we know this is the number of edges contained in any spanning tree: if we have found them, we're done) and the set of edges we haven't considered yet is not empty.
Lines (1) and (2) find the minimum weight edge in Cands, removing it from the set. A suitable structure for Cands would be a min-heap, in which case this is just a pop-operation.
Lines (3) and (4) determine whether the edge we retrieved from Cands in (1) would introduce a cycle in E' if added. If it doesn't, we know this edge is in the MST, otherwise it's not.
The last line just checks to see whether we actually found a tree. It's possible the loop terminates without finding n - 1 edges, for instance when the graph is not one connected component.

Related

Good algorithm for finding shortest path for specific vertices

I'm solving the problem described below and can't think of a better algorithm than trying every permutation of every vertex of every group with every.
I'm given a graph of vertices, along with a list of groups of specific vertices, the goal is to find the shortest path from a specific starting vertex to a specific ending vertex, and the path must pass through at least one vertex from each specified group of vertices.
There are also vertices in the graph that are not part of any given group.
Re-visiting vertices and edges is possible.
The graph data is specified as follows:
Vertex list - each vertex is identified by a sequence number (0 to the number of vertices -1 )
Edge list - list of vertex pairs (by vertex number)
Vertex group list - list of lists of vector numbers
A specific starting and ending vertex.
I would be grateful for any ideas for a better solution, thank you.
Summary:
We can use bitmasks to efficiently check which groups we have visited so far, and combine this with a traditional BFS/ Dijkstra's shortest-path algorithm.
If we assume E edges, V vertices, and K vertex-groups that have to be included, the below algorithm has a time complexity of O((V + E) * 2^K) and a space complexity of O(V * 2^K). The exponential 2^K term means it will only work for a relatively small K, say up to 10 or 20.
Details:
First, are the edges weighted?
If yes then a "shortest path" algorithm will usually be a variation of Dijkstra's algorithm, in which we keep a (min) priority queue of the shortest paths. We only visit a node once it's at the top of the queue, meaning that this must be the shortest path to this node. Any other shorter path to this node would already have been added to the priority queue and would come before the current iteration. (Note: this doesn't work for negative paths).
If no, meaning all edges have the same weight, then there is no need to maintain a priority queue with the shortest edges. We can instead just run a regular Breadth-first search (BFS), in which we maintain a deque with all nodes at the current depth. At each step we iterate over all nodes at the current depth (popping them from the left of the deque), and for each node we add all it's not-yet-visited neighbors to the right side of the deque, forming the next level.
The below algorithm works for both BFS and Dijkstra's, but for simplicity's sake for the rest of the answer I'll pretend that the edges have positive weights and we will use Dijkstra's. What is important to take away though is that for either algorithm we will only "visit" or "explore" a node for a path that must be the shortest path to that node. This property is essential for the algorithm to be efficient, since we know that we will at most visit each of the V nodes and E edges only one time, giving us a time complexity of O(V + E). If we use Dijkstra's we have to multiply this with log(V) for the priority queue usage (this also applies to the time complexity mentioned in the summary).
Our Problem
In our case we have the additional complexity that we have K vertex-groups, for each of which our shortest path has to contain at least one the nodes in it. This is a big problem, since it destroys our ability to simple go along with the "shortest current path".
See for example this simple graph. Notation: -- means an edge, start is that start node, and end is the end node. A vertex with value 0 does not have a vertex-group, and a vertex with value >= 1 belongs to the vertex-group of that index.
end -- 0 -- 2 -- start -- 1 -- 2
It is clear that the optimal path will first move right to the node in group 1, and then move left until the end. But this is impossible to do for the BFS and Dijkstra's algorithm we introduced above! After we move from the start to the right to capture the node in group 1, we would never ever move back left to the start, since we have already been there with a shorter path.
The Trick
In the above example, if the right-hand side would have looked like start -- 0 -- 0, where 0 means the vertex does not not belonging to a group, then it would be of no use to go there and back to the start.
The decisive reason of why it makes sense to go there and come back, although the path will get longer, is that it includes a group that we have not seen before.
How can we keep track of whether or not at a current position a group is included or not? The most efficient solution is a bit mask. So if we for example have already visited a node of group 2 and 4, then the bitmask would have a bit set at the position 2 and 4, and it would have the value of 2 ^ 2 + 2 ^ 4 == 4 + 16 == 20
In the regular Dijkstra's we would just keep a one-dimensional array of size V to keep track of what the shortest path to each vertex is, initialized to a very high MAX value. array[start] begins with value 0.
We can modify this method to instead have a two-dimensional array of dimensions [2 ^ K][V], where K is the number of groups. Every value is initialized to MAX, only array[mask_value_of_start][start] begins with 0.
The value we store at array[mask][node] means Given the already visited groups with bit-mask value of mask, what is the length of the shortest path to reach this node?
Suddenly, Dijkstra's resurrected
Once we have this structure, we can suddenly use Dijkstra's again (it's the same for BFS). We simply change the rules a bit:
In regular Dijkstra's we never re-visit a node
--> in our modification we differentiate by mask and never re-visit a node if it's already been visited for that particular mask.
In regular Dijkstra's, when exploring a node, we look at all neighbors and only add them to the priority queue if we managed to decrease the shortest path to them.
--> in our modification we look at all neighbors, and update the mask we use to check for this neighbor like: neighbor_mask = mask | (1 << neighbor_group_id). We only add a {neighbor_mask, neighbor} pair to the priority queue, if for that particular array[neighbor_mask][neighbor] we managed to decrease the minimal path length.
In regular Dijkstra's we only visit unexplored nodes with the current shortest path to it, guaranteeing it to be the shortest path to this node
--> In our modification we only visit nodes that for their respective mask values are not explored yet. We also only visit the current shortest path among all masks, meaning that for any given mask it must be the shortest path.
In regular Dijkstra's we can return once we visit the end node, since we are sure we got the shortest path to it.
--> In our modification we can return once we visit the end node for the full mask, meaning the mask containing all groups, since it must be the shortest path for the full mask. This is the answer to our problem.
If this is too slow...
That's it! Because time and space complexity are exponentially dependent on the number of groups K, this will only work for very small K (of course depending on the number of nodes and edges).
If this is too slow for your requirements then there might be a more sophisticated algorithm for this that someone smarter can come up with, it will probably involve dynamic programming.
It is very possible that this is still too slow, in which case you will probably want to switch to some heuristic, that sacrifices accuracy in order to gain more speed.

Algorithm for the Smallest Set of Vertices that will "Infect" the Entire Graph

My question is about infecting an entire graph with the smallest set of vertices that will be deemed as infected. The question goes something like this. For a vertex A in a (not necessarily simple) directed graph, A will become infected if for all in-edges of the form (A, B) (it's a directed graph so A will be pointing towards B) B is also infected. If we were to take a specific example:
In this case, if the vertices E, A were infected:
Iteration 1:
vertices F, D are infected because of the fact that the only vertex that points to them is E and E is infected.
Iteration 2:
The vertex B is infected as both vertices A and D are infected.
Iteration 3:
Finally the vertex C is infected as a result of the infection of vertex B from Iteration 2.
In this case, the infected set {E, A} that I chose was able to infect the entire graph. Obviously, this isn't always possible as in the case with the infected set of {B} (the vertex A doesn't end up infected as B doesn't point to it and thus there is no way of reaching it) or the infected set of {A} (the vertex B is not infected as it has a perfectly healthy parent in D).
I really want to find an algorithm that finds the smallest set of infected vertices that will end up infecting the entire graph after an arbitrary number of iterations. Does something like this already exist?
Just for clarification, for vertices that are a self-loop, it would necessarily have to be in the infected set as that's the only way that it can get infected.
btilly gave a response about how the problem is NP-hard. Could someone suggest a good approximation algorithm then? It doesn't need to be too too efficient. After all I only need to run it once albeit on a large graph. It has around 750,000 nodes and around 10 edges for each of them.
This problem is NP-hard.
The vertices which are part of this minimal infection set fall into two buckets. The first bucket is all nodes with no incoming edges. The second bucket is a minimal set of nodes that makes the graph acyclic. (If you left a cycle, then things in that cycle don't get infected. Conversely if you leave no cycles, then it is easy to prove by induction that the whole graph winds up infected.)
Very importantly, if you were able to solve this problem, then it would be easy to take the solution, remove from it the nodes which had no incoming edges, and then you're left with the minimal feedback vertex set. This would give an algorithm to solve an NP-hard problem for arbitrary graphs.
(Yes, yes, I know that the Wikipedia article says NP-complete everywhere. It is wrong. The question, "Is there a feedback vertex set of size k" is NP-complete. The question, "What is the smallest feedback vertex set" is NP-hard because, given a claimed minimal set, you don't have a polynomial algorithm to verify that no minimal set is shorter. That is, the decision problem is NP-complete and the optimization problem is NP-hard.)
The difficulty is that all source nodes for each node must be infected to infect a node. So, the reality is that every edge must carry infection. So -
- LOOP over the roots ( nodes with only out edges )
Add root to solution
Mark all edges reachable from root
If every edge marked
STOP
- LOOP over every node that has unmarked out edges
Add node to solution
Mark all edges reachable from node
If every edge marked
STOP
- LOOP over disconnected nodes
Add node to solution
STOP
Note that this does not always give the optimal solution. It gives a successful solution that infects every node. Brute force optimization: place algorithm in loop over enumeration of node orders and keep the solution if it is improved.

Yen's improvement to Bellman-Ford

I stumbled with famous Yen's optimization of Bellman-Ford algorithm that I got initially from Wikipedia, then I found the same improvement in several textbooks in Exercise section (for example, this is a problem 24-1 in Cormen and Web exercise N5 in Sedgewick's "Algorithms").
Here's the improvement:
Yen's second improvement first assigns some arbitrary linear order on all vertices and then partitions the set of all edges into two subsets. The first subset, Ef, contains all edges (vi, vj) such that i < j; the second, Eb, contains edges (vi, vj) such that i > j. Each vertex is visited in the order v1, v2, ..., v|V|, relaxing each outgoing edge from that vertex in Ef. Each vertex is then visited in the order v|V|, v|V|−1, ..., v1, relaxing each outgoing edge from that vertex in Eb. Each iteration of the main loop of the algorithm, after the first one, adds at least two edges to the set of edges whose relaxed distances match the correct shortest path distances: one from Ef and one from Eb. This modification reduces the worst-case number of iterations of the main loop of the algorithm from |V| − 1 to |V|/2.
Unfortunately, I didn't manage to find the proof of this bound |V|/2, and moreover, it seems that I found a counterexample. I'm sure that I'm wrong about that, but I can't see where exactly.
The counterexample is just a path graph with vertices labeled from 1 to n and initial vertex 1. (So, E={(i, i+1)} for i in range from 1 to n-1). In that case the set of forward vertices is equal E (E_f = E) and the set of backward vertices is just the empty set. Each iteration of algorithm adds only one correctly computed distance, so algorithm finishes in n-1 time, which contradicts the proposed bound n/2.
What am I doing wrong?
UPD: So the mistake was very stupid. I didn't consider the iteration through vertices, thinking about iterations as of immediate updates of path costs. I'm not deleting this topic because someone upvoted it, in case this improvement can be interesting for someone.
This is actually the best case which finishes in 2 iterations regardless of the number of vertices.
Draw the iterations on paper or write the code. The first iteration will find all the correct shortest paths. The second iteration will then not change anything and the algorithm terminates because the set of vertices whose distance was changed last iteration is empty.
The "forward" run through the vertices that relaxes the Ef set of edges will do all the work, while the "backwards" run won't do anything because Eb is an empty set.

How to detect if the given graph has a cycle containing all of its nodes? Does the suggested algorithm have any flaws?

I have a connected, non-directed, graph with N nodes and 2N-3 edges. You can consider the graph as it is built onto an existing initial graph, which has 3 nodes and 3 edges. Every node added onto the graph and has 2 connections with the existing nodes in the graph. When all nodes are added to the graph (N-3 nodes added in total), the final graph is constructed.
Originally I'm asked, what is the maximum number of nodes in this graph that can be visited exactly once (except for the initial node), i.e., what is the maximum number of nodes contained in the largest Hamiltonian path of the given graph? (Okay, saying largest Hamiltonian path is not a valid phrase, but considering the question's nature, I need to find a max. number of nodes that are visited once and the trip ends at the initial node. I thought it can be considered as a sub-graph which is Hamiltonian, and consists max. number of nodes, thus largest possible Hamiltonian path).
Since i'm not asked to find a path, I should check if a hamiltonian path exists for given number of nodes first. I know that planar graphs and cycle graphs (Cn) are hamiltonian graphs (I also know Ore's theorem for Hamiltonian graphs, but the graph I will be working on will not be a dense graph with a great probability, thus making Ore's theorem pretty much useless in my case). Therefore I need to find an algorithm for checking if the graph is cycle graph, i.e. does there exist a cycle which contains all the nodes of the given graph.
Since DFS is used for detecting cycles, I thought some minor manipulation to the DFS can help me detect what I am looking for, as in keeping track of explored nodes, and finally checking if the last node visited has a connection to the initial node. Unfortunately
I could not succeed with that approach.
Another approach I tried was excluding a node, and then try to reach to its adjacent node starting from its other adjacent node. That algorithm may not give correct results according to the chosen adjacent nodes.
I'm pretty much stuck here. Can you help me think of another algorithm to tell me if the graph is a cycle graph?
Edit
I realized by the help of the comment (thank you for it n.m.):
A cycle graph consists of a single cycle and has N edges and N vertices. If there exist a cycle which contains all the nodes of the given graph, that's a Hamiltonian cycle. – n.m.
that I am actually searching for a Hamiltonian path, which I did not intend to do so:)
On a second thought, I think checking the Hamiltonian property of the graph while building it will be more efficient, which is I'm also looking for: time efficiency.
After some thinking, I thought whatever the number of nodes will be, the graph seems to be Hamiltonian due to node addition criteria. The problem is I can't be sure and I can't prove it. Does adding nodes in that fashion, i.e. adding new nodes with 2 edges which connect the added node to the existing nodes, alter the Hamiltonian property of the graph? If it doesn't alter the Hamiltonian property, how so? If it does alter, again, how so? Thanks.
EDIT #2
I, again, realized that building the graph the way I described might alter the Hamiltonian property. Consider an input given as follows:
1 3
2 3
1 5
1 3
these input says that 4th node is connected to node 1 and node 3, 5th to node 2 and node 3 . . .
4th and 7th node are connected to the same nodes, thus lowering the maximum number of nodes that can be visited exactly once, by 1. If i detect these collisions (NOT including an input such as 3 3, which is an example that you suggested since the problem states that the newly added edges are connected to 2 other nodes) and lower the maximum number of nodes, starting from N, I believe I can get the right result.
See, I do not choose the connections, they are given to me and I have to find the max. number of nodes.
I think counting the same connections while building the graph and subtracting the number of same connections from N will give the right result? Can you confirm this or is there a flaw with this algorithm?
What we have in this problem is a connected, non-directed graph with N nodes and 2N-3 edges. Consider the graph given below,
A
/ \
B _ C
( )
D
The Graph does not have a Hamiltonian Cycle. But the Graph is constructed conforming to your rules of adding nodes. So searching for a Hamiltonian Cycle may not give you the solution. More over even if it is possible Hamiltonian Cycle detection is an NP-Complete problem with O(2N) complexity. So the approach may not be ideal.
What I suggest is to use a modified version of Floyd's Cycle Finding algorithm (Also called the Tortoise and Hare Algorithm).
The modified algorithm is,
1. Initialize a List CYC_LIST to ∅.
2. Add the root node to the list CYC_LIST and set it as unvisited.
3. Call the function Floyd() twice with the unvisited node in the list CYC_LIST for each of the two edges. Mark the node as visited.
4. Add all the previously unvisited vertices traversed by the Tortoise pointer to the list CYC_LIST.
5. Repeat steps 3 and 4 until no more unvisited nodes remains in the list.
6. If the list CYC_LIST contains N nodes, then the Graph contains a Cycle involving all the nodes.
The algorithm calls Floyd's Cycle Finding Algorithm a maximum of 2N times. Floyd's Cycle Finding algorithm takes a linear time ( O(N) ). So the complexity of the modied algorithm is O(N2) which is much better than the exponential time taken by the Hamiltonian Cycle based approach.
One possible problem with this approach is that it will detect closed paths along with cycles unless stricter checking criteria are implemented.
Reply to Edit #2
Consider the Graph given below,
A------------\
/ \ \
B _ C \
|\ /| \
| D | F
\ / /
\ / /
E------------/
According to your algorithm this graph does not have a cycle containing all the nodes.
But there is a cycle in this graph containing all the nodes.
A-B-D-C-E-F-A
So I think there is some flaw with your approach. But suppose if your algorithm is correct, it is far better than my approach. Since mine takes O(n2) time, where as yours takes just O(n).
To add some clarification to this thread: finding a Hamiltonian Cycle is NP-complete, which implies that finding a longest cycle is also NP-complete because if we can find a longest cycle in any graph, we can find the Hamiltonian cycle of the subgraph induced by the vertices that lie on that cycle. (See also for example this paper regarding the longest cycle problem)
We can't use Dirac's criterion here: Dirac only tells us minimum degree >= n/2 -> Hamiltonian Cycle, that is the implication in the opposite direction of what we would need. The other way around is definitely wrong: take a cycle over n vertices, every vertex in it has exactly degree 2, no matter the size of the circle, but it has (is) an HC. What you can tell from Dirac is that no Hamiltonian Cycle -> minimum degree < n/2, which is of no use here since we don't know whether our graph has an HC or not, so we can't use the implication (nevertheless every graph we construct according to what OP described will have a vertex of degree 2, namely the last vertex added to the graph, so for arbitrary n, we have minimum degree 2).
The problem is that you can construct both graphs of arbitrary size that have an HC and graphs of arbitrary size that do not have an HC. For the first part: if the original triangle is A,B,C and the vertices added are numbered 1 to k, then connect the 1st added vertex to A and C and the k+1-th vertex to A and the k-th vertex for all k >= 1. The cycle is A,B,C,1,2,...,k,A. For the second part, connect both vertices 1 and 2 to A and B; that graph does not have an HC.
What is also important to note is that the property of having an HC can change from one vertex to the other during construction. You can both create and destroy the HC property when you add a vertex, so you would have to check for it every time you add a vertex. A simple example: take the graph after the 1st vertex was added, and add a second vertex along with edges to the same two vertices of the triangle that the 1st vertex was connected to. This constructs from a graph with an HC a graph without an HC. The other way around: add now a 3rd vertex and connect it to 1 and 2; this builds from a graph without an HC a graph with an HC.
Storing the last known HC during construction doesn't really help you because it may change completely. You could have an HC after the 20th vertex was added, then not have one for k in [21,2000], and have one again for the 2001st vertex added. Most likely the HC you had on 23 vertices will not help you a lot.
If you want to figure out how to solve this problem efficiently, you'll have to find criteria that work for all your graphs that can be checked for efficiently. Otherwise, your problem doesn't appear to me to be simpler than the Hamiltonian Cycle problem is in the general case, so you might be able to adjust one of the algorithms used for that problem to your variant of it.
Below I have added three extra nodes (3,4,5) in the original graph and it does seem like I can keep adding new nodes indefinitely while keeping the property of Hamiltonian cycle. For the below graph the cycle would be 0-1-3-5-4-2-0
1---3---5
/ \ / \ /
0---2---4
As there were no extra restrictions about how you can add a new node with two edges, I think by construction you can have a graph that holds the property of hamiltonian cycle.

Path from s to e in a weighted DAG graph with limitations

Consider a directed graph with n nodes and m edges. Each edge is weighted. There is a start node s and an end node e. We want to find the path from s to e that has the maximum number of nodes such that:
the total distance is less than some constant d
starting from s, each node in the path is closer than the previous one to the node e. (as in, when you traverse the path you are getting closer to your destination e. in terms of the edge weight of the remaining path.)
We can assume there are no cycles in the graph. There are no negative weights. Does an efficient algorithm already exist for this problem? Is there a name for this problem?
Whatever you end up doing, do a BFS/DFS starting from s first to see if e can even be reached; this only takes you O(n+m) so it won't add to the complexity of the problem (since you need to look at all vertices and edges anyway). Also, delete all edges with weight 0 before you do anything else since those never fulfill your second criterion.
EDIT: I figured out an algorithm; it's polynomial, depending on the size of your graphs it may still not be sufficiently efficient though. See the edit further down.
Now for some complexity. The first thing to think about here is an upper bound on how many paths we can actually have, so depending on the choice of d and the weights of the edges, we also have an upper bound on the complexity of any potential algorithm.
How many edges can there be in a DAG? The answer is n(n-1)/2, which is a tight bound: take n vertices, order them from 1 to n; for two vertices i and j, add an edge i->j to the graph iff i<j. This sums to a total of n(n-1)/2, since this way, for every pair of vertices, there is exactly one directed edge between them, meaning we have as many edges in the graph as we would have in a complete undirected graph over n vertices.
How many paths can there be from one vertex to another in the graph described above? The answer is 2n-2. Proof by induction:
Take the graph over 2 vertices as described above; there is 1 = 20 = 22-2 path from vertex 1 to vertex 2: (1->2).
Induction step: assuming there are 2n-2 paths from the vertex with number 1 of an n vertex graph as described above to the vertex with number n, increment the number of each vertex and add a new vertex 1 along with the required n edges. It has its own edge to the vertex now labeled n+1. Additionally, it has 2i-2 paths to that vertex for every i in [2;n] (it has all the paths the other vertices have to the vertex n+1 collectively, each "prefixed" with the edge 1->i). This gives us 1 + Σnk=2 (2k-2) = 1 + Σn-2k=0 (2k-2) = 1 + (2n-1 - 1) = 2n-1 = 2(n+1)-2.
So we see that there are DAGs that have 2n-2 distinct paths between some pairs of their vertices; this is a bit of a bleak outlook, since depending on weights and your choice of d, you may have to consider them all. This in itself doesn't mean we can't choose some form of optimum (which is what you're looking for) efficiently though.
EDIT: Ok so here goes what I would do:
Delete all edges with weight 0 (and smaller, but you ruled that out), since they can never fulfill your second criterion.
Do a topological sort of the graph; in the following, let's only consider the part of the topological sorting of the graph from s to e, let's call that the integer interval [s;e]. Delete everything from the graph that isn't strictly in that interval, meaning all vertices outside of it along with the incident edges. During the topSort, you'll also be able to see whether there is a
path from s to e, so you'll know whether there are any paths s-...->e. Complexity of this part is O(n+m).
Now the actual algorithm:
traverse the vertices of [s;e] in the order imposed by the topological
sorting
for every vertex v, store a two-dimensional array of information; let's call it
prev[][] since it's gonna store information about the predecessors
of a node on the paths leading towards it
in prev[i][j], store how long the total path of length (counted in
vertices) i is as a sum of the edge weights, if j is the predecessor of the
current vertex on that path. For example, pres+1[1][s] would have
the weight of the edge s->s+1 in it, while all other entries in pres+1
would be 0/undefined.
when calculating the array for a new vertex v, all we have to do is check
its incoming edges and iterate over the arrays for the start vertices of those
edges. For example, let's say vertex v has an incoming edge from vertex w,
having weight c. Consider what the entry prev[i][w] should be.
We have an edge w->v, so we need to set prev[i][w] in v to
min(prew[i-1][k] for all k, but ignore entries with 0) + c (notice the subscript of the array!); we effectively take the cost of a
path of length i - 1 that leads to w, and add the cost of the edge w->v.
Why the minimum? The vertex w can have many predecessors for paths of length
i - 1; however, we want to stay below a cost limit, which greedy minimization
at each vertex will do for us. We will need to do this for all i in [1;s-v].
While calculating the array for a vertex, do not set entries that would give you
a path with cost above d; since all edges have positive weights, we can only get
more costly paths with each edge, so just ignore those.
Once you reached e and finished calculating pree, you're done with this
part of the algorithm.
Iterate over pree, starting with pree[e-s]; since we have no cycles, all
paths are simple paths and therefore the longest path from s to e can have e-s edges. Find the largest
i such that pree[i] has a non-zero (meaning it is defined) entry; if non exists, there is no path fitting your criteria. You can reconstruct
any existing path using the arrays of the other vertices.
Now that gives you a space complexity of O(n^3) and a time complexity of O(n²m) - the arrays have O(n²) entries, we have to iterate over O(m) arrays, one array for each edge - but I think it's very obvious where the wasteful use of data structures here can be optimized using hashing structures and other things than arrays. Or you could just use a one-dimensional array and only store the current minimum instead of recomputing it every time (you'll have to encapsulate the sum of edge weights of the path together with the predecessor vertex though since you need to know the predecessor to reconstruct the path), which would change the size of the arrays from n² to n since you now only need one entry per number-of-nodes-on-path-to-vertex, bringing down the space complexity of the algorithm to O(n²) and the time complexity to O(nm). You can also try and do some form of topological sort that gets rid of the vertices from which you can't reach e, because those can be safely ignored as well.

Resources