Dijkstra's pathfinding algorithm - algorithm

I'm learning about pathfinding from a book, but I have spotted some weird statement.
For completeness, I will include most of the code, but feel free to skip to the second part (Search())
template <class graph_type >
class Graph_SearchDijkstra
{
private:
//create typedefs for the node and edge types used by the graph
typedef typename graph_type::EdgeType Edge;
typedef typename graph_type::NodeType Node;
private:
const graph_type & m_Graph;
//this vector contains the edges that comprise the shortest path tree -
//a directed sub-tree of the graph that encapsulates the best paths from
//every node on the SPT to the source node.
std::vector<const Edge*> m_ShortestPathTree;
//this is indexed into by node index and holds the total cost of the best
//path found so far to the given node. For example, m_CostToThisNode[5]
//will hold the total cost of all the edges that comprise the best path
//to node 5 found so far in the search (if node 5 is present and has
//been visited of course).
std::vector<double> m_CostToThisNode;
//this is an indexed (by node) vector of "parent" edges leading to nodes
//connected to the SPT but that have not been added to the SPT yet.
std::vector<const Edge*> m_SearchFrontier;
int m_iSource;
int m_iTarget;
void Search();
public:
Graph_SearchDijkstra(const graph_type& graph,
int source,
int target = -1):m_Graph(graph),
m_ShortestPathTree(graph.NumNodes()),
m_SearchFrontier(graph.NumNodes()),
m_CostToThisNode(graph.NumNodes()),
m_iSource(source),
m_iTarget(target)
{
Search();
}
//returns the vector of edges defining the SPT. If a target is given
//in the constructor, then this will be the SPT comprising all the nodes
//examined before the target is found, else it will contain all the nodes
//in the graph.
std::vector<const Edge*> GetAllPaths()const;
//returns a vector of node indexes comprising the shortest path
//from the source to the target. It calculates the path by working
//backward through the SPT from the target node.
std::list<int> GetPathToTarget()const;
//returns the total cost to the target
double GetCostToTarget()const;
};
Search():
template <class graph_type>
void Graph_SearchDijkstra<graph_type>::Search()
{
//create an indexed priority queue that sorts smallest to largest
//(front to back). Note that the maximum number of elements the iPQ
//may contain is NumNodes(). This is because no node can be represented
// on the queue more than once.
IndexedPriorityQLow<double> pq(m_CostToThisNode, m_Graph.NumNodes());
//put the source node on the queue
pq.insert(m_iSource);
//while the queue is not empty
while(!pq.empty())
{
//get the lowest cost node from the queue. Don't forget, the return value
//is a *node index*, not the node itself. This node is the node not already
//on the SPT that is the closest to the source node
int NextClosestNode = pq.Pop();
//move this edge from the search frontier to the shortest path tree
m_ShortestPathTree[NextClosestNode] = m_SearchFrontier[NextClosestNode];
//if the target has been found exit
if (NextClosestNode == m_iTarget) return;
//now to relax the edges. For each edge connected to the next closest node
graph_type::ConstEdgeIterator ConstEdgeItr(m_Graph, NextClosestNode);
for (const Edge* pE=ConstEdgeItr.begin();
!ConstEdgeItr.end();
pE=ConstEdgeItr.next())
{
//the total cost to the node this edge points to is the cost to the
//current node plus the cost of the edge connecting them.
double NewCost = m_CostToThisNode[NextClosestNode] + pE->Cost();
//if this edge has never been on the frontier make a note of the cost
//to reach the node it points to, then add the edge to the frontier
//and the destination node to the PQ.
if (m_SearchFrontier[pE->To()] == 0)
{
m_CostToThisNode[pE->To()] = NewCost;
pq.insert(pE->To());
m_SearchFrontier[pE->To()] = pE;
}
//else test to see if the cost to reach the destination node via the
//current node is cheaper than the cheapest cost found so far. If
//this path is cheaper we assign the new cost to the destination
//node, update its entry in the PQ to reflect the change, and add the
//edge to the frontier
else if ( (NewCost < m_CostToThisNode[pE->To()]) &&
(m_ShortestPathTree[pE->To()] == 0) )
{
m_CostToThisNode[pE->To()] = NewCost;
//because the cost is less than it was previously, the PQ must be
//resorted to account for this.
pq.ChangePriority(pE->To());
m_SearchFrontier[pE->To()] = pE;
}
}
}
}
What I don't get is this part:
//else test to see if the cost to reach the destination node via the
//current node is cheaper than the cheapest cost found so far. If
//this path is cheaper we assign the new cost to the destination
//node, update its entry in the PQ to reflect the change, and add the
//edge to the frontier
else if ( (NewCost < m_CostToThisNode[pE->To()]) &&
(m_ShortestPathTree[pE->To()] == 0) )
if the new cost is lower than the cost already found, then why do we also test if the node has not already been added to the SPT? This seems to beat the purpose of the check?
FYI, in m_ShortestPathTree[pE->To()] == 0 the container is a vector that has a pointer to an edge (or NULL) for each index (the index represents a node)

Imagine the following graph:
S --5-- A --2-- F
\ /
-3 -4
\ /
B
And you want to go from S to F. First, let me tell you the Dijkstra's algorithm assumes there are no loops in the graph with a negative weight. In my example, this loop is S -> B -> A -> S or simpler yet, S -> B -> S
If you have such a loop, you can infinitely loop in it, and your cost to F gets lower and lower. That is why this is not acceptable by Dijkstra's algorithm.
Now, how do you check that? It is as the code you posted does. Every time you want to update the weight of a node, besides checking whether it gets smaller weight, you check if it's not in the accepted list. If it is, then you must have had a negative weight loop. Otherwise, how can you end up going forward and reaching an already accepted node with smaller weight?
Let's follow the algorithm on the example graph (nodes in [] are accepted):
Without the if in question:
Starting Weights: S(0), A(inf), B(inf), F(inf)
- Accept S
New weights: [S(0)], A(5), B(-3), F(inf)
- Accept B
New weights: [S(-3)], A(-7), [B(-3)], F(inf)
- Accept A
New weights: [S(-3)], [A(-7)], [B(-11)], F(-5)
- Accept B again
New weights: [S(-14)], [A(-18)], [B(-11)], F(-5)
- Accept A again
... infinite loop
With the if in question:
Starting Weights: S(0), A(inf), B(inf), F(inf)
- Accept S
New weights: [S(0)], A(5), B(-3), F(inf)
- Accept B (doesn't change S)
New weights: [S(0)], A(-7), [B(-3)], F(inf)
- Accept A (doesn't change S or B
New weights: [S(0)], [A(-7)], [B(-3)], F(-5)
- Accept F

Related

What is this algorithm called? (SSSP)

Observation: For each node, we can reuse it's min path to destination, so we don't have to recalculate it(dp). Also, the moment we discover a cycle, we check if it's negative. If it's not, it will not affect our final answer, and we can say that it is not connected to the destination(wether it does or not).
Pseudo code:
Given source node u and dest node v
Initialize Integer dp array that stores min distance to dest node, relative to source node. dp[v]= 0, everything else infinite
Initialize boolean onPath array that stores wether the current node is on the path we are considering.
Initialize boolean visited array that tracks wether the current path has been done(initially all false)
Initialize int tentative array that stores the tentative value of a node. (tentative[u] = 0)
return function(u).
int function(int node){
onPath[node] = true;
for each connection u of node{
if(onPath[u]){ //we've found a cycle
if(cost to u + tentative[node] > tentative[u]) //report negative cycle
continue; //safely ignore
}
if(visited[u]){
dp[node] = min(dp[node], dp[u]); //dp already calculated
}else{
tentative[u] = tentative[node] + cost to u
dp[node] = min(function(u), dp[node])
}
visited[node] = true;
onPath[node] = false;
return dp[node];
}
I'm aware this algorithm won't cover the case where destination is part of a negative cycle, but besides that, is there anything wrong with algorithm?
If not, what is it called?
You can't "safely ignore" a positive sum cycle because it might be hiding a shorter path. For example, suppose we have a graph with arcs u->x (10), u->y (1), x->y (10), y->x (1), x->v (1), y->v (10). The shortest u-v path is u->y->x->v, of length 3.
In a bad execution, the first three calls look like
function(u)
function(x)
function(y)
The out-neighbors of y are v, yielding a y->v path of length 10; and x, but the cycle logic suppresses consideration of this arc, so y is marked as visited with distance 10 (not 2). As a result we miss the shortest path.

Traversal directed graph with cycles

I wrote a script to construct a directed graph using networkx in python, and I want to get all possible path from start to end including cycles.
For example, there is a directed graph:
I want to get these paths:
A->B->D
A->B->C->D
A->B->C->B->D
A->B->C->B->C->B->D
...
As far as I know, there are many algorithms to find shortest paths or paths without cycles between 2 nodes, but I want to find paths with cycles.
Is there any algorithm to achieve this ?
Thx a lot
As noted, there is an infinite number of such paths.
However, you can still generate all of them in a lazy way by maintaining all nodes v (and path you used to reach v) you can reach from the start node in k steps for k=1,2,...; if v is your target node, remember it.
When you have to return the next path, (i) pop the first target node off list, and (ii) generate the next candidates for all non-target nodes on the list. If there is no target node on the list, repeat (ii) until you find one.
The method works assuming the path always exists. If you don't find a path in n-1 steps, where n is the number of nodes, simply report that no path exists.
Here's the pseudo code for an algorithm that generates paths from shortest to longest assuming unit weights:
class Node {
int steps
Node prev
Node(int steps=0, Node prev=null) {
prev = prev
steps = steps
}
}
class PathGenerator {
Queue<Node> nodes
Node start, target;
PathGenerator(Node start, Node target) {
start = start
target = target
nodes = new Queue<>()
nodes.add(start) // assume start.steps=0 and stat.prev=null
}
Node nextPath(int n) {
current_length = -1;
do {
node = nodes.poll()
current_length = node.steps
// expand to all others you can reach from node
for each u in node.neighbors()
list.add(new Node(node, node.steps+1))
// if node is the target, return the path
if (node == target)
return node
} while (current_length < n);
throw new Exception("no path of length <=n exists");
}
}
Beware that the list nodes can grow exponentially in the worst case (think of what happens in case you run it on a complete graph).

Computing depth of each node in a "maximally packed" DAG

(Note: I thought about asking this on https://cstheory.stackexchange.com/, but decided my question is not theoretical enough -- it's about an algorithm. If there is a better Stack Exchange community for this post, I'm happy to listen!)
I'm using the terminology "starting node" to mean a node with no links into it, and "terminal node" to mean a node with no links out of it. So the following graph has starting nodes A and B and terminal nodes F and G:
I want to draw it with the following rules:
at least one starting node has a depth of 0.
links always point from top to bottom
nodes are packed vertically as closely as possible
Using those rules, depth of for each node is shown for the graph above. Can someone suggest an algorithm to compute the depth of each node that runs in less than O(n^2) time?
update:
I tweaked the graph to show that the DAG may contain starting and terminal nodes at different depths. (This was a case that I didn't consider in my original buggy answer.) I also switched terminology from "x coordinate" to "depth" in order to emphasize that this is about "graphing" and not "graphics".
Your x coordinate of a node corresponds to the longest way from any node without incomming edges to this node in question. For a DAG it can be calculated in O(N):
given DAG G:
calculate incomming_degree[v] for every v in G
initialize queue q={v with incomming_degree[v]==0}, x[v]=0 for every v in q
while(q not empty):
v=q.pop() #retreive and delete first element
for(w in neighbors of v):
incomming_degree[w]--
if(incomming_degree[w]==0): #no further way to w exists, evaluate
q.offer(w)
x[w]=x[v]+1
x stores the desired information.
Here's one solution which is essentially a two-pass depth-first tree walk. The first pass (traverseA) traces the DAG from the starting nodes (A and B in the O.P.'s example) until encountering terminal nodes (F and G in the example). It them marks them with the maximum depth as traced through the graph.
The second pass (traverseB) starts at the terminal nodes and traces back towards the starting nodes, marking each node along the way with the node's current value OR the previous node's value minus one, whichever is smaller if the node hasn't been visited yet:
function labelDAG() {
nodes.forEach(function(node) { node.depth = -1; }); // initialize
// find and mark terminal nodes
startingNodes().forEach(function(node) { traverseA(node, 0); });
// walk backwards from the terminal nodes
terminalNodes().forEach(function(node) { traverseB(node); });
dumpGraph();
};
function traverseA(node, depth) {
var targets = targetsOf(node);
if (targets.length === 0) {
// we're at a leaf (terminal) node -- set depth
node.depth = Math.max(node.depth, depth);
} else {
// traverse each subtree with depth = depth+1
targets.forEach(function(target) {
traverseA(target, depth+1);
});
};
};
// walk backwards from a terminal node, setting each source node's depth value
// along the way.
function traverseB(node) {
sourcesOf(node).forEach(function(source) {
if ((source.depth === -1) || (source.depth > node.x - 1)) {
// source has not yet been visited, or we found a longer path
// between terminal node and source node.
source.depth = node.depth - 1;
}
traverseB(source);
});
};

Finding all the shortest paths between two nodes in unweighted undirected graph

I need help finding all the shortest paths between two nodes in an unweighted undirected graph.
I am able to find one of the shortest paths using BFS, but so far I am lost as to how I could find and print out all of them.
Any idea of the algorithm / pseudocode I could use?
As a caveat, remember that there can be exponentially many shortest paths between two nodes in a graph. Any algorithm for this will potentially take exponential time.
That said, there are a few relatively straightforward algorithms that can find all the paths. Here's two.
BFS + Reverse DFS
When running a breadth-first search over a graph, you can tag each node with its distance from the start node. The start node is at distance 0, and then, whenever a new node is discovered for the first time, its distance is one plus the distance of the node that discovered it. So begin by running a BFS over the graph, writing down the distances to each node.
Once you have this, you can find a shortest path from the source to the destination as follows. Start at the destination, which will be at some distance d from the start node. Now, look at all nodes with edges entering the destination node. A shortest path from the source to the destination must end by following an edge from a node at distance d-1 to the destination at distance d. So, starting at the destination node, walk backwards across some edge to any node you'd like at distance d-1. From there, walk to a node at distance d-2, a node at distance d-3, etc. until you're back at the start node at distance 0.
This procedure will give you one path back in reverse order, and you can flip it at the end to get the overall path.
You can then find all the paths from the source to the destination by running a depth-first search from the end node back to the start node, at each point trying all possible ways to walk backwards from the current node to a previous node whose distance is exactly one less than the current node's distance.
(I personally think this is the easiest and cleanest way to find all possible paths, but that's just my opinion.)
BFS With Multiple Parents
This next algorithm is a modification to BFS that you can use as a preprocessing step to speed up generation of all possible paths. Remember that as BFS runs, it proceeds outwards in "layers," getting a single shortest path to all nodes at distance 0, then distance 1, then distance 2, etc. The motivating idea behind BFS is that any node at distance k + 1 from the start node must be connected by an edge to some node at distance k from the start node. BFS discovers this node at distance k + 1 by finding some path of length k to a node at distance k, then extending it by some edge.
If your goal is to find all shortest paths, then you can modify BFS by extending every path to a node at distance k to all the nodes at distance k + 1 that they connect to, rather than picking a single edge. To do this, modify BFS in the following way: whenever you process an edge by adding its endpoint in the processing queue, don't immediately mark that node as being done. Instead, insert that node into the queue annotated with which edge you followed to get to it. This will potentially let you insert the same node into the queue multiple times if there are multiple nodes that link to it. When you remove a node from the queue, then you mark it as being done and never insert it into the queue again. Similarly, rather than storing a single parent pointer, you'll store multiple parent pointers, one for each node that linked into that node.
If you do this modified BFS, you will end up with a DAG where every node will either be the start node and have no outgoing edges, or will be at distance k + 1 from the start node and will have a pointer to each node of distance k that it is connected to. From there, you can reconstruct all shortest paths from some node to the start node by listing of all possible paths from your node of choice back to the start node within the DAG. This can be done recursively:
There is only one path from the start node to itself, namely the empty path.
For any other node, the paths can be found by following each outgoing edge, then recursively extending those paths to yield a path back to the start node.
This approach takes more time and space than the one listed above because many of the paths found this way will not be moving in the direction of the destination node. However, it only requires a modification to BFS, rather than a BFS followed by a reverse search.
Hope this helps!
#templatetypedef is correct, but he forgot to mention about distance check that must be done before any parent links are added to node. This means that se keep the distance from source in each of nodes and increment by one the distance for children. We must skip this increment and parent addition in case the child was already visited and has the lower distance.
public void addParent(Node n) {
// forbidding the parent it its level is equal to ours
if (n.level == level) {
return;
}
parents.add(n);
level = n.level + 1;
}
The full java implementation can be found by the following link.
http://ideone.com/UluCBb
I encountered the similar problem while solving this https://oj.leetcode.com/problems/word-ladder-ii/
The way I tried to deal with is first find the shortest distance using BFS, lets say the shortest distance is d. Now apply DFS and in DFS recursive call don't go beyond recursive level d.
However this might end up exploring all paths as mentioned by #templatetypedef.
First, find the distance-to-start of all nodes using breadth-first search.
(if there are a lot of nodes, you can use A* and stop when top of the queue has distance-to-start > distance-to-start(end-node). This will give you all nodes that belong to some shortest path)
Then just backtrack from the end-node. Anytime a node is connected to two (or more) nodes with a lower distance-to-start, you branch off into two (or more) paths.
templatetypedef your answer was very good, thank you a lot for that one(!!), but it missed out one point:
If you have a graph like this:
A-B-C-E-F
| |
D------
Now lets imagine I want this path:
A -> E.
It will expand like this:
A-> B -> D-> C -> F -> E.
The problem there is,
that you will have F as a parent of E, but
A->B->D->F-E is longer than
A->B->C->E. You will have to take of tracking the distances of parents you are so happily adding.
Step 1: Traverse the graph from the source by BFS and assign each node the minimal distance from the source
Step 2: The distance assigned to the target node is the shortest length
Step 3: From source, do a DFS search along all paths where the minimal distance is increased one by one until the target node is reached or the shortest length is reached. Print the path whenever the target node is reached.
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that:
Every adjacent pair of words differs by a single letter.
Every si for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList.
sk == endWord
Given two words, beginWord and endWord, and a dictionary wordList, return all the shortest transformation sequences from beginWord to endWord, or an empty list if no such sequence exists. Each sequence should be returned as a list of the words [beginWord, s1, s2, ..., sk].
Example 1:
Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]
Output: [["hit","hot","dot","dog","cog"],["hit","hot","lot","log","cog"]]
Explanation: There are 2 shortest transformation sequences:
"hit" -> "hot" -> "dot" -> "dog" -> "cog"
"hit" -> "hot" -> "lot" -> "log" -> "cog"
Example 2:
Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]
Output: []
Explanation: The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.
https://leetcode.com/problems/word-ladder-ii
class Solution {
public List<List<String>> findLadders(String beginWord, String endWord, List<String> wordList) {
List<List<String>> result = new ArrayList<>();
if (wordList == null) {
return result;
}
Set<String> dicts = new HashSet<>(wordList);
if (!dicts.contains(endWord)) {
return result;
}
Set<String> start = new HashSet<>();
Set<String> end = new HashSet<>();
Map<String, List<String>> map = new HashMap<>();
start.add(beginWord);
end.add(endWord);
bfs(map, start, end, dicts, false);
List<String> subList = new ArrayList<>();
subList.add(beginWord);
dfs(map, result, subList, beginWord, endWord);
return result;
}
private void bfs(Map<String, List<String>> map, Set<String> start, Set<String> end, Set<String> dicts, boolean reverse) {
// Processed all the word in start
if (start.size() == 0) {
return;
}
dicts.removeAll(start);
Set<String> tmp = new HashSet<>();
boolean finish = false;
for (String str : start) {
char[] chars = str.toCharArray();
for (int i = 0; i < chars.length; i++) {
char old = chars[i];
for (char n = 'a' ; n <='z'; n++) {
if(old == n) {
continue;
}
chars[i] = n;
String candidate = new String(chars);
if (!dicts.contains(candidate)) {
continue;
}
if (end.contains(candidate)) {
finish = true;
} else {
tmp.add(candidate);
}
String key = reverse ? candidate : str;
String value = reverse ? str : candidate;
if (! map.containsKey(key)) {
map.put(key, new ArrayList<>());
}
map.get(key).add(value);
}
// restore after processing
chars[i] = old;
}
}
if (!finish) {
// Switch the start and end if size from start is bigger;
if (tmp.size() > end.size()) {
bfs(map, end, tmp, dicts, !reverse);
} else {
bfs(map, tmp, end, dicts, reverse);
}
}
}
private void dfs (Map<String, List<String>> map,
List<List<String>> result , List<String> subList,
String beginWord, String endWord) {
if(beginWord.equals(endWord)) {
result.add(new ArrayList<>(subList));
return;
}
if (!map.containsKey(beginWord)) {
return;
}
for (String word : map.get(beginWord)) {
subList.add(word);
dfs(map, result, subList, word, endWord);
subList.remove(subList.size() - 1);
}
}
}

A fast way to find connected component in a 1-NN graph?

First of all, I got a N*N distance matrix, for each point, I calculated its nearest neighbor, so we had a N*2 matrix, It seems like this:
0 -> 1
1 -> 2
2 -> 3
3 -> 2
4 -> 2
5 -> 6
6 -> 7
7 -> 6
8 -> 6
9 -> 8
the second column was the nearest neighbor's index. So this was a special kind of directed
graph, with each vertex had and only had one out-degree.
Of course, we could first transform the N*2 matrix to a standard graph representation, and perform BFS/DFS to get the connected components.
But, given the characteristic of this special graph, is there any other fast way to do the job ?
I will be really appreciated.
Update:
I've implemented a simple algorithm for this case here.
Look, I did not use a union-find algorithm, because the data structure may make things not that easy, and I doubt whether It's the fastest way in my case(I meant practically).
You could argue that the _merge process could be time consuming, but if we swap the edges into the continuous place while assigning new label, the merging may cost little, but it need another N spaces to trace the original indices.
The fastest algorithm for finding connected components given an edge list is the union-find algorithm: for each node, hold the pointer to a node in the same set, with all edges converging to the same node, if you find a path of length at least 2, reconnect the bottom node upwards.
This will definitely run in linear time:
- push all edges into a union-find structure: O(n)
- store each node in its set (the union-find root)
and update the set of non-empty sets: O(n)
- return the set of non-empty sets (graph components).
Since the list of edges already almost forms a union-find tree, it is possible to skip the first step:
for each node
- if the node is not marked as collected
-- walk along the edges until you find an order-1 or order-2 loop,
collecting nodes en-route
-- reconnect all nodes to the end of the path and consider it a root for the set.
-- store all nodes in the set for the root.
-- update the set of non-empty sets.
-- mark all nodes as collected.
return the set of non-empty sets
The second algorithm is linear as well, but only a benchmark will tell if it's actually faster. The strength of the union-find algorithm is its optimization. This delays the optimization to the second step but removes the first step completely.
You can probably squeeze out a little more performance if you join the union step with the nearest neighbor calculation, then collect the sets in the second pass.
If you want to do it sequencially you can do it using weighted quick union and path compression .Complexity O(N+Mlog(log(N))).check this link .
Here is the pseudocode .honoring #pycho 's words
`
public class QuickUnion
{
private int[] id;
public QuickUnion(int N)
{
id = new int[N];
for (int i = 0; i < N; i++) id[i] = i;
}
public int root(int i)
{
while (i != id[i])
{
id[i] = id[id[i]];
i = id[i];
}
return i;
}
public boolean find(int p, int q)
{
return root(p) == root(q);
}
public void unite(int p, int q)
{
int i = root(p);
int j = root(q);
id[i] = j;
}
}
`
#reference https://www.cs.princeton.edu/~rs/AlgsDS07/01UnionFind.pdf
If you want to find connected components parallely, the asymptotic complexity can be reduced to O(log(log(N)) time using pointer jumping and weighted quick union with path compression. Check this link
https://vishwasshanbhog.wordpress.com/2016/05/04/efficient-parallel-algorithm-to-find-the-connected-components-of-the-graphs/
Since each node has only one outgoing edge, you can just traverse the graph one edge at a time until you get to a vertex you've already visited. An out-degree of 1 means any further traversal at this point will only take you where you've already been. The traversed vertices in that path are all in the same component.
In your example:
0->1->2->3->2, so [0,1,2,3] is a component
4->2, so update the component to [0,1,2,3,4]
5->6->7->6, so [5,6,7] is a component
8->6, so update the compoent to [5,6,7,8]
9->8, so update the compoent to [5,6,7,8,9]
You can visit each node exactly once, so time is O(n). Space is O(n) since all you need is a component id for each node, and a list of component ids.

Resources