Depth first search - isolated nodes - depth-first-search

I am attempting a depth first search on a dataset which has in excess of 90% of its content as isolated nodes (see node 7 below):
Is there a way to delete these to cut the dfs time? Or is it necessary to plug through brute-force?!

Related

find total components in graph

I have a graph with N nodes and M edges. It is a single component.
Now I have to delete a single node from graph, deleting that node might split graph into 1,2 or more components. The count of such components is required for each deleted node.
Note that only a single node is deleted at any point of time.
I need to do this for all the nodes of the graph in a linear time.
Is this possible in linear time?
I am able to do this in O(n^2) by running dfs for each node.
Read some online resources about "Articulation Point".
I hope you will get your answer.
https://www.geeksforgeeks.org/articulation-points-or-cut-vertices-in-a-graph/

What's the best pathfinding algorithm in complexity?

I need to implement a pathfinding algorithm in one of my programs. The goal is to know whether a path exists or not. As a consequence, knowing the path itself isn't important.
I already did some researches and I am not sure which one to pick. This post have been telling that a DFS or a BFS would be more suitable for this kind of programs but I'd rather have confirmation knowing the exact situation. I also would be interested in knowing the complexity itself of the program, but I guess I can find this. It's fine if it's not shared.
Here's the graph I am using: let's say I have a x*y grid with zones the path can and cannot take.
I want to know if there is an existing path that starts from the top of the graph and ends on the bottom of the graph. Here's an example with the path in red:
I believe DFS is the best in complexity but I also am not sure exactly how to implement it knowing the different start points the path can take. I am not sure if it's better to launch the DFS on each of the different points the path can start or if I add a layer of zones the path can take to let one test work.
Thank you for your help!
There are a number of different approaches that you can take here. Assuming that the grids you're working with are of roughly the size that you're showing above, and assuming you aren't, say, processing millions of grids at once, chances are that both breadth-first search and depth-first search would work equally well. The advantage of breadth-first search is that it will find the shortest path from anywhere in the top to anywhere in the bottom; the disadvantage is that it typically requires more memory than depth-first search. But again, if you're working with grids on the order of, say, hundreds or thousands of cells each, chances are that this memory overhead isn't going to be too much of a problem. I'd say to pick whichever algorithm you feel most comfortable working with and go with it.
As for how to implement a search from "anywhere in the top" to "anywhere in the bottom," you can achieve this in a few different ways.
If you're using a depth-first search, you can run one depth-first search from each of the cells in the top row and search for a path down to the bottom row. DFS requires you to maintain some information about which cells have and have not been visited. If you recycle this same information across all the calls to DFS, you'll ensure that no two calls do any duplicated work, and so the resulting solution should be very efficient, running in time O(mn) for an m × n grid.
If you're using a breadth-first search, the modification is pretty straightforward: instead of just enqueuing a single start point in the queue at the beginning of the search, enqueue every cell in the top row at the beginning of the search. The BFS will then naturally explore all possible paths starting anywhere in the top row.
Both of these ideas can be thought of in a different way. Imagine your grid is a graph where each cell is a node and edges correspond to pairs of adjacent cells. You can then add in a new node that sits above the top row of the grid and is connected to each of the nodes in the top row. You then add in a new node that sits just below the bottom row and is connected to each of the nodes in the bottom row. Now, if there's a path from the new top node to the new bottom node, it means that there's a path from some node in the top row to some node in the bottom row, so doing a single search in this graph will be sufficient to check if a path exists. (Fun fact: the two above modifications to DFS and BFS can each be thought of as implicitly doing a search in this new graph.)
There's another option you might want to consider that's fairly easy to implement and imperceptibly less efficient than DFS or BFS, and that's to use a disjoint-set forest data structure to determine what's connected. This data structure supports two kinds of queries:
Given two cells, mark that there's a way to get from the first cell to the second. ("Union")
Given two cells, determine whether there's a path between them, which can be a direct path or could be formed by chaining together multiple other paths. ("Find")
You could implement your connectivity query by building a disjoint-set forest, unioning together all pairs of adjacent cells, and then unioning together all nodes in the top row and unioning all nodes in the bottom row. Doing a "find" query to see if any one of the top nodes is connected to any of the bottom nodes will then solve your problem. This will take time O(mn α(mn)) for a function α(mn) that grows so slowly that it's essentially three or four, so it's effectively as efficient as BFS or DFS.

Given a query containing two integers as nodes, find all the children of those two nodes in tree?

This is my interview question which has the following problem statement
You are given M queries (1 <= M <= 100000) where every query has 2 integers which behave as nodes of some tree. How will you give all the children(subtree) for these 2 nodes respectively.
Well my approach was naive. I used DFS from both the integers(nodes) for every query but interviewer needed some optimized approach.
More simply, we have to print sub-tree of nodes given in the queries there could be many queries, so we can't run DFS on every node in the query.
Any hints how can I optimize this ?
You could optimize an algorithm that performs DFS on both nodes if one of the nodes is a child of the other.
Suppose Node 2 is a child of Node 1. In this case, calculating the DFS on Node 1 gets all of the children of Node 2, so running DFS again on 2 is inefficient. You could accomplish this by storing intermediate values to avoid recalculation (see dynamic programming, specifically the example for Fibonacci, about how you can not recalculate values for recursive calls)
For a single query, DFS should be the optimal way. For a larger number of queries here are a few things in my mind that you could do:
Cache your results. When a number shows up frequently (say 100 times), save that printed subtree to memory and just return the result when the same number appears again.
When caching, also mark all the nodes contained in the cached subtree on your original tree. When a query contains such a node, refer to the cached subtree instead of the original tree since you have done DFS on these nodes as well.
As noted by #K. Dackow if a query contains A and B and B is a child of A, you can straight out use the DFS results for B when traversing the tree for A. If permitted you can even look into multiple queries (say 10) and see if there are any nodes that belong to the current subtree you're traversing. You can set up a queue for queries and when doing one DFS traversal, look into the top items in you queue to see if you have met any of the nodes.
Hope this helps!

What will be the most optimized position of a node in a binary tree with given specifications?

Suppose I have a binary tree in which a node can have either 0,1 or 2 children. A cost value is associated with each node, and it can be {5,10,20,40}. The most optimal placement of a new node is under a node with same or lower cost value. For example- a new node with cost value 20 is best placed under a node with cost value 20, but can also be placed under nodes with cost values 5 and 10.
Primary requirement of this algorithm is to complete the left and right child of a node if it is required, i.e. if a node with cost value 10 has a left child with cost value 10, then a new node having cost value 10 will be made the right child of the above node . The secondary requirement is to maximize the overall depth of the tree.
The tree cannot be rearranged at any point of time. If an incoming node is of lesser value, then there is no penalty involved.
Given the above requirements, how can we decide the best position of an incoming new node in the tree ? Can we write a general algorithm for it ?
Initially, I thought to complete each level of the tree first, but I don't think it would be optimal.
The secondary requirement is to maximize the overall depth of the tree.
That's a bit unusual.
The quickest way:
sort your input values
fill all the minimal value nodes (5's) in respect with the first requirement (still unclear if both left-right nodes must be filled in before going down a level. If it must then the max depth will be log2(N5) If "going deep on left" is allowed without filling in the right, then the max depth tree will degenerate in list with all right nodes to null).Call this the master tree
make a tree from the next values (say 10-value nodes) and attach this tree to the deepest branch of the master tree
repeat step 3 as necessary
Note: this is the simplest concept, the implementation may take advantage from the fact the master tree is sorted at all time and get over with the initial sort.

Proper traversal of undirected graph using depth first search?

I've got an undirected graph that I need to traverse using depth first search.
The excel chart below shows each node has been marked after traversal in the marked column, and the edgeTo column shows which node brought us to that node. For example, we got to node 1 from node 5, we got to node 2 from node 7, etc.
My question is for node 6 and 8, since they are separated from the main graph, how do I properly traverse it? My guess is that I start at 6 and go to 8, but since 6 will already have been visited at that point, I do not go back to 6 from 8. Hence row 6 is left blank in the edgeTo column.
Am I correct? Is my chart correct?
Depth first search is basically used to find a path between two nodes in a graph. The graph of your example is disconnected, i.e. there exist two nodes in your graph such that no path in your graph has those nodes as endpoints.
6 and 8 are obviously nodes that belong to a different subgraph and therefore you can't find a path between 0 and 8 and the DFS will return IMPOSSIBLE or No path found. Apart from that your chart is correct.

Resources