I have a tree as input to the breadth first search and I want to know as the algorithm progresses at which level it is?
# Breadth First Search Implementation
graph = {
'A':['B','C','D'],
'B':['A'],
'C':['A','E','F'],
'D':['A','G','H'],
'E':['C'],
'F':['C'],
'G':['D'],
'H':['D']
}
def breadth_first_search(graph,source):
"""
This function is the Implementation of the breadth_first_search program
"""
# Mark each node as not visited
mark = {}
for item in graph.keys():
mark[item] = 0
queue, output = [],[]
# Initialize an empty queue with the source node and mark it as explored
queue.append(source)
mark[source] = 1
output.append(source)
# while queue is not empty
while queue:
# remove the first element of the queue and call it vertex
vertex = queue[0]
queue.pop(0)
# for each edge from the vertex do the following
for vrtx in graph[vertex]:
# If the vertex is unexplored
if mark[vrtx] == 0:
queue.append(vrtx) # mark it as explored
mark[vrtx] = 1 # and append it to the queue
output.append(vrtx) # fill the output vector
return output
print breadth_first_search(graph, 'A')
It takes tree as an input graph, what I want is, that at each iteration it should print out the current level which is being processed.
Actually, we don't need an extra queue to store the info on the current depth, nor do we need to add null to tell whether it's the end of current level. We just need to know how many nodes the current level contains, then we can deal with all the nodes in the same level, and increase the level by 1 after we are done processing all the nodes on the current level.
int level = 0;
Queue<Node> queue = new LinkedList<>();
queue.add(root);
while(!queue.isEmpty()){
int level_size = queue.size();
while (level_size-- != 0) {
Node temp = queue.poll();
if (temp.right != null) queue.add(temp.right);
if (temp.left != null) queue.add(temp.left);
}
level++;
}
You don't need to use extra queue or do any complicated calculation to achieve what you want to do. This idea is very simple.
This does not use any extra space other than queue used for BFS.
The idea I am going to use is to add null at the end of each level. So the number of nulls you encountered +1 is the depth you are at. (of course after termination it is just level).
int level = 0;
Queue <Node> queue = new LinkedList<>();
queue.add(root);
queue.add(null);
while(!queue.isEmpty()){
Node temp = queue.poll();
if(temp == null){
level++;
queue.add(null);
if(queue.peek() == null) break;// You are encountering two consecutive `nulls` means, you visited all the nodes.
else continue;
}
if(temp.right != null)
queue.add(temp.right);
if(temp.left != null)
queue.add(temp.left);
}
Maintain a queue storing the depth of the corresponding node in BFS queue. Sample code for your information:
queue bfsQueue, depthQueue;
bfsQueue.push(firstNode);
depthQueue.push(0);
while (!bfsQueue.empty()) {
f = bfsQueue.front();
depth = depthQueue.front();
bfsQueue.pop(), depthQueue.pop();
for (every node adjacent to f) {
bfsQueue.push(node), depthQueue.push(depth+1);
}
}
This method is simple and naive, for O(1) extra space you may need the answer post by #stolen_leaves.
Try having a look at this post. It keeps track of the depth using the variable currentDepth
https://stackoverflow.com/a/16923440/3114945
For your implementation, keep track of the left most node and a variable for the depth. Whenever the left most node is popped from the queue, you know you hit a new level and you increment the depth.
So, your root is the leftMostNode at level 0. Then the left most child is the leftMostNode. As soon as you hit it, it becomes level 1. The left most child of this node is the next leftMostNode and so on.
With this Python code you can maintain the depth of each node from the root by increasing the depth only after you encounter a node of new depth in the queue.
queue = deque()
marked = set()
marked.add(root)
queue.append((root,0))
depth = 0
while queue:
r,d = queue.popleft()
if d > depth: # increase depth only when you encounter the first node in the next depth
depth += 1
for node in edges[r]:
if node not in marked:
marked.add(node)
queue.append((node,depth+1))
If your tree is perfectly ballanced (i.e. each node has the same number of children) there's actually a simple, elegant solution here with O(1) time complexity and O(1) space complexity. The main usecase where I find this helpful is in traversing a binary tree, though it's trivially adaptable to other tree sizes.
The key thing to realize here is that each level of a binary tree contains exactly double the quantity of nodes compared to the previous level. This allows us to calculate the total number of nodes in any tree given the tree's depth. For instance, consider the following tree:
This tree has a depth of 3 and 7 total nodes. We don't need to count the number of nodes to figure this out though. We can compute this in O(1) time with the formaula: 2^d - 1 = N, where d is the depth and N is the total number of nodes. (In a ternary tree this is 3^d - 1 = N, and in a tree where each node has K children this is K^d - 1 = N). So in this case, 2^3 - 1 = 7.
To keep track of depth while conducting a breadth first search, we simply need to reverse this calculation. Whereas the above formula allows us to solve for N given d, we actually want to solve for d given N. For instance, say we're evaluating the 5th node. To figure out what depth the 5th node is on, we take the following equation: 2^d - 1 = 5, and then simply solve for d, which is basic algebra:
If d turns out to be anything other than a whole number, just round up (the last node in a row is always a whole number). With that all in mind, I propose the following algorithm to identify the depth of any given node in a binary tree during breadth first traversal:
Let the variable visited equal 0.
Each time a node is visited, increment visited by 1.
Each time visited is incremented, calculate the node's depth as depth = round_up(log2(visited + 1))
You can also use a hash table to map each node to its depth level, though this does increase the space complexity to O(n). Here's a PHP implementation of this algorithm:
<?php
$tree = [
['A', [1,2]],
['B', [3,4]],
['C', [5,6]],
['D', [7,8]],
['E', [9,10]],
['F', [11,12]],
['G', [13,14]],
['H', []],
['I', []],
['J', []],
['K', []],
['L', []],
['M', []],
['N', []],
['O', []],
];
function bfs($tree) {
$queue = new SplQueue();
$queue->enqueue($tree[0]);
$visited = 0;
$depth = 0;
$result = [];
while ($queue->count()) {
$visited++;
$node = $queue->dequeue();
$depth = ceil(log($visited+1, 2));
$result[$depth][] = $node[0];
if (!empty($node[1])) {
foreach ($node[1] as $child) {
$queue->enqueue($tree[$child]);
}
}
}
print_r($result);
}
bfs($tree);
Which prints:
Array
(
[1] => Array
(
[0] => A
)
[2] => Array
(
[0] => B
[1] => C
)
[3] => Array
(
[0] => D
[1] => E
[2] => F
[3] => G
)
[4] => Array
(
[0] => H
[1] => I
[2] => J
[3] => K
[4] => L
[5] => M
[6] => N
[7] => O
)
)
Set a variable cnt and initialize it to the size of the queue cnt=queue.size(), Now decrement cnt each time you do a pop. When cnt gets to 0, increase the depth of your BFS and then set cnt=queue.size() again.
In Java it would be something like this.
The idea is to look at the parent to decide the depth.
//Maintain depth for every node based on its parent's depth
Map<Character,Integer> depthMap=new HashMap<>();
queue.add('A');
depthMap.add('A',0); //this is where you start your search
while(!queue.isEmpty())
{
Character parent=queue.remove();
List<Character> children=adjList.get(parent);
for(Character child :children)
{
if (child.isVisited() == false) {
child.visit(parent);
depthMap.add(child,depthMap.get(parent)+1);//parent's depth + 1
}
}
}
Use a dictionary to keep track of the level (distance from start) of each node when exploring the graph.
Example in Python:
from collections import deque
def bfs(graph, start):
queue = deque([start])
levels = {start: 0}
while queue:
vertex = queue.popleft()
for neighbour in graph[vertex]:
if neighbour in levels:
continue
queue.append(neighbour)
levels[neighbour] = levels[vertex] + 1
return levels
I write a simple and easy to read code in python.
class TreeNode:
def __init__(self, x):
self.val = x
self.left = None
self.right = None
class Solution:
def dfs(self, root):
assert root is not None
queue = [root]
level = 0
while queue:
print(level, [n.val for n in queue if n is not None])
mark = len(queue)
for i in range(mark):
n = queue[i]
if n.left is not None:
queue.append(n.left)
if n.right is not None:
queue.append(n.right)
queue = queue[mark:]
level += 1
Usage,
# [3,9,20,null,null,15,7]
n3 = TreeNode(3)
n9 = TreeNode(9)
n20 = TreeNode(20)
n15 = TreeNode(15)
n7 = TreeNode(7)
n3.left = n9
n3.right = n20
n20.left = n15
n20.right = n7
DFS().dfs(n3)
Result
0 [3]
1 [9, 20]
2 [15, 7]
I don't see this method posted so far, so here's a simple one:
You can "attach" the level to the node. For e.g., in case of a tree, instead of the typical queue<TreeNode*>, use a queue<pair<TreeNode*,int>> and then push the pairs of {node,level}s into it. The root would be pushed in as, q.push({root,0}), its children as q.push({root->left,1}), q.push({root->right,1}) and so on...
We don't need to modify the input, append nulls or even (asymptotically speaking) use any extra space just to track the levels.
Related
How do we determine breadth a of binary tree.
A simple bin tree
O
/ \
O O
\
O
\
O
\
O
Breadth of above tree is 4
You could use a recursive function that returns two values for a given node: the extent of the subtree at that node towards the left (a negative number or zero), and the extent to the right (zero or positive). So for the example tree given in the question it would return -1, and 3.
To find these extends is easy when you know the extents of the left child and of the right child. And that is where the recursion kicks in, which in fact represents a depth-first traversal.
Here is how that function would look in Python:
def extents(tree):
if not tree:
# If a tree with just one node has extents 0 and 0, then "nothing" should
# have a negative extent to the right and a positive on the left,
# representing a negative breadth
return 1, -1
leftleft, leftright = extents(tree.left)
rightleft, rightright = extents(tree.right)
return min(leftleft-1, rightleft+1), max(leftright-1, rightright+1)
The breadth is simply the difference between the two extents returned by the above function, plus 1 (to count for the root node):
def breadth(tree):
leftextent, rightextent = extents(tree)
return rightextent-leftextent+1
The complete Python code with the example tree, having 6 nodes, as input:
from collections import namedtuple
Node = namedtuple('Node', ['left', 'right'])
def extents(tree):
if not tree:
return 1, -1
leftleft, leftright = extents(tree.left)
rightleft, rightright = extents(tree.right)
return min(leftleft-1, rightleft+1), max(leftright-1, rightright+1)
def breadth(tree):
left, right = extents(tree)
return right-left+1
# example tree as given in question
tree = Node(
Node(
None,
Node(None, Node(None, Node(None, None)))
),
Node(None, None)
)
print(breadth(tree)) # outputs 4
The input is:
An int[][], each sub array contains 2 int as {parent, child}, means there is a path from parent -> child.
e.g
{ { 1, 3 }, { 2, 3 }, { 3, 6 }, { 5, 6 }, { 5, 7 }, { 4, 5 }, { 4, 8 }, { 8, 9 } };
Or as a tree structure:
1 2 4
\ / / \
3 5 8
\ / \ \
6 7 9
The task is:
Giving 2 value (x, y), return a boolean value, to indicate whether they have any common parent(s).
Sample input and output:
[3, 8] => false
[5, 8] => true
[6, 8] => true
My idea:
Represent the input data as a DAG graph, in which data are stored in a Map like this Map<Integer, LinkedList<Integer>>, where key is the vertex, value is its adjacency list. And the direction in graph is reversed (compared to input data) as child -> parent, so that easy to search for parent.
Use a function findAvailableParents(Integer vertex) to find all parents (direct and indirect) for a single vertex, and return Set<Integer>.
Thus, only need to call findAvailableParents() once for each input vertex, then compare whether the 2 returned Sets have any intersection. If yes, then they have common parent; otherwise, they don't.
My questions are:
The time complexity in the solution above is between O(1) ~ O(E), right? (E is edge counts in the graph)
Is there a better solution?
A modified BFS might help you to solve the problem
Algorithm: checkCommonParent
def checkCommonParent(G, v1, v2):
# Create a queues for levelorder traversal
q1 = []
# Mark all the vertices as not visited
# This will be used to cover all the parts of graph
visited = [False]*(len(G.Vertices))
for v in G.Vertices:
if visited[v] == False:
q1.append(v)
visited[v] = True
# Check a connected component and see if it has both vertices exists.
# If it exists, that means they have a common ancestor
v1Visited = False
v2Visited = False
while ((len(q1) > 0) or (len(q2) > 0)):
while len(q1) > 0:
curVertex = q1.popleft()
for adjV in curVertex.adjecentVertices:
if visited[adjV] == False:
q1.append(adjV)
visited[adjV] = True
if adjV == v1:
v1Visited = True
elif adjV == v2:
v2Visited = True
if v1Visited and v2Visited:
return True
return False
I guess the idea is clear on the modification of BFS. Hope it helps!
suppose you have multiple inputs, now BFS would take around O(E) time to process each input.
All inputs can be queried in O(logn) if we do some pre computation which should take about O(nlogn) time
basically you want to find what is the Least common ancestor of those nodes
this thread in topcoder discusses the logic for a tree which can be extended to a DAG
You can also refer to this question for some further ideas
If an LCA exists between 2 nodes, then they have a common parent
I have spent lots of time on this issue. However, I can only find solutions with non-recursive methods for a tree: Non recursive for tree, or a recursive method for the graph, Recursive for graph.
And lots of tutorials (I don't provide those links here) don't provide the approaches as well. Or the tutorial is totally incorrect. Please help me.
Updated:
It's really hard to describe:
If I have an undirected graph:
1
/ | \
4 | 2
3 /
1-- 2-- 3 --1 is a cycle.
At the step: 'push the neighbors of the popped vertex into the stack', what's the order in which the vertices should be pushed?
If the pushed order is 2, 4, 3, the vertices in the stack are:
| |
|3|
|4|
|2|
_
After popping the nodes, we get the result: 1 -> 3 -> 4 -> 2 instead of 1--> 3 --> 2 -->4.
It's incorrect. What condition should I add to stop this scenario?
A DFS without recursion is basically the same as BFS - but use a stack instead of a queue as the data structure.
The thread Iterative DFS vs Recursive DFS and different elements order handles with both approaches and the difference between them (and there is! you will not traverse the nodes in the same order!)
The algorithm for the iterative approach is basically:
DFS(source):
s <- new stack
visited <- {} // empty set
s.push(source)
while (s is not empty):
current <- s.pop()
if (current is in visited):
continue
visited.add(current)
// do something with current
for each node v such that (current,v) is an edge:
s.push(v)
This is not an answer, but an extended comment, showing the application of the algorithm in #amit's answer to the graph in the current version of the question, assuming 1 is the start node and its neighbors are pushed in the order 2, 4, 3:
1
/ | \
4 | 2
3 /
Actions Stack Visited
======= ===== =======
push 1 [1] {}
pop and visit 1 [] {1}
push 2, 4, 3 [2, 4, 3] {1}
pop and visit 3 [2, 4] {1, 3}
push 1, 2 [2, 4, 1, 2] {1, 3}
pop and visit 2 [2, 4, 1] {1, 3, 2}
push 1, 3 [2, 4, 1, 1, 3] {1, 3, 2}
pop 3 (visited) [2, 4, 1, 1] {1, 3, 2}
pop 1 (visited) [2, 4, 1] {1, 3, 2}
pop 1 (visited) [2, 4] {1, 3, 2}
pop and visit 4 [2] {1, 3, 2, 4}
push 1 [2, 1] {1, 3, 2, 4}
pop 1 (visited) [2] {1, 3, 2, 4}
pop 2 (visited) [] {1, 3, 2, 4}
Thus applying the algorithm pushing 1's neighbors in the order 2, 4, 3 results in visit order 1, 3, 2, 4. Regardless of the push order for 1's neighbors, 2 and 3 will be adjacent in the visit order because whichever is visited first will push the other, which is not yet visited, as well as 1 which has been visited.
The DFS logic should be:
1) if the current node is not visited, visit the node and mark it as visited
2) for all its neighbors that haven't been visited, push them to the stack
For example, let's define a GraphNode class in Java:
class GraphNode {
int index;
ArrayList<GraphNode> neighbors;
}
and here is the DFS without recursion:
void dfs(GraphNode node) {
// sanity check
if (node == null) {
return;
}
// use a hash set to mark visited nodes
Set<GraphNode> set = new HashSet<GraphNode>();
// use a stack to help depth-first traversal
Stack<GraphNode> stack = new Stack<GraphNode>();
stack.push(node);
while (!stack.isEmpty()) {
GraphNode curr = stack.pop();
// current node has not been visited yet
if (!set.contains(curr)) {
// visit the node
// ...
// mark it as visited
set.add(curr);
}
for (int i = 0; i < curr.neighbors.size(); i++) {
GraphNode neighbor = curr.neighbors.get(i);
// this neighbor has not been visited yet
if (!set.contains(neighbor)) {
stack.push(neighbor);
}
}
}
}
We can use the same logic to do DFS recursively, clone graph etc.
Many people will say that non-recursive DFS is just BFS with a stack rather than a queue. That's not accurate, let me explain a bit more.
Recursive DFS
Recursive DFS uses the call stack to keep state, meaning you do not manage a separate stack yourself.
However, for a large graph, recursive DFS (or any recursive function that is) may result in a deep recursion, which can crash your problem with a stack overflow (not this website, the real thing).
Non-recursive DFS
DFS is not the same as BFS. It has a different space utilization, but if you implement it just like BFS, but using a stack rather than a queue, you will use more space than non-recursive DFS.
Why more space?
Consider this:
// From non-recursive "DFS"
for (auto i&: adjacent) {
if (!visited(i)) {
stack.push(i);
}
}
And compare it with this:
// From recursive DFS
for (auto i&: adjacent) {
if (!visited(i)) {
dfs(i);
}
}
In the first piece of code you are putting all the adjacent nodes in the stack before iterating to the next adjacent vertex and that has a space cost. If the graph is large it can make a significant difference.
What to do then?
If you decide to solve the space problem by iterating over the adjacency list again after popping the stack, that's going to add time complexity cost.
One solution is to add items to the stack one by one, as you visit them. To achieve this you can save an iterator in the stack to resume the iteration after popping.
Lazy way
In C/C++, a lazy approach is to compile your program with a larger stack size and increase stack size via ulimit, but that's really lousy. In Java you can set the stack size as a JVM parameter.
Recursion is a way to use the call stack to store the state of the graph traversal. You can use the stack explicitly, say by having a local variable of type std::stack, then you won't need the recursion to implement the DFS, but just a loop.
okay. if you are still looking for a java code
dfs(Vertex start){
Stack<Vertex> stack = new Stack<>(); // initialize a stack
List<Vertex> visited = new ArrayList<>();//maintains order of visited nodes
stack.push(start); // push the start
while(!stack.isEmpty()){ //check if stack is empty
Vertex popped = stack.pop(); // pop the top of the stack
if(!visited.contains(popped)){ //backtrack if the vertex is already visited
visited.add(popped); //mark it as visited as it is not yet visited
for(Vertex adjacent: popped.getAdjacents()){ //get the adjacents of the vertex as add them to the stack
stack.add(adjacent);
}
}
}
for(Vertex v1 : visited){
System.out.println(v1.getId());
}
}
Python code. The time complexity is O(V+E) where V and E are the number of vertices and edges respectively. The space complexity is O(V) due to the worst-case where there is a path that contains every vertex without any backtracking (i.e. the search path is a linear chain).
The stack stores tuples of the form (vertex, vertex_edge_index) so that the DFS can be resumed from a particular vertex at the edge immediately following the last edge that was processed from that vertex (just like the function call stack of a recursive DFS).
The example code uses a complete digraph where every vertex is connected to every other vertex. Hence it is not necessary to store an explicit edge list for each node, as the graph is an edge list (the graph G contains every vertex).
numv = 1000
print('vertices =', numv)
G = [Vertex(i) for i in range(numv)]
def dfs(source):
s = []
visited = set()
s.append((source,None))
time = 1
space = 0
while s:
time += 1
current, index = s.pop()
if index is None:
visited.add(current)
index = 0
# vertex has all edges possible: G is a complete graph
while index < len(G) and G[index] in visited:
index += 1
if index < len(G):
s.append((current,index+1))
s.append((G[index], None))
space = max(space, len(s))
print('time =', time, '\nspace =', space)
dfs(G[0])
Output:
time = 2000
space = 1000
Note that time here is measuring V operations and not E. The value is numv*2 because every vertex is considered twice, once on discovery and once on finishing.
Acutally, stack is not well able to deal with discover time and finish time, if we want to implement DFS with stack, and want to deal with discover time and finish time, we would need to resort to another recorder stack, my implementation is shown below, have test correct, below is for case-1, case-2 and case-3 graph.
from collections import defaultdict
class Graph(object):
adj_list = defaultdict(list)
def __init__(self, V):
self.V = V
def add_edge(self,u,v):
self.adj_list[u].append(v)
def DFS(self):
visited = []
instack = []
disc = []
fini = []
for t in range(self.V):
visited.append(0)
disc.append(0)
fini.append(0)
instack.append(0)
time = 0
for u_ in range(self.V):
if (visited[u_] != 1):
stack = []
stack_recorder = []
stack.append(u_)
while stack:
u = stack.pop()
visited[u] = 1
time+=1
disc[u] = time
print(u)
stack_recorder.append(u)
flag = 0
for v in self.adj_list[u]:
if (visited[v] != 1):
flag = 1
if instack[v]==0:
stack.append(v)
instack[v]= 1
if flag == 0:
time+=1
temp = stack_recorder.pop()
fini[temp] = time
while stack_recorder:
temp = stack_recorder.pop()
time+=1
fini[temp] = time
print(disc)
print(fini)
if __name__ == '__main__':
V = 6
G = Graph(V)
#==============================================================================
# #for case 1
# G.add_edge(0,1)
# G.add_edge(0,2)
# G.add_edge(1,3)
# G.add_edge(2,1)
# G.add_edge(3,2)
#==============================================================================
#==============================================================================
# #for case 2
# G.add_edge(0,1)
# G.add_edge(0,2)
# G.add_edge(1,3)
# G.add_edge(3,2)
#==============================================================================
#for case 3
G.add_edge(0,3)
G.add_edge(0,1)
G.add_edge(1,4)
G.add_edge(2,4)
G.add_edge(2,5)
G.add_edge(3,1)
G.add_edge(4,3)
G.add_edge(5,5)
G.DFS()
I think you need to use a visited[n] boolean array to check if the current node is visited or not earlier.
A recursive algorithm works very well for DFS as we try to plunge as deeply as we can, ie. as soon as we find an un-explored vertex, we're going to explore its FIRST un-explored neighbor right away. You need to BREAK out of the for loop as soon as you find the first un-explored neighbor.
for each neighbor w of v
if w is not explored
mark w as explored
push w onto the stack
BREAK out of the for loop
I think this is an optimized DFS regarding space-correct me if I am wrong.
s = stack
s.push(initial node)
add initial node to visited
while s is not empty:
v = s.peek()
if for all E(v,u) there is one unvisited u:
mark u as visited
s.push(u)
else
s.pop
Using Stack and implementing as done by the call stack in the recursion process-
The Idea is to push a vertex in the stack, and then push its vertex adjacent to it which is stored in a adjacency list at the index of the vertex and then continue this process until we cannot move further in the graph, now if we cannot move ahead in the graph then we will remove the vertex which is currently on the top of the stack as it is unable to take us on any vertex which is unvisited.
Now, using stack we take care of the point that the vertex is only removed from the stack when all the vertices that can be explored from the current vertex have been visited, which was being done by the recursion process automatically.
for ex -
See the example graph here.
( 0 ( 1 ( 2 ( 4 4 ) 2 ) ( 3 3 ) 1 ) 0 ) ( 6 ( 5 5 ) ( 7 7 ) 6 )
The above parenthesis show the order in which the vertex is added on the stack and removed from the stack, so a parenthesis for a vertex is closed only when all the vertices that can be visited from it have been done.
(Here I have used the Adjacency List representation and implemented as a vector of list (vector > AdjList) by using C++ STL)
void DFSUsingStack() {
/// we keep a check of the vertices visited, the vector is set to false for all vertices initially.
vector<bool> visited(AdjList.size(), false);
stack<int> st;
for(int i=0 ; i<AdjList.size() ; i++){
if(visited[i] == true){
continue;
}
st.push(i);
cout << i << '\n';
visited[i] = true;
while(!st.empty()){
int curr = st.top();
for(list<int> :: iterator it = AdjList[curr].begin() ; it != AdjList[curr].end() ; it++){
if(visited[*it] == false){
st.push(*it);
cout << (*it) << '\n';
visited[*it] = true;
break;
}
}
/// We can move ahead from current only if a new vertex has been added on the top of the stack.
if(st.top() != curr){
continue;
}
st.pop();
}
}
}
The following Java Code will be handy:-
private void DFS(int v,boolean[] visited){
visited[v]=true;
Stack<Integer> S = new Stack<Integer>();
S.push(v);
while(!S.isEmpty()){
int v1=S.pop();
System.out.println(adjLists.get(v1).name);
for(Neighbor nbr=adjLists.get(v1).adjList; nbr != null; nbr=nbr.next){
if (!visited[nbr.VertexNum]){
visited[nbr.VertexNum]=true;
S.push(nbr.VertexNum);
}
}
}
}
public void dfs() {
boolean[] visited = new boolean[adjLists.size()];
for (int v=0; v < visited.length; v++) {
if (!visited[v])/*This condition is for Unconnected Vertices*/ {
System.out.println("\nSTARTING AT " + adjLists.get(v).name);
DFS(v, visited);
}
}
}
I'm stuck on a code challenge, and I want a hint.
PROBLEM: You are given a tree data structure (without cycles) and are asked to remove as many "edges" (connections) as possible, creating smaller trees with even numbers of nodes. This problem is always solvable as there are an even number of nodes and connections.
Your task is to count the removed edges.
Input:
The first line of input contains two integers N and M. N is the number of vertices and M is the number of edges. 2 <= N <= 100.
Next M lines contains two integers ui and vi which specifies an edge of the tree. (1-based index)
Output:
Print the number of edges removed.
Sample Input
10 9
2 1
3 1
4 3
5 2
6 1
7 2
8 6
9 8
10 8
Sample Output :
2
Explanation : On removing the edges (1, 3) and (1, 6), we can get the desired result.
I used BFS to travel through the nodes.
First, maintain an array separately to store the total number of child nodes + 1.
So, you can initially assign all the leaf nodes with value 1 in this array.
Now start from the last node and count the number of children for each node. This will work in bottom to top manner and the array that stores the number of child nodes will help in runtime to optimize the code.
Once you get the array after getting the number of children nodes for all the nodes, just counting the nodes with even number of nodes gives the answer. Note: I did not include root node in counting in final step.
This is my solution. I didn't use bfs tree, just allocated another array for holding eachnode's and their children nodes total number.
import java.util.Scanner;
import java.util.Arrays;
public class Solution {
public static void main(String[] args) {
int tree[];
int count[];
Scanner scan = new Scanner(System.in);
int N = scan.nextInt(); //points
int M = scan.nextInt();
tree = new int[N];
count = new int[N];
Arrays.fill(count, 1);
for(int i=0;i<M;i++)
{
int u1 = scan.nextInt();
int v1 = scan.nextInt();
tree[u1-1] = v1;
count[v1-1] += count[u1-1];
int root = tree[v1-1];
while(root!=0)
{
count[root-1] += count[u1-1];
root = tree[root-1];
}
}
System.out.println("");
int counter = -1;
for(int i=0;i<count.length;i++)
{
if(count[i]%2==0)
{
counter++;
}
}
System.out.println(counter);
}
}
If you observe the input, you can see that it is quite easy to count the number of nodes under each node. Consider (a b) as the edge input, in every case, a is the child and b is the immediate parent. The input always has edges represented bottom-up.
So its essentially the number of nodes which have an even count(Excluding the root node). I submitted the below code on Hackerrank and all the tests passed. I guess all the cases in the input satisfy the rule.
def find_edges(count):
root = max(count)
count_even = 0
for cnt in count:
if cnt % 2 == 0:
count_even += 1
if root % 2 == 0:
count_even -= 1
return count_even
def count_nodes(edge_list, n, m):
count = [1 for i in range(0, n)]
for i in range(m-1,-1,-1):
count[edge_list[i][1]-1] += count[edge_list[i][0]-1]
return find_edges(count)
I know that this has already been answered here lots and lots of time. I still want to know reviews on my solution here. I tried to construct the child count as the edges were coming through the input and it passed all the test cases.
namespace Hackerrank
{
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
var tempArray = Console.ReadLine().Split(' ').Select(x => Convert.ToInt32(x)).ToList();
int verticeNumber = tempArray[0];
int edgeNumber = tempArray[1];
Dictionary<int, int> childCount = new Dictionary<int, int>();
Dictionary<int, int> parentDict = new Dictionary<int, int>();
for (int count = 0; count < edgeNumber; count++)
{
var nodes = Console.ReadLine().Split(' ').Select(x => Convert.ToInt32(x)).ToList();
var node1 = nodes[0];
var node2 = nodes[1];
if (childCount.ContainsKey(node2))
childCount[node2]++;
else childCount.Add(node2, 1);
var parent = node2;
while (parentDict.ContainsKey(parent))
{
var par = parentDict[parent];
childCount[par]++;
parent = par;
}
parentDict[node1] = node2;
}
Console.WriteLine(childCount.Count(x => x.Value % 2 == 1) - 1);
}
}
}
My first inclination is to work up from the leaf nodes because you cannot cut their edges as that would leave single-vertex subtrees.
Here's the approach that I used to successfully pass all the test cases.
Mark vertex 1 as the root
Starting at the current root vertex, consider each child. If the sum total of the child and all of its children are even, then you can cut that edge
Descend to the next vertex (child of root vertex) and let that be the new root vertex. Repeat step 2 until you have traversed all of the nodes (depth first search).
Here's the general outline of an alternative approach:
Find all of the articulation points in the graph.
Check each articulation point to see if edges can be removed there.
Remove legal edges and look for more articulation points.
Solution - Traverse all the edges, and count the number of even edges
If we remove an edge from the tree and it results in two tree with even number of vertices, let's call that edge - even edge
If we remove an edge from the tree and it results in two trees with odd
number of vertices, let's call that edge - odd edge
Here is my solution in Ruby
num_vertices, num_edges = gets.chomp.split(' ').map { |e| e.to_i }
graph = Graph.new
(1..num_vertices).to_a.each do |vertex|
graph.add_node_by_val(vertex)
end
num_edges.times do |edge|
first, second = gets.chomp.split(' ').map { |e| e.to_i }
graph.add_edge_by_val(first, second, 0, false)
end
even_edges = 0
graph.edges.each do |edge|
dup = graph.deep_dup
first_tree = nil
second_tree = nil
subject_edge = nil
dup.edges.each do |e|
if e.first.value == edge.first.value && e.second.value == edge.second.value
subject_edge = e
first_tree = e.first
second_tree = e.second
end
end
dup.remove_edge(subject_edge)
if first_tree.size.even? && second_tree.size.even?
even_edges += 1
end
end
puts even_edges
Note - Click Here to check out the code for Graph, Node and Edge classes
Is there anyway to ensure the that the fewest number of turns heuristic is met by anything except a breadth first search? Perhaps some more explanation would help.
I have a random graph, much like this:
0 1 1 1 2
3 4 5 6 7
9 a 5 b c
9 d e f f
9 9 g h i
Starting in the top left corner, I need to know the fewest number of steps it would take to get to the bottom right corner. Each set of connected colors is assumed to be a single node, so for instance in this random graph, the three 1's on the top row are all considered a single node, and every adjacent (not diagonal) connected node is a possible next state. So from the start, possible next states are the 1's in the top row or 3 in the second row.
Currently I use a bidirectional search, but the explosiveness of the tree size ramps up pretty quickly. For the life of me, I haven't been able to adjust the problem so that I can safely assign weights to the nodes and have them ensure the fewest number of state changes to reach the goal without it turning into a breadth first search. Thinking of this as a city map, the heuristic would be the fewest number of turns to reach the goal.
It is very important that the fewest number of turns is the result of this search as that value is part of the heuristic for a more complex problem.
You said yourself each group of numbers represents one node, and each node is connected to adjascent nodes. Then this is a simple shortest-path problem, and you could use (for instance) Dijkstra's algorithm, with each edge having weight 1 (for 1 turn).
This sounds like Dijkstra's algorithm. The hardest part would lay in properly setting up the graph (keeping track of which node gets which children), but if you can devote some CPU cycles to that, you'd be fine afterwards.
Why don't you want a breadth-first search?
Here.. I was bored :-) This is in Ruby but may get you started. Mind you, it is not tested.
class Node
attr_accessor :parents, :children, :value
def initialize args={}
#parents = args[:parents] || []
#children = args[:children] || []
#value = args[:value]
end
def add_parents *args
args.flatten.each do |node|
#parents << node
node.add_children self unless node.children.include? self
end
end
def add_children *args
args.flatten.each do |node|
#children << node
node.add_parents self unless node.parents.include? self
end
end
end
class Graph
attr_accessor :graph, :root
def initialize args={}
#graph = args[:graph]
#root = Node.new
prepare_graph
#root = #graph[0][0]
end
private
def prepare_graph
# We will iterate through the graph, and only check the values above and to the
# left of the current cell.
#graph.each_with_index do |row, i|
row.each_with_index do |cell, j|
cell = Node.new :value => cell #in-place modification!
# Check above
unless i.zero?
above = #graph[i-1][j]
if above.value == cell.value
# Here it is safe to do this: the new node has no children, no parents.
cell = above
else
cell.add_parents above
above.add_children cell # Redundant given the code for both of those
# methods, but implementations may differ.
end
end
# Check to the left!
unless j.zero?
left = #graph[i][j-1]
if left.value == cell.value
# Well, potentially it's the same as the one above the current cell,
# so we can't just set one equal to the other: have to merge them.
left.add_parents cell.parents
left.add_children cell.children
cell = left
else
cell.add_parents left
left.add_children cell
end
end
end
end
end
end
#j = 0, 1, 2, 3, 4
graph = [
[3, 4, 4, 4, 2], # i = 0
[8, 3, 1, 0, 8], # i = 1
[9, 0, 1, 2, 4], # i = 2
[9, 8, 0, 3, 3], # i = 3
[9, 9, 7, 2, 5]] # i = 4
maze = Graph.new :graph => graph
# Now, going from maze.root on, we have a weighted graph, should it matter.
# If it doesn't matter, you can just count the number of steps.
# Dijkstra's algorithm is really simple to find in the wild.
This looks like same problem as this projeceuler http://projecteuler.net/index.php?section=problems&id=81
Comlexity of solution is O(n) n-> number of nodes
What you need is memoization.
At each step you can get from max 2 directions. So pick the solution that is cheaper.
It is something like (just add the code that takes 0 if on boarder)
for i in row:
for j in column:
matrix[i][j]=min([matrix[i-1][j],matrix[i][j-1]])+matrix[i][j]
And now you have lest expensive solution if you move just left or down
Solution is in matrix[MAX_i][MAX_j]
If you can go left and up too, than the BigO is much higher (I can figure out optimal solution)
In order for A* to always find the shortest path, your heuristic needs to always under-estimate the actual cost (the heuristic is "admissable"). Simple heuristics like using the Euclidean or Manhattan distance on a grid work well because they're fast to compute and are guaranteed to be less than or equal to the actual cost.
Unfortunately, in your case, unless you can make some simplifying assumptions about the size/shape of the nodes, I'm not sure there's much you can do. For example, consider going from A to B in this case:
B 1 2 3 A
C 4 5 6 D
C 7 8 9 C
C e f g C
C C C C C
The shortest path would be A -> D -> C -> B, but using spatial information would probably give 3 a lower heuristic cost than D.
Depending on your circumstances, you might be able to live with a solution that isn't actually the shortest path, as long as you can get the answer sooner. There's a nice blogpost here by Christer Ericson (progammer for God of War 3 on PS3) on the topic: http://realtimecollisiondetection.net/blog/?p=56
Here's my idea for an nonadmissable heuristic: from the point, move horizontally until you're even with the goal, then move vertically until you reach it, and count the number of state changes that you made. You can compute other test paths (e.g. vertically then horizontally) too, and pick the minimum value as your final heuristic. If your nodes are roughly equal size and regularly shaped (unlike my example), this might do pretty well. The more test paths you do, the more accurate you'd get, but the slower it would be.
Hope that's helpful, let me know if any of it doesn't make sense.
This untuned C implementation of breadth-first search can chew through a 100-by-100 grid in less than 1 msec. You can probably do better.
int shortest_path(int *grid, int w, int h) {
int mark[w * h]; // for each square in the grid:
// 0 if not visited
// 1 if not visited and slated to be visited "now"
// 2 if already visited
int todo1[4 * w * h]; // buffers for two queues, a "now" queue
int todo2[4 * w * h]; // and a "later" queue
int *readp; // read position in the "now" queue
int *writep[2] = {todo1 + 1, 0};
int x, y, same;
todo1[0] = 0;
memset(mark, 0, sizeof(mark));
for (int d = 0; ; d++) {
readp = (d & 1) ? todo2 : todo1; // start of "now" queue
writep[1] = writep[0]; // end of "now" queue
writep[0] = (d & 1) ? todo1 : todo2; // "later" queue (empty)
// Now consume the "now" queue, filling both the "now" queue
// and the "later" queue as we go. Points in the "now" queue
// have distance d from the starting square. Points in the
// "later" queue have distance d+1.
while (readp < writep[1]) {
int p = *readp++;
if (mark[p] < 2) {
mark[p] = 2;
x = p % w;
y = p / w;
if (x > 0 && !mark[p-1]) { // go left
mark[p-1] = same = (grid[p-1] == grid[p]);
*writep[same]++ = p-1;
}
if (x + 1 < w && !mark[p+1]) { // go right
mark[p+1] = same = (grid[p+1] == grid[p]);
if (y == h - 1 && x == w - 2)
return d + !same;
*writep[same]++ = p+1;
}
if (y > 0 && !mark[p-w]) { // go up
mark[p-w] = same = (grid[p-w] == grid[p]);
*writep[same]++ = p-w;
}
if (y + 1 < h && !mark[p+w]) { // go down
mark[p+w] = same = (grid[p+w] == grid[p]);
if (y == h - 2 && x == w - 1)
return d + !same;
*writep[same]++ = p+w;
}
}
}
}
}
This paper has a slightly faster version of Dijsktra's algorithm, which lowers the constant term. Still O(n) though, since you are really going to have to look at every node.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.8746&rep=rep1&type=pdf
EDIT: THE PREVIOUS VERSION WAS WRONG AND WAS FIXED
Since a Djikstra is out. I'll recommend a simple DP, which has the benefit of running in the optimal time and not having you construct a graph.
D[a][b] is the minimal distance to x=a and y=b using only nodes where the x<=a and y<=b.
And since you can't move diagonally you only have to look at D[a-1][b] and D[a][b-1] when calculating D[a][b]
This gives you the following recurrence relationship:
D[a][b] = min(if grid[a][b] == grid[a-1][b] then D[a-1][b] else D[a-1][b] + 1, if grid[a][b] == grid[a][b-1] then D[a][b-1] else D[a][b-1] + 1)
However doing only the above fails on this case:
0 1 2 3 4
5 6 7 8 9
A b d e g
A f r t s
A z A A A
A A A f d
Therefore you need to cache the minimum of each group of node you found so far. And instead of looking at D[a][b] you look at the minimum of the group at grid[a][b].
Here's some Python code:
Note grid is the grid that you're given as input and it's assumed the grid is N by N
groupmin = {}
for x in xrange(0, N):
for y in xrange(0, N):
groupmin[grid[x][y]] = N+1#N+1 serves as 'infinity'
#init first row and column
groupmin[grid[0][0]] = 0
for x in xrange(1, N):
gm = groupmin[grid[x-1][0]]
temp = (gm) if grid[x][0] == grid[x-1][0] else (gm + 1)
groupmin[grid[x][0]] = min(groupmin[grid[x][0]], temp);
for y in xrange(1, N):
gm = groupmin[grid[0][y-1]]
temp = (gm) if grid[0][y] == grid[0][y-1] else (gm + 1)
groupmin[grid[0][y]] = min(groupmin[grid[0][y]], temp);
#do the rest of the blocks
for x in xrange(1, N):
for y in xrange(1, N):
gma = groupmin[grid[x-1][y]]
gmb = groupmin[grid[x][y-1]]
a = (gma) if grid[x][y] == grid[x-1][y] else (gma + 1)
b = (gmb) if grid[x][y] == grid[x][y-1] else (gma + 1)
temp = min(a, b)
groupmin[grid[x][y]] = min(groupmin[grid[x][y]], temp);
ans = groupmin[grid[N-1][N-1]]
This will run in O(N^2 * f(x)) where f(x) is the time the hash function takes which is normally O(1) time and this is one of the best functions you can hope for and it has a lot lower constant factor than Djikstra's.
You should easily be able to handle N's of up to a few thousand in a second.
Is there anyway to ensure the that the fewest number of turns heuristic is met by anything except a breadth first search?
A faster way, or a simpler way? :)
You can breadth-first search from both ends, alternating, until the two regions meet in the middle. This will be much faster if the graph has a lot of fanout, like a city map, but the worst case is the same. It really depends on the graph.
This is my implementation using a simple BFS. A Dijkstra would also work (substitute a stl::priority_queue that sorts by descending costs for the stl::queue) but would seriously be overkill.
The thing to notice here is that we are actually searching on a graph whose nodes do not exactly correspond to the cells in the given array. To get to that graph, I used a simple DFS-based floodfill (you could also use BFS, but DFS is slightly shorter for me). What that does is to find all connected and same character components and assign them to the same colour/node. Thus, after the floodfill we can find out what node each cell belongs to in the underlying graph by looking at the value of colour[row][col]. Then I just iterate over the cells and find out all the cells where adjacent cells do not have the same colour (i.e. are in different nodes). These therefore are the edges of our graph. I maintain a stl::set of edges as I iterate over the cells to eliminate duplicate edges. After that it is a simple matter of building an adjacency list from the list of edges and we are ready for a bfs.
Code (in C++):
#include <queue>
#include <vector>
#include <iostream>
#include <string>
#include <set>
#include <cstring>
using namespace std;
#define SIZE 1001
vector<string> board;
int colour[SIZE][SIZE];
int dr[]={0,1,0,-1};
int dc[]={1,0,-1,0};
int min(int x,int y){ return (x<y)?x:y;}
int max(int x,int y){ return (x>y)?x:y;}
void dfs(int r, int c, int col, vector<string> &b){
if (colour[r][c]<0){
colour[r][c]=col;
for(int i=0;i<4;i++){
int nr=r+dr[i],nc=c+dc[i];
if (nr>=0 && nr<b.size() && nc>=0 && nc<b[0].size() && b[nr][nc]==b[r][c])
dfs(nr,nc,col,b);
}
}
}
int flood_fill(vector<string> &b){
memset(colour,-1,sizeof(colour));
int current_node=0;
for(int i=0;i<b.size();i++){
for(int j=0;j<b[0].size();j++){
if (colour[i][j]<0){
dfs(i,j,current_node,b);
current_node++;
}
}
}
return current_node;
}
vector<vector<int> > build_graph(vector<string> &b){
int total_nodes=flood_fill(b);
set<pair<int,int> > edge_list;
for(int r=0;r<b.size();r++){
for(int c=0;c<b[0].size();c++){
for(int i=0;i<4;i++){
int nr=r+dr[i],nc=c+dc[i];
if (nr>=0 && nr<b.size() && nc>=0 && nc<b[0].size() && colour[nr][nc]!=colour[r][c]){
int u=colour[r][c], v=colour[nr][nc];
if (u!=v) edge_list.insert(make_pair(min(u,v),max(u,v)));
}
}
}
}
vector<vector<int> > graph(total_nodes);
for(set<pair<int,int> >::iterator edge=edge_list.begin();edge!=edge_list.end();edge++){
int u=edge->first,v=edge->second;
graph[u].push_back(v);
graph[v].push_back(u);
}
return graph;
}
int bfs(vector<vector<int> > &G, int start, int end){
vector<int> cost(G.size(),-1);
queue<int> Q;
Q.push(start);
cost[start]=0;
while (!Q.empty()){
int node=Q.front();Q.pop();
vector<int> &adj=G[node];
for(int i=0;i<adj.size();i++){
if (cost[adj[i]]==-1){
cost[adj[i]]=cost[node]+1;
Q.push(adj[i]);
}
}
}
return cost[end];
}
int main(){
string line;
int rows,cols;
cin>>rows>>cols;
for(int r=0;r<rows;r++){
line="";
char ch;
for(int c=0;c<cols;c++){
cin>>ch;
line+=ch;
}
board.push_back(line);
}
vector<vector<int> > actual_graph=build_graph(board);
cout<<bfs(actual_graph,colour[0][0],colour[rows-1][cols-1])<<"\n";
}
This is just a quick hack, lots of improvements can be made. But I think it is pretty close to optimal in terms of runtime complexity, and should run fast enough for boards of size of several thousand (don't forget to change the #define of SIZE). Also, I only tested it with the one case you have provided. So, as Knuth said, "Beware of bugs in the above code; I have only proved it correct, not tried it." :).