need the complexity of a pseudocode - complexity-theory

I need to determine the complexity of the pseudocode I wrote
while root ≠ null
while hasChild(root)
push(parentTree) ← root
root ← pop(getChilds(root))
...
is parentTree isEmpty
root ← null
else
root ← pop(parentTree)
How can I know the number of execution (for each line) in a worst case scenario ?
I am not able to determine it, because I actually does not know the first two lines. After, it's easy, but I don't know the count for the two first lines...
It's a tree implementation using a stack, and root is the root node, as you see.
By the way, it's the first time I write pseudo code, so I am not sure i wrote it in a good way. If it's not correct, I can rewrite it.

prima facie analysis leads me to think the runtime is O(logn*logn)
Reasoning:
Outer while loop executes at most clogn times (where c is a constant). This is due to the fact that it relies on the 'root' variable, which in turn relies on the 'pop parenttree'
parent tree only gets populated with the 'original' root's grandchildren, iteratively. At most it will have all the children down one path in the tree. Its known that the length of a single path down a tree is logn
Inner while loop also executes at most d logn times (d is constant), if the ... does not execute in O(1) then it would execute in dlogn+X, and the overall runtime would be O(logn*(logn+X)), likely simplifying to O(Xlogn)
Assuming the is is an if, the if/else statements run in O(1)
Outer*Inner = O(clogn*dlogn)

Related

Why is the time complexity O(lgN) for an operation in weighted union find algorithm?

So everywhere I see weighted union find algorithm, they use this approach:
Maintain an array pointing to the parent node of a particular node
Maintain an array denoting the size of the tree a node is in
For union (p,q) merge the smaller tree with larger
Time complexity here is O(lgN)
Now an optimzation on this is to flatten the trees, ie, whenever I am calculating the root of a particular node, set all nodes in that path to point to that root.
Time complexity of this is O(lg*N)
This I can understand, but what I don't get is why don't they start off with an array/hashset wherein nodes point to the root (instead of the immediate parent node)? That would bring the time complexity down to O(1).
I am going to assume that the time complexity you are asking for is the time to check whether 2 nodes belong to the same set.
The key is in how sets are joined, specifically you take the root of one set (the smaller one) and have it point to the root of the other set. Let the two sets have p and q as roots respectively, and |p| will represents the size of the set p if p is a root, while in general it will be the number of items whos set path goes through p (which is 1 + all its children).
We can without loss of generality assume that |p| <= |q| (otherwise we just exchange their names). We then have that |p u q| = |p|+|q| >= 2|p|. This shows us that each subtree in data-structure can at most be half as big as its parent, so given N items it can at most have the depth 1+lg N = O(lg(N)).
If the two choosen items are the furthest possible from the root, it will take O(N) operations to find the root for each of their sets, since you only need O(1) operations to move up one layer in the set, and then O(1) operations to compare those roots.
This cost is also applied to each union operation itself, since you need to figure out which two roots you need to merge. The reason we dont just have all nodes point directly to the root, is several-fold. First we would need to change all the nodes in the set every time we perform a union, secondly we only have edges pointing from the nodes toward the root and not the other way, so we would have to look through all nodes to find the ones we would need to change. The next reason is that we have good optimizations that can help in this kind of step and still work. Finally you could do such a step at the end if you really need to, but it would cost O(N lg(N)) time to perform it, which is compariable to how long it would take to run the entire algorithm by itself without the running short-cut optimization.
You are correct that the solution you suggest will bring down the time complexity of a Find operation to O(1). However, doing so will make a Union operation slower.
Imagine that you use an array/hashtable to remember the representative (or the root, as you call it) of each node. When you perform a union operation between two nodes x and y, you would either need to update all the nodes with the same representative as x to have y's representative, or vice versa. This way, union runs in O(min{|Sx|, |Sy|}), where Sxis the set of nodes with the same representative as x. These sets can be significantly larger than log n.
The weighted union algorithm on the other hand has O(log n) for both Find and
So it's a trade-off. If you expect to do many Find operations, but few Union operations, you should use the solution you suggest. If you expect to do many of each, you can use the weighted union algorithm to avoid excessively slow operations.

Nearest node in a tree

Recent I encountered this problem on trees whose solution I found in O(n*q) . I am thinking if there is much better way to deal this with lesser complexity.
The problem is here as follows :
Given an unweighted tree of 'n' nodes ( n>=1 and n can go to 105 ) , Its nodes can be special or non special. Node 1 is always special and rest non special initially. Now ,There are two operations :
1.we can update any non special node to special node by an update operation by "U Node_Number"
OR
2.At any time , we can ask user "Q Node_Number" which should return that special node in tree closest to "Node_Number".
These operations can also go upto 105.
My Solution :
I thought of creating adjacency list. For operation 1, I can keep record of special or Non special by boolean flag. But for operation 2 , my solution comprises of doing BFS whenever "Q Node_Number" is asked taking "Node_Number" as root to begin my BFS.
But complexity is quadratic. Is this the most optimal way of going about this problem ?
Here's an O(n^1.5 + n^0.5 q)-time algorithm via a sqrt decomposition. We need a constant-time distance oracle (this is basically least common ancestors). The idea is, every n^0.5 times a node is made special, perform a breadth-first search from all special nodes, which yields for each node in the tree the closest node that is currently special. On each query, take the closest of (i) the nodes that were special as of the last breadth-first search (ii) the at most n^0.5 newly special nodes.
As I mentioned in the comments, I expect that there's a very complicated O((n + q) log n)-time algorithm via top trees.

Time complexity of recursive method involving a tree traversal

I'm trying to get my head around the calculation of time complexity of my own method - can anybody nudge me in the direction of calculating a method involving a for each method which involves a recursive call on itself?
I've written a method that does a tree traversal of an -nary tree. Unfortunately I can't post the exact code, but it goes: given the root node to start with
for each (child of node)
does a quick check
sets a boolean
does a recursive call on itself until we go to the lead nodes
Your loop visits every node of the tree exactly once.
Beginning withe the root node, you visit all of its child nodes which for them you call the same function on every child node of the root-child nodes and the same repeats.
Since you visit every node exactly once this loop has a runtime of O(n) for n nodes of your tree assuming that quick check is constant and does not depend on n or does anything that exceeds O(n).
"is the for each part done n times":
Yes and no: The for each part is done numberOfChildsOfNode(Node node) for a single node but since you do that for each child node by calling your function recursively the number of times this is executed is actually n times.
What you can test/try: Declare a static variable executionCount or somtheing like that, initialize it to 0 and increment it inside your loop. You should see that executionCount equals the number of nodes.

Checking if A is a part of binary tree B

Let's say I have binary trees A and B and I want to know if A is a "part" of B. I am not only talking about subtrees. What I want to know is if B has all the nodes and edges that A does.
My thoughts were that since tree is essentially a graph, and I could view this question as a subgraph isomorphism problem (i.e. checking to see if A is a subgraph of B). But according to wikipedia this is an NP-complete problem.
http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem
I know that you can check if A is a subtree of B or not with O(n) algorithms (e.g. using preorder and inorder traversals to flatten the trees to strings and checking for substrings). I was trying to modify this a little to see if I can also test for just "parts" as well, but to no avail. This is where I'm stuck.
Are there any other ways to view this problem other than using subgraph isomorphism? I'm thinking there must be faster methods since binary trees are much more restricted and simpler versions of graphs.
Thanks in advance!
EDIT: I realized that the worst case for even a brute force method for my question would only take O(m * n), which is polynomial. So I guess this isn't a NP-complete problem after all. Then my next question is, is there an algorithm that is faster than O(m*n)?
I would approach this problem in two steps:
Find the root of A in B (either BFS of DFS)
Verify that A is contained in B (giving that starting node), using a recursive algorithm, as below (I concocted same crazy pseudo-language, because you didn't specify the language. I think this should be understandable, no matter your background). Note that a is a node from A (initially the root) and b is a node from B (initially the node found in step 1)
function checkTrees(node a, node b) returns boolean
if a does not exist or b does not exist then
// base of the recursion
return false
else if a is different from b then
// compare the current nodes
return false
else
// check the children of a
boolean leftFound = true
boolean rightFound = true
if a.left exists then
// try to match the left child of a with
// every possible neighbor of b
leftFound = checkTrees(a.left, b.left)
or checkTrees(a.left, b.right)
or checkTrees(a.left, b.parent)
if a.right exists then
// try to match the right child of a with
// every possible neighbor of b
leftFound = checkTrees(a.right, b.left)
or checkTrees(a.right, b.right)
or checkTrees(a.right, b.parent)
return leftFound and rightFound
About the running time: let m be the number of nodes in A and n be the number of nodes in B. The search in the first step takes O(n) time. The running time of the second step depends on one crucial assumption I made, but that might be wrong: I assumed that every node of A is equal to at most one node of B. If that is the case, the running time of the second step is O(m) (because you can never search too far in the wrong direction). So the total running time would be O(m + n).
While writing down my assumption, I start to wonder whether that's not oversimplifying your case...
you could compare the trees in bottom-up as follows:
for each leaf in tree A, identify the corresponding node in tree B.
start a parallel traversal towards the root in both trees from the nodes just matched.
specifically, move to the parent of a node in A and subsequently move towards the root in B until you either encounter the corresponding node in B (proceed) or a marked node in A (see below, if a match in B is found proceed, else fail) or the root of B (fail)
mark all nodes visited in A.
you succeed, if you haven't failed ;-).
the main part of the algorithm runs in O(e_B) - in the worst case, all edges in B are visited a constant number of times. the leaf node matching will run in O(n_A * log n_B) if there the B vertices are sorted, O(n_A * log n_A + n_B * log n_B + n) = O(n_B * log n_B) (sort each node set, lienarly scan the results thereafter) otherwise.
EDIT:
re-reading your question, abovementioned step 2 is even easier, as for matching nodes in A, B, their parents must match too (otheriwse there would be a mismatch between the edge sets). no effect on worst-case run time, of course.

Shortest branch in a binary tree?

A binary tree can be encoded using two functions l and r
such that for a node n, l(n) give the left child of n, r(n)
give the right child of n.
A branch of a tree is a path from the root to a leaf, the
length of a branch to a particular leaf is the number of
arcs on the path from the root to that leaf.
Let MinBranch(l,r,x) be a simple recursive algorithm for
taking a binary tree encoded by the l and r functions
together with the root node x for the binary tree and
returns the length of the shortest branch of the binary
tree.
Give the pseudocode for this algorithm.
OK, so basically this is what I've come up with so far:
MinBranch(l, r, x)
{
if x is None return 0
left_one = MinBranch(l, r, l(x))
right_one = MinBranch(l, r, r(x))
return {min (left_one),(right_one)}
}
Obviously this isn't great or perfect. I'd be greatful if
people can help me get this perfect and working - any help
will be appreciated.
I doubt anyone will solve homework for you straight-up. A clue: the return value must surely grow higher as the tree gets bigger, right? However I don't see any numeric literals in your function except 0, and no addition operators either. How will you ever return larger numbers?
Another angle on the same issue: anytime you write a recursive function, it helps to enumerate "what are all the conditions where I should stop calling myself? what I return in each circumstance?"
You're on the right approach, but you're not quite there; your recursive algorithm will always return 0. (the logic is almost right, though...)
note that the length of the sub-branches is one less than the length of the branch; so left_one and right_one should be 1 + MinBranch....
Steping through the algorithm with some sample trees will help uncover off-by-one errors like this one...
It looks like you almost have it, but consider this example:
4
3 5
When you trace through MinBranch, you'll see that in your
MinBranch(l,r,4) call:
left_one = MinBranch(l, r, l(x))
= MinBranch(l, r, l(4))
= MinBranch(l, r, 3)
= 0
That makes sense, after all, 3 is a leaf node, so of course the distance
to the closest leaf node is 0. The same happens for right_one.
But you then wind up here:
return {min (left_one),(right_one)}
= {min (0), (0) }
= 0
but that's clearly wrong, because this node (4) is not a leaf node. Your
code forgot to count the current node (oops!). I'm sure you can manage
to fix that.
Now, actually, they way you're doing this isn't the fastest, but I'm not
sure if that's relevant for this exercise. Consider this tree:
4
3 5
2
1
Your algorithm will count up the left branch recursively, even though it
could, hypothetically, bail out if you first counted the right branch
and noted that 3 has a left, so its clearly longer than 5 (which is a
leaf). But, of course, counting the right branch first doesn't always
work!
Instead, with more complicated code, and probably a tradeoff of greater
memory usage, you can check nodes left-to-right, top-to-bottom (just
like English reading order) and stop at the first leaf you find.
What you've created can be thought of as a depth-first search. However, given what you're after (shortest branch), this may not be the most efficent approach. Think about how your algorithm would perform on a tree that was very heavy on the left side (of the root node), but had only one node on the right side.
Hint: consider a breadth-first search approach.
What you have there looks like a depth first search algorithm which will have to search the entire tree before you come up with a solution. what you need is the breadth first search algorithm which can return as soon as it finds the solution without doing a complete search

Resources