Note: This is problem 4.3 from Cracking the Coding Interview 5th Edition
Problem:Given a sorted(increasing order) array, write an algorithm to create a binary search tree with minimal height
Here is my algorithm, written in Java to do this problem
public static IntTreeNode createBST(int[] array) {
return createBST(array, 0, array.length-1);
}
private static IntTreeNode createBST(int[] array, int left, int right) {
if(right >= left) {
int middle = array[(left + right)/2;
IntTreeNode root = new IntTreeNode(middle);
root.left = createBST(array, left, middle - 1);
root.right = createBST(array, middle + 1, right);
return root;
} else {
return null;
}
}
I checked this code against the author's and it's nearly identical. However I am having a hard time with analyzing the time complexity of this algorithm. I know this wouldn't run in O(logn) like Binary Search because you're not doing the same amount of work at each level of recursion. E.G at the first level, 1 unit of work, 2nd level - 2 units of work, 3rd level - 4 units of work, all the way to log2(n) level - n units of work.
So based off that, the number of steps this algorithms takes would be upper bounded by this mathematical expression
which after watching Infinite geometric series, I evaluated to
or 2n which would be in O(n)
Do you guys agree with my work here and that this algorithm would run in O(n) or did I miss something or it actually runs in O(nlogn) or some other function class?
Sometimes you can simplify calculations by calculating the amount of time per item in the result rather than solving recurrence relations. That trick applies here. Start by changing the code to this obviously equivalent form:
private static IntTreeNode createBST(int[] array, int left, int right) {
int middle = array[(left + right)/2;
IntTreeNode root = new IntTreeNode(middle);
if (middle - 1 >= left) {
root.left = createBST(array, left, middle - 1);
}
if (right >= middle + 1) {
root.right = createBST(array, middle + 1, right);
}
return root;
}
Now every call to createBST directly creates 1 node. Since there's n nodes in the final tree, there must be n total calls to createBST and since each call directly performs a constant amount of work, the overall time complexity is O(n).
If and when you get confused in recursion, substitute the recursive call (mentally, of course) as a loop. For example, in your above function, you can imagine the recursive calls to be inside a "while loop". Since, it is now a while loop executed till the time all n nodes are traversed, complexity is O(n).
Related
The code searches for the number of possible pathways of actions that reach the goal. I do not want to optimize it, just know the Big-O complexity that it has. The code is the following:
private int countPaths(Node parent, List<Action> usableActions, Node goal)
{
int counter = 0;
foreach(Action act in usableActions)
{
Node node = generateNewNode(parent, act); // Only generate the new node O(1)
if (node.isEquals(goal)) //Check goal
{
counter++;
}
else
{
List<Action> subset = actionSubset(usableActions, act); // return usableAction with act removed
counter += countPaths(node, subset, goal); // usableActions - 1
}
}
return counter;
}
The first loop would give the algorithm a complexity of O(n), but having a recursive call does not know if it is O(n^2), O(n^n) or another option.
As already stated in some of the comments, the time complexity is .
As for memory usage, actionSubset() creates a new list each time it is called (as opposed to having the algorithm operate on the original). But because all lists except the original fall out of scope at the end of each iteration, memory usage will only grow by as a function of the size of usableActions, so .
This is my solution to the problem, where, given a Binary Tree, you're asked to find, the total sum of all non-directly linked nodes. "Directly linked" refers to parent-child relationship, just to be clear.
My solution
If the current node is visited, you're not allowed to visit the nodes at the next level. If the current node, however, is not visited, you may or may not visit the nodes at the next level.
It passes all tests. However, what is the run time complexity of this Recursive Binary Tree Traversal. I think it's 2^n because, at every node, you have two choices, whether to use it, or not use it, and accordingly, the next level, would have two choices for each of these choices and so on.
Space complexity : Not using any additional space for storage, but since this is a recursive implementation, stack space is used, and the maximum elements in the stack, could be the height of the tree, which is n. so O(n) ?
public int rob(TreeNode root) {
return rob(root, false);
}
public int rob(TreeNode root, boolean previousStateUsed) {
if(root == null)
return 0;
if(root.left == null && root.right == null)
{
if(previousStateUsed == true)
return 0;
return root.val;
}
if(previousStateUsed == true)
{
int leftSumIfCurrentIsNotUsedNotUsed = rob(root.left, false);
int rightSumIfCurrentIsNotUsed = rob(root.right, false);
return leftSumIfCurrentIsNotUsedNotUsed + rightSumIfCurrentIsNotUsed;
}
else
{
int leftSumIfCurrentIsNotUsedNotUsed = rob(root.left, false);
int rightSumIfCurrentIsNotUsed = rob(root.right, false);
int leftSumIsCurrentIsUsed = rob(root.left, true);
int rightSumIfCurrentIsUsed = rob(root.right, true);
return Math.max(leftSumIfCurrentIsNotUsedNotUsed + rightSumIfCurrentIsNotUsed, leftSumIsCurrentIsUsed + rightSumIfCurrentIsUsed + root.val);
}
}
Your current recursive solution would be O(2^n). It's pretty clear to see if we take an example:
Next, let's cross out alternating layers of nodes:
With the remaining nodes we have about n/2 nodes (this will vary, but you can always remove alternating layers to get at least n/2 - 1 nodes worst case). With just these nodes, we can make any combination of them because none of them are conflicting. Therefore we can be certain that this takes at least Omega( 2^(n/2) ) time worst case. You can probably get a tighter bound, but this should make you realize your solution will not scale well.
This problem is a pretty common adaptation of the Max Non-Adajacent Sum Problem.
You should be able to use dynamic programming on this. I would highly recommend it. Imagine we are finding the solution for node i. Let's assume we already have the solution to nodes i.left and i.right and let's also assume we have the solution to their children (i's grandchildren). We now have 2 options for i's max solution:
max-sum(i.left) + max-sum(i.right)
i.val + max-sum(i.left.left) + max-sum(i.left.right) + max-sum(i.right.left) + max-sum(i.right.right)
You take the max of these and that's your solution for i. You can perform this bottom-up DP or use memoization in your current program. Either should work. The best part is, now your solution is O(n)!
I am doing a problem in binary trees, and when I came across a problem find the right most node in the last level of a complete binary tree and the issue here is we have to do it in O(n) time which was a stopping point, Doing it in O(n) is simple by traversing all the elements, but is there a way to do this in any complexity less than O(n), I have browsed through internet a lot, and I couldn't get anything regarding the thing.
Thanks in advance.
Yes, you can do it in O(log(n)^2) by doing a variation of binary search.
This can be done by first going to the leftest element1, then to the 2nd leftest element, then to the 4th leftest element, 8th ,... until you find there is no such element.
Let's say the last element you found was the ith, and the first you didn't was 2i.
Now you can simply do a binary search over that range.
This is O(log(n/2)) = O(logn) total iterations, and since each iteration is going down the entire tree, it's total of O(log(n)^2) time.
(1) In here and the followings, the "x leftest element" is referring only to the nodes in the deepest level of the tree.
I assume that you know the number of nodes. Let n such number.
In a complete binary tree, a level i has twice the number of nodes than the level i - 1.
So, you could iteratively divide n between 2. If there remainder then n is a right child; otherwise, is a left child. You store into a sequence, preferably a stack, whether there is remainder or not.
Some such as:
Stack<char> s;
while (n > 1)
{
if (n % 2 == 0)
s.push('L');
else
s.push('R');
n = n/2; // n would int so division is floor
}
When the while finishes, the stack contains the path to the rightmost node.
The number of times that the while is executed is log_2(n).
This is the recursive solution with time complexity O(lg n* lg n) and O(lg n) space complexity (considering stack storage space).
Space complexity can be reduced to O(1) using Iterative version of the below code.
// helper function
int getLeftHeight(TreeNode * node) {
int c = 0;
while (node) {
c++;
node = node -> left;
}
return c;
}
int getRightMostElement(TreeNode * node) {
int h = getLeftHeight(node);
// base case will reach when RightMostElement which is our ans is found
if (h == 1)
return node -> val;
// ans lies in rightsubtree
else if ((h - 1) == getLeftHeight(node -> right))
return getRightMostElement(node -> right);
// ans lies in left subtree
else getRightMostElement(node -> left);
}
Time Complexity derivation -
At each recursion step, we are considering either left subtree or right subtree i.e. n/2 elements for maximum height (lg n) function calls,
calculating height takes lg n time -
T(n) = T(n/2) + c1 lgn
= T(n/4) + c1 lgn + c2 (lgn - 1)
= ...
= T(1) + c [lgn + (lgn-1) + (lgn-2) + ... + 1]
= O(lgn*lgn)
Since it's a complete binary tree, going over all the right nodes until you reach the leaves will take O(logN), not O(N). In regular binary tree it takes O(N) because in the worst case all the nodes are lined up to the right, but since it's a complete binary tree, it can't be
I tried to do the classical problem to implement an algorithm to print all valid combinations of n pairs of parentheses.
I found this program (which works perfectly) :
public static void addParen(ArrayList<String> list, int leftRem, int rightRem, char[] str, int count) {
if (leftRem < 0 || rightRem < leftRem) return; // invalid state
if (leftRem == 0 && rightRem == 0) { /* all out of left and right parentheses */
String s = String.copyValueOf(str);
list.add(s);
} else {
if (leftRem > 0) { // try a left paren, if there are some available
str[count] = '(';
addParen(list, leftRem - 1, rightRem, str, count + 1);
}
if (rightRem > leftRem) { // try a right paren, if there’s a matching left
str[count] = ')';
addParen(list, leftRem, rightRem - 1, str, count + 1);
}
}
}
public static ArrayList<String> generateParens(int count) {
char[] str = new char[count*2];
ArrayList<String> list = new ArrayList<String>();
addParen(list, count, count, str, 0);
return list;
}
As I understand, the idea is that we will add left brackets whenever possible. For a right bracket, we will add it only if the remaining number of right brackets is greater than the left one. If we had used all the left and right parentheses, we will add the new combination to the result. We can be sure that there will not be any duplicate constructed string.
For me, this recursion, is like when we work with a tree for example and we do the pre order traversal for example : we go to a left node EACH time is possible, if not we go right, and then we try to go left just after this step. If we can’t, we "come back" and go right and we repeat the traversal. In my opinion, it's exactly the same idea here.
So, naively, I thought that the time complexity will be something like O(log(n)), O(n.log(n)) or something like that with logarithm. But, when I tried to search about that, I found something called "number of CATALAN", which we can use to count the number of combination of parentheses....(https://anonymouscoders.wordpress.com/2015/07/20/its-all-about-catalan/)
What are the time complexity in your opinion? Can we apply the master theorem here or not?
The complexity of this code is O(n * Cat(n)) where Cat(n) is the nth Catalan number. There are Cat(n) possible valid strings that are valid combinations of parenthesis (see https://en.wikipedia.org/wiki/Catalan_number), and for each a string of length n is created.
Since Cat(n) = choose(2n, n) / (n + 1), O(n * Cat(n)) = O(choose(2n, n)) = O(4^n / sqrt(n)) (see https://en.wikipedia.org/wiki/Central_binomial_coefficient).
There's two main flaws with your reasoning. The first is that the search tree is not balanced: the tree that you search when you close the right brace is not the same size as the tree you search when you add another left brace, so more common methods for computing complexity don't work. A second mistake is that even if you assume the tree is balanced, the height of the search tree would be n, and the number of leaves found O(2^n). This differs from analysis of a binary search tree, where you usually have n things in the tree and the height is O(log n).
I don't think there's any standard way to compute the time complexity here -- ultimately you're going to be reproducing something like the math done when you count valid parenthetical strings -- and the Master theorem isn't going to power you through that.
But there is a useful insight here: if a program generates f(n) things, and the cost of generating each if c(n), then the program's complexity can't be better than O(c(n)f(n)). Here, f(n) = Cat(n) and c(n) = 2n, so you can quickly get a lower bound for the complexity even if analyzing the code is difficult. This trick would have immediately led you to discard the idea that the complexity is O(log n) or O(n log n).
For a binary search of a sorted array of 2^n-1 elements in which the element we are looking for appears, what is the amortized worst-case time complexity?
Found this on my review sheet for my final exam. I can't even figure out why we would want amortized time complexity for binary search because its worst case is O(log n). According to my notes, the amortized cost calculates the upper-bound of an algorithm and then divides it by the number of items, so wouldn't that be as simple as the worst-case time complexity divided by n, meaning O(log n)/2^n-1?
For reference, here is the binary search I've been using:
public static boolean binarySearch(int x, int[] sorted) {
int s = 0; //start
int e = sorted.length-1; //end
while(s <= e) {
int mid = s + (e-s)/2;
if( sorted[mid] == x )
return true;
else if( sorted[mid] < x )
start = mid+1;
else
end = mid-1;
}
return false;
}
I'm honestly not sure what this means - I don't see how amortization interacts with binary search.
Perhaps the question is asking what the average cost of a successful binary search would be. You could imagine binary searching for all n elements of the array and looking at the average cost of such an operation. In that case, there's one element for which the search makes one probe, two for which the search makes two probes, four for which it makes three probes, etc. This averages out to O(log n).
Hope this helps!
iAmortized cost is the total cost over all possible queries divided by the number of possible queries. You will get slightly different results depending on how you count queries that fail to find the item. (Either don't count them at all, or count one for each gap where a missing item could be.)
So for a search of 2^n - 1 items (just as an example to keep the math simple), there is one item you would find on your first probe, 2 items would be found on the second probe, 4 on the third probe, ... 2^(n-1) on the nth probe. There are 2^n "gaps" for missing items (remembering to count both ends as gaps).
With your algorithm, finding an item on probe k costs 2k-1 comparisons. (That's 2 compares for each of the k-1 probes before the kth, plus one where the test for == returns true.) Searching for an item not in the table costs 2n comparisons.
I'll leave it to you to do the math, but I can't leave the topic without expressing how irked I am when I see binary search coded this way. Consider:
public static boolean binarySearch(int x, int[] sorted {
int s = 0; // start
int e = sorted.length; // end
// Loop invariant: if x is at sorted[k] then s <= k < e
int mid = (s + e)/2;
while (mid != s) {
if (sorted[mid] > x) e = mid; else s = mid;
mid = (s + e)/2; }
return (mid < e) && (sorted[mid] == x); // mid == e means the array was empty
}
You don't short-circuit the loop when you hit the item you're looking for, which seems like a defect, but on the other hand you do only one comparison on every item you look at, instead of two comparisons on each item that doesn't match. Since half of all items are found at leaves of the search tree, what seems like a defect turns out to be a major gain. Indeed, the number of elements where short-circuiting the loop is beneficial is only about the square root of the number of elements in the array.
Grind through the arithmetic, computing amortized search cost (counting "cost" as the number of comparisons to sorted[mid], and you'll see that this version is approximately twice as fast. It also has constant cost (within ±1 comparison), depending only on the number of items in the array and not on where or even if the item is found. Not that that's important.