Big O(h) vs. Big O(logn) in trees

I have a question on time complex in trees operations.
It's said that (Data Structures, Horowitz et al) time complexity for insertion, deletion, search, finding mins-maxs, successor and predecessor nodes in BSTs is of O(h) while those of AVLs makes O(logn).
I don't exactly understand what the difference is. With h=[logn]+1 in mind, so why do we say O(h) and somewhere else O(logn)?

h is the height of the tree. It is always Omega(logn) [not asymptotically smaller then logn]. It can be very close to logn in complete tree (then you really get h=logn+1, but in a tree that decayed to a chain (each node has only one son) it is O(n).
For balanced trees, h=O(logn) (and in fact it is Theta(logn)), so any O(h) algorithm on those is actually O(logn).
The idea of self balancing search trees (and AVL is one of them) is to prevent the cases where the tree decays to a chain (or somewhere close to it), and its (the balanced tree) features ensures us O(logn) height.
To understand this issue better consider the next two trees (and forgive me for being terrible ascii artist):
tree 1 tree 2
5 4
/ / \
4 2 6
/ / \ / \
3 1 3 5 7
Both are valid Binary search trees, and in both searching for an element (say 1) will be O(h). But in the first, O(h) is actually O(n), while in the second it is O(logn)

O(h) means complexity linear dependent on tree height. If tree is balanced this asymptotic becomes O(logn) (n - number of elements). But it is not true for all trees. Imagine very unbalanced binary tree where each node has only left child, this tree become similar to list and number of elements in that tree equal to height of tree. Complexity for described operation will be O(n) instead of O(logn)


Skewed binary tree vs Perfect binary tree - space complexity

Does a skewed binary tree take more space than, say, a perfect binary tree ?
I was solving the question #654 - Maximum Binary Tree on Leetcode, where given an array you gotta make a binary tree such that, the root is the maximum number in the array and the right and left sub-tree are made on the same principle by the sub-array on the right and left of the max number, and there its concluded that in average and best case(perfect binary tree) the space taken would be O(log(n)), and worst case(skewed binary tree) would be O(n).
For example, given nums = [1,3,2,7,4,6,5],
the tree would be as such,
/ \
3 6
/ \ / \
1 2 4 5
and if given nums = [7,6,5,4,3,2,1],
the tree would be as such,
/ \
/ \
/ \
/ \
/ \
According to my understanding they both should take O(n) space, since they both have n nodes. So i don't understand how they come to that conclusion.
Thanks in advance.
Under "Space complexity," it says:
Space complexity : O(n). The size of the set can grow upto n in the worst case. In the average case, the size will be nlogn for n elements in nums, giving an average case complexity of O(logn).
It's poorly worded, but it is correct. It's talking about the amount of memory required during construction of the tree, not the amount of memory that the tree itself occupies. As you correctly pointed out, the tree itself will occupy O(n) space, regardless if it's balanced or degenerate.
Consider the array [1,2,3,4,5,6,7]. You want the root to be the highest number, and the left to be everything that's to the left of the highest number in the array. Since the array is in ascending order, what happens is that you extract the 7 for the root, and then make a recursive call to construct the left subtree. Then you extract the 6 and make another recursive call to construct that node's left subtree. You continue making recursive calls until you place the 1. In all, you have six nested recursive calls: O(n).
Now look what happens if your initial array is [1,3,2,7,5,6,4]. You first place the 7, then make a recursive call with the subarray [1,3,2]. Then you place the 3 and make a recursive call to place the 1. Your tree is:
At this point, your call depth is 2. You return and place the 2. Then return from the two recursive calls. The tree is now:
1 2
Constructing the right subtree also requires a call depth of 2. At no point is the call depth more than two. That's O(log n).
It turns out that the call stack depth is the same as the tree's height. The height of a perfect tree is O(log n), and the height of a degenerate tree is O(n).

Time Efficiency of Binary Search Tree

for the time efficiency of inserting into binary search tree,
I know that the best/average case of insertion is O(log n), where as the worst case is O(N).
What I'm wondering is if there is any way to ensure that we will always have best/average case when inserting besides implementing an AVL (Balanced BST)?
There is no guaranteed log n complexity without balancing a binary search tree. While searching/inserting/deleting, you have to navigate through the tree in order to position yourself at the right place and perform the operation. The key question is - what is the number of steps needed to get at the right position? If BST is balanced, you can expect on average 2^(i-1) nodes at the level i. This further means, if the tree has k levels (kis called the height of tree), the expected number of nodes in the tree is 1 + 2 + 4 + .. + 2^(k-1) = 2^k - 1 = n, which gives k = log n, and that is the average number of steps needed to navigate from the root to the leaf.
Having said that, there are various implementations of balanced BST. You mentioned AVL, the other very popular is red-black tree, which is used e.g. in C++ for implementing std::map or in Java for implementing TreeMap.
The worst case, O(n), can happen when you don't balance BST and your tree degenerates into a linked list. It is clear that in order to position at the end of the list (which is a worst case), you have to iterate through the whole list, and this requires n steps.

Can I achieve begin insertion on a binary tree in O(log(N))?

Consider a binary tree and some traverse criterion that defines an ordering of the tree's elements.
Does it exists some particular traverse criterion that would allow a begin_insert operation, i.e. the operation of adding a new element that would be at position 1 according to the ordering induced by the traverse criterion, with O(log(N)) cost?
I don't have any strict requirement, like the tree guaranteed to be balanced.
But I cannot accept lack of balance if that allows degeneration to O(N) in worst case scenarios.
Let's try to see if in-order traversal would work.
Consider the BT (not a binary search tree)
/ \
13 5
/ \ /
2 8 9
In-order traversal gives 2-13-8-6-9-5
Perform begin_insert(7) in such a way that in-order traversal gives 7-2-13-8-6-9-5:
/ \
13 5
/ \ /
2 8 9
Now, I think this is not a legitimate O(log(N)) strategy, because if I keep adding values in this way the cost degenerates into O(N) as the tree becomes increasingly unbalanced
/ \
13 5
/ \ /
2 8 9
This strategy would work if I rebalance the tree by preserving ordering:
/ \
2 9
/ \ / \
7 13 6 5
but this costs at least O(N).
According to this example my conclusion would be that in-order traversal does not solve the problem, but since I received feedback that it should work maybe I am missing something?
Inserting, deleting and finding in a binary tree all rely on the same search algorithm to find the right position to do the operation. The complexity of this O(max height of the tree). The reason is that to find the right location you start at the root node and compare keys to decide if you should go into the left subtree or the right subtree and you do this until you find the right location. The worst case is when you have to travel down the longest chain which is also the definition for height of the tree.
If you don't have any constraints and allow any tree then this is going to be O(N) since you allow a tree with only left children (for example).
If you want to get better guarantees you must use algorithms that promise that the height of the tree has a lower bound. For example AVL guarantees that your tree is balanced so the max height is always log N and all the operations above run in O(log N). Red-black trees don't guarantee log N but promise that the tree is not going to be too unbalanced (min height * 2 >= max height) so it keeps O(log N) complexity (see page for details).
Depending on your usage patterns you might be able to find more specialized data structures that give even better complexity (see Fibonacci heap).

Why is it important that a binary tree be balanced?

Why is it important that a binary tree be balanced
Imagine a tree that looks like this:
This is a valid binary tree, but now most operations are O(n) instead of O(lg n).
The balance of a binary tree is governed by the property called skewness. If a tree is more skewed, then the time complexity to access an element of a the binary tree increases. Say a tree
/ \
2 3
\ \
7 4
The above is also a binary tree, but right skewed. It has 7 elements, so an ideal binary tree require O(log 7) = 3 lookups. But you need to go one more level deep = 4 lookups in worst case. So the skewness here is a constant 1. But consider if the tree has thousands of nodes. The skewness will be even more considerable in that case. So it is important to keep the binary tree balanced.
But again the skewness is the topic of debate as the probablity analysis of a random binary tree shows that the average depth of a random binary tree with n elements is 4.3 log n . So it is really the matter of balancing vs the skewness.
One more interesting thing, computer scientists have even found an advantage in the skewness and proposed a skewed datastructure called skew heap
To ensure log(n) search time, you need to divide the total number of down level nodes by 2 at each branch. For example, if you have a linear tree, never branching from root to the leaf node, then the search time will be linear as in a linked list.
An extremely unbalanced tree, for example a tree where all nodes are linked to the left, means you still search through every single node before finding the last one, which is not the point of a tree at all and has no benefit over a linked list. Balancing the tree makes for better search times O(log(n)) as opposed to O(n).
As we know that most of the operations on Binary Search Trees proportional to height of the Tree, So it is desirable to keep height small. It ensure that search time strict to O(log(n)) of complexity.
Rather than that most of the Tree Balancing Techniques available applies more to
trees which are perfectly full or close to being perfectly balanced.
At the end of the end you need the simplicity over your tree and go for best binary trees like red-black tree or avl

In Big-O notation for tree structures: Why do some sources refer to O(logN) and some to O(h)?

In researching complexity for any algorithm that traverses a binary search tree, I see two different ways to express the same thing:
Version #1: The traversal algorithm at worst case compares once per height of the tree; therefore complexity is O(h).
Version #2: The traversal algorithm at worst case compares once per height of the tree; therefore complexity is O(logN).
It seems to me that the same logic is at work, yet different authors use either logN or h. Can someone explain to me why this is the case?
The correct value for the worst-case time to search is tree is O(h), where h is the height of a tree. If you are using a balanced search tree (one where the height of the tree is O(log n)), then the lookup time is worst-case O(log n). That said, not all trees are balanced. For example, here's a tree with height n - 1:
Here, h = O(n), so the lookup is O(n). It's correct to say that the lookup time is also O(h), but h ≠ O(log n) in this case and it would be erroneous to claim that the lookup time was O(log n).
In short, O(h) is the correct bound. O(log n) is the correct bound in a balanced search tree when the height is at most O(log n), but not all trees have lookup time O(log n) because they may be imbalanced.
Hope this helps!
If your binary tree is balanced so that every node has exactly two child nodes, then the number of nodes in the tree will be exactly N = 2h − 1, so the height is the logarithm of the number of elements (and similarly for any complete n-ary tree).
An arbitrary, unconstrained tree may look totally different, though; for instance, it could just have one node at every level, so N = h in that case. So the height is the more general measure, as it relates to actual comparisons, but under the additional assumption of balance you can express the height as the logarithm of the number of elements.
O(h) would refer to a binary tree that is sorted but not balanced
O(logn) would refer to a tree that is sorted and balanced
It's sort of two ways of saying the same thing, because your average balanced binary tree of height 'h' will have around 2^h nodes.
Depending on the context, either height or #nodes may be more relevant, and so that's what you'll see referenced.
because the (h)eight of a balanced tree varies as the log of the (N)umber of elements
