which data structure to use for getting all numbers less than a given number - algorithm

Which data structure is most appropriate for queries like insertion, deletion and getting all the elements
whose value is less than a given number. BST or Priority Queue? or any other data structure.

BST seems like the one you need. Insertion, deletion with O(h) complexity. Getting all the elements less than a given number is also O(h), you just need a preorder traversal to find the node of whole left tree.
h is the height of the tree.
If you want to be more stable, maintain an AVL gives you O(logn) complexity over BST.

Related

inserting into 2-3-4 tree number of nodes

I am implementing 2-3-4 tree for some kind of memory managment.During init of my application I want to insert there some number of integers (get it as input - say n)
What is a complexity of such an insert?
O(nloglog(n))?
The complexity of the insertion on a 2-3-4 tree is O(log(n)).
We can see a quota from Wikipedia
2–3–4 trees are B-trees of order 4 (Knuth 1998)
The complexity of insertion on B-tree is O(log(n)), so is the 2-3-4 tree. By repeating inserting to initialize the 2-3-4 tree, we may say the time cost of init is O(n*log(n)). But we can expect a special way to construct [link]:
In applications, it is frequently useful to build a B-tree to represent a large existing collection of data and then update it incrementally using standard B-tree operations. In this case, the most efficient way to construct the initial B-tree is not to insert every element in the initial collection successively, but instead to construct the initial set of leaf nodes directly from the input, then build the internal nodes from these. This approach to B-tree construction is called bulkloading. Initially, every leaf but the last one has one extra element, which will be used to build the internal nodes.
The time cost may be (n + n/4 + n/16 + ... + n/(4^h)). Based on the sum of Geometric Progression. I calculate the time cost. It is less than (4/3)*n.
Please point out if I have any mistake during the calculation.

Complexity of inserting n numbers into a binary search tree

I have got a question, and it says "calculate the tight time complexity for the process of inserting n numbers into a binary search tree". It does not denote whether this is a balanced tree or not. So, what answer can be given to such a question? If this is a balanced tree, then height is logn, and inserting n numbers take O(nlogn) time. But this is unbalanced, it may take even O(n2) time in the worst case. What does it mean to find the tight time complexity of inserting n numbers to a bst? Am i missing something? Thanks
It could be O(n^2) even if the tree is balanced.
Suppose you're adding a sorted list of numbers, all larger than the largest number in the tree. In that case, all numbers will be added to the right child of the rightmost leaf in the tree, Hence O(n^2).
For example, suppose that you add the numbers [15..115] to the following tree:
The numbers will be added as a long chain, each node having a single right hand child. For the i-th element of the list, you'll have to traverse ~i nodes, which yields O(n^2).
In general, if you'd like to keep the insertion and retrieval at O(nlogn), you need to use Self Balancing trees.
What wiki is saying is correct.
Since the given tree is a BST, so one need not to search through entire tree, just comparing the element to be inserted with roots of tree/subtree will get the appropriate node for th element. This takes O(log2n).
Once we have such a node we can insert the key there bht after that it is required push all the elements in the right aub-tree to right, so that BST's searching property does not get violated. If the place to be inserted comes to be the very last last one, we need to worry for the second procedure. If note this procedure may take O(n), worst case!.
So the overall worst case complexity of inserting an element in a BST would be O(n).
Thanks!

Efficient data structure that returns smallest element greater or equal than given key and allows decrease key

Is there an efficient (log n) data structure that allows the following operations:
Return the smallest element that is greater or equal to a given key
Exchange this element with a smaller one and rearrange the structure accordingly
The number of elements is known and will not change during lifetime.
You could implement a balanced binary tree like a Red-Black Tree
A Red-Black tree has O(log(n)) time complexity for search, insertion and deletion.
You will have to make some modification to return the smallest element that is greater or equal to a given key. But the basic behavior is provided by this data structure I guess.

Find median in O(1) in binary tree

Suppose I have a balanced BST (binary search tree). Each tree node contains a special field count, which counts all descendants of that node + the node itself. They call this data structure order statistics binary tree.
This data structure supports two operations of O(logN):
rank(x) -- number of elements that are less than x
findByRank(k) -- find the node with rank == k
Now I would like to add a new operation median() to find the median. Can I assume this operation is O(1) if the tree is balanced?
Unless the tree is complete, the median might be a leaf node. So in general case the cost will be O(logN). I guess there is a data structure with requested properties and with a O(1) findMedian operation (Perhaps a skip list + a pointer to the median node; I'm not sure about findByRank and rank operations though) but a balanced BST is not one of them.
If the tree is complete (i.e. all levels completely filled), yes you can.
In a balanced order statistics tree, finding the median is O(log N). If it is important to find the median in O(1) time, you can augment the data structure by maintaining a pointer to the median. The catch, of course, is that you would need to update this pointer during each Insert or Delete operation. Updating the pointer would take O(log N) time, but since those operations already take O(log N) time, the extra work of updating the median pointer does not change their big-O cost.
As a practical matter, this only makes sense if you do a lot of "find median" operations compared to the number of insertions/deletions.
If desired, you could reduce the cost of updating the median pointer during Insert/Delete to O(1) by using a (doubly) threaded binary tree, but Insert/Delete would still be O(log N).

Tree Datastructures

I've tried to understand what sorted trees are and binary trees and avl and and and ...
I'm still not sure, what makes a sorted tree sorted? And what is the complexity (Big-Oh) between searching in a sorted and searching in an unsorted tree? Hope you can help me.
Binary Trees
There exists two main types of binary trees, balanced and unbalanced. A balanced tree aims to keep the height of the tree (height = the amount of nodes between the root and the furthest child) as even as possible. There are several types of algorithms for balanced trees, the two most famous being AVL- and RedBlack-trees. The complexity for insert/delete/search operations on both AVL and RedBlack trees is O(log n) or better - which is the important part. Other self balancing algorithms are AA-, Splay- and Scapegoat-tree.
Balanced trees gain their property (and name) of being balanced from the fact that after every delete or insert operation on the tree the algorithm introspects the tree to make sure it's still balanced, if it's not it will try to fix this (which is done differently with each algorithm) by rotating nodes around in the tree.
Normal (or unbalanced) binary trees do not modify their structure to keep themselves balanced and have the risk of, most often overtime, to become very inefficient (especially if the values are inserted in order). However if performance is of no issue and you mainly want a sorted data structure then they might do. The complexity for insert/delete/search operations on an unbalanced tree range from O(1) (best case - if you want the root) to O(n) (worst-case if you inserted all nodes in order and want the largest node)
There exists another variation which is called a randomized binary tree which uses some kind of randomization to make sure the tree doesn't become fully unbalanced (which is the same as a linked list)
A binary search tree is an "tree"-structure where every node has two children-nodes.
The left nodes all have the property of being less than its parent, and the right-nodes are all greater than its parent.
The intressting thing with an binary-tree is that we can search for an value in O(log n) when the tree is properly sorted. Doing the same search in an LinkedList for an example would give us the searchspeed of O(n).
The best way to go about learning datastructures would be to do a day of googling and reading wikipedia articles.
This might get you started
http://en.wikipedia.org/wiki/Binary_search_tree
Do a google search for the following:
site:stackoverflow.com binary trees
to get a list of SO questions which will answer your several questions.
There isn't really a lot of point in using a tree structure if it isn't sorted in some fashion - if you are planning on searching for a node in the tree and it is unsorted, you will have to traverse the entire tree (O(n)). If you have a tree which is sorted in some fashion, then it is only necessary to traverse down a single branch of the tree (typically O(log n)).
In binary tree the right leaf is always smaller then the head, and the left leaf is always bigger, so you can search in sorted tree in O(log(n)), you just need to go right if if the key is smaller than head and to the left if bgger

Resources