what difference between binary search and depth first search - depth-first-search

Binary research executing usually cause memory leaking problem, although it is faster than linear search.
These two search methods, depth first search and binary search, which is more adapted to search random numbers.

Depth first search is the answer here. Because of the nature of binary search, binary search cannot search random numbers (in trees or elsewhere), only sorted numbers. You see, in a stereotypical binary search the middle value is analyzed (or root of tree). If the target value is higher, then the second half of the search domain is chosen, if the number is lower, then the first half. The search is then performed recursively on whichever half is chosen. For this reason, binary search will not work at all on a randomly sorted list of values. I will not go into the specifics of DFS since this question is answered. I'm sure there is a good WIKI on it.

Binary Search would not be the best option for sorting random numbers.
Binary search is a search algorithm finds an element by taking the value of the middle element or root node and compares every other value in the data structure to it. It must begin with a sorted data structure. If the other number is lower than the midpoint, the lower half of the data structure becomes the full structure and the midpoint is found on the lower half. If the other number is higher than the midpoint, then the same process is performed on the higher half of the data structure. This process is repeated until the value is found or the halves cross.
Depth-first search is a search and traversal algorithm that visits nodes in a tree or graph data structure down a path as far as it can go before backtracking. It uses a stack to keep track of all the adjacent nodes of the node being visited and then continues to visit all its neighbors until those have also been considered visited. After all the nodes on the path has been visited, the algorithm backtracks using the stack.
DFS would be the best option for searching random numbers because the prerequisite of Binary Search is for it to be sorted initially. If the numbers are not sorted, it defeats the purpose of the algorithm. DFS will be able to find a value in the structure of random numbers in O(V + E) time complexity.

Related

Find n largest nodes in an arbitrary tree

Given an arbitrary tree (not a binary tree), each node is labeled an integer.
How can I find n largest nodes in the tree?
e.g.
If a tree contains {43, 253, 48, 62, 91, 641}, and asked for 3 largest nodes, then the algorithm should return <641, 253, 91>.
All c++ (or any language) standard library functions/data structures are allowed.
It is also allowed to add fields to the nodes, as long as it is constant space usage. Like, I can add a field to each node to let it point to its largest child, but I cannot have an ArrayList to store all of its children in sorted order.
As a new programmer, I have spent days on this question. A simple graph search algorithm (BFS, DFS) would work and easy to implement, but they are not fast enough because they are all doing an exhaustive search on the entire tree.
Can you please help me find a correct and fast(er) solution to this problem?
As a new programmer, I have spent days on this question. A simple graph search algorithm (BFS, DFS) would work and easy to implement, but they are not fast enough because they are all doing an exhaustive search on the entire tree.
Since your tree is not a binary tree, examining a node does not yield an additional information about its child nodes. Therefore, it not possible to implement an algorithm that produces K highest values without an exhaustive search of the entire tree. In other words, you don't get better performance than what you'd get with an unordered array of arbitrary values.
To get K values in O(N * log K) time maintain a priority queue of K elements as you traverse your arbitrary tree.
As the given tree is arbitrary with no special property. To find even 1 Highest value child, you need to search the entire graph. Complexity of it O(n).
For top K Highest value children,
You have a O(N * log K) time - a priority queue based solution, as mentioned in dasblinkenlight answer.
You have a O(N) time solution with Median of Median as well.

Binary Search Tree Explanation

I am trying to brush up a bit on my understanding of binary trees and in particular binary search trees. Looking through the wikipedia showed me the following information (http://en.wikipedia.org/wiki/Binary_search_tree):
"Binary search trees keep their keys in sorted order, so that lookup and other operations can use the principle of binary search: when looking for a key in a tree (or a place to insert a new key), they traverse the tree from root to leaf, making comparisons to keys stored in the nodes of the tree and deciding, based on the comparison, to continue searching in the left or right subtrees. On average, this means that each comparison allows the operations to skip over half of the tree, so that each lookup/insertion/deletion takes time proportional to the logarithm of the number of items stored in the tree. This is much better than the linear time required to find items by key in an unsorted array, but slower than the corresponding operations on hash tables."
Can someone please elaborate / explain the following portions of that description:
1) "On average, this means that each comparison allows the operations to skip over half of the tree, so that each lookup/insertion/deletion takes time proportional to the logarithm of the number of items stored in the tree."
2) [from the last sentence] "...but slower than the corresponding operations on hash tables."
1) "On average" is only applicable if the BST is balanced, i.e. the left and right subtree contain a rougly equal number of nodes. This makes searching an O(log n) operation, because on each iteration you can roughly discard half of the remaining items.
2) On hash tables, searching, insertion and deletion all take expected O(1) time.

Removing multiple items from balancing binary tree at once

I'm using a red-black binary tree with linked leafs on a project (Java's TreeMap), to quickly find and iterate through the items. The problem is that I can easily get 35000 items or so on the tree, and several times I have to remove "all items above X", which can be almost the entire tree (like 30000 items at once, because all of them are bigger than X), and that takes too much time to remove them and rebalance the tree each time.
Is there any algorithm that can help me here (so I can make my own tree implementation)?
You're looking for the split operation on a red/black tree, which takes the red/black tree and some value k and splits it into two red/black trees, one with all keys greater than or equal to k and one with all keys less than k. This can be implemented in O(log n) time if you augment the structure to store some extra information. In your case, since you're using Java, you can just split the tree and discard the root of the tree you don't care about so that the garbage collector can handle it.
Details on how to implement this are given in this paper, starting on page 9. It's implemented in terms of a catenate (or join) operations which combines two trees, but I think the exposition is pretty clear.
Hope this helps!

Optimal binary search trees for successor lookup?

There are many algorithms for finding optimal binary search trees given a set of keys and the associated probabilities of those keys being chosen. The binary search tree produced this way will have the lowest expected times to look up those elements. However, this binary search tree might not be optimal with regards to other measures. For example, if you attempt to look up a key that is not contained in the tree, the lookup time might be very large, as the tree might be imbalanced in order to optimize lookups of certain elements.
I am currently interested in seeing how to build an binary search tree from a set of keys where the goal is to minimize the time required to find the successor of some particular value. That is, I would like the tree to be structured in a way where, given some random key k, I can find the successor of k as efficiently as possible. I happen to know in advance the probability that a given random key falls in-between any two of the keys the tree is constructed from.
Does anyone know of an algorithm for this problem? Or am I mistaken that the standard algorithm for building optimal binary search trees will not produce efficient trees for this use case?
So now I feel silly, because there's an easy answer to this question. :-)
You use the standard, off-the-shelf algorithm for constructing optimal binary search trees to construct a binary search tree for the set of keys. You then annotate each node so that it stores the entire range between its key and the key before it. This means that you can find the successor efficiently by doing a standard search on the optimally-built tree. If at any point the key you're looking for is found to be contained in a range held in some node, then you're done. In other words, finding the successor is equivalent to just doing a search for the value in the BST.

Listing values in a binary heap in sorted order using breadth-first search?

I'm currently reading this paper and on page five, it discusses properties of binary heaps that it considers to be common knowledge. However, one of the points they make is something that I haven't seen before and can't make sense of. The authors claim that if you are given a balanced binary heap, you can list the elements of that heap in sorted order in O(log n) time per element using a standard breadth-first search. Here's their original wording:
In a balanced heap, any new element can be
inserted in logarithmic time. We can list the elements of a heap in order by weight, taking logarithmic
time to generate each element, simply by using breadth first search.
I'm not sure what the authors mean by this. The first thing that comes to mind when they say "breadth-first search" would be a breadth-first search of the tree elements starting at the root, but that's not guaranteed to list the elements in sorted order, nor does it take logarithmic time per element. For example, running a BFS on this min-heap produces the elements out of order no matter how you break ties:
1
/ \
10 100
/ \
11 12
This always lists 100 before either 11 or 12, which is clearly wrong.
Am I missing something? Is there a simple breadth-first search that you can perform on a heap to get the elements out in sorted order using logarithmic time each? Clearly you can do this by destructively modifying heap by removing the minimum element each time, but the authors' intent seems to be that this can be done non-destructively.
You can get the elements out in sorted order by traversing the heap with a priority queue (which requires another heap!). I guess this is what he refers to as a "breadth first search".
I think you should be able to figure it out (given your rep in algorithms) but basically the key of the priority queue is the weight of a node. You push the root of the heap onto the priority queue. Then:
while pq isn't empty
pop off pq
append to output list (the sorted elements)
push children (if any) onto pq
I'm not really sure (at all) if this is what he was referring to but it vaguely fitted the description and there hasn't been much activity so I thought I might as well put it out there.
In case that you know that all elements lower than 100 are on left you can go left, but in any case even if you get to 100 you can see that there no elements on left so you go out. In any case you go from node (or any other node) at worst twice before you realise that there are no element you are searching for. Than men that you go in this tree at most 2*log(N) times. This is simplified to log(N) complexity.
Point is that even if you "screw up" and you traverse to "wrong" node you go that node at worst once.
EDIT
This is just how heapsort works. You can imagine, that you have to reconstruct heap using N(log n) complexity each time you take out top element.

Resources