B-Tree K-D-B Tree and R-Tree improve search - b-tree

How can B-Tree K-D-B Tree and R-Tree help improve efficiency and accuracy of the search.

They cannot improve accuracy over a linear scan (but unless i plemented incorrectly, they will not be worse either, as both are supposed to give exact answers)

Related

Soft heap: where and why is it useful? [duplicate]

This question already has answers here:
Soft heaps: what is corruption and why is it useful?
(2 answers)
Closed 2 years ago.
From the paper that I was reading By Bernard chazelle https://www.cs.princeton.edu/courses/archive/fall05/cos528/handouts/The%20Soft%20Heap.pdf
I failed to find Soft heap being used much in practical scenario. So, It would be helpful if someone could let me know why is it really useful.
I haven't red the article, only the abstract and I quote
The soft heap can be used to compute exact or approximate medians and percentiles optimally. It is also useful for approximate sorting and for computing minimum spanning trees of general graphs.
So it has some uses in the graphs' algorithms or in medians' computing.
In graph algorithms there's a popular algorithm called "Prim's Algorithmm" and it finds the minimum spanning trees of general graphs. I'm not 100% sure but I think Soft Heap are used in this algorithm.
You might be familiar with plain old heap, it is potent for its fast computing response time. Seems like Soft Heap share the same property.

What's the most optimized algorithm (approach) for traversing a tree, ever? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
As it's stated in wikipedia there are multiple algorithms for traversing a tree data structure. But some of the algorithms are kind of combinations of the others, like Bidirectional search which is almost useful for other graphs rather than trees. But with a tree we almost have no idea of the end of the tree and we can only start from the root or from its children.
In this case we might be able to incorporate the multiprocessing or multithreading in search process. But I couldn't find any comprehensive approach that's described this.
Now my question is that basically what's the most optimized way of traversing a tree when we don't have access to the whole data structure (to be able to index them, etc. like a file directory)?
The most optimized algorithm is usually the one optimized for specific usecase and platform.
It does not matter whether you do inorder, preorder or postorder. Or whether you do DFS or BFS.
What matters is:
How big is the tree? Does it fit into memory?
How deep is the tree? Can you use recursion, or do you have to use explicit separate stack?
How do you find children of the node. Do you have to access harddrive/network?
What do you want to do with the node after you find in traversal. If this operation is long enough, optimizing traversal is not worth it.
How do you share data between threads?
How are the nodes in the tree distributed? Does it resemble even distribution, or are there some very long and some very short branches?
How big are the node keys (this influences data locality and how much data you can fit into one L1/L2 cache line)?
Try in order traversing of Binary Search tree. The complexity of search is nlogn and traverse is n, which is considered to be the best.
Ex: http://www.geeksforgeeks.org/binary-search-tree-set-1-search-and-insertion/

How is algorithmic randomisation improving insertion in binary search trees?

I know that randomisation in quick sort input may help into avoiding the worst case scenario which is an already sorted list. But how is this helping to improve into an insertion in a binary search tree?

Why are heuristics proposed? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a small confusion with the nature of heuristics.
We know that heuristics need not give correct outputs for all input instances.
But then, why are heuristics proposed??
Heuristics are used to trade off performance (usually execution speed, but also memory consumption) with potential accuracy or generality. For example, your anti virus software uses heuristics to characterize what a virus might look like, and can take advantage of that piece of information to determine which files it should spend more time analyzing. A good heuristic has the property that it can save substantial time with minimal cost.
In graph traversal theory, a heuristic for an A* search algorithm need not be perfect. It just needs to have a predicted cost function h(x) that is less than or equal to the true cost to the goal state in order to guarantee an optimal solution. The closer h(x) equals the true cost, the quicker an optimal solution will be found.
Let me give you an example which might help you understand the importance of heuristics.
In Artificial Intelligence, search problems are mainly classified as blind search and directed search. Blind search is where you make use of algorithms such as BFS and DFS and there is a reason they are called blind search, they don't have any knowledge about the direction you should go, you just have to explore and explore until you reach the goal node, imagine the time and space complexity for those algorithms.
Now if you look at the directed search algorithm such as A*, where you have some kind of heuristic function or in simple terms an assumption about which direction you should take the next step.
Although heuristics does not guarantee the best result but rather will try to give you a better solution and sometimes even the best. There are so many classes of problems (Ex. games you play) where a better solution does the task rather than wasting so much time and space in finding the best solution.
I hope it helps.

What are the complicated data structures you should have heard of? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
This is a derivative question, but I'm inquiring as to the data structures that you should at least be familiar with for their usefulness. These structures are too hard to implement without some expertise however.
I would say a good boundary between the two is a heap -- you should be able to code a heap, but it would take you a day. Not appropriate for this would be a BST, etc. Edit: I see the point that it depends on what you are doing. I think it would be awesome to have a list with a phrase summarizing why you use it!
Here's a list to start:
B+ trees: good general indexing structure on a single key
K-d tree: spatial data
Red-black tree: self-balancing BST; also AVL or splay tree
Skip list: good hybrid structure for either random or (pseudo)sequential access
Trie: linear time string search
Bloom filters
What about:
Binomial Heaps
Fibonacci Heaps
Disjoint Set Data Structures
Splay Trees
Finger trees
That is a good start; there is a comprehensive list of data structures on wikipedia, some of them should be examined. But as to which ones you need, that depends on the area you intend to... do whatever it is that you are doing.
Embedded systems guys will have very different ideas from web guys who will strongly disagree with the business logic guys. Figure out what you want to do; languages and platform will also effect the list you need.
To quote Martin Kay:
Suffix trees constitute a well
understood, extremely elegant, but
regrettably poorly appreciated data
structure with potentially many
applications (...)
See also: What are the lesser known but cool data structures?
van Emde Boas trees. I don't literally think that you "should" have heard of them, but I do believe they're an interesting example of what kind of complexity you can achieve with "bit tricks" --- namely O(log log n), exponentially better than binary trees!
R-Tree
Closely related to the B+ tree you mentioned: B*-tree. Along with a balancing approach known as the "dancing tree" approach, these form the basis of Reiser4.
Binary Decision Diagrams, specifically Reduced Order Binary Decision Diagrams (ROBDD). These get reinvented (poorly) a lot when someone decides to create their own filtering system.
Cuckoo hashing, a simple and elegant way of resolving hash-table collisions in expected constant time.
Deterministic finite automata (DFAs), or finite state machines, useful for expressing many things, such as basic lexers, regular expressions, state transitions, etc. See also the related directed acyclic word graphs, which can be useful for storing dictionaries compactly.
I would add Hash Tables to the list. They are pretty simple in concept, but can be complicated once you look at how to implement a good hashing function and efficient probing methods.
R-Tree and its variants, such as R*-Tree, X-Tree, Pyramid-Tree. Various M-Tree variants, such as the Slim-Tree.
As often, querying the tree is easy. There might be an easy bulk-loading, too (for R-Trees, STR often does a good job). The tricky part usually is the maintainance of a good tree across updates.
You can try:
y-fast trees
Approximate ordered sets
select heap
compact arrays
Monolithic lists
Succinct lists

Resources