What makes B-tree well suited for discs - data-structures

What exactly makes B-tree well suited for discs? I thought that it is because discs can read sequential data really fast but it takes significantly more time for disc to seek another location, but I can't really find any explanation for people with little knowledge about discs and their operations.

If you put 1 million things in a binary search tree, then you have to follow about 20 pointers to find one when you do a search.
If the data structure is on disk, then following a pointer means doing a seek, and 20 seeks is pretty slow.
If you put the same 1 million things in a B-tree or B+tree, then you only have to follow 2 or 3 pointers to find one of them when you do a search, using realistic node sizes.
That's up to 10 times faster.

Related

T-Tree or B-Tree

T-tree algorithm is described in this paper
And T*-Tree is an improvement from T-tree for better use of query operations, including range queries and which contains all other good features of T-tree.
This algorithm is described in this paper "T*-tree: A Main Memory Database Index Structure for Real-Time Applications".
According to this research paper, T-Tree is faster than B-tree/B+tree when datasets fit in the memory.
I implemented T-Tree/T*Tree as they described in these papers and compared the performance with B-tree/B+tree, but B-tree/B+tree perform better than T-Tree/T*Tree in all test cases (insertion, deletion, searching).
I read that T-Tree is an efficient index structure for in-memory database, and it used by Oracle TimesTen. But my results did not show that.
If anyone may know the reason or have any comment about that, it will be great to hear from her (or him).
T-Trees are not a fundamental data structure in the same sense that AVL trees or B-trees are. They are just a hacked version of balanced binary trees and as such there may or may not be niche applications where they offer decent performance.
In this day and age they are bound to suffer horribly because of their poor locality, both in the sense of expected block/page transfer counts and in the sense of cache locality. The latter is evident since in all node accesses of a search except for the very last one, only the boundary values will be checked against the search key - all the rest is paged in or cached for nought.
Compare this to the excellent access locality of B-trees in general and B+trees in particular (not to mention cache-oblivious and cache-conscious versions that were designed explicitly with memory performance charactistics in mind).
Similar problems exist with the rebalancing. In the B-tree world many variations - starting with B+ and Blink - have been developed and perfected in order to achieve desired amortised performance characteristics, including aspects like concurrency (locking/latching) or the absence thereof. So most of the time you can simply go out and find a B-tree variation that fits your performance profile - or use the simple classic B+tree and be certain of decent results.
T-trees are more complicated than comparable B-trees and it seems that they have nothing to offer in the way of performance in general, given that the times of commodity hardware with a single-level memory 'hierarchy' have been gone for decades. Not only is the hard disk the new memory, the converse is also true and main memory is the new hard disk now. I.e. even without NUMA the cost of bringing data from main memory into the cache hierarchy is so high that it pays to minimise page transfers - which is precisely what B-trees and their variations do and the T-tree doesn't. Closer to the processor core it's the number of cache line accesses/transfers that matters but the picture remains the same.
In fact, if you take the idea of binary search - which is provably optimal - and think about ways of arranging the search keys in a manner that plays well with memory hierarchies (caches) then you invariably end up with something that looks uncannily like a B-tree...
If you program for performance then you'll find that winners are almost always located somewhere in the triangle between sorted arrays, B-trees and hashing. Even balanced binary trees are only competitive if their comparatively poor performance takes the back seat in the face of other considerations and key counts are fairly small, i.e. not more than a couple million.

Using B-Tree instead of Trie

This is an interview question and not a homework.
"You have N documents, where N is very large. Each document has a set of words lets say w1,w2..wm where m might differ for each document. Now you are given a list to K words lets say q1,q2…qk.
Write an algorithm to print the list of document which have the K words in them."
Now, I could figure out solutions using Hashing and trie. But the guy who posted the question had also written that the interviewer wanted a solution using B-tree.
I am not really able to figure out how to use a B-Tree for this and how efficient would that be. Can somebody please help?
B-Tree is preferred over Trie if our dataset is stored on media with slow random access, for example on conventional hard drives. The interviewer's note that N is very large might imply that it's simply large enough to not fit in memory and should be placed on disk.
As noted in comments: when the data is really huge and it is stored on a disk, the efficiency of a data structure depends more on the number of disk block accesses, not the total amount of all operations. B-Tree contains many records in one node (which can be considered a "data block"), thus requires significantly fewer block accesses than Trie does.
That is exactly the same reason why most DBs store their indexes in a B-Tree. They need fast search through index located on conventional hard drive.
Actually, your problem can be solved by putting your (word-documentId) pairs in DB table and creating an index on word column or the entire pair.
You can try a ternary trie. It doesn't take so much space. You can also look for a Kart-trie. It uses a key and 2 leafs:http://code.dogmap.org/kart/.

Is a linked list in a B-tree node superior to an array?

I want to implement a B-tree index for my database.
I have read many data structure and algorithm books to learn how to do it. All implementations use an array to save data and child indexes.
Now I want to know: is a linked list in B-tree node superior to an array?
There are some ideas I've thought about:
when splitting a node, the copy operation will be more quickly than with an array.
when inserting data, if the data is inserted into the middle or at the head of the array, the speed is lower than inserting to the linked list.
The linked list is not better, in fact a simple array is not better either (except its simplicity which is good argument for it and search speed if sorted).
You have to realize that the "array" implementation is more a "reference" implementation than a true full power implementation. For example, the implementation of the data/key pairs inside a B-Tree node in commercial implementations uses many strategies to solve two problems: storage efficiency and efficient search of keys in the node.
With regard with efficient search, an array of key/value with an internal balanced tree structure on the top of it can make insertion/deletion/search be done in O(log N), for large B tree nodes it makes sense.
With regard to memory efficiency, the nature of data in the key and value is very important. For example, lexicographical keys can be shorten by a common start (e.g. "good", "great" have "g" in common), the data might be compressed as well using any possible scheme relevant to the nature of the data. The compression of keys is more complex as you will want to keep this lexicographical property. Remember that the more data and keys you stuff in a node, the fastest are the disk accesses.
The time to split a node is only partially relevant, as it will be much less than the time to read or write a node on typical media by several order of magnitude. On SSD and extremely fast disks (by 10 to 20 years it is expected to have disks as fast as RAM), many researches are conducted to find a successor to B-Trees, stratified B-Trees are an example.
If the BTree is itself stored on the disk then a linked list will make it very complicated to maintain.
Keep the B-Tree structure compact. This will allow more nodes per page, locality of data and allowing caching of more nodes, and fewer disk reads/cache misses.
Use an array.
The perceived in-memory computational benefits are inconsequential.
So, in short, no, linked list is not superior.
B-tree is typically used in DBs where the data is stored on disks and you want to minimize the number of blocks you want to read. I do not think your proposal would be efficient in that case (although it might be beneficial if you can load all data into RAM).
If you want to perform those two operations effectively you should use a Skip List (http://en.wikipedia.org/wiki/Skip_list). Performance-wise it will be similar to what you have outlined.

Performance of RTree vs kd-trees

I have around 10 K points in 5 dimensional space. We can assume that the points are randomly distributed in space (0,0,0,0,0) and (100,100,100,100,100). Clearly, the whole data set can easily reside in memory.
I would like to know which algorithm for k nearest neighbour would run faster, kd-tree or RTree.
Although I have some very high level idea of these two algorithms, I am not sure which will run faster, and why. I am open to exploring other algorithms if any, which could run fast. Please, if possible, specify why an algorithm may run faster.
This depends on various parameters. Most importantly on your capability to implement these algorithms.
I've personally found bulk-loaded R*-trees to be faster for large data, probably because they have a better fan-out. Bulk-loaded R-trees is a more fair comparison, as kd-trees are commonly bulk-loaded (in fact, they don't support incremental operation very well at all).
For tiny data, kd-trees will likely be faster, plus they are much simpler to implement.
For other things, please refer to this earlier question / answer:
https://stackoverflow.com/a/11109467/1060350

Optimal physical orderings of nodes

First, read this:
TPT paper
I was wondering what other options might exist for arranging nodes to boost performance. Anything from post-parent order in a byte array, like TPT's, to something more like a k-order b-tree; I'm wondering what good options are known at the moment?
A bit more on the problem:
I have an extremely fast way of finding elements within a sparse set, given some concept of adjacency to a given pointer. I was wondering how I could best take advantage of this in storing a patricia trie.
You can make assumptions about whether the trie will be random-access, read only, write-seldom, or add-only. Please note them if you do, but I've actually used a TPT and the gains were pretty significant so I'm willing to consider certain constraints.
Update
I guess in some senses this was a little unclear. What I'm looking for here is ways of arranging things in memory that optimize one performance metric or another. The TPTs, through some tricks, use node order to optimize disk reads and space-per-node. I'm curious about:
Total deletion, where the structure is removed from memory entirely.
Inserts, particularly in densely populated structures.
Deletes, again, particularly in densely populated structures.
A DAWG or a minimal DFA (see this question or the paper "How to squeeze a lexicon") may be even better than a TPT because the totel size is smaller.

Resources