Array-backed heap structures for Java

Array-backed heap structures for Java - performance

I'm looking for heap structures for priority queues that can be implemented using Object[] arrays instead of Node objects.
Binary heaps of course work well, and so do n-ary heaps. Java's java.util.PriorityQueue is a binary heap, using an Object[] array as storage.
There are plenty of other heaps, such as Fibonacci heaps for example, but as far as I can tell, these need to be implemented using nodes. And from my benchmarks I have the impression that the overhead paid by managing all these node objects comes at a cost that may well eat up all benefits gained. I find it very hard to implement a heap that can comete with the simple array-backed binary heap.
So I'm currently looking for advanced heap / priority queue structures that also do not have the overhead of using Node objects. Because I want it to be fast in reality, not just better in complexity theory... and there is much more happening in reality: for example the CPU has L1, L2 and L3 caches etc. that do affect performance.
My question also focuses on Java for the very reason that I have little influence on memory management here, and there are no structs as in C. A lot of heaps that work well when implemented in C become costly in Java because of the memory management overhead and garbage collection costs.

Several uncommon heap structures can be implemented this way. For example, the Leonardo Heap used in Dijkstra's smoothsort algorithm is typically implemented as an array augmented with two machine words. Like a binary heap, the array representation is an array representation of a specific tree structure (or rather, a forest, since it's a collection of trees).
The poplar heap structure was introduced as a slightly less efficient theoretically but more efficient practically heap structure with the same basic idea - the data is represented as an array with some extra small state information, and is a compressed representation of a forest structure.
A more canonical array-based heap is a d-ary heap, which is a natural generalization of the binary heap to have d children instead of two.
As a very silly example, a sorted array can be used as a priority queue, though it's very inefficient.
Hope this helps!

Related

Memory Heap vs. Min/Max Heap Data Structure [duplicate]

This question already has answers here:
Why are two different concepts both called "heap"? [duplicate]
(9 answers)
Closed last year.
I have read about memory allocation for applications and I understand that, more or less, the heap, in memory, for a given application is dynamically allocated at the startup. However, there is also another concept called a min-heap, which is a data structure organized in the form of a tree where each parent node is smaller or equal to its children.
So, my question is: What are the relationships between the heap that is allocated at startup for a given application and the min-heap data structure that includes functions commonly referred to as 'heapify' and etc? Is there any relationship or is the min-heap data structure more of a higher level programming concept? If there is no relationship, is there any reason they've been given the same name?
This may seem like a silly question for some, but it has actually stirred up a debate at work.

heap is a data structure which is actually a complete binary tree with some extra properties.
There are 2 types of Heaps:
MIN Heap
MAX Heap
in min heap the root has the lowest value in the tree and when you pop out the root the next lowest element comes on the top. To convert a tree into heap we use heapify algorithm.
It is also know as priority queue in c++. and usually as a competitive programmer we use STL function for heap so that we dont have to get into the hustle of creating a heap from scratch.
Max heap is just the opposite with largest at the root.
Usually heap is used because it has a O(logN) time complexity for removing and inserting elements and hence can even work with tight constraints like 10^6.
Now i can understand you confusion between heap in memory and heap data structure but they are completely different things. Heap in data structure is just a way to store the data.

Are there real-world reasons to employ a Binary Search Tree over a Binary Search of semi-contiguous list?

I'm watching university lectures on algorithms and it seems so many of them rely almost entirely binary search trees of some particular sort for querying/database/search tasks.
I don't understand this obsession with Binary Search Trees. It seems like in the vast majority of scenarios, a BSP could be replaced with a sorted array in the case of a static data, or a sorted bucketed list if insertions occur dynamically, and then a Binary Search could be employed over them.
With this approach, you get the same algorithmic complexity (for querying at least) as a BST, way better cache coherency, way less memory fragmentation (and less gc allocs depending on what language you're in), and are likely much simpler to write.
The fundamental issue is that BSP are completely memory naïve -- their focus is entirely on O(n) complexity and they ignore the very real performance considerations of memory fragmentation and cache coherency... Am I missing something?

Binary search trees (BST) are not totally equivalent to the proposed data structure. Their asymptotic complexity is better when it comes to both insert and remove sorted values dynamically (assuming they are balanced correctly). For example, when you when to build an index of the top-k values dynamically:
while end_of_stream(stream):
value <- stream.pop_value()
tree.insert(value)
tree.remove_max()
Sorted arrays are not efficient in this case because of the linear-time insertion. The complexity of bucketed lists is not better than plain list asymptotically and also suffer from a linear-time search. One can note that a heap can be used in this case, and in fact it is probably better to use a heap here, although they are not always interchangeable.
That being said, your are right : BST are slow, cause a lot of cache miss and fragmentation, etc. Thus, they are often replaced by more compact variants like B-trees. B-tree uses a sorted array index to reduce the amount of node jumps and make the data-structure much more compact. They can be mixed with some 4-byte pointer optimizations to make them even more compact. B-trees are to BST what bucketed linked-lists are to plain linked-lists. B-trees are very good for building dynamic database index of huge datasets stored on a slow storage device (because of the size): they enable applications to fetch values associated to a key using very few storage-device lookups (which as very slow on HDD for example). Another example of real-world use-case is interval-trees.
Note that memory fragmentation can be reduced using compaction methods. For BSTs/B-trees, one can reorder the root nodes like in a heap. However, compaction is not always easy to apply, especially on native languages with pointers like in C/C++ although some very clever methods exists to do so.
Keep in mind that B-trees shine only on big datasets (especially the ones that do not fit in cache). On relatively small ones, using just plain arrays or even sorted array is often a very good solution.

Why would one use a heap over a self balancing binary search tree?

What are the uses of heaps? Whatever a heap can do can also be done by a self-balancing binary search tree like an AVL tree. The most common use of heap is to find the minimum (or maximum) element in O(1) time (which is always the root). This functionality can also be included while constructing the AVL tree by maintaining a pointer to the minimum(or maximum) element, and min/max queries can be answered in O(1) time.
The only benefit of heaps over AVL trees I can think of is that AVL trees use a bit more memory because of pointers. Is there any other advantage/functionality of using a heap over an AVL tree?

A heap may have better insert and merge times. It really depends on the type of heap, but typically they are far less strict than an AVL because they don't have to worry about auto balancing after each operation.
A heap merely guarantees that all nodes follow the same style of ordering across the heap. There are of course more strict heaps like a binary heap that make inserting and merging more difficult since ordering matters more, but this is not always the case.
For example, the insert and merge times for a Fibonacci heap would be O(1) vs O(log n) for the AVL.
It is also more difficult to build the full AVL when compared to a heap.
We typically use heaps when we just want quick access to the min and max items and don't care about perfect ordering of other elements. With fast insertion, we can deal with many elements quickly and always keep our attention on the most important (or least important) ones.

You are correct when you say that a self-balancing binary tree can do strictly more things than a heap, and that heaps use less space for pointers. Here are some additional considerations:
The binary heap takes much less code to implement than an AVL tree. This makes coding, debugging, and modification significantly easier.
An AVL tree uses one object container and two pointers per data item stored. The binary heap uses zero overhead per data item stored - it is all packed into one array.

The main reason is a binary heap is actually implemented as an array, not a tree (the tree is a metaphor, the actual implementation is an array, where the children of element with index i are elements with indices 2i+1 and 2i+2). An array is extremely more efficient (in constants) than a tree, both in space and time - due to locality of reference, making the data structure much more cache efficient, which usually results in much better constants.
In addition, initializing a binary heap with n elements takes O(n) time, while doing the same for a BST takes O(nlogn) time.

Can heaps really only use O(1) auxiliary storage?

One of the biggest advantages of using heaps for priority queues (as opposed to, say, red-black trees) seems to be space-efficiency: unlike balanced BSTs, heaps only require O(1) auxiliary storage.
i.e, the ordering of the elements alone is sufficient to satisfy the invariants and guarantees of a heap, and no child/parent pointers are necessary.
However, my question is: is the above really true?
It seems to me that, in order to use an array-based heap while satisfying the O(log n) running time guarantee, the array must be dynamically expandable.
The only way we can dynamically expand an array in O(1) time is to over-allocate the memory that backs the array, so that we can amortize the memory operations into the future.
But then, doesn't that overallocation imply a heap also requires auxiliary storage?
If the above is true, that seems to imply heaps have no complexity advantages over balanced BSTs whatsoever, so then, what makes heaps "interesting" from a theoretical point of view?

You appear to confuse binary heaps, heaps in general, and implementations of binary heaps that use an array instead of an explicit tree structure.
A binary heap is simply a binary tree with properties that make it theoretically interesting beyond memory use. They can be built in linear time, whereas building a BST necessarily takes n log n time. For example, this can be used to select the k smallest/largest values of a sequence in better-than-n log n time.
Implementing a binary heap as an array yields an implicit data structure. This is a topic formalized by theorists, but not actively pursued by most of them. But in any case, the classification is justified: To dynamically expand this structure one indeed needs to over-allocate, but not every data structure has to grow dynamically, so the case of a statically-sized pre-allocated data structure is interesting as well.
Furthermore, there is a difference between space needed for faster growth and space needed because each element is larger than it has to be. The first can be avoided by not growing, and also reduced to arbitrarily small constant factor of the total size, at the cost of a greater constant factor on running time. The latter kind of space overhead is usually unavoidable and can't be reduced much (the pointers in a tree are at least log n bits, period).
Finally, there are many heaps other than binary heaps (fibonacci, binominal, leftist, pairing, ...), and almost all except binary heaps offer better bounds for at least some operations. The most common ones are decrease-key (alter the value of a key already in the structure in a certain way) and merge (combined two heaps into one). The complexity of these operations are important for the analysis of several algorithms using priority queues, and hence the motivation for a lot of research into heaps.
In practice the memory use is important. But (with over-allocation and without), the difference is only a constant factor overall, so theorists are not terribly interested in binary heaps. They'd rather get better complexity for decrease-key and merge; most of them are happy if the data structure takes O(n) space. The extremely high memory density, ease of implementation and cache friendliness are far more interesting for practitioners, and it's them who sing the praise of binary heaps far and wide.

B-tree faster than AVL or RedBlack-Tree? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I know that performance never is black and white, often one implementation is faster in case X and slower in case Y, etc. but in general - are B-trees faster then AVL or RedBlack-Trees? They are considerably more complex to implement then AVL trees (and maybe even RedBlack-trees?), but are they faster (does their complexity pay off) ?
Edit: I should also like to add that if they are faster then the equivalent AVL/RedBlack tree (in terms of nodes/content) - why are they faster?

Sean's post (the currently accepted one) contains several incorrect claims. Sorry Sean, I don't mean to be rude; I hope I can convince you that my statement is based in fact.
They're totally different in their use cases, so it's not possible to make a comparison.
They're both used for maintaining a set of totally ordered items with fast lookup, insertion and deletion. They have the same interface and the same intention.
RB trees are typically in-memory structures used to provide fast access (ideally O(logN)) to data. [...]
always O(log n)
B-trees are typically disk-based structures, and so are inherently slower than in-memory data.
Nonsense. When you store search trees on disk, you typically use B-trees. That much is true. When you store data on disk, it's slower to access than data in memory. But a red-black tree stored on disk is also slower than a red-black tree stored in memory.
You're comparing apples and oranges here. What is really interesting is a comparison of in-memory B-trees and in-memory red-black trees.
[As an aside: B-trees, as opposed to red-black trees, are theoretically efficient in the I/O-model. I have experimentally tested (and validated) the I/O-model for sorting; I'd expect it to work for B-trees as well.]
B-trees are rarely binary trees, the number of children a node can have is typically a large number.
To be clear, the size range of B-tree nodes is a parameter of the tree (in C++, you may want to use an integer value as a template parameter).
The management of the B-tree structure can be quite complicated when the data changes.
I remember them to be much simpler to understand (and implement) than red-black trees.
B-tree try to minimize the number of disk accesses so that data retrieval is reasonably deterministic.
That much is true.
It's not uncommon to see something like 4 B-tree access necessary to lookup a bit of data in a very database.
Got data?
In most cases I'd say that in-memory RB trees are faster.
Got data?
Because the lookup is binary it's very easy to find something. B-tree can have multiple children per node, so on each node you have to scan the node to look for the appropriate child. This is an O(N) operation.
The size of each node is a fixed parameter, so even if you do a linear scan, it's O(1). If we big-oh over the size of each node, note that you typically keep the array sorted so it's O(log n).
On a RB-tree it'd be O(logN) since you're doing one comparison and then branching.
You're comparing apples and oranges. The O(log n) is because the height of the tree is at most O(log n), just as it is for a B-tree.
Also, unless you play nasty allocation tricks with the red-black trees, it seems reasonable to conjecture that B-trees have better caching behavior (it accesses an array, not pointers strewn about all over the place, and has less allocation overhead increasing memory locality even more), which might help it in the speed race.
I can point to experimental evidence that B-trees (with size parameters 32 and 64, specifically) are very competitive with red-black trees for small sizes, and outperforms it hands down for even moderately large values of n. See http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.html
B-trees are faster. Why? I conjecture that it's due to memory locality, better caching behavior and less pointer chasing (which are, if not the same things, overlapping to some degree).

Actually Wikipedia has a great article that shows every RB-Tree can easily be expressed as a B-Tree. Take the following tree as sample:
now just convert it to a B-Tree (to make this more obvious, nodes are still colored R/B, what you usually don't have in a B-Tree):
Same Tree as B-Tree
(cannot add the image here for some weird reason)
Same is true for any other RB-Tree. It's taken from this article:
http://en.wikipedia.org/wiki/Red-black_tree
To quote from this article:
The red-black tree is then
structurally equivalent to a B-tree of
order 4, with a minimum fill factor of
33% of values per cluster with a
maximum capacity of 3 values.
I found no data that one of both is significantly better than the other one. I guess one of both had already died out if that was the case. They are different regarding how much data they must store in memory and how complicated it is to add/remove nodes from the tree.
Update:
My personal tests suggest that B-Trees are better when searching for data, as they have better data locality and thus the CPU cache can do compares somewhat faster. The higher the order of a B-Tree (the order is the number of children a note can have), the faster the lookup will get. On the other hand, they have worse performance for adding and removing new entries the higher their order is. This is caused by the fact that adding a value within a node has linear complexity. As each node is a sorted array, you must move lots of elements around within that array when adding an element into the middle: all elements to the left of the new element must be moved one position to the left or all elements to the right of the new element must be moved one position to the right. If a value moves one node upwards during an insert (which happens frequently in a B-Tree), it leaves a hole which must be also be filled either by moving all elements from the left one position to the right or by moving all elements to the right one position to the left. These operations (in C usually performed by memmove) are in fact O(n). So the higher the order of the B-Tree, the faster the lookup but the slower the modification. On the other hand if you choose the order too low (e.g. 3), a B-Tree shows little advantages or disadvantages over other tree structures in practice (in such a case you can as well use something else). Thus I'd always create B-Trees with high orders (at least 4, 8 and up is fine).
File systems, which often base on B-Trees, use much higher orders (order 200 and even a lot more) - this is because they usually choose the order high enough so that a node (when containing maximum number of allowed elements) equals either the size of a sector on harddrive or of a cluster of the filesystem. This gives optimal performance (since a HD can only write a full sector at a time, even when just one byte is changed, the full sector is rewritten anyway) and optimal space utilization (as each data entry on drive equals at least the size of one cluster or is a multiple of the cluster sizes, no matter how big the data really is). Caused by the fact that the hardware sees data as sectors and the file system groups sectors to clusters, B-Trees can yield much better performance and space utilization for file systems than any other tree structure can; that's why they are so popular for file systems.
When your app is constantly updating the tree, adding or removing values from it, a RB-Tree or an AVL-Tree may show better performance on average compared to a B-Tree with high order. Somewhat worse for the lookups and they might also need more memory, but therefor modifications are usually fast. Actually RB-Trees are even faster for modifications than AVL-Trees, therefor AVL-Trees are a little bit faster for lookups as they are usually less deep.
So as usual it depends a lot what your app is doing. My recommendations are:
Lots of lookups, little modifications: B-Tree (with high order)
Lots of lookups, lots of modifiations: AVL-Tree
Little lookups, lots of modifications: RB-Tree
An alternative to all these trees are AA-Trees. As this PDF paper suggests, AA-Trees (which are in fact a sub-group of RB-Trees) are almost equal in performance to normal RB-Trees, but they are much easier to implement than RB-Trees, AVL-Trees, or B-Trees. Here is a full implementation, look how tiny it is (the main-function is not part of the implementation and half of the implementation lines are actually comments).
As the PDF paper shows, a Treap is also an interesting alternative to classic tree implementation. A Treap is also a binary tree, but one that doesn't try to enforce balancing. To avoid worst case scenarios that you may get in unbalanced binary trees (causing lookups to become O(n) instead of O(log n)), a Treap adds some randomness to the tree. Randomness cannot guarantee that the tree is well balanced, but it also makes it highly unlikely that the tree is extremely unbalanced.

Nothing prevents a B-Tree implementation that works only in memory. In fact, if key comparisons are cheap, in-memory B-Tree can be faster because its packing of multiple keys in one node will cause less cache misses during searches. See this link for performance comparisons. A quote: "The speed test results are interesting and show the B+ tree to be significantly faster for trees containing more than 16,000 items." (B+Tree is just a variation on B-Tree).

The question is old but I think it is still relevant. Jonas Kölker and Mecki gave very good answers but I don't think the answers cover the whole story. I would even argue that the whole discussion is missing the point :-).
What was said about B-Trees is true when entries are relatively small (integers, small strings/words, floats, etc). When entries are large (over 100B) the differences become smaller/insignificant.
Let me sum up the main points about B-Trees:
They are faster than any Binary Search Tree (BSTs) due to memory locality (resulting in less cache and TLB misses).
B-Trees are usually more space efficient if entries are relatively
small or if entries are of variable size. Free space management is
easier (you allocate larger chunks of memory) and the extra metadata
overhead per entry is lower. B-Trees will waste some space as nodes
are not always full, however, they still end up being more compact
that Binary Search Trees.
The big O performance ( O(logN) ) is the same for both. Moreover, if you do binary search inside each B-Tree node, you will even end up with the same number of comparisons as in a BST (it is a nice math exercise to verify this).
If the B-Tree node size is sensible (1-4x cache line size), linear searching inside each node is still faster because of
the hardware prefetching. You can also use SIMD instructions for
comparing basic data types (e.g. integers).
B-Trees are better suited for compression: there is more data per node to compress. In certain cases this can be a huge benefit.
Just think of an auto-incrementing key in a relational database table that is used to build an index. The lead nodes of a B-Tree contain consecutive integers that compress very, very well.
B-Trees are clearly much, much faster when stored on secondary storage (where you need to do block IO).
On paper, B-Trees have a lot of advantages and close to no disadvantages. So should one just use B-Trees for best performance?
The answer is usually NO -- if the tree fits in memory. In cases where performance is crucial you want a thread-safe tree-like data-structure (simply put, several threads can do more work than a single one). It is more problematic to make a B-Tree support concurrent accesses than to make a BST. The most straight-forward way to make a tree support concurrent accesses is to lock nodes as you are traversing/modifying them. In a B-Tree you lock more entries per node, resulting in more serialization points and more contended locks.
All tree versions (AVL, Red/Black, B-Tree, an others) have countless variants that differ in how they support concurrency. The vanilla algorithms that are taught in a university course or read from some introductory books are almost never used in practice. So, it is hard to say which tree performs best as there is no official agreement on the exact algorithms are behind each tree. I would suggest to think of the trees mentioned more like data-structure classes that obey certain tree-like invariants rather than precise data-structures.
Take for example the B-Tree. The vanilla B-Tree is almost never used in practice -- you cannot make it to scale well! The most common B-Tree variant used is the B+-Tree (widely used in file-systems, databases). The main differences between the B+-Tree and the B-Tree: 1) you dont store entries in the inner nodes of the tree (thus you don't need write locks high in the tree when modifying an entry stored in an inner node); 2) you have links between nodes at the same level (thus you do not have to lock the parent of a node when doing range searches).
I hope this helps.

Guys from Google recently released their implementation of STL containers, which is based on B-trees. They claim their version is faster and consume less memory compared to standard STL containers, implemented via red-black trees.
More details here

For some applications, B-trees are significantly faster than BSTs.
The trees you may find here:
http://freshmeat.net/projects/bps
are quite fast. They also use less memory than regular BST implementations, since they do not require the BST infrastructure of 2 or 3 pointers per node, plus some extra fields to keep the balancing information.

THey are sed in different circumstances - B-trees are used when the tree nodes need to be kept together in storage - typically because storage is a disk page and so re-balancing could be vey expensive. RB trees are used when you don't have this constraint. So B-trees will probably be faster if you want to implement (say) a relational database index, while RB trees will probably be fasterv for (say) an in memory search.

They all have the same asymptotic behavior, so the performance depends more on the implementation than which type of tree you are using.
Some combination of tree structures might actually be the fastest approach, where each node of a B-tree fits exactly into a cache-line and some sort of binary tree is used to search within each node. Managing the memory for the nodes yourself might also enable you to achieve even greater cache locality, but at a very high price.
Personally, I just use whatever is in the standard library for the language I am using, since it's a lot of work for a very small performance gain (if any).
On a theoretical note... RB-trees are actually very similar to B-trees, since they simulate the behavior of 2-3-4 trees. AA-trees are a similar structure, which simulates 2-3 trees instead.

moreover ...the height of a red black tree is O(log[2] N) whereas that of B-tree is O(log[q] N) where ceiling[N]<= q <= N . So if we consider comparisons in each key array of B-tree (that is fixed as mentioned above) then time complexity of B-tree <= time complexity of Red-black tree. (equal case for single record equal in size of a block size)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio