What is the precise definition of the Heap data structure? - data-structures

The definition of heap given in wikipedia (http://en.wikipedia.org/wiki/Heap_(data_structure)) is
In computer science, a heap is a specialized tree-based data structure
that satisfies the heap property: If A is a parent node of B then
key(A) is ordered with respect to key(B) with the same ordering
applying across the heap. Either the keys of parent nodes are always
greater than or equal to those of the children and the highest key is
in the root node (this kind of heap is called max heap) or the keys of
parent nodes are less than or equal to those of the children (min
heap)
The definition says nothing about the tree being complete. For example, according to this definition, the binary tree 5 => 4 => 3 => 2 => 1 where the root element is 5 and all the descendants are right children also satisfies the heap property. I want to know the precise definition of the heap data structure.

As others have said in comments: That is the definition of a heap, and your example tree is a heap, albeit a degenerate/unbalanced one. The tree being complete, or at least reasonably balanced, is useful for more efficient operations on the tree. But an inefficient heap is still a heap, just like an unbalanced binary search tree is still a binary search tree.
Note that "heap" does not refer to a data structure, it refers to any data structure fulfilling the heap property or (depending on context) a certain set of operations. Among the data structures which are heaps, most efficient ones explicitly or implicitly guarantee the tree to be complete or somewhat balanced. For example, a binary heap is by definition a complete binary tree.
In any case, why do you care? If you care about specific lower or upper bounds on specific operations, state those instead of requiring a heap. If you discuss specific data structure which are heaps and complete trees, state that instead of just speaking about heaps (assuming, of course, that the completeness matters).

Since this question was asked, the Wikipedia definition has been updated:
In computer science, a heap is a specialized tree-based data structure which is essentially an almost complete1 tree that satisfies the heap property: in a max heap, for any given node C, if P is a parent node of C, then the key (the value) of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C.2 The node at the "top" of the heap (with no parents) is called the root node.
However, "heap data structure" really denotes a family of different data structures, which also includes:
Binomial heap
Fibonaicci heap
Leftist heap
Skew heap
Pairing heap
2-3 heap
...and these are certainly not necessarily complete trees.
On the other hand, the d-ary heap data structures -- including the binary heap -- most often refer to complete trees, such that they can be implemented in an array in level-order, without gaps:
The 𝑑-ary heap consists of an array of 𝑛 items, each of which has a priority associated with it. These items may be viewed as the nodes in a complete 𝑑-ary tree, listed in breadth first traversal order.

Related

In data structure, what is the difference between an ordinary heap and a binary heap?

As I know, a binary heap is a heap data structure that takes the form of a binary tree, so that a binary heap is a special kind of heap. But what is the definition of a heap(an ordinary heap).
According to wikipedia:
In computer science, a heap is a specialized tree-based data structure
that satisfies the heap property: if P is a parent node of C, then the
key (the value) of node P is greater than the key of node C.
Binary heap is a specific data structure that is based on a complete binary tree.
Heap data structure is a general term. There are many different heap data structures. And the heap is not limited to be a single tree at all (not only binary). Look at the Binomial heap for example.

What is the correct definition of a heap

I was reading about heaps in Java programming. In my textbook, I found this definition of a heap: a heap is a complete binary tree with the following properties: 1) the value in the root is the smallest item in the tree;
2) every subtree is a heap
But when I was watching videos about heaps, I found a totally different definition of heaps which says: In a heap the parent keys are bigger then the children.
Now I am confused because the two definitions do not fit with each other.
Which definition is the correct one?
Thanks!
Both the definition are correct.
There are two types of Heap.
Min Heap: In which parent node is always smaller than its children.
Max Heap: In which, parent node is always larger than its children.
This smaller/larger value of the parent than it's children is called Heap Property. This Heap Property has be satisfied by each node of the tree.
The complexity of constructing the Heap from a given array is O(n). This operation is called Heapify.
Given a Heap, adding/removing a node/element from the Heap. The complexity of the operation is O(log(n)).
The complexity of the Sorting any array using the Heap data structure (Heap Sort) is O(n.log(n)). Basically you extract the top (root) element from the Min Heap. This operation is repeated n times, So complexity is O(n.log(n))
Quoting wikipedia here
In computer science, a heap is a specialized tree-based data structure
that satisfies the heap property: If A is a parent node of B then the
key of node A is ordered with respect to the key of node B with the
same ordering applying across the heap. A heap can be classified
further as either a "max heap" or a "min heap". In a max heap, the
keys of parent nodes are always greater than or equal to those of the
children and the highest key is in the root node. In a min heap, the
keys of parent nodes are less than or equal to those of the children
and the lowest key is in the root node. Heaps are crucial in several
efficient graph algorithms such as Dijkstra's algorithm, and in the
sorting algorithm heapsort. A common implementation of a heap is the
binary heap, in which the tree is a complete binary tree (see figure).
There are 2 types of heaps:
Min Heap: Parent node is always smaller than the childeren.
Max Heap: Parent node is always larger than the childeren.

Heap vs binary search tree (when it is better than the other?)

In what situations is using a min-heap more efficient than using a binary search tree? Is it true that the time of finding the minimum in a binary search tree is equal to finding minimum value in min-heap - O(1)?
This is almost like comparing coffee cups and koala bears. Heaps and binary search trees are intended to perform very different functions. A heap is an implementation of the priority queue abstract data type. At the basic level, a priority queue (and thus a heap) is just a bag where you put things, and when you reach in to get an item out you always get the smallest (min-heap) or largest (max-heap) item in the bag.
You can get fancy and give your heap the ability to remove any arbitrary item, or to change the priority of an item in the heap, but those are more advanced functionality and don't fall within the bounds of the traditional definition of the heap data structure.
A binary search tree is a much different beast. It's a bag where you put things, and you can quickly reach in to grab any item by key or you can list all of the items in order (or reverse order).
You can use a binary search tree to implement a priority queue, meaning that you could in principle replace a heap with a binary tree. The binary search tree wouldn't perform as well as the heap, but it would get the job done.
But the reverse isn't true. You can't use a heap to replace a binary search tree.
So the question of which is better is really a question of what do you want to do?
If you want an ordered set of items from which you can quickly locate any item, or that you can traverse in order, then you want a binary search tree.
If you want an implementation of the priority queue abstract data type: a bag that will quickly give you the smallest (or largest, depending on how you define it) item when you ask for it, then you want to use a heap.
The two have different uses and are not interchangeable.
A heap is a structure that guarantees you that the value of a given node is lower or equal (for a min heap; greater or equal for a max heap) than the value of any node underneath. This allows to get the minimum (or maximum) value in O(1).
A binary search tree is a structure that keeps all nodes ordered. This allows to retrieve any value in O(h) (h being the height of the tree, and h=log2(n) if the tree is balanced, with n the number of nodes).

What are the differences between heap and red-black tree?

We know that heaps and red-black tree both have these properties:
worst-case cost for searching is lgN;
worst-case cost for insertion is lgN.
So, since the implementation and operation of red-black trees is difficult, why don't we just use heaps instead of red-black trees? I am confused.
You can't find an arbitrary element in a heap in O(log n). It takes O(n) to do this. You can find the first element (the smallest, say) in a heap in O(1) and extract it in O(log n). Red-black trees and heaps have quite different uses, internal orderings, and implementations: see below for more details.
Typical use
Red-black tree: storing dictionary where as well as lookup you want elements sorted by key, so that you can for example iterate through them in order. Insert and lookup are O(log n).
Heap: priority queue (and heap sort). Extraction of minimum and insertion are O(log n).
Consistency constraints imposed by structure
Red-black tree: total ordering: left child < parent < right child.
Heap: dominance: parent < children only.
(note that you can substitute a more general ordering than <)
Implementation / Memory overhead
Red-black tree: pointers used to represent structure of tree, so overhead per element. Typically uses a number of nodes allocated on free store (e.g. using new in C++), nodes point to other nodes. Kept balanced to ensure logarithmic lookup / insertion.
Heap: structure is implicit: root is at position 0, children of root at 1 and 2, etc, so no overhead per element. Typically just stored in a single array.
Red Black Tree:
Form of a binary search tree with a deterministic balancing strategy. This Balancing guarantees good performance and it can always be searched in O(log n) time.
Heaps:
We need to search through every element in the heap in order to determine if an element is inside. Even with optimization, I believe search is still O(N). On the other hand, It is best for finding min/max in a set O(1).

If you have a binomial heap of size 14, how can you tell which node is the root node?

Hi guys, I just had a question about this diagram.
How can I tell which node is the root node and how would I heapify something like this?
Thank you.
Edit: Sorry, when I said heapify I meant make a max heap.
Normally with a regular heap, I would go from left to right, starting at the first node that isn't a leaf node and sift downwards. I don't see how I can do that here though.
This is a binomial heap, it doesn't have one root but a set of roots (because a binomial heap is a set of binomial trees).
What do you mean by "make a max heap" ?
Max heaps and binomial heaps are as close from each other as java and javascript are.
If you extract the minimum n times you can obtain a sorted array which is a max heap. The complexity is O(n*log(n)).
I think you're trying to treat the binomial heap as a binary heap, which doesn't work.
A Binary Heap can be stored in an array without explicit links - the links are implicit in the positions within the array. An unordered array can be "heapified", reordering to make a valid binary heap in O(n) time. That is a key advantage of binary heaps - there's a lightweight implementation that uses memory well.
I've never implemented a Binomial Heap and though I've studied them, that was a while ago. I'm pretty confident, though, that a binomial heap isn't a binary heap and can't be implemented that way. Binomial heaps have their own advantages, but they don't keep all the advantages of binary heaps. If binomial heaps were universally superior, no-one would care about binary heaps.
IIRC, the normal implementation of binomial trees (on which binomial heaps are based) is that you have a linked list of children for each parent node and a linked list of roots. Those linked lists use explicit links. This is how you support k children per node, with no upper bound on k.
The important extra operation for binary heaps is the merge. If a binomial heap were stored in an array with implicit links, a merge would obviously require lots of copying - copying items from one array into the other for a start. The efficient merge would therefore be impossible - the key advantage of the binomial heap would be lost.
With explicit links, however, combining two binomial trees into one is an O(1) pointer-fiddling operation (adding an item to the head of a linked list), so two binominal heaps can be merged with O(log n) binomial tree merges very efficiently.
It's a bit like the difference between a sorted array and a binary search tree. Sure, the sorted array has advantages, but it also has limitations. Some operations are more efficient when all you have to do is modify a link or two without moving items around in an array. Sometimes you don't need those operations, and it's more efficient to avoid the need for links and just binary search a sorted array, which is equivalent to searching a perfectly balanced binary search tree with implicit links.
Conceptually, the root should be the only node that has no ancestors - 1 in the case of your diagram.

Resources