Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
What are the real world applications of Fibonacci heaps and binary heaps? It'd be great if you could share some instance when you used it to solve a problem.
Edit: Added binary heaps also. Curious to know.
You would rarely use one in real life. I believe the purpose of the Fibonacci heap was to improve the asymptotic running time of Dijkstra's algorithm. It might give you an improvement for very, very large inputs, but most of the time, a simple binary heap is all you need.
From Wiki:
Although the total running time of a
sequence of operations starting with
an empty structure is bounded by the
bounds given above, some (very few)
operations in the sequence can take
very long to complete (in particular
delete and delete minimum have linear
running time in the worst case). For
this reason Fibonacci heaps and other
amortized data structures may not be
appropriate for real-time systems.
The binary heap is a data structure that can be used to quickly find the maximum (or minimum) value in a set of values. It's used in Dijkstra's algorithm (shortest path), Prim's algorithm (minimum spanning tree) and Huffman encoding (data compression).
Can't say about the fibonacci heaps but binary heaps are used in the priority queues. Priority queues are widely used in the real systems.
One known example is process scheduling in the kernel. The highest priority process is taken first.
I have used the priority queues in the partitioning of the sets. The set which has the maximum members was to be taken first for the partitioning.
In most scenarios, you have to choose based on the complexity of:
insertion
finding elements
And the usual suspects are:
BST: log(n) insert and find
linked list: O(1) insert and O(n) find
heap:
O(1) insert
average for binary heap, see: https://stackoverflow.com/a/29548834/895245)
amortized for Fibonacci. This is stronger than average, weaker than worst case.
O(1) find for the first element only, O(n) in general
worst case for the binary heap
amortized for Fibonacci
There is also the Brodal queue and other heaps which reach O(1) worst case, but requires even larger queues than Fibonacci to be worth it.
So if your algorithm only needs to "find" the first element and do lots of insertions, heaps are a good choice.
As others mentioned, this is the case for Dijkstra.
Priority queues are usually implemented as heaps, for example: http://download.oracle.com/javase/6/docs/api/java/util/PriorityQueue.html
Computing the top N elements from a huge data-set can be done efficiently using binary heaps (e.g. top search queries in a large-scale website).
Related
I want to know the basic difference between binary, binomial, and Fibonacci heaps and in which scenarios they are best to use.
I am mainly concerned with their application in Dijkstra's algorithm that how it's Time complexity will vary depending on the type of the heap used?
According to Wikipedia, a binary heap is a heap data structure created using a binary tree. It can be seen as a binary tree with two additional constraints complete binary tree and heap property. Note that heap property is all nodes are either greater or less than each of children.
Binomial heap is more complex than most of the binary heaps. However, it has excellent merge performance which bound to O(lg N) time. A binomial heap is consist of a list of binomial trees.
Before jumping into Fibonacci heaps, it's probably good to explore why we even need them in the first place. There are plenty of other types of heaps (binary heaps and binomial heaps, for example), so why do we need another one?
The main reason comes up in Dijkstra's algorithm and Prim's algorithm. Both of these graph algorithms work by maintaining a priority queue holding nodes with associated priorities. Interestingly, these algorithms rely on a heap operation called decrease-key that takes an entry already in the priority queue and then decreases its key (i.e. increases its priority). In fact, a lot of the runtime of these algorithms is explained by the number of times you have to call decrease-key. If we could build a data structure that optimized decrease-key, we could optimize the performance of these algorithms. In the case of the binary heap and binomial heap, decrease-key takes time O(log n), where n is the number of nodes in the priority queue. If we could drop that to O(1), then the time complexities of Dijkstra's algorithm and Prim's algorithm would drop from O(m log n) to (m + n log n), which is asymptotically faster than before. Therefore, it makes sense to try to build a data structure that supports decrease-key efficiently.
If you're interested in learning more about Fibonacci heaps, you may want to check out this two-part series of lecture slides. Part one introduces binomial heaps and shows how lazy binomial heaps work.Part two explores Fibonacci heaps. These slides go into more mathematical depth than what I've covered here.
Are Fibonacci heaps used in practice anywhere? I've looked around on SO and found answers to related questions (see below) but nothing that actually quite answers the question.
There are good implementations of Fibonacci heaps out there, including in standard libraries such as Boost C++. The fact that these libraries contain Fibonacci heaps suggests to be that they must be useful somewhere.
We know that certain conditions need to be met for a Fibonacci heap to be faster in practice: "to benefit from Fibonacci heaps in practice, you have to use them in an application where decrease_keys are incredibly frequent"; "For the Fibonacci Heap to really shine, you need either of the following cases: a) Expensive comparisons: Fib Heaps minimize the number of comparisons required to organize the data. b) The majority of operations is updateKey/insert/delete. As Fibonacci Heaps 'group' the updates together until the next extractMin, the larger the 'batch', the more efficient it gets."
There is a data structure called a "Brodal Queue" which I'm not sure I'd heard of before that seems to have time complexity behaviors at least as good as Fibonacci heaps. Here's a nice table with a comparison of time complexities for various operations for different varieties of heaps.
On a question about whether there are any applications of Fibonacci or binomial heaps, answerers only gave examples of binomial heaps.
To the best of my knowledge, there are no major applications that actually use Fibonacci heaps or Brodal queues.
Fibonacci heaps were initially designed to satisfy a theoretical rather than a practical need: to speed up Dijkstra's shortest paths algorithm asymptotically. The Brodal queue (and the related functional data structure) were similarly designed to meet theoretical guarantees, specifically, to answer a longstanding open question about whether it was possible to match the time bounds of a Fibonacci heap with worst-case guarantees rather than amortized guarantees. In that sense, the data structures were not developed to meet practical needs, but rather to push forward our theoretical understanding of the limits of algorithmic efficiency. To the best of my knowledge, there are no present algorithms in which it would actually be better to use a Brodal queue over a Fibonacci heap.
As other answers have noted, the constant factors hidden in a Fibonacci heap or Brodal queue are very high. They need a lot of pointers wired in lots of complicated linked lists and, accordingly, have absolutely terrible locality of reference, especially compared to a standard binary heap. This means that they're likely to perform worse in practice given caching effects unless you have algorithms that need a colossally large number of decrease-key operations. There are some cases where this comes up (the linked answers talk about a few of them, for example), but treat them as highly specialized circumstances rather than common use cases. If you're working on huge graphs, it's more common to use other techniques to improve efficiency, such as using approximation algorithms for the problem at hand, better heuristics, or algorithms that use specific properties of the underlying data.
Hope this helps!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am presently studying sorting algorithms. I have studied that quick sort algorithm depends on the initial organization of data. If the array is sorted, quick sort becomes slower. Is there any other sort which depends on the initial organization of data?
Of course. Insertion sort will be O(n) with the descending sorted input:
define selection_sort (arr):
out = []
while not (arr.is_empty()):
x = arr.pop()
out = insert x out
return out
because each insert call will be O(1). If pop_last() is used instead of pop() then it will be fastest on the sorted ascending input (this assumes pop() and/or pop_last() are O(1) themselves).
All fast sort algorithms minimize comparison and move operations. Minimizing move operations is dependent on the initial element ordering. I'm assuming you mean initial element ordering by initial organization.
Additionally, the fastest real world algorithms exploit locality of reference which which also shows dependence on the initial ordering.
If you are only interestend in a dependency that slows or speeds up the sorting dramatically, for example bubble sort will complete in one pass on sorted data.
Finally, many sort algorithms have average time complexity O(N log N) but worst case complexity O(N^2). What this means is that there exist specific inputs (e.g. sorted or reverse sorted) for these O(N^2) algorithms that provoke the bad run time behaviour. Some quicksort versions are example of these algorithms.
If what you're asking is "should I be worried about which sorting algorithm should I pick on a case basis?", unless you're processing thousands of millions of operations, the short answer is "no". Most of the times quicksort will be just fine (quicksort with a calculated pivot, like Java's).
In general cases, quicksort is good enough.
On the other hand, If your system is always expecting the source data in a consistent initial sorted way, and you need long CPU time and power each time, then you should definitely find the right algorithm for that corner case.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
In (most of the) research papers on sorting, authors conclude that their algorithm takes n-1comparisons to sort a 'n' sized array (where n is size of the array)
...so and so
but when it comes to coding, the code uses more comparisons than concluded.
More specifically, what assumptions do they take for the comparisons?
What kind of comparisons they don't take into account?
Like if you take a look at freezing sort or Enhanced Insertion sort. The no. Of comparisons, these algo take in actual code is more than they have specied in the graph(no. of comparisons vs no. of elements)
The least possible number of comparisons done in a sorting algorithm could be n-1. In this case, you wouldn't actually be sorted at all, you'd just be checking whether the data is already sorted, essentially just comparing each element to the ones directly before and after it (this is done in the best case for insertion sort). It's fairly easy to see that it's impossible to do less comparisons than this, because then you'd have more than one disjoint sets of what you've compared, meaning you wouldn't know how the elements across these sets compare to each other.
If we're talking about average / worst case, it's actually been proven that the number of comparisons required is Ω(n log n).
An algorithm being recursive or iterative doesn't (directly) affect the number of comparisons. The only statement I could think that we could make specifically about recursive sorting algorithms is perhaps the recursion depth. This greatly depends on the algorithm, but quick-sort, specifically, has a (worst-case) recursion depth around n-1.
More comparisons that are often ignored on papers, but are conducted
in real code are the comparisons for branches. (if (<stop clause>)
return ...;), and similarly for loop iterators.
One reason why they are mostly ignored is because they are done on
indices, which are of constant sizes, while the compared elements
(which we do count) - might take more time, depending on the actual
type being compared (strings might take longer to compare than
integers, for example).
Also note, an array cannot be sorted using n-1 comparisons
(worst/average case), since sorting is Omega(nlogn) problem.
However, it is possible what the authour meant is the sorting takes
n-1 comparisons at each step of the algorithm, and there could be
multiple (typically O(logn)) of those steps.
I am interested in implementing a priority queue to enable an efficient Astar implementation that is also relatively simple (the priority queue is simple I mean).
It seems that because a Skip List offers a simple O(1) extract-Min operation and an insert operation that is O(Log N) it may be competitive with the more difficult to implement Fibonacci Heap which has O(log N) extract-Min and an O(1) insert. I suppose that the Skip-List would be better for a graph with sparse nodes whereas a Fibonacci heap would be better for an environment with more densely connected nodes.
This would probably make the Fibonacci Heap usually better, but am I correct in assuming that Big-Oh wise these would be similar?
The raison d'etre of the Fibonacci heap is the O(1) decrease-key operation, enabling Dijkstra's algorithm to run in time O(|V| log |V| + |E|). In practice, however, if I needed an efficient decrease-key operation, I'd use a pairing heap, since the Fibonacci heap has awful constants. If your keys are small integers, it may be even better just to use bins.
Fibonacci heaps are very very very slow except for very very very very large and dense graphs (on the order of hundreds of millions of edges). They are also notoriously difficult to implement correctly.
On the other hand, skip lists are very nice data structures and relatively simple to implement.
However I wonder why you're not considering using a simple binary heap. I believe binary heaps-based priority queues are even faster than skip list-based priority queues. Skip lists are mainly used to take advantage of concurrency.