Soft heap: where and why is it useful? [duplicate] - algorithm

This question already has answers here:
Soft heaps: what is corruption and why is it useful?
(2 answers)
Closed 2 years ago.
From the paper that I was reading By Bernard chazelle https://www.cs.princeton.edu/courses/archive/fall05/cos528/handouts/The%20Soft%20Heap.pdf
I failed to find Soft heap being used much in practical scenario. So, It would be helpful if someone could let me know why is it really useful.

I haven't red the article, only the abstract and I quote
The soft heap can be used to compute exact or approximate medians and percentiles optimally. It is also useful for approximate sorting and for computing minimum spanning trees of general graphs.
So it has some uses in the graphs' algorithms or in medians' computing.
In graph algorithms there's a popular algorithm called "Prim's Algorithmm" and it finds the minimum spanning trees of general graphs. I'm not 100% sure but I think Soft Heap are used in this algorithm.
You might be familiar with plain old heap, it is potent for its fast computing response time. Seems like Soft Heap share the same property.

Related

What is the time complexity of A* search [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I'm new to stack overflow, but I'm here because I've searched everywhere and can't seem to find much info on the time complexity of A*, besides off the wiki. I would also like to compare it to Dijkstra's algorithm and see how adding a heuristic in A* improves it's performance.
I know it's a very advanced topic, but I just can't fully understand it from the info given on wiki (Even the analysis of Dijkstra's algorithm on wiki seems quite advanced).
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
https://en.wikipedia.org/wiki/A*_search_algorithm
I would greatly appreciate it if anyone could explain the time complexity in more detail, or suggest any reading / learning material on the topic. I do have a good understanding of the A* algorithm, but I've just started learning the analysis thereof now.
The answer is simply it depends. A star by itself is no complete algorithm. A star is Dijkstra with a heuristic that fulfills some properties (like triangle inequality). You can select different heuristic functions that lead to different time complexities. The simplest heuristic is straight line distance. However there is also more advanced stuff like landmarks heuristic for example.
In the worst case you always need to explore the whole neighborhood so you won't get better than Dijkstra from a general point of analysis.
However in most practical applications you can achieve much better bounds.
This is only when you know some properties of your graph and of your heuristic function. You then can make some assumptions which lead to a better complexity, but only for those instances.
For example if you know that the straight line distance is always the correct distance in your graph and you use a straight line distance heuristic, then your A star will have the best possible complexity with Theta(1). However this is a much to strong assumption for most applications. But you can think of where this goes.
The bottom line is: It extremely depends on the structure of your graph and your heuristic function.
Here's a lecture on A star as you ask for learning material: Efficient Route Planning (A*, Landmarks, Set Dijkstra) - University of Freiburg
There is also much on the internet, the algorithm is pretty popular as it is very easy to implement and for most cases already fast enough (non-complex games for example).

Are Fibonacci heaps or Brodal queues used in practice anywhere?

Are Fibonacci heaps used in practice anywhere? I've looked around on SO and found answers to related questions (see below) but nothing that actually quite answers the question.
There are good implementations of Fibonacci heaps out there, including in standard libraries such as Boost C++. The fact that these libraries contain Fibonacci heaps suggests to be that they must be useful somewhere.
We know that certain conditions need to be met for a Fibonacci heap to be faster in practice: "to benefit from Fibonacci heaps in practice, you have to use them in an application where decrease_keys are incredibly frequent"; "For the Fibonacci Heap to really shine, you need either of the following cases: a) Expensive comparisons: Fib Heaps minimize the number of comparisons required to organize the data. b) The majority of operations is updateKey/insert/delete. As Fibonacci Heaps 'group' the updates together until the next extractMin, the larger the 'batch', the more efficient it gets."
There is a data structure called a "Brodal Queue" which I'm not sure I'd heard of before that seems to have time complexity behaviors at least as good as Fibonacci heaps. Here's a nice table with a comparison of time complexities for various operations for different varieties of heaps.
On a question about whether there are any applications of Fibonacci or binomial heaps, answerers only gave examples of binomial heaps.
To the best of my knowledge, there are no major applications that actually use Fibonacci heaps or Brodal queues.
Fibonacci heaps were initially designed to satisfy a theoretical rather than a practical need: to speed up Dijkstra's shortest paths algorithm asymptotically. The Brodal queue (and the related functional data structure) were similarly designed to meet theoretical guarantees, specifically, to answer a longstanding open question about whether it was possible to match the time bounds of a Fibonacci heap with worst-case guarantees rather than amortized guarantees. In that sense, the data structures were not developed to meet practical needs, but rather to push forward our theoretical understanding of the limits of algorithmic efficiency. To the best of my knowledge, there are no present algorithms in which it would actually be better to use a Brodal queue over a Fibonacci heap.
As other answers have noted, the constant factors hidden in a Fibonacci heap or Brodal queue are very high. They need a lot of pointers wired in lots of complicated linked lists and, accordingly, have absolutely terrible locality of reference, especially compared to a standard binary heap. This means that they're likely to perform worse in practice given caching effects unless you have algorithms that need a colossally large number of decrease-key operations. There are some cases where this comes up (the linked answers talk about a few of them, for example), but treat them as highly specialized circumstances rather than common use cases. If you're working on huge graphs, it's more common to use other techniques to improve efficiency, such as using approximation algorithms for the problem at hand, better heuristics, or algorithms that use specific properties of the underlying data.
Hope this helps!

Iterate through all trees of a given size

I am often faced with the problem of checking some property of trees (the graph ones) of a given size by brute force. Do you have any nice tricks for doing this? Ideally, I'd like to examine each isomorphism class only once (but after all, speed is all that matters).
Bit twiddling tricks are most welcome since n is usually less than 32 :)
I'm asking for slightly more refined algorithms than the likes of "loop through all (n-1)-edge subsets and check if they form a tree" for trees on n nodes.
This is in Knuth's The Art of Computer Programming volume on Combinatorial Algorithms. If I remember correctly, it's an exercise there. Since he has the solutions for such, I point you there.
Some googling turned up the following algorithm description: http://www.cs.auckland.ac.nz/compsci720s1c/lectures/mjd/treenotes.pdf. They adapt an algorithm for enumerating rooted trees to enumerating unrooted trees.
Apparently others have proved that this requires only amortised constant time per tree, and the PDF shows some performance measurements demonstrating this.

Real world applications of Binary heaps and Fibonacci Heaps [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
What are the real world applications of Fibonacci heaps and binary heaps? It'd be great if you could share some instance when you used it to solve a problem.
Edit: Added binary heaps also. Curious to know.
You would rarely use one in real life. I believe the purpose of the Fibonacci heap was to improve the asymptotic running time of Dijkstra's algorithm. It might give you an improvement for very, very large inputs, but most of the time, a simple binary heap is all you need.
From Wiki:
Although the total running time of a
sequence of operations starting with
an empty structure is bounded by the
bounds given above, some (very few)
operations in the sequence can take
very long to complete (in particular
delete and delete minimum have linear
running time in the worst case). For
this reason Fibonacci heaps and other
amortized data structures may not be
appropriate for real-time systems.
The binary heap is a data structure that can be used to quickly find the maximum (or minimum) value in a set of values. It's used in Dijkstra's algorithm (shortest path), Prim's algorithm (minimum spanning tree) and Huffman encoding (data compression).
Can't say about the fibonacci heaps but binary heaps are used in the priority queues. Priority queues are widely used in the real systems.
One known example is process scheduling in the kernel. The highest priority process is taken first.
I have used the priority queues in the partitioning of the sets. The set which has the maximum members was to be taken first for the partitioning.
In most scenarios, you have to choose based on the complexity of:
insertion
finding elements
And the usual suspects are:
BST: log(n) insert and find
linked list: O(1) insert and O(n) find
heap:
O(1) insert
average for binary heap, see: https://stackoverflow.com/a/29548834/895245)
amortized for Fibonacci. This is stronger than average, weaker than worst case.
O(1) find for the first element only, O(n) in general
worst case for the binary heap
amortized for Fibonacci
There is also the Brodal queue and other heaps which reach O(1) worst case, but requires even larger queues than Fibonacci to be worth it.
So if your algorithm only needs to "find" the first element and do lots of insertions, heaps are a good choice.
As others mentioned, this is the case for Dijkstra.
Priority queues are usually implemented as heaps, for example: http://download.oracle.com/javase/6/docs/api/java/util/PriorityQueue.html
Computing the top N elements from a huge data-set can be done efficiently using binary heaps (e.g. top search queries in a large-scale website).

Understanding Ukkonen's algorithm for suffix trees [duplicate]

This question already has answers here:
Ukkonen's suffix tree algorithm in plain English
(7 answers)
Closed 9 years ago.
I'm doing some work with Ukkonen's algorithm for building suffix trees, but I'm not understanding some parts of the author's explanation for it's linear-time complexity.
I have learned the algorithm and have coded it, but the paper which I'm using as the main source of information (linked bellow) is kinda confusing at some parts so it's not really clear for me why the algorithm is linear.
Any help? Thanks.
Link to Ukkonen's paper: http://www.cs.helsinki.fi/u/ukkonen/SuffixT1withFigs.pdf
Find a copy of Gusfield's string algorithms textbook. It's got the best exposition of the suffix tree construction I've seen. The linearity is a surprising consequence of a number of optimizations of the high-level algorithm.

Resources