What is the worst case runtime for the most efficient algorithm to build a huffman tree? - big-o

I've searched online everywhere and can't seem to find an answer to this. Given a sorted list of frequencies, what is the most efficient algorithm for creating a huffman tree and what would its big O worst case be?

If you have the list of ordered frequencies then the basic greedy algorithm runs in linear time. You can't beat that (from an asymptotic perspective) since you have to process each node at least once.

Related

Big O algorithms minimum time

I know that for some problems, no matter what algorithm you use to solve it, there will always be a certain minimum amount of time that will be required to solve the problem. I know BigO captures the worst-case (maximum time needed), but how can you find the minimum time required as a function of n? Can we find the minimum time needed for sorting n integers, or perhaps maybe finding the minimum of n integers?
what you are looking for is called best case complexity. It is kind of useless analysis for algorithms while worst case analysis is the most important analysis and average case analysis is sometimes used in special scenario.
the best case complexity depends on the algorithms. for example in a linear search the best case is, when the searched number is at the beginning of the array. or in a binary search it is in the first dividing point. in these cases the complexity is O(1).
for a single problem, best case complexity may vary depending on the algorithm. for example lest discuss about some basic sorting algorithms.
in bubble sort best case is when the array is already sorted. but even in this case you have to check all element to be sure. so the best case here is O(n). same goes to the insertion sort
for quicksort/mergesort/heapsort the best case complexity is O(n log n)
for selection sort it is O(n^2)
So from the above case you can understand that the complexity ( whether it is best , worst or average) depends on the algorithm, not on the problem

Efficiently rebalancing a tree of 2^n-1 nodes?

I stumbled upon this question:
Given a binary search tree with 2^n-1 nodes, give an efficient algorithm to convert it to a self balancing tree(like avl or RB tree). and analyze its worst case running time as a function of n.
well I think the most efficient algorithm is at o(n) time for n nodes, but the 2^n-1 nodes is the tricky part. any idea what will be the running time then?
any help will be greatly appreciated
If you've already got a linear-time algorithm for solving this problem, great! Think of it this way. Let m = 2n - 1. If you have an algorithm that balances the tree and runs in time linear in the number of nodes, then your algorithm runs in time O(m) in this case, which is great. Don't let the exponential time scare you; if the runtime is O(2n) on inputs of size 2n - 1, then you're running efficiently.
As for particular algorithms, you seem to already know one, but if you haven't heard of it already, check out the Day-Stout-Warren algorithm, which optimally rebuilds a tree and does so in linear time and constant space.

Splay tree: worst-case sequence

I want to try trying executing the worst-case sequence on Splay tree.
But what is the worst-case sequence on Splay-trees?
And are there any way to calculate this sequence easily given the keys which is inserted into the tree?
Any can help me with this?
Unless someone corrects me, I'm going to go with "no one actually knows what the worst-case series of operations is on a splay tree or what the complexity is in that case."
While we do know many results about the efficiency of splay trees, we actually don't know all that much about how to bound the time complexity of a splay tree. There's a conjecture called the dynamic optimality conjecture that says that in the worst case, any sufficiently long series of operations on a splay tree will take no more than a constant amount of time more than the best possible self-adjusting binary search tree on that series of operations. One of the challenges we're having in trying to prove this is that no one actually knows how to determine the cost of the best possible BST on all inputs. Another is that finding upper bounds on the runtimes of various input combinations to splay trees is hard - as of now, no one knows whether it takes time O(n) to treat a splay tree as a deque!
Hope this helps!
I don't know if an attempt of an answer after more than five years is of any use to you, but sorry, I made my Master in CS only recently :-) In the wake of that, I played around exactly with your question.
Consider the sequence S(3,2) (it should be obvious how S(m,n) works generally if you graph it): S(3,2)=[5,13,6,14,3,15,4,16,1,17,2,18,11,19,12,20,9,21,10,22,7,23,8,24]. Splay is so lousy on this sequence type that the competive ratio r to the "Greedy Future" algorithm (see Demaine) is S[infty,infty]=2. I was never able to get over 2 even though Greedy Future is also not completely optimal and I could shave off a few operations.
(Legend: black,grey,blue: S(7,4); purple,orange,red: Splay must access these points too. Shown in the Demaine formulation.)
But note that your question is somewhat ill defined! If you ask for the absolutely worst sequence, then take e.g. the bit-reversal sequence, ANY tree algorithm needs O(n log n) for that. But if you ask for the competetive ratio r as implied in templatetypdef's answer, then indeed nobody knows (but I would make bets on r=2, see above).
Feel free to email me for details, I'm easily googled.

Need an efficient selection algorithm?

I am looking for an algorithm for selecting A [N/4] the element in an unsorted array A where N is the Number of elements of the array A. I want the algorithm to do the selection in sublinear times .I have knowledge of basic structures like a BST etc? Which one will be the best algorithm for me keeping in mind I want it to be the fastest possible and should not be too tough for me to implement.Here N can vary upto 250000.Any help will be highly appreciated.Note array can have non unique elements
As #Jerry Coffin mentioned, you cannot hope to get a sublinear time algorithm here unless you are willing to do some preprocessing up front. If you want a linear-time algorithm for this problem, you can use the quickselect algorithm, which runs in expected O(n) time with an O(n2) worst-case. The median-of-medians algorithm has worst-case O(n) behavior, but has a high constant factor. One algorithm that you might find useful is the introselect algorithm, which combines the two previous algorithms to get a worst-case O(n) algorithm with a low constant factor. This algorithm is typically what's used to implement the std::nth_element algorithm in the C++ standard library.
If you are willing to do some preprocessing ahead of time, you can put all of the elements into an order statistic tree. From that point forward, you can look up the kth element for any k in time O(log n) worst-case. The preprocessing time required is O(n log n), though, so unless you are making repeated queries this is unlikely to be the best option.
Hope this helps!

How to test an algorithm for perfect optimization?

Is there any way to test an algorithm for perfect optimization?
There is no easy way to prove that any given algorithm is asymptotically optimal.
Proving optimality (if ever) sometimes follows years and/or decades after the algorithm has been written. A classic example is the Union-Find/disjoint-set data structure.
Disjoint-set forests are a data structure where each set is represented by a tree data structure, in which each node holds a reference to its parent node. They were first described by Bernard A. Galler and Michael J. Fischer in 1964, although their precise analysis took years.
[...] These two techniques complement each other; applied together, the amortized time per operation is only O(α(n)), where α(n) is the inverse of the function f(n) = A(n,n), and A is the extremely quickly-growing Ackermann function.
[...] In fact, this is asymptotically optimal: Fredman and Saks showed in 1989 that Ω(α(n)) words must be accessed by any disjoint-set data structure per operation on average.
For some algorithms optimality can be proven after very careful analysis, but generally speaking, there's no easy way to tell if an algorithm is optimal once it's written. In fact, it's not always easy to prove if the algorithm is even correct.
See also
Wikipedia/Matrix multiplication
The naive algorithm is O(N3), Strassen's is roughly O(N2.807), Coppersmith-Winograd is O(N2.376), and we still don't know what is optimal.
Wikipedia/Asymptotically optimal
it is an open problem whether many of the most well-known algorithms today are asymptotically optimal or not. For example, there is an O(nα(n)) algorithm for finding minimum spanning trees. Whether this algorithm is asymptotically optimal is unknown, and would be likely to be hailed as a significant result if it were resolved either way.
Practical considerations
Note that sometimes asymptotically "worse" algorithms are better in practice due to many factors (e.g. ease of implementation, actually better performance for the given input parameter range, etc).
A typical example is quicksort with a simple pivot selection that may exhibit quadratic worst-case performance, but is still favored in many scenarios over a more complicated variant and/or other asymptotically optimal sorting algorithms.
For those among us mortals that merely want to know if an algorithm:
reasonably works as expected;
is faster than others;
there is an easy step called 'benchmark'.
Pick up the best contenders in the area and compare them with your algorithm.
If your algorithm wins then it better matches your needs (the ones defined by
your benchmarks).

Resources