Big oh notation for heaps - algorithm

I am trying to understand big oh notations. Any help would be appreciated.
Say there is a program that creates a max heap and then pushes and removes the item.
Say there is n items.
To create a heap,it takes O(n) to heapify if you have read it into an array and then, heapifies it.
To push an item, it takes O(1) and to remove it, it takes O(1)
To heapify it after that, it takes log n for each remove and n log n for n items
So the big oh notation is O(n + n log n)
OR, is it O(n log n) only because we choose the biggest one.

The complexity to heapify the new element in the heap is O(logN), not O(1)(unless you use an Fibonacci heap which it seems is not the case).
Also there is no notation O(N + NlogN) as NlogN grows faster than N so this notation is simply written as O(NlogN).
EDIT: The big-oh notation only describes the asymptotic behavior of a function, that is - how fast it grows. As you get close to infinity 2*f(x) and 11021392103*f(x) behave similarly and that is why when writing big-oh notation, we ignore any constants in front of the function.

Formally speaking, O(N + N log N) is equivalent to O(N log N).
That said, it's assumed that there are coefficients buried in each of these, ala: O( aN + bN log(cN) ). If you have very large N values, these coefficients become unimportant and the algorithm is bounded only by its largest term, which, in this case, is log(N).
But it doesn't mean the coefficients are entirely unimportant. This is why in discussions of graph algorithms you'll often see authors say something like "the Floyd-Warshall algorithm runs in O(N^3) time, but has small coefficients".
If we could somehow write O(0.5N^3) in this case, we would. But it turns out that the coefficients vary depending on how you implement an algorithm and which computer you run it on. Thus, we settle for asymptotic comparisons, not necessarily because it is the best way, but because there isn't really a good alternative.
You'll also see things like "Worst-case: O(N^2), Average case: O(N)". This is an attempt to capture how the behavior of the algorithm varies with the input. Often times, presorted or random inputs can give you that average case, whereas an evil villain can construct inputs that produce the worst case.
Ultimately, what I am saying is this: O(N + N log N)=O(N log N). This is true, and it's the right answer for your homework. But we use this big-O notation to communicate and, in the fullness of time, you may find situations where you feel that O(N + N log N) is more expressive, perhaps if your algorithm is generally used for small N. In this case, do not worry so much about the formalism - just be clear about what it is you are trying to convey with it.

Related

Are O(n log n) algorithms always better than all O(n^2) algorithms?

When trying to properly understand Big-O, I am wondering whether it's true that O(n log n) algorithms are always better than all O(n^2) algorithms.
Are there any particular situations where O(n^2) would be better?
I've read multiple times that in sorting for example, a O(n^2) algorithm like bubble sort can be particularly quick when the data is almost sorted, so would it be quicker than a O(n log n) algorithm, such as merge sort, in this case?
No, O(n log n) algorithms are not always better than O(n^2) ones.
The Big-O notation describes an upper bound of the asymptotic behavior of an algorithm, i.e. for n that tends towards infinity.
In this definition you have to consider some aspects:
The Big-O notation is an upper bound of the algorithm complexity, meaning that for some inputs (like the one you mentioned about sorting algorithms) an algorithm with worst Big-O complexity may actually perform better (bubble sort runs in O(n) for an already sorted array, while mergesort and quicksort takes always at least O(n log n));
The Big-O notation only describes the class of complexity, hiding all the constant factors that in real case scenarios may be relevant. For example, an algorithm that has complexity 1000000 x that is in class O(n) perform worst than an algorithm with complexity 0.5 x^2 (class O(n^2)) for inputs smaller than 2000000. Basically the Big-O notation tells you that for big enough input n, the O(n) algorithms will perform better than O(n^2), but if you work with small inputs you may still prefer the latter solution.
O(n log n) is better than O(n2) asymptotically.
Big-O, Big-Theta, Big-Omega, all those measure the asymptotic behavior of functions, i.e., how functions behave when their argument goes toward a certain limit.
O(n log n) functions grow slower than O(n2) functions, that's what Big-O notation essentially says. However, this does not mean that O(n log n) is always faster. It merely means that at some point, the O(n log n) function will always be cheaper for an ever-rising value of n.
In that image, f(n) = O(g(n)). Note that there is a range where f(n) is actually more costly than g(n), even though it is bounded asymptotically by g(n). However, when talking limits, or asymptotics for that matter, f(n) outperforms g(n) "in the long run," so to say.
In addition to #cadaniluk's answer:
If you restrict the inputs to the algorithms to a very special type, this also can effect the running time. E.g. if you run sorting algorithms only on already sorted lists, BubbleSort will run in linear time, but MergeSort will still need O(n log n).
There are also algorithms that have a bad worst-case complexity, but a good average case complexity. This means that there are bad input instances such that the algorithm is slow, but in total it is very unlikely that you have such a case.
Also never forget, that Big-O notation hides constants and additive functions of lower orders. So an Algorithm with worst-case complexity O(n log n) could actually have a complexity of 2^10000 * n * log n and your O(n^2) algorithm could actually run in 1/2^1000 n^2. So for n < 2^10000 you really want to use the "slower" algorithm.
Here is a practical example.
The GCC implementations of sorting functions have O(n log n) complexity. Still, they employ O(n^2) algorithms as soon as the size of the part being sorted is less than some small constant.
That's because for small sizes, they tend to be faster in practice.
See here for some of the internal implementation.

How did CLRS book conclude BUILD_MAX_HEAP is not asymptotically tight with O(n log n)?

CLRS book page no 157 3rd edition. My question is not regarding why is complexity of BUILD_MAX_HEAP is O(n) nor the proof. However, towards the systematic approach they used to conclude that it is not asymptotically tight at O(n log n). Logically it is right to say O(n log n).
The point i’m stuck is how did they get the intuition that, there is much better tighter upper bound. If it wasn’t for the CLRS explanation. Will we ever know ?
The core of my question regarding the intuition we have to make self aware that there is a much better tighter bound.
Are we suppose to hunt for tighter tighter bound for every algorithm we find ? I mean a logical conclusion for a laymen is O(n log n). How can he go further from here. Have i missed some basics in between ?
how did they get the intuition that, there is much better tighter upper bound.
I don't know how did they specifically get the intuition, but here is one way of seeing it (possibly with the benefit of hindsight).
Build-Heap's complexity, at least intuitively, seems similar to
T(n) = T(n / 2) + Θ(n)
whose solution is T(n) = Θ(n).
(Note that what follows is not a proof, but only intuition!)
Say you have a binary heap with n / 2 elements, and whose last row is full, and you double the number of elements. This will basically add another row.
How does doubling the number of elements (in this case, at least), change the time of Build-Heap?
Because of the penultimate row, there are Θ(n) more calls to Heapify, but each one obviously does O(1) work.
For the rows above the penultimate row, the Θ(n) calls to Heapify basically operate as before, but each one has another O(1) work at the end (the path is longer by 1).
Are we suppose to hunt for tighter tighter bound for every algorithm we find ?
To some extent, yes. Finding tighter bounds for algorithms always generates interest. Note that for Heapsort, there was even interest in proving that one variation used 2 n log(n) (1 - o(1)) operations, whereas another one used n log(n) (1 + o(1)) operations, even though both are Θ(n log(n)).
I mean a logical conclusion for a laymen is O(n log n).
Laymen probably mostly use algorithms which have been extensively studied by experts.
How can he go further from here.
There's probably no algorithm for finding bounds on algorithms. One possible approach would be to plot the running time for increasing values of n, and see how the running time looks. If it looks linear, it's not a proof, but it's an indication for what to look for.

Is n or nlog(n) better than constant or logarithmic time?

In the Princeton tutorial on Coursera the lecturer explains the common order-of-growth functions that are encountered. He says that linear and linearithmic running times are "what we strive" for and his reasoning was that as the input size increases so too does the running time. I think this is where he made a mistake because I have previously heard him refer to a linear order-of-growth as unsatisfactory for an efficient algorithm.
While he was speaking he also showed a chart that plotted the different running times - constant and logarithmic running times looked to be more efficient. So was this a mistake or is this true?
It is a mistake when taken in the context that O(n) and O(n log n) functions have better complexity than O(1) and O(log n) functions. When looking typical cases of complexity in big O notation:
O(1) < O(log n) < O(n) < O(n log n) < O(n^2)
Notice that this doesn't necessarily mean that they will always be better performance-wise - we could have an O(1) function that takes a long time to execute even though its complexity is unaffected by element count. Such a function would look better in big O notation than an O(log n) function, but could actually perform worse in practice.
Generally speaking: a function with lower complexity (in big O notation) will outperform a function with greater complexity (in big O notation) when n is sufficiently high.
You're missing the broader context in which those statements must have been made. Different kinds of problems have different demands, and often even have theoretical lower bounds on how much work is absolutely necessary to solve them, no matter the means.
For operations like sorting or scanning every element of a simple collection, you can make a hard lower bound of the number of elements in the collection for those operations, because the output depends on every element of the input. [1] Thus, O(n) or O(n*log(n)) are the best one can do.
For other kinds of operations, like accessing a single element of a hash table or linked list, or searching in a sorted set, the algorithm needn't examine all of the input. In those settings, an O(n) operation would be dreadfully slow.
[1] Others will note that sorting by comparisons also has an n*log(n) lower bound, from information-theoretic arguments. There are non-comparison based sorting algorithms that can beat this, for some types of input.
Generally speaking, what we strive for is the best we can manage to do. But depending on what we're doing, that might be O(1), O(log log N), O(log N), O(N), O(N log N), O(N2), O(N3), or (or certain algorithms) perhaps O(N!) or even O(2N).
Just for example, when you're dealing with searching in a sorted collection, binary search borders on trivial and gives O(log N) complexity. If the distribution of items in the collection is reasonably predictable, we can typically do even better--around O(log log N). Knowing that, an algorithm that was O(N) or O(N2) (for a couple of obvious examples) would probably be pretty disappointing.
On the other hand, sorting is generally quite a bit higher complexity--the "good" algorithms manage O(N log N), and the poorer ones are typically around O(N2). Therefore, for sorting an O(N) algorithm is actually very good (in fact, only possible for rather constrained types of inputs), and we can pretty much count on the fact that something like O(log log N) simply isn't possible.
Going even further, we'd be happy to manage a matrix multiplication in only O(N2) instead of the usual O(N3). We'd be ecstatic to get optimum, reproducible answers to the traveling salesman problem or subset sum problem in only O(N3), given that optimal solutions to these normally require O(N!).
Algorithms with a sublinear behavior like O(1) or O(Log(N)) are special in that they do not require to look at all elements. In a way this is a fallacy because if there are really N elements, it will take O(N) just to read or compute them.
Sublinear algorithms are often possible after some preprocessing has been performed. Think of binary search in a sorted table, taking O(Log(N)). If the data is initially unsorted, it will cost O(N Log(N)) to sort it first. The cost of sorting can be balanced if you perform many searches, say K, on the same data set. Indeed, without the sort, the cost of the searches will be O(K N), and with pre-sorting O(N Log(N)+ K Log(N)). You win if K >> Log(N).
This said, when no preprocessing is allowed, O(N) behavior is ideal, and O(N Log(N)) is quite comfortable as well (for a million elements, Lg(N) is only 20). You start screaming with O(N²) and worse.
He said those algorithms are what we strive for, which is generally true. Many algorithms cannot possibly be improved better than logarithmic or linear time, and while constant time would be better in a perfect world, it's often unattainable.
constant time is always better because the time (or space) complexity doesn't depend on the problem size... isn't it a great feature? :-)
then we have O(N) and then Nlog(N)
did you know? problems with constant time complexity exist!
e.g.
let A[N] be an array of N integer values, with N > 3. Find and algorithm to tell if the sum of the first three elements is positive or negative.
What we strive for is efficiency, in the sense of designing algorithms with a time (or space) complexity that does not exceed their theoretical lower bound.
For instance, using comparison-based algorithms, you can't find a value in a sorted array faster than Omega(Log(N)), and you cannot sort an array faster than Omega(N Log(N)) - in the worst case.
Thus, binary search O(Log(N)) and Heapsort O(N Log(N)) are efficient algorithms, while linear search O(N) and Bubblesort O(N²) are not.
The lower bound depends on the problem to be solved, not on the algorithm.
Yes constant time i.e. O(1) is better than linear time O(n) because the former is not depending on the input-size of the problem. The order is O(1) > O (logn) > O (n) > O (nlogn).
Linear or linearthimic time we strive for because going for O(1) might not be realistic as in every sorting algorithm we atleast need a few comparisons which the professor tries to prove with his decison Tree- comparison analysis where he tries to sort three elements a b c and proves a lower bound of nlogn. Check his "Complexity of Sorting" in the Mergesort lecture.

What does 'log' represent in asymptotic notation?

I understand the principles of asymptotic notation, and I get what it means when something is O(1) or O(n2) for example. But what does O(log n) mean? or O(n log n) for example?
Log is short for "logarithm": http://en.wikipedia.org/wiki/Logarithm
Logarithms tell us for example how many digits are needed to represent a number, or how many levels a balanced tree has when you add N elements to it.
Check: en.wikipedia.org/wiki/Big_O_notation
Remeber that log increases slowly than a an exponential function. So, if you have an algorithm that is n^2 and other, that doing the same, has a logarithmic function, the last would be more efficient (in general term, not always!).
To evaluate the complexity of a function (or algorithm) you must take in consideration the execution in time and space, mainly. You can evaluate a function or algorithm with other parameters, but, initially, those two would be OK.
EDIT:
http://en.wikibooks.org/wiki/Data_Structures/Asymptotic_Notation
Also, check the sorting algorithms. Will give great insight about complexity.
log is a mathematical function. It is the inverse of exponentiation - log (base 2) of 2^n is n. In practice, it is better than n^c for any positive c (including fractional c such as 1/2 (which is square root)). Check wikipedia for more info.

Trying to understand Big-oh notation

Hi I would really appreciate some help with Big-O notation. I have an exam in it tomorrow and while I can define what f(x) is O(g(x)) is, I can't say I thoroughly understand it.
The following question ALWAYS comes up on the exam and I really need to try and figure it out, the first part seems easy (I think) Do you just pick a value for n, compute them all on a claculator and put them in order? This seems to easy though so I'm not sure. I'm finding it very hard to find examples online.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
What is the
worst-case computational-complexity of
the Binary Search algorithm on an
ordered list of length n = 2k?
That guy should help you.
From lowest to highest, what is the
correct order of the complexities
O(n2), O(log2 n), O(1), O(2n), O(n!),
O(n log2 n)?
The order is same as if you compare their limit at infinity. like lim(a/b), if it is 1, then they are same, inf. or 0 means one of them is faster.
What is the worst-case
computational-complexity of the Binary
Search algorithm on an ordered list of
length n = 2k?
Find binary search best/worst Big-O.
Find linked list access by index best/worst Big-O.
Make conclusions.
Hey there. Big-O notation is tough to figure out if you don't really understand what the "n" means. You've already seen people talking about how O(n) == O(2n), so I'll try to explain exactly why that is.
When we describe an algorithm as having "order-n space complexity", we mean that the size of the storage space used by the algorithm gets larger with a linear relationship to the size of the problem that it's working on (referred to as n.) If we have an algorithm that, say, sorted an array, and in order to do that sort operation the largest thing we did in memory was to create an exact copy of that array, we'd say that had "order-n space complexity" because as the size of the array (call it n elements) got larger, the algorithm would take up more space in order to match the input of the array. Hence, the algorithm uses "O(n)" space in memory.
Why does O(2n) = O(n)? Because when we talk in terms of O(n), we're only concerned with the behavior of the algorithm as n gets as large as it could possibly be. If n was to become infinite, the O(2n) algorithm would take up two times infinity spaces of memory, and the O(n) algorithm would take up one times infinity spaces of memory. Since two times infinity is just infinity, both algorithms are considered to take up a similar-enough amount of room to be both called O(n) algorithms.
You're probably thinking to yourself "An algorithm that takes up twice as much space as another algorithm is still relatively inefficient. Why are they referred to using the same notation when one is much more efficient?" Because the gain in efficiency for arbitrarily large n when going from O(2n) to O(n) is absolutely dwarfed by the gain in efficiency for arbitrarily large n when going from O(n^2) to O(500n). When n is 10, n^2 is 10 times 10 or 100, and 500n is 500 times 10, or 5000. But we're interested in n as n becomes as large as possible. They cross over and become equal for an n of 500, but once more, we're not even interested in an n as small as 500. When n is 1000, n^2 is one MILLION while 500n is a "mere" half million. When n is one million, n^2 is one thousand billion - 1,000,000,000,000 - while 500n looks on in awe with the simplicity of it's five-hundred-million - 500,000,000 - points of complexity. And once more, we can keep making n larger, because when using O(n) logic, we're only concerned with the largest possible n.
(You may argue that when n reaches infinity, n^2 is infinity times infinity, while 500n is five hundred times infinity, and didn't you just say that anything times infinity is infinity? That doesn't actually work for infinity times infinity. I think. It just doesn't. Can a mathematician back me up on this?)
This gives us the weirdly counterintuitive result where O(Seventy-five hundred billion spillion kajillion n) is considered an improvement on O(n * log n). Due to the fact that we're working with arbitrarily large "n", all that matters is how many times and where n appears in the O(). The rules of thumb mentioned in Julia Hayward's post will help you out, but here's some additional information to give you a hand.
One, because n gets as big as possible, O(n^2+61n+1682) = O(n^2), because the n^2 contributes so much more than the 61n as n gets arbitrarily large that the 61n is simply ignored, and the 61n term already dominates the 1682 term. If you see addition inside a O(), only concern yourself with the n with the highest degree.
Two, O(log10n) = O(log(any number)n), because for any base b, log10(x) = log_b(*x*)/log_b(10). Hence, O(log10n) = O(log_b(x) * 1/(log_b(10)). That 1/log_b(10) figure is a constant, which we've already shown drop out of O(n) notation.
Very loosely, you could imagine picking extremely large values of n, and calculating them. Might exceed your calculator's range for large factorials, though.
If the definition isn't clear, a more intuitive description is that "higher order" means "grows faster than, as n grows". Some rules of thumb:
O(n^a) is a higher order than O(n^b) if a > b.
log(n) grows more slowly than any positive power of n
exp(n) grows more quickly than any power of n
n! grows more quickly than exp(kn)
Oh, and as far as complexity goes, ignore the constant multipliers.
That's enough to deduce that the correct order is O(1), O(log n), O(2n) = O(n), O(n log n), O(n^2), O(n!)
For big-O complexities, the rule is that if two things vary only by constant factors, then they are the same. If one grows faster than another ignoring constant factors, then it is bigger.
So O(2n) and O(n) are the same -- they only vary by a constant factor (2). One way to think about it is to just drop the constants, since they don't impact the complexity.
The other problem with picking n and using a calculator is that it will give you the wrong answer for certain n. Big O is a measure of how fast something grows as n increases, but at any given n the complexities might not be in the right order. For instance, at n=2, n^2 is 4 and n! is 2, but n! grows quite a bit faster than n^2.
It's important to get that right, because for running times with multiple terms, you can drop the lesser terms -- ie, if O(f(n)) is 3n^2+2n+5, you can drop the 5 (constant), drop the 2n (3n^2 grows faster), then drop the 3 (constant factor) to get O(n^2)... but if you don't know that n^2 is bigger, you won't get the right answer.
In practice, you can just know that n is linear, log(n) grows more slowly than linear, n^a > n^b if a>b, 2^n is faster than any n^a, and n! is even faster than that. (Hint: try to avoid algorithms that have n in the exponent, and especially avoid ones that are n!.)
For the second part of your question, what happens with a binary search in the worst case? At each step, you cut the space in half until eventually you find your item (or run out of places to look). That is log2(2k). A search where you just walk through the list to find your item would take n steps. And we know from the first part that O(log(n)) < O(n), which is why binary search is faster than just a linear search.
Good luck with the exam!
In easy to understand terms the Big-O notation defines how quickly a particular function grows. Although it has its roots in pure mathematics its most popular application is the analysis of algorithms which can be analyzed on the basis of input size to determine the approximate number of operations that must be performed.
The benefit of using the notation is that you can categorize function growth rates by their complexity. Many different functions (an infinite number really) could all be expressed with the same complexity using this notation. For example, n+5, 2*n, and 4*n + 1/n all have O(n) complexity because the function g(n)=n most simply represents how these functions grow.
I put an emphasis on most simply because the focus of the notation is on the dominating term of the function. For example, O(2*n + 5) = O(2*n) = O(n) because n is the dominating term in the growth. This is because the notation assumes that n goes to infinity which causes the remaining terms to play less of a role in the growth rate. And, by convention, any constants or multiplicatives are omitted.
Read Big O notation and Time complexity for more a more in depth overview.
See this and look up for solutions here is first one.

Resources