What is big-O notation? How do you come up with figures like O(n)? [duplicate] - big-o

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Plain english explanation of Big O
I'd imagine this is probably something taught in classes, but as I a self-taught programmer, I've only seen it rarely.
I've gathered it is something to do with the time, and O(1) is the best, while stuff like O(n^n) is very bad, but could someone point me to a basic explanation of what it actually represents, and where these numbers come from?

Big O refers to the worst case run-time order. It is used to show how well an algorithm scales based on the size of the data set (n->number of items).
Since we are only concerned with the order, constant multipliers are ignored, and any terms which increase less quickly than the dominant term are also removed. Some examples:
A single operation or set of operations is O(1), since it takes some constant time (does not vary based on data set size).
A loop is O(n). Each element in the data set is looped over.
A nested loop is O(n^2). A nested nested loop is O(n^3), and onward.
Things like binary tree searching are log(n), which is more difficult to show, but at every level in the tree, the possible number of solutions is halved, so the number of levels is log(n) (provided the tree is balanced).
Something like finding the sum of a set of numbers that is closest to a given value is O(n!), since the sum of each subset needs to be calculated. This is very bad.

It's a way of expressing time complexity.
O(n) means for n elements in a list, it takes n computations to sort the list. Which isn't bad at all. Each increase in n increases time complexity linearly.
O(n^n) is bad, because the amount of computation required to perform a sort (or whatever you are doing) will exponentially increase as you increase n.
O(1) is the best, as it means 1 computation to perform a function, think of hash tables, looking up a value in a hash table has O(1) time complexity.

Big O notation as applied to an algorithm refers to how the run time of the algorithm depends on the amount of input data. For example, a sorting algorithm will take longer to sort a large data set than a small data set. If for the sorting algorithm example you graph the run time (vertical-axis) vs the number of values to sort (horizontal-axis), for numbers of values from zero to a large number, the nature of the line or curve that results will depend on the sorting algorithm used. Big O notation is a shorthand method for describing the line or curve.
In big O notation, the expression in the brackets is the function that is graphed. If a variable (say n) is included in the expression, this variable refers to the size of the input data set. You say O(1) is the best. This is true because the graph f(n) = 1 does not vary with n. An O(1) algorithm takes the same amount of time to complete regardless of the size of the input data set. By contrast, the run time of an algorithm of O(n^n) increases with the square of the size of the input data set.
That is the basic idea, for a detailed explanation, consult the wikipedia page titled 'Big O Notation'.

Related

Do problem constraints change the time complexity of algorithms?

Let's say that the algorithm involves iterating through a string character by character.
If I know for sure that the length of the string is less than, say, 15 characters, will the time complexity be O(1) or will it remain as O(n)?
There are two aspects to this question - the core of the question is, can problem constraints change the asymptotic complexity of an algorithm? The answer to that is yes. But then you give an example of a constraint (strings limited to 15 characters) where the answer is: the question doesn't make sense. A lot of the other answers here are misleading because they address only the second aspect but try to reach a conclusion about the first one.
Formally, the asymptotic complexity of an algorithm is measured by considering a set of inputs where the input sizes (i.e. what we call n) are unbounded. The reason n must be unbounded is because the definition of asymptotic complexity is a statement like "there is some n0 such that for all n ≥ n0, ...", so if the set doesn't contain any inputs of size n ≥ n0 then this statement is vacuous.
Since algorithms can have different running times depending on which inputs of each size we consider, we often distinguish between "average", "worst case" and "best case" time complexity. Take for example insertion sort:
In the average case, insertion sort has to compare the current element with half of the elements in the sorted portion of the array, so the algorithm does about n2/4 comparisons.
In the worst case, when the array is in descending order, insertion sort has to compare the current element with every element in the sorted portion (because it's less than all of them), so the algorithm does about n2/2 comparisons.
In the best case, when the array is in ascending order, insertion sort only has to compare the current element with the largest element in the sorted portion, so the algorithm does about n comparisons.
However, now suppose we add the constraint that the input array is always in ascending order except for its smallest element:
Now the average case does about 3n/2 comparisons,
The worst case does about 2n comparisons,
And the best case does about n comparisons.
Note that it's the same algorithm, insertion sort, but because we're considering a different set of inputs where the algorithm has different performance characteristics, we end up with a different time complexity for the average case because we're taking an average over a different set, and similarly we get a different time complexity for the worst case because we're choosing the worst inputs from a different set. Hence, yes, adding a problem constraint can change the time complexity even if the algorithm itself is not changed.
However, now let's consider your example of an algorithm which iterates over each character in a string, with the added constraint that the string's length is at most 15 characters. Here, it does not make sense to talk about the asymptotic complexity, because the input sizes n in your set are not unbounded. This particular set of inputs is not valid for doing such an analysis with.
In the mathematical sense, yes. Big-O notation describes the behavior of an algorithm in the limit, and if you have a fixed upper bound on the input size, that implies it has a maximum constant complexity.
That said, context is important. All computers have a realistic limit to the amount of input they can accept (a technical upper bound). Just because nothing in the world can store a yottabyte of data doesn't mean saying every algorithm is O(1) is useful! It's about applying the mathematics in a way that makes sense for the situation.
Here are two contexts for your example, one where it makes sense to call it O(1), and one where it does not.
"I decided I won't put strings of length more than 15 into my program, therefore it is O(1)". This is not a super useful interpretation of the runtime. The actual time is still strongly tied to the size of the string; a string of size 1 will run much faster than one of size 15 even if there is technically a constant bound. In other words, within the constraints of your problem there is still a strong correlation to n.
"My algorithm will process a list of n strings, each with maximum size 15". Here we have a different story; the runtime is dominated by having to run through the list! There's a point where n is so large that the time to process a single string doesn't change the correlation. Now it makes sense to consider the time to process a single string O(1), and therefore the time to process the whole list O(n)
That said, Big-O notation doesn't have to only use one variable! There are problems where upper bounds are intrinsic to the algorithm, but you wouldn't put a bound on the input arbitrarily. Instead, you can describe each dimension of your input as a different variable:
n = list length
s = maximum string length
=> O(n*s)
It depends.
If your algorithm's requirements would grow if larger inputs were provided, then the algorithmic complexity can (and should) be evaluated independently of the inputs. So iterating over all the elements of a list, array, string, etc., is O(n) in relation to the length of the input.
If your algorithm is tied to the limited input size, then that fact becomes part of your algorithmic complexity. For example, maybe your algorithm only iterates over the first 15 characters of the input string, regardless of how long it is. Or maybe your business case simply indicates that a larger input would be an indication of a bug in the calling code, so you opt to immediately exit with an error whenever the input size is larger than a fixed number. In those cases, the algorithm will have constant requirements as the input length tends toward very large numbers.
From Wikipedia
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity.
...
In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows.
In practice, almost all inputs have limits: you cannot input a number larger than what's representable by the numeric type, or a string that's larger than the available memory space. So it would be silly to say that any limits change an algorithm's asymptotic complexity. You could, in theory, use 15 as your asymptote (or "particular value"), and therefore use Big-O notation to define how an algorithm grows as the input approaches that size. There are some algorithms with such terrible complexity (or some execution environments with limited-enough resources) that this would be meaningful.
But if your argument (string length) does not tend toward a large enough value for some aspect of your algorithm's complexity to define the growth of its resource requirements, it's arguably not appropriate to use asymptotic notation at all.
NO!
The time complexity of an algorithm is independent of program constraints. Here is (a simple) way of thinking about it:
Say your algorithm iterates over the string and appends all consonants to a list.
Now, for iteration time complexity is O(n). This means that the time taken will increase roughly in proportion to the increase in the length of the string. (Time itself though would vary depending on the time taken by the if statement and Branch Prediction)
The fact that you know that the string is between 1 and 15 characters long will not change how the program runs, it merely tells you what to expect.
For example, knowing that your values are going to be less than 65000 you could store them in a 16-bit integer and not worry about Integer overflow.
Do problem constraints change the time complexity of algorithms?
No.
If I know for sure that the length of the string is less than, say, 15 characters ..."
We already know the length of the string is less than SIZE_MAX. Knowing an upper fixed bound for string length does not make the the time complexity O(1).
Time complexity remains O(n).
Big-O measures the complexity of algorithms, not of code. It means Big-O does not know the physical limitations of computers. A Big-O measure today will be the same in 1 million years when computers, and programmers alike, have evolved beyond recognition.
So restrictions imposed by today's computers are irrelevant for Big-O. Even though any loop is finite in code, that need not be the case in algorithmic terms. The loop may be finite or infinite. It is up to the programmer/Big-O analyst to decide. Only s/he knows which algorithm the code intends to implement. If the number of loop iterations is finite, the loop has a Big-O complexity of O(1) because there is no asymptotic growth with N. If, on the other hand, the number of loop iterations is infinite, the Big-O complexity is O(N) because there is an asymptotic growth with N.
The above is straight from the definition of Big-O complexity. There are no ifs or buts. The way the OP describes the loop makes it O(1).
A fundamental requirement of big-O notation is that parameters do not have an upper limit. Suppose performing an operation on N elements takes a time precisely equal to 3E24*N*N*N / (1E24+N*N*N) microseconds. For small values of N, the execution time would be proportional to N^3, but as N gets larger the N^3 term in the denominator would start to play an increasing role in the computation.
If N is 1, the time would be 3 microseconds.
If N is 1E3, the time would be about 3E33/1E24, i.e. 3.0E9.
If N is 1E6, the time would be about 3E42/1E24, i.e. 3.0E18
If N is 1E7, the time would be 3E45/1.001E24, i.e. ~2.997E21
If N is 1E8, the time would be about 3E48/2E24, i.e. 1.5E24
If N is 1E9, the time would be 3E51/1.001E27, i.e. ~2.997E24
If N is 1E10, the time would be about 3E54/1.000001E30, i.e. 2.999997E24
As N gets bigger, the time would continue to grow, but no matter how big N gets the time would always be less than 3.000E24 seconds. Thus, the time required for this algorithm would be O(1) because one could specify a constant k such that the time necessary to perform the computation with size N would be less than k.
For any practical value of N, the time required would be proportional to N^3, but from an O(N) standpoint the worst-case time requirement is constant. The fact that the time changes rapidly in response to small values of N is irrelevant to the "big picture" behaviour, which is what big-O notation measures.
It will be O(1) i.e. constant.
This is because for calculating time complexity or worst-case time complexity (to be precise), we think of the input as a huge chunk of data and the length of this data is assumed to be n.
Let us say, we do some maximum work C on each part of this input data, which we will consider as a constant.
In order to get the worst-case time complexity, we need to loop through each part of the input data i.e. we need to loop n times.
So, the time complexity will be:
n x C.
Since you fixed n to be less than 15 characters, n can also be assumed as a constant number.
Hence in this case:
n = constant and,
(maximum constant work done) = C = constant
So time complexity is n x C = constant x constant = constant i.e. O(1)
Edit
The reason why I have said n = constant and C = constant for this case, is because the time difference for doing calculations for smaller n will become so insignificant (compared to n being a very large number) for modern computers that we can assume it to be constant.
Otherwise, every function ever build will take some time, and we can't say things like:
lookup time is constant for hashmaps

Why big-Oh is not always a worst case analysis of an algorithm?

I am trying to learn analysis of algorithms and I am stuck with relation between asymptotic notation(big O...) and cases(best, worst and average).
I learn that the Big O notation defines an upper bound of an algorithm, i.e. it defines function can not grow more than its upper bound.
At first it sound to me as it calculates the worst case.
I google about(why worst case is not big O?) and got ample of answers which were not so simple to understand for beginner.
I concluded it as follows:
Big O is not always used to represent worst case analysis of algorithm because, suppose a algorithm which takes O(n) execution steps for best, average and worst input then it's best, average and worst case can be expressed as O(n).
Please tell me if I am correct or I am missing something as I don't have anyone to validate my understanding.
Please suggest a better example to understand why Big O is not always worst case.
Big-O?
First let us see what Big O formally means:
In computer science, big O notation is used to classify algorithms
according to how their running time or space requirements grow as the
input size grows.
This means that, Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation. Here, O means order of the function, and it only provides an upper bound on the growth rate of the function.
Now let us look at the rules of Big O:
If f(x) is a sum of several terms, if there is one with largest
growth rate, it can be kept, and all others omitted
If f(x) is a product of several factors, any constants (terms in the
product that do not depend on x) can be omitted.
Example:
f(x) = 6x^4 − 2x^3 + 5
Using the 1st rule we can write it as, f(x) = 6x^4
Using the 2nd rule it will give us, O(x^4)
What is Worst Case?
Worst case analysis gives the maximum number of basic operations that
have to be performed during execution of the algorithm. It assumes
that the input is in the worst possible state and maximum work has to
be done to put things right.
For example, for a sorting algorithm which aims to sort an array in ascending order, the worst case occurs when the input array is in descending order. In this case maximum number of basic operations (comparisons and assignments) have to be done to set the array in ascending order.
It depends on a lot of things like:
CPU (time) usage
memory usage
disk usage
network usage
What's the difference?
Big-O is often used to make statements about functions that measure the worst case behavior of an algorithm, but big-O notation doesn’t imply anything of the sort.
The important point here is we're talking in terms of growth, not number of operations. However, with algorithms we do talk about the number of operations relative to the input size.
Big-O is used for making statements about functions. The functions can measure time or space or cache misses or rabbits on an island or anything or nothing. Big-O notation doesn’t care.
In fact, when used for algorithms, big-O is almost never about time. It is about primitive operations.
When someone says that the time complexity of MergeSort is O(nlogn), they usually mean that the number of comparisons that MergeSort makes is O(nlogn). That in itself doesn’t tell us what the time complexity of any particular MergeSort might be because that would depend how much time it takes to make a comparison. In other words, the O(nlogn) refers to comparisons as the primitive operation.
The important point here is that when big-O is applied to algorithms, there is always an underlying model of computation. The claim that the time complexity of MergeSort is O(nlogn), is implicitly referencing an model of computation where a comparison takes constant time and everything else is free.
Example -
If we are sorting strings that are kk bytes long, we might take “read a byte” as a primitive operation that takes constant time with everything else being free.
In this model, MergeSort makes O(nlogn) string comparisons each of which makes O(k) byte comparisons, so the time complexity is O(k⋅nlogn). One common implementation of RadixSort will make k passes over the n strings with each pass reading one byte, and so has time complexity O(nk).
The two are not the same thing.  Worst-case analysis as other have said is identifying instances for which the algorithm takes the longest to complete (i.e., takes the most number of steps), then formulating a growth function using this.  One can analyze the worst-case time complexity using Big-Oh, or even other variants such as Big-Omega and Big-Theta (in fact, Big-Theta is usually what you want, though often Big-Oh is used for ease of comprehension by those not as much into theory).  One important detail and why worst-case analysis is useful is that the algorithm will run no slower than it does in the worst case.  Worst-case analysis is a method of analysis we use in analyzing algorithms.
Big-Oh itself is an asymptotic measure of a growth function; this can be totally independent as people can use Big-Oh to not even measure an algorithm's time complexity; its origins stem from Number Theory.  You are correct to say it is the asymptotic upper bound of a growth function; but the manner you prescribe and construct the growth function comes from your analysis.  The Big-Oh of a growth function itself means little to nothing without context as it only says something about the function you are analyzing.  Keep in mind there can be infinitely many algorithms that could be constructed that share the same time complexity (by the definition of Big-Oh, Big-Oh is a set of growth functions).
In short, worst-case analysis is how you build your growth function, Big-Oh notation is one method of analyzing said growth function.  Then, we can compare that result against other worst-case time complexities of competing algorithms for a given problem.  Worst-case analysis if done correctly yields the worst-case running time if done exactly (you can cut a lot of corners and still get the correct asymptotics if you use a barometer), and using this growth function yields the worst-case time complexity of the algorithm.  Big-Oh alone doesn't guarantee the worst-case time complexity as you had to make the growth function itself.  For instance, I could utilize Big-Oh notation for any other kind of analysis (e.g., best case, average case).  It really depends on what you're trying to capture.  For instance, Big-Omega is great for lower bounds.
Imagine a hypothetical algorithm that in best case only needs to do 1 step, in the worst case needs to do n2 steps, but in average (expected) case, only needs to do n steps. With n being the input size.
For each of these 3 cases you could calculate a function that describes the time complexity of this algorithm.
1 Best case has O(1) because the function f(x)=1 is really the highest we can go, but also the lowest we can go in this case, omega(1). Since Omega is equal to O (the upper bound and lower bound), we state that this function, in the best case, behaves like theta(1).
2 We could do the same analysis for the worst case and figure out that O(n2 ) = omega(n2 ) =theta(n2 ).
3 Same counts for the average case but with theta( n ).
So in theory you could determine 3 cases of an algorithm and for those 3 cases calculate the lower/upper/thight bounds. I hope this clears things up a bit.
https://www.google.co.in/amp/s/amp.reddit.com/r/learnprogramming/comments/3qtgsh/how_is_big_o_not_the_same_as_worst_case_or_big/
Big O notation shows how an algorithm grows with respect to input size. It says nothing of which algorithm is faster because it doesn't account for constant set up time (which can dominate if you have small input sizes). So when you say
which takes O(n) execution steps
this almost doesn't mean anything. Big O doesn't say how many execution steps there are. There are C + O(n) steps (where C is a constant) and this algorithm grows at rate n depending on input size.
Big O can be used for best, worst, or average cases. Let's take sorting as an example. Bubble sort is a naive O(n^2) sorting algorithm, but when the list is sorted it takes O(n). Quicksort is often used for sorting (the GNU standard C library uses it with some modifications). It preforms at O(n log n), however this is only true if the pivot chosen splits the array in to two equal sized pieces (on average). In the worst case we get an empty array one side of the pivot and Quicksort performs at O(n^2).
As Big O shows how an algorithm grows with respect to size, you can look at any aspect of an algorithm. Its best case, average case, worst case in both time and/or memory usage. And it tells you how these grow when the input size grows - but it doesn't say which is faster.
If you deal with small sizes then Big O won't matter - but an analysis can tell you how things will go when your input sizes increase.
One example of where the worst case might not be the asymptotic limit: suppose you have an algorithm that works on the set difference between some set and the input. It might run in O(N) time, but get faster as the input gets larger and knocks more values out of the working set.
Or, to get more abstract, f(x) = 1/x for x > 0 is a decreasing O(1) function.
I'll focus on time as a fairly common item of interest, but Big-O can also be used to evaluate resource requirements such as memory. It's essential for you to realize that Big-O tells how the runtime or resource requirements of a problem scale (asymptotically) as the problem size increases. It does not give you a prediction of the actual time required. Predicting the actual runtimes would require us to know the constants and lower order terms in the prediction formula, which are dependent on the hardware, operating system, language, compiler, etc. Using Big-O allows us to discuss algorithm behaviors while sidestepping all of those dependencies.
Let's talk about how to interpret Big-O scalability using a few examples. If a problem is O(1), it takes the same amount of time regardless of the problem size. That may be a nanosecond or a thousand seconds, but in the limit doubling or tripling the size of the problem does not change the time. If a problem is O(n), then doubling or tripling the problem size will (asymptotically) double or triple the amounts of time required, respectively. If a problem is O(n^2), then doubling or tripling the problem size will (asymptotically) take 4 or 9 times as long, respectively. And so on...
Lots of algorithms have different performance for their best, average, or worst cases. Sorting provides some fairly straightforward examples of how best, average, and worst case analyses may differ.
I'll assume that you know how insertion sort works. In the worst case, the list could be reverse ordered, in which case each pass has to move the value currently being considered as far to the left as possible, for all items. That yields O(n^2) behavior. Doubling the list size will take four times as long. More likely, the list of inputs is in randomized order. In that case, on average each item has to move half the distance towards the front of the list. That's less than in the worst case, but only by a constant. It's still O(n^2), so sorting a randomized list that's twice as large as our first randomized list will quadruple the amount of time required, on average. It will be faster than the worst case (due to the constants involved), but it scales in the same way. The best case, however, is when the list is already sorted. In that case, you check each item to see if it needs to be slid towards the front, and immediately find the answer is "no," so after checking each of the n values you're done in O(n) time. Consequently, using insertion sort for an already ordered list that is twice the size only takes twice as long rather than four times as long.
You are right, in that you can say certainly say that an algorithm runs in O(f(n)) time in the best or average case. We do that all the time for, say, quicksort, which is O(N log N) on average, but only O(N^2) worst case.
Unless otherwise specified, however, when you say that an algorithm runs in O(f(n)) time, you are saying the algorithm runs in O(f(n)) time in the worst case. At least that's the way it should be. Sometimes people get sloppy, and you will often hear that a hash table is O(1) when in the worst case it is actually worse.
The other way in which a big O definition can fail to characterize the worst case is that it's an upper bound only. Any function in O(N) is also in O(N^2) and O(2^N), so we would be entirely correct to say that quicksort takes O(2^N) time. We just don't say that because it isn't useful to do so.
Big Theta and Big Omega are there to specify lower bounds and tight bounds respectively.
There are two "different" and most important tools:
the best, worst, and average-case complexity are for generating numerical function over the size of possible problem instances (e.g. f(x) = 2x^2 + 8x - 4) but it is very difficult to work precisely with these functions
big O notation extract the main point; "how efficient the algorithm is", it ignore a lot of non important things like constants and ... and give you a big picture

What is the overall O(n) time complexity of O(sum(a)) if a is an array of integers and n is the length of the array?

I’m having a hard time using O(n) principles to generalize the time complexity of an algorithm whose more specific time complexity is O(sum(a)) where a is an array of integers.
My intuition is that this time complexity should generalize to O(n) as you can think of this as a “linear” equation of ki values that occur n times where k is the integer value in the array, making it O(n)( k=1 for a straight up O(n) case).
But it doesn’t seem to be exactly the same as O(n) - the value of k could be much larger than n, and if all these k values are larger you have something that could be O(n^2) or O(n^3) depending on how large that value is.
Is this something to take into account for O(n) complexity where n is the length of the array? Should I actually be defining n as the sum of all elements in the array instead of the length of the array?
In general, what would be the best way to think about this?
Fundamentally, we want to describe the runtime of an algorithm based on the input. The "runtime" is a vague term, that is often swept under the rug. For example, the "runtime" of a sorting algorithm or a hashtable operation is measured in number of comparisons, but using "runtime" to mean the number of basic operations (which are also usually only vaguely defined) is also possible.
There are two choices (or simplifications) often made when calcuating runtime. The first, is to ignore the actual input, and to use the size of the input (measured somehow) instead. This size is usually denoted n. The second, is to use big-O notation to describe the worst case (or best case, or average, or amortized...).
Neither of these choices is always necessary, and sometimes, they won't make sense. To repeat, since this is the crux of the answer: describing runtimes in big-O of n is not the only way to describe runtimes and sometimes it makes no sense to do so.
For example, in the case of an algorithm that runs in O(sum(a)) time:
func f(a) {
t = 0
for x in a {
for i = 1..x {
t += 1
}
}
}
It's not useful to describe the runtime of this using the length of the input array a. It's not useful because the length of a doesn't say anything about the worst-case runtime.
Saying that t is incremented sum(a) times is a useful statement about the runtime of the program. It doesn't use big-O complexity notation.
And if you do want to express that in big-O notation, you can say that the runtime of this code is O(sum(a)). This blurs exactly what you're measuring in the runtime, because you can be including the cost of performing the statements other than incrementing t.
And going back to the example, you could (and if you were studying complexity classes, you probably would) say n is the size (in bits) of the input array. Then you could say something about the runtime (measured in basic operations): it's O(2^n), since the worst case input is an array with one element which takes the value 2^n-1 (*note).
*note: this ignores some technical details about how to encode an array using bits.

How can the worst case for an algorithm have different bounds?

I've been trying to figure this out all day. Some other threads address this, but I really don't understand the answers. There are also many answers that contradict one another.
I understand that an algorithm will never take longer than the upper bound and never be faster than the lower bound. However, I didn't know an upper bound existed for best case time and a lower bound existed for worst case time. This question really threw me in a loop. I can't wrap my head around this... a given run time can have a different upper and lower bound?
For example, if someone asked: "Show that the worst-case running time of some algorithm on a heap of size n is Big Omega(lg(n))". How do you possibly get a lower bound, any bound for that matter, when given a run time?
So, in summation, an algorithm's worst case upper bound can be different than its worst case lower bound? How can this be? Once given the case, don't bounds become irrelevant? Trying to independent study algorithms and I really need to wrap my head around this first.
The meat of my accepted answer to that question is a function whose running time oscillates between n^2 and n^3 depending on whether n is odd. The point that I was trying to make is that sometimes bounds of the form O(n^k) and Omega(n^k) aren't sufficiently descriptive, even though the worst case running time is a perfectly well defined function (which, like all functions, is its own best lower and upper bound). This happens with more natural functions like n log n, which is Omega(n^k) but not O(n^k) for k ≤ 1, and O(n^k) but not Omega(n^k) for k > 1 (and hence not Theta(n^k) regardless of how we choose a constant k).
Suppose you write a program like this to find the smallest prime factor of an integer:
function lpf(n):
for i = 2 to n
if n%i == 0 then return i
If you run the function on the number 10^11 + 3, it will take 10^11 + 2 steps. If you run it on the number 10^11 + 4 it will take just one step. So the function's best-case time is O(1) steps and its worst-case time is O(n) steps.
Big O notation, describes efficiency in runtime iterations, generally based on size of an input data set.
The notation is written in its simplest form, ignoring multiples or additives, but keeping exponential form. If you have an operation of O(1) it is executed in constant time, no matter the input data.
However if you have something such as O(N) or O(log(N)), they will execute at different rates depending on input data.
The high and low bounds describe the largest and least iterations, respectively, that an algorithm can take.
Example: O(N), high bound is largest input data and low bound is smallest.
Extra sources:
Big O Cheat Sheet and MIT Lecture Notes
UPDATE:
Looking at the Stack Overflow question mentioned above, that algorithm is broken into three parts, where it has 3 possible types of runtime, depending on data. Now really, this is three different algorithms designed to handle for different data values. An algorithm is generally classified with just one notation of efficiency and that is of the notation taking the least time for ALL possible values of N.
In the case of O(N^2), larger data will take exponentially longer, and having a smaller number will proceed quickly. The algorithm determines how quickly a data set will be run, yet bounds are given depending on the range of data the algorithm is designed to handle.
I will try to explain it in the quicksort algorithm.
In quicksort you have an array and choose an element as pivot. The next step is to partition the input array into two arrays. The first one will contain elements < pivot and the second one elements > pivot.
Now assume you will apply quicksort on an already sorted list and the pivot element will always be the last element of the array. The result of partition will be an array of size n-1 and an array oft size 1 (the pivot element). This will result in a runtime of O(n*n). Now assume that the pivot element will always split the array in two equal sized array. In every step the array size will be cut in halves. This will result in O(n log n). I hope this example will make this a bit clearer for you.
Another well known sort algorithm is mergesort. Mergesort has always runtime of O(n log n). In mergesort you will cut the array down until only one element is left und will climb up the call stack to merge the one sized arrays and after that merge the array of size two and so on.
Let's say you implement a set using an array. To insert a element you simply put in the next available bucket. If there is no available bucket you increase the capacity of the array by a value m.
For the insert algorithm "there is no enough space" is the worse case.
insert (S, e)
if size(S) >= capacity(S)
reserve(S, size(S) + m)
put(S,e)
Assume we never delete elements. By keeping track of the last available position, put, size and capacity are Θ(1) in space and memory.
What about reserve? If it is implemented like [realloc in C][1], in the best case you just allocate new memory at the end of the existing memory (best case for reserve), or you have to move all existing elements as well (worse case for reserve).
The worst case lower bound for insert is the best case of
reserve(), which is linear in m if we dont nitpick. insert in
worst case is Ω(m) in space and time.
The worst case upper bound for insert is the worse case of
reserve(), which is linear in m+n. insert in worst case is
O(m+n) in space and time.

what does O(N) mean [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
What is Big O notation? Do you use it?
Hi all,
fairly basic scalability notation question.
I recently recieved a comment on a post that my python ordered-list implimentation
"but beware that your 'ordered set' implementation is O(N) for insertions"
Which is great to know, but I'm not sure what this means.
I've seen notation such as n(o) o(N), N(o-1) or N(o*o)
what does the above notation refer to?
The comment was referring to the Big-O Notation.
Briefly:
O(1) means in constant time -
independent of the number of items.
O(N) means in proportion to the
number of items.
O(log N) means a time proportional to
log(N)
Basically any 'O' notation means an operation will take time up to a maximum of k*f(N)
where:
k is a constant multiplier
f() is a function that depends on N
O(n) is Big O Notation and refers to the complexity of a given algorithm. n refers to the size of the input, in your case it's the number of items in your list.
O(n) means that your algorithm will take on the order of n operations to insert an item. e.g. looping through the list once (or a constant number of times such as twice or only looping through half).
O(1) means it takes a constant time, that it is not dependent on how many items are in the list.
O(n^2) means that for every insert, it takes n*n operations. i.e. 1 operation for 1 item, 4 operations for 2 items, 9 operations for 3 items. As you can see, O(n^2) algorithms become inefficient for handling large number of items.
For lists O(n) is not bad for insertion, but not the quickest. Also note that O(n/2) is considered as being the same as O(n) because they both grow at the same rate with n.
It's called Big O Notation: http://en.wikipedia.org/wiki/Big_O_notation
So saying that insertion is O(n) means that you have to walk through the whole list (or half of it -- big O notation ignores constant factors) to perform the insertion.
This looks like a nice introduction: http://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/
Specifically O(n) means that if there's 2x as many items in the list, it'll takes No more than twice as long, if there's 50 times as many it'll take No more than 50 times as long. See the wikipedia article dreeves pointed out for more details
Edit (in bold above): It was pointed out that Big-O does represent the upper bound, so if there's twice as many elements in the list, insertion will take at most twice as long, and if there's 50 times as many elements, it would take at most 50 times as long.
If it was additionally Ω(n) (Big Omega of n) then it would take at least twice as long for a list that is twice as big. If your implementation is both O(n) and Ω(n), meaning that it'll take both at least and at most twice as long for a list twice as big, then it can be said to be Θ(n) (Big Theta of n), meaning that it'll take exactly twice as long if there are twice as many elements.
According to Wikipedia (and personal experience, being guilty of it myself) Big-O is often used where Big-Theta is what is meant. It would be technically correct to call your function O(n^n^n^n) because all Big-O says is that your function is no slower than that, but no one would actually say that other than to prove a point because it's not very useful and misleading information, despite it being technically accurate.
It refers to how complex your program is, i.e., how many operations it takes to actually solve a problem. O(n) means that each operation takes the same number of steps as the items in your list, which for insertion, is very slow. Likewise, if you have O(n^2) means that any operation takes "n" squared number of steps to accomplish, and so on... The "O" is for Order of Magnitude, and the the expression in the parentheses is always related to the number of items being manipulated in the procedure.
Short answer: It means that the processing time is in linear relation to the size of input. E.g if the size of input (length of list) triples, the processing time (roughly) triples. And if it increases thousandfold, the processing time also increases in the same magnitude.
Long answer: See the links provided by Ian P and dreeves
This may help:
http://en.wikipedia.org/wiki/Big_O_notation#Orders_of_common_functions
O(n): Finding an item in an unsorted
list or a malformed tree (worst case);
adding two n-digit numbers
Good luck!
Wikipedia explains it far better than I can, however it means that if your list size is N, it takes at max N loops/iterations to insert an item. (In effect, you have to iterate over the whole list)
If you want a better understanding, there is a free book from Berkeley that goes more in-depth about the notation.

Resources