Big O Notation: Definition - algorithm

I've been watching MIT lectures for the algorithms course and the definition for the Big O notation says
f(n) = O(g(n)) such that for some constants c and n0
0 < f(n) < c.g(n) for all n>n0
Then the instructor proceeded to give an example,
2n2=O(n3)
Now I get that Big O gives the upper bound on the function but I am confused as to what exactly does the function f(n) correspond to here? What is its significance? As per my understanding goes, g(n) is the function representing the algorithm we are trying to analyse, but what is the purpose of f(n) or as in the example 2n2?
Need some clarification on this, I've been stuck here for hours.

In the formal definition of big-O notation, the functions f(n) and g(n) are placeholders for other functions, the same way that, say, in the quadratic formula, the letters a, b, and c are placeholders for the actual coefficients in the quadratic equation.
In your example, the instructor was talking about how 2n2 = O(n3). You have a formal definition that talks about what it means, in general, for f(n) = O(g(n)) to be true. So let's pattern-match that against the math above. It looks like f(n) is the thing on the left and g(n) is the thing on the right, so in this example f(n) = 2n2 and g(n) = n3.
The previous paragraph gives a superficial explanation of what f(n) and g(n) are by just looking at one example, but it's better to talk about what they really mean. Mathematically, f(n) and g(n) really can be any functions you'd like, but typically when you're using big-O notation in the context of the analysis of algorithms, you'll usually let f(n) be the true amount of work done by the algorithm in question (or its runtime, or its space usage, or really just about anything else) and will pick g(n) to be some "nice" function that's easier to reason about. For example, it might be the case that some function you're analyzing has a true runtime, as a function of n, as 16n3 - 2n2 - 9n + 137. That would be your function f(n). Since the whole point behind big-O notation is to be able to (mathematically rigorously and safely) discard constant factors and low-order terms, we'll try to pick a g(n) that grows at the same rate as f(n) but is easier to reason about - say, g(n) = n3. So now we can try to determine whether f(n) = O(g(n)) by seeing whether we can find the constants c and n0 talked about in the formal definition of big-O notation.
So to recap:
f(n) and g(n) in the definition given are just placeholders for other functions.
In practical usage, f(n) will be the true runtime of the algorithm in question, and g(n) will be something a lot simpler that grows at the same rate.

f(n) is the function that gives you the exact values of the thing you are trying to measure (be that time, number of processor instructions, number of iterations steps, amount of memory used, whatever).
g(n) is another function that approximates the growth of f(n).
In the usual case you don't really know f(n) or it's really hard to compute. For example for time it depends on the processor speed, memory access patterns, system load, compiler optimizations and other. g(n) is usually really simple and it's easier to understand if f(n) = O(N) that if you double n you will roughly double the runtime, in the worst case. Since it's an upper bound g(n) doesn't have to be the minimum, but usually people try to avoid inflating it if it's not necessary. In your example O(n^3) is an upper bound for 2n^2, but so is O(n^2) and O(n!).

Related

How do I prove that y = n^2 does not belong to O(1) using formal definition of Big-O?

I understand that classifying in Big-O is essentially having an "upper-bound" of sort so I understand it graphically, I don't understand is how to use the Big-O formal definition to solve this types of problems. Any help is appreciated.
Although there are platforms better suited for this question than SO, since this gets purely mathematical, understanding this is fundamental for using big-O also in a Computer Science context, so I like the question. And understanding your particular example will likely shed some light on what big-O is in general, as well as provide some practical intuition. So I will try to explain:
We have a function f(n) = n2. To show that it is not in O(1), we have to show that it grows faster than a function g(n) = c where c is some constant. In other words, that f(n) > g(n) for sufficiently large n.
What does sufficiently large mean? It means that for any constant c, we can find an N so that f(n) > g(n) for all n > N.
This is how we define asymptotic behaviour. Your constant function may be larger than f(n), but as n grows enough, you will eventually reach a point where f(n) remains larger forever. How to prove it?
Whatever constant c you choose - however large - we can construct our N. Let us say N = √c. Since c is a constant, so is √c.
Now, we have f(n) = n2 > c = g(n) whenever n > √c.
Therefore, f(n) is not in O(1). □
The key here is that we found an constant N (that is, which depends on our constant c but not on our variable n) such that this inequivalence holds. It can take some wrapping one's head around it, but doing a few other simple examples can help get the intuition.

When and Where to use which asymptotic notation

I have been through this Big-Oh explanation, understanding the complexity of two loops, difference between big theta and big-oh and also through this question.
I understand that we cannot say that Big-oh is used as the worst case, Omega as Best case and theta as average case. Big-oh has its own best, worst and average cases. But how we find out that any specific algorithm belongs from Big-oh, Big-theta or Big-Omega. or how we can check that if any algorithm belongs from all of these.
A function f(n) is Big-Oh of a function g(n), written f(n) = O(g(n)), if there exist a positive constant c and a natural number n0 such that for n > n0, f(n) <= c * g(n). A function f(n) is Big-Omega of g(n), written f(n) = Omega(g(n)), if and only if g(n) = O(f(n)). A function f(n) is Theta of a function g(n), written f(n) = Theta(g(n)), if and only if f(n) = O(g(n)) and f(n) = Omega(g(n)).
To prove any of the free, you do it by showing some function(s) are Big-Oh of some other functions. To show that one function is Big-Oh of another is a difficult problem in the general case. Any form of mathematical proof may be helpful. Induction proofs in conjunction with intuition for the base cases are not uncommon. Basically, guess at values for c and n0 and see if they work. Other options involve choosing one of the two and working out a reasonable value for another.
Note that a function may not be Big-Theta of any other function, if its tightest bounds from above and below are functions with different asymptotic rates of growth. However, I think it's usually a safe bet that most functions are going to be Big-Oh of something reasonably uncomplicated, and all functions typically looked at from this perspective are at least constant-time in the best case - Omega(1).

Issue while understanding Big Oh notations?

According to CourseEra course on Algorithms and Introduction to Algorithms
, a function G(n) where n is the input size is said to be a big oh notation of F(n) when there exists constants n0 and C such that this inequality holds true
F(n) <= C*G(N) ( For all N > N0 )
Now ,
This mathematical definition is very clear to me .
But as it was taught to me by my teacher today , I am confused!
He said that "Big - Oh Notations are upper bound on a function and it is like the LCM of two numbers i.e. Unique and greater than the function"
I don't think this statement was kind of correct, Is Big Oh notation really unique ?
Morover,
Thinking about Big Oh notations , I also confused myself why do we approximate the Big Oh notations to the highest degree term . ( We can easily prove the mathematical inequality though with nice choice of constants ) but what is the real use of it ??
I mean what does it signify?
We can even take F(n) as the Big Oh Notation of F(n) for the constant 1 !
I think it shows the dependence of the running time only on the highest degree term! Please Clear my doubts as I might have understood it wrongly from my book or my teacher made an error?
Is Big Oh notation really unique ?
Yes and no. By the pure formula, Big-O is of course not unique. However, to be of use for its purpose, one actually tries to find not just some upper bound, but the lowest upper bound. And this makes a meaningful "Big-O" unique.
We can even take F(n) as the Big Oh Notation of F(n) for the constant
1 !
Yes we probably can do that. However, the Big-O is used to relate classes of functions/algorithms to each other. Saying that F(n) relates to X(n) like F(n) relates to X(n) is what you get by using G(n) = F(n). Not much value in that.
That's why we try to find the unique lowest G to satisfy the equation. G(n) is usually a rather trivial function, like G(n) = n, G(n) = n², or G(n) = n*log(n), and this allows us to compare algorithms more easily because we can easily see that, e.g., G(n) = n is less than G(n) = n² for all n >= something.
Interestingly, most algorithms' complexity converges to one of the simple G(n) for large n. You could also say that, by looking at large n's, we try to separate out the "important" from the not-so-important parts of F(n); then we just omit the minor terms in F(n) and get a simplified function G(n).
In practical terms, we also want to abstract away from technical details. If I have, for instance, F(n) = 4*n and E(n) = 2*n I can use twice as much CPUs for the F algorithm and be just as good as the E one independent of the size of the input. Maybe one machine has a dedicated instruction for sqare root, so that SQRT(x) is a single step, while another machine needs much more instructions to get the result. We want to abstract away from that.
This implies one more point of view too: If I have a problem to solve, e.g. "calculate x(y)", I could present the solution as "result := x(y)", O(1). But that's not considered an algorithm. The specification of the algorithm must include a relevant level of detail to be a) meaningful and b) accessible to Big-O.

Meaning of Big O notation

Our teacher gave us the following definition of Big O notation:
O(f(n)): A function g(n) is in O(f(n)) (“big O of f(n)”) if there exist
constants c > 0 and N such that |g(n)| ≤ c |f(n)| for all n > N.
I'm trying to tease apart the various components of this definition. First of all, I'm confused by what it means for g(n) to be in O(f(n)). What does in mean?
Next, I'm confused by the overall second portion of the statement. Why does saying that the absolute value of g(n) less than or equal f(n) for all n > N mean anything about Big O Notation?
My general intuition for what Big O Notation means is that it is a way to describe the runtime of an algorithm. For example, if bubble sort runs in O(n^2) in the worst case, this means that it takes the time of n^2 operations (in this case comparisons) to complete the algorithm. I don't see how this intuition follows from the above definition.
First of all, I'm confused by what it means for g(n) to be in O(f(n)). What does in mean?
In this formulation, O(f(n)) is a set of functions. Thus O(N) is the set of all functions that are (in simple terms) proportional to N as N tends to infinity.
The word "in" means ... "is a member of the set".
Why does saying that the absolute value of g(n) less than or equal f(n) for all n > N mean anything about Big O Notation?
It is the definition. And besides you have neglected the c term in your synopsis, and that is an important part of the definition.
My general intuition for what Big O Notation means is that it is a way to describe the runtime of an algorithm. For example, if bubble sort runs in O(n^2) in the worst case, this means that it takes the time of n^2 operations (in this case comparisons) to complete the algorithm. I don't see how this intuition follows from the above definition.
Your intuition is incorrect in two respects.
Firstly, the real definition of O(N^2) is NOT that it takes N^2 operations. it is that it takes proportional to N^2 operations. That's where the c comes into it.
Secondly, it is only proportional to N^2 for large enough values of N. Big O notation is not about what happens for small N. It is about what happens when the problem size scales up.
Also, as a a comment notes "proportional" is not quite the right phraseology here. It might be more correct to say "tends towards proportional" ... but in reality there isn't a simple english description of what is going on here. The real definition is the mathematical one.
If you now reread the definition in the light of that, you should see that it fits just nicely.
(Note that the definitions of Big O, and related measures of complexity can also be expressed in calculus terminology; i.e. using "limits". However, generally speaking the things we are talking about are quantized; i.e. an integer number instructions, an integer number bytes of storage, etc. Calculus is really about functions involving real numbers. Hence, you could argue that the formulation above is preferable. OTOH, a real mathematician would probably see bus-sized holes in this argumentation.)
O(g(n)) looks like a function, but it is actually a set of functions. If a function f is in O(g(n)), it means that g is an asymptotic upper bound on f to within a constant factor. O(g(n)) contains all functions that are bounded from above by g(n).
More specifically, there exists a constant c and n0 such that f(n) < c * g(n) for all n > n0. This means that c * g(n) will always overtake f(n) beyond some value of n. g is asymptotically larger than f; it scales faster.
This is used in the analysis of algorithms as follows. The running time of an algorithm is impossible to specify practically. It would obviously depend on the machine on which it runs. We need a way of talking about efficiency that is unconcerned with matters of hardware. One might naively suggest counting the steps executed by the algorithm and using that as the measure of running time, but this would depend on the granularity with which the algorithm is specified and so is no good either. Instead, we concern ourselves only with how quickly the running time (this hypothetical thing T(n)) scales with the size of the input n.
Thus, we can report the running time by saying something like:
My algorithm (algo1) has a running time T(n) that is in the set O(n^2). I.e. it's bounded from above by some constant multiple of n^2.
Some alternative algorithm (algo2) might have a time complexity of O(n), which we call linear. This may or may not be better for some particular input size or on some hardware, but there is one thing we can say for certain: as n tends to infinity, algo2 will out-perform algo1.
Practically then, one should favour algorithms with better time complexities, as they will tend to run faster.
This asymptotic notation may be applied to memory usage also.

Algorithms Analysis Big O notation

I need help in this question. I really don't understand how to do it.
Show, either mathematically or by an example, that if f(n) is O(g(n)), a*f(n) is O(g(n)), for any constant a > 0.
I'll give you this. It should help you look in the right direction:
definition of O(n):
a function f(n) who satisfies f(n) <= C*n for an arbitrary constant number C and for every n above an arbitrary constant number N will be noted f(n) = O(n).
This is the formal definition for big-o notation, it should be simple to take this and turn it into a solution.

Resources