So let's say we have a function such as 6wn^2 - 6wn + 6w, would the big-o notation be O(n^2) or O(wn^2)? The problem set question asks me to write this function in Big-O form in terms of w and n, so I think it's the second one but I've never seen a big O notation like that involving 2 terms so I'm a little confused.
If a function f(n) is O(g(n)) for some function g(x), it means that f(n) is bounded above by g(n) asymptotically. Basically this means that for large n values, g(n) will be bigger than f(n).
(More formally, we could say that f(n) is O(g(n)) if and only if there exists some N such that g(n)>f(n) for all n>N)
Now in your case, let f(n) = 6wn^2 - 6wn + 6w.Then f(n) is both O(n^2) and O(wn^2). This is because both are asymptotic upper bounds for f(n). In fact, f(n) is also O(n^2^2^2^2^2^2^2).
However the best answer for you to give would likely be that f(n) is O(wn^2), since that includes w, which is what the question asked for.
Note that in practice we usually remove all coefficients and unimportant powers from g(n). The reason is that you get more information about a function if you present a low upper bound as opposed to a high one. For example, if I tell you that my speedy search algorithm is O(n^(1000!)), I'm not telling you very much at all. On the other hand, if I told you it was O(n^2), I'm giving you more information - but both could be correct.
You are correct, it is the second one.
Technically, both are correct, but if the question asks for it it in terms of both w and n, then it's the second one.
Since the question asks you to write in terms of both, the second is the answer pretended.
In general it is okay to have two variables in big-O notation, if the input size is defined by two variables and they do not depend in one another.
An example of this is Dijkstra's algorithm, that has a worst case performance of O( | E | + | V | log | V | ). The size of the input graph is defined by the number of vertices (V) and the number of edges (E). The adjacency matrix representation ends up making E = V^2, so if we used Dijkstra's algorithm with an adjacency matrix (don't do it, it's a bad idea) we would convey better information with O( V^2 + | V | log | V | ), which is the same as simply O(V^2).
Related
I have some confusion regarding the Asymptotic Analysis of Algorithms.
I have been trying to understand this upper bound case, seen a couple of youtube videos. In one of them, there was an example of this equation
where we have to find the upper bound of the equation 2n+3. So, by looking at this, one can say that it is going o be O(n).
My first question :
In algorithmic complexity, we have learned to drop the constants and find the dominant term, so is this Asymptotic Analysis to prove that theory? or does it have other significance? otherwise, what is the point of this analysis when it is always going to be the biggest n in the equation, example- if it were n+n^2+3, then the upper bound would always be n^2 for some c and n0.
My second question :
as per rule the upper bound formula in Asymptotic Analysis must satisfy this condition f(n) = O(g(n)) IFF f(n) < c.g(n) where n>n0,c>0, n0>=1
i) n is the no of inputs, right? or does n represent the number of steps we perform? and does f(n) represents the algorithm?
ii) In the following video to prove upper bound of the equation 2n+3 could be n^2 the presenter considered c =1, and that is why to satisfy the equation n had to be >= 3 whereas one could have chosen c= 5 and n=1 as well, right? So then why were, in most cases in the video, the presenter was changing the value of n and not c to satisfy the conditions? is there a rule, or is it random? Can I change either c or n(n0) to satisfy the condition?
My Third Question:
In the same video, the presenter mentioned n0 (n not) is the number of steps. Is that correct? I thought n0 is the limit after which the graph becomes the upper bound (after n0, it satisfies the condition for all values of n); hence n0 also represents the input.
Would you please help me understand because people come up with different ideas in different explanations, and I want to understand them correctly?
Edit
The accepted answer clarified all of the questions except the first one. I have gone through many articles on the web, and here I am documenting my conclusion if anyone else has the same question. This will help them.
My first question was
In algorithmic complexity, we have learned to drop the constants and
find the dominant term, so is this Asymptotic Analysis to prove that
theory?
No, Asymptotic Analysis describes the algorithmic complexity, which is all about understanding or visualizing the Asymptotic behavior or the tail behavior of a function or a group of functions by plotting mathematical expression.
In computer science, we use it to evaluate (note: evaluate is not measuring) the performance of an algorithm in terms of input size.
for example, these two functions belong to the same group
mySet = set()
def addToMySet(n):
for i in range(n):
mySet.add(i*i)
mySet2 = set()
def addToMySet2(n):
for i in range(n):
for j in range(500):
mySet2.add(i*j)
Even though the execution time of the addToMySet2(n) is always > the execution time of addToMySet(n), the tail behavior of both of these functions would be the same with respect to the largest n, if one plot them in a graph the tendency of that graph for both of the functions would be linear thus they belong to the same group. Using Asymptotic Analysis, we get to see the behavior and group them.
A mistake that I made assuming upper bound represents the worst case. In reality, The upper bound of any algorithm is associated with all of the best, average, and worst cases. so the correct way of putting that would be
upper/lower bound in the best/average/worst case of an
algorithm
.
We can't relate the upper bound of an algorithm with the worst-case time complexity and the lower bound with the best-case complexity. However, an upper bound can be higher than the worst-case because upper bounds are usually asymptotic formulae that have been proven to hold.
I have seen this kind of question like find the worst-case time complexity of such and such algorithm, and the answer is either O(n) or O(n^2) or O(log-n), etc.
For example, if we consider the function addToMySet2(n), one would say the algorithmic time complexity of that function is O(n), which is technically wrong because there are three factors bound, bound type, (inclusive upper bound and strict upper bound) and case are involved determining the algorithmic time complexity.
When one denote O(n) it is derived from this Asymptotic Analysis f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0) so we are considering upper bound of best/average/worst case. In the above statement the case is missing.
I think We can consider, when not indicated, the big O notation generally describes an asymptotic upper bound on the worst-case time complexity. Otherwise, one can also use it to express asymptotic upper bounds on the average or best case time complexities
The whole point of asymptotic analysis is to compare algorithms performance scaling. For example, if I write two version of the same algorithm, one with O(n^2) time complexity and the other with O(n*log(n)) time complexity, I know for sure that the O(n*log(n)) one will be faster when n is "big". How big? it depends. You actually can't know unless you benchmark it. What you know is at some point, the O(n*log(n)) will always be better.
Now with your questions:
the "lower" n in n+n^2+3 is "dropped" because it is negligible when n scales up compared to the "dominant" one. That means that n+n^2+3 and n^2 behave the same asymptotically. It is important to note that even though 2 algorithms have the same time complexity, it does not mean they are as fast. For example, one could be always 100 times faster than the other and yet have the exact same complexity.
(i) n can be anything. It may be the size of the input (eg. an algorithm that sorts a list) but it may also be the input itself (eg. an algorithm that give the n-th prime number) or a number of iteration, etc
(ii) he could have taken any c, he chose c=1 as an example as he could have chosen c=1.618. Actually the correct formulation would be:
f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0)
the n0 from the formula is a pure mathematical construct. For c>0, it is the n value from which the function f is bounded by g. Since n can represent anything (size of a list, input value, etc), it is the same for n0
I had seen in one of the videos (https://www.youtube.com/watch?v=A03oI0znAoc&t=470s) that, If suppose f(n)= 2n +3, then BigO is O(n).
Now my question is if I am a developer, and I was given O(n) as upperbound of f(n), then how I will understand, what exact value is the upper bound. Because in 2n +3, we remove 2 (as it is a constant) and 3 (because it is also a constant). So, if my function is f(n) where n = 1, I can't say g(n) is upperbound where n = 1.
1 cannot be upperbound for 1. I find hard understanding this.
I know it is a partial (and probably wrong answer)
From Wikipedia,
Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation.
In your example,
f(n) = 2n+3 has the same growth rate as f(n) = n
If you plot the functions, you will see that both functions have the same linear growth; and as n -> infinity, the difference between the 2 gets minimal.
In Big O notation, f(n) = 2n+3 when n=1 means nothing; you need to look at the trend, not discreet values.
As a developer, you will consider big-O as a first indication for deciding which algorithm to use. If you have an algorithm which is say, O(n^2), you will try to understand whether there is another one which is, say, O(n). If the problem is inherently O(n^2), then the big-O notation will not provide further help and you will need to use other criterion for your decision. However, if the problem is not inherently O(n^2), but O(n), you should discard any algorithm that happen to be O(n^2) and find an O(n) one.
So, the big-O notation will help you to better classify the problem and then try to solve it with an algorithm whose complexity has the same big-O. If you are lucky enough as to find 2 or more algorithms with this complexity, then you will need to ponder them using a different criterion.
This question already has answers here:
What does O(log n) mean exactly?
(32 answers)
Closed 7 years ago.
So I have been studying Big O notation ( Noob ) , and most things looks like alien language to me. Now I understand basic of log like log 16 of base2 is the power of 2 equals the number 16. Now for Binary search big O(logN) making no sense to me , what is the value of LogN exacly what is the base here? I have searched internet, problem is everyone explained this mathmetically which i cant catch I am not good with math. Can someone explain this to me in Basic English not Alien language like exponential. I know How Binary search works.
Second question: [I dont even know what f = Ω(g) this symbol means] Can someone explain to me in Plain English what is required here , I dont want the answer , just what this means.
Question :
In each of the following situations, indicate whether f = O(g), or f = Ω(g), or both. (in which case f = Θ(g)).
f(n) g(n)
(a) n-100 ...............n-200
(b) 100n + logn .......n + (log n)2
(c) log2n ............... log3n
Update: I just realized that I studied algorithms from MIT's videos. Here is the link to the first of those videos. Keep going to next lecture as far as you want.
Clearly, Log(n) has no value without fixing what n is and what base of log we are using. The purpose of mentioning log(n) so often is to help people understand the rate of growth of a particular algorithm or piece of code. It is only to help people see things in perspective. To build your perspective, see the comparison below:
1 < logn < n < nlogn < n2 < 2^n < n! < n^n
The line above says that after some value of n on the number line, the rate of growth of the above written functions is in the order mentioned there. This way, decision makers can decide which approach they want to take in solving their problem (and students can pass their Algorithm Design and Analysis exam).
Coming to your question, when books say 'binary search's run time is Log(n)', essentially they mean that the if you have n elements, the running time for binary search would be proportional to Log(n) and if you have 17n elements then you can expect the answer from your algorithm in a time duration that is proportional to Log(17n). In this case, the base of Log function is 2 because in binary search, we have exactly <= 2 paths to pick from at every node.
Since, the log function's base can be easily converted from any number to any other number by multiplying a constant, telling what the base is becomes irrelevant as in Big O notations, constants are ignored.
Coming to the answer to your second question, images will explain it the best.
Big O is only about the upper bound on a function. In the image below, f(n) = O(g(n)). In other words, there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k.
Importance of k is that after 'k' this Big O will stay true, no matter what value of n. If we can't fix a 'k', we cannot say that the growth rate will always stay below the function mentioned in O(...).
Importance of c is in saying that it is the function between O(...) that's really important.
Omega is simply the inversion of Big O. If f(n) = O(g(n)), then g(n) = Ω(f(n)). In other words, Ω() is about your function staying above what is mentioned in Ω(...) for a given value of another 'k' and another 'c'.
The pictorial visualization is
Finally, Big theta is about finding a mathematical function that grows at same rate as your given function. But how do you prove that this function runs same as your function. By using two constant values.
Since it runs same as your given function, you should be able to multiply two constants 'c1' and 'c2' that will be able to put c1 * g(n) above your function f(n) and put c2 * g(n) below your function f(n).
The thing behind Big theta is to provide a function with same rate of growth. Note that there may be no constant 'c' that will be able to get f(n) and g(n) to overlap. Nobody is concerned with that. The only concern is to be able to sandwich the f(n) between a g(n) using two constants so that we can confidently say that we found the rate of growth of f(n).
How to apply the above learned ideas to your question?
Let's take each of them one by one. You can use some online tool to plot these functions and see first hand, how these function behave when you go along the number line.
f(n) = n - 100 and g(n) = n - 200
Here, the rate of growth can be found out by differentiating both functions wrt n. d(f(n))/dn = d(g(n))/dn = 1. Therefore, even though the running times of f(n) and g(n) may be different, their rate of growth is same. Can you pick 'c1' and 'c2' such that c1 * g(n) < f(n) < c2 * g(n)?
f(n) = 100n + log(n) and g(n) = n + 2(log (n))
Differentiate and tell if you can relate the functions as Big O or Big Theta or Big Omega.
f(n) = log (2n) and g(n) = log (3n)
Same as above.
(The images are taken from different pages on this website: http://xlinux.nist.gov/dads/HTML/)
My experience: Try to compare the growth rate of a lot of different functions. Eventually you will get the hang of it for all of them and it will become very intuitive for you. Given concentrated effort for one week or two, this concept cannot remain esoteric for anyone.
First of all, let's go through the notations. I'm assuming from the questions that
O(f) is upper bound,
Ω(f) is lower bound, and
Θ(f) is both
For O(log(N)) in this case, generally the base isn't given because the general form of log(N) is known regardless of the base. E.g.,
(source: rapidtables.com)
So if you've worked through the binary search algorithm (I suggest you do this if you haven't), you should find that the worst case scenario (upper bound) is log_2(N). So given N terms, it will take "log_2(N) computations" in the worst case in order to find the term.
For your second question,
You are simply comparing computational run-times of f and g.
f = O(g)
is when f is an upper bound on g, i.e., f will definitely take longer to compute than g. Alternately,
f = Ω(g)
is when f is a lower bound on g, i.e., g will definitely take longer to compute than f. Lastly,
f = Θ(g)
is when the f is both an upper and lower bound on g, i.e., the run times are the same.
You need to compare the two functions for each question and determine which will take longer to compute. As Mitch mentioned you can check here where this question has already been answered.
Edit: accidentally linked e^x instead of log(x)
The reason the base of the log is never specified is because it is actually completely irrelevant. You can convince yourself of this in three steps:
First, recall that log_2(x) = log_10(x)/log_10(2). But also recall that log_10(2) is a constant, which we'll call k2, so really, log_2(x) * k2 = log_10(x)
Second, recall that this is not unique to logs of base 2. The constants of conversion vary, but all the log functions are related to each other through multiplicative constants.
(You can prove this to yourself if you understand the mathematics behind log functions, or you can just work it up very quickly on a spreadsheet-- have a column of log_2(x) and a column of log_3(x) and divide them.)
Finally, remember that in Big Oh notation, constants basically drop out as being irrelevant. Trying to draw a distinction between O(log_2(N)) and O(log_3(N)) is like trying to draw a distinction between O(N) and O(2N). It is a distinction that does not matter because log_2 and log_3 are related through a constant.
Honestly, the base of the log does not matter.
According to CourseEra course on Algorithms and Introduction to Algorithms
, a function G(n) where n is the input size is said to be a big oh notation of F(n) when there exists constants n0 and C such that this inequality holds true
F(n) <= C*G(N) ( For all N > N0 )
Now ,
This mathematical definition is very clear to me .
But as it was taught to me by my teacher today , I am confused!
He said that "Big - Oh Notations are upper bound on a function and it is like the LCM of two numbers i.e. Unique and greater than the function"
I don't think this statement was kind of correct, Is Big Oh notation really unique ?
Morover,
Thinking about Big Oh notations , I also confused myself why do we approximate the Big Oh notations to the highest degree term . ( We can easily prove the mathematical inequality though with nice choice of constants ) but what is the real use of it ??
I mean what does it signify?
We can even take F(n) as the Big Oh Notation of F(n) for the constant 1 !
I think it shows the dependence of the running time only on the highest degree term! Please Clear my doubts as I might have understood it wrongly from my book or my teacher made an error?
Is Big Oh notation really unique ?
Yes and no. By the pure formula, Big-O is of course not unique. However, to be of use for its purpose, one actually tries to find not just some upper bound, but the lowest upper bound. And this makes a meaningful "Big-O" unique.
We can even take F(n) as the Big Oh Notation of F(n) for the constant
1 !
Yes we probably can do that. However, the Big-O is used to relate classes of functions/algorithms to each other. Saying that F(n) relates to X(n) like F(n) relates to X(n) is what you get by using G(n) = F(n). Not much value in that.
That's why we try to find the unique lowest G to satisfy the equation. G(n) is usually a rather trivial function, like G(n) = n, G(n) = n², or G(n) = n*log(n), and this allows us to compare algorithms more easily because we can easily see that, e.g., G(n) = n is less than G(n) = n² for all n >= something.
Interestingly, most algorithms' complexity converges to one of the simple G(n) for large n. You could also say that, by looking at large n's, we try to separate out the "important" from the not-so-important parts of F(n); then we just omit the minor terms in F(n) and get a simplified function G(n).
In practical terms, we also want to abstract away from technical details. If I have, for instance, F(n) = 4*n and E(n) = 2*n I can use twice as much CPUs for the F algorithm and be just as good as the E one independent of the size of the input. Maybe one machine has a dedicated instruction for sqare root, so that SQRT(x) is a single step, while another machine needs much more instructions to get the result. We want to abstract away from that.
This implies one more point of view too: If I have a problem to solve, e.g. "calculate x(y)", I could present the solution as "result := x(y)", O(1). But that's not considered an algorithm. The specification of the algorithm must include a relevant level of detail to be a) meaningful and b) accessible to Big-O.
Question 1: Under what circumstances would O(f(n)) = O(k f(n)) be the most appropriate form of time-complexity analysis?
Question 2: Working from mathematical definition of O notation, how to show that O(f(n)) = O(k f(n)), for positive constant k?
For the first Question I think it is average case and worst case form of time-complexity. Am I right? And what else should I write in that?
For the second Question, I think we need to define the function mathematically. So is the answer something like because the multiplication by a constant just corresponds to a readjustment of value of the arbitrary constant k in definition of O?
My view: For the first one I think it
is average case and worst case form of
time-complexity. am i right? and what
else do i write in that?
No! Big O notation has NOTHING to do with average case or worst case. It is only about the order of growth of a function - particularly, how quickly a function grows relative to another one. A function f can be O(n) in the average case and O(n^2) in the worst case - this just means the function behaves differently depending on its inputs, and so the two cases must be accounted for separately.
Regarding question 2, it is obvious to me from the wording of the question that you need to start with the mathematical definition of Big O. For completeness's sake, it is:
Formal Definition: f(n) = O(g(n))
means there are positive constants c
and k, such that 0 ≤ f(n) ≤ cg(n) for
all n ≥ k. The values of c and k must
be fixed for the function f and must
not depend on n.
(source http://www.itl.nist.gov/div897/sqg/dads/HTML/bigOnotation.html)
So, you need to work from this definition and write a mathematical proof showing that f(n) = O(k(n)). Start by substituting O(g(n)) with O(k*f(n)) in the definition above; the rest should be quite easy.
Question 1 is a little vague, but your answer for question 2 is definitely lacking. The question says "working from the mathematical definition of O notation". This means that your instructor wants you to use the mathematical definition:
f(x) = O(g(x)) if and only if limit [x -> a+] |f(x)/g(x)| < infinity, for some a
And he wants you to plug in g(x) = k f(x) and prove that that inequality holds.
The general argument you posted might get you partial credit, but it is reasoning rather than mathematics, and the question is asking for mathematics.