proving witnesses and relationships - big-o

I've just been introduced to Big-O notation and I've been given some questions. However, I'm confused as to how to determine the value of n0. I have to show that
4^(n+5) is O(23n)

Related

Use of asymptotic notation

I have a doubt in this particular question where the answer says that big-Oh(n^2) algorithm will not run faster than big-Oh(n^3) algorithm whereas if the notation in both cases was theta instead then it would have been true but why is that so?
I would love it if anyone could explain it to me in detail because I couldn't find any source from where my doubt could get clarified.
Part 1 paraphrased (note the phrasing in the question is ambiguous where "number" is quantified -- it has to be picked after you choose the two algorithms, but I assume that's what's intended).
Given functions f and g with f=Theta(n^2) and g=Theta(n^3), then there exists a number N such that f(n) < g(n) for all n>N.
Part 2 paraphrased:
Given functions f and g with f=O(n^2) and g=O(n^3), then for all n, f(n) < g(n).
1 is true, and you can prove it by application of the definition of big-Theta.
2 is false (as a general statement), and you can disprove it by finding a single example of f and g for which it is false. For example, f(n) = 2, g(n) = 1. Big O is a kind of upper bound, so these constant functions work. The counter-examples given in the question are f(n)=n, g(n)=log(n), but the same principle applies.
the answer says that big-Oh(n^2) algorithm will not run faster than big-Oh(n^3) algorithm
It is more subtle: an O(𝑛²) algorithm could run slower than a O(𝑛³) algorithm. It is not will, but could.
The answer gives one reason, but actually there are two:
Big O notation only gives an upper bound. Quoted from Wikipedia:
A description of a function in terms of big O notation usually only provides an upper bound on the growth rate of the function.
So anything that is O(𝑛) is also O(𝑛²) and O(𝑛³), but not necessarily vice versa. The answer says that an algorithm that is O(𝑛³) could maybe have a tighter bound that is O(log𝑛). True, this may sound silly, because why should one then say it is O(𝑛³) when it is also O(log𝑛)? Then it seems more reasonable to just talk about O(log𝑛). And this is what we commonly do. But there is also a second reason:
The second option does not have the contraint "n > number" in its claim. This is essential, because irrespective of time complexities, the running time of an algorithm for a given value of 𝑛 cannot be determined from its time complexity. An algorithm that is O(𝑛log𝑛) may take 10 seconds to do its job, while an algorithm that is O(𝑛²) may take 8 seconds to get the same result, even though its time complexity is worse. When comparing time complexities you only get information about asymptotic behaviour, i.e. when 𝑛 is larger than a large enough number.
Because this extra constraint is part of the first claim, it is indeed true.

Asymptotic analysis, Upper bound

I have some confusion regarding the Asymptotic Analysis of Algorithms.
I have been trying to understand this upper bound case, seen a couple of youtube videos. In one of them, there was an example of this equation
where we have to find the upper bound of the equation 2n+3. So, by looking at this, one can say that it is going o be O(n).
My first question :
In algorithmic complexity, we have learned to drop the constants and find the dominant term, so is this Asymptotic Analysis to prove that theory? or does it have other significance? otherwise, what is the point of this analysis when it is always going to be the biggest n in the equation, example- if it were n+n^2+3, then the upper bound would always be n^2 for some c and n0.
My second question :
as per rule the upper bound formula in Asymptotic Analysis must satisfy this condition f(n) = O(g(n)) IFF f(n) < c.g(n) where n>n0,c>0, n0>=1
i) n is the no of inputs, right? or does n represent the number of steps we perform? and does f(n) represents the algorithm?
ii) In the following video to prove upper bound of the equation 2n+3 could be n^2 the presenter considered c =1, and that is why to satisfy the equation n had to be >= 3 whereas one could have chosen c= 5 and n=1 as well, right? So then why were, in most cases in the video, the presenter was changing the value of n and not c to satisfy the conditions? is there a rule, or is it random? Can I change either c or n(n0) to satisfy the condition?
My Third Question:
In the same video, the presenter mentioned n0 (n not) is the number of steps. Is that correct? I thought n0 is the limit after which the graph becomes the upper bound (after n0, it satisfies the condition for all values of n); hence n0 also represents the input.
Would you please help me understand because people come up with different ideas in different explanations, and I want to understand them correctly?
Edit
The accepted answer clarified all of the questions except the first one. I have gone through many articles on the web, and here I am documenting my conclusion if anyone else has the same question. This will help them.
My first question was
In algorithmic complexity, we have learned to drop the constants and
find the dominant term, so is this Asymptotic Analysis to prove that
theory?
No, Asymptotic Analysis describes the algorithmic complexity, which is all about understanding or visualizing the Asymptotic behavior or the tail behavior of a function or a group of functions by plotting mathematical expression.
In computer science, we use it to evaluate (note: evaluate is not measuring) the performance of an algorithm in terms of input size.
for example, these two functions belong to the same group
mySet = set()
def addToMySet(n):
for i in range(n):
mySet.add(i*i)
mySet2 = set()
def addToMySet2(n):
for i in range(n):
for j in range(500):
mySet2.add(i*j)
Even though the execution time of the addToMySet2(n) is always > the execution time of addToMySet(n), the tail behavior of both of these functions would be the same with respect to the largest n, if one plot them in a graph the tendency of that graph for both of the functions would be linear thus they belong to the same group. Using Asymptotic Analysis, we get to see the behavior and group them.
A mistake that I made assuming upper bound represents the worst case. In reality, The upper bound of any algorithm is associated with all of the best, average, and worst cases. so the correct way of putting that would be
upper/lower bound in the best/average/worst case of an
algorithm
.
We can't relate the upper bound of an algorithm with the worst-case time complexity and the lower bound with the best-case complexity. However, an upper bound can be higher than the worst-case because upper bounds are usually asymptotic formulae that have been proven to hold.
I have seen this kind of question like find the worst-case time complexity of such and such algorithm, and the answer is either O(n) or O(n^2) or O(log-n), etc.
For example, if we consider the function addToMySet2(n), one would say the algorithmic time complexity of that function is O(n), which is technically wrong because there are three factors bound, bound type, (inclusive upper bound and strict upper bound) and case are involved determining the algorithmic time complexity.
When one denote O(n) it is derived from this Asymptotic Analysis f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0) so we are considering upper bound of best/average/worst case. In the above statement the case is missing.
I think We can consider, when not indicated, the big O notation generally describes an asymptotic upper bound on the worst-case time complexity. Otherwise, one can also use it to express asymptotic upper bounds on the average or best case time complexities
The whole point of asymptotic analysis is to compare algorithms performance scaling. For example, if I write two version of the same algorithm, one with O(n^2) time complexity and the other with O(n*log(n)) time complexity, I know for sure that the O(n*log(n)) one will be faster when n is "big". How big? it depends. You actually can't know unless you benchmark it. What you know is at some point, the O(n*log(n)) will always be better.
Now with your questions:
the "lower" n in n+n^2+3 is "dropped" because it is negligible when n scales up compared to the "dominant" one. That means that n+n^2+3 and n^2 behave the same asymptotically. It is important to note that even though 2 algorithms have the same time complexity, it does not mean they are as fast. For example, one could be always 100 times faster than the other and yet have the exact same complexity.
(i) n can be anything. It may be the size of the input (eg. an algorithm that sorts a list) but it may also be the input itself (eg. an algorithm that give the n-th prime number) or a number of iteration, etc
(ii) he could have taken any c, he chose c=1 as an example as he could have chosen c=1.618. Actually the correct formulation would be:
f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0)
the n0 from the formula is a pure mathematical construct. For c>0, it is the n value from which the function f is bounded by g. Since n can represent anything (size of a list, input value, etc), it is the same for n0

BigO Notation, understanding

I had seen in one of the videos (https://www.youtube.com/watch?v=A03oI0znAoc&t=470s) that, If suppose f(n)= 2n +3, then BigO is O(n).
Now my question is if I am a developer, and I was given O(n) as upperbound of f(n), then how I will understand, what exact value is the upper bound. Because in 2n +3, we remove 2 (as it is a constant) and 3 (because it is also a constant). So, if my function is f(n) where n = 1, I can't say g(n) is upperbound where n = 1.
1 cannot be upperbound for 1. I find hard understanding this.
I know it is a partial (and probably wrong answer)
From Wikipedia,
Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation.
In your example,
f(n) = 2n+3 has the same growth rate as f(n) = n
If you plot the functions, you will see that both functions have the same linear growth; and as n -> infinity, the difference between the 2 gets minimal.
In Big O notation, f(n) = 2n+3 when n=1 means nothing; you need to look at the trend, not discreet values.
As a developer, you will consider big-O as a first indication for deciding which algorithm to use. If you have an algorithm which is say, O(n^2), you will try to understand whether there is another one which is, say, O(n). If the problem is inherently O(n^2), then the big-O notation will not provide further help and you will need to use other criterion for your decision. However, if the problem is not inherently O(n^2), but O(n), you should discard any algorithm that happen to be O(n^2) and find an O(n) one.
So, the big-O notation will help you to better classify the problem and then try to solve it with an algorithm whose complexity has the same big-O. If you are lucky enough as to find 2 or more algorithms with this complexity, then you will need to ponder them using a different criterion.

Issue while understanding Big Oh notations?

According to CourseEra course on Algorithms and Introduction to Algorithms
, a function G(n) where n is the input size is said to be a big oh notation of F(n) when there exists constants n0 and C such that this inequality holds true
F(n) <= C*G(N) ( For all N > N0 )
Now ,
This mathematical definition is very clear to me .
But as it was taught to me by my teacher today , I am confused!
He said that "Big - Oh Notations are upper bound on a function and it is like the LCM of two numbers i.e. Unique and greater than the function"
I don't think this statement was kind of correct, Is Big Oh notation really unique ?
Morover,
Thinking about Big Oh notations , I also confused myself why do we approximate the Big Oh notations to the highest degree term . ( We can easily prove the mathematical inequality though with nice choice of constants ) but what is the real use of it ??
I mean what does it signify?
We can even take F(n) as the Big Oh Notation of F(n) for the constant 1 !
I think it shows the dependence of the running time only on the highest degree term! Please Clear my doubts as I might have understood it wrongly from my book or my teacher made an error?
Is Big Oh notation really unique ?
Yes and no. By the pure formula, Big-O is of course not unique. However, to be of use for its purpose, one actually tries to find not just some upper bound, but the lowest upper bound. And this makes a meaningful "Big-O" unique.
We can even take F(n) as the Big Oh Notation of F(n) for the constant
1 !
Yes we probably can do that. However, the Big-O is used to relate classes of functions/algorithms to each other. Saying that F(n) relates to X(n) like F(n) relates to X(n) is what you get by using G(n) = F(n). Not much value in that.
That's why we try to find the unique lowest G to satisfy the equation. G(n) is usually a rather trivial function, like G(n) = n, G(n) = n², or G(n) = n*log(n), and this allows us to compare algorithms more easily because we can easily see that, e.g., G(n) = n is less than G(n) = n² for all n >= something.
Interestingly, most algorithms' complexity converges to one of the simple G(n) for large n. You could also say that, by looking at large n's, we try to separate out the "important" from the not-so-important parts of F(n); then we just omit the minor terms in F(n) and get a simplified function G(n).
In practical terms, we also want to abstract away from technical details. If I have, for instance, F(n) = 4*n and E(n) = 2*n I can use twice as much CPUs for the F algorithm and be just as good as the E one independent of the size of the input. Maybe one machine has a dedicated instruction for sqare root, so that SQRT(x) is a single step, while another machine needs much more instructions to get the result. We want to abstract away from that.
This implies one more point of view too: If I have a problem to solve, e.g. "calculate x(y)", I could present the solution as "result := x(y)", O(1). But that's not considered an algorithm. The specification of the algorithm must include a relevant level of detail to be a) meaningful and b) accessible to Big-O.

Big-oh vs big-theta [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What is the difference between Θ(n) and O(n)?
It seems to me like when people talk about algorithm complexity informally, they talk about big-oh. But in formal situations, I often see big-theta with the occasional big-oh thrown in.
I know mathematically what the difference is between the two, but in English, in what situation would using big-oh when you mean big-theta be incorrect, or vice versa (an example algorithm would be appreciated)?
Bonus: why do people seemingly always use big-oh when talking informally?
Big-O is an upper bound.
Big-Theta is a tight bound, i.e. upper and lower bound.
When people only worry about what's the worst that can happen, big-O is sufficient; i.e. it says that "it can't get much worse than this". The tighter the bound the better, of course, but a tight bound isn't always easy to compute.
See also
Wikipedia/Big O Notation
Related questions
What is the difference between Θ(n) and O(n)?
The following quote from Wikipedia also sheds some light:
Informally, especially in computer science, the Big O notation often is
permitted to be somewhat abused to describe an asymptotic tight bound
where using Big Theta notation might be more factually appropriate in a
given context.
For example, when considering a function T(n) = 73n3+ 22n2+ 58, all of the following are generally acceptable, but tightness of bound (i.e., bullets 2 and 3 below) are usually strongly preferred over laxness of bound (i.e., bullet 1
below).
T(n) = O(n100), which is identical to T(n) ∈ O(n100)
T(n) = O(n3), which is identical to T(n) ∈ O(n3)
T(n) = Θ(n3), which is identical to T(n) ∈ Θ(n3)
The equivalent English statements are respectively:
T(n) grows asymptotically no faster than n100
T(n) grows asymptotically no faster than n3
T(n) grows asymptotically as fast as n3.
So while all three statements are true, progressively more information is contained in
each. In some fields, however, the Big O notation (bullets number 2 in the lists above)
would be used more commonly than the Big Theta notation (bullets number 3 in the
lists above) because functions that grow more slowly are more desirable.
I'm a mathematician and I have seen and needed big-O O(n), big-Theta Θ(n), and big-Omega Ω(n) notation time and again, and not just for complexity of algorithms. As people said, big-Theta is a two-sided bound. Strictly speaking, you should use it when you want to explain that that is how well an algorithm can do, and that either that algorithm can't do better or that no algorithm can do better. For instance, if you say "Sorting requires Θ(n(log n)) comparisons for worst-case input", then you're explaining that there is a sorting algorithm that uses O(n(log n)) comparisons for any input; and that for every sorting algorithm, there is an input that forces it to make Ω(n(log n)) comparisons.
Now, one narrow reason that people use O instead of Ω is to drop disclaimers about worst or average cases. If you say "sorting requires O(n(log n)) comparisons", then the statement still holds true for favorable input. Another narrow reason is that even if one algorithm to do X takes time Θ(f(n)), another algorithm might do better, so you can only say that the complexity of X itself is O(f(n)).
However, there is a broader reason that people informally use O. At a human level, it's a pain to always make two-sided statements when the converse side is "obvious" from context. Since I'm a mathematician, I would ideally always be careful to say "I will take an umbrella if and only if it rains" or "I can juggle 4 balls but not 5", instead of "I will take an umbrella if it rains" or "I can juggle 4 balls". But the other halves of such statements are often obviously intended or obviously not intended. It's just human nature to be sloppy about the obvious. It's confusing to split hairs.
Unfortunately, in a rigorous area such as math or theory of algorithms, it's also confusing not to split hairs. People will inevitably say O when they should have said Ω or Θ. Skipping details because they're "obvious" always leads to misunderstandings. There is no solution for that.
Because my keyboard has an O key.
It does not have a Θ or an Ω key.
I suspect most people are similarly lazy and use O when they mean Θ because it's easier to type.
One reason why big O gets used so much is kind of because it gets used so much. A lot of people see the notation and think they know what it means, then use it (wrongly) themselves. This happens a lot with programmers whose formal education only went so far - I was once guilty myself.
Another is because it's easier to type a big O on most non-Greek keyboards than a big theta.
But I think a lot is because of a kind of paranoia. I worked in defence-related programming for a bit (and knew very little about algorithm analysis at the time). In that scenario, the worst case performance is always what people are interested in, because that worst case might just happen at the wrong time. It doesn't matter if the actually probability of that happening is e.g. far less than the probability of all members of a ships crew suffering a sudden fluke heart attack at the same moment - it could still happen.
Though of course a lot of algorithms have their worst case in very common circumstances - the classic example being inserting in-order into a binary tree to get what's effectively a singly-linked list. A "real" assessment of average performance needs to take into account the relative frequency of different kinds of input.
Bonus: why do people seemingly always use big-oh when talking informally?
Because in big-oh, this loop:
for i = 1 to n do
something in O(1) that doesn't change n and i and isn't a jump
is O(n), O(n^2), O(n^3), O(n^1423424). big-oh is just an upper bound, which makes it easier to calculate because you don't have to find a tight bound.
The above loop is only big-theta(n) however.
What's the complexity of the sieve of eratosthenes? If you said O(n log n) you wouldn't be wrong, but it wouldn't be the best answer either. If you said big-theta(n log n), you would be wrong.
Because there are algorithms whose best-case is quick, and thus it's technically a big O, not a big Theta.
Big O is an upper bound, big Theta is an equivalence relation.
There are a lot of good answers here but I noticed something was missing. Most answers seem to be implying that the reason why people use Big O over Big Theta is a difficulty issue, and in some cases this may be true. Often a proof that leads to a Big Theta result is far more involved than one that results in Big O. This usually holds true, but I do not believe this has a large relation to using one analysis over the other.
When talking about complexity we can say many things. Big O time complexity is just telling us what an algorithm is guarantied to run within, an upper bound. Big Omega is far less often discussed and tells us the minimum time an algorithm is guarantied to run, a lower bound. Now Big Theta tells us that both of these numbers are in fact the same for a given analysis. This tells us that the application has a very strict run time, that can only deviate by a value asymptoticly less than our complexity. Many algorithms simply do not have upper and lower bounds that happen to be asymptoticly equivalent.
So as to your question using Big O in place of Big Theta would technically always be valid, while using Big Theta in place of Big O would only be valid when Big O and Big Omega happened to be equal. For instance insertion sort has a time complexity of Big О at n^2, but its best case scenario puts its Big Omega at n. In this case it would not be correct to say that its time complexity is Big Theta of n or n^2 as they are two different bounds and should be treated as such.
I have seen Big Theta, and I'm pretty sure I was taught the difference in school. I had to look it up though. This is what Wikipedia says:
Big O is the most commonly used asymptotic notation for comparing functions, although in many cases Big O may be replaced with Big Theta Θ for asymptotically tighter bounds.
Source: Big O Notation#Related asymptotic notation
I don't know why people use Big-O when talking formally. Maybe it's because most people are more familiar with Big-O than Big-Theta? I had forgotten that Big-Theta even existed until you reminded me. Although now that my memory is refreshed, I may end up using it in conversation. :)

Resources