Asymptotic analysis, Upper bound - algorithm

I have some confusion regarding the Asymptotic Analysis of Algorithms.
I have been trying to understand this upper bound case, seen a couple of youtube videos. In one of them, there was an example of this equation
where we have to find the upper bound of the equation 2n+3. So, by looking at this, one can say that it is going o be O(n).
My first question :
In algorithmic complexity, we have learned to drop the constants and find the dominant term, so is this Asymptotic Analysis to prove that theory? or does it have other significance? otherwise, what is the point of this analysis when it is always going to be the biggest n in the equation, example- if it were n+n^2+3, then the upper bound would always be n^2 for some c and n0.
My second question :
as per rule the upper bound formula in Asymptotic Analysis must satisfy this condition f(n) = O(g(n)) IFF f(n) < c.g(n) where n>n0,c>0, n0>=1
i) n is the no of inputs, right? or does n represent the number of steps we perform? and does f(n) represents the algorithm?
ii) In the following video to prove upper bound of the equation 2n+3 could be n^2 the presenter considered c =1, and that is why to satisfy the equation n had to be >= 3 whereas one could have chosen c= 5 and n=1 as well, right? So then why were, in most cases in the video, the presenter was changing the value of n and not c to satisfy the conditions? is there a rule, or is it random? Can I change either c or n(n0) to satisfy the condition?
My Third Question:
In the same video, the presenter mentioned n0 (n not) is the number of steps. Is that correct? I thought n0 is the limit after which the graph becomes the upper bound (after n0, it satisfies the condition for all values of n); hence n0 also represents the input.
Would you please help me understand because people come up with different ideas in different explanations, and I want to understand them correctly?
Edit
The accepted answer clarified all of the questions except the first one. I have gone through many articles on the web, and here I am documenting my conclusion if anyone else has the same question. This will help them.
My first question was
In algorithmic complexity, we have learned to drop the constants and
find the dominant term, so is this Asymptotic Analysis to prove that
theory?
No, Asymptotic Analysis describes the algorithmic complexity, which is all about understanding or visualizing the Asymptotic behavior or the tail behavior of a function or a group of functions by plotting mathematical expression.
In computer science, we use it to evaluate (note: evaluate is not measuring) the performance of an algorithm in terms of input size.
for example, these two functions belong to the same group
mySet = set()
def addToMySet(n):
for i in range(n):
mySet.add(i*i)
mySet2 = set()
def addToMySet2(n):
for i in range(n):
for j in range(500):
mySet2.add(i*j)
Even though the execution time of the addToMySet2(n) is always > the execution time of addToMySet(n), the tail behavior of both of these functions would be the same with respect to the largest n, if one plot them in a graph the tendency of that graph for both of the functions would be linear thus they belong to the same group. Using Asymptotic Analysis, we get to see the behavior and group them.
A mistake that I made assuming upper bound represents the worst case. In reality, The upper bound of any algorithm is associated with all of the best, average, and worst cases. so the correct way of putting that would be
upper/lower bound in the best/average/worst case of an
algorithm
.
We can't relate the upper bound of an algorithm with the worst-case time complexity and the lower bound with the best-case complexity. However, an upper bound can be higher than the worst-case because upper bounds are usually asymptotic formulae that have been proven to hold.
I have seen this kind of question like find the worst-case time complexity of such and such algorithm, and the answer is either O(n) or O(n^2) or O(log-n), etc.
For example, if we consider the function addToMySet2(n), one would say the algorithmic time complexity of that function is O(n), which is technically wrong because there are three factors bound, bound type, (inclusive upper bound and strict upper bound) and case are involved determining the algorithmic time complexity.
When one denote O(n) it is derived from this Asymptotic Analysis f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0) so we are considering upper bound of best/average/worst case. In the above statement the case is missing.
I think We can consider, when not indicated, the big O notation generally describes an asymptotic upper bound on the worst-case time complexity. Otherwise, one can also use it to express asymptotic upper bounds on the average or best case time complexities

The whole point of asymptotic analysis is to compare algorithms performance scaling. For example, if I write two version of the same algorithm, one with O(n^2) time complexity and the other with O(n*log(n)) time complexity, I know for sure that the O(n*log(n)) one will be faster when n is "big". How big? it depends. You actually can't know unless you benchmark it. What you know is at some point, the O(n*log(n)) will always be better.
Now with your questions:
the "lower" n in n+n^2+3 is "dropped" because it is negligible when n scales up compared to the "dominant" one. That means that n+n^2+3 and n^2 behave the same asymptotically. It is important to note that even though 2 algorithms have the same time complexity, it does not mean they are as fast. For example, one could be always 100 times faster than the other and yet have the exact same complexity.
(i) n can be anything. It may be the size of the input (eg. an algorithm that sorts a list) but it may also be the input itself (eg. an algorithm that give the n-th prime number) or a number of iteration, etc
(ii) he could have taken any c, he chose c=1 as an example as he could have chosen c=1.618. Actually the correct formulation would be:
f(n) = O(g(n)) IFF for any c>0, there is a n0>0 from which f(n) < c.g(n) (for any n>n0)
the n0 from the formula is a pure mathematical construct. For c>0, it is the n value from which the function f is bounded by g. Since n can represent anything (size of a list, input value, etc), it is the same for n0

Related

Time complexity of an algorithm taking highest term

In the analysis of the time complexity of algorithms, why do you take only the highest growing term?
My thoughts are that time complexity is generally not used as an accurate measurement of the performance of an algorithm but rather used to identify which group an algorithm belongs to.
Consider if you had separate loops but each have nested loops of two levels, both iterating through n items such that the total complexity of the algorithm is 2n^2
Why is it taken that this algorithm is complexity of O(n^2)?
rather than O(2n^2)
My other thoughts are, n^2 defines the shape of the computation vs input length graph or that we consider parallel computation when calculating the complexity such that O(2n^2) = O(n^2)
Its true that Ax^2 is just a scaling of x^2 the shape remains quadratic
My other question is consider the same two separate loops, now the first iterating through n items and the second v items such that the total complexity of the algorithm is n^2 + v^2 if using sequential computation. Will the complexity be O(n^2+v^2)?
Thank you
So first let me observe that two loops can be faster than one if the one does more than twice as much work in the loop body.
To fully answer your question, big-O describes a growth regime, but it doesn't describe what we're observing growth in. The usual assumption is something like processor cycles or machine instructions, you can also count floating-point operations only (common in scientific computing), memory reads (probe complexity), cache misses, you name it.
If you want to count machine operations, no one's stopping you. But if you don't use big-O notation, then you have to be specific about which machine, and which implementation. Don Knuth famously invented an assembly language for The Art of Computer Programming so that he could do exactly that.
That's a lot of work though, so algorithm researchers typically follow the example of Angluin--Valiant instead, who introduced the unit-cost RAM. Their hypothesis was something like this: for any pair of the computers of the day, you could write a program to simulate one on the other where each source instruction used a constant number of target instructions. Therefore, by using big-O to erase the leading constant, you could make a statement about a large class of machines.
It's still useful to distinguish between broad classes of algorithms. In an old but particularly memorable demonstration, Jon Bentley showed that a linear-time algorithm on a TRS-80 (low-end microcomputer) could beat out a cubic algorithm on a Cray 1 (the fastest supercomputer of its day).
To answer the other question: yes, O(n² + v²) is correct if we don't know whether n or v dominates.
You're using big-O notation to express time complexity which gives an asymptotic upper bound .
For a function f(n) we define O(g(n)) as the set of functions
O(g(n)) = { f(n): there exist positive constants c and n0
such that,
0 ≤ f(n) ≤ cg(n) for all n ≥ n0}
Now coming to question,
In the analysis of the time complexity of algorithms, why do you take only the highest growing term?
Consider an algorithm with, f(n) = an2 +bn +c
It's time complexity is given by O(n2) or f(n) = O(n2)
because there will always be constants c' and n0 such that c'g(n) ≥ f(n) i.e. c'n2 ≥ an2 +bn +c , for n ≥ n0
We consider only higher growing term an2 and neglect bn+c as in the long term an2 will be much bigger than bn +c (eg.for n=1020, n2 is 1020 times larger than n)
Why is it taken that this algorithm is complexity of O(n^2)? rather than O(2n^2)
When a constant is multiplied it scales the function by constant amount, so even though 2n2 > n2 we write time complexity as O(n2) as there exists some constants c', n0 such that c'n2 ≥ 2n2
Will the complexity be O(n^2+v^2)
Yes, the time complexity will be O(n2 + v2) but if one variable dominates it will be O(n2) or O(v2) depending upon which dominates.
My thoughts are that time complexity is generally not used as an accurate
measurement of the performance of an algorithm but rather used to identify
which group an algorithm belongs to.
Time complexity is used to estimate performance in terms of growth of a function. The exact performance of algorithm is hard to determine precisely due to which we rely on asymptotic bounds to get rough idea about performance for large inputs
You can take the case of Merge Sort (O(NlgN)) and Insertion Sort(O(N2)). Clearly merge sort is better than insertion sort in terms of performance as NlgN < N2 but when the input size is small (eg. N=10) Insertion sort outperforms merge sort mainly because the constant operations in Merge Sort increases it's execution time.

Asymptotic Notation for algorithms

I finally thought I understood what it means when a function f(n) is sandwiched between a lower and upper bound which are the same class and so can be described as theta(n).
As an example:
f(n) = 2n + 3
1n <= 2n + 3 <= 5n for all values of n >= 1
In the example above it made perfect sense, the order of n is on both sides, so f(n) is sandwiched between 1 * g(n) and 5 * g(n).
It was also a lot clearer when I tried not to use the notations to think about best or worst case and instead as an upper, lower or average bound.
So now thinking I finally understood this and the maths around it I went back to visit this page: https://www.bigocheatsheet.com/ to look at the run times of various search functions and was suddenly confused again about how many of the algorithms there, for example bubble sort, do not have the same order on both sides (upper and lower bound) yet theta is used to describe them.
Bubble sort has Ω(n) and O(n^2) but the theta value is given as Θ(n^2). How is that it can have Θ(n^2) if the upper bound of the function is in the order of N^2 but the lower bound of the function is in the order of n?
Actually, the page you referred to is highly misleading - even if not completely wrong. If you analyze the complexity of an algorithm, you first have to specify the scenario: i.e. whether you are talking about worst-case (the default case), average case or best-case. For each of the three scenarios, you can then give a lower bound (Ω), upper bound (O) or a tight bound (Θ).
Take insertion sort as an example. While the page is, strictly speaking, correct in that the best case is Ω(n), it could just as well (and more precisely) have said that the best case is Θ(n). Similarly, the worst case is indeed O(n²) as stated on that page (as well as Ω(n²) or Ω(n) or O(n³)), but more precisely it's Θ(n²).
Using Ω to always denote the best case and O to always denote the worst-case is, unfortunately, an often made mistake. Takeaway message: the scenario (worst, average, best) and the type of the bound (upper, lower, tight) are two independent dimensions.

BigO Notation, understanding

I had seen in one of the videos (https://www.youtube.com/watch?v=A03oI0znAoc&t=470s) that, If suppose f(n)= 2n +3, then BigO is O(n).
Now my question is if I am a developer, and I was given O(n) as upperbound of f(n), then how I will understand, what exact value is the upper bound. Because in 2n +3, we remove 2 (as it is a constant) and 3 (because it is also a constant). So, if my function is f(n) where n = 1, I can't say g(n) is upperbound where n = 1.
1 cannot be upperbound for 1. I find hard understanding this.
I know it is a partial (and probably wrong answer)
From Wikipedia,
Big O notation characterizes functions according to their growth rates: different functions with the same growth rate may be represented using the same O notation.
In your example,
f(n) = 2n+3 has the same growth rate as f(n) = n
If you plot the functions, you will see that both functions have the same linear growth; and as n -> infinity, the difference between the 2 gets minimal.
In Big O notation, f(n) = 2n+3 when n=1 means nothing; you need to look at the trend, not discreet values.
As a developer, you will consider big-O as a first indication for deciding which algorithm to use. If you have an algorithm which is say, O(n^2), you will try to understand whether there is another one which is, say, O(n). If the problem is inherently O(n^2), then the big-O notation will not provide further help and you will need to use other criterion for your decision. However, if the problem is not inherently O(n^2), but O(n), you should discard any algorithm that happen to be O(n^2) and find an O(n) one.
So, the big-O notation will help you to better classify the problem and then try to solve it with an algorithm whose complexity has the same big-O. If you are lucky enough as to find 2 or more algorithms with this complexity, then you will need to ponder them using a different criterion.

Is Big Omega of any linear algorithm n or can it also be 1?

If we have a linear algorithm (for example, find if a number exists in a given array of numbers), does this mean that Omega(n) = n? The number of steps would be n. And the tightest bound I can make is c*n where c = 1.
But as far as I know, Omega also describes the best case scenario which in this case would be 1 because the searched element can be on the first position of the array and that accounts for only one step. So, by this logic, Omega(n) = 1.
Which variant is the correct one and why? Thanks.
There is a large confusion about what is described using the asymptotic notation.
The running time of an algorithm is in general a function of the number of elements, but also of the particular values of the inputs. Hence T(x) where x is an input of n elements is not a function of n alone.
Now one can study the worst-case and best-case: to determine these, you choose a configuration of the input corresponding to the slowest or fastest execution time and these are functions of n only. And additional option is the expected (or average) running-time, which corresponds to a given statistical distribution of the input. This is also a function of n alone.
Now, Tworst(n), Tbest(n), Texpected(n) can have upper bounds, denoted by O(f(n)), and lower bounds, denoted by Ω(f(n)). When these bounds coincide, the notation Θ(f(n)) is used.
In the case of a linear search, the best case is Θ(1) and the worst and expected cases are Θ(n). Hence the running time for arbitrary input is Ω(1)
and O(n).
Addendum:
The Graal of algorithmics is the discovery of efficient algorithms, i.e. such that the effective running time is of the same order as the best behavior that can be achieved independently of any algorithm.
For instance, it is obvious that the worst-case of any search algorithm is Ω(n) because whatever the search order, you may have to perform n comparisons (for instance if the key is not there). As the linear search is worst-case O(n), it is worst-case efficient. It is also best-case efficient, but this is not so interesting.
If you have an linear time algorithm that means that the time complexity has a linear upper bound, namely O(n). This does not mean that it also has a linear lower bound. In your example, finding out if a element exits, the lower bound is Ω(1). Here is Ω(n) just wrong.
Doing a linear search on an array, to find the minimal element takes exactly n steps in all cases. So here is the lower bound Ω(n). But Ω(1) would also be right, since a constant number of steps is also a lower bound for n steps, but it is no tight lower bound.

Meaning of Big O notation

Our teacher gave us the following definition of Big O notation:
O(f(n)): A function g(n) is in O(f(n)) (“big O of f(n)”) if there exist
constants c > 0 and N such that |g(n)| ≤ c |f(n)| for all n > N.
I'm trying to tease apart the various components of this definition. First of all, I'm confused by what it means for g(n) to be in O(f(n)). What does in mean?
Next, I'm confused by the overall second portion of the statement. Why does saying that the absolute value of g(n) less than or equal f(n) for all n > N mean anything about Big O Notation?
My general intuition for what Big O Notation means is that it is a way to describe the runtime of an algorithm. For example, if bubble sort runs in O(n^2) in the worst case, this means that it takes the time of n^2 operations (in this case comparisons) to complete the algorithm. I don't see how this intuition follows from the above definition.
First of all, I'm confused by what it means for g(n) to be in O(f(n)). What does in mean?
In this formulation, O(f(n)) is a set of functions. Thus O(N) is the set of all functions that are (in simple terms) proportional to N as N tends to infinity.
The word "in" means ... "is a member of the set".
Why does saying that the absolute value of g(n) less than or equal f(n) for all n > N mean anything about Big O Notation?
It is the definition. And besides you have neglected the c term in your synopsis, and that is an important part of the definition.
My general intuition for what Big O Notation means is that it is a way to describe the runtime of an algorithm. For example, if bubble sort runs in O(n^2) in the worst case, this means that it takes the time of n^2 operations (in this case comparisons) to complete the algorithm. I don't see how this intuition follows from the above definition.
Your intuition is incorrect in two respects.
Firstly, the real definition of O(N^2) is NOT that it takes N^2 operations. it is that it takes proportional to N^2 operations. That's where the c comes into it.
Secondly, it is only proportional to N^2 for large enough values of N. Big O notation is not about what happens for small N. It is about what happens when the problem size scales up.
Also, as a a comment notes "proportional" is not quite the right phraseology here. It might be more correct to say "tends towards proportional" ... but in reality there isn't a simple english description of what is going on here. The real definition is the mathematical one.
If you now reread the definition in the light of that, you should see that it fits just nicely.
(Note that the definitions of Big O, and related measures of complexity can also be expressed in calculus terminology; i.e. using "limits". However, generally speaking the things we are talking about are quantized; i.e. an integer number instructions, an integer number bytes of storage, etc. Calculus is really about functions involving real numbers. Hence, you could argue that the formulation above is preferable. OTOH, a real mathematician would probably see bus-sized holes in this argumentation.)
O(g(n)) looks like a function, but it is actually a set of functions. If a function f is in O(g(n)), it means that g is an asymptotic upper bound on f to within a constant factor. O(g(n)) contains all functions that are bounded from above by g(n).
More specifically, there exists a constant c and n0 such that f(n) < c * g(n) for all n > n0. This means that c * g(n) will always overtake f(n) beyond some value of n. g is asymptotically larger than f; it scales faster.
This is used in the analysis of algorithms as follows. The running time of an algorithm is impossible to specify practically. It would obviously depend on the machine on which it runs. We need a way of talking about efficiency that is unconcerned with matters of hardware. One might naively suggest counting the steps executed by the algorithm and using that as the measure of running time, but this would depend on the granularity with which the algorithm is specified and so is no good either. Instead, we concern ourselves only with how quickly the running time (this hypothetical thing T(n)) scales with the size of the input n.
Thus, we can report the running time by saying something like:
My algorithm (algo1) has a running time T(n) that is in the set O(n^2). I.e. it's bounded from above by some constant multiple of n^2.
Some alternative algorithm (algo2) might have a time complexity of O(n), which we call linear. This may or may not be better for some particular input size or on some hardware, but there is one thing we can say for certain: as n tends to infinity, algo2 will out-perform algo1.
Practically then, one should favour algorithms with better time complexities, as they will tend to run faster.
This asymptotic notation may be applied to memory usage also.

Resources