When analyzing the time complexity of an algorithm why do we drop the constant of the term with the largest degree - algorithm

Suppose I have the following : T(n) = 5n^2 +2n
The asymtotic tight bound of this is theta n^2. I want to understand the reason behind dropping the 5. I understand why we ignore the lower order terms.

Consult the definition of big-O.
Keeping things simple[*], let's define that a function g is O(f) if there exist constants C and M such that for n > M, 0 <= g(n) < Cf(n).
The presence of a positive constant multiplier in f doesn't affect this, just choose C appropriately. Your example T is O(n^2) by choosing a value of C greater than 5, and a value of M big enough that the +2n is irrelevant. For for example, for n > 2 it's a fact that 5n^2 + 2n < 6n^2 (because n^2 > 2n), so with C= 6 and M = 2 we see that T(n) is O(n^2).
So it's true that T(n) is O(n^2), and also true that it's O(5n^2), and O(5n^2 + 2n). The most interesting of those facts is that it's O(n^2), since it's the simplest expression and the other two are logically equivalent. If we want to compare the complexities of different functions, then we want to use simple expressions.
For big-Theta just note that we can play the same trick when f and g are the other way around. The relation "g is Theta(f)" is an equivalence relation, so what are we going to choose as the representative member of the equivalence class of T? The simplest one.
[*] Keeping things less simple, we cope with negative numbers by using a limsup rather than a plain limit. My definition above is actually sufficient but not necessary.

wEverything comes back to the concept of "Order of Magnitute". Given something like
5n^2 +2n
You think that the 5 is significant, however when you break it down and think about the numbers, the constant really doesn't matter (graph it and you will see why). For example .. say n is 50.
5 * 50^2 + 2 * 50 --> 5 * 2500 + 2 * 50 --> 12,600
As you mentioned, the 2*n is insignificant when compared to n^2. This same concept applies when viewing the constant as well... you might think that 2500 vs 125,000 is a big difference; however consider if the algorithm was n^3... you are now looking at 12,600 vs 625,100
So the factor that will have the most significant difference on the cost of an algorithm is just n^2.

The constant is dropped because it does not affect which complexity class the function belongs to.
If you have two functions f(n) = c1 * n^2 and g(n) = c2 * n^3, where c1 and c2 are constants, it doesn't matter how large c1 is and how small c2 is, g(n) will ALWAYS overtake and outgrow f(n) at some value of n.

Related

Why doesn't the additive constant in Big-O notation matter?

In definition of Big-O notation we care only about C coefficient:
f(n) ≤ Cg(n) for all n ≥ k
Why don't we care about A as well:
f(n) ≤ Cg(n) + A for all n ≥ k
There are really two cases to consider here. For starters, imagine that your function g(n) has the property that g(n) ≥ 1 for all "sufficiently large" choices of n. In that case, if you know that
f(n) ≥ cg(n) + A,
then you also know that
f(n) ≥ cg(n) + Ag(n),
so
f(n) ≥ (c + A)g(n).
In other words, if your function g is always at least one, then bounding f(n) by something of the form cg(n) + A is equivalent to bounding it with something of the form c'g(n) for some new constant c'. In that sense, adding some extra flexibility into the definition of big-O notation, at least in this case, wouldn't make a difference.
In the context of the analysis of algorithms, pretty much every function g(n) you might bound something with will be at least one, and so we can "munch up" that extra additive term by choosing a larger multiple of g.
However, big-O notation is also used in many cases to bound functions that decrease as n increases. For example, we might say that the probability that some algorithm gives back the right answer is O(1 / n), where the function 1/n drops to 0 as a function of n. In this case, we use big-O notation to talk about how fast the function drops off. If the success probability is O(1 / n2), for example, that's a better guarantee than the earlier O(1 / n) success probability, assuming n gets sufficiently large. In that case, allowing for additive terms in the definition of big-O notation would actually break things. For example, intuitively, the function 1 / n2 drops to 0 faster than the function 1 / n, and using the formal definition of big-O notation you can see this because 1 / n2 ≤ 1 / n for all n ≥ 1. However, with your modified definition of big-O notation, we could also say that 1 / n = O(1 / n2), since
1 / n ≤ 1 / n2 + 1 for all n ≥ 1,
which is true only because the additive 1 term bounds the 1/n term, not the 1/n2 we might have been initially interested in.
So the long answer to your question is "the definition you proposed above is equivalent to the regular definition of big-O if we only restrict ourselves to the case where g(n) doesn't drop to zero as a function of n, and in the case where g(n) does drop the zero as a function of n your new definition isn't particularly useful."
Big-O notation is about what happens as the data gets larger. In other words, it is a limit as n --> infinity.
As n gets large, A remains the same. So it gets smaller and smaller in comparison. On the other hand, g(n) (presumably) gets bigger and bigger, so its contribution increases more and more.
A is constant in this case, thus it will not affect much the complexity when the size of the problem grows very much.
When you have a cost of 1 million, you do not care if you add up a constant factor of 100 for example. You care for how this 1 million grows (resulting from Cg(n)); whether it becomes 2 millions for example if the size of the problem grows a bit. However, your constant will still be 100, so it doesn't really affect the overall complexity.

Complexity - determining the order of growth

I understand how to calculate a function's complexity for the most part. The same goes for determining the order of growth for a mathematical function. [I probably don't understand it as much as I think I do, which is why I'm probably asking this.] For instance:
an^3 + bn^2 + cn + d could be written as O(n^3) in big-o notation, since for large enough nthe values of the term bn^2 + cn + d are insignificant in comparison to an^3 (the constant coefficients a, b, c and d are left out as well, as their contribution to the value become insignificant, too).
What I don't understand is, how does this work when the leading term is involved in some sort of division? For instance:
a/n^3 + bn^2 or n^3/a + bn^2
Let n=100, a=1000 and b=10 for the former formula, then we have
n^3/a = 100^3/1000 = 1000 and bn^2 = 10*100^2 = 100,000
or even more dramatic for the latter - in this case the leading term is not only growing slowly as above, but it's also shrinking, isn't it?:
a/n^3 = 1000/100^3 = 0.001 and bn^2 = 100,000 as above.
In both cases the second term contributes much more, so isn't it n^2that actually determines the order of growth?
It gets even more complicated (for me, at least) when the leading term is followed by a subtraction (a/n^3 - bn^2) or when the second term is also a division (n^3/a + n^2/b) or when both are divisions but in mixed order (a/n^3 + n^2/b), etc.
The list seems endless, so my general question is, how to understand and handle formulas that involve division (and subtraction) in order to determine the order of growth for a given function?
A division is just a multiplication by the multiplicative inverse, so n^3/a == n^3 * a^-1, and you can handle it the same way as any other coefficient.
With regards to substraction a*n^3 - b*n^2 <= a*n^3, so it is also in O(n^3). Also, a*n^3 - b*n^2 >= a/2 * n^3 for large enough values of n, and it is also in Omega(n^3). A more detailed explanation about substraction can be found in: Algorithm complexity when faced with substraction in value
big O notation is generally used for increasing (don't have to be monotonically) functions, and a decreasing function such as a/n is not a good fit for it, though O(1/n) seems to be still perfectly defined, AFAIK, and it is a subset of O(1) (unless you take into account only discrete functions). However, this has very little value for algorithm's analysis, as a complexity of an algorithm cannot really shrink..
There's a very simple rule for the type of questions you posted.
Suppose you're trying to find the order of growth of f(n), and you find some simple function g(n) such that
lim {n -> inf} f(n) / g(n) = k
where k is a positive finite constant. Then
f(n) = Theta(g(n))
(It's easy to see this from the calculus definitions.)
Now let's see how this applies to your examples:
lim {n -> inf} (a/n^3 + bn^2) / n^2 = b
so it's Theta(n^2).
lim {n -> inf} (a n^3 - bn^2) / n^3 = a
so it's Theta(n^2).
(of course, assuming a and b are positive.)

Are 2^n and n*2^n in the same time complexity?

Resources I've found on time complexity are unclear about when it is okay to ignore terms in a time complexity equation, specifically with non-polynomial examples.
It's clear to me that given something of the form n2 + n + 1, the last two terms are insignificant.
Specifically, given two categorizations, 2n, and n*(2n), is the second in the same order as the first? Does the additional n multiplication there matter? Usually resources just say xn is in an exponential and grows much faster... then move on.
I can understand why it wouldn't since 2n will greatly outpace n, but because they're not being added together, it would matter greatly when comparing the two equations, in fact the difference between them will always be a factor of n, which seems important to say the least.
You will have to go to the formal definition of the big O (O) in order to answer this question.
The definition is that f(x) belongs to O(g(x)) if and only if the limit limsupx → ∞ (f(x)/g(x)) exists i.e. is not infinity. In short this means that there exists a constant M, such that value of f(x)/g(x) is never greater than M.
In the case of your question let f(n) = n ⋅ 2n and let g(n) = 2n. Then f(n)/g(n) is n which will still grow infinitely. Therefore f(n) does not belong to O(g(n)).
A quick way to see that n⋅2ⁿ is bigger is to make a change of variable. Let m = 2ⁿ. Then n⋅2ⁿ = ( log₂m )⋅m (taking the base-2 logarithm on both sides of m = 2ⁿ gives n = log₂m ), and you can easily show that m log₂m grows faster than m.
I agree that n⋅2ⁿ is not in O(2ⁿ), but I thought it should be more explicit since the limit superior usage doesn't always hold.
By the formal definition of Big-O: f(n) is in O(g(n)) if there exist constants c > 0 and n₀ ≥ 0 such that for all n ≥ n₀ we have f(n) ≤ c⋅g(n). It can easily be shown that no such constants exist for f(n) = n⋅2ⁿ and g(n) = 2ⁿ. However, it can be shown that g(n) is in O(f(n)).
In other words, n⋅2ⁿ is lower bounded by 2ⁿ. This is intuitive. Although they are both exponential and thus are equally unlikely to be used in most practical circumstances, we cannot say they are of the same order because 2ⁿ necessarily grows slower than n⋅2ⁿ.
I do not argue with other answers that say that n⋅2ⁿ grows faster than 2ⁿ. But n⋅2ⁿ grows is still only exponential.
When we talk about algorithms, we often say that time complexity grows is exponential.
So, we consider to be 2ⁿ, 3ⁿ, eⁿ, 2.000001ⁿ, or our n⋅2ⁿ to be same group of complexity with exponential grows.
To give it a bit mathematical sense, we consider a function f(x) to grow (not faster than) exponentially if exists such constant c > 1, that f(x) = O(cx).
For n⋅2ⁿ the constant c can be any number greater than 2, let's take 3. Then:
n⋅2ⁿ / 3ⁿ = n ⋅ (2/3)ⁿ and this is less than 1 for any n.
So 2ⁿ grows slower than n⋅2ⁿ, the last in turn grows slower than 2.000001ⁿ. But all three of them grow exponentially.
You asked "is the second in the same order as the first? Does the additional n multiplication there matter?" These are two different questions with two different answers.
n 2^n grows asymptotically faster than 2^n. That's that question answered.
But you could ask "if algorithm A takes 2^n nanoseconds, and algorithm B takes n 2^n nanoseconds, what is the biggest n where I can find a solution in a second / minute / hour / day / month / year? And the answers are n = 29/35/41/46/51/54 vs. 25/30/36/40/45/49. Not much difference in practice.
The size of the biggest problem that can be solved in time T is O (ln T) in both cases.
Very Simple answer is 'NO'
see 2^n and n.2^n
as seen n.2^n > 2^n for any n>0
or you can even do it by applying log on both sides then you get
n.log(2) < n.log(2) + log(n)
hence by both type of analysis that is by
substituting a number
using log
we see that n.2^n is greater than 2^n as visibly seen
so if you get a equation like
O ( 2^n + n.2^n ) which can be replaced as O ( n.2^n)

How does the formal definition of O(n) tie to simple explanation in data structures class

The informal idea of Big-O is described as "it's the highest order of growth of a function" ie f(n) = 3n^2 + 5n + 50 is just O(n^2).
I do understand that Big-O is just a way of saying "guaranteed to not be worse than this period". Formally, it appears the definition is f(n) -> O(g(n)) iff f(n) <= c * g(n) where c is positive
First some mathy stuff.. if f(n) = 5n^2, g(n)=n I should be able to show 5n^2 isn't O(g(n)) by doing
5n^2 <= cn
5n <= c
If the idea is that is that c isn't a constant(I have no idea if that's a requirement), and that is proof f(n) isnt in O(g(n)), what about if g(n) were n^3 (of which it surely should be contained)?
5n^2 <= cn^3
5/n <= c
I have a misunderstanding of how the math works out for all of this I assume, so I ask:
How does all this fancy stuff work
How does it connect to the simple definition given in my data structures class?
Thanks for any help
n is a positive integer, which means that 1<=n and therefore 5/n<=5/1=5, so you can pick c=5.
A more complete definition also allows you to pick n0 and a, both constants, and only prove that f(n)<=a+c*g(n) for all n0<n
c is a constant (i.e. independent of n)
In your first example (it's proof by contradiction):
i.e. assume
5n^2 <= cn
5n <= c
But for any fixed constant c, we can find a value of n that makes it untrue.
For example pick c = 1000000, then a value of n = 200001 would be a contradiction.
In your second example, we know that f(n) is O(n^2), therefore it is also O(n^3) and above. If you are bounded by k(n^2), you are also bounded by j(n^3)
The informal idea of Big-O is described as "it's the highest order of growth of a function" ie f(n) = 3n^2 + 5n + 50 is just O(n^2).
I wouldn't say that this is the idea behind Big-O. Informally Big-O is some rough estimation of what a given function cannot exceed. And it's usage is mostly approximating how something will grow for big numbers.
For example, if we take a 6 digit number, we can definitely say that it's less than million without looking for its digits. There are a lot of cases when this is enough and we don't need to analyse all digits.
For analysing function growth two factors play their role:
we only interested in function behaviour for very large numbers
if f is bigger than g, but we can fix it with multiplying g by some big constant, that's means f's advantage is not because of growth
This leads us to two parts of the definition: (1) some constant and (2) for big enough n
And for polynomials indeed, the higher order component defines grow speed.

Are all algorithms with a constant base to the n (e.g.: k^n) of the same time complexity?

In this example: http://www.wolframalpha.com/input/?i=2%5E%281000000000%2F2%29+%3C+%283%2F2%29%5E1000000000
I noticed that those two equations are pretty similar no matter how high you go in n. Do all algorithms with a constant to the n fall in the same time complexity category? Such as 2^n, 3^n, 4^n, etc.
They are in the same category, This does not mean their complexity is the same. They are exponential running time algorithms. Obviously 2^n < 4^n
We can see 4^n/2^n = 2^2n/2^n = 2^n
This means 4^n algorithm exponential slower(2^n times) than 2^n
The same thing happens with 3^n which is 1.5^n.
But this does not mean 2^n is something far less than 4^n, It is still exponential and will not be feasible when n>50.
Note this is happening due to n is not in the base. If they were in the base like this:
4n^k vs n^k then this 2 algorithms are asymptotically the same(as long as n is relatively small than actually data size). They would be different by linear time, just like O(n) vs c * O(n)
The time complexities O(an) and O(bn) are not the same if 1 < a < b. As a quick proof, we can use the formal definition of big-O notation to show that bn ≠ O(an).
This works by contradiction. Suppose that bn = O(an) and that 1 < a < b. Then there must be some c and n0 such that for any n ≥ n0, we have that bn ≤ c · an. This means that bn / an ≤ c for any n ≥ n0. Since b > a, it should start to become clear that this is impossible - as n grows larger, bn / an = (b / a)n will get larger and larger. In particular, if we pick any n ≥ n0 such that n > logb / a c, then we will have that
(b / a)n > (b / a)log(b/a) c = c
So, if we pick n = max{n0, c + 1}, then it's not true that bn ≤ c · an, contradicting our assumption that bn = O(an).
This means, in particular, that O(2n) ≠ O(1.5n) and that O(3n) ≠ O(2n). This is why when using big-O notation, it's still necessary to specify the base of any exponents that end up getting used.
One more thing to notice - although it looks like 21000000000/2 is approximately 1.41000000000/2, notice that these are totally different numbers. The first is of the form 10108.1ish and the second of the form 10108.2ish. That might not seem like a big difference, but it's absolutely colossal. Take, for example, 10101 and 10102. This first number is 1010, which is 10 billion and takes ten digits to write out. The second is 10100, one googol, which takes 100 digits to write out. There's a huge difference between them - the first of them is close to the world population, while the second is about the total number of atoms in the universe!
Hope this helps!

Resources