Big-O and Function Domination - big-o

I am currently working on some problems from my textbook, about Big-O notation, and how functions can dominate each other.
These are the functions that I am looking at from my book.
n²
n² + 1000n
n (if n is odd)
n³ (if n is even)
n (if n ≤ 100)
n³ (if n > 100)
I am trying to figure out which functions that #1 dominates. I know that both #1 and #2 simplify to n², so it does not dominate #2. However, the split functions (#3 and #4) are giving me problems. #1 dominates the function only on a certain condition, and under the other condition, #1 is being dominated by the other function. So does this mean that, since it is not always dominating it, that it doesn't technically count as dominating it at all? Does function #1 not dominate any of these functions, or does it dominate #3, for all odd numbers, and #4 for all numbers ≤ 100? The way I see it is, #1 does not dominate #2, only dominates #3 for odd numbers, and only dominates #4 for numbers ≤ 100. Am I on the right track?
Thanks for any help anyone can provide. I'm having a real tough time trying to reason this out to myself.

I am not sure what "dominates" means in your case. Lets say "f(n) dominates g(n)" translates to f(n) ∈ O(g(n)), where O(g(n)) is the worst case complexity.
So we should calculate the worst case complexity first:
n² is in Θ(n²)
n² + 1000n is also in Θ(n²)
n (if n is odd) n³ (if n is even) is in Θ(n³) (just picking the worst case which appears in 50% of all cases for random choices of n)
n (if n ≤ 100) n³ (if n > 100) is also in Θ(n³), since Big-O depends on asymptotics (large values of n).
Now we can compare the worst case complexities and see #1 dominates only #2.
Maybe you want to change the worst case complexity to an average case. But only for #3 there could be a change.
After calculating (n³ + n) / 2 we notice, that even the average case of #3 is in Θ(n³).
If you look at the best case you get the first change, but also only for #3. Here the best case is Θ(n), so here is #3 dominated by #1.
Notice that the best case of #4 is not Θ(n), since the complexity holds only for n → ∞, so we ignore all cases of n < c₀ where c₀ is a constant.

Related

Why is O(n) better than O( nlog(n) )?

I just came around this weird discovery, in normal maths, n*logn would be lesser than n, because log n is usually less than 1.
So why is O(nlog(n)) greater than O(n)? (ie why is nlogn considered to take more time than n)
Does Big-O follow a different system?
It turned out, I misunderstood Logn to be lesser than 1.
As I asked few of my seniors i got to know this today itself, that if the value of n is large, (which it usually is, when we are considering Big O ie worst case), logn can be greater than 1.
So yeah,
O(1) < O(logn) < O(n) < O(nlogn) holds true.
(I thought this to be a dumb question, and was about to delete it as well, but then realised, no question is dumb question and there might be others who get this confusion so I left it here.)
...because log n is always less than 1.
This is a faulty premise. In fact, logb n > 1 for all n > b. For example, log2 32 = 5.
Colloquially, you can think of log n as the number of digits in n. If n is an 8-digit number then log n ≈ 8. Logarithms are usually bigger than 1 for most values of n, because most numbers have multiple digits.
Plot both the graph( on desmos (https://www.desmos.com/calculator) or any other web) and look yourself the result on large values of n ( y=f(n)). I am saying that you should look for large value because for small value of n the program will not have time issue. For convenience I had attached a graph below you can try for other base of log.
The red represent time = n and blue represent time = nlog(n).
Here is a graph of the popular time complexities
n*log(n) is clearly greater than n for n>2 (log base 2)
An easy way to remember might be, taking two examples
Imagine the binary search algorithm with is Log N time complexity: O(log(N))
If, for each step of binary search, you had to iterate the array of N elements
The time complexity of that task would be O(N*log(N))
Which is more work than iterating the array once: O(N)
In computers, it's log base 2 and not base 10. So log(2) is 1 and log(n), where n>2, is a positive number which is greater than 1.
Only in the case of log (1), we have the value less than 1, otherwise, it's greater than 1.
Log(n) can be greater than 1 if n is greater than b. But this doesn't answer your question that why is O(n*logn) is greater than O(n).
Usually the base is less than 4. So for higher values n, n*log(n) becomes greater than n. And that is why O(nlogn) > O(n).
This graph may help. log (n) rises faster than n and is greater than 1 for n greater than logarithm's base. https://stackoverflow.com/a/7830804/11617347
No matter how two functions behave on small value of n, they are compared against each other when n is large enough. Theoretically, there is an N such that for each given n > N, then nlogn >= n. If you choose N=10, nlogn is always greater than n.
The assertion is not always accurate. When n is small, (n^2) requires more time than (log n), but when n is large, (log n) is more effective. The growth rate of (n^2) is less than (n) and (log n) for small values, so we can say that (n^2) is more efficient because it takes less time than (log n), but as n increases, (n^2) increases dramatically, whereas (log n) has a growth rate that is less than (n^2) and (n), so (log n) is more efficient.
For higher values of log n it becomes greater than 1. as we consider all possible values of n we can say that for most of the time log n is greater than 1. Hence we can say O(nlogn) > O(n) (Assuming higher values)
Remember "big O" is not about values, it is about the shape of the function I can have an O(n2) function that runs faster, even for values of n over a million than a O(1) function...

Why is the worst case time complexity of this simple algorithm T(n/2) +1 as opposed to n^2+T(n-1)?

The following question was on a recent assignment in University. I would have thought the answer would be n^2+T(n-1) as I thought the n^2 would make it's asymptotic time complexity O(n^2). Where as with T(n/2)+1 its asymptotic time complexity would be O(log2(n)).
The answers were returned and it turns out the correct answer is T(n/2)+1 however I can't get my head around why this is the case.
Could someone possibly explain to me why that's the worst case time complexity of this algorithm? It's possible my understanding of time complexity is just wrong.
The asymptotic time complexity is taking n large. In the case of your example, since the question specifies that k is fixed, the only complexity relevant is the last one. See the Wikipedia formal definition, specifically:
As n grows to infinity, the recursion that dominates T(n) = T(n / 2) + 1. You can prove this as well using the formal definition, basically picking x_0 = 10 * k and showing that a finite M can be found using the first two cases. It should be clear that both log(n) and n^2 satisfy the definition, so the tighter bound is the asymptotic complexity.
What does O (f (n)) mean? It means the time is at most c * f (n), for some unknown and possibly large c.
kevmo claimed a complexity of O (log2 n). Well, you can check all the values n ≤ 10k, and let the largest value of T (n) be X. X might be quite large (about 167 k^3 in this case, I think, but it doesn't actually matter). For larger n, the time needed is at most X + log2 (n). Choose c = X, and this is always less than c * log2 (n).
Of course people usually assume that a O (log n) algorithm would be quick, and this one most certainly isn't if say k = 10,000. So you learned as well that O notation must be handled with care.

Counting the steps of an algorithm for an upper bound

I know that If I have a for loop, and a nested for loop, which both iterate 1 to n times, I can multiply the run times of both loops to get O(n^2). This is a clean and simple calculation. However, if you had iterations like so,
n = 2, k = 5
n = 3, k = 9
n = 4, k = 14
where k is the number of times the inner for loop iterates. At one point, it is larger than n^2, then it is exactly n^2, then it becomes less than n^2. Assuming you cannot determine k based on n, and maybe even having these points of n very far apart how do you calculate Big-O?
I tried graphing points. And at one point, I could say it was O(n^3) since some points exceed n^2, and further down, it would be O(n^2). Which one should I choose?
You state in your question that k is:
"... At one point, it is larger than n^2"
This is the uncertainty (or non-specificity) in your question that makes it hard to answer rigorously. Anyway, for the remainder of this answer, we shall assume that what you mean by the quote above is that:
For all values of n, the value of k(n) is bounded from above by
C·n^2, for some constant C>0.
From here on, let's refer to this statement as (+).
Now, since you're mentioning Big-O notation, we'll proceed to somewhat loosely define what this actually means:
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus
there exists some constant c such that f(n) is always ≤ c · g(n),
for sufficiently large n (i.e. , n ≥ n0 for some constant n0).
I.e., Big-O notation is a way to describe an upper bound here for the asymptotic (limiting) behaviour of our algorithm. You write in your question, however, that:
"And at one point, I could say it was O(n^3) since some points exceed n^2, and further down, it would be O(n^2)"
Now this is a very specific analysis of how the inner loop of your algorithm behaves for specific values of n, and really not something that is related to asymptotic analysis (or Big-O notation). We're not interested in specifics about how the algorithms behaves for specific values of n, but whether we can find some general upper bound for the algorithm given n is "sufficiently large" (n ≥ n0 for some constant n0).
Now, with these comments above, we can proceed to analysing the asymptotic behaviour of your algorithm.
We can approach this using Sigma notation, making use of statement (+) above, k(n) < C·n:
The last step (++) follows from the definition of Big-O-notation, that we loosely stated above.
Hence, given that we interpret your information regarding k as (+), your algorithm runs in O(n^3) (which is an upper bound, not necessarily a tight one).

Which algorithm is faster O(N) or O(2N)?

Talking about Big O notations, if one algorithm time complexity is O(N) and other's is O(2N), which one is faster?
The definition of big O is:
O(f(n)) = { g | there exist N and c > 0 such that g(n) < c * f(n) for all n > N }
In English, O(f(n)) is the set of all functions that have an eventual growth rate less than or equal to that of f.
So O(n) = O(2n). Neither is "faster" than the other in terms of asymptotic complexity. They represent the same growth rates - namely, the "linear" growth rate.
Proof:
O(n) is a subset of O(2n): Let g be a function in O(n). Then there are N and c > 0 such that g(n) < c * n for all n > N. So g(n) < (c / 2) * 2n for all n > N. Thus g is in O(2n).
O(2n) is a subset of O(n): Let g be a function in O(2n). Then there are N and c > 0 such that g(n) < c * 2n for all n > N. So g(n) < 2c * n for all n > N. Thus g is in O(n).
Typically, when people refer to an asymptotic complexity ("big O"), they refer to the canonical forms. For example:
logarithmic: O(log n)
linear: O(n)
linearithmic: O(n log n)
quadratic: O(n2)
exponential: O(cn) for some fixed c > 1
(Here's a fuller list: Table of common time complexities)
So usually you would write O(n), not O(2n); O(n log n), not O(3 n log n + 15 n + 5 log n).
Timothy Shield's answer is absolutely correct, that O(n) and O(2n) refer to the same set of functions, and so one is not "faster" than the other. It's important to note, though, that faster isn't a great term to apply here.
Wikipedia's article on "Big O notation" uses the term "slower-growing" where you might have used "faster", which is better practice. These algorithms are defined by how they grow as n increases.
One could easily imagine a O(n^2) function that is faster than O(n) in practice, particularly when n is small or if the O(n) function requires a complex transformation. The notation indicates that for twice as much input, one can expect the O(n^2) function to take roughly 4 times as long as it had before, where the O(n) function would take roughly twice as long as it had before.
It depends on the constants hidden by the asymptotic notation. For example, an algorithm that takes 3n + 5 steps is in the class O(n). So is an algorithm that takes 2 + n/1000 steps. But 2n is less than 3n + 5 and more than 2 + n/1000...
It's a bit like asking if 5 is less than some unspecified number between 1 and 10. It depends on the unspecified number. Just knowing that an algorithm runs in O(n) steps is not enough information to decide if an algorithm that takes 2n steps will complete faster or not.
Actually, it's even worse than that: you're asking if some unspecified number between 1 and 10 is larger than some other unspecified number between 1 and 10. The sets you pick from being the same doesn't mean the numbers you happen to pick will be equal! O(n) and O(2n) are sets of algorithms, and because the definition of Big-O cancels out multiplicative factors they are the same set. Individual members of the sets may be faster or slower than other members, but the sets are the same.
Theoretically O(N) and O(2N) are the same.
But practically, O(N) will definitely have a shorter running time, but not significant. When N is large enough, the running time of both will be identical.
O(N) and O(2N) will show significant difference in growth for small numbers of N, But as N value increases O(N) will dominate the growth and coefficient 2 becomes insignificant. So we can say algorithm complexity as O(N).
Example:
Let's take this function
T(n) = 3n^2 + 8n + 2089
For n= 1 or 2, the constant 2089 seems to be the dominant part of function but for larger values of n, we can ignore the constants and 8n and can just concentrate on 3n^2 as it will contribute more to the growth, If the n value still increases the coefficient 3 also seems insignificant and we can say complexity is O(n^2).
For detailed explanation refer here
O(n) is faster however you need to understand that when we talk about Big O, we are measuring the complexity of a function/algorithm, not its speed. And we measure this complexity asymptotically. In lay man terms, when we talk about asymptotic analysis, we take immensely huge values for n. So if you plot the graph for O(n) and O(2n), the values will stay in some particular range from each other for any value of n. They are much closer compared to the other canonical forms like O(nlogn) or O(1), so by convention we approximate the complexity to the canonical form O(n).

Big O when adding together different routines

Lets say I have a routine that scans an entire list of n items 3 times, does a sort based on the size, and then bsearches that sorted list n times. The scans are O(n) time, the sort I will call O(n log(n)), and the n times bsearch is O(n log(n)). If we add all 3 together, does it just give us the worst case of the 3 - the n log(n) value(s) or does the semantics allow added times?
Pretty sure, now that I type this out that the answer is n log(n), but I might as well confirm now that I have it typed out :)
The sum is indeed the worst of the three for Big-O.
The reason is that your function's time complexity is
(n) => 3n + nlogn + nlogn
which is
(n) => 3n + 2nlogn
This function is bounded above by 3nlogn, so it is in O(n log n).
You can choose any constant. I just happened to choose 3 because it was a good asymptotic upper bound.
You are correct. When n gets really big, the n log(n) dominates 3n.
Yes it will just be the worst case since O-notation is just about asymptotic performance.
This should of course not be taken to mean that adding these extra steps will have no effect on your programs performance. One of the O(n) steps could easily consume a huge portion of your execution time for the given range of n where your program operates.
As Ray said, the answer is indeed O(n log(n)). The interesting part of this question is that it doesn't matter which way you look at it: does adding mean "actual addition" or does it mean "the worst case". Let's prove that these two ways of looking at it produce the same result.
Let f(n) and g(n) be functions, and without loss of generality suppose f is O(g). (Informally, that g is "worse" than f.) Then by definition, there exists constants M and k such that f(n) < M*g(n) whenever n > k. If we look at in the "worst case" way, we expect that f(n)+g(n) is O(g(n)). Now looking at it in the "actual addition" way, and specializing to the case where n > k, we have f(n) + g(n) < M*g(n) + g(n) = (M+1)*g(n), and so by definition f(n)+g(n) is O(g(n)) as desired.

Resources