How do I prove Big O Notation with actual runtimes? - big-o

I am here because I don't exactly know how to prove it by comparing runtimes. For instance, proving linear time complexity is quite straight forward. As n increases, the time it takes increases in a linear manner.
However, with O(n^3), I can't quite get a consistent increase. The code is ```
for(int i=0; i<n; ++i)
for(int j=0; j<n*n; ++j)
sum++;
If my method is wrong or if the time complexity is wrong (if it's not O(n^3)), feel free to correct me. Thank you!
p.s. I use clock_t to check time (divided by CLOCKS_PER_SECOND).

Related

If my Termination statement in a for-loop is i < n * n , is my running time then O(n^2)?

So im just a bit confused on how to correctly interpret the running time of this for-loop:
for (int i = 0; i < n * n; ++i) {}
I know the basics of O-Notation im just insecure of how to correctly interpret the running time and I couldn't find similar examples.
The problem is actually a triple nested for loop and I know you just multiply the running time of nested loops but this one makes me insecure.
Yes.
n multiplied by itself is n2, and you perform n2 iterations.
There are no constant factors and no other considerations in this short example.
The complexity is simply O(n2).
Note that this does not consider any hypothetical operations performed inside the loop. Also note that, if we take the loop exactly at face value, it doesn't actually do any meaningful work so we could say that it has no algorithmic complexity at all. You would need to present a real example to really say.

Big O Notation - Growth Rate

I am trying to understand if my reasoning is correct:
If I am given the following snippet of code and asked to find it's Big O:
for(int i = 3; i < 1000; i++)
sum++;
I want to say O(n) because we are dealing with one for loop and sum++ which is iterated say n times but then looking at this I realise we are not dealing with n at all as we are given the amount of times this for loop iterates... but in my mind it would be wrong to say that this has a Big O of O(1) because the growth is linear and not constant and depends on the size of this loop (although the loop is 'constant'). Would I be correct in saying that this is O(n)?
Also, another one that has me thinking around which has a similar setup:
for(int i = 0; i < n * n * n; i++)
for(int j = 0; j < i; j++)
sum++;
Now here again I know that when dealing with a nested loop containing and outer and inner loop we would use the multiplication rule to derive our Big O. Let's assume that the inner loop was in fact j < n then I would say that the Big O of this snippet of code is O(n^4) but as it isn't and we have a the second loop running its iterations off i and not n then would it be correct to say this as a Big Order of O(n^3)?
I think what is throwing me is where 'n' is not appearing and we're given a constant or another variable and all of a sudden I'm assuming n must not be considered for that section of code. However, having said that the other part of my reasoning is telling me that despite not seeing an 'n' I should still treat the code as though there were an n as the growth rate would be the same regardless of the variable?
It works best if you consider the code to always be within a function, where the function's arguments are used to calculate complexity. Thus:
// this is O(1), since it always takes the same time
void doSomething() {
for(int i = 3; i < 1000; i++)
sum++;
}
And
// this is O(n^6), since it only takes one argument
// and if you plot it, the curve matches t = k * n^6
void doSomethingElse(int n) {
for(int i = 0; i < n * n * n; i++)
for(int j = 0; j < i; j++)
sum++;
}
In the end, the whole point of big-O is to say what the run-times (or memory-footprints; but if you don't say anything, you are referring to run-times) look like as the problem size increases. It matters not what happens in the inside (although you can use that to estimate complexity) - what really matters is what you would measure outside.
Looking closer at your second snippet, it's O(n^6) because:
outer loop runs exactly n^3 times; inner loop runs, on average, n^3 / 2 times.
therefore, inner sum runs n^3 * k * n^3 times (with k a constant). In big-O notation, that's O(n^6).
The first is either O(1) or simply a wrong question, just like you understand it.
The second is O(n6). Try to imagine the size of the inner loop. On first iteration, it will be 1. On the second, 2. On the ith, it will be i, and on the last, it will be n*n*n. So it will be n*n*n/2, but that's O(n*n*n). That, times the outer O(n3) is O(n6) overall.
Although the calculation of O() for your question, by others, may be correct, here is a little more insight that should help delineate the conceptual outlook for this whole asymptotic analysis story.
I think what is throwing me is where 'n' is not appearing and we're given a constant or another
variable and all of a sudden I'm assuming n must not be considered for
that section of code.
The simplest way to understand this one is to identify if the execution of a line of code is affected by/related to the current value of n.
Had the inner loop been, let's say, j < 10 instead of j < i, the complexity would have well been O(n^3).
Why is any constant considered O(1)?
This may agreeably sound a little counter-intuitive at first however, here is a small conceptual summary to clear the air.
Let us say that your first loop runs 1000 times. Then you set it to 10^1000 times and try to believe that hey, it doesn't take the same time anymore.
Fair enough! Even though it may now take your computer 5 seconds more to run the same piece of code, the time complexity still remains O(1).
What this practically means is that you can actually calculate the time that it takes your computer to execute that piece of code and it will remain constant forever (for the same configuration).
Big-Oh is actually a function on the input and not the measure of the discrete value itself (time/space).
I hope that the above explanation also helps clarify why we actually ignore the constants in the O() notation.
Why is this Big-Oh thing so generalized and why is it used at the first place?
I thought of including this extra info as I myself had this question in mind when learning this topic for the first time.
Asymptotic time-complexity is an apriori analysis of any algorithm to understand the worst (Big-Oh) behavior (time/space) of that program regardless of the size of the input.
Eg. Your second code can not perform worse than O(n^6).
It is generalized because from one computer to another, only the constant changes, not Big-Oh.
With more experience, you will realize that practically, you would want your algorithm's time-complexity to be as asymptotically small as possible. Till a polynomial function it is fine. But for large inputs, today's computers start coughing if you try to run an algorithm with exponential time complexity of the order O(k^n) or O(n^n), eg. The Travelling Salesman and other NP-C/H problems.
Hope this adds to the info. :)

Is it sometimes better to write solution in O(n^2) than in O(n)? [duplicate]

This question already has answers here:
Are there any cases where you would prefer a higher big-O time complexity algorithm over the lower one?
(23 answers)
Closed 7 years ago.
If we have solution to problem something like this:
public void solutionLinear(Problem problem) {
for (int i = 0; i < problem.getSize(); i++) {
// do something with problem and compute solution
}
}
...and if we have solution to problem which is like this
public void solutionQuadric(Problem problem) {
for (int i = 0; i < problem.getSize(); i++) {
for (int j = 0; j < problem.getSize(); j++) {
// do something with problem and compute solution
}
}
}
Is it better to write second solution sometimes and when?
Big O complexity measurements omit constant coefficients, so "O(N) complexity" and "O(N^2) complexity" roughly correspond to "runs in A*N + B seconds" and "runs in C*N^2 + D*N + E seconds" respectively. The latter may be preferable if A & B are large and C & D & E are small.
Consider the code samples:
public void solutionLinear(Problem problem) {
for (int i = 0; i < problem.getSize(); i++) {
do_stuff_taking_one_hour();
}
}
public void solutionQuadric(Problem problem) {
for (int i = 0; i < problem.getSize(); i++) {
for (int j = 0; j < problem.getSize(); j++) {
do_stuff_taking_one_second();
}
}
}
Despite being O(N^2), the latter algorithm will run faster as long as problem.getSize() is less than 60.
The other answers do a good job of explaining the situation where a O(n^2) solution is actually faster than a O(n) solution, so I will be focusing on the other side of the question, namely, "Should you favor readability over performance?"
Short Answer: No
Long Answer: Generally no. There are times when the difference in performance is small enough that the gains you get from readability might be worth it. For example, people debate over the relative speed of switches and if/else statements, but the difference in performance is so small that you should really just use whichever is more maintainable for you and your team.
Outside of those cases, the potential for slowing down your program generally outweighs the gain you get from the code being readable. If it is well written code and the only problem is that the algorithm is more complex, you can solve that problem by leaving documentation for the next person to work on it.
I think a good example of this trade-off is bubble-sort vs quick-sort. Bubble-sort is a very easy algorithm to understand and is very readable. Quick-sort on the other hand is far less intuitive and definitely harder to read. However, it would not be appropriate to replace quick-sort with bubble-sort in production code because the performance difference is too extreme. The situation you asked about is even worse than that because you are talking about O(n) vs O(n^2) whereas bubble-sort vs quick-sort is O(n) vs O(log(n)) (in the best case of course).
When It Comes to Speed
When it comes to runtime in particular, what generally everyone else is saying is right; there usually is a hidden constant whenever you refer to a function in Big O. If the function with the O(n^2) had a relatively small constant and didn't run particularly long, it could be faster than a function which ran O(n) with a large constant and ran longer.
Don't Forget About Memory
Runtime isn't the only thing to consider when writing or using an algorithm; you also need to worry about the space complexity. If you happened to need to conserve memory in your application and you had to choose between a function which ran in O(n) but uses a ton of memory and a function which ran in O(n^2) but uses much less memory, you might want to consider that slower algorithm.
A good example for this is quicksort vs. mergesort - in general, mergesort is consistently faster than quicksort, however quicksort is done in place and doesn't require allocating memory, unlike mergesort.
In Conclusion
Is it sometimes better to write solution in O(n^2) than in O(n)?
Yes, considering the specific circumstances of your application, the slower option may indeed be the better one. You should never rule out an algorithm simply because its slower!

Big Oh Notation and Calculating the Running Time for a Triple-Nested For-Loop

In Computer Science, it is very important for Computer Scientists to know how to calculate the running times of algorithms in order to optimize code. For you Computer Scientists, I pose a question.
I understand that, in terms of n, a double-nested for-loop typically has a running time of n2 and a triple-nested for-loop typically has a running time of n3.
However, for a case where the code looks like this, would the running time be n4?
x = 0;
for(a = 0; a < n; a++)
for(b = 0; b < 2a; b++)
for (c=0; c < b*b; c++)
x++;
I simplified the running time for each line to be virtually (n+1) for the first loop, (2n+1) for the second loop, and (2n)2+1 for the third loop. Assuming the terms are multiplied together, and we extract the highest term to find the Big Oh, would the running time be n4, or would it still follow the usual running-time of n3?
I would appreciate any input. Thank you very much in advance.
You are correct, n*2n*4n2 = O(n4).
The triple nested loop only means there will be three numbers to multiply to determine the final Big O - each multiplicand itself is dependent on how much "processing" each loop does though.
In your case the first loop does O(n) operations, the second one O(2n) = O(n) and the inner loop does O(n2) operations, so overall O(n*n*n2) = O(n4).
Formally, using Sigma Notation, you can obtain this:
Could this be a question for Mathematics?
My gut feelings, like BrokenGlass is that it is O(n⁴).
EDIT: Sum of squares and Sum of cubes give a pretty good understanding of what is involved. The answer is a resounding O(n^4): sum(a=0 to n) of (sum(b=0 to 2a) of (b^2)). The inner sum is congruent to a^3. Therefore your outer sum is congruent to n^4.
Pity, I thought you might get away with some log instead of n^4. Never mind.

Algorithm Analysis Question

NOTE: I'm ultra-newbie on algorithm analysis so don't take any of my affirmations as absolute truths, anything (or everything) that I state could be wrong.
Hi, I'm reading about algorithm analysis and "Big-O-Notation" and I fell puzzled about something.
Suppose that you are asked to print all permutations of a char array, for [a,b,c] they would be ab, ac, ba, bc, ca and cb.
Well one way to do it would be (In Java):
for(int i = 0; i < arr.length; i++)
for(int q = 0; q < arr.length; q++)
if(i != q)
System.out.println(arr[i] + " " + arr[q]);
This algorithm has a notation of O(n2) if I'm correct.
I thought other way of doing it:
for(int i = 0; i < arr.length; i++)
for(int q = i+1; q < arr.length; q++)
{
System.out.println(arr[i] + " " + arr[q]);
System.out.println(arr[q] + " " + arr[i]);
}
Now this algorithm is twice as fast than the original, but unless I'm wrong, for big-O-notation it's also a O(2)
Is this correct? Probably it isn't so I'll rephrase: Where am I wrong??
You are correct. O-notation gives you an idea of how the algorithm scales, not the absolute speed. If you add more possibilities, both solutions will scale the same way, but one will always be twice as fast as the other.
O(n) operations may also be slower than O(n^2) operations, for sufficiently small 'n'. Imagine your O(n) computation involves taking 5 square roots, and your O(n^2) solution is a single comparison. The O(n^2) operation will be faster for small sets of data. But when n=1000, and you are doing 5000 square roots but 1000000 comparisons, then the O(n) might start looking better.
I think most people agree first one is O(n^2). Outer loop runs n times and inner loop runs n times every time outer loop runs. So the run time is O(n * n), O(n^2).
The second one is O(n^2) because the outer loop runs n times. The inner loops runs n-1 times. On average for this algorithm, inner loop runs n/2 times for every outer loop. so the run time of this algorithm is O(n * n/2) => O ( 1/2 * n^2) => O(n^2).
Big-O notation says nothing about the speed of the algorithm except for how fast it is relative to itself when the size of the input changes.
An algorithm could be O(1) yet take a million years. Another algorithm could be O(n^2) but be faster than an O(n) algorithm for small n.
Some of the answers to this question may help with this aspect of big-O notation. The answers to this question may also be helpful.
Ignoring the problem of calling your program output "permutation":
Big-O-Notation omits constant coefficients. And 2 is a constant coefficient.
So, there is nothing wrong for programs two times faster than the original to have the same O()
You are correct. Two algorithms are equivalent in Big O notation if one of them takes a constant amount of time more ("A takes 5 minutes more than B"), or a multiple ("A takes 5 times longer than B") or both ("A takes 2 times B plus an extra 30 milliseconds") for all sizes of input.
Here is an example that uses a FUNDAMENTALLY different algorithm to do a similar sort of problem. First, the slower version, which looks much like your original example:
boolean arraysHaveAMatch = false;
for (int i = 0; i < arr1.length(); i++) {
for (int j = i; j < arr2.length(); j++) {
if (arr1[i] == arr2[j]) {
arraysHaveAMatch = true;
}
}
}
That has O(n^2) behavior, just like your original (it even uses the same shortcut you discovered of starting the j index from the i index instead of from 0). Now here is a different approach:
boolean arraysHaveAMatch = false;
Set set = new HashSet<Integer>();
for (int i = 0; i < arr1.length(); i++) {
set.add(arr1[i]);
}
for (int j = 0; j < arr2.length(); j++) {
if (set.contains(arr2[j])) {
arraysHaveAMatch = true;
}
}
Now, if you try running these, you will probably find that the first version runs FASTER. At least if you try with arrays of length 10. Because the second version has to deal with creating the HashSet object and all of its internal data structures, and because it has to calculate a hash code for every integer. HOWEVER, if you try it with arrays of length 10,000,000 you will find a COMPLETELY different story. The first version has to examine about 50,000,000,000,000 pairs of numbers (about (N*N)/2); the second version has to perform hash function calculations on about 20,000,000 numbers (about 2*N). In THIS case, you certainly want the second version!!
The basic idea behind Big O calculations is (1) it's reasonably easy to calculate (you don't have to worry about details like how fast your CPU is or what kind of L2 cache it has), and (2) who cares about the small problems... they're fast enough anyway: it's the BIG problems that will kill you! These aren't always the case (sometimes it DOES matter what kind of cache you have, and sometimes it DOES matter how well things perform on small data sets) but they're close enough to true often enough for Big O to be useful.
You're right about them both being big-O n squared, and you actually proved that to be true in your question when you said "Now this algorithm is twice as fast than the original." Twice as fast means multiplied by 1/2 which is a constant, so by definition they're in the same big-O set.
One way of thinking about Big O is to consider how well the different algorithms would fare even in really unfair circumstances. For instance, if one was running on a really powerful supercomputer and the other was running on a wrist-watch. If it's possible to choose an N that is so large that even though the worse algorithm is running on a supercomputer, the wrist watch can still finish first, then they have different Big O complexities. If, on the other hand, you can see that the supercomputer will always win, regardless of which algorithm you chose or how big your N was, then both algorithms must, by definition, have the same complexity.
In your algorithms, the faster algorithm was only twice as fast as the first. This is not enough of an advantage for the wrist watch to beat the supercomputer, even if N was very high, 1million, 1trillion, or even Graham's number, the pocket watch could never ever beat the super computer with that algorithm. The same would be true if they swapped algorithms. Therefore both algorithms, by definition of Big O, have the same complexity.
Suppose I had an algorithm to do the same thing in O(n) time. Now also suppose I gave you an array of 10000 characters. Your algorithms would take n^2 and (1/2)n^2 time, which is 100,000,000 and 50,000,000. My algorithm would take 10,000. Clearly that factor of 1/2 isn't making a difference, since mine is so much faster. The n^2 term is said to dominate the lesser terms like n and 1/2, essentially rendering them negligible.
The big-oh notation express a family of function, so say "this thing is O(n²)" means nothing
This isn't pedantry, it is the only, correct way to understand those things.
O(f) = { g | exists x_0 and c such that, for all x > x_0, g(x) <= f(x) * c }
Now, suppose that you're counting the steps that your algorithm, in the worst case, does in term of the size of the input: call that function f.
If f \in O(n²), then you can say that your algorithm has a worst-case of O(n²) (but also O(n³) or O(2^n)).
The meaninglessness of the constants follow from the definition (see that c?).
The best way to understand Big-O notation is to get the mathematical grasp of the idea behind the notation. Look for dictionary meaning of the word "Asymptote"
A line which approaches nearer to some curve than assignable distance, but, though
infinitely extended, would never meet it.
This defines the maximum execution time (imaginary because asymptote line meets the curve at infinity), so what ever you do will be under that time.
With this idea, you might want to know, Big-O, Small-O and omega notation.
Always keep in mind, Big O notation represents the "worst case" scenario. In your example, the first algorithm has an average case of full outer loop * full inner loop, so it is n^2 of course. Because the second case has one instance where it is almost full outer loop * full inner loop, it has to be lumped into the same pile of n^2, since that is its worst case. From there it only gets better, and your average compared to the first function is much lower. Regardless, as n grows, your functions time grows exponentially, and that is all Big O really tells you. The exponential curves can vary widely, but at the end of the day, they are all of the same type.

Resources