why we always consider large value of input in analysis of algorithm for eg:in big-oh notation ?
The point of Big-O notation is precisely to work out how the running time (or space) varies as the size of input increases - in other words, how well it scales.
If you're only interested in small inputs, you shouldn't use Big-O analysis... aside from anything else, there are often approaches which scale really badly but work very well for small inputs.
Because the worst case performance is usually more of a problem than the best case performance. If your worst case performance is acceptable your algorithm will run fine.
Analysis of algorithms does not just mean running them on the computer to see which one is faster. Rather it is being able to look at the algorithm and determine how it would perform. This is done by looking at the order of magnitude of the algorithm. As the number of items(N) changes what effect does it have on the number of operations needed to execute(time). This method of classification is referred to as BIG-O notation.
Programmers use Big-O to get a rough estimate of "how many seconds" and "how much memory" various algorithms use for "large" inputs
It's because of the definition of BigO notation. Given O(f(n)) is the bounds on g([list size of n]): For some value of n, n0, all values of n, n0 < n, the run-time or space- complexity of g([list]) is less than G*f(n), where G is an arbitrary constant.
What that means is that after your input goes over a certain size, the function will not scale beyond some function. So, if f(x) = x (being eq to O(n)), n2 = 2 * n1, the function i'm computing will not take beyond double the amount of time. Now, note that if O(n) is true, so is O(n^2). If my function will never do worse than double, it will never do worse than square either. In practice the lowest order function known is usually given.
Big O says nothing about how well an algorithm will scale. "How well" is relative. It is a general way to quantify how an algorithm will scale, but the fitness or lack of fitness for any specific purpose is not part of the notation.
Suppose we want to check whether a no is prime or not. And Ram and Shyam came up with following solutions.
Ram's solution
for(int i = 2; i <= n-1; i++)
if( n % i == 0 )
return false;
return true;
now we know that the above algorithm will run n-2 times.
shyam's solution
for(int i = 2; i <= sqrt(n); i++)
if ( n % i == 0 )
return false;
return true;
The above algorithm will run sqrt(n) - 1 times
Assuming that in both the algorithms each run takes unit time(1ms) then
if n = 101
1st algorithm:- Time taken is 99 ms which is even less than blink of an eye
2nd algorithm:- Around 9 ms which again is not noticable.
if n = 10000000019
1st algorithm:- Time taken is 115 days which is 3rd of an year.
2nd algorithm:- Around 1.66 minutes which is equivalent to sipping a cup of coffee.
I think nothing need to be said now :D
Related
Suppose that I have 2 nested for loops, and 1 array of size N as shown in my code below:
int result = 0;
for( int i = 0; i < N ; i++)
{
for( int j = i; j < N ; j++)
{
result = array[i] + array[j]; // just some funny operation
}
}
Here are 2 cases:
(1) if the constraint is that N >= 1,000,000 strictly, then we can definitely say that the time complexity is O(N^2). This is true for sure as we all know.
(2) Now, if the constraint is that N < 25 strictly, then people could probably say that because we know that definitely, N is always too small, the time complexity is estimated to be O(1) since it takes very little time to run and complete these 2 for loops WITH MODERN COMPUTERS ? Does that sound right ?
Please tell me if the value of N plays a role in deciding the outcome of the time complexity O(N) ? If yes, then how big the value N needs to be in order to play that role (1,000 ? 5,000 ? 20,000 ? 500,000 ?) In other words, what is the general rule of thumb here ?
INTERESTING THEORETICAL QUESTION: If 15 years from now, the computer is so fast that even if N = 25,000,000, these 2 for loops can be completed in 1 second. At that time, can we say that the time complexity would be O(1) even for N = 25,000,000 ? I suppose the answer would be YES at that time. Do you agree ?
tl:dr No. The value of N has no effect on time complexity. O(1) versus O(N) is a statement about "all N" or how the amount of computation increases when N increases.
Great question! It reminds me of when I was first trying to understand time complexity. I think many people have to go through a similar journey before it ever starts to make sense so I hope this discussion can help others.
First of all, your "funny operation" is actually funnier than you think since your entire nested for-loops can be replaced with:
result = array[N - 1] + array[N - 1]; // just some hilarious operation hahaha ha ha
Since result is overwritten each time, only the last iteration effects the outcome. We'll come back to this.
As far as what you're really asking here, the purpose of Big-O is to provide a meaningful way to compare algorithms in a way that is indenependent of input size and independent of the computer's processing speed. In other words, O(1) versus O(N) has nothing to with the size of N and nothing to do with how "modern" your computer is. That all effects execution time of the algorithm on a particular machine with a particular input, but does not effect time complexity, i.e. O(1) versus O(N).
It is actually a statement about the algorithm itself, so a math discussion is unavoidable, as dxiv has so graciously alluded to in his comment. Disclaimer: I'm going to omit certain nuances in the math since the critical stuff is already a lot to explain and I'll defer to the mountains of complete explanations elsewhere on the web and textbooks.
Your code is a great example to understand what Big-O does tell us. The way you wrote it, its complexity is O(N^2). That means that no matter what machine or what era you run your code in, if you were to count the number of operations the computer has to do, for each N, and graph it as a function, say f(N), there exists some quadratic function, say g(N)=9999N^2+99999N+999 that is greater than f(N) for all N.
But wait, if we just need to find big enough coefficients in order for g(N) to be an upper bound, can't we just claim that the algorithm is O(N) and find some g(N)=aN+b with gigantic enough coefficients that its an upper bound of f(N)??? THE ANSWER TO THIS IS THE MOST IMPORTANT MATH OBSERVATION YOU NEED TO UNDERSTAND TO REALLY UNDERSTAND BIG-O NOTATION. Spoiler alert. The answer is no.
For visuals, try this graph on Desmos where you can adjust the coefficients:[https://www.desmos.com/calculator/3ppk6shwem][1]
No matter what coefficients you choose, a function of the form aN^2+bN+c will ALWAYS eventually outgrow a function of the form aN+b (both having positive a). You can push a line as high as you want like g(N)=99999N+99999, but even the function f(N)=0.01N^2+0.01N+0.01 crosses that line and grows past it after N=9999900. There is no linear function that is an upper bound to a quadratic. Similarly, there is no constant function that is an upper bound to a linear function or quadratic function. Yet, we can find a quadratic upper bound to this f(N) such as h(N)=0.01N^2+0.01N+0.02, so f(N) is in O(N^2). This observation is what allows us to just say O(1) and O(N^2) without having to distinguish between O(1), O(3), O(999), O(4N+3), O(23N+2), O(34N^2+4+e^N), etc. By using phrases like "there exists a function such that" we can brush all the constant coefficients under the rug.
So having a quadratic upper bound, aka being in O(N^2), means that the function f(N) is no bigger than quadratic and in this case happens to be exactly quadratic. It sounds like this just comes down to comparing the degree of polynomials, why not just say that the algorithm is a degree-2 algorithm? Why do we need this super abstract "there exists an upper bound function such that bla bla bla..."? This is the generalization necessary for Big-O to account for non-polynomial functions, some common ones being logN, NlogN, and e^N.
For example if the number of operations required by your algorithm is given by f(N)=floor(50+50*sin(N)), we would say that it's O(1) because there is a constant function, e.g. g(N)=101 that is an upper bound to f(N). In this example, you have some bizarre algorithm with oscillating execution times, but you can convey to someone else how much it doesn't slow down for large inputs by simply saying that it's O(1). Neat. Plus we have a way to meaningfully say that this algorithm with trigonometric execution time is more efficient than one with linear complexity O(N). Neat. Notice how it doesn't matter how fast the computer is because we're not measuring in seconds, we're measuring in operations. So you can evaluate the algorithm by hand on paper and it's still O(1) even if it takes you all day.
As for the example in your question, we know it's O(N^2) because there are aN^2+bN+c operations involved for some a, b, c. It can't be O(1) because no matter what aN+b you pick, I can find a large enough input size N such that your algorithm requires more than aN+b operations. On any computer, in any time zone, with any chance of rain outside. Nothing physical effects O(1) versus O(N) versus (N^2). What changes it to O(1) is changing the algorithm itself to the one-liner that I provided above where you just add two numbers and spit out the result no matter what N is. Let's say for N=10 it takes 4 operations to do both array lookups, the addition, and the variable assignment. If you run it again on the same machine with N=10000000 it's still doing the same 4 operations. The amount of operations required by the algorithm doesn't grow with N. That's why the algorithm is O(1).
It's why problems like finding a O(NlogN) algorithm to sort an array are math problems and not nano-technology problems. Big-O doesn't even assume you have a computer with electronics.
Hopefully this rant gives you a hint as to what you don't understand so you can do more effective studying for a complete understanding. There's no way to cover everything needed in one post here. It was some good soul-searching for me, so thanks.
I'm currently learning Big O Notations, but I'm kinda confused with time/iterations calculation according to different complexities.
I made up this problem:
An algorithm that goes through all possible solutions takes 10^(-7) seconds with each test.
If the number of solutions are the following functions: logn, n, nlog. n^2, what's the maximum n I can calculate in, for example, less than 1 second?
What I though about (for case logn) was 10^-7 times logn must take less than 1 second:
10^(-7) * logn < 1 <=> n = 10^(1/10^-7)
Incredibly big (If it's not wrong, oh damn --'). But what about n^2 ?
10^(-7) * n^2 < 1 <=> n = square_root(1/10^-7)
But how can the number of solutions in n^2 case be less than the number in n, if the complexity is bigger? This is confusing me...?
"This is bad on so many levels."
First off, O(f(n)) is not the same as f(n).
Complexity is typically used to represent the time it takes to solve a problem, as a function of the size of the input.
Of course if you can solve more problems in the same amount of time using X vs Y, X will solve more problems in a fixed amount of time than Y. But since you're not using the term complexity in the right way, it is no surprise you get a (seemingly) paradoxical answer.
You just upper bounded your algorithm to a real world time of 10^(-7) seconds, which implies that your algorithm will guarantee to finish, for all complexities, in 10^(-7) seconds.
Let's refrain from discussing if this is possible in reality. But since you just defined your algorithm to go through all possible solutions in 10^(-7), it means no matter what n is, it will finish in that time. So your n is positive infinity.
Besides, I don't think you should use big O to denote the number of solutions.
Good news first there is nothing "wrong" with the result of your calculation.
Naturally if the complexity of an algirithm is higher (like O(n^2) complexity is higher than O(log n)), the size of the problem you are still able to handle in "acceptable" time will be smaller, in total accordance to your calculation.
Nevertheless the example seems a bit twisted and I have to admit, I didn't completely understand the purpose, you invented the 10^-7 factor for and thus suspect, you didn't get the concept of the O notation right.
Essentially the main idea of the O notation is, that you don't care about any linear factors (like the 10^7 factor you invented in your calculation) in comparing two algorithms, but only about how fast computation time grows with the size of the problem, as constant (and thus not growing) factors sooner or later will become irrelevant compared to the growth in computation time due to the problem size.
Going by example:
With the O(n^2) algorithm A, taking the time t=2*n^2 milliseconds and the O(log n) algorithm B, taking t=200*log(n) millliseconds on a specific machine for given problem size nr things will look as follows:
For a very small problem, say n=10, Algorithm A might be faster than Algorithm B:
2*10^2 = 2*100 = 200 ms
400*log10 = 400*1 = 400 ms
But with growing problem size, say n=100, Algorithm B will sooner or later overtake Algorithm A in speed:
2*100^2 = 2*10,000 = 20,000 ms
400*log100 = 400*2 = 800 ms
While for even bigger problem sizes, say n=1,000,000, waiting for A to complete might take a lot of patience.
2*1,000,000^2 = 2*10^12 ms = 2*10^9 s = 33333333 min = 555555 h = 23148 days = 63 years
Algorithm B might still operate in acceptable time.
400*log1,000,000 = 400*6 = 2,400 ms = 2.4 s
As the constant factor plays an ever smaller role with growing problem size, for bigger and bigger problems it becomes more and more irrelevant and is therefore (together with terms of lower order, that follow the same rule) left out in the O notation.
Thus the "right" way to look at complexities given in O notation is not trying to look at fixed values for fixed n or even reinvent constant factors and additional terms of lower order already abstracted away, but to look how fast computation time grows with problem size.
So, again going by example the "right" way to look at complexity of O(n^2) and O(log10) would be to compare their growth.
If the problem size grows by a factor of 10 Algorithm A's computation time will rise by a factor of 100, thus taking 100 times as long as before, as:
(n*10)^2 = n^2 * 10^2 = n^2 * 100
While Algorithm B's computation time will just grow by a constant ammount, as:
log(n*10) = log(n) + log(10) = log(n) + 1
I just read in Cormen's algorithm book that big-O and big-omega do not follow the trichotomy property. That means for two functions, f(n) and g(n), it may be the case that neither f(n) = O(g(n)) nor f(n) = Omega(g(n)) holds. In example they argue that if function is n^(1+sin n) than it is possible.
While it is correct is it possible in a real world algorithm to have a run time of something like sin n. Since it would sometimes decrease, with the increase of input size. Does anybody knows any such algorithm or can give a small code snippet which does this.
Thanks for the answers, so in that case is it correct to assume that Given a problem P with size n, if it can not be solved in O(f(n)) time by any known algorithm, then the lower bound of P is Omega(f(n)).
The Boyer-Moore string search algorithm gets faster when the string searched for gets longer. Of course, the limiting factor is most often rather the length of the string searched in.
The average running time of SAT for randomly-generated 3CNF clauses eventually goes down as the ratio of clauses to variables increases. The intuition is that when there are very many clauses relative to the number of variables, it is more likely that the formula is "obviously" unsatisfiable; that is, typical SAT-solving algorithms (a step or two better than exhaustive search, but simple enough to cover in an undergrad logic course) quickly reach contradictions and stop.
Of course, those are experimental observations for some notion of "random" 3CNF formulas. I'm not sure what people have proven about it.
Use any inverse function.
f(x) -> 1 / x
f(x) -> 1 / x²
f(x) -> 1 / log(x)
As the input x grows, the resulting value will get smaller. It's fairly straightforward to relate a smaller value to lesser number of steps in the algorithm. Just use a counter in a loop to move towards that number.
Here's a simple algorithm.
function(x) {
step = 0.001
y = 1 / x;
for (i = 0; i < y; i += step) { /* do something awesome here */ }
}
I have difficulties conceiving a meaningful problem with decreasing complexity. A "meaningful" problem will need to read or touch parts of all of its input. Unless the input is encoded in a very inefficient way, processing it should take an increasing amount of time.
It could be increasing toward a constant, though.
Consider two algorithms, A, and B. These algorithms both solve the same problem, and have time complexities
(in terms of the number of elementary operations they perform) given
respectively by
a) (n) = 9n+6
b) (n) = 2(n^2)+1
(i) Which algorithm is the best asymptotically?
(ii) Which is the best for small input sizes n, and for what values of n is this the
case? (You may assume where necessary that n>0.)
I think it's A. Am I right?
And what's the answer for part B? What exactly do they want?
Which algorithm is the best asymptotically?
To answer this question, you just need to take a look at the exponents of n in both functions: Asymptotically, n2 will grow faster than n. So A ∈ O(n) is asymptotically the better choice than B ∈ O(n2).
Which is the best for small input sizes n, and for what values of n is this the case? (You may assume where necessary that n>0.)
To answer this question, you need to find the point of intersection where both functions have the same value. And for n=5 both functions evaluate to 51 (see 9n+6=2(n^2)+1 on Wolfram Alpha). And since A(4)=42 and B(4)=33, B is the better choice for n < 5.
I think plotting those functions would be very helpful to understand what's going on.
You should probably start by familiarizing yourself with asymptotics, big O notation, and the like. Asymptotically, a will be better. Why? Because it can be proven that for sufficiently large N, a(n) < b(n) for n> N.
Proof left as an exercise for the reader.
By looking at the constants it's easy to see that a will be larger than b at the beginning. By looking at the occurences of n (n in a, n^2 in b), you can see that b is larger asymptotically. So we only need to figure out from which point on b is larger than a. To do that we just need to solve the equation a(n) = b(n) for n.
9n + 6 IS the best.
Take this example, if your n is 10, then
9n + 6 = 96
2(n^2) + 1 = 201
now, take n is 100
9n + 6 = 906
2(n^2) + 1 = 20001
and it goes on and on...
if n = 4 then
9n + 6 = 40
2(n^2) + 1 = 33
Conclusion, the second one is better if n <= 4, but worst with 5 or more.
BTW, when calculating complexity of an algorithm, we usually end up dropping factor and constants because they do not affect the speed diference by much, so it should be simplify as a(n) = n and b(n) = n^2, which gives you a clear answer.
Simple graphing software will show that 9n+6 will perform better quite quickly as will simple algebra. At sets of 5 or more, 9n+6 will be faster.
Asymptotically, O(n) is better (cheaper) than O(n^2).
For small values of n, this is a simple algebra problem:
Find 'n' for which 9n+6=2(n^2)+1: cleaning it up, we get the 2nd grade equation 2(n^2)-9n-5=0. This yields n=5, which means that, for n=5, both processes would have the same cost:
9n+6 => n:5 => 9*5+6 = 45+6 = 51
2(n^2)+1 => n:5 => 2(5*5)+1 = 2*25+1 = 50+1 = 51
This means that B is better for n<5, they are equal for n=5, and A is better for n>5. If you expect n to be smaller than 5 in the vast majority of cases, then B may be a better choice, but it will only be relevant if the algorithm is used a lot. If you get implemented it as a function, the minor benefits of B pale against the call overhead, so they won't be unnoticeable.
In summary, unless you are very sure of what you're up to, go ahead with A. In general, you always want the algorithm with better (cheaper) asymptotic cost. Only when you have the same generic order, or reliable knowledge about the input data you may be getting, deeper insight is worth the effort, and even then the best approach is benchmarking both versions with realistic data than theoretical analysis.
Asymptotically, a(n) is better since it's O(n) as opposed to O(n2). Here is a plot the running time as a function of n. As you can see, a(n) is only slower for small values of n.
What I would do is run the algorithm several times for the cases you described (and on computers with different processors) and get the time when you start and the time you finish. Then look at the difference between their times for each of your cases.
it's obvious, if we have:
* 9n+6 => n:5 => 9*5+6 = 45+6 = 51
* 2(n^2)+1 => n:5 => 2(5*5)+1 = 2*25+1 = 50+1 = 51
then 9n+6 is much better.
The answer to the first part is obviously (a) because O(n*n) > O(n).
The answer to the second part is that you cannot say "Which is the best for small input sizes" because you give no information about what the 'elementary operations' are in each case, nor how long each of those operations takes. The compiler or the CPU could apply some optimization which makes the so-called slower algorithm perform MUCH faster than the 'better' algorithm at small input sizes.
And by 'small' input sizes that could mean 10, 100 or 1 million+! The cross over between a clever O(n) algorithm and a dumb O(n*n) algorithm can be huge because compilers and CPUs are great at running dumb code really fast.
Many people make the mistake of 'optimizing' their code on the basis of O() without considering the size of the input, nor testing how well each performs with the data sets they will be using.
Of course that doesn't mean you should always right dumb code when there is a much better algorithm, or data structure that can do it in O(n) time, but it does mean you should think carefully before investing work to create a much cleverer (but much harder to maintain) algorithm that you think is optimal because it has a better O().
NOTE: I'm ultra-newbie on algorithm analysis so don't take any of my affirmations as absolute truths, anything (or everything) that I state could be wrong.
Hi, I'm reading about algorithm analysis and "Big-O-Notation" and I fell puzzled about something.
Suppose that you are asked to print all permutations of a char array, for [a,b,c] they would be ab, ac, ba, bc, ca and cb.
Well one way to do it would be (In Java):
for(int i = 0; i < arr.length; i++)
for(int q = 0; q < arr.length; q++)
if(i != q)
System.out.println(arr[i] + " " + arr[q]);
This algorithm has a notation of O(n2) if I'm correct.
I thought other way of doing it:
for(int i = 0; i < arr.length; i++)
for(int q = i+1; q < arr.length; q++)
{
System.out.println(arr[i] + " " + arr[q]);
System.out.println(arr[q] + " " + arr[i]);
}
Now this algorithm is twice as fast than the original, but unless I'm wrong, for big-O-notation it's also a O(2)
Is this correct? Probably it isn't so I'll rephrase: Where am I wrong??
You are correct. O-notation gives you an idea of how the algorithm scales, not the absolute speed. If you add more possibilities, both solutions will scale the same way, but one will always be twice as fast as the other.
O(n) operations may also be slower than O(n^2) operations, for sufficiently small 'n'. Imagine your O(n) computation involves taking 5 square roots, and your O(n^2) solution is a single comparison. The O(n^2) operation will be faster for small sets of data. But when n=1000, and you are doing 5000 square roots but 1000000 comparisons, then the O(n) might start looking better.
I think most people agree first one is O(n^2). Outer loop runs n times and inner loop runs n times every time outer loop runs. So the run time is O(n * n), O(n^2).
The second one is O(n^2) because the outer loop runs n times. The inner loops runs n-1 times. On average for this algorithm, inner loop runs n/2 times for every outer loop. so the run time of this algorithm is O(n * n/2) => O ( 1/2 * n^2) => O(n^2).
Big-O notation says nothing about the speed of the algorithm except for how fast it is relative to itself when the size of the input changes.
An algorithm could be O(1) yet take a million years. Another algorithm could be O(n^2) but be faster than an O(n) algorithm for small n.
Some of the answers to this question may help with this aspect of big-O notation. The answers to this question may also be helpful.
Ignoring the problem of calling your program output "permutation":
Big-O-Notation omits constant coefficients. And 2 is a constant coefficient.
So, there is nothing wrong for programs two times faster than the original to have the same O()
You are correct. Two algorithms are equivalent in Big O notation if one of them takes a constant amount of time more ("A takes 5 minutes more than B"), or a multiple ("A takes 5 times longer than B") or both ("A takes 2 times B plus an extra 30 milliseconds") for all sizes of input.
Here is an example that uses a FUNDAMENTALLY different algorithm to do a similar sort of problem. First, the slower version, which looks much like your original example:
boolean arraysHaveAMatch = false;
for (int i = 0; i < arr1.length(); i++) {
for (int j = i; j < arr2.length(); j++) {
if (arr1[i] == arr2[j]) {
arraysHaveAMatch = true;
}
}
}
That has O(n^2) behavior, just like your original (it even uses the same shortcut you discovered of starting the j index from the i index instead of from 0). Now here is a different approach:
boolean arraysHaveAMatch = false;
Set set = new HashSet<Integer>();
for (int i = 0; i < arr1.length(); i++) {
set.add(arr1[i]);
}
for (int j = 0; j < arr2.length(); j++) {
if (set.contains(arr2[j])) {
arraysHaveAMatch = true;
}
}
Now, if you try running these, you will probably find that the first version runs FASTER. At least if you try with arrays of length 10. Because the second version has to deal with creating the HashSet object and all of its internal data structures, and because it has to calculate a hash code for every integer. HOWEVER, if you try it with arrays of length 10,000,000 you will find a COMPLETELY different story. The first version has to examine about 50,000,000,000,000 pairs of numbers (about (N*N)/2); the second version has to perform hash function calculations on about 20,000,000 numbers (about 2*N). In THIS case, you certainly want the second version!!
The basic idea behind Big O calculations is (1) it's reasonably easy to calculate (you don't have to worry about details like how fast your CPU is or what kind of L2 cache it has), and (2) who cares about the small problems... they're fast enough anyway: it's the BIG problems that will kill you! These aren't always the case (sometimes it DOES matter what kind of cache you have, and sometimes it DOES matter how well things perform on small data sets) but they're close enough to true often enough for Big O to be useful.
You're right about them both being big-O n squared, and you actually proved that to be true in your question when you said "Now this algorithm is twice as fast than the original." Twice as fast means multiplied by 1/2 which is a constant, so by definition they're in the same big-O set.
One way of thinking about Big O is to consider how well the different algorithms would fare even in really unfair circumstances. For instance, if one was running on a really powerful supercomputer and the other was running on a wrist-watch. If it's possible to choose an N that is so large that even though the worse algorithm is running on a supercomputer, the wrist watch can still finish first, then they have different Big O complexities. If, on the other hand, you can see that the supercomputer will always win, regardless of which algorithm you chose or how big your N was, then both algorithms must, by definition, have the same complexity.
In your algorithms, the faster algorithm was only twice as fast as the first. This is not enough of an advantage for the wrist watch to beat the supercomputer, even if N was very high, 1million, 1trillion, or even Graham's number, the pocket watch could never ever beat the super computer with that algorithm. The same would be true if they swapped algorithms. Therefore both algorithms, by definition of Big O, have the same complexity.
Suppose I had an algorithm to do the same thing in O(n) time. Now also suppose I gave you an array of 10000 characters. Your algorithms would take n^2 and (1/2)n^2 time, which is 100,000,000 and 50,000,000. My algorithm would take 10,000. Clearly that factor of 1/2 isn't making a difference, since mine is so much faster. The n^2 term is said to dominate the lesser terms like n and 1/2, essentially rendering them negligible.
The big-oh notation express a family of function, so say "this thing is O(n²)" means nothing
This isn't pedantry, it is the only, correct way to understand those things.
O(f) = { g | exists x_0 and c such that, for all x > x_0, g(x) <= f(x) * c }
Now, suppose that you're counting the steps that your algorithm, in the worst case, does in term of the size of the input: call that function f.
If f \in O(n²), then you can say that your algorithm has a worst-case of O(n²) (but also O(n³) or O(2^n)).
The meaninglessness of the constants follow from the definition (see that c?).
The best way to understand Big-O notation is to get the mathematical grasp of the idea behind the notation. Look for dictionary meaning of the word "Asymptote"
A line which approaches nearer to some curve than assignable distance, but, though
infinitely extended, would never meet it.
This defines the maximum execution time (imaginary because asymptote line meets the curve at infinity), so what ever you do will be under that time.
With this idea, you might want to know, Big-O, Small-O and omega notation.
Always keep in mind, Big O notation represents the "worst case" scenario. In your example, the first algorithm has an average case of full outer loop * full inner loop, so it is n^2 of course. Because the second case has one instance where it is almost full outer loop * full inner loop, it has to be lumped into the same pile of n^2, since that is its worst case. From there it only gets better, and your average compared to the first function is much lower. Regardless, as n grows, your functions time grows exponentially, and that is all Big O really tells you. The exponential curves can vary widely, but at the end of the day, they are all of the same type.