What is the net running time of the below expression? - runtime

For a recursive algorithm, I came up with the following expression to calculate the running time. But I am not clear on how to simplify this and express in Big-O notation.
If it is just 4k, then I know that it is simply a GP series and we can take the last term which is 4n as the worst case running time. Help me understand how to deal with (k+1) here.

Just try to simplify the term a little bit
Σk=0,...,n 4k(k+1) < Σk=0,...,n 4k(n+1) = (n+1) Σk=0,...,n 4k
So this is in O(n⋅4n). And this bound is tight since 4n(n+1) is part of the sum.
Notice: what you mean by "running time" is usually called "complexity".

Related

Time Complexity (Big O) - Can value of N decides whether the time complexity is O(1) or O(N) when we have 2 nested FOR loops?

Suppose that I have 2 nested for loops, and 1 array of size N as shown in my code below:
int result = 0;
for( int i = 0; i < N ; i++)
{
for( int j = i; j < N ; j++)
{
result = array[i] + array[j]; // just some funny operation
}
}
Here are 2 cases:
(1) if the constraint is that N >= 1,000,000 strictly, then we can definitely say that the time complexity is O(N^2). This is true for sure as we all know.
(2) Now, if the constraint is that N < 25 strictly, then people could probably say that because we know that definitely, N is always too small, the time complexity is estimated to be O(1) since it takes very little time to run and complete these 2 for loops WITH MODERN COMPUTERS ? Does that sound right ?
Please tell me if the value of N plays a role in deciding the outcome of the time complexity O(N) ? If yes, then how big the value N needs to be in order to play that role (1,000 ? 5,000 ? 20,000 ? 500,000 ?) In other words, what is the general rule of thumb here ?
INTERESTING THEORETICAL QUESTION: If 15 years from now, the computer is so fast that even if N = 25,000,000, these 2 for loops can be completed in 1 second. At that time, can we say that the time complexity would be O(1) even for N = 25,000,000 ? I suppose the answer would be YES at that time. Do you agree ?
tl:dr No. The value of N has no effect on time complexity. O(1) versus O(N) is a statement about "all N" or how the amount of computation increases when N increases.
Great question! It reminds me of when I was first trying to understand time complexity. I think many people have to go through a similar journey before it ever starts to make sense so I hope this discussion can help others.
First of all, your "funny operation" is actually funnier than you think since your entire nested for-loops can be replaced with:
result = array[N - 1] + array[N - 1]; // just some hilarious operation hahaha ha ha
Since result is overwritten each time, only the last iteration effects the outcome. We'll come back to this.
As far as what you're really asking here, the purpose of Big-O is to provide a meaningful way to compare algorithms in a way that is indenependent of input size and independent of the computer's processing speed. In other words, O(1) versus O(N) has nothing to with the size of N and nothing to do with how "modern" your computer is. That all effects execution time of the algorithm on a particular machine with a particular input, but does not effect time complexity, i.e. O(1) versus O(N).
It is actually a statement about the algorithm itself, so a math discussion is unavoidable, as dxiv has so graciously alluded to in his comment. Disclaimer: I'm going to omit certain nuances in the math since the critical stuff is already a lot to explain and I'll defer to the mountains of complete explanations elsewhere on the web and textbooks.
Your code is a great example to understand what Big-O does tell us. The way you wrote it, its complexity is O(N^2). That means that no matter what machine or what era you run your code in, if you were to count the number of operations the computer has to do, for each N, and graph it as a function, say f(N), there exists some quadratic function, say g(N)=9999N^2+99999N+999 that is greater than f(N) for all N.
But wait, if we just need to find big enough coefficients in order for g(N) to be an upper bound, can't we just claim that the algorithm is O(N) and find some g(N)=aN+b with gigantic enough coefficients that its an upper bound of f(N)??? THE ANSWER TO THIS IS THE MOST IMPORTANT MATH OBSERVATION YOU NEED TO UNDERSTAND TO REALLY UNDERSTAND BIG-O NOTATION. Spoiler alert. The answer is no.
For visuals, try this graph on Desmos where you can adjust the coefficients:[https://www.desmos.com/calculator/3ppk6shwem][1]
No matter what coefficients you choose, a function of the form aN^2+bN+c will ALWAYS eventually outgrow a function of the form aN+b (both having positive a). You can push a line as high as you want like g(N)=99999N+99999, but even the function f(N)=0.01N^2+0.01N+0.01 crosses that line and grows past it after N=9999900. There is no linear function that is an upper bound to a quadratic. Similarly, there is no constant function that is an upper bound to a linear function or quadratic function. Yet, we can find a quadratic upper bound to this f(N) such as h(N)=0.01N^2+0.01N+0.02, so f(N) is in O(N^2). This observation is what allows us to just say O(1) and O(N^2) without having to distinguish between O(1), O(3), O(999), O(4N+3), O(23N+2), O(34N^2+4+e^N), etc. By using phrases like "there exists a function such that" we can brush all the constant coefficients under the rug.
So having a quadratic upper bound, aka being in O(N^2), means that the function f(N) is no bigger than quadratic and in this case happens to be exactly quadratic. It sounds like this just comes down to comparing the degree of polynomials, why not just say that the algorithm is a degree-2 algorithm? Why do we need this super abstract "there exists an upper bound function such that bla bla bla..."? This is the generalization necessary for Big-O to account for non-polynomial functions, some common ones being logN, NlogN, and e^N.
For example if the number of operations required by your algorithm is given by f(N)=floor(50+50*sin(N)), we would say that it's O(1) because there is a constant function, e.g. g(N)=101 that is an upper bound to f(N). In this example, you have some bizarre algorithm with oscillating execution times, but you can convey to someone else how much it doesn't slow down for large inputs by simply saying that it's O(1). Neat. Plus we have a way to meaningfully say that this algorithm with trigonometric execution time is more efficient than one with linear complexity O(N). Neat. Notice how it doesn't matter how fast the computer is because we're not measuring in seconds, we're measuring in operations. So you can evaluate the algorithm by hand on paper and it's still O(1) even if it takes you all day.
As for the example in your question, we know it's O(N^2) because there are aN^2+bN+c operations involved for some a, b, c. It can't be O(1) because no matter what aN+b you pick, I can find a large enough input size N such that your algorithm requires more than aN+b operations. On any computer, in any time zone, with any chance of rain outside. Nothing physical effects O(1) versus O(N) versus (N^2). What changes it to O(1) is changing the algorithm itself to the one-liner that I provided above where you just add two numbers and spit out the result no matter what N is. Let's say for N=10 it takes 4 operations to do both array lookups, the addition, and the variable assignment. If you run it again on the same machine with N=10000000 it's still doing the same 4 operations. The amount of operations required by the algorithm doesn't grow with N. That's why the algorithm is O(1).
It's why problems like finding a O(NlogN) algorithm to sort an array are math problems and not nano-technology problems. Big-O doesn't even assume you have a computer with electronics.
Hopefully this rant gives you a hint as to what you don't understand so you can do more effective studying for a complete understanding. There's no way to cover everything needed in one post here. It was some good soul-searching for me, so thanks.

When (not how or why) to calculate Big O of an algorithm

I was asked this question in an interview recently and was curious as to what others thought.
"When should you calculate Big O?"
Most sites/books talk about HOW to calc Big O but not actually when you should do it. I'm an entry level developer and I have minimal experience so I'm not sure if I'm thinking on the right track. My thinking is you would have a target Big O to work towards, develop the algorithm then calculate the Big O. Then try to refactor the algorithm for efficiency.
My question then becomes is this what actually happens in industry or am I far off?
"When should you calculate Big O?"
When you care about the Time Complexity of the algorithm.
When do I care?
When you need to make your algorithm to be able to scale, meaning that it's expected to have big datasets as input to your algorithm (e.g. number of points and number of dimensions in a nearest neighbor algorithm).
Most notably, when you want to compare algorithms!
You are asked to do a task, for which several algorithms can be applied to. Which one do you choose? You compare the Space, Time and development/maintenance complexities of them, and choose the one that best fits your needs.
Big O or asymptotic notations allow us to analyze an algorithm's running time by identifying its behavior as the input size for the algorithm increases.
So whenever you need to analyse your algorithm's behavior with respect to growth of the input, you will calculate this. Let me give you an example -
Suppose you need to query over 1 billion data. So you wrote a linear search algorithm. So is it okay? How would you know? You will calculate Big-o. It's O(n) for linear search. So in worst case it would execute 1 billion instruction to query. If your machine executes 10^7 instruction per second(let's assume), then it would take 100 seconds. So you see - you are getting an runtime analysis in terms of growth of the input.
When we are solving an algorithmic problem we want to test the algorithm irrespective of hardware where we are running the algorithm. So we have certain asymptotic notation using which we can define the time and space complexities of our algorithm.
Theta-Notation: Used for defining average case complexity as it bounds the function from top and bottom
Omega-Notation: Bounds the function from below. It is used for best-time complexity
Big-O Notation: This is important as it tells about worst-case complexity and it bounds the function from top.
Now I think the answer to Why BIG-O is calculated is that using it we can get a fair idea that how bad our algorithm can perform as the size of input increases. And If we can optimize our algorithm for worst case then average and best case will take care for themselves.
I assume that you want to ask "when should I calculate time complexity?", just to avoid technicalities about Theta, Omega and Big-O.
Right attitude is to guess it almost always. Notable exceptions include piece of code you want to run just once and you are sure that it will never receive bigger input.
The emphasis on guess is because it does not matter that much whether complexity is constant or logarithmic. There is also a little difference between O(n^2) and O(n^2 log n) or between O(n^3) and O(n^4). But there is a big difference between constant and linear.
The main goal of the guess, is the answer to the question: "What happens if I get 10 times larger input?". If complexity is constant, nothing happens (in theory at least). If complexity is linear, you will get 10 times larger running time. If complexity is quadratic or bigger, you start to have problems.
Secondary goal of the guess is the answer to question: 'What is the biggest input I can handle?". Again quadratic will get you up to 10000 at most. O(2^n) ends around 25.
This might sound scary and time consuming, but in practice, getting time complexity of the code is rather trivial, since most of the things are either constant, logarithmic or linear.
It represents the upper bound.
Big-oh is the most useful because represents the worst-case behavior. So, it guarantees that the program will terminate within a certain time period, it may stop earlier, but never later.
It gives the worst time complexity or maximum time require to execute the algorithm

Is Big(O) machine dependent?

I am really confused with Big(O) notation. Is Big(O) machine dependent or machine independent ? (Machine in the sense the computer in which we run the Algorithm)
Will Sorting of 1000 numbers using quick sort in i3 processor and i7 processor be the same ? Why don't we consider the machine and it's processor speed when calculating the Time Complexity ? I am a neophyte in this stuff.
Big-O is a measure of scalability, not of speed. It shows you what effect on time and memory it has when you e.g. double the amount of data - does it double the execution, or quadruple it?
Whether you use i7 or i3, double is double. Whether a linear algorithm is fast or slow, double is double.
This also has another implication many people ignore. A complex algorithm such as O(n^3) can be faster than a simple algorithm such as O(n) for a given n that is below a certain limit. Example:
loop n times:
loop n times:
loop n times:
sleep 1 second
is O(n^3), as it has 3 nested loops.
loop n times:
sleep 10 seconds
is O(n), as it only has one loop. For n = 10 the first program executes for 1000 seconds, and the second one executes for only 100. So, O(n) is good! one would be tempted to say. But if you have n = 2, the first, complex program executes in only 8 seconds, while the second, simpler one executes for 20! Even for n = 3, the first executes in 27 seconds, the second one in 30. So while the n is low, a complex program might be able to outperform the simpler one. It's just that as n rises, the complex program gets slower much faster (if that makes sense) than a simple one. For n = 1000, the simple code has risen to only 10000 seconds, but the complex one is now 1000000000 seconds!
Also, this clearly shows you that complexity is not processor-dependent. A second is a second.
EDIT: Also, you might want to read this question, where Big-O is explained in a number of very high-quality answers.
Big(O) Notation is the method of calculating the complexity of an algorithm, and hence the relative time it will take to run. The same algorithm, for the same data, will run faster on a faster processor, but will still take the same number of operations. It's used as a way of evaluating the relative efficiency of different algorithms to achieve the same result.
Big O notation is not architecture-dependent in any way, it is a Mathematical construct.It is a very limited measure of algorithmic complexity, it only gives you a rough upper bound for how performance changes with data size.
Big(O) is alogorithm dependent. It's job is to help compare the relative costs of various algorithms, without the need to consider the machine dependencies.
Linear search though an array, on average will look at about 1/2 of the elements if it is found. for all practical purposes that is O(N/2) which is the same as O(1/2 * N). for compairson, you toss away the coefficient. hence it is O(N) for use.
A binary tree can hold N elements for searching as well. on agerage it will look though log base 2 (N) to find something, hence you will see it described as cost O(LN2(N)).
pop in small values for N, and there isn't a whole lot of difference between the algorithms. Pop in a large value of N, and it will be clear that the binary tree lookup is much faster.
Big(O) is not machine dependent. It is mathematical notation to denote complexity of an algorithm. Usually we use these notations in theory to compare algorithms performance.

Big O based comparison of an algorithm's running time

This was during a Data Structures and Algorithms lesson that I encountered this doubt. I have looked up several resources,both online and offline, but still have doubts. I was asked to figure out the running time of an algorithm and I did that correctly - O(n^3) . But what puzzled me was this question - how much slower does the algorithm work if n goes from ,say, 90 to 900,000 ? My doubts were :
The algorithm has a running time of O(n^3) , so obviously it will take more time for a larger input. But how do I compare an algorithm's performance for different inputs based on just the worst case time complexity ?
Can we just plug in the values of 'n' and divide the big-O to get a ratio ?
Please correct me wherever I am mistaken ! Thanks !
Obviously,your thinking is correct!
As the algorithm goes for the size(n),IN THE WORST-CASE,if size is increased the speed will go inversely with n^3.
Your assumption is correct.You just divide after putting values for n=900,000 by n=90 and obtain the result.This factor will be the slowing factor!
Here,slowing factor = (900,000)^3 / (90)^3 = 10^12 !
Hence, slow factor=10^12 .Your program will slow down by a factor of 10^12 !!!Such a drastic change!Or in other words,your efficiency will fall 10^(-12) times!!!
EDIT based on suggested comments :-
But how do I compare an algorithm's performance for different inputs
based on just the worst case time complexity ?
As hinted by G.Bach,one of the commentator on this post,The basic idea underlying your question is itself contradictory!You should have talked about Big-Theta Notation,instead of Big-O to think of a general solution! Big-O is an upper-bound whereas Big-Theta is a tight bound. When people only worry about what's the worst that can happen, big-O is sufficient; i.e. it says that "it can't get much worse than this". The tighter the bound the better, of course, but a tight bound isn't always easy to compute.
So,in worst case analysis your question would be answered the way both of us have answered.Now,one narrow reason that people use O instead of Ω is to drop disclaimers about worst or average cases.But,for a tighter analysis you'll have to check for both O and big-Omega as well to frame a question. This question can suitably be found solution for varied size of n though.I leave it for you to figure out.But,if you have any doubts,PLEASE,feel free to comment.
Allso,your running time is not directly related to worst case analysis,but,somehow it is related and can be reframed this way!
P.S.---> I have taken some ideas/statements from Big-oh vs big-theta. So,THANKS to the authors too!
I hope it helps.Feel free to comment if any info skimmed!
Your understanding is correct.
If the only performance measure you have is worst-case time complexity, that's all you can compare. If you had more information you could make a better estimate.
Yes, just substitute the values of n and divide 900,0003 by 903 to get the ratio.

Relation between worst case and average case running time of an algorithm

Let's say A(n) is the average running time of an algorithm and W(n) is the worst. Is it correct to say that
A(n) = O(W(n))
is always true?
The Big O notation is kind of tricky, since it only defines an upper bound to the execution time of a given algorithm.
What this means is, if f(x) = O(g(x)) then for every other function h(x) such that g(x) < h(x) you'll have f(x) = O(h(x)) . The problem is, are those over extimated execution times usefull? and the clear answer is not at all. What you usually whant is the "smallest"
upper bound you can get, but this is not strictly required in the definition, so you can play around with it.
You can get some stricter bound using the other notations, such as the Big Theta, as you can read here.
So, the answer to your question is yes, A(n) = O(W(n)), but that doesn't give any usefull information on the algorithm.
If you're mentioning A(n) and W(n) are functions - then, yes, you can do such statement in common case - it is because big-o formal definition.
Note, that in terms on big-o there's no sense to act such way - since it makes understanding of the real complexity worse. (In general, three cases - worst, average, best - are present exactly to show complexity more clear)
Yes, it is not a mistake to say so.
People use asymptotic notation to convey the growth of running time on specific cases in terms of input sizes.To compare the average case complexity with the worst case complexity isn't providing much insight into understanding the function's growth on either of the cases.
Whilst it is not wrong, it fails to provide more information than what we already know.
I'm unsure of exactly what you're trying to ask, but bear in mind the below.
The typical algorithm used to show the difference between average and worst case running time complexities is Quick Sort with poorly chosen pivots.
On average with a random sample of unsorted data, the runtime complexity is n log(n). However, with an already sorted set of data where pivots are taken from either the front/end of the list, the runtime complexity is n^2.

Resources