I was interested in knowing how to calculate the time and space complexity of recursive functions like permutation, fibonacci(described here)
In general we can have recursion at many places than just at permutaion or recursion, so I am looking for approach generally followed to calculate tmie ans space complexity
Thank you
Take a look at http://www.cs.duke.edu/~ola/ap/recurrence.html
Time complexity and space complexity are the two things that characterize the performance of an algorithm. Nowadays, as space is relatively inexpensive, people bother mostly about time complexity, and time complexity is mostly expressed in terms of a recurrence relation.
Consider binary search algorithm (Search for an element in an array): Each time middle element(mid) is selected and compared with the element(x) to be searched. If (mid > x), search the lower sub-array, otherwise search the upper sub-array. If there are n elements in an array, and let T(n) represents the time complexity of the algorithm, then
T(n) = T(n/2) + c, where c is a constant. With given boundary conditions, we can solve for T(n), in this case, it would be T(n) = log(n), with T(1) = 1.
Related
In the analysis of the time complexity of algorithms, why do you take only the highest growing term?
My thoughts are that time complexity is generally not used as an accurate measurement of the performance of an algorithm but rather used to identify which group an algorithm belongs to.
Consider if you had separate loops but each have nested loops of two levels, both iterating through n items such that the total complexity of the algorithm is 2n^2
Why is it taken that this algorithm is complexity of O(n^2)?
rather than O(2n^2)
My other thoughts are, n^2 defines the shape of the computation vs input length graph or that we consider parallel computation when calculating the complexity such that O(2n^2) = O(n^2)
Its true that Ax^2 is just a scaling of x^2 the shape remains quadratic
My other question is consider the same two separate loops, now the first iterating through n items and the second v items such that the total complexity of the algorithm is n^2 + v^2 if using sequential computation. Will the complexity be O(n^2+v^2)?
Thank you
So first let me observe that two loops can be faster than one if the one does more than twice as much work in the loop body.
To fully answer your question, big-O describes a growth regime, but it doesn't describe what we're observing growth in. The usual assumption is something like processor cycles or machine instructions, you can also count floating-point operations only (common in scientific computing), memory reads (probe complexity), cache misses, you name it.
If you want to count machine operations, no one's stopping you. But if you don't use big-O notation, then you have to be specific about which machine, and which implementation. Don Knuth famously invented an assembly language for The Art of Computer Programming so that he could do exactly that.
That's a lot of work though, so algorithm researchers typically follow the example of Angluin--Valiant instead, who introduced the unit-cost RAM. Their hypothesis was something like this: for any pair of the computers of the day, you could write a program to simulate one on the other where each source instruction used a constant number of target instructions. Therefore, by using big-O to erase the leading constant, you could make a statement about a large class of machines.
It's still useful to distinguish between broad classes of algorithms. In an old but particularly memorable demonstration, Jon Bentley showed that a linear-time algorithm on a TRS-80 (low-end microcomputer) could beat out a cubic algorithm on a Cray 1 (the fastest supercomputer of its day).
To answer the other question: yes, O(n² + v²) is correct if we don't know whether n or v dominates.
You're using big-O notation to express time complexity which gives an asymptotic upper bound .
For a function f(n) we define O(g(n)) as the set of functions
O(g(n)) = { f(n): there exist positive constants c and n0
such that,
0 ≤ f(n) ≤ cg(n) for all n ≥ n0}
Now coming to question,
In the analysis of the time complexity of algorithms, why do you take only the highest growing term?
Consider an algorithm with, f(n) = an2 +bn +c
It's time complexity is given by O(n2) or f(n) = O(n2)
because there will always be constants c' and n0 such that c'g(n) ≥ f(n) i.e. c'n2 ≥ an2 +bn +c , for n ≥ n0
We consider only higher growing term an2 and neglect bn+c as in the long term an2 will be much bigger than bn +c (eg.for n=1020, n2 is 1020 times larger than n)
Why is it taken that this algorithm is complexity of O(n^2)? rather than O(2n^2)
When a constant is multiplied it scales the function by constant amount, so even though 2n2 > n2 we write time complexity as O(n2) as there exists some constants c', n0 such that c'n2 ≥ 2n2
Will the complexity be O(n^2+v^2)
Yes, the time complexity will be O(n2 + v2) but if one variable dominates it will be O(n2) or O(v2) depending upon which dominates.
My thoughts are that time complexity is generally not used as an accurate
measurement of the performance of an algorithm but rather used to identify
which group an algorithm belongs to.
Time complexity is used to estimate performance in terms of growth of a function. The exact performance of algorithm is hard to determine precisely due to which we rely on asymptotic bounds to get rough idea about performance for large inputs
You can take the case of Merge Sort (O(NlgN)) and Insertion Sort(O(N2)). Clearly merge sort is better than insertion sort in terms of performance as NlgN < N2 but when the input size is small (eg. N=10) Insertion sort outperforms merge sort mainly because the constant operations in Merge Sort increases it's execution time.
I am reading about the Rabin-Karp algorithm on Wikipedia and the time complexity mentioned in there is O(n+m). Now, from my understanding, m is necessarily between 0 and n, so in the best case the complexity is O(n) and in the worst case it is also O(2n)=O(n), so why isn't it just O(n)?
Basically, Robin-Karp expresses it's asymptotic notation as O(m+n) as a means to express the fact that it takes linear time relative to m+n and not just n. Essentially, the variables, m and n, have to mean something whenever you use asymptotic notation. For the case of the Robin-Karp algorithm, n represents the length of the text, and m represents the combined length of both the text and the pattern. Note that O(2n) means the same thing as O(n), because O(2n) is still a linear function of just n. However, in the case of Robin-Karp, m+n isn't really a function of just n. Rather, it's a function of both m and n, which are two independent variables. As such, O(m+n) doesn't mean the same thing as O(n) in the same way that O(2n) equates to O(n).
I hope that makes sense. :-P
m and n measure different dimensions of the input data. Text of length n and patterns of length m is not the same as text of length 2n and patterns of length 0.
O(m+n) tells us that the complexity is proportional to both the length of the text and the length of the patterns.
There are some scenarios where saying complexity in form of O(n+m) is suitable than just saying O(max(m,n)).
Scenario:
Consider BFS(Breadth First Search) or DFS(Depth First Search) as Scenario.
It will be more intuitive and will convey more information to say that the complexity is O(E+V) than max{E,V}. The former is in Sync with actual algorithmic description.
for i = 0 to size(arr)
for o = i + 1 to size(arr)
do stuff here
What's the worst-time complexity of this? It's not N^2, because the second one decreases by one every i loop. It's not N, it should be bigger. N-1 + N-2 + N-3 + ... + N-N+1.
It is N ^ 2, since it's the product of two linear complexities.
(There's a reason asymptotic complexity is called asymptotic and not identical...)
See Wikipedia's explanation on the simplifications made.
Think of it like you are working with a n x n matrix. You are approximately working on half of the elements in the matrix, but O(n^2/2) is the same as O(n^2).
When you want to determine the complexity class of an algorithm, all you need is to find the fastest growing term in the complexity function of the algorithm. For example, if you have complexity function f(n)=n^2-10000*n+400, to find O(f(n)), you just have to find the "strongest" term in the function. Why? Because for n big enough, only that term dictates the behavior of the entire function. Having said that, it is easy to see that both f1(n)=n^2-n-4 and f2(n)=n^2 are in O(n^2). However, they, for the same input size n, don't run for the same amount of time.
In your algorithm, if n=size(arr), the do stuff here code will run f(n)=n+(n-1)+(n-2)+...+2+1 times. It is easy to see that f(n) represents a sum of an arithmetic series, which means f(n)=n*(n+1)/2, i.e. f(n)=0.5*n^2+0.5*n. If we assume that do stuff here is O(1), then your algorithm has O(n^2) complexity.
for i = 0 to size(arr)
I assumed that the loop ends when i becomes greater than size(arr), not equal to. However, if the latter is the case, than f(n)=0.5*n^2-0.5*n, and it is still in O(n^2). Remember that O(1),O(n),0(n^2),... are complexity classes, and that complexity functions of algorithms are functions that describe, for the input size n, how many steps there is in the algorithm.
It's n*(n-1)/2 which is equal to O(n^2).
This is a basic question... but I'm thinking that O(M+N) is the same as O(max(M,N)), since the larger term should dominate as we go to infinity? Also, that would be different from O(min(M,N)), is that right? I keep seeing this notation, esp. when discussing graph algorithms. For example, you routinely see: O(|V| + |E|) (e.g., http://algs4.cs.princeton.edu/41undirected/).
Yes, O(M+N) means the same thing as O(max(M, N)). That is different than O(min(M, N)). As #Dr_Asik says, O(M+N) is technically linear O(N) but when M and N have a meaning, it is nice to be able to say "linear in what?" Imagine the algorithm is linear in the number of rows and the number of columns. We can either define N = rows + cols and say O(N) or we can say O(M+N) where M is rows and N is columns.
Linear time is noted O(N). Since (M+N) is a linear function, it should simply be noted O(N) as well. Likewise there is no sense in comparing O(1) to O(2), O(10) etc., they're all constant time and should all be noted O(1).
I know this is an old thread, but as I am studying this now I figured I would add my two cents for those currently searching similar questions.
I would argue that O(n+m), in the context of a graph represented as an adjacency list, is exactly that and cannot be changed for the following reasons:
1) O(n+m) = O(n) + O(m), but O(m) is upper bounded by O(n^2) so that
O(n+m) = O(n) + O(n^2) = O(n^2). However this is purely in terms of n only, that is, it is only taking into account the vertices and giving a weak upper bound (weak because it is trying to represent the edges with vertices). This does show though that O(n) does not equal O(n+m) as there COULD be a quadratic amount of edges when compared to vertices.
2) Saying O(n+m) takes into account all the elements that have to be passed through when implementing an algorithm which is reduced to something like Breadth First Search (BFS). As it takes into account all the elements in the graph exactly once, it can be considered linear and is a more strict analysis that upper bounding the edges with n^2. One could, for the sake of notation write something like n = |V| + |E| thus BFS runs is O(n) and gives the reader a sense of linearity, but generally, as the OP has mentioned, it is written as O(n+m) where
n= |V| and m = |E|.
Thanks a lot, hope this helps someone.
Comparison based sorting is big omega of nlog(n), so we know that mergesort can't be O(n). Nevertheless, I can't find the problem with the following proof:
Proposition P(n): For a list of length n, mergesort takes O(n) time.
P(0): merge sort on the empty list just returns the empty list.
Strong induction: Assume P(1), ..., P(n-1) and try to prove P(n). We know that at each step in a recursive mergesort, two approximately "half-lists" are mergesorted and then "zipped up". The mergesorting of each half list takes, by induction, O(n/2) time. The zipping up takes O(n) time. So the algorithm has a recurrence relation of M(n) = 2M(n/2) + O(n) which is 2O(n/2) + O(n) which is O(n).
Compare the "proof" that linear search is O(1).
Linear search on an empty array is O(1).
Linear search on a nonempty array compares the first element (O(1)) and then searches the rest of the array (O(1)). O(1) + O(1) = O(1).
The problem here is that, for the induction to work, there must be one big-O constant that works both for the hypothesis and the conclusion. That's impossible here and impossible for your proof.
The "proof" only covers a single pass, it doesn't cover the log n number of passes.
The recurrence only shows the cost of a pass as compared to the cost of the previous pass. To be correct, the recurrence relation should have the cumulative cost rather than the incremental cost.
You can see where the proof falls down by viewing the sample merge sort at http://en.wikipedia.org/wiki/Merge_sort
Here is the crux: all induction steps which refer to particular values of n must refer to a particular function T(n), not to O() notation!
O(M(n)) notation is a statement about the behavior of the whole function from problem size to performance guarantee (asymptotically, as n increases without limit). The goal of your induction is to determine a performance bound T(n), which can then be simplified (by dropping constant and lower-order factors) to O(M(n)).
In particular, one problem with your proof is that you can't get from your statement purely about O() back to a statement about T(n) for a given n. O() notation allows you to ignore a constant factor for an entire function; it doesn't allow you to ignore a constant factor over and over again while constructing the same function recursively...
You can still use O() notation to simplify your proof, by demonstrating:
T(n) = F(n) + O(something less significant than F(n))
and propagating this predicate in the usual inductive way. But you need to preserve the constant factor of F(): this constant factor has direct bearing on the solution of your divide-and-conquer recurrence!