The time complexity of an algorithm is defined as the amount of time taken by it to run as a function of the length of the input.
If I have a simple for loop function in C, which runs for a given input n then:
the length of n is log n(number of bits required to represent it).
Since input is log n and the loop runs n times, the code runs exponentially many times in it's input length ( 2^(log n) = n))
C code:
int forfunction(unsigned int n){
unsigned int i=0;
for(;i<n;i++){
// do something ordinary
}
return 0;
}
This for loop being an example.
But we will never hear anyone say, that such a for loop program is exponential in it's input (the bits required to store n). Why is it so? The only difference I see is that this is a program and time complexity is defined for an algorithm. But even if it is, then why does this not have an impact/taken into account when we want to do a rough time complexity of a program?
EDIT:
Further clarification: I find it reasonable to claim it is exponential in it's input ( might be wrong =) ). If it so, then if a simple for loop is exponential, then what about other hard problems? Clearly this for loop is not a worry for anyone today. Why is it not? Why does this not have (much) real world implications when compared to other hard problems(EXP,NP-Hard,etc...)? Note: I am using hard the way it used to define NP-Hard problems.
Elaborating on #Anonymous's answer: The question you should be asking is "exponential in what?" Ultimately, whether this is exponential time depends on how n is presented to you.
If n is given to you as an explicit binary integer using O(log n) bits, then this function will run in pseudopolynomial time (technically exponential in the number of input bits, but polynomial in the numeric value of the input). This is why simple primality testing algorithms like trial division (divide n by all numbers from 2 up to √n and see if any of them are factors) technically run in exponential time even though they do run in time O(√n).
On the other hand, if n is given to you implicitly using O(n) bits (perhaps as the number of nodes in a graph given an adjacency matrix, or perhaps as the number of characters in a string given a string), then the runtime is polynomial because the input has at least linear size and linear work is done. This is why algorithms like DFS or BFS, which have runtimes of the form O(m + n), run in polynomial time: the number of bits in the input is Ω(m + n).
Hope this helps!
The amount of time a function takes is a function parameterised by something. Often it'll be the size of the input, but other times it's an explicit parameter, and it's up to you, when you're describing a function, to make it clear what you mean. Because it's often "obvious" what the parameterisation is from context, it's often omitted which leads to a lot of confusion when the parameterisation is not obvious to everyone.
When you add the word "complexity" then all that means is that instead of describing a function, you're saying it belongs to a particular class of functions. It doesn't obviate the need to say what the function is and what its argument is.
Techincally speaking the for loop or for that matter all linear programs are exponential in their inputs but this is not used to explain their runtime because for the simple reason that runtime is defined how it varies with change in input . Here in these problems it is considered the no of input bits is constant for example you might define the algorithm for only integer input so its input has always 32 bits so it is considered constant as even if value of n changes no of bits dont so constant terms cannot be used to define growth of algorithm hence omitted.
Related
I've seen coding problems that are similar to this:
int doSomething(String s)
Where it says in the problem description that s will contain at most one of every character, so s cannot be more than length 26. I think in this case, iterating over s would be constant time.
But I've also seen problems where inputs are constrained to a random large number, like 10^5, just to avoid stack overflows and other weird edge cases. If we are going to consider inputs that are constrained by constants to be constant complexity, shouldn't these inputs also be considered constant complexity?
But it doesn't make sense to me to consider s to be of O(n) complexity either, because there are many problems were people allocate char[26] arrays to hold every letter of the alphabet. How does if make sense to consider an input that we know will be less than or equal to 26 to be of greater complexity than an array of size 26?
The point of analyzing the complexity of algorithms is to estimate how long it will take to run it. If the problem you're trying to solve limits the maximum value of n to a constant, you can consider n to be a constant and you wouldn't be wrong. But would that be useful if you wanted to predict whether an algorithm that does 2^n operations will run in a few seconds for n = 26? On the other hand, if you had an algorithm that does n*m operations and m is at most 3, how useful would it be to include m in the complexity analysis?
Calculating complexity has focus on what is the most critical variable related to the running time. If the running time is dominant by the length of s, it is our main focus of analyzing complexity and that should be in bigO notation. And in that case, of course it's not a constant.
If the input is constrained to a large number like 10^5.
And if the algorithm is getting slower proportional to that input.
for example,
int sort(string s); //length of s is less than 10^5
In this case, depending on what sorting algorithm you use,
the running time will be proportional to the length of s
like O(n^2) or O(nlogn) if n is the length of s
In this case you cannot say it's constant because running time is very different as the length of s is changing.
But if the algorithm inside has nothing to do with the length of s, like it has constant calculation time, then you can say 10^5 constraint is just a constant.
Let's say you had an algorithm which had n^(-1/2) complexity, say a scientific algorithm where one sample doesn't give much information so it takes ages to process it, but many samples to cross-reference made it faster. Would you represent that as O(n^(-1/2))? Is that even possible theoretically? Tldr can you have an inverse exponential time complexity?
You could define O(n^(-0.5)) using this set:
O(n^(-0.5)) := {g(n) : There exist positive constants c and N such that 0<=g(n)<=cn^(-0.5), for n > N}.
The function n^(-1), for example, belongs to this set.
None of the elements of the set above, however, could be a an upper bound on the running time of an algorithm.
Note that for any constant c:
if: n>c^2 then: n^(-0.5)*c < 1.
This means that your algorithm do less than one simple operation for input large enough. Since it must execute a natural number of simple operation, we have that it does exactly 0 operations - nothing at all.
A decreasing running time doesn't make sense in practice (even less if it decreases to zero). If that existed, you would find ways to add dummy elements and increase N artificially.
But most algorithm have at least O(N) complexity (whenever every data element influences the final solution); even if not, just the representation of N gets longer and longer which will eventually increase the running time (like O(Log N)).
I’m having a hard time using O(n) principles to generalize the time complexity of an algorithm whose more specific time complexity is O(sum(a)) where a is an array of integers.
My intuition is that this time complexity should generalize to O(n) as you can think of this as a “linear” equation of ki values that occur n times where k is the integer value in the array, making it O(n)( k=1 for a straight up O(n) case).
But it doesn’t seem to be exactly the same as O(n) - the value of k could be much larger than n, and if all these k values are larger you have something that could be O(n^2) or O(n^3) depending on how large that value is.
Is this something to take into account for O(n) complexity where n is the length of the array? Should I actually be defining n as the sum of all elements in the array instead of the length of the array?
In general, what would be the best way to think about this?
Fundamentally, we want to describe the runtime of an algorithm based on the input. The "runtime" is a vague term, that is often swept under the rug. For example, the "runtime" of a sorting algorithm or a hashtable operation is measured in number of comparisons, but using "runtime" to mean the number of basic operations (which are also usually only vaguely defined) is also possible.
There are two choices (or simplifications) often made when calcuating runtime. The first, is to ignore the actual input, and to use the size of the input (measured somehow) instead. This size is usually denoted n. The second, is to use big-O notation to describe the worst case (or best case, or average, or amortized...).
Neither of these choices is always necessary, and sometimes, they won't make sense. To repeat, since this is the crux of the answer: describing runtimes in big-O of n is not the only way to describe runtimes and sometimes it makes no sense to do so.
For example, in the case of an algorithm that runs in O(sum(a)) time:
func f(a) {
t = 0
for x in a {
for i = 1..x {
t += 1
}
}
}
It's not useful to describe the runtime of this using the length of the input array a. It's not useful because the length of a doesn't say anything about the worst-case runtime.
Saying that t is incremented sum(a) times is a useful statement about the runtime of the program. It doesn't use big-O complexity notation.
And if you do want to express that in big-O notation, you can say that the runtime of this code is O(sum(a)). This blurs exactly what you're measuring in the runtime, because you can be including the cost of performing the statements other than incrementing t.
And going back to the example, you could (and if you were studying complexity classes, you probably would) say n is the size (in bits) of the input array. Then you could say something about the runtime (measured in basic operations): it's O(2^n), since the worst case input is an array with one element which takes the value 2^n-1 (*note).
*note: this ignores some technical details about how to encode an array using bits.
According to this question Time complexity to convert a decimal to another base
One of the answer states that
Strictly speaking, the answer is O(1).
If int was an integer type that supported arbitrary precision, then
clearly the answer would be O(logN).
But it is not! An int can get no larger than Integer.MAX_INT with is
2^31 - 1 ... or roughly 2 billion.
So, if we let N (the unbounded integer) tend to infinity, the value of
num wraps around so that it never exceeds Integer.MAX_INT. That means
that if (for example) base is 10, the while loop can execute at most
log10(2^31) times (i.e. 10 times) ... and convertToBase is O(1).
However, if you are prepared to abuse the terminology / notation, you
could say that it is O(logN) for small enough N.
This led me to think that every algorithm if define as public myAlgorithm (int i) is going to be bounded? Let's say I am required to print a String from 0 to n.
The code will just be
public myAlgorithm (int n) {
for (int i = 0; i <=n ; i++) System.out.println(i);
}
This is clearly O(n) right? But we can just use the "bounded" argument to call it O(1).
Can somebody give me a clearer insight on how I should approach this time complexity?
Because the algorithm is running on the JVM, then the input is bounded. But this is really only a limitation in the implementation, not the algorithm itself. You could theoretically take the algorithm and run it on a variation of Java that has 64-bit integers, or any arbitrary size and it would still be correct.
Because the algorithm doesn't rely on the fact that integers are bounded, then the time-complexity shouldn't either.
I would say it is the quoted answer that is abusing the terminology/notation. Followed to its logical extreme, EVERYTHING would be O(1), which completely eliminates any utility of determining complexity.
The whole point of algorithmic complexity is that it is an abstraction used to help understand the underlying structure of a problem/algorithm.
This is clearly O(n) right? But we can just use the "bounded" argument to call it O(1).
Bounded result aquisition - well, reference of it - takes O(1), but printing result (string of n characters length) takes O(n) anyway. This logic works when n becomes big enough to measure/notice the difference. For example, printing 2^32 characters will take long enough time to notice it thanks to all the scrolling etc.
Even more, if you feint results for n! computation algorithm that should produce result as correct string representation of real n! value (not modulo 2^64), it will take ages to print characters for, say, 10050050!
So when we do an iterative solution to find the nth number in a Fibonacci sequence, we run a for loop (n-2) times. This would mean that the time complexity would be O(n). Is this correct or would it actually be pseudo-polynomial depending on the number of bits of the input, much like the Knapsack problem?
Here, I assume Fib(n) is an iterative version of a program that computes Fibonacci numbers. Perhaps something like:
def Fib(n):
a, b = 0, 1
for _ in xrange(n):
a, b = b, a + b
return a
"Fib(n) is pseudo-polynomial" means in this context that computing Fib is bounded by a polynomial of its argument, n, but isn't bounded by a polynomial function of the size of the argument, log(n). That's true in this case.
"Fib(n) is O(n)" is a statement about the running time of Fib with respect to the value of its argument. There's sometimes ambiguity what "n" is, but here there's none -- it's the input to Fib, otherwise "n" would refer to two different things in the original statement. That's true here (although see the technical side-note below).
"Fib is O(n)" is ambiguous. There are people who will tell you that n clearly refers to the argument, and there's others who will tell you that n always refers to the size of the argument. The truth is that it's ambiguous and if it's not clear in context you should say what you mean (or ask what it means if you hear it and are confused). One context where it's not ambiguous is when you're talking about classes of P/NP problems -- there it's assumed that complexities are always relative to the size of the input.
A technical side-note
The iterative version of Fib(n) performs O(n) arithmetic operations, but whether it's O(n) time depends on your computational model, and specifically whether it can perform arbitrary integer arithmetic operations in O(1) time. Personally, I'd be careful and say "Fib(n) performs O(n) arithmetic operations" rather than "Fib(n) is O(n)" -- and if you plot the running time of Fib(n), you'll find it's not linear time in practice, as real bignum implementations are certainly not O(1) for all basic operations.
Yes, it is infact O(n). The time complexity of Knapsack Problem is a really weird one and is an exception.