How to determine the complexity of an algorithm function? - algorithm

How do you know if a algorithm function takes linear/constant/logarithmic time for a specific operation? does it depend on the cpu cycles?

There are three ways you can do it (at least).
Look up the algorithm on the net and see what it says about its time complexity.
Examine the algorithm yourself to look at things like nested loops and recursion conditions, and how often each loop is run or each recursion is done, based on the input size. An extension of this is a rigorous mathematical analysis.
Experiment. Vary the input variable and see how long it takes depending on that. Calculate an equation that gives you said runtime based on the variable (simultaneous equation solving is one possibility here for O(nc)-type functions.
Of these, probably the first is the easiest for the layman since it will almost certainly have been produced by someone more knowledgable doing the second :-)

At first the function may take any time to execute the algorithm. It can be quite non-linear also and even infinite.
Shortly if you have an algorithm then it is used the abstraction called Turing machine. It is used to measure a number of operations required to perform the algorithm before it halts.
More precise info you may get here WIKI::Computational complexity theory

About dependency on CPU:
The answer is NO - time complexity is totally cpu independent. This is because complexity shows - How algorithm's demands on cpu resources increases with increasing algorithm input data size. In other words it is a function. And functions are the same everywhere - be it on different machines or on different planet :)

Related

Why do we measure time complexity instead of step complexity?

When I first took a class on algorithms, I was confused as to what was actually being measured when talking about asymptotic time complexity, since it sure wasn't the time the computer took to run a program. Instead, my mental model was that we were measuring the asymptotic step complexity, that is the asymptotic number of steps the CPU would take to run the algorithm.
Any reason why we reason about time complexity as opposed to step complexity and talk about how much time an algorithm takes as opposed to how many steps (asymptotically) a CPU takes to execute the algorithm?
Indeed, the number of steps is the determining factor, with the condition that the duration of a step is not dependent on the input -- it should never take more time than some chosen constant time.
What exactly that constant time is, will depend on the system you run it on. Some CPUs are just faster than others, and some CPUs are more specialised in one kind of operation, and less in another. Two different steps may therefore represent different times: on one CPU step A may execute with a shorter delay than step B, while on another it may be the inverse. It might even be, that on the same CPU step A sometimes can execute faster than other times (for example, because of some favourable condition in the pipe of that CPU).
All that makes it impossible to say something useful by just measuring the time to run a step. Instead, we consider that there is a maximum time (for a given CPU) for all the different kinds of "steps" we have identified in the algorithm, such that the individual execution of one step will never exceed that maximum time.
So when we talk about time complexity we do say something about the time an algorithm will take. If an algorithm has O(n²) time complexity, it means we can find a value minN and a constant time C (we may freely choose those), such that for every n >= minN, the total time T it takes to run the algorithm is bounded by T < Cn². Note especially that T and C are not a number of steps, but really measures of time (e.g. milliseconds). However the choice of C will depend on the CPU and the maximum we found for its step execution. So we don't actually know which value C will have in general, we just prove that such a C exists for every CPU (or whatever executes the algorithm).
In short, we make an equivalence between a step and a unit of time, such that the execution of a step is guaranteed to be bounded by that unit of time.
You are right, we measure the computational steps that an algorithm uses to run on a Turing machine. However, we do not count every single step. Instead, we are typically interested in the runtime differences of algorithms ignoring constant factors as we do when using the O-notation.
Also, I believe the term is quite intuitive to grasp. Everybody has a basic understanding of what you mean when you talk about how much time an algorithm takes (I can even explain that to my mother). However, if you talk about how many steps an algorithm needs, you may find yourself in a discussion about the computational model (what kind of CPU).
The term time complexity isn't wrong (in fact, I believe it is quite what we are looking for). The term step complexity would be misleading.

How can we modify almost any algorithm to have a good best-case running time?

This is a question from Introduction to Algorithms by Cormen et al, but this isn't a homework problem. Instead, it's self-study.
I have thought a lot and searched on Google. The answer that I can think of are:
Use another algorithm.
Give it best-case inputs
Use a better computer to run the algorithm
But I don't think these are correct. Changing the algorithm isn't the same as making an algorithm have better performance. Also using a better computer may increase the speed but the algorithm isn't better. This is a question in the beginning of the book so I think this is something simple that I am overlooking.
So how can we modify almost any algorithm to have a good best-case running time?
You can modify any algorithm to have a best case time complexity of O(n) by adding a special case, that if the input matches this special case - return a cached hard coded answer (or some other easily obtained answer).
For example, for any sort, you can make best case O(n) by checking if the array is already sorted - and if it is, return it as it is.
Note it does not impact average or worst cases (assuming they are not better then O(n)), and you basically improve the algorithm's best case time complexity.
Note: If the size of the input is bounded, the same optimization makes the best case O(1), because reading the input in this case is O(1).
If we could introduce an instruction for that very algorithm in the computation model of the system itself, we can just solve the problem in one instruction.
But as you might have already discovered that it is a highly unrealistic approach. Thus a generic method to modify any algorithm to have a best case running time is next to impossible. What we can do at max is to apply tweaks in the algorithm for general redundancies found in various problems.
Or you can go naive by taking the best case inputs. But again that isn't actually modifying the algorithm. In fact, introducing the algorithm in the computation system itself, instead of being highly unrealistic, isn't a modification in the algorithm either.
The ways we can modify the algorithm to have a best case running time are:
If the algorithm is at the point of its purpose/solution , For ex, for an increasing sort , it is already ascending order sorted and so on .
If we modify the algorithm such that we output and exit for its purpose only hence forcing multiple nested loops to be just one
We can sometimes use a randomized algorithm, that makes random choices, to allow a probabilistic analysis and thus improve the running time..
I think the only way for this problem is the input to the algorithm. Because the cases in time complexity analysis only depend on our input, how complex it is, how much times it tends to run the algorithm. on this analysis, we decide whether our case is best, average or worst.
So, our input will decide the running time for an algorithm in every case.
Or we can change our algorithm to improve for all cases(reducing the time complexity).
These are the ways we can achieve good best-case running time.
We can modify an algorithm for some special-case conditions, so if the input satisfies that condition, we can output the pre-computed answer. Generally, the best case running time is not a good measure for an algorithm. We need to know how the algorithm performes in worst case.
i just reached to this discussion while looking for an answer. what i think there is only one way to make any algorithm best case by having it a fixed input instead of varing input. if we have an fixed input always the cost and time complexity will always be O(1)

Analyzing algorithms - Why only time complexity?

I was learning about algorithms and time complexity, and this quesiton just sprung into my mind.
Why do we only analyze an algorithm's time complexity?
My question is, shouldn't there be another metric for analyzing an algorithm? Say I have two algos A and B.
A takes 5s for 100 elements, B takes 1000s for 100 elements. But both have O(n) time.
So this means that the time for A and B both grow slower than cn grows for two separate constants c=c1 and c=c2. But in my very limited experience with algorithms, we've always ignored this constant term and just focused on the growth. But isn't it very important while choosing between my given example of A and B? Over here c1<<c2 so Algo A is much better than Algo B.
Or am I overthinking at an early stage and proper analysis will come later on? What is it called?
OR is my whole concept of time complexity wrong and in my given example both can't have O(n) time?
We worry about the order of growth because it provides a useful abstraction to the behaviour of the algorithm as the input size goes to infinity.
The constants "hidden" by the O notation are important, but they're also difficult to calculate because they depend on factors such as:
the particular programming language that is being used to implement the algorithm
the specific compiler that is being used
the underlying CPU architecture
We can try to estimate these, but in general it's a lost cause unless we make some simplifying assumptions and work on some well defined model of computation, like the RAM model.
But then, we're back into the world of abstractions, precisely where we started!
We measure lots of other types of complexity.
Space (Memory usage)
Circuit Depth / Size
Network Traffic / Amount of Interaction
IO / Cache Hits
But I guess you're talking more about a "don't the constants matter?" approach. Yes, they do. The reason it's useful to ignore the constants is that they keep changing. Different machines perform different operations at different speeds. You have to walk the line between useful in general and useful on your specific machine.
It's not always time. There's also space.
As for the asymptotic time cost/complexity, which O() gives you, if you have a lot of data, then, for example, O(n2)=n2 is going to be worse than O(n)=100*n for n>100. For smaller n you should prefer this O(n2).
And, obviously, O(n)=100*n is always worse than O(n)=10*n.
The details of your problem should contribute to your decision between several possible solutions (choices of algorithms) to it.
A takes 5s for 100 elements, B takes 1000s for 100 elements. But both
have O(n) time.
Why is that?
O(N) is an asymptotic measurement on the number of steps required to execute a program in relation to the programs input.
This means that for really large values of N the complexity of the algorithm is linear growth.
We don't compare X and Y seconds. We analyze how the algorithm behaves as the input goes larger and larger
O(n) gives you an idea how much slower the same algorithm will be for a different n, not for comparing algorithms.
On the other hand there is also space complexity - how memory usage grows as a function of input n.

What is the difference between time complexity and running time?

What is the difference between time complexity and running time? Are they the same?
Running time is how long it takes a program to run. Time complexity is a description of the asymptotic behavior of running time as input size tends to infinity.
You can say that the running time "is" O(n^2) or whatever, because that's the idiomatic way to describe complexity classes and big-O notation. In fact the running time is not a complexity class, it's either a duration, or a function which gives you the duration. "Being O(n^2)" is a mathematical property of that function, not a full characterisation of it. The exact running time might be 2036*n^2 + 17453*n + 18464 CPU cycles, or whatever. Not that you very often need to know it in that much detail, and anyway it might well depend on the actual input as well as the size of the input.
The time complexity and running time are two different things altogether.
Time complexity is a complete theoretical concept related to algorithms, while running time is the time a code would take to run, not at all theoretical.
Two algorithms may have the same time complexity, say O(n^2), but one may take twice as much running time as the other one.
From CLRS 2.2 pg. 25
The running time of an algorithm on a particular input is the number
of primitive operations or “steps” executed. It is convenient to
define the notion of step so that it is as machine-independent as
possible.
Now from Wikipedia
... time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the string
representing the input.
Time complexity is commonly estimated by counting the number of
elementary operations performed by the algorithm, where an elementary
operation takes a fixed amount of time to perform.
Notice that both descriptions emphasize the relationship of the size of the input to the number of primitive/elementary operations.
I believe this makes it clear both refer to the same concept.
In practice though you'll find that enterprisey jargon rarely matches academic terminology, e.g., tons of people work doing code optimization but rarely solve optimization problems.
"Running time" refers to the algorithm under consideration:
Another algorithm might be able solve the same problem asymptotically faster, that is, with less running time.
"Time complexity" on the other hand is inherent to the problem under consideration.
It is defined as the least running time of any algorithm solving said problem.
The same distincting applies to other measures of algorithmic cost such as memory, #processors, communication volume etc.
(Blum's Speedup Theorem demonstrates that the "least" time may in general not be attainable...)
To analyze an algorithm is to determine the amount of resources (such as time and storage) necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or running time of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity).
Running time measures the number of operations it takes to complete a code or program. the keyword here is "operations" and "complete", the time taken for every single operation to complete can be affected by the processor, memory, etc.
With running time, If we have 2 different algorithms solving the same problem, the optimized algorithm might take a longer time to complete than the non-optimized one because of varying factors like ram, the current state of the PC (serving other programs) etc. or even the function for calculating the runtime itself.
For this reason, it is not enough to measure the efficiency of an algorithm based on operations it takes to complete but rather time against input, that way all the external factors are eliminated and that's exactly what time complexity does.
Time complexity is the measurement of an algorithm's time behavior as input size increases.
Time complexity can also be calculated from the logic behind the algorithm/code.
On the other hand, running time can be calculated when the code is completed.

How to compute exact complexity of an algorithm?

Without resorting to asymptotic notation, is tedious step counting the only way to get the time complexity of an algorithm? And without step count of each line of code can we arrive at a big-O representation of any program?
Details: trying to find out the complexity of several numerical analysis algorithms to decide which will be best suited for solving a particular problem.
E.g. - from among Regula-Falsi or Newton-Rhapson method for solving eqns, intention is to evaluate the exact complexity of each method and then decide (putting value of 'n' or whatever arguments there are) which method is less complex.
The only way --- not the "easy" or hard way but the only reasonable way --- to find the exact complexity of a complicated algorithm is to profile it. A modern implementation of an algorithm has a complex interaction with numerical libraries and with the CPU and its floating point unit. For instance in-cache memory access is much faster than out-of-cache memory access, and on top of that there may be more than one level of cache. Counting steps is really much more suitable to the asymptotic complexity that you say isn't enough for your purpose.
But, if you did want to count steps automatically, there are also ways to do that. You can add a counter increment command (like "bloof++;" in C) to every line of code, and then display the value at the end.
You should also know about the more refined time complexity expression, f(n)*(1+o(1)), that is also useful for analytical calculations. For instance n^2+2*n+7 simplifies to n^2*(1+o(1)). If the constant factor is what bothers you about usual asymptotic notation O(f(n)), this refinement is a way to keep track of it and still throw out negligible terms.
The 'easy way' is to simulate it. Try your algorithms with lots of values of n and lots of different data, plot the results then match the curve on the graph to an equation.
Your results may not be strictly correct and they're only as valid as your ability to generate good test data but for most cases this will work.
E.g. - from among Regula-Falsi or Newton-Rhapson method for solving eqns, intention is to evaluate the exact complexity of each method and then decide (putting value of 'n' or whatever arguments there are) which method is less complex.
I don't think it's possible to answer this question in general for nonlinear solvers. You could an exact number of computations per iteration, but you're never going to know in general how many iterations it will take for each solver to converge. There are other complications like needing the Jacobian for Newton's which could make computing the complexity even more difficult.
To sum up, the most efficient nonlinear solver is always dependent on the problem you're solving. If the variety of problems you're solving is very limited, doing a bunch of experiments with different solvers and measuring the number of iterations and CPU time will probably give you more useful information.

Resources