What does improving a function's running time mean? - algorithm

Suppose a function has a total of 10N + 10 steps. The function class would just be O(N) then. If I want to improve the function's running time, does that mean decrease the number of steps and reduce the function class so that it's less than linear?

Literally you are reducing a function's running time if you make it runs in shorter time. Normally there are two direction to do this: Imagine real life running, you can run in shorter time by strengthen your muscles (upgrade to NASA super computer), or by shorten the distance that you have to run (Improve / change the algorithm to reduce the steps). We only focus the second direction.
Still there are tons of factor to consider like what is the realistic input to your function?
If N is small 99% of the time, then constant factor matters even they are of same class O(N).
O(10^6*N) and O(2*N) are both O(N), but N is not that dominant when it is smaller than 10^6
If N is usually large, you can still say you have improved the function by reducing the constant factor, but it is negligible (but yes you are still reducing it). If you need an observable boost, then you may need to change the algorithm, change the data structures, in order to improve the function to a better complexity class (from O(N) to O(lg N) for example).
Therefore, using your own words: "decrease the number of steps" and "reduce the function class" are both reducing the running time of the function, but which one is observable and thus useful, depends on its usage and other realistic factors.

Related

Why do we measure time complexity instead of step complexity?

When I first took a class on algorithms, I was confused as to what was actually being measured when talking about asymptotic time complexity, since it sure wasn't the time the computer took to run a program. Instead, my mental model was that we were measuring the asymptotic step complexity, that is the asymptotic number of steps the CPU would take to run the algorithm.
Any reason why we reason about time complexity as opposed to step complexity and talk about how much time an algorithm takes as opposed to how many steps (asymptotically) a CPU takes to execute the algorithm?
Indeed, the number of steps is the determining factor, with the condition that the duration of a step is not dependent on the input -- it should never take more time than some chosen constant time.
What exactly that constant time is, will depend on the system you run it on. Some CPUs are just faster than others, and some CPUs are more specialised in one kind of operation, and less in another. Two different steps may therefore represent different times: on one CPU step A may execute with a shorter delay than step B, while on another it may be the inverse. It might even be, that on the same CPU step A sometimes can execute faster than other times (for example, because of some favourable condition in the pipe of that CPU).
All that makes it impossible to say something useful by just measuring the time to run a step. Instead, we consider that there is a maximum time (for a given CPU) for all the different kinds of "steps" we have identified in the algorithm, such that the individual execution of one step will never exceed that maximum time.
So when we talk about time complexity we do say something about the time an algorithm will take. If an algorithm has O(n²) time complexity, it means we can find a value minN and a constant time C (we may freely choose those), such that for every n >= minN, the total time T it takes to run the algorithm is bounded by T < Cn². Note especially that T and C are not a number of steps, but really measures of time (e.g. milliseconds). However the choice of C will depend on the CPU and the maximum we found for its step execution. So we don't actually know which value C will have in general, we just prove that such a C exists for every CPU (or whatever executes the algorithm).
In short, we make an equivalence between a step and a unit of time, such that the execution of a step is guaranteed to be bounded by that unit of time.
You are right, we measure the computational steps that an algorithm uses to run on a Turing machine. However, we do not count every single step. Instead, we are typically interested in the runtime differences of algorithms ignoring constant factors as we do when using the O-notation.
Also, I believe the term is quite intuitive to grasp. Everybody has a basic understanding of what you mean when you talk about how much time an algorithm takes (I can even explain that to my mother). However, if you talk about how many steps an algorithm needs, you may find yourself in a discussion about the computational model (what kind of CPU).
The term time complexity isn't wrong (in fact, I believe it is quite what we are looking for). The term step complexity would be misleading.

Is Big(O) machine dependent?

I am really confused with Big(O) notation. Is Big(O) machine dependent or machine independent ? (Machine in the sense the computer in which we run the Algorithm)
Will Sorting of 1000 numbers using quick sort in i3 processor and i7 processor be the same ? Why don't we consider the machine and it's processor speed when calculating the Time Complexity ? I am a neophyte in this stuff.
Big-O is a measure of scalability, not of speed. It shows you what effect on time and memory it has when you e.g. double the amount of data - does it double the execution, or quadruple it?
Whether you use i7 or i3, double is double. Whether a linear algorithm is fast or slow, double is double.
This also has another implication many people ignore. A complex algorithm such as O(n^3) can be faster than a simple algorithm such as O(n) for a given n that is below a certain limit. Example:
loop n times:
loop n times:
loop n times:
sleep 1 second
is O(n^3), as it has 3 nested loops.
loop n times:
sleep 10 seconds
is O(n), as it only has one loop. For n = 10 the first program executes for 1000 seconds, and the second one executes for only 100. So, O(n) is good! one would be tempted to say. But if you have n = 2, the first, complex program executes in only 8 seconds, while the second, simpler one executes for 20! Even for n = 3, the first executes in 27 seconds, the second one in 30. So while the n is low, a complex program might be able to outperform the simpler one. It's just that as n rises, the complex program gets slower much faster (if that makes sense) than a simple one. For n = 1000, the simple code has risen to only 10000 seconds, but the complex one is now 1000000000 seconds!
Also, this clearly shows you that complexity is not processor-dependent. A second is a second.
EDIT: Also, you might want to read this question, where Big-O is explained in a number of very high-quality answers.
Big(O) Notation is the method of calculating the complexity of an algorithm, and hence the relative time it will take to run. The same algorithm, for the same data, will run faster on a faster processor, but will still take the same number of operations. It's used as a way of evaluating the relative efficiency of different algorithms to achieve the same result.
Big O notation is not architecture-dependent in any way, it is a Mathematical construct.It is a very limited measure of algorithmic complexity, it only gives you a rough upper bound for how performance changes with data size.
Big(O) is alogorithm dependent. It's job is to help compare the relative costs of various algorithms, without the need to consider the machine dependencies.
Linear search though an array, on average will look at about 1/2 of the elements if it is found. for all practical purposes that is O(N/2) which is the same as O(1/2 * N). for compairson, you toss away the coefficient. hence it is O(N) for use.
A binary tree can hold N elements for searching as well. on agerage it will look though log base 2 (N) to find something, hence you will see it described as cost O(LN2(N)).
pop in small values for N, and there isn't a whole lot of difference between the algorithms. Pop in a large value of N, and it will be clear that the binary tree lookup is much faster.
Big(O) is not machine dependent. It is mathematical notation to denote complexity of an algorithm. Usually we use these notations in theory to compare algorithms performance.

O(n log n) vs O(n) -- practical differences in time complexity

n log n > n -- but this is like a pseudo-linear relationship. If n=1 billion, log n ~ 30;
So n log n will be 30 billion, which is 30 X n, order of n.
I am wondering if this time complexity difference between n log n and n are significant in real life.
Eg: A quick select on finding kth element in an unsorted array is O(n) using quickselect algorithm.
If I sort the array and find the kth element, it is O(n log n). To sort an array with 1 trillion elements, I will be 60 times slower if I do quicksort and index it.
The main purpose of the Big-O notation is to let you do the estimates like the ones you did in your post, and decide for yourself if spending your effort coding a typically more advanced algorithm is worth the additional CPU cycles that you are going to buy with that code improvement. Depending on the circumstances, you may get a different answer, even when your data set is relatively small:
If you are running on a mobile device, and the algorithm represents a significant portion of the execution time, cutting down the use of CPU translates into extending the battery life
If you are running in an all-or-nothing competitive environment, such as a high-frequency trading system, a micro-optimization may differentiate between making money and losing money
When your profiling shows that the algorithm in question dominates the execution time in a server environment, switching to a faster algorithm may improve performance for all your clients.
Another thing the Big-O notation hides is the constant multiplication factor. For example, Quick Select has very reasonable multiplier, making the time savings from employing it on extremely large data sets well worth the trouble of implementing it.
Another thing that you need to keep in mind is the space complexity. Very often, algorithms with O(N*Log N) time complexity would have an O(Log N) space complexity. This may present a problem for extremely large data sets, for example when a recursive function runs on a system with a limited stack capacity.
It depends.
I was working at amazon, there was a method, which was doing linear search on a list. We could use a Hashtable and do the look up in O(1) compared to O(n).
I suggested the change, and it wasn't approved. because the input was small, it wouldn't really make a huge difference.
However, if the input is large, then it would make a difference.
In another company, where the data/input was huge, using a Tree, Compared to List made a huge difference. So it depends on the data and architecture of the application.
It is always good to know your options and how you can optimize.
There are times when you will work with billions of elements (and more), where that difference will certainly be significant.
There are other times when you will be working with less than a thousand elements, in which case the difference probably won't be all that significant.
If you have a decent idea what your data will look like, you should have a decent idea which one to pick from the start, but the difference between O(n) and O(n log n) is small enough that it's probably best to start off with whichever one is simplest, benchmark it and only try to improve it if you see it's too slow.
However, note that O(n) may actually be slower than O(n log n) for any given value of n (especially, but not necessarily, for small values of n) because of the constant factors involved, since big-O ignores those (it only considers what happens when n tends to infinity), so, if you're looking purely at the time complexity, what you think may be an improvement may actually slow things down.
Darth Vader is correct. It always depends. Its also important to rememeber that complexities are asymptotic, worst-case (usually) and that constants are dropped. Each of these is important to consider.
So you could have two algorithms, one of which is O(n) and one of which is O(nlogn), and for every value up to the number of atoms in the universe and beyond (to some finite value of n), the O(nlogn) algorithm outperforms the O(n) algorithm. It could be because lower order terms are dominating, or it could be because in the average case, the O(nlogn) algorithm is actually O(n), or because the actual number of steps is something like 5,000,000n vs 3nlogn.
PriorityQueue Sorts each element that you add each time while using Collections.sort() will sort all the elements in a single go. But if you have a problem where you want to get the biggest element as soon as possible use PriorityQueue on the other hand if you need to perform some computations but requires the element to be sorted then using ArrayList with Collections.Sort is best

why O(1) != O(log(n)) ? for n=[integer, long, ...]

for example, say n = Integer.MAX_VALUE or 2^123 then O(log(n)) = 32 and 123 so a small integer. isn't it O(1) ?
what is the difference ? I think, the reason is O(1) is constant but O(log(n)) not. Any other ideas ?
If n is bounded above, then complexity classes involving n make no sense. There is no such thing as "in the limit as 2^123 approaches infinity", except in the old joke that "a pentagon approximates a circle, for sufficiently large values of 5".
Generally, when analysing the complexity of code, we pretend that the input size isn't bounded above by the resource limits of the machine, even though it is. This does lead to some slightly odd things going on around log n, since if n has to fit into a fixed-size int type, then log n has quite a small bound, so the bound is more likely to be useful/relevant.
So sometimes, we're analysing a slightly idealised version of the algorithm, because the actual code written cannot accept arbitrarily large input.
For example, your average quicksort formally uses Theta(log n) stack in the worst case, obviously so with the fairly common implementation that call-recurses on the "small" side of the partition and loop-recurses on the "big" side. But on a 32 bit machine you can arrange to in fact use a fixed-size array of about 240 bytes to store the "todo list", which might be less than some other function you've written based on an algorithm that formally has O(1) stack use. The morals are that implementation != algorithm, complexity doesn't tell you anything about small numbers, and any specific number is "small".
If you want to account for bounds, you could say that, for example, your code to sort an array is O(1) running time, because the array has to be below the size that fits in your PC's address space, and hence the time to sort it is bounded. However, you will fail your CS assignment if you do, and you won't be providing anyone with any useful information :-)
Obviously if you know that the input will always have a fixed number of elements, the algorithm will always run in constant time. Big-O notation is used to denote worse-case running time, which describes the limit when the number of elements grows infinitely large.
The difference is that n isn't fixed. The idea behind Big-O notation is to get an idea of how the size of the input effects the running time (or memory usage). So if an algorithm always takes the same amount of time, whether n = 1 or n = Integer.MAX_VALUE, we say it is O(1). If the algorithm takes a unit of time longer each time the input size doubles, then we say it is O(logn).
Edit: to answer your specific question on the difference between O(1) and O(logn), I'll give you an example. Let's say we want an algorithm that will find the min element in an unsorted array. One approach is to go through each element and keep track of the current min. Another approach is to sort the array and then return the first element.
The first algorithm is O(n), and the second algorithm is O(nlogn). So let's say we start with an array of 16 elements. The first algorithm will run in time 16, the second algorithm will run in time 16*4. If we increase it to 17, then it becomes 17 and 17*4. We might naively say that the second algorithm takes 4 times as long as the first algorithm (if we treat the logn component as constant).
But let's look at what happens when our array contains 2^32 elements. Now the first algorithm takes 2^32 time to complete, where our second algorithm takes 32*2^32 time to complete. It takes 32 times as long. Yes, it's a small difference, but it is still a difference. If the first algorithm takes 1 minute, the second algorithm will take over half an hour!
I think you will get a better idea if it is called O(n^0).
It is a scaling function depending on the input variable N. It is a function, not number, you should never assume any number for the variable N.
It is just like that you say that a function f(x) is 3 because f(100) = 3, it is wrong. It is a function, not any particular number. A constant function f(x) = 1 is still a function, it will never equal to another function g(x) = N, i.e. g(x)=f(x)
Its the growth rate that you want to look at. O(1) implies no growth at all. While O(logn) does have growth. Even though the growth is small it is still growth.
You’re not thinking big enough. Any algorithm that runs on a computer will either run forever or terminate after some small number of steps — since the computer is only a finite state machine, you cannot write algorithms that run for an arbitrary amount of time and then terminate. By that argument, Big-O notation is only theoretical and has no purpose in a real-life computer program. Even O(2^n) hits an upper limit at O(2^INT_MAX), which is equivalent to O(1).
Realistically, though, Big-O can help you out if you know the constant factors. Even if an algorithm has an upper bound of O(log n), and n can have 32 bits, that could mean the difference between a request taking 1 second and 32 seconds.
Big-O shows how running time (or memory, etc) changes as the size of problem changes.
When size of the problem gets 10 times bigger, an O(n) solution takes 10 times as long, an O(log(n)) solution takes a bit longer, and an O(1) solution takes the same time: O(1) means 'changes as fast as constant 1', but constants don't change.
Familiarize yourself with the big-O notation in a bit more detail.
There is a reason why you leave "O(n)" in, and consider to drop "O(log n)". They both are "constants": the former is less than 32, and the latter is less than 232. But you nevertheless have a natural feeling that you can't call O(n) O(1).
However, if log(n) < 32, it means that O(n*logn) algorithm works thirty two times slower than its O(n) version. Big enough to write "log*n"s?

How can one test time complexity "experimentally"?

Could it be done by keeping a counter to see how many iterations an algorithm goes through, or does the time duration need to be recorded?
The currently accepted won't give you any theoretical estimation, unless you are somehow able to fit the experimentally measured times with a function that approximates them. This answer gives you a manual technique to do that and fills that gap.
You start by guessing the theoretical complexity function of the algorithm. You also experimentally measure the actual complexity (number of operations, time, or whatever you find practical), for increasingly larger problems.
For example, say you guess an algorithm is quadratic. Measure (Say) the time, and compute the ratio of time to your guessed function (n^2):
for n = 5 to 10000 //n: problem size
long start = System.time()
executeAlgorithm(n)
long end = System.time()
long totalTime = end - start
double ratio = (double) time / (n * n)
end
. As n moves towards infinity, this ratio...
Converges to zero? Then your guess is too low. Repeat with something bigger (e.g. n^3)
Diverges to infinity? Then your guess is too high. Repeat with something smaller (e.g. nlogn)
Converges to a positive constant? Bingo! Your guess is on the money (at least approximates the theoretical complexity for as large n values as you tried)
Basically that uses the definition of big O notation, that f(x) = O(g(x)) <=> f(x) < c * g(x) - f(x) is the actual cost of your algorithm, g(x) is the guess you put, and c is a constant. So basically you try to experimentally find the limit of f(x)/g(x); if your guess hits the real complexity, this ratio will estimate the constant c.
Algorithm complexity is defined as (something like:)
the number of operations the algorithm does as a function
of its input size.
So you need to try your algorithm with various input sizes (i.e. for sort - try sorting 10 elements, 100 elements etc.), and count each operation (e.g. assignment, increment, mathematical operation etc.) the algorithm does.
This will give you a good "theoretical" estimation.
If you want real-life numbers on the other hand - use profiling.
As others have mentioned, the theoretical time complexity is a function of number of cpu operations done by your algorithm. In general processor time should be a good approximation for that modulo a constant. But the real run time may vary because of a number of reasons such as:
processor pipeline flushes
Cache misses
Garbage collection
Other processes on the machine
Unless your code is systematically causing some of these things to happen, with enough number of statistical samples, you should have a fairly good idea of the time complexity of your algorithm, based on observed runtime.
The best way would be to actually count the number of "operations" performed by your algorithm. The definition of "operation" can vary: for an algorithm such as quicksort, it could be the number of comparisons of two numbers.
You could measure the time taken by your program to get a rough estimate, but various factors could cause this value to differ from the actual mathematical complexity.
yes.
you can track both, actual performance and number of iterations.
Might I suggest using ANTS profiler. It will provide you this kind of detail while you run your app with "experimental" data.

Resources