What do the constant factors in a time complexity equation represent? - algorithm

all.
I'm taking a Coursera course on data structures and algorithms to prep myself for my upcoming semester which has a data structures and algorithms course. In my study I'm currently on the topic of algorithm analysis. This lead to me going down a rabbit hole and reading other resources. In one of those resources they have a time complexity equation for Insertion-Sort:
My question is what exactly do the constant factors in the equation (C_1, C_2, C_3, etc.) represent? My understanding is they represent the run time to execute a specific line (so C_2 represents the time it takes to execute key = A[j]. This is based on the understanding that the equation represents time complexity, and time complexity is a measurement of the change in runtime as input size changes (Figure 2 and Figure 3 below):
Of course, you can see in the same sentence the authors of Introduction to Algorithms say C_i is a measurement of steps. Then in the book A Common-Sense Guide to Data Structures and Algorithms, Second Edition the author says this:
If you take away just one thing from this book, let it be this: when we measure
how “fast” an operation takes, we do not refer to how fast the operation takes
in terms of pure time, but instead in how many steps it takes.
We’ve actually seen this earlier in the context of printing the even numbers
from 2 to 100. The second version of that function was faster because it took
half as many steps as the first version did.
Why do we measure code’s speed in terms of steps?
We do this because we can never say definitively that any operation takes,
say, five seconds. While a piece of code may take five seconds on a particular
computer, that same piece of code may take longer on an older piece of
hardware. For that matter, that same code might run much faster on the
supercomputers of tomorrow. Measuring the speed of an operation in terms
of time is undependable, since the time will always change depending on the
hardware it is run on.
However, we can measure the speed of an operation in terms of how many
computational steps it takes. If Operation A takes 5 steps, and Operation B
takes 500 steps, we can assume that Operation A will always be faster than
Operation B on all pieces of hardware. Measuring the number of steps is,
therefore, the key to analyzing the speed of an operation.
Measuring the speed of an operation is also known as measuring its time
complexity. Throughout this book, I’ll use the terms speed, time complexity,
efficiency, performance, and runtime interchangeably. They all refer to the
number of steps a given operation takes.
The Wikipedia page for Time Complexity says:
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform.
So there seems to be a dual definition and understanding of time complexity and what the equation represents. I see that intrinsically, the more steps an algorithm takes the longer it's going to take. I also understand that it's usually very difficult or impossible to measure the exact running time of an algorithm and therefor may be efficacious to count the number of steps an algorithm takes.
That said, I'm having a difficult time fulling making this connection between the two, and ultimately what those constant factors in the equation represent. If they do represent running time, why are they then counted as just a constant and seem to be counted as 1 (1 for each time the line is executed)?
Hoping someone out there can help me understand what the equation represents, the connection between runtime and steps, etc.
I appreciate you taking the time to read this and for your help.

I think the constants represent the steps that must be taken to execute each line. The equation represents the number of all steps.

Because they are just...constants! They represent the specific time each operation on average takes to be executed (consider that in reality every operation is a set of operations in Machine code), so every different operation has a different running time but if the constants are all of similar sizes the difference between them is not always that important (every programming languages has his features so something really fast in one can be somewhat slow in another, here for example there are performanceTips for python).
The focus of the analysis is on how many times the operations are executed. Big O complexity considers the limiting behaviour so constants are not important, just the 'speed' of the functions involved ( a*n^2+b*n is always O(n^2) for every constants a and b).
Anyway some optimization can be useless because the different operations used cost more. Some instead can be really important if the two operations have constants with a big difference in execution time

Related

Why do we measure time complexity instead of step complexity?

When I first took a class on algorithms, I was confused as to what was actually being measured when talking about asymptotic time complexity, since it sure wasn't the time the computer took to run a program. Instead, my mental model was that we were measuring the asymptotic step complexity, that is the asymptotic number of steps the CPU would take to run the algorithm.
Any reason why we reason about time complexity as opposed to step complexity and talk about how much time an algorithm takes as opposed to how many steps (asymptotically) a CPU takes to execute the algorithm?
Indeed, the number of steps is the determining factor, with the condition that the duration of a step is not dependent on the input -- it should never take more time than some chosen constant time.
What exactly that constant time is, will depend on the system you run it on. Some CPUs are just faster than others, and some CPUs are more specialised in one kind of operation, and less in another. Two different steps may therefore represent different times: on one CPU step A may execute with a shorter delay than step B, while on another it may be the inverse. It might even be, that on the same CPU step A sometimes can execute faster than other times (for example, because of some favourable condition in the pipe of that CPU).
All that makes it impossible to say something useful by just measuring the time to run a step. Instead, we consider that there is a maximum time (for a given CPU) for all the different kinds of "steps" we have identified in the algorithm, such that the individual execution of one step will never exceed that maximum time.
So when we talk about time complexity we do say something about the time an algorithm will take. If an algorithm has O(n²) time complexity, it means we can find a value minN and a constant time C (we may freely choose those), such that for every n >= minN, the total time T it takes to run the algorithm is bounded by T < Cn². Note especially that T and C are not a number of steps, but really measures of time (e.g. milliseconds). However the choice of C will depend on the CPU and the maximum we found for its step execution. So we don't actually know which value C will have in general, we just prove that such a C exists for every CPU (or whatever executes the algorithm).
In short, we make an equivalence between a step and a unit of time, such that the execution of a step is guaranteed to be bounded by that unit of time.
You are right, we measure the computational steps that an algorithm uses to run on a Turing machine. However, we do not count every single step. Instead, we are typically interested in the runtime differences of algorithms ignoring constant factors as we do when using the O-notation.
Also, I believe the term is quite intuitive to grasp. Everybody has a basic understanding of what you mean when you talk about how much time an algorithm takes (I can even explain that to my mother). However, if you talk about how many steps an algorithm needs, you may find yourself in a discussion about the computational model (what kind of CPU).
The term time complexity isn't wrong (in fact, I believe it is quite what we are looking for). The term step complexity would be misleading.

Why is order of growth preferred as a benchmark for algorithm performance wrt runtime?

I learnt that growth rate is often used to gauge the runtime and efficiency of an algorithm. My question is why use growth rate instead of using the exact(or approximate) relation between the runtime and input size?
Edit:
Thanks for the responses. I would like to clarify what I meant by "relation between the runtime and input size" as it is a little vague.
From what I understand, growth rate is the gradient of the runtime against input. So a growth rate of n^2 would give an equation of the form t = k(n^3) + Constant. Given that the equation is more informative(as it includes constants) and shows a direct relation to the time needed, I thought it would be preferred.
I do understand that as n increase, constants soon becomes irrelevant and depending on different computation configuration, k will be different. Perhaps that is why it is sufficient to just work with the growth rate.
The algorithm isn't the only factor affecting actual running time
Things like programming language, optimizations, branch prediction, I/O speed, paging, processing speed, etc. all come into play.
One language / machine / whatever may certainly have advantages over another, so every algorithm needs to be executed under the exact same conditions.
Beyond that, one algorithm may outperform another in C when considering input and output residing in RAM, but the other may outperform the first in Python when considering input and output residing on disk.
There will no doubt be little to no chance of agreement on the exact conditions that should be used to perform all the benchmarking, and, even if such agreement could be reached, it would certainly be irresponsible to use 5-year-old benchmarking results today in the computing world, so these results would need to be recreated for all algorithms on a regular basis - this would be a massive, very time-consuming task.
Algorithms have varying constant factors
In the extreme case, the constant factors of certain algorithms are so high that other asymptotically slower algorithms outperform it on all reasonable inputs in the modern day. If we merely go by running time, the fact that these algorithms would outperform the others on larger inputs may be lost.
In the less extreme case, we'll get results that will be different at other input sizes because of the constant factors involved - we may see one algorithm as faster in all our tests, but as soon as we hit some input size, the other may become faster.
The running times of some algorithms depend greatly on the input
Basic quicksort on already sorted data, for example, takes O(n^2), while it takes O(n log n) on average.
One can certainly determine the best and worst cases and run the algorithm for those, but the average case is something that could only be determined through mathematical analysis - you can't run it for 'the average case' - you could run it a bunch of times for random input and get average of that, but that's fairly imprecise.
So a rough estimate is sufficient
Because of the above reasons, it makes sense to just say an algorithm is, for example, O(n^2), which very roughly means that, if we're dealing with large enough input size, it would take 4 times longer if the input size doubles. If you've been paying attention, you'll know that the actual time taken could be quite different from 4 times longer, but it at least gives us some idea - we won't expect it to take twice as long, nor 10 times longer (although it might under extreme circumstances). We can also reasonably expect, for example, an O(n log n) algorithm to outperform an O(n^2) algorithm for a large n, which is a useful comparison, and may be easier to see what's going on than some perhaps more exact representation.
You can use both types of measures. In practice, it can be useful to measure performance with specific inputs that you are likely to work with (benchmarking), but it is also quite useful to know the asymptotic behavior of algorithms, as that tells us the (space/time) cost in the general case of "very large inputs" (technically, n->infinity).
Remember that in many cases, the main term of the runtime often far outweighs the importance of lower-order terms, especially as n takes on large values. Therefore, we can summarize or abstract away information by giving a "growth rate" or bound on the algorithm's performance, instead of working with the "exact" runtime or space requirements. Exact in quotes because the constants for the various terms of your runtime may very much vary between runs, between machines - basically different conditions will produce different "constants". In summary, we are interested in asymptotic algorithm behavior because it is still very useful and machine-agnostic.
Growth rate is a relation between the run time of the algorithm and the size of its input. However, this measure is not expressed in units of time, because the technology quickly makes these units obsolete. Only 20 years ago, a microsecond wasn't a lot of time; if you work with embedded systems, it is still not all that much. On the other hand, on mainstream computers with clock speeds of over a gigahertz a microsecond is a lot of time.
An algorithm does not become faster if you run it on a faster hardware. If you say, for example, that an algorithm takes eight milliseconds for an input of size 100, the information would be meaningless until you say on what computer you run your computations: it could be a slow algorithm running on a fast hardware, a fast algorithm running on a slow hardware, or anything in between.
If you also say that it also takes, say, 32 milliseconds for an input of size 200, it would be more meaningful, because the reader would be able to derive the growth rate: the reader would know that doubling the input size quadruples the time, which is a nice thing to know. However, you might as well specify that your algorithm is O(n2).

Analysis with Parallel Algorithms

Everyone knows that bubblesort is O(n^2), but this is based on the number of comparisons needed to sort this. I have a question in which, if I didn't care about the number of comparisons, but the output time, then how do you do analysis of this? Is there a way to do analysis on output time instead of comparisons?
For example, if you could have bubble sort and have parallel comparisons happening at for all pairs (even then odd comparisons), then the throughput time would be something like 2n-1 throughput time. The number of comparisons would be high, but I don't care as the final throughput time is quick.
So in essence, is there a common analysis for overall parallel performance time? If so, just give me some key terms and I'll learn the rest from google.
Parallel programming is a bit of red herring here. Making assumptions about run time only on big O notation can be misleading. To compare run times of algorithms you need the full equation not just the big O notation.
The problem is that big O notation tells you the dominating term as n goes to infinity. But the run time is on finite ranges of n. This is easy to understand from continuous mathematics (my background).
Consider y=Ax and y=Bx^2 Big O notation would tell you that y=Bx^2 is slower. However, between (0,A/B) it's less than y=Ax. In this case it could be faster to use the O(x^2) algorithm than the O(x) algorithm for x<A/B.
In fact I have heard of sorting algorithms which start off with a O(nlogn) algorithm and then switch to a O(n^2) logarithm when n is sufficiently small.
The best example is matrix multiplication. The naïve algorithm is O(n^3) but there are algorithms that get that down to O(n^2.3727). However, every algorithm I have seen has such a large constant that the naïve O(n^3) is still the fastest algorithm for all particle values of n and that does not look likely to change any time soon.
So really what you need is the full equation to compare. Something like An^3 (let's ignore lower order terms) and Bn^2.3727. In this case B is so incredibly large that the O(n^3) method always wins.
Parallel programming usually just lowers the constant. For example when I do matrix multiplication using four cores my time goes from An^3 to A/4 n^3. The same thing will happen with your parallel bubble sort. You will decrease the constant. So it's possible that for some range of values of n that your parallel bubble sort will beat a non-parallel (or possibly even parallel) merge sort. Though, unlike matrix multiplication, I think the range will be pretty small.
Algorithm analysis is not meant to give actual run times. That's what benchmarks are for. Analysis tells you how much relative complexity is in a program, but the actual run time for that program depends on so many other factors that strict analysis can't guarantee real-world times. For example, what happens if your OS decides to suspend your program to install updates? Your run time will be shot. Even running the same program over and over yields different results due to the complexity of computer systems (memory page faults, virtual machine garbage collection, IO interrupts, etc). Analysis can't take these into account.
This is why parallel processing doesn't usually come under consideration during analysis. The mechanism for "parallelizing" a program's components is usually external to your code, and usually based on a probabilistic algorithm for scheduling. I don't know of a good way to do static analysis on that. Once again, you can run a bunch of benchmarks and that will give you an average run time.
The time efficiency we get by parallel steps can be measured by round complexity. Where each round consists of parallel steps occurring at the same time. By doing so, we can see how effective the throughput time is, in similar analysis that we are used to.

Analyzing algorithms - Why only time complexity?

I was learning about algorithms and time complexity, and this quesiton just sprung into my mind.
Why do we only analyze an algorithm's time complexity?
My question is, shouldn't there be another metric for analyzing an algorithm? Say I have two algos A and B.
A takes 5s for 100 elements, B takes 1000s for 100 elements. But both have O(n) time.
So this means that the time for A and B both grow slower than cn grows for two separate constants c=c1 and c=c2. But in my very limited experience with algorithms, we've always ignored this constant term and just focused on the growth. But isn't it very important while choosing between my given example of A and B? Over here c1<<c2 so Algo A is much better than Algo B.
Or am I overthinking at an early stage and proper analysis will come later on? What is it called?
OR is my whole concept of time complexity wrong and in my given example both can't have O(n) time?
We worry about the order of growth because it provides a useful abstraction to the behaviour of the algorithm as the input size goes to infinity.
The constants "hidden" by the O notation are important, but they're also difficult to calculate because they depend on factors such as:
the particular programming language that is being used to implement the algorithm
the specific compiler that is being used
the underlying CPU architecture
We can try to estimate these, but in general it's a lost cause unless we make some simplifying assumptions and work on some well defined model of computation, like the RAM model.
But then, we're back into the world of abstractions, precisely where we started!
We measure lots of other types of complexity.
Space (Memory usage)
Circuit Depth / Size
Network Traffic / Amount of Interaction
IO / Cache Hits
But I guess you're talking more about a "don't the constants matter?" approach. Yes, they do. The reason it's useful to ignore the constants is that they keep changing. Different machines perform different operations at different speeds. You have to walk the line between useful in general and useful on your specific machine.
It's not always time. There's also space.
As for the asymptotic time cost/complexity, which O() gives you, if you have a lot of data, then, for example, O(n2)=n2 is going to be worse than O(n)=100*n for n>100. For smaller n you should prefer this O(n2).
And, obviously, O(n)=100*n is always worse than O(n)=10*n.
The details of your problem should contribute to your decision between several possible solutions (choices of algorithms) to it.
A takes 5s for 100 elements, B takes 1000s for 100 elements. But both
have O(n) time.
Why is that?
O(N) is an asymptotic measurement on the number of steps required to execute a program in relation to the programs input.
This means that for really large values of N the complexity of the algorithm is linear growth.
We don't compare X and Y seconds. We analyze how the algorithm behaves as the input goes larger and larger
O(n) gives you an idea how much slower the same algorithm will be for a different n, not for comparing algorithms.
On the other hand there is also space complexity - how memory usage grows as a function of input n.

What is the difference between time complexity and running time?

What is the difference between time complexity and running time? Are they the same?
Running time is how long it takes a program to run. Time complexity is a description of the asymptotic behavior of running time as input size tends to infinity.
You can say that the running time "is" O(n^2) or whatever, because that's the idiomatic way to describe complexity classes and big-O notation. In fact the running time is not a complexity class, it's either a duration, or a function which gives you the duration. "Being O(n^2)" is a mathematical property of that function, not a full characterisation of it. The exact running time might be 2036*n^2 + 17453*n + 18464 CPU cycles, or whatever. Not that you very often need to know it in that much detail, and anyway it might well depend on the actual input as well as the size of the input.
The time complexity and running time are two different things altogether.
Time complexity is a complete theoretical concept related to algorithms, while running time is the time a code would take to run, not at all theoretical.
Two algorithms may have the same time complexity, say O(n^2), but one may take twice as much running time as the other one.
From CLRS 2.2 pg. 25
The running time of an algorithm on a particular input is the number
of primitive operations or “steps” executed. It is convenient to
define the notion of step so that it is as machine-independent as
possible.
Now from Wikipedia
... time complexity of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the string
representing the input.
Time complexity is commonly estimated by counting the number of
elementary operations performed by the algorithm, where an elementary
operation takes a fixed amount of time to perform.
Notice that both descriptions emphasize the relationship of the size of the input to the number of primitive/elementary operations.
I believe this makes it clear both refer to the same concept.
In practice though you'll find that enterprisey jargon rarely matches academic terminology, e.g., tons of people work doing code optimization but rarely solve optimization problems.
"Running time" refers to the algorithm under consideration:
Another algorithm might be able solve the same problem asymptotically faster, that is, with less running time.
"Time complexity" on the other hand is inherent to the problem under consideration.
It is defined as the least running time of any algorithm solving said problem.
The same distincting applies to other measures of algorithmic cost such as memory, #processors, communication volume etc.
(Blum's Speedup Theorem demonstrates that the "least" time may in general not be attainable...)
To analyze an algorithm is to determine the amount of resources (such as time and storage) necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or running time of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity).
Running time measures the number of operations it takes to complete a code or program. the keyword here is "operations" and "complete", the time taken for every single operation to complete can be affected by the processor, memory, etc.
With running time, If we have 2 different algorithms solving the same problem, the optimized algorithm might take a longer time to complete than the non-optimized one because of varying factors like ram, the current state of the PC (serving other programs) etc. or even the function for calculating the runtime itself.
For this reason, it is not enough to measure the efficiency of an algorithm based on operations it takes to complete but rather time against input, that way all the external factors are eliminated and that's exactly what time complexity does.
Time complexity is the measurement of an algorithm's time behavior as input size increases.
Time complexity can also be calculated from the logic behind the algorithm/code.
On the other hand, running time can be calculated when the code is completed.

Resources