What counts as an operation in algorithms? - algorithm

So I've just started learning algorithms and data structures, and I've read about Big O and how it portrays complexity of algorithms based on how the number of operations required scales
But what actually counts as an operation? In this bubble sort, does each iteration of the for loop count as an operation, or only when an if statement is triggered, or all of them?
And since there are so many different algorithms of all kinds, how do you immediately identify what would count as an "operation" happening in the algorithm's code?
function bubbleSort(array) {
for (let i = 0; i < array.length; i++) {
for (let j = 0; j < array.length; j++) {
if (array[j + 1] < array[j]) {
let tmp = array[j]
array[j] = array[j+1]
array[j+1] = tmp
}
}
}
return array
}

You can count anything as an operation that will execute within a constant amount of time, independent of input. In other words, operations that have a constant time complexity.
If we assume your input consists of fixed-size integers (like 32-bit, 64 bit), then all of the following can be considered such elementary operations:
i++
j < array.length
array[j + 1] < array[j]
let tmp = array[j]
...
But that also means you can take several of such operations together and still consider them an elementary operation. So this is also an elementary operation:
if (array[j + 1] < array[j]) {
let tmp = array[j]
array[j] = array[j+1]
array[j+1] = tmp
}
So, don't concentrate on breaking down operations into smaller operations, and those again into even smaller operations, when you are already certain that the larger operation is O(1).

Usually, everything that happens is a single operation. This is one of the reason we don't actually count the exact number of them, but instead use asymptotic notations (big O and big Theta).
However, sometimes you are interested about one kind of operation only. A common example is algorithms that use IO. Since IO is significantly more time consuming than anything happening on the CPU, you often just "count" the number of IO operations instead. In these cases, you often actually care about exact number of times an IO occurs, and can't use only asymptotic notations.

Related

How to better analyze runtime of complex nested loops?

When you have nested for-loops where the amount of loops for the nested one changes each time, what is the easiest approach to analyze the total runtime? It's hard for me to conceptualize how to factor in the changing max value since I've only ever analyzed nested loops where the max was out of N, which led to a pretty simple O(n^2) runtime. Should I make a summation and use that?
For Example:
int val = 1;
for (int i = 0; i < n; i++) {
for (int j = 0; j < val; j++) {
val++;
}
}
My intuition tells me this is 2^n, but I have no practical way of really proving that
In general, to find the time complexity of loops, you need to find how many times they execute, as a function of the input. Sometimes it is straightforward, sometimes it is not. You may end up with complex mathematical expressions, and in some cases you may not be able to decide at all.
As for your example, your outer loop would clearly run exactly n times. Your inner loop, however, checks its loop condition j < val, which the first time is true because j = 0 and val = 1. Then, it increments val by 1 on each iteration so it will always be true that j < val. Therefore we notice that it is an infinite loop, and your program thus runs in O(∞).
(As a side note, in practice, depending on the language of implementation, eventually val may overflow and become smaller than j, which will cause the loop to finish. In this case, it only depends on the integer size you are using.)

Time Complexity of 3 nested for loops

Although, I found pretty good replies to the same question! However, I want time complexity equation of the following piece of code
sum = 0; c1=11
for (i=0; i<N; i++) c2=6
for (j=0; j<N; j++) c2=6
for (j=0; j<N; j++) c2=7
sum += arr[i][j] c3=2
While each statement has a cost associated with it, I require complete time complexity equation and its answer.
Regards
The comments section got quite long so I am going to write up an answer summarizing everything.
Measuring Time Complexity
In Computer Science, we measure time complexity by the number of steps/iterations your algorithm takes to evaluate.
So if you have a simple array of length n and you go through this array only once, say to print all the elements, we say that this algorithm is O(n) because the time is takes to run will grow proportionally to the size of the array you have, thus n
You can think of Big-O O(..) as a higher order function that compares other functions. if we say f(x) = O(n) it means that you function grows at most as fast as y=n thus linearly. This means that if you were to plot these functions on a graph, there would be a point c x = c after which the graph of n will always be on top of f(x) for any given x > c. Big-O signifies upper bound of a function in terms of another function.
So let's look at your original question and what it means to be constant time. Say we have this function
def printFirst5(arr: Array[Int]) = {
for(i =0 ;i < 5; i++){
println(arr[i])
}
}
This is what we call a constant time algorithm. Can you see why? Because no matter what array you pass into this (as long as it has at least 5 elements), it will only go through the first 5 elements. You can pass it an array of length 100000000000 you can pass it an array of length 10 it doesn't matter. In each case it will only look at the first 5 elements. Meaning this function printFirst5 will never go above the line y=5 in terms of time complexity. These kind of functions are denoted O(1) for constant time.
Now, finally, let's look at you edited version. (I am not sure what you are trying to do in your example because it is syntactically wrong, so I will write my own example)
def groupAllBy3(array: Array[Int]) = {
for(i=0; i < array.length; i++){
for(j=0; j < array.length; j++){
for(k=0; k< array.length; k++{
println(s"$array[i], $array[j], $array[k]")
}
}
}
}
This functions time complexity is O(N3). Why? Let's take a look.
The innermost loop will go through N elements for every j
How many js are there? Well there will be N js for every i.
How many is are there? N many.
So in total we get numberof-i * numberof-j * numberof-k = N * N * N = O(N^3)
Just to make sure you understand this correctly, let's take a look at another example. What would happen if these loops weren't nested? If we had:
def printAllx3(array: Array[Int]) = {
for(i=0; i < array.length; i++){
println(s"array[i]")
}
for(j=0; j < array.length; j++){
println(s"array[j]")
}
for(k=0; k< array.length; k++{
println(s"array[k]")
}
}
What is the case here?
The first loop goes through N elements, the second loop goes through N elements, the third loop goes through N elements. But they don't depend on each other in terms of iterations so we get N + N + N = 3N = O(N)
Do you see the difference?
With all due respect, I believe you are missing some of the fundamentals of what time complexity is & how we measure it. There is only so much I can explain here, I highly recommend you do some reading on the subject and ask any further questions you don't understand.
Hope this helps

Proposed analysis of algorithm

I have been practicing analyzing algorithms lately. I feel like I have a pretty good understanding of analyzing non-recursive algorithms but I am unsure, and have just begun to embark on a full understanding of recursive algorithm as well. Although, I have not had a formal check on my methods and if what I have been doing is indeed correct
Would it be too much to ask if someone could check a few algorithms that I have implemented and analyzed and see if my understanding is along the right lines or if I am completely off.
here:
1)
sum = 0;
for (i = 0; i < n; i++){
for (j = 0; j < i*i; j++){
if (j % i == 0) {
for (k = 0; k < j; k++){
sum++;
}
}
}
}
My analysis of this one was O(n^5) due to:
Sum(i = 0 to n)[Sum(j = 0 to i^2)[Sum(k = 0 to j) of 1]]
which evaluated to:
(1/2)(n^5/5 + n^4/2 + n^3/3 - n/30) + (1/2)(n^3/3 + n^2/2 + n/6) + (1/2)(n^3/3 + n^2/2 + n/6) + n + 1.
Hence it is O(n^5)
Is this correct as an evaluation of the summations of the loops?
a triple summation. I have assumed that the if statement will always pass for worse case complexity. Is this a correct assumption for worst case?
2)
int tonyblair (int n, int a) {
if (a < 12) {
for (int i = 0; i < n; i++){
System.out.println("*");
}
tonyblair(n-1, a);
} else {
for (int k = 0; k < 3000; k++){
for (int j = 0; j < nk; j++){
System.out.println("#");
}
}
}
}
My analysis of this algorithm is O(infinity) due to the infinite recursion in the if statement if it is assumed to be true, which would be the worst case. Although, for pure analysis, I analyzed if this were not true and the if statement would not run. I then got a complexity of O(nk) due to:
Sum(k = 0 to 3000)[Sum(j = 0 to nk) of 1]
which then evaluated to nk(3001) + 3001. Hence is O(nk), where k is not discarded due to it controlling the number of iterations of the loop.
Number 1
I can't tell how you've derived your formula. Usually adding terms happens when there are multiple steps in an algorithm, such as precomputing data and then looking up values from the data. Instead, nested for loops implies multiplication. Also, the worst case is the best case for this snippet of code, because given a value of n, sum will be the same at the end.
To find the complexity, we want to find the number of times that the inner loop is evaluated. Summations are often easy to solve if they go from 1 to n, so I'm going to drop the 0s from them later on. If i is 0, the middle loop won't run, and if j is 0, the inner loop won't run. We can rewrite the code equivalently as:
sum = 0;
for (i = 1; i < n; i++)
{
for (j = 1; j < i*i; j++)
{
if (j % i == 0)
{
for (k = 0; k < j; k++)
{
sum++;
}
}
}
}
I could make my life harder by forcing the outer loop to start at 2, but I'm not going to. The outer loop now runs from 1 to n-1. The middle loop runs based on the current value of i, so we need to do a summation:
The middle for loop always goes to (i^2 - 1), and j will only be divisible by i for a total of (i - 1) times (i, i*2, i*3, ..., i*(i-2), i*(i-1)). With this, we get:
The middle loop then executes j times. The j in our summation is not the same as the j in the code though. The j in the summation represents each time the middle loop executes. Each time the middle loop executes, the j in the code will be (i * (number of executions so far)) = i * (the j in the summation). Therefore, we have:
We can move the i to in-between the two summations, as it is a constant for the inner summation. Then, the formula for the sum of 1 to n is well known: n*(n+1)/2. Because we are going to n - 1, we must subtract n out. This gives:
The summations for the sum of squares and the sum of cubes are also well known. Keeping in mind that we are only summing to n-1 in both cases, we must remember to subtract n^3 and n^2, respectively, and we get:
This is obviously n^4. If we solve it all the way, we get:
Number 2
For the last one, it is in fact O(infinity) if a < 12 because of the if statement. Well, technically everything is O(infinity), because Big-O only provides an upper bound on runtime. If a < 12, it is also omega(infinity) and theta(infinity). If only the else runs, then we have the summation from 1 to 2999 of i*n:
It's very important to notice that the summation from 1 to 2999 is a constant (it's 4498500). No matter how large a constant is, it's still a constant, and not dependent on n. We will end up throwing it out of the runtime calculations. Sometimes, when a theoretically fast algorithm has a large constant, it is practically slower than other algorithms that are theoretically slow. One example I can think of is Chazelle's linear time triangulation algorithm. No one has ever implemented it. In any case, we have 4498500 * n. This is theta(n):

What is the relative running time of this bubble sort derivative?

public class Sort {
public static void sort(int[] arr) {
for (int i = 0; i < arr.length - 1; i++) {
if (arr[i] > arr[i + 1]) {
/* swap */
int temp = arr[i];
arr[i] = arr[i + 1];
arr[i + 1] = temp;
/* make i == -1 because at the end of the loop, it increments to go to 0 */
i = -1;
}
}
}
}
This is different from a traditional bubble sort. But does it have a worse running time?
I believe this is still O(N^2), so I do not see how it is any worse...
Funny beast.
Lets take the worst case, an anti-sorted input:
9 8 7 6 5 4 3 2 1
How many comparisons are needed to move the 9 from the first to the last position?
1+2+3+4+5+6+7+8 = (n-1)*(n-2)/2
What about sending the 8 to the second to last position?
1+2+3+4+5+6+7 = (n-2)*(n-3)/2
So, your algorithm is actually cubic in comparisons performed!
Still, there's a light at the end of the tunnel:
It is still only quadratic in the number of swaps performed, like the standard bubble-sort.
This is definitely an implementation of bubblesort, although it is not the traditional algorithm. You're right, the algorithm is in O(n^2).
Although this algorithm and the traditional algorithm are the same in terms of asymptotic complexity, the running time on a real machine may differ. Remember that constant factors are omitted when we are looking at O(whatever) complexity. So
O(3 n^2) = O(n^2)
Keep in mind that O(whatever) are sets of functions
Consider traditional bubblesort: Here you go through the entire array by constantly increasing the array index. In this algorithm you make some comparisions that are in fact not needed (because you know when you increment the loop variable of the outer for loop (in traditional bubblesort you have 2 for loops), you know that everything on the left is already sorted. So you don't have to loop over it again.
In practice this will probably not make a difference because the algorithm will need so much time that you will be fed up with it before that effect occurs.

Time Complexity - Calculating Worst Case For Algorithms

I am reading some information on time complexity and I'm quite confused as to how the following time complexities are achieved and if there is a particular set of rules or methods for working this out?
1)
Input: int n
for(int i = 0; i < n; i++){
print("Hello World, ");
}
for(int j = n; j > 0; j--){
print("Hello World");
}
Tight: 6n + 5
Big O: O(n)
2)
Input: l = array of comparable items
Output: l = array of sorted items
Sort:
for(int i = 0; i < l.length; i++){
for(int j = 0; j < l.length; j++){
if(l{i} > l{j}){
} }
Swap(l{i},l{j});
}
return ls;
Worst Case Time Complexity: 4n2 +3n+2 = O(n2)
For a given algorithm, time complexity or Big O is a way to provide some fair enough estimation of "total elementary operations performed by the algorithm" in relationship with the given input size n.
Type-1
Lets say you have an algo like this:
a=n+1;
b=a*n;
there are 2 elementary operations in the above code, no matter how big your n is, for the above code a computer will always perform 2 operations, as the algo does not depend on the size of the input, so the Big-O of the above code is O(1).
Type-2
For this code:
for(int i = 0; i < n; i++){
a=a+i;
}
I hope you understand the Big-O in O(n), as elementary operation count directly depend on the size of n
Type-3
Now what about this code:
//Loop-1
for(int i = 0; i < n; i++){
print("Hello World, ");
}
//Loop-2
for(int i = 0; i < n; i++){
for(int j = 0; j < n; j++) {
x=x+j;
}
}
As you can see loop-1 is O(n) and loop-2 is O(n^2). So it feel like total complexity should be O(n)+O(n^2). But no, the time complexity of the above code is O(n^2). Why? Because we are trying to know the fair enough count of elementary operations performed by the algorithm for a given input size n, which will be comparatively easy to understand by another person. With this logic, O(n)+O(n^2) become O(n^2), or O(n^2)+O(n^3)+O(n^4) become O(n^4)!
Again, you may ask: But how? How all the lower power of Big-O become so insignificant as we add it with a higher power of Big-O, that we can completely omit them (lower powers) when we are describing the complexity of our algorithm to another human?
I will try show the reason for this case: O(n)+O(n^2)=O(n^2).
Lets say n=1000 then the exact count for O(n) is 1000 operations and the exact count for O(n^2) is 1000*1000=1000000, so O(n^2) is 1000 time bigger than O(n), which means your program will spend most of the execution time in O(n^2) and thus it is not worth to mention that your algorithm also has some O(n).
PS. Pardon my English :)
In the first example, the array has n elements, and you go through these elements Twice. The first time you start from index 0 until i, and the second time you start from index n to 0. So, to simplify this, we can say that it took you 2n. When dealing with Big O notation, you should keep in mind that we care about the bounds:
As a result, O(2n)=O(n)
and O(an+b)=O(n)
Input: int n // operation 1
for(int i = 0; i < n; i++){ // operation 2
print("Hello World, "); // Operation 3
}
for(int j = n; j > 0; j--) // Operation 4
{
print("Hello World"); //Operation 5
}
As you can see, we have a total of 5 operations outside the loops.
Inside the first loop, we do three internal operations: checking if i is less than n, printing "Hello World", and incrementing i .
Inside the second loop, we also have three internal operations.
So, the total number of of opetations that we need is: 3n ( for first loop) + 3n ( second loop) + 5 ( operations outside the loop). As a result, the total number of steps required is 6n+5 ( that is your tight bound).
As I mentioned before, O( an +b )= n because once an algorithm is linear, a and b do not have a great impact when n is very large.
So, your time complexity will become : O(6n+5) =O(n).
You can use the same logic for the second example keeping in mind that two nested loops take n² instead of n.
I will slightly modify Johns answer. Defining n is one constant operation, defining integer i and assigning it to 0 is 2 constant operations. defining integer j and assigning with n is another 2 constant operations. checking the conditions for i,j inside for loop,increment,print statement depends on n so the total will be 3n+3n+5 which is equal to 6n+5. Here we cannot skip any of the statements during execution so its average case running time will also be its worst case running time which is O(n)

Resources