What is the time complexity of a comparison or substitution operation? - algorithm

When estimating the time complexity of a certain algorithm, let's say the following in pseudo code:
for (int i=0; i<n; i++) ---> O(n)
//comparison? ---> ?
//substitution ---> ?
for (int i=0; i<n; i++) ---> O(n)
//some function which is not recursive
In this case the time complexity of these instructions is O(n) because we iterate over the input n, but how about the comparison and substitution operations are they constant time since they don't depend on n?
Thanks

Both of the other answers assume you are comparing some sort of fixed-size data type, such as 32-bit integers, doubles, or characters. If you are using operators like < in a language such as Java where they can only be used on fixed-size data types, and cannot be overloaded, then that is correct. But your question is not language-specific, and you also did not say you are comparing using such operators.
In general, the time complexity of a comparison operation depends on the data type you are comparing. It takes O(1) time to compare 64-bit integers, doubles, or characters, for example. But as a counter-example comparing strings in lexicographic order takes O(min(k, k')) time in the worst case, where k, k' are the lengths of the strings.
For example, here is the Java source code for the String.compareTo method in OpenJDK 7, which clearly does not take constant time:
public int compareTo(String anotherString) {
int len1 = value.length;
int len2 = anotherString.value.length;
int lim = Math.min(len1, len2);
char v1[] = value;
char v2[] = anotherString.value;
int k = 0;
while (k < lim) {
char c1 = v1[k];
char c2 = v2[k];
if (c1 != c2) {
return c1 - c2;
}
k++;
}
return len1 - len2;
}
Therefore when analysing the time complexity of comparison-based sorting algorithms, we often analyse their complexity in terms of the number of comparisons and substitutions, rather than the number of basic operations; for example, selection sort does O(n) substitutions and O(n²) comparisons, whereas merge sort does O(n log n) substitutions and O(n log n) comparisons.

First, read this book. Here is good explanation of this topic.
Comparsion. For instance, we have two variables a and b. And when we doing this a==bwe just take a and b from the memory and compare them. Let's define "c" as cost of memory, and "t" as cost of time. In this case we're using 2c (because we're using two cells of the memory) and 1t (because there is only one operation with the constant cost), therefore the 1t - is the constan. Thus the time complexity is constant.
Substitution. It's pretty same as the previous operation. We're using two variables and one operation. This operation is same for the any type, therefore the cost of the time of the substitutin is constant. Then complexity is constant too.

but how about the comparison and substitution operations are they
constant time since they don't depend on n?
Yes. Comparison and substitution operations are a constant factor because their execution time doesn't depend on a size of the input. Their execution time takes time but, again, it's independent from the input size.
However, the execution time of your for loop grows proportionally to the number of items n and so its time complexity is O(n).
UPDATE
As #kaya3 correctly pointed out, we assume that we deal with fixed-size data types. If they're not then check an answer from #kaya3.

Related

Search a Sorted Array for First Occurrence of K

I'm trying to solve question 11.1 in Elements of Programming Interviews (EPI) in Java: Search a Sorted Array for First Occurrence of K.
The problem description from the book:
Write a method that takes a sorted array and a key and returns the index of the first occurrence of that key in the array.
The solution they provide in the book is a modified binary search algorithm that runs in O(logn) time. I wrote my own algorithm also based on a modified binary search algorithm with a slight difference - it uses recursion. The problem is I don't know how to determine the time complexity of my algorithm - my best guess is that it will run in O(logn) time because each time the function is called it reduces the size of the candidate values by half. I've tested my algorithm against the 314 EPI test cases that are provided by the EPI Judge so I know it works, I just don't know the time complexity - here is the code:
public static int searchFirstOfKUtility(List<Integer> A, int k, int Lower, int Upper, Integer Index)
{
while(Lower<=Upper){
int M = Lower + (Upper-Lower)/2;
if(A.get(M)<k)
Lower = M+1;
else if(A.get(M) == k){
Index = M;
if(Lower!=Upper)
Index = searchFirstOfKUtility(A, k, Lower, M-1, Index);
return Index;
}
else
Upper=M-1;
}
return Index;
}
Here is the code that the tests cases call to exercise my function:
public static int searchFirstOfK(List<Integer> A, int k) {
Integer foundKey = -1;
return searchFirstOfKUtility(A, k, 0, A.size()-1, foundKey);
}
So, can anyone tell me what the time complexity of my algorithm would be?
Assuming that passing arguments is O(1) instead of O(n), performance is O(log(n)).
The usual theoretical approach for analyzing recursion is calling the Master Theorem. It is to say that if the performance of a recursive algorithm follows a relation:
T(n) = a T(n/b) + f(n)
then there are 3 cases. In plain English they correspond to:
Performance is dominated by all the calls at the bottom of the recursion, so is proportional to how many of those there are.
Performance is equal between each level of recursion, and so is proportional to how many levels of recursion there are, times the cost of any layer of recursion.
Performance is dominated by the work done in the very first call, and so is proportional to f(n).
You are in case 2. Each recursive call costs the same, and so performance is dominated by the fact that there are O(log(n)) levels of recursion times the cost of each level. Assuming that passing a fixed number of arguments is O(1), that will indeed be O(log(n)).
Note that this assumption is true for Java because you don't make a complete copy of the array before passing it. But it is important to be aware that it is not true in all languages. For example I recently did a bunch of work in PL/pgSQL, and there arrays are passed by value. Meaning that your algorithm would have been O(n log(n)).

Reverse an array-run time

The following code reverses an array.What is its runtime ?
My heart says it is O(n/2), but my friend says O(n). which is correct? please answer with reason. thank you so much.
void reverse(int[] array) {
for (inti = 0; i < array.length / 2; i++) {
int other = array.length - i - 1;
int temp = array[i];
array[i] = array[other];
array[other] = temp;
}
}
Big-O complexity captures how the run-time scales with n as n gets arbitrarily large. It isn't a direct measure of performance. f(n) = 1000n and f(n) = n/128 + 10^100 are both O(n) because they both scale linearly with n even though the first scales much more quickly than the second, and the second is actually prohibitively slow for all n because of the large constant cost. Nonetheless, they have the same complexity class. For these sorts of reasons, if you want to differentiate actual performance between algorithms or define the performance of any particular algorithm (rather than how performance scales with n) asymptotic complexity is not the best tool. If you want to measure performance, you can count the exact number of operations performed by the algorithm, or better yet, provide a representative set of inputs and just measure the execution time on those inputs.
As for the particular problem, yes, the for loop runs n/2 times, but you also do some constant number of operations, c, in each of those loops (subtractions, array accesses, variable assignments, conditional check on i). Maybe c=10, it's not really important to count precisely to determine the complexity class, just to know that it's constant. The run-time is then f(n)=c*n/2, which is O(n): the fact that you only do n/2 for-loops doesn't change the complexity class.

I'm confused about space complexity

I'm a little confused about the space complexity.
int fn_sum(int a[], int n){
int result =0;
for(int i=0; i<n ; i++){
result += a[i];
}
return result;
}
In this case, is the space complexity O(n) or O(1)?
I think it uses only result,i variables so it is O(1). What's the answer?
(1) Space Complexity: how many memory do your algorithm allocate according to input size?
int fn_sum(int a[], int n){
int result = 0; //here you have 1 variable allocated
for(int i=0; i<n ; i++){
result += a[i];
}
return result;
}
as the variable you created (result) is a single value (it's not a list, an array, etc.), your space complexity is O(1), since the space usage is constant, which means: it doesn't change according to the size of the inputs, it's just a single and constant value.
(2) Time Complexity: how do the number of operations of your algorithm relates to the size of the input?
int fn_sum(int a[], int n){ //the input is an array of size n
int result = 0; //1 variable definition operation = O(1)
for(int i=0; i<n ; i++){ //loop that will run n times whatever it has inside
result += a[i]; //1 sum operation = O(1) that runs n times = n * O(1) = O(n)
}
return result; //1 return operation = O(1)
}
all the operations you do take O(1) + O(n) + O(1) = O(n + 2) = O(n) time, following the rules of removing multiplicative and additive constants from the function.
I answer bit differently:
Since memory space consumed by int fn_sum(int a[], int n) doesn't correlate with the number of input items its algorithmic complexity in this regard is O(1).
However runtime complexity is O(N) since it iterates over N items.
And yes, there are algorithms that consume more memory and get faster. Classic one is caching operations.
https://en.wikipedia.org/wiki/Space_complexity
If int means the 32-bit signed integer type, the space complexity is O(1) since you always allocate, use and return the same number of bits.
If this is just pseudocode and int means integers represented in their binary representations with no leading zeroes and maybe an extra sign bit (imagine doing this algorithm by hand), the analysis is more complicated.
If negatives are allowed, the best case is alternating positive and negative numbers so that the result never grows beyond a constant size - O(1) space.
If zero is allowed, an equally good case is to put zero in the whole array. This is also O(1).
If only positive numbers are allowed, the best case is more complicated. I expect the best case will see some number repeated n times. For the best case, we'll want the smallest representable number for the number of bits involved; so, I expect the number to be a power of 2. We can work out the sum in terms of n and the repeated number:
result = n * val
result size = log(result) = log(n * val) = log(n) + log(val)
input size = n*log(val) + log(n)
As val grows without bound, the log(val) term dominates in result size, and the n*log(val) term dominates in the input size; the best-case is thus like the multiplicative inverse of the input size, so also O(1).
The worst case should be had by choosing val to be as small as possible (we choose val = 1) and letting n grow without bound. In that case:
result = n
result size = log(n)
input size = 2 * log(n)
This time, the result size grows like half the input size as n grows. The worst-case space complexity is linear.
Another way to calculate space complexity is to analyze whether the memory required by your code scales/increases according to the input given.
Your input is int a[] with size being n. The only variable you have declared is result.
No matter what the size of n is, result is declared only once. It does not depend on the size of your input n.
Hence you can conclude your space complexity to be O(1).

Space complexity of an algorithm

Example1: Given an input of array A with n elements.
See the algo below:
Algo(A, I, n)
{
int i, j = 100;
for (i = 1 to j)
A[i] = 0;
}
Space complexity = Extra space required by variable i + variable 'j'
In this case my space complexity is: O(1) => constant
Example2: Array of size n given as input
A(A,I,n)
{
int i;
create B[n]; //create a new array B with same number of elements
for(i = 1 to n)
B[i] = A[i]
}
Space complexity in this case: Extra space taken by i + new Array B
=> 1 + n => O(n)
Even if I used 5 variables here space complexity will still be O(n).
If as per computer science my space complexity is always constant for first and O(n) for second even if I was using 10 variables in the above algo, why is it always advised to make programs using less number of variables?
I do understand that in practical scenarios it makes the code more readable and easier to debug etc.
But looking for an answer in terms of space complexity only here.
Big O complexity is not the be-all end-all consideration in analysis of performance. It is all about the constants that you are dropping when you look at asymptotic (big O) complexity. Two algorithms can have the same big-O complexity and yet one can be thousands of times more expensive than the other.
E.g. if one approach to solving some problem always takes 10s flat, and another approach takes 3000s flat, regardless of input size, they both have O(1) time complexity. Of course, that doesn't really mean both are equally good; using the latter approach if there is no real benefit is simply a massive waste of time.
This is not to say performance is the only, or even the primary consideration when someone advises you to be economical with your use of local variables. Other considerations like readability, or avoidance of subtle bugs are also factors.
For this code snippet
Algo(A, I, n)
{
int i, j = 100;
for (i = 1 to j)
A[i] = 0;
}
Space Complexity is: O(1) for the array and constant space for the two variables i and j
It is always advised to use less variables because ,each variable occupies constant space ,if you have 'k' variables.k variables will use k*constant space ,if lets consider each variable is of type int so int occupies 2 bytes so k*2bytes,lets take k as 10 so it 20bytes here
It is as similar as using int A[10] =>20 bytes space complexity
I hope you understand

What is the n in big-O notation?

The question is rather simple, but I just can't find a good enough answer. On the most upvoted SO question regarding the big-O notation, it says that:
For example, sorting algorithms are typically compared based on comparison operations (comparing two nodes to determine their relative ordering).
Now let's consider the simple bubble sort algorithm:
for (int i = arr.length - 1; i > 0; i--) {
for (int j = 0; j < i; j++) {
if (arr[j] > arr[j+1]) {
switchPlaces(...)
}
}
}
I know that worst case is O(n²) and best case is O(n), but what is n exactly? If we attempt to sort an already sorted algorithm (best case), we would end up doing nothing, so why is it still O(n)? We are looping through 2 for-loops still, so if anything it should be O(n²). n can't be the number of comparison operations, because we still compare all the elements, right?
When analyzing the Big-O performance of sorting algorithms, n typically represents the number of elements that you're sorting.
So, for example, if you're sorting n items with Bubble Sort, the runtime performance in the worst case will be on the order of O(n2) operations. This is why Bubble Sort is considered to be an extremely poor sorting algorithm, because it doesn't scale well with increasing numbers of elements to sort. As the number of elements to sort increases linearly, the worst case runtime increases quadratically.
Here is an example graph demonstrating how various algorithms scale in terms of worst-case runtime as the problem size N increases. The dark-blue line represents an algorithm that scales linearly, while the magenta/purple line represents a quadratic algorithm.
Notice that for sufficiently large N, the quadratic algorithm eventually takes longer than the linear algorithm to solve the problem.
Graph taken from http://science.slc.edu/~jmarshall/courses/2002/spring/cs50/BigO/.
See Also
The formal definition of Big-O.
I think two things are getting confused here, n and the function of n that is being bounded by the Big-O analysis.
By convention, for any algorithm complexity analysis, n is the size of the input if nothing different is specified. For any given algorithm, there are several interesting functions of the input size for which one might calculate asymptotic bounds such as Big-O.
The commonest such function for a sorting algorithm is the worst case number of comparisons. If someone says a sorting algorithm is O(n^2), without specifying anything else, I would assume they mean the worst case comparison count is O(n^2), where n is the input size.
Another interesting function is the amount of work space, of space in addition to the array being sorted. Bubble sort's work space is O(1), constant space, because it only uses a few variables regardless of the array size.
Bubble sort can be coded to do only n-1 array element comparisons in the best case, by finishing after any pass that does no exchanges. See this pseudo code implementation, which uses swapped to remember whether there were any exchanges. If the array is already sorted the first pass does no exchanges, so the sort finishes after one pass.
n is usually the size of the input. For array, that would be the number of elements.
To see the different cases, you would need to change the algorithm:
for (int i = arr.length - 1; i > 0 ; i--) {
boolean swapped = false;
for (int j = 0; j<i; j++) {
if (arr[j] > arr[j+1]) {
switchPlaces(...);
swapped = true;
}
}
if(!swapped) {
break;
}
}
Your algorithm's best/worst cases are both O(n^2), but with the possibility of returning early, the best-case is now O(n).
n is array length. You want to find T(n) algorithm complexity.
It is much expensive to access memory then check condition if. So, you define T(n) to be number of access memory.
In the given algorithm BC and WC use O(n^2) accesses to memory because you check the if-condition O(n^2) times.
Make the complexity better: Hold a flag and if you don't do any swaps in the main-loop, it means your array is sorted and you can put a break.
Now, in BC the array is sorted and you access all elements once so O(n).
And in WC still O(n^2).

Resources