Space complexity of an algorithm - algorithm

Example1: Given an input of array A with n elements.
See the algo below:
Algo(A, I, n)
{
int i, j = 100;
for (i = 1 to j)
A[i] = 0;
}
Space complexity = Extra space required by variable i + variable 'j'
In this case my space complexity is: O(1) => constant
Example2: Array of size n given as input
A(A,I,n)
{
int i;
create B[n]; //create a new array B with same number of elements
for(i = 1 to n)
B[i] = A[i]
}
Space complexity in this case: Extra space taken by i + new Array B
=> 1 + n => O(n)
Even if I used 5 variables here space complexity will still be O(n).
If as per computer science my space complexity is always constant for first and O(n) for second even if I was using 10 variables in the above algo, why is it always advised to make programs using less number of variables?
I do understand that in practical scenarios it makes the code more readable and easier to debug etc.
But looking for an answer in terms of space complexity only here.

Big O complexity is not the be-all end-all consideration in analysis of performance. It is all about the constants that you are dropping when you look at asymptotic (big O) complexity. Two algorithms can have the same big-O complexity and yet one can be thousands of times more expensive than the other.
E.g. if one approach to solving some problem always takes 10s flat, and another approach takes 3000s flat, regardless of input size, they both have O(1) time complexity. Of course, that doesn't really mean both are equally good; using the latter approach if there is no real benefit is simply a massive waste of time.
This is not to say performance is the only, or even the primary consideration when someone advises you to be economical with your use of local variables. Other considerations like readability, or avoidance of subtle bugs are also factors.

For this code snippet
Algo(A, I, n)
{
int i, j = 100;
for (i = 1 to j)
A[i] = 0;
}
Space Complexity is: O(1) for the array and constant space for the two variables i and j
It is always advised to use less variables because ,each variable occupies constant space ,if you have 'k' variables.k variables will use k*constant space ,if lets consider each variable is of type int so int occupies 2 bytes so k*2bytes,lets take k as 10 so it 20bytes here
It is as similar as using int A[10] =>20 bytes space complexity
I hope you understand

Related

Reverse an array-run time

The following code reverses an array.What is its runtime ?
My heart says it is O(n/2), but my friend says O(n). which is correct? please answer with reason. thank you so much.
void reverse(int[] array) {
for (inti = 0; i < array.length / 2; i++) {
int other = array.length - i - 1;
int temp = array[i];
array[i] = array[other];
array[other] = temp;
}
}
Big-O complexity captures how the run-time scales with n as n gets arbitrarily large. It isn't a direct measure of performance. f(n) = 1000n and f(n) = n/128 + 10^100 are both O(n) because they both scale linearly with n even though the first scales much more quickly than the second, and the second is actually prohibitively slow for all n because of the large constant cost. Nonetheless, they have the same complexity class. For these sorts of reasons, if you want to differentiate actual performance between algorithms or define the performance of any particular algorithm (rather than how performance scales with n) asymptotic complexity is not the best tool. If you want to measure performance, you can count the exact number of operations performed by the algorithm, or better yet, provide a representative set of inputs and just measure the execution time on those inputs.
As for the particular problem, yes, the for loop runs n/2 times, but you also do some constant number of operations, c, in each of those loops (subtractions, array accesses, variable assignments, conditional check on i). Maybe c=10, it's not really important to count precisely to determine the complexity class, just to know that it's constant. The run-time is then f(n)=c*n/2, which is O(n): the fact that you only do n/2 for-loops doesn't change the complexity class.

What is the time complexity of a comparison or substitution operation?

When estimating the time complexity of a certain algorithm, let's say the following in pseudo code:
for (int i=0; i<n; i++) ---> O(n)
//comparison? ---> ?
//substitution ---> ?
for (int i=0; i<n; i++) ---> O(n)
//some function which is not recursive
In this case the time complexity of these instructions is O(n) because we iterate over the input n, but how about the comparison and substitution operations are they constant time since they don't depend on n?
Thanks
Both of the other answers assume you are comparing some sort of fixed-size data type, such as 32-bit integers, doubles, or characters. If you are using operators like < in a language such as Java where they can only be used on fixed-size data types, and cannot be overloaded, then that is correct. But your question is not language-specific, and you also did not say you are comparing using such operators.
In general, the time complexity of a comparison operation depends on the data type you are comparing. It takes O(1) time to compare 64-bit integers, doubles, or characters, for example. But as a counter-example comparing strings in lexicographic order takes O(min(k, k')) time in the worst case, where k, k' are the lengths of the strings.
For example, here is the Java source code for the String.compareTo method in OpenJDK 7, which clearly does not take constant time:
public int compareTo(String anotherString) {
int len1 = value.length;
int len2 = anotherString.value.length;
int lim = Math.min(len1, len2);
char v1[] = value;
char v2[] = anotherString.value;
int k = 0;
while (k < lim) {
char c1 = v1[k];
char c2 = v2[k];
if (c1 != c2) {
return c1 - c2;
}
k++;
}
return len1 - len2;
}
Therefore when analysing the time complexity of comparison-based sorting algorithms, we often analyse their complexity in terms of the number of comparisons and substitutions, rather than the number of basic operations; for example, selection sort does O(n) substitutions and O(n²) comparisons, whereas merge sort does O(n log n) substitutions and O(n log n) comparisons.
First, read this book. Here is good explanation of this topic.
Comparsion. For instance, we have two variables a and b. And when we doing this a==bwe just take a and b from the memory and compare them. Let's define "c" as cost of memory, and "t" as cost of time. In this case we're using 2c (because we're using two cells of the memory) and 1t (because there is only one operation with the constant cost), therefore the 1t - is the constan. Thus the time complexity is constant.
Substitution. It's pretty same as the previous operation. We're using two variables and one operation. This operation is same for the any type, therefore the cost of the time of the substitutin is constant. Then complexity is constant too.
but how about the comparison and substitution operations are they
constant time since they don't depend on n?
Yes. Comparison and substitution operations are a constant factor because their execution time doesn't depend on a size of the input. Their execution time takes time but, again, it's independent from the input size.
However, the execution time of your for loop grows proportionally to the number of items n and so its time complexity is O(n).
UPDATE
As #kaya3 correctly pointed out, we assume that we deal with fixed-size data types. If they're not then check an answer from #kaya3.

Space complexity of nested loop

I am confused when it comes to space complexity of an algorithm. In theory, it corresponds to extra stack space that an algorithm uses i.e. other than the input. However, I have problems pointing out, what exactly is meant by that.
If, for instance, I have a following brute force algorithm that checks whether there are no duplicates in the array, would that mean that it uses O(1) extra storage spaces, because it uses int j and int k?
public static void distinctBruteForce(int[] myArray) {
for (int j = 0; j < myArray.length; j++) {
for (int k = j + 1; k < myArray.length; k++) {
if (k != j && myArray[k] == myArray[j]) {
return;
}
}
}
}
Yes, according to your definition (which is correct), your algorithm uses constant, or O(1), auxilliary space: the loop indices, possibly some constant heap space needed to set up the function call itself, etc.
It is true that it could be argued that the loop indices are bit-logarithmic in the size of the input, but it is usually approximated as being constant.
According to the Wikipedia entry:
In computational complexity theory, DSPACE or SPACE is the computational resource describing the resource of memory space for a deterministic Turing machine. It represents the total amount of memory space that a "normal" physical computer would need to solve a given computational problem with a given algorithm
So, in a "normal" computer, the indices would be considered each to be 64 bits, or O(1).
would that mean that it uses O(1) extra storage spaces, because it uses int j and int k?
Yes.
Extra storage space means space used for something other then the input itself. And, just as time complexity works, if that extra space is not dependent (increases when input size is increased) on the size of the input size itself, then the space complexity would be O(1)
Yes, your algorithm is indeed O(1) storage space 1, since the auxillary space you use has a strict upper bound that is independent on the input.
(1) Assuming integers used for iteration are in restricted range, usually up to 2^32-1

Big O - is n always the size of the input?

I made up my own interview-style problem, and have a question on the big O of my solution. I will state the problem and my solution below, but first let me say that the obvious solution involves a nested loop and is O(n2). I believe I found a O(n) solution, but then I realized it depends not only on the size of the input, but the largest value of the input. It seems like my running time of O(n) is only a technicality, and that it could easily run in O(n2) time or worse in real life.
The problem is:
For each item in a given array of positive integers, print all the other items in the array that are multiples of the current item.
Example Input:
[2 9 6 8 3]
Example Output:
2: 6 8
9:
6:
8:
3: 9 6
My solution (in C#):
private static void PrintAllDivisibleBy(int[] arr)
{
Dictionary<int, bool> dic = new Dictionary<int, bool>();
if (arr == null || arr.Length < 2)
return;
int max = arr[0];
for(int i=0; i<arr.Length; i++)
{
if (arr[i] > max)
max = arr[i];
dic[arr[i]] = true;
}
for(int i=0; i<arr.Length; i++)
{
Console.Write("{0}: ", arr[i]);
int multiplier = 2;
while(true)
{
int product = multiplier * arr[i];
if (dic.ContainsKey(product))
Console.Write("{0} ", product);
if (product >= max)
break;
multiplier++;
}
Console.WriteLine();
}
}
So, if 2 of the array items are 1 and n, where n is the array length, the inner while loop will run n times, making this equivalent to O(n2). But, since the performance is dependent on the size of the input values, not the length of the list, that makes it O(n), right?
Would you consider this a true O(n) solution? Is it only O(n) due to technicalities, but slower in real life?
Good question! The answer is that, no, n is not always the size of the input: You can't really talk about O(n) without defining what the n means, but often people use imprecise language and imply that n is "the most obvious thing that scales here". Technically we should usually say things like "This sort algorithm performs a number of comparisons that is O(n) in the number of elements in the list": being specific about both what n is, and what quantity we are measuring (comparisons).
If you have an algorithm that depends on the product of two different things (here, the length of the list and the largest element in it), the proper way to express that is in the form O(m*n), and then define what m and n are for your context. So, we could say that your algorithm performs O(m*n) multiplications, where m is the length of the list and n is the largest item in the list.
An algorithm is O(n) when you have to iterate over n elements and perform some constant time operation in each iteration. The inner while loop of your algorithm is not constant time as it depends on the hugeness of the biggest number in your array.
Your algorithm's best case run-time is O(n). This is the case when all the n numbers are same.
Your algorithm's worst case run-time is O(k*n), where k = the max value of int possible on your machine if you really insist to put an upper bound on k's value. For 32 bit int the max value is 2,147,483,647. You can argue that this k is a constant, but this constant is clearly
not fixed for every case of input array; and,
not negligible.
Would you consider this a true O(n) solution?
The runtime actually is O(nm) where m is the maximum element from arr. If the elements in your array are bounded by a constant you can consider the algorithm to be O(n)
Can you improve the runtime? Here's what else you can do. First notice that you can ensure that the elements are different. ( you compress the array in hashmap which stores how many times an element is found in the array). Then your runtime would be max/a[0]+max/a[1]+max/a[2]+...<= max+max/2+...max/max = O(max log (max)) (assuming your array arr is sorted). If you combine this with the obvious O(n^2) algorithm you'd get O(min(n^2, max*log(max)) algorithm.

Calculating the space complexity of a C-function?

Consider the following C-function:
double foo (int n) {
int i;
double sum;
if (n==0)
return 1.0;
else {
sum = 0.0;
for (i=0; i<n; i++)
sum +=foo(i);
return sum;
}
}
The space complexity of the above function is:
(a) O(1) (b) O(n) (c) O(n!) (d) O(n^n)
What I've done is calculating the recurrence relation for the above code but I'm still not able to solve that recurrence. I know this is not home work related site. But any help would be appreciated.
This is my recurrence.
T(n) = T(n-1) + T(n-2) + T(n-3) + T(n-4) +........+ T(1)+ S
Where S is some constant.
That would depend on whether you're talking about stack, or heap space complexity.
For the heap, it's O(1) or O(0) since you're using no heap memory. (aside from the basic system/program overhead)
For the stack, it's O(n). This is because the recursion gets up the N levels deep.
The deepest point is:
foo(n)
foo(n - 1)
foo(n - 2)
...
foo(0)
Space complexity describes how much space your program needs. Since foo does not declare arrays, each level requires O(1) space. Now all you need to do is to figure out how many nested levels can be active at the most at any given time.
Edit: ...so much for letting you figure out the solution for yourself :)
You don't explain how you derived your recurrence relation. I would do it like this:
If n == 0, then foo uses constant space (there is no recursion).
If n > 1, then foo recurses once for each i from 0 to n-1 (inclusive). For each recursion, it uses constant space (for the call itself) plus T(i) space for the recursive call. But these calls occur one after the other; the space used by each call is releasing before the next call. Therefore they should not be added, but simply the maximum taken. That would be T(n-1), since T is non-decreasing.
The space cmplexity would be O(n). As you have mentioned, it might seem like O(n*n), but one should remember that onces the call for say (i=1) in the loop is done, the space used up in the stack for this is removed. So, you will have to consider the worst case, when i=n-1. Then the maximum number of recursive function calls will be on the stack simultaneously

Resources