Consider the following pseudocode:
linked_list_node = ... //We have some linked list
while linked_list_node is not NULL //Iterate through it
node_copy = CopyNode(linked_list_node) //We allocate a new pointer and copy the node, which is O(1) space
... //Do something
DeleteAndFree(node_copy) //We free the memory allocated at the beginning of the loop
Next(linked_list_node) //Advance once
Let N be the size of our linked list
On the one hand
At each iteration of the loop, we used O(1) space, the loop is N iterations, which means that in total we allocated O(N) space
On the other hand
We never actually allocated N nodes at the same time, each time we allocated exactly one node, so, in theory, we only O(1) Space. In other words, if our machine only had 1 byte available in memory, it could allocate and delete that same byte over and over again, never running into a memory limit.
I found this question on stack overflow: What is O(1) space complexity?
From accepted answer:
However, if let's say for some reason the algorithm needs to allocate 'N' pointers when traversing a list of size N, ..., then the algorithm is considered to have a space complexity of O(N)
It seems like my algorithm doesn't satisfy this condition as I never actually use N different pointers at the same time, so it should be O(1) Space. However, it indeed requires N allocating operations, which could be why it is really O(N) Space
So, what is the space complexity of that and why?
This would be O(1) complexity. At each step you can say that your usage increases and decreases by 1, so net gain of 0 for each element. That said, this is kind of an odd solution, as you presumably could replace it with an implementation where you allocate a single node, and just copy each element into it. The two would be equivalent, and the latter is clearly O(1) space complexity.
Assume the object is a dummy list node, it's only used in the same recursion level where it is created.
I feel the part I'm not sure about is whether the space of the object can be recycled when a recursion level finishes. If the space can be recycled, I would say the space complexity is O(1), otherwise I feel it's O(M) where M is the number of recursions.
Being able to reclaim some (fixed size) objects associated with an activation frame doesn't change the asymptotic complexity. It only improves the constants of the actual resource use.
The algorithm's storage use is O(M) simply based on the number of activation frames it allocates at once, not their precise size.
Of course, using 100 bytes per frame is inarguably better than 1000 or 10000, but constant factors like that don't contribute to complexity, which is why we make them disappear in the O notation.
Knocking that down to O(1) will require a reorganization of the control flow itself, such a switch to tail calls or iteration.
I am curious to know which algorithm is better :
Algorithm with O(n log n) time and O(1) space complexity
Algorithm with O(n) time and O(n) space complexity
Most of the algorithm which are solved in O(n long n) time and constant space can be solved in O(n) time by paying penalty in terms of space. Which algorithm is better ?
How do I decide between these two parameters ?
Example : Array Pair Sum
Can be solved in O(n logn) time by sorting
Can be solved using hash maps in O(n) time but with O(n) space
Without actually testing anything (a risky move!), I'm going to claim that the O(n log n)-time, O(1)-space algorithm is probably faster than the O(n)-time, O(n)-space algorithm, but is still probably not the optimal algorithm.
First, let's talk about this from a high-level perspective that ignores the particular details of the algorithms you're describing. One detail to keep in mind is that although O(n)-time algorithms are asymptotically faster than O(n log n)-time algorithms, they're only faster by a logarithmic factor. Keeping in mind that the number of atoms in the universe is about 1080 (thanks, physics!), the base-2 log of the number of atoms in the universe is about 240. From a practical perspective, this means that you can think of that extra O(log n) factor as just a constant. Consequently, to determine whether an O(n log n) algorithm will be faster or slower than an O(n) algorithm on a particular input, you'd need to know more about what constants are hidden by the big-O notation. An algorithm that runs in time 600n will be slower than an algorithm that runs in time 2n log n for any n that fits in the universe, for example. Therefore, in terms of wall-clock performance, to evaluate which algorithm is faster, you'd probably need to do a bit of profiling on the algorithm to see which one is faster.
Then there's the effects of caching and locality of reference. Computer memory has a huge number of caches in it that are optimized for the case where reads and writes are located next to one another. The cost of a cache miss can be huge - hundreds or thousands of times slower than a hit - so you want to try to minimize this. If an algorithm uses O(n) memory, then as n gets larger, you need to start worrying about how closely-packed your memory accesses will be. If they're spread out, then the cost of the cache misses might start to add up pretty quickly, significantly driving up the coefficient hidden in the big-O notation of the time complexity. If they're more sequential, then you probably don't need to worry too much about this.
You also need to be careful about total memory available. If you have 8GB of RAM on your system and get an array with one billion 32-bit integers, then if you need O(n) auxiliary space with even a reasonable constant, you're not going to be able to fit your auxiliary memory into main memory and it will start getting paged out by the OS, really killing your runtime.
Finally, there's the issue of randomness. Algorithms based on hashing have expected fast runtimes, but if you get a bad hash function, there's a chance that the algorithm will slow down. Generating good random bits is hard, so most hash tables just go for "reasonably good" hash functions, risking worst-case inputs that will make the algorithm's performance degenerate.
So how do these concerns actually play out in practice? Well, let's look at the algorithms. The O(n)-time, O(n)-space algorithm works by building a hash table of all the elements in the array so that you can easily check whether a given element is present in the array, then scanning over the array and seeing whether there is a pair that sums up to the total. Let's think about how this algorithm works given the factors above.
The memory usage is O(n) and, due to how hashing works, the accesses to the hash table are not likely to be sequential (an ideal hash table would have pretty much random access patterns). This means that you're going to have a lot of cache misses.
The high memory usage means that for large inputs, you have to worry about memory getting paged in and out, exacerbating the above problem.
As a result of the above two factors, the constant term hidden in the O(n) runtime is likely much higher than it looks.
Hashing is not worst-case efficient, so there may be inputs that cause performance to significantly degrade.
Now, think about the O(n log n)-time, O(1) space algorithm, which works by doing an in-place array sort (say, heapsort), then walking inwards from the left and right and seeing if you can find a pair that sums to the target. The second step in this process has excellent locality of reference - virtually all array accesses are adjacent - and pretty much all of the cache misses you're going to get are going to be in the sorting step. This will increase the constant factor hidden in the big-O notation. However, the algorithm has no degenerate inputs and its low memory footprint probably means that the locality of reference will be better than the hash table approach. Therefore, if I had to guess, I'd put my money on this algorithm.
... Well, actually, I'd put my money on a third algorithm: an O(n log n)-time, O(log n)-space algorithm that's basically the above algorithm, but using introsort instead of heapsort. Introsort is an O(n log n)-time, O(log n)-space algorithm that uses randomized quicksort to mostly sort the array, switching to heapsort if the quicksort looks like it's about to degenerate, and doing a final insertion sort pass to clean everything up. Quicksort has amazing locality of reference - this is why it's so fast - and insertion sort is faster on small inputs, so this is an excellent compromise. Plus, O(log n) extra memory is basically nothing - remember, in practice, log n is at most 240. This algorithm has about the best locality of reference that you can get, giving a very low constant factor hidden by the O(n log n) term, so it would probably outperform the other algorithms in practice.
Of course, I've got to qualify that answer as well. The analysis I did above assumes that we're talking about pretty large inputs to the algorithm. If you're only ever looking at small inputs, then this whole analysis goes out the window because the effects I was taking into account won't start to show up. In that case, the best option would just be to profile the approaches and see what works best. From there, you might be able to build a "hybrid" approach where you use one algorithm for inputs in one size range and a different algorithm for inputs in a different size range. Chances are that this would give an approach that beats any single one of the approaches.
That said, to paraphrase Don Knuth, "beware of the above analysis - I have merely proved it correct, not actually tried it." The best option would be to profile everything and see how it works. The reason I didn't do this was to go through the analysis of what factors to keep an eye out for and to highlight the weakness of a pure big-O analysis comparing the two algorithms. I hope that the practice bears this out! If not, I'd love to see where I got it wrong. :-)
From experience:
If you absolutely can't afford the space, head the O(1) space route.
When random access is unavoidable, head the O(n) space route. (It's usually simpler and has a smaller time constant.)
When random access is slow (e.g. seek times), head the O(1) space route. (You can usually figure out a way to be cache coherent.)
Otherwise, random access is fast -- head the O(n) space route. (It's usually simpler with a smaller time constant.)
Note that usually random access is "fast" if the problem fits within memory that's faster than the bottleneck storage. (e.g. if disks are the bottleneck, main memory is fast enough for random access --- if main memory is the bottleneck, CPU cache is fast enough for random access)
Using your specific algorithm example Array Pair Sum, the hash version O(n) time with O(n) space will be faster. Here's a little JavaScript benchmark you can play with http://jsfiddle.net/bbxb0bt4/1/
I used two different sorting algorithms, quick sort and radix sort in the benchmark. Radix sort in this instance (array of 32bit integers) is the ideal sorting algorithm and even it can barely compete with the single pass hash version.
If you want some generalized opinion, with regards to programming:
using the O(N) time with O(N) space algorithm is preferred because the implementation will be simpler, which means it will be easier to maintain and debug.
function apsHash(arr, x) {
var hash = new Set();
for(var i = 0; i < arr.length; i++) {
if(hash.has(x - arr[i])) {
return [arr[i], x - arr[i]];
}
hash.add(arr[i]);
}
return [NaN, NaN];
}
function apsSortQS(arr, x) {
arr = quickSortIP(arr);
var l = 0;
var r = arr.length - 1;
while(l < r) {
if(arr[l] + arr[r] === x) {
return [arr[l], arr[r]];
} else if(arr[l] + arr[r] < x) {
l++;
} else {
r--;
}
}
return [NaN, NaN];
}
To compare two algorithms, firstly it should be quiet clear that for what we are comparing them.
If our priority is space, the algorithm with T(n)=O(n log n) & S(n)=O(1) is better.
In general case, second one with T(n)=O(n) & S(n)=O(n) is better as space could be compensated but time couldn't.
That's not true that you can always substitute an O(n lg n) time O(1) space algorithm, with O(n) time O(n) space one. It really depends on the problem, and there are many different algorithms with different complexities for time and space, not just linear or linearithmic (e.g. n log n).
Note that O(1) space sometimes means (like in your example) that you need to modify the input array. So this actually means that you do need O(n) space, but you can somehow use the input array as your space (vs the case of really using only constant space). Changing the input array is not always possible or allowed.
As for choosing between the different algorithms with different time and space characteristics, it depends on your priorities. Often, the time is most important, so if you have enough memory, you would choose the fastest algorithm (remember that this memory is only used temporarily while the algorithm is running). If you really don't have the required space, then you would choose a slower algorithm which requires less space.
So, the general rule of thumb is to choose the fastest algorithm (not just by asymptotic complexity, but the actual real world fastest execution time for your regular work load) that it's possible to accomodate its space requirements.
One should keep three things in mind while selecting an algorithm approach.
Time in which the application will run smoothly in worst case scenario.
Space availability based on kind of environment the program will run in.
Re-usability of the functions created.
Given these three points, We may decide which approach will suit our application.
If I would be having a limited space and reasonable data supplied to it, then condition 2 will play prime role. Here, We may check the smoothness with O(nlogn) and try to optimize the code and give importance to condition 3.
(For example, Sorting algorithm utilized in Array Pair Sum can be reused at some other place in my code.)
If I would be having enough space, then improvising on time would be major concern. Here, instead re-usability, one would focus on writing time-efficient program.
Assuming that your assumption is true.
Giving the fact that in real life, unlimited resources do not exist and that while implementing a solution you would do your best to implement the most reliable solution (a solution that does not break because you consumed all your allowed memory), I would be wise and go with :
Algorithm with O(n log n) time and O(1) space complexity
Even if you have a big amount of memory and that you are sure you would never exhaust your memory using solutions that consume a lot of memory could cause many issues (I/O read/write speed, backup data on case of failure) and I guess no one likes application that uses 2Go of memory at start ups and keeps growing over time as if there was a memory leak.
i guess best is to write a test,
actual algorithm, amount of data (n),
and memory usage pattern will be important.
here a simple attempt to model it;
random() function calls and mod operations for time complexity,
random memory access (read/write) for space complexity.
#include <stdio.h>
#include <malloc.h>
#include <time.h>
#include <math.h>
int test_count = 10;
int* test (long time_cost, long mem_cost){
// memory allocation cost is also included
int* mem = malloc(sizeof(int) * mem_cost);
long i;
for (i = 0; i < time_cost; i++){
//random memory access, read and write operations.
*(mem + (random() % mem_cost)) = *(mem + (random() % mem_cost));
}
return mem;
}
int main(int argc, char** argv){
if (argc != 2) {
fprintf(stderr,"wrong argument count %d \nusage: complexity n", argc);
return -1;
}
long n = atol(argv[1]);
int *mem1, *mem2;
clock_t start,stop;
long long sum1 = 0;
long long sum2 = 0;
int i = 0;
for (i; i < test_count; i++){
start = clock();
mem1 = test(n * log(n), 1);
stop = clock();
free(mem1);
sum1 += (stop - start);
start = clock();
mem2 = test(n , n);
stop = clock();
free(mem2);
sum2 += (stop - start);
}
fprintf(stdout, "%lld \t", sum1);
fprintf(stdout, "%lld \n", sum2);
return 0;
}
disabling optimizations;
gcc -o complexity -O0 -lm complexity.c
testing;
for ((i = 1000; i < 10000000; i *= 2)); do ./complexity $i; done | awk -e '{print $1 / $2}'
results i got;
7.96269
7.86233
8.54565
8.93554
9.63891
10.2098
10.596
10.9249
10.8096
10.9078
8.08227
6.63285
5.63355
5.45705
up to some point O(n) is doing better in my machine,
after some point, O(n*logn) getting better, (i didn't use swap).
By time complexity we understand algorithm's running time as a function of the size of input (number of bits needed to represent the instance in memory). Then how do we define space complexity, with regard to this observation? It obviously can't be related to the size of instance...
Space complexity can be defined in multiple ways, but the usual definition is the following. We assume that the input is stored in read-only memory somewhere, that there is a dedicated write-only memory for storing the result of the operation, and that there is some general "scratch space" memory for doing auxiliary computations. Typically, space complexity is the amount of space needed to store the output and for all the scratch space. For example, binary search has space complexity O(1) because only O(1) storage space is needed to store the input and output (assuming that array indices fit into machine words).
Sometimes, the input and output space are combined into a single storage unit and the input can be modified. In this model, for example, heapsort has space complexity O(1), while mergesort has space complexity O(n) for the auxiliary storage space needed for the merging.
Hope this helps!
I have seen that in most cases the time complexity is related to the space complexity and vice versa. For example in an array traversal:
for i=1 to length(v)
print (v[i])
endfor
Here it is easy to see that the algorithm complexity in terms of time is O(n), but it looks to me like the space complexity is also n (also represented as O(n)?).
My question: is it possible that an algorithm has different time complexity than space complexity?
The time and space complexities are not related to each other. They are used to describe how much space/time your algorithm takes based on the input.
For example when the algorithm has space complexity of:
O(1) - constant - the algorithm uses a fixed (small) amount of space which doesn't depend on the input. For every size of the input the algorithm will take the same (constant) amount of space. This is the case in your example as the input is not taken into account and what matters is the time/space of the print command.
O(n), O(n^2), O(log(n))... - these indicate that you create additional objects based on the length of your input. For example creating a copy of each object of v storing it in an array and printing it after that takes O(n) space as you create n additional objects.
In contrast the time complexity describes how much time your algorithm consumes based on the length of the input. Again:
O(1) - no matter how big is the input it always takes a constant time - for example only one instruction. Like
function(list l) {
print("i got a list");
}
O(n), O(n^2), O(log(n)) - again it's based on the length of the input. For example
function(list l) {
for (node in l) {
print(node);
}
}
Note that both last examples take O(1) space as you don't create anything. Compare them to
function(list l) {
list c;
for (node in l) {
c.add(node);
}
}
which takes O(n) space because you create a new list whose size depends on the size of the input in linear way.
Your example shows that time and space complexity might be different. It takes v.length * print.time to print all the elements. But the space is always the same - O(1) because you don't create additional objects. So, yes, it is possible that an algorithm has different time and space complexity, as they are not dependent on each other.
Time and Space complexity are different aspects of calculating the efficiency of an algorithm.
Time complexity deals with finding out how the computational time of
an algorithm changes with the change in size of the input.
On the other hand, space complexity deals with finding out how much
(extra)space would be required by the algorithm with change in the
input size.
To calculate time complexity of the algorithm the best way is to check if we increase in the size of the input, will the number of comparison(or computational steps) also increase and to calculate space complexity the best bet is to see additional memory requirement of the algorithm also changes with the change in the size of the input.
A good example could be of Bubble sort.
Lets say you tried to sort an array of 5 elements.
In the first pass you will compare 1st element with next 4 elements. In second pass you will compare 2nd element with next 3 elements and you will continue this procedure till you fully exhaust the list.
Now what will happen if you try to sort 10 elements. In this case you will start with comparing comparing 1st element with next 9 elements, then 2nd with next 8 elements and so on. In other words if you have N element array you will start of by comparing 1st element with N-1 elements, then 2nd element with N-2 elements and so on. This results in O(N^2) time complexity.
But what about size. When you sorted 5 element or 10 element array did you use any additional buffer or memory space. You might say Yes, I did use a temporary variable to make the swap. But did the number of variables changed when you increased the size of array from 5 to 10. No, Irrespective of what is the size of the input you will always use a single variable to do the swap. Well, this means that the size of the input has nothing to do with the additional space you will require resulting in O(1) or constant space complexity.
Now as an exercise for you, research about the time and space complexity of merge sort
First of all, the space complexity of this loop is O(1) (the input is customarily not included when calculating how much storage is required by an algorithm).
So the question that I have is if its possible that an algorithm has different time complexity from space complexity?
Yes, it is. In general, the time and the space complexity of an algorithm are not related to each other.
Sometimes one can be increased at the expense of the other. This is called space-time tradeoff.
There is a well know relation between time and space complexity.
First of all, time is an obvious bound to space consumption: in time t
you cannot reach more than O(t) memory cells. This is usually expressed
by the inclusion
DTime(f) ⊆ DSpace(f)
where DTime(f) and DSpace(f) are the set of languages
recognizable by a deterministic Turing machine in time
(respectively, space) O(f). That is to say that if a problem can
be solved in time O(f), then it can also be solved in space O(f).
Less evident is the fact that space provides a bound to time. Suppose
that, on an input of size n, you have at your disposal f(n) memory cells,
comprising registers, caches and everything. After having written these cells
in all possible ways you may eventually stop your computation,
since otherwise you would reenter a configuration you
already went through, starting to loop. Now, on a binary alphabet,
f(n) cells can be written in 2^f(n) different ways, that gives our
time upper bound: either the computation will stop within this bound,
or you may force termination, since the computation will never stop.
This is usually expressed in the inclusion
DSpace(f) ⊆ Dtime(2^(cf))
for some constant c. the reason of the constant c is that if L is in DSpace(f) you only
know that it will be recognized in Space O(f), while in the previous
reasoning, f was an actual bound.
The above relations are subsumed by stronger versions, involving
nondeterministic models of computation, that is the way they are
frequently stated in textbooks (see e.g. Theorem 7.4 in Computational
Complexity by Papadimitriou).
Yes, this is definitely possible. For example, sorting n real numbers requires O(n) space, but O(n log n) time. It is true that space complexity is always a lowerbound on time complexity, as the time to initialize the space is included in the running time.
Sometimes yes they are related, and sometimes no they are not related,
actually we sometimes use more space to get faster algorithms as in dynamic programming https://www.codechef.com/wiki/tutorial-dynamic-programming
dynamic programming uses memoization or bottom-up, the first technique use the memory to remember the repeated solutions so the algorithm needs not to recompute it rather just get them from a list of solutions. and the bottom-up approach start with the small solutions and build upon to reach the final solution.
Here two simple examples, one shows relation between time and space, and the other show no relation:
suppose we want to find the summation of all integers from 1 to a given n integer:
code1:
sum=0
for i=1 to n
sum=sum+1
print sum
This code used only 6 bytes from memory i=>2,n=>2 and sum=>2 bytes
therefore time complexity is O(n), while space complexity is O(1)
code2:
array a[n]
a[1]=1
for i=2 to n
a[i]=a[i-1]+i
print a[n]
This code used at least n*2 bytes from the memory for the array
therefore space complexity is O(n) and time complexity is also O(n)
The way in which the amount of storage space required by an algorithm varies with the size of the problem it is solving. Space complexity is normally expressed as an order of magnitude, e.g. O(N^2) means that if the size of the problem (N) doubles then four times as much working storage will be needed.
space complexity is the total amount of memory space used by an algorithm/program, including input value execution space. whereas the time complexity is the number of operations an algorithm performs to complete its task. These are two different concept, a single algorithm can of low time complexity but still can take up a lot of memory for example hashmaps take more memory than array but take less time.