Space efficiency of algorithms - algorithm

It seems like none of the algorithm textbooks mentions about space efficiency as much, so I don't really understand when I encounter questions asking for an algorithm that requires only constant memory.
What would be an example of a few examples of algorithms that uses constant memory and algorithms that doesn't use constant memory?

If an algorithm:
a) recurses a number of levels deep which depends on N, or
b) allocates an amount of memory which depends on N
then it is not constant memory. Otherwise it probably is: formally it is constant-memory if there is a constant upper bound on the amount of memory which the algorithm uses, no matter what the size/value of the input. The memory occupied by the input is not included, so sometimes to be clear you talk about constant "extra" memory.
So, here's a constant-memory algorithm to find the maximum of an array of integers in C:
int max(int *start, int *end) {
int result = INT_MIN;
while (start != end) {
if (*start > result) result = *start;
++start;
}
return result;
}
Here's a non-constant memory algorithm, because it uses stack space proportional to the number of elements in the input array. However, it could become constant-memory if the compiler is somehow capable of optimising it to a non-recursive equivalent (which C compilers don't usually bother with except sometimes with a tail-call optimisation, which wouldn't do the job here):
int max(int *start, int *end) {
if (start == end) return INT_MIN;
int tail = max(start+1, end);
return (*start > tail) ? *start : tail;
}
Here is a constant-space sort algorithm (in C++ this time), which is O(N!) time or thereabouts (maybe O(N*N!)):
void sort(int *start, int *end) {
while (std::next_permutation(start,end));
}
Here is an O(N) space sort algorithm, which is O(N^2) time:
void sort(int *start, int *end) {
std::vector<int> work;
for (int *current = start; current != end; ++current) {
work.insert(
std::upper_bound(work.begin(), work.end(), *current),
*current
);
}
std::copy(work.begin(), work.end(), start);
}

Very easy example: counting a number of characters in a string. It can be iterative:
int length( const char* str )
{
int count = 0;
while( *str != 0 ) {
str++;
count++
}
return count;
}
or recursive:
int length( const char* str )
{
if( *str == 0 ) {
return 0;
}
return 1 + length( str + 1 );
}
The first variant only uses a couple of local variables regardless of the string length - it's space complexity is O(1). The second if executed without recursion elimination requires a separate stack frame for storing the return address and local variables corresponding to each depth level - its space complexity is O(n) where n is string length.

Take a sorting algorithms on an array for example. You can either use an new array of the same length as the original array where you put the sorted elements into (Θ(n)). Or you sort the array in-place and just use one additional temporary variable for swapping two elements (Θ(1)).

Related

Why is recursive Merge Sort preferred over iterative Merge Sort even though the latter has auxillary space complexity?

While studying about Merge Sort algorithm, I was curious to know if this sorting algorithm can be further optimised. Found out that there exists Iterative version of Merge Sort algorithm with same time complexity but even better O(1) space complexity. And Iterative approach is always better than recursive approch in terms of performance. Then why is it less common and rarely talked about in any regular Algorithm course?
Here's the link to Iterative Merge Sort algorithm
If you think that it has O(1) space complexity, look again. They have the original array A of size n, and an auxiliary temp also of size n. (It actually only needs to be n/2 but they kept it simple.)
And the reason why they need that second array is that when you merge, you copy the bottom region out to temp, then merge back starting with where it was.
So the tradeoff is this. A recursive solution involves a lot less fiddly bits and makes the concepts clearer, but adds a O(log(n)) memory overhead on top of the O(n) memory overhead that both solutions share. When you're trying to communicate concepts, that's a straight win.
Furthermore in practice I believe that recursive is also a win.
In the iterative approach you keep making full passes through your entire array. Which, in the case of a large array, means that data comes into the cache for a pass, gets manipulated, and then falls out as you load the rest of the array. Only to have to be loaded again for the next pass.
In the recursive approach, by contrast, for the operations that are the equivalent of the first few passes you load them into cache, completely sort them, then move on. (How many passes you get this win for depends heavily on data type, memory layout, and the size of your CPU cache.) You are only loading/unloading from cache when you're merging too much data to fit into cache. Algorithms courses generally omit such low-level details, but they matter a lot to real-world performance.
Found out that there exists Iterative version of Merge Sort algorithm with same time complexity but even better O(1) space complexity
The iterative, bottom-up implementation of Merge Sort you linked to, doesn't have O(1) space complexity. It maintains a copy of the array, so this represents O(n) space complexity. By consequence that makes the additional O(logn) stack space (for the recursive implementation) irrelevant for the total space complexity.
In the title of your question, and in some comments, you use the words "auxiliary space complexity". This is what we usually mean with space complexity, but you seem to suggest this term means constant space complexity. This is not true. "Auxiliary" refers to the space other than the space used by the input. This term tells us nothing about the actual complexity.
Recursive top down merge sort is mostly educational. Most actual libraries use some variation of a hybrid insertion sort and bottom up merge sort, using insertion sort to create small sorted runs that will be merged in an even number of merge passes, so that merging back and forth between original and temp array ends up with the sorted data in the original array (no copy operation in merge other than singleton runs at the end of an array, which can be avoided by choosing an appropriate initial run size for insertion sort (note - this is not done in my example code, I only use run size 32 or 64, while a more advanced method like Timsort does choose an appropriate run size).
Bottom up is slightly faster because the array pointers and indexes will be kept in registers (assuming an optimizing compiler), while top down is pushing|popping array pointers and indexes to|from the stack.
Although I'm not sure that the OP actually meant O(1) space complexity for a merge sort, it is possible, but it is about 50% slower than conventional merge sort with O(n) auxiliary space. It's mostly an research (educational) effort now. The code is fairly complex. Link to example code. One of the options is no extra buffer at all. The benchmark table is for a relatively small number of keys (max is 32767 keys). For a large number of keys, this example ends up about 50% slower than an optimized insertion + bottom up merge sort (std::stable_sort is generalized, such as using a pointer to function for every compare, so it is not fully optimized).
https://github.com/Mrrl/GrailSort
Example hybrid insertion + bottom up merge sort C++ code (left out the prototypes):
void MergeSort(int a[], size_t n) // entry function
{
if(n < 2) // if size < 2 return
return;
int *b = new int[n];
MergeSort(a, b, n);
delete[] b;
}
void MergeSort(int a[], int b[], size_t n)
{
size_t s; // run size
s = ((GetPassCount(n) & 1) != 0) ? 32 : 64;
{ // insertion sort
size_t l, r;
size_t i, j;
int t;
for (l = 0; l < n; l = r) {
r = l + s;
if (r > n)r = n;
l--;
for (j = l + 2; j < r; j++) {
t = a[j];
i = j-1;
while(i != l && a[i] > t){
a[i+1] = a[i];
i--;
}
a[i+1] = t;
}
}
}
while(s < n){ // while not done
size_t ee = 0; // reset end index
size_t ll;
size_t rr;
while(ee < n){ // merge pairs of runs
ll = ee; // ll = start of left run
rr = ll+s; // rr = start of right run
if(rr >= n){ // if only left run
rr = n; // copy left run
while(ll < rr){
b[ll] = a[ll];
ll++;
}
break; // end of pass
}
ee = rr+s; // ee = end of right run
if(ee > n)
ee = n;
Merge(a, b, ll, rr, ee);
}
std::swap(a, b); // swap a and b
s <<= 1; // double the run size
}
}
void Merge(int a[], int b[], size_t ll, size_t rr, size_t ee)
{
size_t o = ll; // b[] index
size_t l = ll; // a[] left index
size_t r = rr; // a[] right index
while(1){ // merge data
if(a[l] <= a[r]){ // if a[l] <= a[r]
b[o++] = a[l++]; // copy a[l]
if(l < rr) // if not end of left run
continue; // continue (back to while)
while(r < ee) // else copy rest of right run
b[o++] = a[r++];
break; // and return
} else { // else a[l] > a[r]
b[o++] = a[r++]; // copy a[r]
if(r < ee) // if not end of right run
continue; // continue (back to while)
while(l < rr) // else copy rest of left run
b[o++] = a[l++];
break; // and return
}
}
}
size_t GetPassCount(size_t n) // return # passes
{
size_t i = 0;
for(size_t s = 1; s < n; s <<= 1)
i += 1;
return(i);
}

Algorithm to match sets with overlapping members

Looking for an efficient algorithm to match sets among a group of sets, ordered by the most overlapping members. 2 identical sets for example are the best match, while no overlapping members are the worst.
So, the algorithm takes input a list of sets and returns matching set pairs ordered by the sets with the most overlapping members.
Would be interested in ideas to do this efficiently. Brute force approach is to try all combinations and sort which obviously is not very performant when the number of sets is very large.
Edit: Use case - Assume a large number of sets already exist. When a new set arrives, the algorithm is run and the output includes matching sets (with at least one element overlap) sorted by the most matching to least (doesn't matter how many items are in the new/incoming set). Hope that clarifies my question.
If you can afford an approximation algorithm with a chance of error, then you should probably consider MinHash.
This algorithm allows estimating the similarity between 2 sets in constant time. For any constructed set, a fixed size signature is computed, and then only the signatures are compared when estimating the similarities. The similarity measure being used is Jaccard distance, which ranges from 0 (disjoint sets) to 1 (identical sets). It is defined as the intersection to union ratio of two given sets.
With this approach, any new set has to be compared against all existing ones (in linear time), and then the results can be merged into the top list (you can use a bounded search tree/heap for this purpose).
Since the number of possible different values is not very large, you get a fairly efficient hashing if you simply set the nth bit in a "large integer" when the nth number is present in your set. You can then look for overlap between sets with a simple bitwise AND followed by a "count set bits" operation. On 64 bit architecture, that means that you can look for the similarity between two numbers (out of 1000 possible values) in about 16 cycles, regardless of the number of values in each cluster. As the cluster gets more sparse, this becomes a less efficient algorithm.
Still - I implemented some of the basic functions you might need in some code that I attach here - not documented but reasonably understandable, I think. In this example I made the numbers small so I can check the result by hand - you might want to change some of the #defines to get larger ranges of values, and obviously you will want some dynamic lists etc to keep up with the growing catalog.
#include <stdio.h>
// biggest number you will come across: want this to be much bigger
#define MAXINT 25
// use the biggest type you have - not int
#define BITSPER (8*sizeof(int))
#define NWORDS (MAXINT/BITSPER + 1)
// max number in a cluster
#define CSIZE 5
typedef struct{
unsigned int num[NWORDS]; // want to use longest type but not for demo
int newmatch;
int rank;
} hmap;
// convert number to binary sequence:
void hashIt(int* t, int n, hmap* h) {
int ii;
for(ii=0;ii<n;ii++) {
int a, b;
a = t[ii]%BITSPER;
b = t[ii]/BITSPER;
h->num[b]|=1<<a;
}
}
// print binary number:
void printBinary(int n) {
unsigned int jj;
jj = 1<<31;
while(jj!=0) {
printf("%c",((n&jj)!=0)?'1':'0');
jj>>=1;
}
printf(" ");
}
// print the array of binary numbers:
void printHash(hmap* h) {
unsigned int ii, jj;
for(ii=0; ii<NWORDS; ii++) {
jj = 1<<31;
printf("0x%08x: ", h->num[ii]);
printBinary(h->num[ii]);
}
//printf("\n");
}
// find the maximum overlap for set m of n
int maxOverlap(hmap* h, int m, int n) {
int ii, jj;
int overlap, maxOverlap = -1;
for(ii = 0; ii<n; ii++) {
if(ii == m) continue; // don't compare with yourself
else {
overlap = 0;
for(jj = 0; jj< NWORDS; jj++) {
// just to see what's going on: take these print statements out
printBinary(h->num[ii]);
printBinary(h->num[m]);
int bc = countBits(h->num[ii] & h->num[m]);
printBinary(h->num[ii] & h->num[m]);
printf("%d bits overlap\n", bc);
overlap += bc;
}
if(overlap > maxOverlap) maxOverlap = overlap;
}
}
return maxOverlap;
}
int countBits (unsigned int b) {
int count;
for (count = 0; b != 0; count++) {
b &= b - 1; // this clears the LSB-most set bit
}
return count;
}
int main(void) {
int cluster[20][CSIZE];
int temp[CSIZE];
int ii,jj;
static hmap H[20]; // make them all 0 initially
for(jj=0; jj<20; jj++){
for(ii=0; ii<CSIZE; ii++) {
temp[ii] = rand()%MAXINT;
}
hashIt(temp, CSIZE, &H[jj]);
}
for(ii=0;ii<20;ii++) {
printHash(&H[ii]);
printf("max overlap: %d\n", maxOverlap(H, ii, 20));
}
}
See if this helps at all...

find minimum step to make a number from a pair of number

Let's assume that we have a pair of numbers (a, b). We can get a new pair (a + b, b) or (a, a + b) from the given pair in a single step.
Let the initial pair of numbers be (1,1). Our task is to find number k, that is, the least number of steps needed to transform (1,1) into the pair where at least one number equals n.
I solved it by finding all the possible pairs and then return min steps in which the given number is formed, but it taking quite long time to compute.I guess this must be somehow related with finding gcd.can some one please help or provide me some link for the concept.
Here is the program that solved the issue but it is not cleat to me...
#include <iostream>
using namespace std;
#define INF 1000000000
int n,r=INF;
int f(int a,int b){
if(b<=0)return INF;
if(a>1&&b==1)return a-1;
return f(b,a-a/b*b)+a/b;
}
int main(){
cin>>n;
for(int i=1;i<=n/2;i++){
r=min(r,f(n,i));
}
cout<<(n==1?0:r)<<endl;
}
My approach to such problems(one I got from projecteuler.net) is to calculate the first few terms of the sequence and then search in oeis for a sequence with the same terms. This can result in a solutions order of magnitude faster. In your case the sequence is probably: http://oeis.org/A178031 but unfortunately it has no easy to use formula.
:
As the constraint for n is relatively small you can do a dp on the minimum number of steps required to get to the pair (a,b) from (1,1). You take a two dimensional array that stores the answer for a given pair and then you do a recursion with memoization:
int mem[5001][5001];
int solve(int a, int b) {
if (a == 0) {
return mem[a][b] = b + 1;
}
if (mem[a][b] != -1) {
return mem[a][b];
}
if (a == 1 && b == 1) {
return mem[a][b] = 0;
}
int res;
if (a > b) {
swap(a,b);
}
if (mem[a][b%a] == -1) { // not yet calculated
res = solve(a, b%a);
} else { // already calculated
res = mem[a][b%a];
}
res += b/a;
return mem[a][b] = res;
}
int main() {
memset(mem, -1, sizeof(mem));
int n;
cin >> n;
int best = -1;
for (int i = 1; i <= n; ++i) {
int temp = solve(n, i);
if (best == -1 || temp < best) {
best = temp;
}
}
cout << best << endl;
}
In fact in this case there is not much difference between dp and BFS, but this is the general approach to such problems. Hope this helps.
EDIT: return a big enough value in the dp if a is zero
You can use the breadth first search algorithm to do this. At each step you generate all possible NEXT steps that you havent seen before. If the set of next steps contains the result you're done if not repeat. The number of times you repeat this is the minimum number of transformations.
First of all, the maximum number you can get after k-3 steps is kth fibinocci number. Let t be the magic ratio.
Now, for n start with (n, upper(n/t) ).
If x>y:
NumSteps(x,y) = NumSteps(x-y,y)+1
Else:
NumSteps(x,y) = NumSteps(x,y-x)+1
Iteratively calculate NumSteps(n, upper(n/t) )
PS: Using upper(n/t) might not always provide the optimal solution. You can do some local search around this value for the optimal results. To ensure optimality you can try ALL the values from 0 to n-1, in which worst case complexity is O(n^2). But, if the optimal value results from a value close to upper(n/t), the solution is O(nlogn)

how would i find the time and space complexity of this code?

I am having difficulty finding space and time complexity for this code that i wrote to find number of palindromes in a string.
/**
This program finds palindromes in a string.
*/
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int checkPalin(char *str, int len)
{
int result = 0, loop;
for ( loop = 0; loop < len/2; loop++)
{
if ( *(str+loop) == *(str+((len - 1) - loop)) )
result = 1;
else {
result = 0;
break;
}
}
return result;
}
int main()
{
char *string = "baaab4";
char *a, *palin;
int len = strlen(string), index = 0, fwd=0, count=0, LEN;
LEN = len;
while(fwd < (LEN-1))
{
a = string+fwd;
palin = (char*)malloc((len+1)*sizeof(char));
while(index<len)
{
sprintf(palin+index, "%c",*a);
index++;
a++;
if ( index > 1 ) {
*(palin+index) = '\0';
count+=checkPalin(palin, index);
}
}
free(palin);
index = 0;
fwd++;
len--;
}
printf("Palindromes: %d\n", count);
return 0;
}
I gave it a shot and this what i think:
in main we have two while loops. The outer one runs over the entire length-1 of the string. Now here is the confusion, the inner while loop runs over the entire length first, then n-1, then n-2 etc for each iteration of the outer while loop. so does that mean our time complexity will be O(n(n-1)) = O(n^2-n) = O(n^2)?
And for the space complexity initially i assign space for string length+1, then (length+1)-1, (length+1)-2 etc. so how can we find space complexity from this?
For the checkPalin function its O(n/2).
i am preparing for interviews and would like to understand this concept.
Thank you
Don't forget that each call to checkPalin (which you do each time through the inner loop of main) executes a loop index / 2 times inside checkPalin. Your computation of the time complexity of the algorithm is correct except for this. Since index gets as large as n, this adds another factor of n to the time complexity, giving O(n3).
As for space compexity, you allocate each time through the outer loop, but then free it. So the space complexity is O(n). (Note that O(n) == O(n/2). It's just the exponent and the form of the function that's important.)
For time complexity, your analysis is correct. It's O(n^2) because of the n+(n-1)+(n-2)+...+1 steps. For space complexity, you generally only count space needed at any given time. In your case, the most additional memory you ever need is O(n) the first time through the loop, so the space complexity is linear.
That said, this isn't especially good code for checking a palindrome. You could do it in O(n) time and O(1) space and actually have cleaner and clearer code to boot.
Gah: didn't read closely enough. The correct answer is given elsewhere.

How to calculate Big-O Notation on the following code

I've read the topic:
Big O, how do you calculate/approximate it?
And am not sure what the Big-O notation for the following function would be:
static int build_nspaces_pattern(const char * const value, char *pattern,
size_t sz_pattern) {
static char val_buffer[1025];
char *ptr, *saveptr;
size_t szptrn;
ptrdiff_t offset;
val_buffer[0] = '\0';
strncat(val_buffer, value, sizeof(val_buffer) - 1);
val_buffer[sizeof(val_buffer) - 1] = '\0';
pattern[0] = '^'; pattern[1] = '('; pattern[2] = '\0';
for ( ptr=strtok_r(val_buffer, ",", &saveptr);
ptr!=NULL;
ptr=strtok_r(NULL, ",", &saveptr)
) {
szptrn = sz_pattern - strlen(pattern) - 1;
if ( sanitize(ptr) != 0 ) {
return -1;
}
strncat(pattern, ptr, szptrn);
szptrn -= strlen(ptr);
strncat(pattern, "|", szptrn);
}
offset = strlen(pattern);
pattern[offset-1] = ')'; pattern[offset] = '$'; pattern[offset+1] = '\0';
return 0;
}
Sanitize is O(n), but the for loop will run k times (k is the number of commas in the string).
So, k * O(n) is still O(n), would it be O(n^2), O(k.n) or something else?
Thanks.
Looks O(n) to me, at a glance.
strtok_r() iterates through the original string = O(n)
sanitize() you say is O(n), but this is presumably with respect to the length of the token rather than the length of the original string, so multiply token length by number of tokens = O(n)
strncat() ends up copying all of the original string with no overlap = O(n)
you append a fixed number of characters to the output string (^, (, ), $ and a couple of NULLs) = O(1)
you append a | to the string per token = O(n)
But wait!
you call strlen() over the output pattern for every iteration of the loop = O(n^2)
So there's your answer.
One way to approach it I like is to replace code with the running times, so for instance
val_buffer[0] = '\0';
strncat(val_buffer, value, sizeof(val_buffer) - 1);
val_buffer[sizeof(val_buffer) - 1] = '\0';
becomes
O(1)
O(n) (* Assume the size of value is the size of the input *)
O(1)
A loop
for each k in value {
strlen(value)
}
becomes
O(n) {
O(n)
}
or something such notation, which you can then make into O(n) * O(n) = O(n^2). You can then sum up all the listed big-oh times to obtain your final time complexity.
A similar trick is to first lace all of your code with counts of how much work is done, then remove the code that does the real work leaving only the counts. Then using simple math to simplify the counts. I.e.,
count = 0;
for (i = 0; i < k; i++) {
count++
}
is easily seen to be replaceable by count = k.
Why is everyone assuming strlen = O(n)? I thought O(n) was only for loops.

Resources