How to calculate Big-O Notation on the following code - algorithm

I've read the topic:
Big O, how do you calculate/approximate it?
And am not sure what the Big-O notation for the following function would be:
static int build_nspaces_pattern(const char * const value, char *pattern,
size_t sz_pattern) {
static char val_buffer[1025];
char *ptr, *saveptr;
size_t szptrn;
ptrdiff_t offset;
val_buffer[0] = '\0';
strncat(val_buffer, value, sizeof(val_buffer) - 1);
val_buffer[sizeof(val_buffer) - 1] = '\0';
pattern[0] = '^'; pattern[1] = '('; pattern[2] = '\0';
for ( ptr=strtok_r(val_buffer, ",", &saveptr);
ptr!=NULL;
ptr=strtok_r(NULL, ",", &saveptr)
) {
szptrn = sz_pattern - strlen(pattern) - 1;
if ( sanitize(ptr) != 0 ) {
return -1;
}
strncat(pattern, ptr, szptrn);
szptrn -= strlen(ptr);
strncat(pattern, "|", szptrn);
}
offset = strlen(pattern);
pattern[offset-1] = ')'; pattern[offset] = '$'; pattern[offset+1] = '\0';
return 0;
}
Sanitize is O(n), but the for loop will run k times (k is the number of commas in the string).
So, k * O(n) is still O(n), would it be O(n^2), O(k.n) or something else?
Thanks.

Looks O(n) to me, at a glance.
strtok_r() iterates through the original string = O(n)
sanitize() you say is O(n), but this is presumably with respect to the length of the token rather than the length of the original string, so multiply token length by number of tokens = O(n)
strncat() ends up copying all of the original string with no overlap = O(n)
you append a fixed number of characters to the output string (^, (, ), $ and a couple of NULLs) = O(1)
you append a | to the string per token = O(n)
But wait!
you call strlen() over the output pattern for every iteration of the loop = O(n^2)
So there's your answer.

One way to approach it I like is to replace code with the running times, so for instance
val_buffer[0] = '\0';
strncat(val_buffer, value, sizeof(val_buffer) - 1);
val_buffer[sizeof(val_buffer) - 1] = '\0';
becomes
O(1)
O(n) (* Assume the size of value is the size of the input *)
O(1)
A loop
for each k in value {
strlen(value)
}
becomes
O(n) {
O(n)
}
or something such notation, which you can then make into O(n) * O(n) = O(n^2). You can then sum up all the listed big-oh times to obtain your final time complexity.
A similar trick is to first lace all of your code with counts of how much work is done, then remove the code that does the real work leaving only the counts. Then using simple math to simplify the counts. I.e.,
count = 0;
for (i = 0; i < k; i++) {
count++
}
is easily seen to be replaceable by count = k.

Why is everyone assuming strlen = O(n)? I thought O(n) was only for loops.

Related

String permutation with duplicate characters

I have string "0011" and want all of the combinations without duplicate.
that's means I want a string with a combination of two '0' and two '1';
for example : [0011,0101,0110,1001,1010,1100]
I tried with this and the result is exactly what i need.
private void permutation(String result, String str, HashSet hashset) {
if (str.length()==0 && !hashSet.contains(result)){
System.out.println(result);
hashSet.add(result);
return;
}
IntStream.range(0,str.length()).forEach(pos->permutation(result+ str.charAt(pos), str.substring(0, pos) + str.substring(pos+1),hashset));
}
if i remove HashSet, this code will produce 24 results instead of 6 results.
but the time complexity of this code is O(n!).
how to avoid it to create a duplicate string and reduce the time complexity?
Probably something like this can be faster than n! even on small n
The idea is to count how many bits we need should be in resulting item and
iterate through all posible values and filter only those than have same number of bits. It will work similar amount of time with only one 1 and for 50%/50% of 0 and 1
function bitCount(n) {
n = n - ((n >> 1) & 0x55555555)
n = (n & 0x33333333) + ((n >> 2) & 0x33333333)
return ((n + (n >> 4) & 0xF0F0F0F) * 0x1010101) >> 24
}
function perm(inp) {
const bitString = 2;
const len = inp.length;
const target = bitCount(parseInt(inp, bitString));
const min = (Math.pow(target, bitString) - 1);
const max = min << (len - target);
const result = [];
for (let i = min; i < max + 1; i++) {
if (bitCount(i) === target) {
result.push(i.toString(bitString).padStart(len, '0'));
}
}
return result;
}
const inp = '0011';
const res = perm(inp);
console.log('result',res);
P.s. My first idea was probably faster than upper code. But upper is easier to implement
first idea was to convert string to int
and use bitwise left shift but only for one digit every time. it still depends on n. and can be larger or smaller than upper solution. but bitwise shift is faster itself.
example
const input = '0011'
const len = input.length;
step1: calc number of bits = 2;
then generate first element = 3(Dec) is = '0011' in bin
step2 move last from the right bit one position left with << operator: '0101'
step3 move again: '1001'
step4: we are reached `len` so use next bit:100'1' : '1010'
step5: repeat:'1100'
step6: move initial 3 << 1: '0110'
repeat above steps: '1010'
step8: '1100'
it will generate duplicates so probably can be improved
Hope it helps
The worst case time complexity cannot be improved because there can be no duplicates in a string. However, in case of a multi-set, we could prune a lot of sub-trees to prevent duplicates.
The key idea is to permute the string using traditional backtracking algorithm but prevent swapping if the character has been previously swapped to prevent duplicates.
Here is a C++ code snippet that prevents duplicates and doesn't use any memory for lookup.
bool shouldSwap(const string& str, size_t start, size_t index) {
for (auto i = start; i < index; ++i) {
if (str[i] == str[index])
return false;
}
return true;
}
void permute(string& str, size_t index)
{
if (index >= str.size()) {
cout << str << endl;;
return;
}
for (size_t i = index; i < str.size(); ++i) {
if(shouldSwap(str, index, i)) {
swap(str[index], str[i]);
permute(str, index + 1);
swap(str[index], str[i]);
}
}
}
Running demo. Also refer to SO answer here and Distinct permutations for more references.
Also, note that the time complexity of this solution is O(n2 n!)
O(n) for printing a string
O(n) for iterating over the string to generate swaps and recurrence.
O(n!) possible states for the number of permutations.

Rank of string solution

I was going through a question where it asks you to find the rank of the string amongst its permutations sorted lexicographically.
O(N^2) is pretty clear.
Some websites have O(n) solution also. The part that is optimized is basically pre-populating a count array such that
count[i] contains count of characters which are present in str and are smaller than i.
I understand that this'd reduce the complexity but can't fit my head around how we are calculating this array. This is the function that does this (taken from the link):
// Construct a count array where value at every index
// contains count of smaller characters in whole string
void populateAndIncreaseCount (int* count, char* str)
{
int i;
for( i = 0; str[i]; ++i )
++count[ str[i] ];
for( i = 1; i < 256; ++i )
count[i] += count[i-1];
}
Can someone please provide an intuitive explanation of this function?
That solution is doing a Bucket Sort and then sorting the output.
A bucket sort is O(items + number_of_possible_distinct_inputs) which for a fixed alphabet can be advertised as O(n).
However in practice UTF makes for a pretty large alphabet. I would therefore suggest a quicksort instead. Because a quicksort that divides into the three buckets of <, > and = is efficient for a large character set, but still takes advantage of a small one.
Understood after going through it again. Got confused due to wrong syntax in c++. It's actually doing a pretty simple thing (Here's the java version :
void populateAndIncreaseCount(int[] count, String str) {
// count is initialized to zero for all indices
for (int i = 0; i < str.length(); ++i) {
count[str.charAt(i)]++;
}
for (int i = 1; i < 256; ++i)
count[i] += count[i - 1];
}
After first step, indices whose character are present in string are non-zero. Then, for each index in count array, it'd be the sum of all the counts till index-1 since array represents lexicographically sorted characters. And, after each search, we udate the count array also:
// Removes a character ch from count[] array
// constructed by populateAndIncreaseCount()
void updatecount (int* count, char ch)
{
int i;
for( i = ch; i < MAX_CHAR; ++i )
--count[i];
}

Time Complexity of Permutations of a String

Following example was taken from Cracking the coding interview (version 6) book. As per the book the time complexity of the following code is O(n^2 * n!). (Please refer the example 12. Page 32,33)
public static void main(String[] args) {
new PermutationsTest().permutations("ABCD");
}
void permutations(String string) {
permutations(string, "");
}
static int methodCallCount = 0;
void permutations(String string, String perfix) {
if (string.length() == 0) {
System.out.println(perfix);
} else {
for (int i = 0; i < string.length(); i++) {
String rem = string.substring(0, i) + string.substring(i + 1);
permutations(rem, perfix + string.charAt(i));
}
}
System.out.format("Method call count %s \n", methodCallCount);
methodCallCount += 1;
}
I am finding it difficult to understand how it was calculated. Following is my thoughts about it.
There can be n! arrangements. So there should be at least n! calls. However, for each call, roughly n times work happens. (as it need to loop through the passed string). So shouldn't the answer be O (n * n!)?
But what really happen is for each call the looping need to be done for (n-1) strings. So can we say it should be rather n! * n(n+1)/2
Pls explain..
There are n! possible strings, but each character that's added to the string requires:
String rem = string.substring(0, i) + string.substring(i + 1);
permutations(rem, perfix + string.charAt(i));
The substring calls and the string concatenation are O(n). For each character in a string that would be O(n^2) and for all strings would be O(n^2 * n!).
EDIT:
I calculated the complexity to create a string via concatenation as being O(n^2) but multiplying by the number of strings is inaccurate are the strings share common prefixes so there's a lot of double counting there.
As the number of calls for the final strings is much more than for the rest of them, they dominate the complexity so they're the only ones that need to be counted. So I'm thinking we could reduce the complexity to O(n * n!).
To get the asymptotic time complexity, you need to count how many times the permutations function is called and what is its asymptotic time complexity. The answer is product of these.
The string.length() = len decreases always with 1 in each iteration, so there is 1 call for len=n, n calls for len=n-1, n*(n-1) calls for len = n-2, ... , n! calls for len = 0. Hence, the total number of calls is:
n!/1! + n!/2! + n!/3! + n!/4! + .. + n!/n! = sum(k=1..n) n!/k!
In asymptotic limit this can be calculated:
sum(k=1..n)( n!/k! ) = n! (-1 + sum(k=0..n) 1/k! (1^k)) -> n! (e^1 - 1) = (e-1)*n!,
which is O((1-e)*n!) = O(n!). e is the Napier constant 2.71828128.. .To calculate the sum I used the Taylor series e^x = sum(k=0..infinity) 1/k! x^k atx=1.
Now for each call of the function there is the substring and concatenation operations:
String rem = string.substring(0, i) + string.substring(i + 1);
This operation requires order of string.length operations as under the hood the String class needs to copy each character to a new String ( String.length - 1 number of operations). Therefore, the total complexity is the product of these two O(n*n!).
To check that the calls to perm behave as I said, I wrote a small c++ code for permutations (without the string operations so it should be O(n!) )`.
#include <iostream>
#include <string>
#include <iomanip>
unsigned long long int permutations = 0;
unsigned long long int calls = 0;
void perm(std::string& str, size_t index){
++calls;
if (index == str.size()-1){
++permutations;
// std::cout << count << " " << str << std::endl;
}
else{
for (size_t i=index; i < str.size();++i){
std::swap(str[index],str[i]);
perm(str,index+1);
std::swap(str[index],str[i]);
}
}
}
int main(){
std::string alpha="abcdefghijklmnopqrstuvwxyz";
std::cout << std::setprecision(10);
for (size_t i=1;i<alpha.size()+1;++i){
permutations = calls = 0;
std::string str(alpha.begin(),alpha.begin()+i);
perm(str,0);
std::cout << i << " " << permutations << " " << calls << " " << static_cast<double>(calls)/permutations << std::endl;
}
}
Output:
1 1 1 1
2 2 3 1.5
3 6 10 1.666666667
4 24 41 1.708333333
5 120 206 1.716666667
6 720 1237 1.718055556
7 5040 8660 1.718253968
8 40320 69281 1.71827877
9 362880 623530 1.718281526
10 3628800 6235301 1.718281801
11 39916800 68588312 1.718281826
12 479001600 823059745 1.718281828
13 6227020800 10699776686 1.718281828
14 took too long
The columns are: length of the string = n , n!, sum(k=1..n) n!/k!, ratio of third and second column, which should be (e-1)=1.71828182845905. So it seems to converge rather fast to the asymptotic limit.
I'm afraid the book is mistaken. The time complexity is ϴ(n!n), as has been correctly conjectured in fgb's answer.
Here is why:
As always with recursive functions, we first write down the recurrence relation. In this case, we have to inputs, string and perfix [sic!]. Let's denote their lenghts by s and p, respectively:
T(0,p) = p // println
T(s,p) = s * // for (int i = 0; i < string.length(); i++)
(O(s + // String rem = string.substring(0, i) + string.substring(i + 1);
p) + // perfix + string.charAt(i)
T(s-1,p+1)) // permutations(rem, perfix + string.charAt(i));
= s*T(s-1,p+1) + O(s(s+p))
However, note that
s+p always stays constant, namely it is k, the original length of the string string.
when s has counted down to 0, p also has length k.
So for a particular k, we can rewrite the recurrence relation like this:
T_k(0) = k
T_k(s) = s*T(s-1) + O(ks)
A good rule to memorize is that recurrence relations of the form
T(n) = n * T(n-1) + f(n)
have the general solution
T(n) = n! (T(0) + Sum { f(i)/i!, for i=1..n })
Applying this rule here yields the exact solution
T_k(s) = s! (k + Sum { ki/i!, for i=1..s })
= s!k (1 + Sum { 1/(i-1)!, for i=1..s })
Now recall that k is the original length of the string string, so we are actually just interested in the case k = s, hence we can write down the final exact solution for this case as
T(s) = s!s (1 + Sum { 1/(i-1)!, for i=1..s })
Since the series Sum { 1/(i-1)!, for i=1..infinity } converges, we finally have
T(n) = ϴ(n!n), qed

how would i find the time and space complexity of this code?

I am having difficulty finding space and time complexity for this code that i wrote to find number of palindromes in a string.
/**
This program finds palindromes in a string.
*/
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int checkPalin(char *str, int len)
{
int result = 0, loop;
for ( loop = 0; loop < len/2; loop++)
{
if ( *(str+loop) == *(str+((len - 1) - loop)) )
result = 1;
else {
result = 0;
break;
}
}
return result;
}
int main()
{
char *string = "baaab4";
char *a, *palin;
int len = strlen(string), index = 0, fwd=0, count=0, LEN;
LEN = len;
while(fwd < (LEN-1))
{
a = string+fwd;
palin = (char*)malloc((len+1)*sizeof(char));
while(index<len)
{
sprintf(palin+index, "%c",*a);
index++;
a++;
if ( index > 1 ) {
*(palin+index) = '\0';
count+=checkPalin(palin, index);
}
}
free(palin);
index = 0;
fwd++;
len--;
}
printf("Palindromes: %d\n", count);
return 0;
}
I gave it a shot and this what i think:
in main we have two while loops. The outer one runs over the entire length-1 of the string. Now here is the confusion, the inner while loop runs over the entire length first, then n-1, then n-2 etc for each iteration of the outer while loop. so does that mean our time complexity will be O(n(n-1)) = O(n^2-n) = O(n^2)?
And for the space complexity initially i assign space for string length+1, then (length+1)-1, (length+1)-2 etc. so how can we find space complexity from this?
For the checkPalin function its O(n/2).
i am preparing for interviews and would like to understand this concept.
Thank you
Don't forget that each call to checkPalin (which you do each time through the inner loop of main) executes a loop index / 2 times inside checkPalin. Your computation of the time complexity of the algorithm is correct except for this. Since index gets as large as n, this adds another factor of n to the time complexity, giving O(n3).
As for space compexity, you allocate each time through the outer loop, but then free it. So the space complexity is O(n). (Note that O(n) == O(n/2). It's just the exponent and the form of the function that's important.)
For time complexity, your analysis is correct. It's O(n^2) because of the n+(n-1)+(n-2)+...+1 steps. For space complexity, you generally only count space needed at any given time. In your case, the most additional memory you ever need is O(n) the first time through the loop, so the space complexity is linear.
That said, this isn't especially good code for checking a palindrome. You could do it in O(n) time and O(1) space and actually have cleaner and clearer code to boot.
Gah: didn't read closely enough. The correct answer is given elsewhere.

Space efficiency of algorithms

It seems like none of the algorithm textbooks mentions about space efficiency as much, so I don't really understand when I encounter questions asking for an algorithm that requires only constant memory.
What would be an example of a few examples of algorithms that uses constant memory and algorithms that doesn't use constant memory?
If an algorithm:
a) recurses a number of levels deep which depends on N, or
b) allocates an amount of memory which depends on N
then it is not constant memory. Otherwise it probably is: formally it is constant-memory if there is a constant upper bound on the amount of memory which the algorithm uses, no matter what the size/value of the input. The memory occupied by the input is not included, so sometimes to be clear you talk about constant "extra" memory.
So, here's a constant-memory algorithm to find the maximum of an array of integers in C:
int max(int *start, int *end) {
int result = INT_MIN;
while (start != end) {
if (*start > result) result = *start;
++start;
}
return result;
}
Here's a non-constant memory algorithm, because it uses stack space proportional to the number of elements in the input array. However, it could become constant-memory if the compiler is somehow capable of optimising it to a non-recursive equivalent (which C compilers don't usually bother with except sometimes with a tail-call optimisation, which wouldn't do the job here):
int max(int *start, int *end) {
if (start == end) return INT_MIN;
int tail = max(start+1, end);
return (*start > tail) ? *start : tail;
}
Here is a constant-space sort algorithm (in C++ this time), which is O(N!) time or thereabouts (maybe O(N*N!)):
void sort(int *start, int *end) {
while (std::next_permutation(start,end));
}
Here is an O(N) space sort algorithm, which is O(N^2) time:
void sort(int *start, int *end) {
std::vector<int> work;
for (int *current = start; current != end; ++current) {
work.insert(
std::upper_bound(work.begin(), work.end(), *current),
*current
);
}
std::copy(work.begin(), work.end(), start);
}
Very easy example: counting a number of characters in a string. It can be iterative:
int length( const char* str )
{
int count = 0;
while( *str != 0 ) {
str++;
count++
}
return count;
}
or recursive:
int length( const char* str )
{
if( *str == 0 ) {
return 0;
}
return 1 + length( str + 1 );
}
The first variant only uses a couple of local variables regardless of the string length - it's space complexity is O(1). The second if executed without recursion elimination requires a separate stack frame for storing the return address and local variables corresponding to each depth level - its space complexity is O(n) where n is string length.
Take a sorting algorithms on an array for example. You can either use an new array of the same length as the original array where you put the sorted elements into (Θ(n)). Or you sort the array in-place and just use one additional temporary variable for swapping two elements (Θ(1)).

Resources