What is time complexity of for loop of multiple conditions - algorithm

I was trying to develop a solution to decrease the time complexity of O(n^2) or O(n*m) algorithm to O(n) or O(n+m) algorithm. For example:
let arr = [[1, 2], [1, 2, 3], [1, 2, 3, 4, 5, 6, 7, 8]];
let x = 0;
let len = getArrayMaxLength (arr) //Get the maximum length of a 2d array which in this example is 8.
for (let i = 0; i < len && x < arr.length; ++i) {
print (arr [x][(i % arr [x].length);
if ((i + 1) % arr [x].length == 0) {
++x;
if (x != arr.length) i = -1;
}
}
I'm having problem to determine the Big-O of this algorithm as i have never dealt with loop with multiple conditions that much. I've read this and this and still don't get quite right. From what i understand the time-complexity will be O(n+m). Where n is [arr.length] and m is [len] which is the output of a function getArrayMaxLength as described above.
So to sum things up. What is the time complexity of the algorithm?
Thank you.

If the body of your loop contains lots of conditionals but none of these conditionals add to the number of repetitions on the outer for-loop and they also don't do any computation that has a variable time based on some input, them you should consider the body as a constant, thus not influencing on the final big-O complexity.
Your assumption of O(m + n) is correct.

Note that every time the for loop reaches the end of an inner array, the counter (variable i) resets and you increase x, passing onto the next inner array. That means you go through every single element of your two-dimensional array, which the program output confirms.
Although n+m complexity may seem ok, it is actually a bad approximation. In practice, the complexity is always bigger because array lengths vary. Imagine that all the subarrays have the same length, so n = m. As you visit n elements for every of the n inner arrays, the total complexity would then be quadratic (n*n) and not linear. When you're working with big arrays this difference becomes very obvious.
In conclusion, the time complexity is O(n*m).

Related

How can sieve of Eratosthenes be implemented in O(n) time complexity?

There is implementation of this algorithm of finding prime numbers upto n in O(n*log(log(n)) time complexity. How can we achieve it in O(n) time complexity?
You can perform the Sieve of Eratosthenes to determine which numbers are prime in the range [2, n] in O(n) time as follows:
For each number x in the interval [2, n], we compute the minimum prime factor of x. For implementation purposes, this can easily be done by keeping an array --- say MPF[] --- in which MPF[x] represents the minimum prime factor of x. Initially, you should set MPF[x] equal to zero for every integer x. As the algorithm progresses, this table will get filled.
Now we use a for-loop and iterate from i = 2 upto i = n (inclusive). If we encounter a number for which MPF[i] equals 0, then we conclude immediately that i is prime since it doesn't have a least prime factor. At this point, we mark i as prime by inserting it into a list, and we set MPF[i] equal to i. Conversely, if MPF[i] does not equal 0, then we know that i is composite with minimum prime factor equal to MPF[i].
During each iteration, after we've checked MPF[i], we do the following: compute the number y_j = i * p_j for each prime number p_j less than or equal to MPF[i], and set MPF[y_j] equal to p_j.
This might seem counterintuitive --- why is the runtime O(n) if we have two nested loops? The key idea is that every value is set exactly one, so the runtime is O(n). This website gives a C++ implementation, which I've provided below:
const int N = 10000000;
int lp[N+1];
vector<int> pr;
for (int i=2; i<=N; ++i) {
if (lp[i] == 0) {
lp[i] = i;
pr.push_back (i);
}
for (int j=0; j<(int)pr.size() && pr[j]<=lp[i] && i*pr[j]<=N; ++j)
lp[i * pr[j]] = pr[j];
}
The array lp[] in the implementation above is the same thing as MPF[] that I described in my explanation. Also, pr stores the list of prime numbers.
Well, if the algorithm is O(n*log(n)) you generally can’t do better without changing the algorithm.
The complexity is O(n*log(n)). But you can trade between time and resources: By making sure you have O(log(n)) computing nodes running in parallel, it would be possible to do it in O(n).
Hope I didn’t do your homework...

Divide and conquer algorithm

I had a job interview a few weeks ago and I was asked to design a divide and conquer algorithm. I could not solve the problem, but they just called me for a second interview! Here is the question:
we are giving as input two n-element arrays A[0..n − 1] and B[0..n − 1] (which
are not necessarily sorted) of integers, and an integer value. Give an O(nlogn) divide and conquer algorithm that determines if there exist distinct values i, j (that is, i != j) such that A[i] + B[j] = value. Your algorithm should return True if i, j exists, and return False otherwise. You may assume that the elements in A are distinct, and the elements in B are distinct.
can anybody solve the problem? Thanks
My approach is..
Sort any of the array. Here we sort array A. Sort it with the Merge Sort algorithm which is a Divide and Conquer algorithm.
Then for each element of B, Search for Required Value- Element of B in array A by Binary Search. Again this is a Divide and Conquer algorithm.
If you find the element Required Value - Element of B from an Array A then Both element makes pair such that Element of A + Element of B = Required Value.
So here for Time Complexity, A has N elements so Merge Sort will take O(N log N) and We do Binary Search for each element of B(Total N elements) Which takes O(N log N). So total time complexity would be O(N log N).
As you have mentioned you require to check for i != j if A[i] + B[j] = value then here you can take 2D array of size N * 2. Each element is paired with its original index as second element of the each row. Sorting would be done according the the data stored in the first element. Then when you find the element, You can compare both elements original indexes and return the value accordingly.
The following algorithm does not use Divide and Conquer but it is one of the solutions.
You need to sort both the arrays, maintaining the indexes of the elements maybe sorting an array of pairs (elem, index). This takes O(n log n) time.
Then you can apply the merge algorithm to check if there two elements such that A[i]+B[j] = value. This would O(n)
Overall time complexity will be O(n log n)
I suggest using hashing. Even if it's not the way you are supposed to solve the problem, it's worth mentioning since hashing has a better time complexity O(n) v. O(n*log(n)) and that's why more efficient.
Turn A into a hashset (or dictionary if we want i index) - O(n)
Scan B and check if there's value - B[j] in the hashset (dictionary) - O(n)
So you have an O(n) + O(n) = O(n) algorithm (which is better that required (O n * log(n)), however the solution is NOT Divide and Conquer):
Sample C# implementation
int[] A = new int[] { 7, 9, 5, 3, 47, 89, 1 };
int[] B = new int[] { 5, 7, 3, 4, 21, 59, 0 };
int value = 106; // 47 + 59 = A[4] + B[5]
// Turn A into a dictionary: key = item's value; value = item's key
var dict = A
.Select((val, index) => new {
v = val,
i = index, })
.ToDictionary(item => item.v, item => item.i);
int i = -1;
int j = -1;
// Scan B array
for (int k = 0; k < B.Length; ++k) {
if (dict.TryGetValue(value - B[k], out i)) {
// Solution found: {i, j}
j = k;
// if you want any solution then break
break;
// scan further (comment out "break") if you want all pairs
}
}
Console.Write(j >= 0 ? $"{i} {j}" : "No solution");
Seems hard to achieve without sorting.
If you leave the arrays unsorted, checking for existence of A[i]+B[j] = Value takes time Ω(n) for fixed i, then checking for all i takes Θ(n²), unless you find a trick to put some order in B.
Balanced Divide & Conquer on the unsorted arrays doesn't seem any better: if you divide A and B in two halves, the solution can lie in one of Al/Bl, Al/Br, Ar/Bl, Ar/Br and this yields a recurrence T(n) = 4 T(n/2), which has a quadratic solution.
If sorting is allowed, the solution by Sanket Makani is a possibility but you do better in terms of time complexity for the search phase.
Indeed, assume A and B now sorted and consider the 2D function A[i]+B[j], which is monotonic in both directions i and j. Then the domain A[i]+B[j] ≤ Value is limited by a monotonic curve j = f(i) or equivalently i = g(j). But strict equality A[i]+B[j] = Value must be checked exhaustively for all points of the curve and one cannot avoid to evaluate f everywhere in the worst case.
Starting from i = 0, you obtain f(i) by dichotomic search. Then you can follow the border curve incrementally. You will perform n step in the i direction, and at most n steps in the j direction, so that the complexity remains bounded by O(n), which is optimal.
Below, an example showing the areas with a sum below and above the target value (there are two matches).
This optimal solution has little to do with Divide & Conquer. It is maybe possible to design a variant based on the evaluation of the sum at a central point, which allows to discard a whole quadrant, but that would be pretty artificial.

Big O - is n always the size of the input?

I made up my own interview-style problem, and have a question on the big O of my solution. I will state the problem and my solution below, but first let me say that the obvious solution involves a nested loop and is O(n2). I believe I found a O(n) solution, but then I realized it depends not only on the size of the input, but the largest value of the input. It seems like my running time of O(n) is only a technicality, and that it could easily run in O(n2) time or worse in real life.
The problem is:
For each item in a given array of positive integers, print all the other items in the array that are multiples of the current item.
Example Input:
[2 9 6 8 3]
Example Output:
2: 6 8
9:
6:
8:
3: 9 6
My solution (in C#):
private static void PrintAllDivisibleBy(int[] arr)
{
Dictionary<int, bool> dic = new Dictionary<int, bool>();
if (arr == null || arr.Length < 2)
return;
int max = arr[0];
for(int i=0; i<arr.Length; i++)
{
if (arr[i] > max)
max = arr[i];
dic[arr[i]] = true;
}
for(int i=0; i<arr.Length; i++)
{
Console.Write("{0}: ", arr[i]);
int multiplier = 2;
while(true)
{
int product = multiplier * arr[i];
if (dic.ContainsKey(product))
Console.Write("{0} ", product);
if (product >= max)
break;
multiplier++;
}
Console.WriteLine();
}
}
So, if 2 of the array items are 1 and n, where n is the array length, the inner while loop will run n times, making this equivalent to O(n2). But, since the performance is dependent on the size of the input values, not the length of the list, that makes it O(n), right?
Would you consider this a true O(n) solution? Is it only O(n) due to technicalities, but slower in real life?
Good question! The answer is that, no, n is not always the size of the input: You can't really talk about O(n) without defining what the n means, but often people use imprecise language and imply that n is "the most obvious thing that scales here". Technically we should usually say things like "This sort algorithm performs a number of comparisons that is O(n) in the number of elements in the list": being specific about both what n is, and what quantity we are measuring (comparisons).
If you have an algorithm that depends on the product of two different things (here, the length of the list and the largest element in it), the proper way to express that is in the form O(m*n), and then define what m and n are for your context. So, we could say that your algorithm performs O(m*n) multiplications, where m is the length of the list and n is the largest item in the list.
An algorithm is O(n) when you have to iterate over n elements and perform some constant time operation in each iteration. The inner while loop of your algorithm is not constant time as it depends on the hugeness of the biggest number in your array.
Your algorithm's best case run-time is O(n). This is the case when all the n numbers are same.
Your algorithm's worst case run-time is O(k*n), where k = the max value of int possible on your machine if you really insist to put an upper bound on k's value. For 32 bit int the max value is 2,147,483,647. You can argue that this k is a constant, but this constant is clearly
not fixed for every case of input array; and,
not negligible.
Would you consider this a true O(n) solution?
The runtime actually is O(nm) where m is the maximum element from arr. If the elements in your array are bounded by a constant you can consider the algorithm to be O(n)
Can you improve the runtime? Here's what else you can do. First notice that you can ensure that the elements are different. ( you compress the array in hashmap which stores how many times an element is found in the array). Then your runtime would be max/a[0]+max/a[1]+max/a[2]+...<= max+max/2+...max/max = O(max log (max)) (assuming your array arr is sorted). If you combine this with the obvious O(n^2) algorithm you'd get O(min(n^2, max*log(max)) algorithm.

Time complexity to get average of large data set using subsets

Say you're given a large set of numbers (size n) and asked to compute the average of the data. You only have enough space and memory for c numbers at one time. What is the run-time complexity of this data?
To compute an average for the whole dataset, the complexity would be O(n). Consider the following algorithm:
set sum = 0;
for(i = 0; i < n; i++){ // Loop n times
add value of n to sum;
}
set average = sum / n;
Since we can disregard the two constant time operations, the main operation (adding value to sum) occurs n times.
In this particular example, you only have data for 'c' numbers at one time. For each individual group, you'll need a time complexity of O(c). However, this will not change your overall complexity, because ultimately you will be making n passes.
To provide a concrete example, consider the case n = 100 and c = 40, and your values are passed in an array. Your first loop would have 40 passes, the second another 40, and the third only twenty. Regardless, you have made 100 passes through the loop.
This assumes also that it is a constant time operation to get the second set of numbers.
It is O(n).
A basic (though not particularly stable) algorithm computes it iteratively as follows:
mean = 0
for n = 0,1,2,.. length(arr)-1
mean = (mean*n + arr[n])/(n+1)
A variant of this algorithm can be used to parse the data from the array in sets of c numbers, but it is still linear in n.
Ie, spelling out the seralization:
To spell it out, you can do this:
mean = 0
for m = 0, c, 2c, ..., arr_length -1
sub_arr = request_sub_arr_between(m,min(m+c-1, total_length(arr)-1))
for i = 0, 1, ..., length(sub_arr)
n = m + i
mean = (mean*n + sub_arr[i])/(n+1)
This is still O(n), as we are only doing a bounded number of things for each n. In fact, the algorithm given at the top of this answer is a variant of this with c=1. If sub_arr is not kept in local memory, but sub_arr[n] is read at each step, then we are only storing 3 numbers at any step.

global max of local mins of k contiguous elements

If this question appears to be a replicate, please point it out.
The problem states:
Given an array of n elements and an integer k <= n, find the max of {min{a_i+1 ... a_i+k} for i in {0 ... n-k}}, i.e. find the max of the mins of k contiguous numbers.
For example, let the sequence a = {10, 21, 11, 13, 16, 15, 12, 9} and k = 3.
The max is 13 for block {13, 16, 15}.
Hopefully the problem is clear!
It is straight forward to have a simple "brute force" making it O(nk). I am wondering if we can do it in O(nlogn) with "divide and conquer" or even in O(n) possibly with "dynamic programming".
It appears to me that if I try "divide and conquer", I have to deal with a set of blocks that is just across the middle border. Yet figuring out the max in that case seems to take O(k2), making the whole recurrence O(nk) again. (Maybe I get the numbers wrong somewhere!)
Looking for directions from you guys! Words and pseudocode are welcome!
You can do it in O(n log k) time; I'll start you off with a tip.
Suppose you have elements ai+1, ..., ai+k in some data structure with logarithmic time insert and constant time min operations (like a min heap). Can you use that to get the same data structure, but with elements ai+2, ..., ai+k+1
in O(log k) time?
Once you have that, then you can basically go through all of the consecutive groups of k and take the maximum of the minimums normally.
Here's one way to solve it using a dynamic programming-like solution. We construct an array "ys" that stores the mins of ever increasing ranges. It uses the idea that if the ys currently store mins of ranges of length p, then if we do for all i: ys[i] = min(ys[i], ys[i+p]), then we've now got mins of ranges of length 2p. Similarly, if we do ys[i] = min(ys[i], ys[i+p], xs[i+2p]) then we've got mins of ranges of length 2p+1. By using a technique very like exponentiation-by-squaring, we can end up with ys storing min-ranges of length k.
def mins(xs, k, ys):
if k == 1: return ys
mins(xs, k // 2, ys)
for i in xrange(len(ys)):
if i + k//2 < len(ys): ys[i] = min(ys[i], ys[i+k//2])
if k % 2:
if i+k-1 < len(ys): ys[i] = min(ys[i], xs[i+k-1])
return ys
def maxmin(xs, k):
return max(mins(xs, k, xs[:]))
We call mins recursively log_2(k) times, and otherwise we iterate over the ys array once per call. Thus, the algorithm is O(n log k).

Resources