What is the complexity of this nested loop? - big-o

I always been under the assumption that nested loops are always O(N^2). But this code that I wrote recently is clearly not that, what is the complexity of this code?
emails = [test#gmail.com, test2#gmail.com, test3#gmail.com]
for i in range (0, len(emails):
for j in range(0, len(emails[i]):
Is this O(N^2) or am I incorrect?

Nested loops are not always O(N^2). See an old post of mine for an example: Am I crazy for thinking this program is O(n) runtime? My TA says it's O(n^2)
In your case, is the length of emails[i] dependent upon the size of your emails array(which you are calling n)?

Because the length of the strings themselves factors into the complexity, you can only create an upper-bound time complexity if you have an upper-bound for the length of the strings.
let n = the number of strings in your array
let m = the maximum length of any string in the array
In that case it would be O(nm) complexity.
This assumes you're not doing above O(1) in the innermost loop you've shown.
If you can't guarantee an upper-bound size of your strings, the time is unbounded, since a string could in theory be infinitely long.

Related

How do i measure the O-notation of a for loop?

i have a question about O-notation. (big O)
In my code, i am using a for loop to iterate through an array of users.
The for loop has if-statements that makes it break out of the loop, if the rigth user is found.
My question is how i measure the O-notation?
Is the O-notation is O(N) as i loop through all the users in the array?
Or is the O-notation O(1), as the loop breaks and never runs again?
O notation defines an "order of" relationship between an amount of work (however measured) and the number of items processed (usually 'n'). So "O(n)" means "in direct proportion to the number of items n". "O(1)" means simply "constant". If a loop processes every item once then the amount of work is intuitively in direct proportion to n, but let's say that your exit condition gets hit on average half way through, we might be tempted to say that this is O(n/2), but instead we still say that it is O(n) because the relationship to n is still direct/linear. Similarly if you were to assess the relationship to be O(7n^3 + 2n), you'd say the relationship was simply O(n^3) because n^3 is the term that dominates as n grows large.
The answer to your specific question is therefore O(n) because the number of iterations is in direct proportion to n. All that this says is that if N user records take M milliseconds to process, 2N should take about 2M milliseconds.
It is probably worth noting that O notation is strictly concerned with worst case and not the average cost of algorithms (although I have started to find that it is quite common for people to use it in the latter sense). It is always a good idea to specify to avoid ambiguity.
Big O notation answers the following two questions:
If there are N data elements, how many steps will the algorithm take?
How will the performance of the algorithm change if the number of data elements increases?
Best-case scenario in your case is that the user you are searching for is found at the first index. Time complexity in this case would be O(1) because number of steps taken by the algorithm are constant and do not change if the number of elements in the array are changed.
The worst-case scenario is that your loop will have to iterate over all the users. That makes the time complexity to be O(N) because number of steps taken by the algorithm will be directly proportional to the number of elements in the array.
Big O notation generally refers to the worst-case scenario, so you can say that the time complexity in your case is O(N).
Best case complexity of for loop is O(1) and worst case complexity is O(N). In linear search best case is O(N) and worst case is O(N). It also depends on the approach followed by you to solve problem. Like for(int i = n; i>1; i=i/2) in this case complexity is O(log(N). Complexity of if else condition is O(1).

What is the complexity of this while loop?

Let m be the size of Array A and n be the size of Array B. What is the complexity of the following while loop?
while (i<n && j<m){ if (some condition) i++ else j++}
Example for an array: A=[1,2,3,4] B=[1,2,3,4] the while loop executes at most 5+4 times O(m+n).
Example for an array: A=[1,2,3,4,7,8,9,10] B=[1,2,3,4] the while loop executes at most 4 times O(n).
I am not able to figure out how to represent the complexity of the while loop.
One common approach is to describe the worst-case time complexity. In your example, the worst-case time complexity is O(m + n), because no matter what some condition is during a given loop iteration, the total number of loop iterations is at most m + n.
If it's important to emphasize that the time complexity has a lesser upper bound in some cases, then you'll need to figure out what those cases are, and find a way to express them. (For example, if a given algorithm takes an array of size n and has worst-case O(n2) time, it might also be possible to describe it as "O(mn) time, where m is the number of distinct values in the array" — only if that's true, of course — where we've introduced an extra variable m to let us capture the impact on the performance of having more vs. fewer duplicate values.)

Big O notation of a preprocessed static data structure

From what i understand, O(n) will grow linearly in regards to the size of the input data set.
I'm getting confused as I have a querying structure that maps keys to a list of preprocessed values that will not ever change after the structure is initialised.
If i define n as the input, an array of keys.
def (arrOfKeys):
for key in arrOfKeys: # O(n) Iterating through the input.
preprocessedList = getPreprocessedListDifferentForEachKey(key) # O(1) this list could have any number of elements.
for anotherPreprocessedList in preprocessedList: # * <- O(n) or O(1)?
for element in anotherPreprocessedList: # * <- O(n) or O(1)?
...
I'm unsure if this O(1) because it is preprocessed or O(n) as the size of the list is dependent on what the input is?
Does this end up being O(n^3) at the worst case or is it possible to argue O(n)?
It depends, if preprocessedList (and its sub-array) is always going to be of a constant length, your 2 inner loops will be of time complexity O(1). If they however are depending on the input argument arrOfKeys they will each be of O(n) and thus O(n) * O(n) = O(n^2).
Combined with the first loop you then multiply it with its time complexity which is O(n).
So if the inner loops are each of O(n) it's going to be in total O(n^3)
If the lengths of preprocessedList is variable, but not depending on the length of arrOfKeys you can define it as m and say it's of time complexity O(m). You can then say that the time complexity is O(n*m^2).
It's usually possible to introduce another symbol to describe the time complexity as long as you explain what they are and how they relate to the indata.

algorithmic complexity of operation with f(n) cost

If I need to determine the algorithmic complexity of a process with a cost set by a given function, is it just a question of giving O(n^2 log n) - or whatever the big O happens to be?
Also, isn't big O just going to be the highest order of any term in the polynomial? If I'm asked to give a derivation I'm not sure what to provide because it seems a little trivial.
Last question, if I need to give the operation count for an algorithm and it's really straightforward - roughly like
array1, array2, array3 of size n
for i in n:
array2[i] = sqrt(array1[i])
array3[i] = array1[i]^2
For 'operation count' am I just counting up all my arithmetical operations and figuring out which ones (like sqrt) count as multiple operations, etc... Or can I just write that it's O(n)?
The algorithmic cost of a process is the costs of all the components of the process. For example, using your example, we can decompose the costs of everything
array1, array2, array3 of size n
This takes n time for each array, so a total of 3n time, which is in O(n).
for i in n:
This means that everything in the loop is multiplies by n.
array2[i] = sqrt(array1[i])
This takes O(1) time. Why? Accessing an array element is constant time. Taking the sqrt is constant time. And setting the value of an array element is constant time. So the whole operation is constant time.
array3[i] = array1[i] ^ 2
This takes O(1) time, for the same reasons as the previous operation.
So the whole running time is 3n + n * ( 1 + 1) (using rough math here, not exact), which is just in O(n) time. Does that help?
As for an actual derivation, there are specific techniques for this. Did you learn the precise mathematical definition of big-Oh notation?
This link describes the formal definition of big-Oh notation, and provides of an example of how to prove this stuff.

Time complexity in Big-O

What would be the BigO time of this algorithm
Input: Array A sorting n>=1 integers
Output: The sum of the elements at even cells in A
s=A[0]
for i=2 to n-1 by increments of 2
{
s=s+A[i]
}
return s
I think the function for this equation is F(n)=n*ceiling(n/2) but how do you convert that to bigO
The time complexity for that algorithm would be O(n), since the amount of work it does grows linearly with the size of the input. The other way to look at it is that loops over the input once - ignore the fact that it only looks at half of the values, that doesn't matter for Big-O complexity).
The number of operations is not proportional to n*ceiling(n/2), but rather n/2 which is O(n). Because of the meaning of big-O, (which includes the idea of an arbitrary coefficient), O(n) and O(n/2) are absolutely equivalent - so it is always written as O(n).
This is an O(n) algorithm since you look at ~n/2 elements.
Your algorithm will do N/2 iterations given that there are N elements in the array. Each iteration requires constant time to complete. This gives us O(N) complexity, and here's why.
Generally, the running time of an algorithm is a function f(x) from the size of data. Saying that f(x) is O(g(x)) means that there exists some constant c such that for all sufficiently large values of x f(x) <= cg(x). Easy to see how this applies in our case: if we assume that each iteration takes a unit of time, obviously f(N) <= 1/2N.
A formal manner to obtain the exact number of operations and your algorithm's order of growth:

Resources