Big O notation of a preprocessed static data structure - algorithm

From what i understand, O(n) will grow linearly in regards to the size of the input data set.
I'm getting confused as I have a querying structure that maps keys to a list of preprocessed values that will not ever change after the structure is initialised.
If i define n as the input, an array of keys.
def (arrOfKeys):
for key in arrOfKeys: # O(n) Iterating through the input.
preprocessedList = getPreprocessedListDifferentForEachKey(key) # O(1) this list could have any number of elements.
for anotherPreprocessedList in preprocessedList: # * <- O(n) or O(1)?
for element in anotherPreprocessedList: # * <- O(n) or O(1)?
...
I'm unsure if this O(1) because it is preprocessed or O(n) as the size of the list is dependent on what the input is?
Does this end up being O(n^3) at the worst case or is it possible to argue O(n)?

It depends, if preprocessedList (and its sub-array) is always going to be of a constant length, your 2 inner loops will be of time complexity O(1). If they however are depending on the input argument arrOfKeys they will each be of O(n) and thus O(n) * O(n) = O(n^2).
Combined with the first loop you then multiply it with its time complexity which is O(n).
So if the inner loops are each of O(n) it's going to be in total O(n^3)
If the lengths of preprocessedList is variable, but not depending on the length of arrOfKeys you can define it as m and say it's of time complexity O(m). You can then say that the time complexity is O(n*m^2).
It's usually possible to introduce another symbol to describe the time complexity as long as you explain what they are and how they relate to the indata.

Related

Time and space complexity of Ruby Anagram algorithm

I have written this algorithm here and I am trying to evaluate its time and space complexity in terms of Big-O notation. The algorithm determines if two given strings are anagrams.
def anagram(str1, str2)
str1.each_char do |char|
selected_index = str2.index(char)
return false if !selected_index #to handle nil index
str2.slice!(selected_index)
end
str2.empty?
end
The time complexity of this function is O(n^2), and the space complexity is O(1)? I believe I may be mistaken for the space complexity (could be O(n)) because the selected_index variable is repeatedly re-assigned which takes up memory relative to how long the each_char loop runs for.
If someone could please throw some guidance that would be great :)
Gathering up all those comments into an answer, here is my analysis:
Time
The algorithm as presented does indeed have O(n^2) running time.
The body of the loop is executed n times and takes linear time for index, linear time for slice, and constant time for the rest, requiring a total of O(n^2) time.
Space
The algoithm as presented requires linear space, because it updates a copy of str2 at each iteration.
The rest of the algorithm only takes constant space, unless you include the storage for the inputs themselves, which is also linear.
Faster algorithm: sort str1 and str2
A faster algorithm would be to do string compare sort-by-character(str1) and sort-by-character(str2). That would take O(n log n) time and O(n) space for the sort; and linear time and constant space for the comparison, for an overall O(n log n) time and O(n) space.
Even faster algorithm: use a hash (proposed by OP in the comments)
Using hash tables to store character and then compare character counts can reduce the running time to O(n), assuming standard O(1) insert and lookup hash operations. The space in this case is the space required for the hash tables, which is O(k) for a character alphabet of size k, which can be considered constant if k is fixed. Of course, the input parameters still consume their initial O(n) space as they are passed in or where they are originally stored; the O(k) reflects only the additional space required to run this algorithm.

What is the complexity of this while loop?

Let m be the size of Array A and n be the size of Array B. What is the complexity of the following while loop?
while (i<n && j<m){ if (some condition) i++ else j++}
Example for an array: A=[1,2,3,4] B=[1,2,3,4] the while loop executes at most 5+4 times O(m+n).
Example for an array: A=[1,2,3,4,7,8,9,10] B=[1,2,3,4] the while loop executes at most 4 times O(n).
I am not able to figure out how to represent the complexity of the while loop.
One common approach is to describe the worst-case time complexity. In your example, the worst-case time complexity is O(m + n), because no matter what some condition is during a given loop iteration, the total number of loop iterations is at most m + n.
If it's important to emphasize that the time complexity has a lesser upper bound in some cases, then you'll need to figure out what those cases are, and find a way to express them. (For example, if a given algorithm takes an array of size n and has worst-case O(n2) time, it might also be possible to describe it as "O(mn) time, where m is the number of distinct values in the array" — only if that's true, of course — where we've introduced an extra variable m to let us capture the impact on the performance of having more vs. fewer duplicate values.)

What is the complexity of this nested loop?

I always been under the assumption that nested loops are always O(N^2). But this code that I wrote recently is clearly not that, what is the complexity of this code?
emails = [test#gmail.com, test2#gmail.com, test3#gmail.com]
for i in range (0, len(emails):
for j in range(0, len(emails[i]):
Is this O(N^2) or am I incorrect?
Nested loops are not always O(N^2). See an old post of mine for an example: Am I crazy for thinking this program is O(n) runtime? My TA says it's O(n^2)
In your case, is the length of emails[i] dependent upon the size of your emails array(which you are calling n)?
Because the length of the strings themselves factors into the complexity, you can only create an upper-bound time complexity if you have an upper-bound for the length of the strings.
let n = the number of strings in your array
let m = the maximum length of any string in the array
In that case it would be O(nm) complexity.
This assumes you're not doing above O(1) in the innermost loop you've shown.
If you can't guarantee an upper-bound size of your strings, the time is unbounded, since a string could in theory be infinitely long.

Space complexity of algorithm to copy a list into a HashSet

What is the space complexity for an algorithm which places each element from a list into a HashSet? Is it O(n), where n is the size of the list or is it O(k), where k is the number of unique elements in the list. Since the HashSet only grows when we add unique elements to it it seems to me that the latter is correct.
Space complexity of any algorithm takes into account the size of the input. It is a measure of the maximum working memory that will be needed during the execution of the algorithm. So for O(n) size input the space complexity has to be at least O(n). Source
So given the algorithm used O(n) just for the input, and it isn't a really bad implementation i.e. it uses constant amount of space as it iterates over the list and we know that k < n, so input size will always be the dominating factor in space complexity. So overall the space complexity will be O(n).

analyzing time complexity

I have 2 arrays
a of length n
b of length m
Now i want to find all elements that are common to both the arrays
Algorithm:
Build hash map consisting of all elements of A
Now for each element x in B check if the element x is present in hashmap.
Analyzing overall time complexity
for building hashmap O(n)
for second step complexity is O(m)
So the overall is O(m+n). Am i correct?
What is O(m+n) = ?? when m is large or vice versa?
O(m) + O(n) = O(m+n), if you know that m>n then O(m+n)=O(m+m)=O(m).
Note: hashes theoretically don't guarantee O(1) lookup, but practically you can count on it (= it's the average complexity, the expected runtime for a random input).
Also note, that your algo will repeatedly signal duplicated elements of b which are also present in a. If this is a problem you have to store in the hash that you already checked/printed out that element.
Average case time complexity is O(m + n). This is what you should consider if you are doing some implementation, since hash maps would usually not have collisions. O(m+n) = O(max(m, n))
However, if this is an test question, by time complexity, people mean worst case time complexity. Worst case time complexity is O(mn) since each of second steps can take O(n) time in worst case.

Resources