Finding the complexity of given code - algorithm

I am battling to find the complexity of given code. I think I am struggling with identifying the correct complexity and how to actually analyze the complexity. The code to be analyzed is as follows:
public void doThings(int[] arr, int start){
boolean found = false;
int i = start;
while ((found != true) && (i<arr.length)){
i++;
if (arr[i]==17){
found=true;
}
}
}
public void reorganize(int[] arr){
for (int i=0; i<arr.length; i++){
doThings(arr, i);
}
}
The questions are:
1) What is the best case complexity of the reorganize method and for what inputs does it occur?
2) What is the worst case complexity of the reorganize method and for what inputs does it occur?
My answers are:
1) For the reorganize method there are two possible best cases that could occur. The first is when the array length is 1, meaning the loop in the reorganize and doThings method will run exactly once. The other possibility is when the ith item of the array is 17 meaning the doThings loop will not run completely on that ith iteration. Thus in both cases the best case=Ο(n).
2) The worst case would be when the number 17 is at the end of the array and when the number 17 is not in the array. This is will mean that the array will be traversed n×n times meaning the worst case would be Ο(n^2 ).
Could anyone please help me answer the questions correctly, if mine is incorrect and if possible explain the problem?

"best case" the array is empty, and you search nothing.
The worst case is that you look at every single element because you never see 17. All other cases are in between.
if (arr[i]==17){ is the "hottest path" of the code, meaning it is ran most often.
It will always execute a total of n*(n-1)/2 times (I think I did that math right) in the worst case because even when you set found = true, the reorganize method doesn't know about that, doesn't end, and continues to search even though you already scanned the entire array.
Basically, flatten the code without methods. You have this question.
What is the Big-O of a nested loop, where number of iterations in the inner loop is determined by the current iteration of the outer loop?

Related

What is time complexity of the following sort function?

I've wrote this code for bubble sort.Can someone explain me the time complexity for this. It is working similar to 2 for loops. But still want to confirm with time complexity.
public int[] sortArray(int[] inpArr)
{
int i = 0;
int j = 0;
while(i != inpArr.length-1 && j != inpArr.length-1)
{
if(inpArr[i] > inpArr[i+1])
{
int temp = inpArr[i];
inpArr[i] = inpArr[i+1];
inpArr[i+1] = temp;
}
else
{
i++;
}
if(i==inpArr.length-1)
{
j++;
i = 0;
}
}
return inpArr;
}
This would have O(n^2) time complexity. Actually, this would be probably be both O(n^2) and theta(n^2).
Look at the logic of your code. You are performing the following:
Loop through the input array
If the current item is bigger than the next, switch the two
If that is not the case, increase the index(and essentially check the next item, so recursively walk through steps 1-2)
Once your index is the length-1 of the input array, i.e. it has gone through the entire array, your index is reset (the i=0 line), and j is increased, and the process restarts.
This essentially ensures that the given array will be looped through twice, meaning that you will have a WORST-CASE (big o, or O(x)) time complexity of O(n^2), but given this code, your AVERAGE (theta) time complexity will be theta(n^2).
There are SOME situations where you can have a BEST CASE (lambda) of nlg(n), giving a lambda(nlg*(n)) time complexity, but this situation is rare and I'm not even sure its achievable with this code.
Your time complexity is O(n^2) as a worst-case scenario and O(n) as a best case scenario. Your average scenario still performs O(n^2) comparisons but will have less swaps than O(n^2). This is because you're essentially doing the same thing as having two for loops. If you're interested in algorithmic efficiency, I'd recommend checking out pre-existing libraries that sort. The computer scientists that work on these sort of things really are intense. Java's Arrays.sort() method is based on a Python project called timsort that is based on merge-sorting. The disadvantage of your (and every) Bubble sort is that it's really inefficient for big, disordered arrays. Read more here.

Array merging and sorting complexity calculation

I have one exercise from my algorithm text book and I am not really sure about the solution. I need to explain why this solution:
function array_merge_sorted(array $foo, array $bar)
{
$baz = array_merge($foo, $bar);
$baz = array_unique($baz);
sort($baz);
return $baz;
}
that merge two array and order them is not the most efficient and I need to provide one solution that is the most optimized and prove that not better solution can be done.
My idea was about to use a mergesort algorithm that is O(n log n), to merge and order the two array passed as parameter. But how can I prove that is the best solution ever?
Algorithm
As you have said that both inputs are already sorted, you can use a simple zipper-like approach.
You have one pointer for each input array, pointing to the begin of it. Then you compare both elements, adding the smaller one to the result and advancing the pointer of the array with the smaller element. Then you repeat the step until both pointers reached the end and all elements where added to the result.
You find a collection of such algorithms at Wikipedia#Merge algorithm with my current presented approach being listed as Merging two lists.
Here is some pseudocode:
function Array<Element> mergeSorted(Array<Element> first, Array<Element> second) {
Array<Element> result = new Array<Element>(first.length + second.length);
int firstPointer = 0;
int secondPointer = 0;
while (firstPointer < first.length && secondPointer < first.length) {
Element elementOfFirst = first.get(firstPointer);
Element elementOfSecond = second.get(secondPointer);
if (elementOfFirst < elementOfSecond) {
result.add(elementOfFirst);
firstPointer = firstPointer + 1;
} else {
result.add(elementOfSecond);
secondPointer = secondPointer + 1;
}
}
}
Proof
The algorithm obviously works in O(n) where n is the size of the resulting list. Or more precise it is O(max(n, n') with n being the size of the first list and n' of the second list (or O(n + n') which is the same set).
This is also obviously optimal since you need, at some point, at least traverse all elements once in order to build the result and know the final ordering. This yields a lower bound of Omega(n) for this problem, thus the algorithm is optimal.
A more formal proof assumes a better arbitrary algorithm A which solves the problem without taking a look at each element at least once (or more precise, with less than O(n)).
We call that element, which the algorithm does not look at, e. We can now construct an input I such that e has a value which fulfills the order in its own array but will be placed wrong by the algorithm in the resulting array.
We are able to do so for every algorithm A and since A always needs to work correctly on all possible inputs, we are able to find a counter-example I such that it fails.
Thus A can not exist and Omega(n) is a lower bound for that problem.
Why the given algorithm is worse
Your given algorithm first merges the two arrays, this works in O(n) which is good. But after that it sorts the array.
Sorting (more precise: comparison-based sorting) has a lower-bound of Omega(n log n). This means every such algorithm can not be better than that.
Thus the given algorithm has a total time complexity of O(n log n) (because of the sorting part). Which is worse than O(n), the complexity of the other algorithm and also the optimal solution.
However, to be super-correct, we also would need to argue whether the sort-method truly yields that complexity, since it does not get arbitrary inputs but always the result of the merge-method. Thus it could be possible that a specific sorting method works especially good for such specific inputs, yielding O(n) in the end.
But I doubt that this is in the focus of your task.

Avg number of comparisons for failure in this sequential search algorithm?

The algorithm:
public boolean search(int[] A, int target)
{
for(int i=0;i<A.length;i++)
{
if(target==A[i]) return true;
if(target<A[i]) return false;
}
return false;
}
I'm having trouble understanding this problem - I know it has something to do with the series, but the introduction of two comparisons per iteration really has me stumped. Can anybody help me out and explain this to me?
How I used to look at this was:
think of the best case, whats the least possible comparisons you can have? that would be when:
target==A[0] //first element
think of your worst case, whats the most comparisons you can have? that is when:
target==A[A.length-1] ///last elements or not found
so what would be our average case?
well take into consideration that first element is really fast, but last element is slow (O(1) vs O(n))
also as you move away from the begginning it starts taking longer, but at the same time as you get away from the end it gets faster. so your average case would lie in the middle.
if you are looking for a specific number avg number of comparisons might be
3 comparisons "for comparrison,target == A[i], target < A[i] times n/2 " which is our average number of comparissons
if you want to test it you can make a counter and increase 1 everytime you do a comparison in your algorithm

Find First Unique Element

I had this question in interview which I couldn't answer.
You have to find first unique element(integer) in the array.
For example:
3,2,1,4,4,5,6,6,7,3,2,3
Then unique elements are 1, 5, 7 and first unique of 1.
The Solution required:
O(n) Time Complexity.
O(1) Space Complexity.
I tried saying:
Using Hashmaps, Bitvector...but none of them had space complexity O(1).
Can anyone tell me solution with space O(1)?
Here's a non-rigorous proof that it isn't possible:
It is well known that duplicate detection cannot be better than O(n * log n) when you use O(1) space. Suppose that the current problem is solvable in O(n) time and O(1) memory. If we get the index 'k' of the first non-repeating number as anything other than 0, we know that k-1 is a repeated and hence with one more sweep through the array we can get its duplicate making duplicate detection a O(n) exercise.
Again it is not rigorous and we can get into a worst case analysis where k is always 0. But it helps you think and convince the interviewer that it isn't likely to be possible.
http://en.wikipedia.org/wiki/Element_distinctness_problem says:
Elements that occur more than n/k times in a multiset of size n may be found in time O(n log k). Here k = n since we want elements that appear more than once.
I think that this is impossible. This isn't a proof, but evidence for a conjecture. My reasoning is as follows...
First, you said that there is no bound on value of the elements (that they can be negative, 0, or positive). Second, there is only O(1) space, so we can't store more than a fixed number of values. Hence, this implies that we would have to solve this using only comparisons. Moreover, we can't sort or otherwise swap values in the array because we would lose the original ordering of unique values (and we can't store the original ordering).
Consider an array where all the integers are unique:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
In order to return the correct output 1 on this array, without reordering the array, we would need to compare each element to all the other elements, to ensure that it is unique, and do this in reverse order, so we can check the first unique element last. This would require O(n^2) comparisons with O(1) space.
I'll delete this answer if anyone finds a solution, and I welcome any pointers on making this into a more rigorous proof.
Note: This can't work in the general case. See the reasoning below.
Original idea
Perhaps there is a solution in O(n) time and O(1) extra space.
It is possible to build a heap in O(n) time. See Building a Heap.
So you built the heap backwards, starting at the last element in the array and making that last position the root. When building the heap, keep track of the most recent item that was not a duplicate.
This assumes that when inserting an item in the heap, you will encounter any identical item that already exist in the heap. I don't know if I can prove that . . .
Assuming the above is true, then when you're done building the heap, you know which item was the first non-duplicated item.
Why it won't work
The algorithm to build a heap in place starts at the midpoint of the array and assumes that all of the nodes beyond that point are leaf nodes. It then works backward (towards item 0), sifting items into the heap. The algorithm doesn't examine the last n/2 items in any particular order, and the order changes as items are sifted into the heap.
As a result, the best we could do (and even then I'm not sure we could do it reliably) is find the first non-duplicated item only if it occurs in the first half of the array.
OP's question original doesn't mention the limit of the number(although latter add number can be negative/positive/zero). Here I assume one more condition:
The number in array are all smaller than array length and
non-negative.
Then, giving a O(n) time, O(1) space solution is possible and seems like a interview question, and the the test case OP gives in the question comply to above assumption.
Solution:
for (int i = 0; i < nums.length; i++) {
if (nums[i] != i) {
if (nums[i] == -1) continue;
if (nums[nums[i]] == nums[i]) {
nums[nums[i]] = -1;
} else {
swap(nums, nums[i], i);
i--;
}
}
}
}
for (int i = 0; i < nums.length; i++) {
if (nums[i] == i) {
return i;
}
}
The algorithm here is considering the original array as bucket in bucket sort. Put numbers into its bucket, if more than twice, mark it as -1. Using another loop to find the first number that has nums[i] == i

How is this solution an example of dynamic programming?

A lecturer gave this question in class:
[question]
A sequence of n integers is stored in
an array A[1..n]. An integer a in A is
called the majority if it appears more
than n/2 times in A.
An O(n) algorithm can be devised to
find the majority based on the
following observation: if two
different elements in the original
sequence are removed, then the
majority in the original sequence
remains the majority in the new
sequence. Using this observation, or
otherwise, write programming code to
find the majority, if one exists, in
O(n) time.
for which this solution was accepted
[solution]
int findCandidate(int[] a)
{
int maj_index = 0;
int count = 1;
for (int i=1;i<a.length;i++)
{
if (a[maj_index] == a[i])
count++;
else
count--;
if (count == 0)
{
maj_index =i;
count++;
}
}
return a[maj_index];
}
int findMajority(int[] a)
{
int c = findCandidate(a);
int count = 0;
for (int i=0;i<a.length;i++)
if (a[i] == c) count++;
if (count > n/2) return c;
return -1;//just a marker - no majority found
}
I can't see how the solution provided is a dynamic solution. And I can't see how based on the wording, he pulled that code out.
The origin of the term dynamic programming is trying to describe a really awesome way of optimizing certain kinds of solutions (dynamic was used since it sounded punchier). In other words, when you see "dynamic programming", you need to translate it into "awesome optimization".
'Dynamic programming' has nothing to do with dynamic allocation of memory or whatever, it's just an old term. In fact, it has little to do with modern meaing of "programming" also.
It is a method of solving of specific class of problems - when an optimal solution of subproblem is guaranteed to be part of optimal solution of bigger problem. For instance, if you want to pay $567 with a smallest amount of bills, the solution will contain at least one of solutions for $1..$566 and one more bill.
The code is just an application of the algorithm.
This is dynamic programming because the findCandidate function is breaking down the provided array into smaller, more manageable parts. In this case, he starts with the first array as a candidate for the majority. By increasing the count when it is encountered and decreasing the count when it is not, he determines if this is true. When the count equals zero, we know that the first i characters do not have a majority. By continually calculating the local majority we don't need to iterate through the array more than once in the candidate identification phase. We then check to see if that candidate is actually the majority by going through the array a second time, giving us O(n). It actually runs in 2n time, since we iterate twice, but the constant doesn't matter.

Resources