Given an integer array nums, return the largest perimeter of a triangle with a non-zero area formed from three of these lengths. If it is impossible to form any triangle of a non-zero area, return 0.
I can think about the brute force method, and another method is sorting the array given and iterating a loop backward and checking the condition for forming a triangle and returning the sum, is there any other method to solve this
The best algorithm that I can think of is to put the numbers into a heap, instead of sorting the array. Usually it will be O(n) to make the heap and O(1) to find the triangle. The worst case time is if no numbers satisfy the triangle condition, and therefore it would take O(n log(n)) comparisons. This worst case can only be hit if you allow big integers into the mix. With 64-bit integers, say, the worst case is O(n).
Given that you have to look at every element, you can't do better than O(n) average time. And all of the tricks for fixed sized integers, like a radix sort, won't help much.
Even in the arbitrary integer situation, it is still very good.
Suppose that we had a comparison based algorithm with a worst case of o(n log(n)). Then by tracking the comparisons made and applying Kahn's algorithm for a topological sort, you'd have a comparison based sort that is o(n log(n)), which is well-known to be impossible.
I'm having trouble finding a non-comparison based algorithm for integers of arbitrary size that might do better.
Related
I am prepping for interview leet-code type problems and I came across the k closest problem, but given a sorted array. This problem requires finding the k closest elements by value to an input value from the array. The answer to this problem was fairly straight forward and I did not have any issues determining a linear-time algorithm to solve it.
However, working on this problem got me thinking. Is it possible to solve this problem given an unsorted array in linear time? My first thought was to use a heap and that would give an O(nlogk) time complexity solution, but I am trying to determine if its possible to come up with an O(n) solution? I was thinking about possibly using something like quickselect, but the issue is that this has an expected time of O(n), not a worst case time of O(n).
Is this even possible?
The median-of-medians algorithm makes Quickselect take O(n) time in the worst case.
It is used to select a pivot:
Divide the array into groups of 5 (O(n))
Find the median of each group (O(n))
Use Quickselect to find the median of the n/5 medians (O(n))
The resulting pivot is guaranteed to be greater and less than 30% of the elements, so it guarantees linear time Quickselect.
After selecting the pivot, of course, you have to continue on with the rest of Quickselect, which includes a recursive call like the one we made to select the pivot.
The worst case total time is T(n) = O(n) + T(0.7n) + T(n/5), which is still linear. Compared to the expected time of normal Quickselect, though, it's pretty slow, which is why we don't often use this in practice.
Your heap solution would be very welcome at an interview, I'm sure.
If you really want to get rid of the logk, which in practical applications should seldom be a problem, then yes, using Quickselect would be another option. Something like this:
Partition your array in values smaller and larger than x. <- O(n).
For the lower half, run Quickselect to find the kth largest number, then take the right-side partition which are your k largest numbers. <- O(n)
Repeat step 2 for the higher half, but for the k smallest numbers. <- O(n)
Merge your k smallest and k largest numbers and extract the k closest numbers. <- O(k)
This gives you a total time complexity of O(n), as you said.
However, a few points about your worry about expected time vs worst-case time. I understand that if an interview question explicitly insists on worst-case O(n), then this solution might not be accepted, but otherwise, this can well be considered O(n) in practice.
The key here being that for randomized quickselect and random or well-behaved input, the probability that the time complexity goes beyond O(n) decreases exponentially as the input grows. Meaning that already at largeish inputs, the probability is as small as guessing at a specific atom in the known universe. The assumption on well-behaved input concerns being somewhat random in nature and not adversarial. See this discussion on a similar (not identical) problem.
Is there any sorting algorithm with an average time complexity log(n)??
example [8,2,7,5,0,1]
sort given array with time complexity log(n)
No; this is, in fact, impossible for an arbitrary list! We can prove this fairly simply: the absolute minimum thing we must do for a sort is look at each element in the list at least once. After all, an element may belong anywhere in the sorted list; if we don't even look at an element, it's impossible for us to sort the array. This means that any sorting algorithm has a lower bound of n, and since n > log(n), a log(n) sort is impossible.
Although n is the lower bound, most sorts (like merge sort, quick sort) are n*log(n) time. In fact, while we can sort purely numerical lists in n time in some cases with radix sort, we actually have no way to, say, sort arbitrary objects like strings in less than n*log(n).
That said, there may be times when the list is not arbitrary; ex. we have a list that is entirely sorted except for one element, and we need to put that element in the list. In that case, methods like binary search tree can let you insert in log(n), but this is only possible because we are operating on a single element. Building up a tree (ie. performing n inserts) is n*log(n) time.
As #dominicm00 also mentioned the answer is no.
In general when you see an algorithm with time complexity of Log N with base 2 that means that, you are dividing the input list into 2 sets, and getting rid of one of them repeatedly. In sorting algorithm we need to put all the elements in their appropriate place, if we get rid of half of the list in each iteration, that does not correlate with sorting functionality.
The most efficient sorting algorithms have the time complexity of O(n), but with some limitations. Three most famous algorithm with complexity of O(n) are :
Counting sort with time complexity of O(n+k), while k is the maximum number in given list. Assuming n>>k, you can consider its time complexity as O(n)
Radix sort with time complexity of O(d*(n+k)), where k is maximum number of input list and d is maximum number of digits you may have in input list. Similar to counting sort assuming n>>k && n>>d => time complexity will be O(n)
Bucket sort with time complexity of O(n)
But in general due to limitation of each of these algorithms most implementation relies on O(n* log n) algorithms, such as merge sort, quick sort, and heap sort.
Also there are some sorting algorithms with time complexity of O(n^2) which are recommended for list with smaller sizes such as insertion sort, selection sort, and bubble sort.
Using a PLA it might be possible to implement counting sort for a few elements with a low range of values.
count each amount in parallel and sum using lg2(N) steps
find the offset of each element in lg2(N) steps
write the array in O(1)
Only massive parallel computation would be able to do this, general purpose CPU's would not do here unless they implement it in silicon as part of their SIMD.
This is my question I have got somewhere.
Given a list of numbers in random order write a linear time algorithm to find the 𝑘th smallest number in the list. Explain why your algorithm is linear.
I have searched almost half the web and what I got to know is a linear-time algorithm is whose time complexity must be O(n). (I may be wrong somewhere)
We can solve the above question by different algorithms eg.
Sort the array and select k-1 element [O(n log n)]
Using min-heap [O(n + klog n)]
etc.
Now the problem is I couldn't find any algorithm which has O(n) time complexity and satisfies that algorithm is linear.
What can be the solution for this problem?
This is std::nth_element
From cppreference:
Notes
The algorithm used is typically introselect although other selection algorithms with suitable average-case complexity are allowed.
Given a list of numbers
although it is not compatible with std::list, only std::vector, std::deque and std::array, as it requires RandomAccessIterator.
linear search remembering k smallest values is O(n*k) but if k is considered constant then its O(n) time.
However if k is not considered as constant then Using histogram leads to O(n+m.log(m)) time and O(m) space complexity where m is number of possible distinct values/range in your input data. The algo is like this:
create histogram counters for each possible value and set it to zero O(m)
process all data and count the values O(m)
sort the histogram O(m.log(m))
pick k-th element from histogram O(1)
in case we are talking about unsigned integers from 0 to m-1 then histogram is computed like this:
int data[n]={your data},cnt[m],i;
for (i=0;i<m;i++) cnt[i]=0;
for (i=0;i<n;i++) cnt[data[i]]++;
However if your input data values does not comply above condition you need to change the range by interpolation or hashing. However if m is huge (or contains huge gaps) is this a no go as such histogram is either using buckets (which is not usable for your problem) or need list of values which lead to no longer linear complexity.
So when put all this together is your problem solvable with linear complexity when:
n >= m.log(m)
Consider this problem:
A comparison-based sorting algorithm sorts an array with n items. For which fraction of n! permutations, the number of comparisons may be cn where c is a constant?
I know the best time complexity for sorting an array with arbitrary items is O(nlogn) and it doesn't depend on any order, right? So, there is no fraction that leads to cn comparisons. Please guide me if I am wrong.
This depends on the sorting algorithm you use.
Optimized Bubble Sort for example, compares all neighboring elements of an array and swaps them when the left element is larger then right one. This is repeated until no swaps where performed.
When you give Bubble Sort a sorted array it won't perform any swaps in the first iteration and thus sorts in O(n).
On the other hand, Heapsort will take O(n log n) independent of the order of the input.
Edit:
To answer your question for a given sorting algorithm, might be non-trivial. Only one out of n! permutations is sorted (assuming no duplicates for simplicity). However, for the example of bubblesort you could (starting for the sorted array) swap each pair of neighboring elements. This input will take Bubblesort two iterations which is also O(n).
First, I know
lower bound is O(nlogn)
and how to prove it
And I agree the lower bound should be O(nlogn).
What I don't quite understand is:
For some special cases, the # of comparisons could actually be even lower than the lower bound. For example, use bubble sort to sort an already sorted array. The # of comparisons is O(n).
So how to actually understand the idea of lower bound?
The classical definition on Wikipedial: http://en.wikipedia.org/wiki/Upper_and_lower_bounds does not help much.
My current understanding of this is:
lower bound of the comparison-based sorting is actually the upper bound for the worst case.
namely, how best you could in the worst case.
Is this correct? Thanks.
lower bound of the comparison-based sorting is actually the upper bound for the best case.
No.
The function that you are bounding is the worst-case running time of the best possible sorting algorithm.
Imagine the following game:
We choose some number n.
You pick your favorite sorting algorithm.
After looking at your algorithm, I pick some input sequence of length n.
We run your algorithm on my input, and you give me a dollar for every executed instruction.
The O(n log n) upper bound means you can limit your cost to at most O(n log n) dollars, no matter what input sequence I choose.
The Ω(n log n) lower bound means that I can force you to pay at least Ω(n log n) dollars, no matter what sorting algorithm you choose.
Also: "The lower bound is O(n log n)" doesn't make any sense. O(f(n)) means "at most a constant times f(n)". But "lower bound" means "at least ...". So saying "a lower bound of O(n log n)" is exactly like saying "You can save up to 50% or more!" — it's completely meaningless! The correct notation for lower bounds is Ω(...).
The problem of sorting can be viewed as following.
Input: A sequence of n numbers .
Output: A permutation (reordering) of the input sequence such that a‘1 <= a‘2 ….. <= a‘n.
A sorting algorithm is comparison based if it uses comparison operators to find the order between two numbers. Comparison sorts can be viewed abstractly in terms of decision trees. A decision tree is a full binary tree that represents the comparisons between elements that are performed by a particular sorting algorithm operating on an input of a given size. The execution of the sorting algorithm corresponds to tracing a path from the root of the decision tree to a leaf. At each internal node, a comparison ai aj is made. The left subtree then dictates subsequent comparisons for ai aj, and the right subtree dictates subsequent comparisons for ai > aj. When we come to a leaf, the sorting algorithm has established the ordering. So we can say following about the decison tree.
1) Each of the n! permutations on n elements must appear as one of the leaves of the decision tree for the sorting algorithm to sort properly.
2) Let x be the maximum number of comparisons in a sorting algorithm. The maximum height of the decison tree would be x. A tree with maximum height x has at most 2^x leaves.
After combining the above two facts, we get following relation.
n! <= 2^x
Taking Log on both sides.
\log_2n! <= x
Since \log_2n! = \Theta(nLogn), we can say
x = \Omega(nLog_2n)
Therefore, any comparison based sorting algorithm must make at least \Omega(nLog_2n) comparisons to sort the input array, and Heapsort and merge sort are asymptotically optimal comparison sorts.
When you do asymptotic analysis you derive an O or Θ or Ω for all input.
But you can also make analysis on whether properties of the input affect the runtime.
For example algorithms that take as input something almost sorted have better performance than the formal asymptotic formula due to the input characteristics and the structure of the algorithm. Examples are bubblesort and quicksort.
It is not that you can go bellow the lower boundaries. It only behavior of the implementation on specific input.
Imagine all the possible arrays of things that could be sorted. Lets say they are arrays of length 'n' and ignore stuff like arrays with one element (which, of course, are always already sorted.
Imagine a long list of all possible value combinations for that array. Notice that we can simplify this a bit since the values in the array always have some sort of ordering. So if we replace the smallest one with the number 1, the next one with 1 or 2 (depending on whether its equal or greater) and so forth, we end up with the same sorting problem as if we allowed any value at all. (This means an array of length n will need, at most, the numbers 1-n. Maybe less if some are equal.)
Then put a number beside each one telling how much work it takes to sort that array with those values in it. You could put several numbers. For example, you could put the number of comparisons it takes. Or you could put the number of element moves or swaps it takes. Whatever number you put there indicates how many operations it takes. You could put the sum of them.
One thing you have to do is ignore any special information. For example, you can't know ahead of time that the arrangement of values in the array are already sorted. Your algorithm has to do the same steps with that array as with any other. (But the first step could be to check if its sorted. Usually that doesn't help in sorting, though.)
So. The largest number, measured by comparisons, is the typical number of comparisons when the values are arranged in a pathologically bad way. The smallest number, similarly, is the number of comparisons needed when the values are arranged in a really good way.
For a bubble sort, the best case (shortest or fastest) is if the values are in order already. But that's only if you use a flag to tell whether you swapped any values. In that best case, you look at each adjacent pair of elements one time and find they are already sorted and when you get to the end, you find you haven't swapped anything so you are done. that's n-1 comparisons total and forms the lowest number of comparisons you could ever do.
It would take me a while to figure out the worst case. I haven't looked at a bubble sort in decades. But I would guess its a case where they are reverse ordered. You do the 1st comparison and find the 1st element needs to move. You slide up to the top comparing to each one and finally swap it with the last element. So you did n-1 comparisons in that pass. The 2nd pass starts at the 2nd element and does n-2 comparisons and so forth. So you do (n-1)+(n-2)+(n-3)+...+1 comparisons in this case which is about (n**2)/2.
Maybe your variation on bubble sort is better than the one I described. No matter.
For bubble sort then, the lower bound is n-1 and the upper bound is (n**2)/2
Other sort algorithms have better performance.
You might want to remember that there are other operations that cost besides comparisons. We use comparisons because much sorting is done with strings and a string comparison is costly in compute time.
You could use element swaps to count (or the sum of swaps and elements swaps) but they are typically shorter than comparisons with strings. If you have numbers, they are similar.
You could also use more esoteric things like branch prediction failure or memory cache misses or for measuring.