Say I'm given a bunch of number pairs, is there a good algorithm to find the pair with the largest sum of their squares besides computing all sums of squares and comparing it to the current max?
eg
input:
[3, 3]
[0, 3]
[4, 0]
[2, 4]
output:
[2, 4] (sum of squares: 20)
To use a binary search, you would need to sort the list. This raises two problems (mentioned or hinted above):
You don't have a well-ordered collection
The optimal sort is O(n log n)
Sum of two squares is simple: two data accesses for the numbers, two multiplications, and an addition. On a linear CPU, this pipelines to 3 cheap (i.e. single-cycle) operations. On a modern AI processor, this is a single-cycle dot-product.
In short, not only is the linear, brute-force approach only O(n), it's a fast O(n). Take the money and run.
Related
In the Subset Sum problem, if we don't use the Dynamic Programming approach, then we have an exponential time complexity. But if we draw the recursion tree, it seems that all the 2^n branches are unique. If we use dynamic programming, how can we assure that all the unique branches are explored? If there really exists 2^n possible solutions, how does dynamic programming reduce it to polynomial time while also ensuring all 2^n solutions are explored?
How does dynamic programming reduce it to polynomial time while also ensuring all 2^n solutions are explored?
It is pseudo polynomial time, not polynomial time. It's a very important distinction. According to Wikipedia, A numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the numeric value of the input, but not necessarily in the length of the input, which is the case for polynomial time algorithms.
What does it matter?
Consider an example [1, 2, 3, 4], sum = 1 + 2 + 3 + 4 = 10.
There does in fact exist 2^4 = 16 subsequences, however, do we need to check them all? The answer is no, since we are only concerned about the sum of subsequence. To illustrate this, let's say we're iterating from the 1st element to the 4th element:
1st element:
We can choose to take or not take the 1st element, so the possible sum will be [0, 1].
2nd element:
We can choose to take or not to take the 2nd element. Same idea, possible sum will be [0, 1, 2, 3].
3rd element:
We have [0, 1, 2, 3] now. We now consider taking the third element. But wait... If we take the third element and add it to 0, we still get 3, which is already present in the array, do we need to store this piece of information? Apparently not. In fact, we only need to know whether a sum is possible at any stage. If there are multiple subsequences summing to the same value, we ignore it. This is the key to the reduction of complexity, if you consider it as a reduction.
With that said, a real polynomial solution for subset sum is not known since it is NP-complete
I recently came across a problem that made me wonder.
What if I stored a N element array inside an array of length N across all the N indexes.
As a tiny example:
[
[1, 2, 3],
[5, 6, 7],
[8, 9, 10],
]
An array of length 3 and at every index there is an array again of length 3
What would be the space complexity? Is it still O(N) or has it change.
It would still be O(n) because the Space Complexity analysis is meant to describe the complexity of the relationship between n and space, it doesn't care if you store an array of 3 elements at every index. The space used will be 3 times higher but still the relationship will be linear.
Big-O notation describes an asymptotic upper bound. It represents the
algorithm’s scalability and performance.
Simply put, it gives the worst-case scenario of an algorithm’s growth
rate.
from here.
It would be different if you said that at every index an array of N=index elements is stored. In that case it would have been O(n^2).
How is it different if I select a randomized pivot versus just selecting the first pivot in an unordered set/list?
If the set is unordered, isnt selecting the first value in the set, random in itself? So essentially, I am trying to understand how/if randomizing promises a better worst case runtime.
I think you may be mixing up the concepts of arbitrary and random. It's arbitrary to pick the first element of the array - you could pick any element you'd like and it would work equally well - but it's not random. A random choice is one that can't be predicted in advance. An arbitrary choice is one that can be.
Let's imagine that you're using quicksort on the sorted sequence 1, 2, 3, 4, 5, 6, ..., n. If you choose the first element as a pivot, then you'll choose 1 as the pivot. All n - 1 other elements then go to the right and nothing goes to the left, and you'll recursively quicksort 2, 3, 4, 5, ..., n.
When you quicksort that range, you'll choose 2 as the pivot. Partitioning the elements then puts nothing on the left and the numbers 3, 4, 5, 6, ..., n on the right, so you'll recursively quicksort 3, 4, 5, 6, ..., n.
More generally, after k steps, you'll choose the number k as a pivot, put the numbers k+1, k+2, ..., n on the right, then recursively quicksort them.
The total work done here ends up being Θ(n2), since on the first pass (to partition 2, 3, ..., n around 1) you have to look at n-1 elements, on the second pass (to partition 3, 4, 5, ..., n around 2), you have to look at n-2 elements, etc. This means that the work done is (n-1)+(n-2)+ ... +1 = Θ(n2), quite inefficient!
Now, contrast this with randomized quicksort. In randomized quicksort, you truly choose a random element as your pivot at each step. This means that while you technically could choose the same pivots as in the deterministic case, it's very unlikely (the probability would be roughly 22 - n, which is quite low) that this will happen and trigger the worst-case behavior. You're more likely to choose pivots closer to the center of the array, and when that happens the recursion branches more evenly and thus terminates a lot faster.
The advantage of randomized quicksort is that there's no one input that will always cause it to run in time Θ(n log n) and the runtime is expected to be O(n log n). Deterministic quicksort algorithms usually have the drawback that either (1) they run in worst-case time O(n log n), but with a high constant factor, or (2) they run in worst-case time O(n2) and the sort of input that triggers this case is deterministic.
In quick sort, the pivot is always the right most index of the selected array whereas in Randomized quick sort, pivot can be any element in the array.
We have 3 variants of Merge sort.
Top down
Bottom up
Natural
Are any of these adaptive algorithms? For instance, if an array is sorted they will take advantage of the sorted order.
According to me, no matter if an array is sorted or not, merge sort will still go in for comparisons and then merge. So, the answer is none of these are adaptive.
Is my understanding is correct?
Natural merge sort is adaptive. For example, it executes only one run through sorted array and makes N comparisons.
Both top down and bottom up sorts are not adaptive, they always make O(NlogN) operations
Merge Sort is an implementation of divide and conquer algorithm. You need to divide the whole array into sub arrays. For example
[1, 3 ,5, 6, 2, 4,1 10] is divided into [1 3 5 6] and [2 4 1 10]. [1 3 5 6] is divided into [1 3] and [5 6]. Now as both [1 3] and [5 6] are sorted swap procedure is not needed.
So there is at least a little complexity if the array is sorted
Accoring to me, no matter if an array is sorted or not, merge sort will still go in for comparisons
I don't think its possible to sort without any comparisons and save memory if the input is not already sorted or contains information regarding what sorting procidure to follow. The fastest sorting algorithm is of time complexity O(n log(n)) (there are sorting techniques like bead sorting. I am not considering it here as it is not an optimal method memory vice)
P.S quick sort does involve comparisons (but is adaptive), and in its worse case it makes about O(n^2) comparisons.
P.P.S Radix sort, counting sort and bucket sort are some examples of adaptive sort;
Actually, I am teaching myself algorithm and here I am trying to solve this problem which is the following:
We have an array of n positive integers in an arbitrary order and we have k which is k>=1 to n. The question is to output k smallest odd integers. If the
number of odd integers in A is less than k, we should report all odd integers. For example,
if A = [2, 17, 3, 10, 28, 5, 9, 4, 12,13, 7] and k = 3, the output should be 3, 5, 9.
I want to solve this problem in O(n) time.
My current solution is to have another array with only odd numbers and then I apply this algorithm which is by finding the median and divide the list into L, Median , Right and compare the k as the following:
If |L|<k<= (|L|+|M|) Return the median
else if K<|L|, solve the problem recursively on (L)
else work on (R, k- (|L|+|M|)
Any help is appreciated.
Assuming the output can be in any order:
Create a separate array with only odd numbers.
Use a selection algorithm to determine the k-th item. One such algorithm is quickselect (which runs in O(n) on average), which is related to quicksort - it partitions the array by some pivot, and then recursively goes to one of the partitioned sides, based on the sizes of each. See this question for more details.
Since quickselect partitions the input, you will be able to output the results directly after running this algorithm (as Karoly mentioned).
Both of the above steps take O(n), thus the overall running time is O(n).
If you need the output in ascending order:
If k = n, and all the numbers are odd, then an O(n) solution to this would be an O(n) sorting algorithm, but no-one knows of such an algorithm.
To anyone who's considering disagreeing, saying that some non-comparison-based sort is O(n) - it's not, each of these algorithms have some other factor in the complexity, such as the size of the numbers.
The best you can do here, with unbounded numbers, is to use the approach suggested in Proger's answer (O(n + k log n)), or iterate through the input, maintaining a heap of the k smallest odd numbers (O(n log k)).