In a 2d array of n×n, I am trying to find a number that's larger than its neighbour.
I approached this using the divide and conquer algorithm.
Next, I attempted to prove the correctness of my algorithm by exhibiting a suitable invariant i.e. I approached this by filling up 4x4 grid with random numbers, dividing it and finding the global maxima for each column selected (not sure this is the way to prove the correctness of the algorithm)
The bit I am most confused about is how to analyse the running time for my algorithm i.e. how many elements in the n×n array need to be visited in the worst case and finally if there is a way to show if the algorithm is asymptotically optimal.
You cannot have better complexity than O(n^2), as you need to visit every element of an array. If you did not visit some element, it may turn out that your array does not have any local maxima except global maximum in the element you didn't visit. Brute force algorithm (just checking every element if it is a local maximum) is O(n^2), so there's not much space to do better or worse than that.
Re divide and conquer algorithm -- this seems to be overengineering for the given problem. You gain nothing by it, and you introduce additional complications (how do you handle borders between the parts?)
Related
The algorithm described to in this this MIT lecture and written out in this SO question for finding a peak in a 1d array makes sense.
So does its analysis of O(log n); we re dividing the array into halves
How can I update it to find all peaks in the array? What would that complexity be?
For finding all peaks, you can't do any better than just going through the whole array and comparing every element to its neighbors. There's no way to tell whether an element you didn't look at is or isn't a peak, so you have to look at all of them.
Thus, the time complexity is O(n) for n elements.
I'm tutoring a student and one of her assignments is to describe an O(nlogn) algorithm for the closest pair of points in the one-dimensional case. But the restriction is she's not allowed to use a divide-and-conquer approach. I understand the two-dimensional case from a question a user posted some years ago. I'll link it in case someone wants to look at it: For 2-D case (plane) - "Closest pair of points" algorithm.
However, for the 1-D case, I can only think of a solution which involves checking each and every point on the line and comparing it to the closest point to the left and right of it. But this solution isn't O(nlogn) since checking each point will take time proportional to n and the comparisons for each point would take time proportional to 2n. I'm not sure where log(n) would come from without using a divide-and-conquer approach.
For some reason, I can't come up with a solution. Any help would be appreciated.
Hint: If the points were ordered from left to right, what would you do, and what would the complexity be? What is the complexity of ordering the points first?
It seems to me that one could:
Sort the locations into order - O(n log n)
Find the differences between the ordered locations - O(n)
Find the smallest difference - O(n)
The smallest difference defines the two closest points.
The overall result would be O(n log n).
Why do divide and conquer algorithms often run faster than brute force? For example, to find closest pair of points. I know you can show me the mathematical proof. But intuitively, why does this happen? Magic?
Theoretically, is it true that "divide and conquer is always better than brute force"? If it is not, is there any counterexample?
For your first question, the intuition behind divide-and-conquer is that in many problems the amount of work that has to be done is based on some combinatorial property of the input that scales more than linearly.
For example, in the closest pair of points problem, the runtime of the brute-force answer is determined by the fact that you have to look at all O(n2) possible pairs of points.
If you take something that grows quadratically and cut it into two pieces, each of which is half the size as before, it takes one quarter of the initial time to solve the problem in each half, so solving the problem in both halves takes time roughly one half the time required for the brute force solution. Cutting it into four pieces would take one fourth of the time, cutting it into eight pieces one eighth the time, etc.
The recursive version ends up being faster in this case because at each step, we avoid doing a lot of work from dealing with pairs of elements by ensuring that there aren't too many pairs that we actually need to check. Most algorithms that have a divide and conquer solution end up being faster for a similar reason.
For your second question, no, divide and conquer algorithms are not necessarily faster than brute-force algorithms. Consider the problem of finding the maximum value in an array. The brute-force algorithm takes O(n) time and uses O(1) space as it does a linear scan over the data. The divide-and-conquer algorithm is given here:
If the array has just one element, that's the maximum.
Otherwise:
Cut the array in half.
Find the maximum in each half.
Compute the maximum of these two values.
This takes time O(n) as well, but uses O(log n) memory for the stack space. It's actually worse than the simple linear algorithm.
As another example, the maximum single-sell profit problem has a divide-and-conquer solution, but the optimized dynamic programming solution is faster in both time and memory.
Hope this helps!
I recommend you read through the chapter 5 of Algorithm Design, it explains divide-and-conquer very well.
Intuitively, for a problem, if you can divide it into two sub-problems with the same pattern as the origin one, and the time complexity to merge the results of the two sub-problems into the final result is somehow small, then it's faster than solve the orignal complete problem by brute-force.
As said in Algorithm Design, you actually cannot gain too much from divide-and-conquer in terms of time, general you can only reduce time complexity from higher polynomial to lower polynomial(e.g. from O(n^3) to O(n^2)), but hardly from exponential to polynomial(e.g. from O(2^n) to O(n^3)).
I think the most you can gain from divide-and-conquer is the mindset for problem solving. It's always a good attempt to break the original big problem down to smaller and easier sub-problems. Even if you don't get a better running time, it still helps you think through the problem.
The original problem was discussed in here: Algorithm to find special point k in O(n log n) time
Simply we have an algorithm that finds whether a set of points in the plane has a center of symmetry or not.
I wonder is there a way to prove a lower bound (nlogn) to this algorithm? I guess we need to use this algorithm to solve a simplier problem, such as sorting, element uniqueness, or set uniqueness, therefore we can conclude that if we can solve e.g. element uniqueness by using this algorithm, it can be at least nlogn.
It seems like the solution is something to do with element uniqueness, but i couldn't figure out a way to manipulate this into center of symmetry algorithm.
Check this paper
The idea is if we can reduce problem A to problem B, then B is no harder than A.
That said, if problem B has lower bound Ω(nlogn), then problem A is guaranteed the same lower bound.
In the paper, the author picked the following relatively approachable problem to be B: given two sets of n real numbers, we wish to decide whether or not they are identical.
It's obvious that this introduced problem has lower bound Ω(nlogn). Here's how the author reduced our problem at hand to the introduced problem (A, B denote the two real sets in the following context):
First observe that that your magical point k must be in the center.
build a lookup data structure indexed by vector position (O(nlog n))
calculate the centroid of the set of points (O(n))
for each point, calculate the vector position of its opposite and check for its existence in the lookup structure (O(log n) * n)
Appropriate lookup data structures can include basically anything that allows you to look something up efficiently by content, including balanced trees, oct-trees, hash tables, etc.
I'm given a task to write an algorithm to compute the maximum two dimensional subset, of a matrix of integers. - However I'm not interested in help for such an algorithm, I'm more interested in knowing the complexity for the best worse-case that can possibly solve this.
Our current algorithm is like O(n^3).
I've been considering, something alike divide and conquer, by splitting the matrix into a number of sub-matrices, simply by adding up the elements within the matrices; and thereby limiting the number of matrices one have to consider in order to find an approximate solution.
Worst case (exhaustive search) is definitely no worse than O(n^3). There are several descriptions of this on the web.
Best case can be far better: O(1). If all of the elements are non-negative, then the answer is the matrix itself. If the elements are non-positive, the answer is the element that has its value closest to zero.
Likewise if there are entire rows/columns on the edges of your matrix that are nothing but non-positive integers, you can chop these off in your search.
I've figured that there isn't a better way to do it. - At least not known to man yet.
And I'm going to stick with the solution I got, mainly because its simple.