Find the unique mapping between elements of two same size arrays
This is quite known interview question and it is easy to find algorithm (using the idea of quicksort) that has O(NlogN) for average case and O(N^2) for worst case complexity. Also using the same techniques as for sorting problem we can show that any algorithm should do at least NlogN comparisons.
So the question I cant get answered, is there worst case O(NlogN) algorithm for this problem? Maybe it should be similar to merge sort.
Yes, as of 1995 there are worst-case O(n log n)-time algorithms known for this problem, but they appear to be quite complicated. Here are two citations from Jeff Erickson's algorithm notes:
János Komlós, Yuan Ma, and Endre Szemerédi, Sorting nuts and bolts in O(n log n) time, SIAM J. Discrete Math 11(3):347–372, 1998.
Phillip G. Bradford, Matching nuts and bolts optimally, Technical Report MPI-I-95-1-025, Max-Planck-Institut für Informatik, September 1995.
As Jeff remarks, "Both the algorithms and their analysis are incredibly technical and the constant hidden in the O(·) notation is quite large." He notes also that Bradford’s algorithm, which appeared second, is slightly simpler.
Related
I am working on an existing algorithm to improve its complexity. The existing algorithm uses K-means to perform clustering, whereas I chose to use K-means++ to do the same.
K-means++ was chosen because it mostly has faster and more accurate clustering results compared to K-mean.
Now, towards the end, where I have to compare the complexity of the new and existing algorithms, I find that I can't make sense of the fact that K-means++ has a complexity of O(logk) competitive.
I have tried looking everywhere on the web for an explanation, including stack overflow.
The only thing I have understood is that competitive has something to do with "on-line" and "off-line" algorithms. Could anyone please explain how it applies here?
The full sentence that you are reading says something like "The k-means++ clustering is O(log k)-competitive to the optimal k-means solution".
This is not a statement about its algorithmic complexity. It's a statement about its effectiveness. You can use O-notation for other things.
K-means attempts to minimize a "potential" that is calculated as the sum of the squared distances of points from their cluster centers.
For any specific clustering problem, the expected potential of a K-means++ solution is at most 8(ln k + 2) times the potential of the best possible solution. That 8(ln k + 2) is shortened to O(log k) for brevity.
The precise meaning of the statement that the k-means++ solution is O(log k)-competitive is that there is some constant C such that the expected ratio between the k-means++ potential and the best possible potential is less than C*(log k) for all sufficiently large k.
the smallest such constant is about 8
I am working on a program that uses just one for-loop for N times and sorts the N elements.
Just wished to ask, is it worth it? Because I know it's gonna work because it is working pretty well on paper.
It also uses comparisons.
I also wished to know if there were any drawbacks in Radix Sort.
Cheers.
Your post mentions that you are using comparisons. Comparison-based sorting algorithms need at least O(n log n) comparisons for average inputs. Please note that Ω(n log n) lower bound on comparison sorting algorithms has been proven mathematically using information theory. You can only achieve O(n) is the best case scenario where the input data is already sorted. There is a lot more detail on sorting algorithm on Wikipedia.
I would only implement your sorting algorithm as a challenging programming exercise. Most modern languages already provide fast sorting algorithms that have been thoroughly tested.
Hey just a quick question,
I've just started looking into algorithm analysis and I'm attempting to learn Big-Oh notation.
The algorithm I'm looking at contains a quicksort (of complexity O(nlog(n))) to sort a dataset, and then the algorithm that operates upon the set itself has a worst case run-time of n/10 and complexity O(n).
I believe that the overall complexity of the algorithm would just be O(n), because it's of the highest order, so it makes the complexity of the quicksort redundant. However, could someone confirm this or tell me if I'm doing something wrong?
Wrong.
Quicksort has worst case complexity O(n^2). But even if you have an O(nlogn) sort algorithm, this is still more than O(n).
I am attempting to prepare a presentation to explain the basics of algorithm analysis to my co-workers - some of them have never had a lecture on the subject before, but everyone has at least a few years programming behind them and good math backgrounds, so I think I can teach this. I can explain the concepts fine, but I need concrete examples of some code structures or patterns that result in factors so I can demonstrate them.
Geometric factors (n, n^2, n^3, etc) are easy, embedded loops using the same sentinel, but I am getting lost on how to describe and show off some of the less common ones.
I would like to incorporate exponential (2^n or c^n), logarithmic (n log(n) or just log(n)) and factoral (n!) factors in the presentation. What are some short, teachable ways to get these in code?
A divide-and-conquer algorithm that does a constant amount of work for each time it divides the problem in half is O(log n). For example a binary search.
A divide-and-conquer algorithm that does a linear amount of work for each time it divides the problem in half is O(n * log n). For example a merge sort.
Exponential and factorial are probably best illustrated by iterating respectively over all subsets of a set, or all permutations of a set.
Exponential: naive Fibonacci implementation.
n log(n) or just log(n): Sorting and Binary seach
Factorial: Naive traveling salesman solutions. Many naive solutions to NP-complete problems.
n! problems are pretty simple. There are many NP-complete n! time problems such as the travelling salesman problem
In doubt pick one of the Sort algorithms - everyone knows what they're supposed to do and therefore they're easy to explain in relation to the complexity stuff: Wikipedia has a quite good overview
It seems like the best complexity would be linear O(n).
Doesn't matter the case really, I'm speaking of greedy algorithms in general.
Sometimes it pays off to be greedy?
In the specific case that I am interested would be computing change.
Say you need to give 35 cents in change. You have coins of 1, 5, 10, 25. The greedy algorithm, coded simply, would solve this problem quickly and easily. First grabbing 25 cents the highest value going in 35 and then next 10 cents to complete the total. This would be best case. Of course there are bad cases and cases where this greedy algorithm would have issues. I'm talking best case complexity for determining this type of problem.
Any algorithm that has an output of n items that must be taken individually has at best O(n) time complexity; greedy algorithms are no exception. A more natural greedy version of e.g. a knapsack problem converts something that is NP-complete into something that is O(n^2)--you try all items, pick the one that leaves the least free space remaining; then try all the remaining ones, pick the best again; and so on. Each step is O(n). But the complexity can be anything--it depends on how hard it is to be greedy. (For example, a greedy clustering algorithm like hierarchical agglomerative clustering has individual steps that are O(n^2) to evaluate (at least naively) and requires O(n) of these steps.)
When you're talking about greedy algorithms, typically you're talking about the correctness of the algorithm rather than the time complexity, especially for problems such as change making.
Greedy heuristics are used because they're simple. This means easy implementations for easy problems, and reasonable approximations for hard problems. In the latter case you'll find time complexities that are better than guaranteed correct algorithms. In the former case, you can't hope for better than optimal time complexity.
GREEDY APPROACH
knapsack problem...sort the given element using merge sort ..(nlogn)
find max deadline that will take O(n)
using linear search select one by one element....O(n²)
nlogn + n + n² = n² in worst case....
now can we apply binary search instead of linear search.....?
Greedy or not has essentially nothing to do with computational complexity, other than the fact that greedy algorithms tend to be simpler than other algorithms to solve the same problem, and hence they tend to have lower complexity.