Priority Queue Contradiction - sorting

I need help on just wording this problem. I have the right idea but I need more to make sure I understand the solution.
Lets say your friend claims he invented a super fast comparison based priority queue. He claims insertion and extraction are O(sqrt(logn))
Why is he wrong?
If I were to prove by contradiction. He's claiming that an insert and extraction of 1 item is sqrt(logn).
Therefore n items would take nsqrt(logn). If you used the queue to sort, he's claiming it would take the above time.
However we know that the lower bound for comparison based sorting is O(nlogn) which is why your friend must be wrong.
When I try to explain this, I'm told, your friend isn't claiming that he's sorting. Just that he's inserting and extracting in that small of a time.

When I try to explain this, I'm told, your friend isn't claiming that he's sorting. Just that he's inserting and extracting in that small of a time.
Then, assuming these are worst-case bounds, your friend is wrong. You just demonstrated how this can be used to sort and you've derived a contradiction; the only thing you'd need to show is how the sort works and that it indeed takes O(n sqrt(lg n)) time.

Related

How can this Prolog predicate become faster than exponential?

I have a predicate that will check if a room is available within a given schedule(consisting of events).
Checking if the room is available and not taken by another event is currently working in an exponential way. I'd like to optimise this.
What I'm currently doing is:
I take the first event, verify it doesn't overlap with any of the other events. Then I take the second event, I verify it doesn't overlap with any of the other remaining events. And so on until the list is empty.
I've been thinking about it, but the only way I see in order to make this more performant is by using asserts.
I'm wondering if there is any other way than using asserts in order to improve the efficiency?
Optimal scheduling is definitely an exponential problem. It's akin to optimal bin packing. This is an entire research area.
It sounds to me like what you're doing is O(n2): you're comparing every element in the list to every other element in the list. But you're only doing it once, because you're comparing to every element after that element in the list. So element 1 gets compared to N-1 other elements, but element N-1 only gets compared to 1 other element. This is not an absurd time complexity for your problem.
An interval tree approach is potentially a significant improvement, because you will not actually compare every element to every other element. This lowers your worst-case time complexity to O(N log N) which is considered a pretty big improvement assuming your set of events is large enough that the constant factor cost of using a balanced tree is mitigated.
I suspect this isn't really where your performance problem lies though. You probably don't want the first schedule you can build, you probably want to see what schedule you can make that has the fewest conflicts, which will mean trying different permutations. This is where your algorithm is really running into trouble, and unfortunately it's where my knowledge runs dry; I don't know how one optimizes this process further. But I do know there is a lot written about process theory and scheduling theory that can assist you if you look for it. :)
I don't think your problem comes down to needing to use certain Prolog technologies better, such as the dynamic store. But, you can always profile your code and see where it is spending its time, and maybe there is some low-hanging fruit there that we could solve.
To go much further I think we're going to need to know more about your problem.

Sorting an array

I want to know which is better to sort an array of elements.
Is it better to get good performance to sort the array at the end when I finish filling it? Or it is better to sort it each time I add an element to it?
Ignoring the application-specific information for a moment, consider that sorted insertion requires, worst case, O(n) operations for each element. For n elements, this of course gives us O(n^2). If this sounds familiar, it's because what you're doing is (as another commenter pointed out) an insertion sort. In contrast, performing one quicksort on the entire list will take, worse case, O(n log n) time.
So, does this mean you should definitely wait until the end to sort? No. The important thing to remember is that O(n log n) is the best we can do for sorting in the general case. Your application-speciifc knowledge of the data can influence the best algorithm for the job. For example, if you know your data is already almost sorted, then insertion sort will give you linear time complexity.
Your mileage may vary, but I think the algorithmic viewpoint is useful when examining the general problem of "When should I sort?"
It depends on what is critical to you. Do you need to be able to insert very fast (a lot of entries but little queries) or do you need to be able to query very fast and insert slowly (a lot of queries but not many entries) ?
This is basically your problem to solve. When you know this you can select an appropriate sorting algorithm and apply it.
Edit: This is assuming that either choice actually matters. This depends a lot on your activity (inserts vs queries) and the amount of data that you need to sort.

Is it possible to write a verifier that checks if a given program implements a given algorithm?

Given a program P, written in C++, can I write an algorithm that find if the program P implements a particular algorithm? Is there any algorithm that solves this problem. Is this problem solvable?
For example I ask a person to implement quick sort algorithm and now if I want to make sure that the person actually implemented quick sort algorithm. The person can actually implement some other sorting algorithm and it will produce correct output and pass all testcases (black box testing). One way I can do this is look into source code. I want to avoid this manual effort and want to write a program that can do this job. The question is "Is that possible?".
From Rice's Theorem, you cannot even in general decide whether a piece of code is a sort function or not by examining the code. You can, of course, find out whether it has the effect of sorting for some finite set of inputs by running it with those inputs and examining the results.
You may be able to do something for the specific case of a given target sort algorithm, by examining the array that is being sorted during the sort, checking for invariants that are characteristic of the target algorithm. For example, each call in a recursive quick sort implementation will result in a subarray becoming sorted.
=================================================================
Following on from the comments, I suggest looking at Ahmad Taherkhani's home page. He has continued research in this area, including a 2012 paper on the topic.
I was thinking, and still thinking of stack/heap checks (given you testing against optimized solutions as well).
You can check the time complexity and overall memory complexity which will narrow the results. even for Time: O(n lg n) for merge and quick sorts. you can distinguished them with the memory allocation since they are N ,Lg(n) in order.
you can also check for things like original array disturbance..etc but this is not of decisive weight.

Upper bound and lower bound of sorting algorithm

This is a very simple question but I'm struggling too much to understand the concept completely.
I'm trying to understand the difference between the following statements:
There exists an algorithm which sorts an array of n numbers in O(n) in the best case.
Every algorithm sorts an array of n numbers in O(n) in the best case.
There exists an algorithm which sorts an array of n numbers in Omega(n) in the best case.
Every algorithm sorts an array of n numbers in Omega(n) in the best case.
I will first explain what is driving me crazy. I'm not sure regarding 1 and 3 - but I know that for one of them the answer is correct just by specifying one case and for the other one the answer is correct by examining all the possible inputs. Therefore I know one of them must be true just by specifying that the array is already sorted but I can't tell which.
My teacher always told me to think about it like we are examining who's the heighest guy in the class and again by one of these options(1,3) it's enough to say that he is and there is no reason to examine all the class.
I do know that if we were to examine the worst case then none of these statements could be true because the best sorting algorithm without any assumptions or additional memory is Omega(nlogn).
IMPORTANT NOTE: I'm not looking for a solution (an algorithm which is able to do the matching sort) - only trying to understand the concept a little better.
Thank you!
For 1+3 ask yourself - do you know an algorithm that can sort an array at best case in Theta(n) - if the answer is true, then both 1+3 are true - since Theta(n) is O(n) [intersection] Omega(n), and thus if you do have such an algorithm (that runs in Theta(n) best case) - both 1+3 are correct.
Hint: optimized bubble sort.
For 2: ask yourself - does EVERY algorithm sorts an array of numbers in O(n) best case? Do you know an algorithm that have a worst case and best case identical time complexity? What happens to the mentioned bubble sort if you take all optimizations off?
For 4: ask yourself - do you need to read all elements in order to ensure the array is sorted? If you do - Omega(n) is a definite lower bound, you cannot go better then it.
Good Luck!
The difference, obviously, is in terms "O" and "Omega". One says "rising not faster than", second says "rising not slower than".
Make sure that you understand the difference between those terms, and you'll see the difference in the sentences.
1 and 3 both state completely different things, just as 2 and 4 are.
Look at those (those are NOT the same!):
1~ there exists an algorithm that for 10 items doesn't take more than 30 in the best case.
3~ there exists an algorithm that for 10 items doesn't take less than 30 in the best case.
2~ every algorithm that for 10 items takes not more than 30 in the best case.
4~ every algorithm that for 10 items takes not less than 30 in the best case.
Do you sense the difference now? With O/Omega the difference is similar, but the subject of investigation differs. The examples above say about different performance in some point/case, while O/Omega notation tell you about the performance, related to the size of data, but only if the data "is large enough", be it three items or milions, and it drops constant factors:
function 1000000*n is O(n)
function 0.00000*n*n is O(n^2)
For small amounts data, second one is obviously very very better than first. But as the quantity of data rises, soon the first starts to be much better!
Rewriting the above examples into "more proper" terms, that are more similar to your original sentences:
1~ there exists an algorithm that, for more than N items, doesn't take more than X*N in the best case.
3~ there exists an algorithm that, for more than N items, doesn't take less than X*n in the best case.
2~ every algorithm that, for more than N items, takes not more than X*N in the best case.
4~ every algorithm that, for more than N items, takes not less than X*N in the best case.
I hope that this helps you with "seeing"/"feeling" the difference!

What is the performance (Big-O) for removeAll() of a treeset?

I'm taking a Java data structures course atm. One of my assignment asks me to choose a data structure of my choice and write a spell checker program. I am in the process of checking the performance of the different data structures.
I went to the api for treeset and this is what it says...
"This implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains)."
would that include removeAll()?
how else would I be able to figure this out
thank you in advance
It would not include removeAll(), but I have to disagree with polkageist's answer. It is possible that removeAll() could be executed in constant time depending on the implementation, although it seems most likely that the execution would happen in linear time.
I think that NlogN would be if it was implemented in pretty much the worst way. If you are removing each element, there is no need to search for elements. Any element that you have needs to be removed, so there's no need to search.
Nope. For an argument collection of size k, the worst-case upper bound of removeAll() is, of course, O(k*log n) - because each of the elements contained in the argument collection have to be removed from the tree set (this requires at least searching for them), each of this searches yielding a cost of log n.

Resources