I was reading about sorting of presorted list in which few numbers are unsorted, someone said that cooks-kim algorithm is best for such cases, I googled about it but no relevant links.
Please let me know if anyone knows about it
Thank you
Kurtis R Cook, Do Jin Kim, the paper you want is called "Best sorting algorithm for nearly sorted list", can be found in Communications of the ACM,
23:620–624, 1980.
Can't find anywhere to download it from, the publisher keeps vigilant, $15 from ACM themselves.
To answer your question, it's a combination of an insertion sort and a quick sort, optimised for reordering mostly ordered data. ie. bringing a previously sorted list back in to a sorted form after some alterations.
There's one research paper of them... You can view it if you have ACM account
Related
Was just asked this day before.
Could come up with -
1. Arrays
2. Linked Lists
3. Sets/Vectors
4. Min/Max heaps
5. Trees
6. Map
Interviewer said I haven't included one obvious answer which is as good, if not better than trees (assuming, they are balanced). Any ideas?
There are lots of options that can be added. Skip list, graph, btree, log-structured merge-tree, bloom filter...
I consider that a horrible interview question. The interviewer is asking for useless trivia, and believes that how much trivia you know in common with the interviewer is a measure of something useful. Speaking personally, I'd consider that a red flag that would limit my interest in the company.
This is an exam practice question i've been working on, i know of methods to do this but as the question states i don't know which would be the most efficient.
You
are
given
a
telephone
book
listing
the
surnames
of
people
in
alphabetical
order.
Describe
the
fastest
method
(clearly
explain
what
you
have
to
do)
you
can
use
to
find
a
given
surname.
If
there
are
n
people
listed
in
the
telephone
book,
what
is
the
Big
O
complexity
of
your
fastest
method
(and
explain
why)?
In this case you know the phone book entries are in order already. This means that a binary search is probably your best bet. This search works by cutting the number of entries to search in half on each iteration. It only works if your data is already sorted however. Check out this website for time complexity in Big O notation: http://bigocheatsheet.com
Edit wording
Just a curiosity question. Remember when in class groupwork the professor would divide people up into groups of a certain number (n)?
Some of my professors would take a list of n people one wants to work with and n people one doesn't want to work with from each student, and then magically turn out groups of n where students would be matched up with people they prefer and avoid working with people they don't prefer.
To me this algorithm sounds a lot like a Knapsack problem, but I thought I would ask around about what your approach to this sort of problem would be.
EDIT: Found an ACM article describing something exactly like my question. Read the second paragraph for deja vu.
To me it sounds more like some sort of clique problem.
The way I see the problem, I'd set up the following graph:
Vertices would be the students
Two students would be connected by an edge if both of these following things hold:
At least one of the two students wants to work with the other one.
None of the two students doesn't want to work with the other one.
It is then a matter of partitioning the graph into cliques of size n. (Assuming the number of students is divisible by n)
If this was not possible, I'd probably let the first constraint on the edges slip, and have edges between two people as long as neither of them explicitly says that they don't want to work with the other one.
As for an approach to solving this efficiently, I have no idea, but this should hopefully get you closer to some insight into the problem.
You could model this pretty easily as a clustering problem and you wouldn't even really need to define a space, you could actually just define the distances:
Make two people very close if they both want to work together.
Close if one of them wants to work with the other.
Medium distance if there's just apathy.
Far away if either one doesn't want to work with the other.
Then you could just find clusters, yay. Then split up any clusters of overly large size, with confidence that the people in the clusters would all be fine working together.
This problem can be brute-forced, hence my approach would be first to brute force it, then fix it when I get a better idea.
There are a couple of algorithms you could use. A great example is the so called "stable marriage problem", which has a perfect solution. You can read more about it here:
http://en.wikipedia.org/wiki/Stable_marriage_problem
The stable marriage problem only works with two groups of people (men/women in the marriage case). If you want to form pair you can use a variation, the stable roommate problem. In this case you create pairs but everybody comes from a single pool.
But you asked for a team (which I translate into >2 people per team). In this case you could let everybody fill in their best to worst match and then run the
Ok, so here's the problem:
I need to find any number of intem groups from 50-100 item set that add up to 1000, 2000, ..., 10000.
Input: list of integers
Integer can be on one list only.
Any ideas on algorithm?
Googling for "Knapsack problem" should get you quite a few hits (though they're not likely to be very encouraging -- this is quite a well known NP-complete problem).
Edit: if you want to get technical, what you're describing seems to really be the subset sum problem -- which is a special case of the knapsack problem. Of course, that's assuming I'm understanding your description correctly, which I'll admit may be open to some question.
You might find Algorithm 3.94 in The Handbook of Applied Cryptography helpful.
I'm not 100% on what you are asking, but I've used backtracking searches for something like this before. This is a brute force algorithm that is the slowest possible solution, but it will work. The wiki article on Backtracking Search may help you. Basically, you can use a recursive algorithm to examine every possible combination.
This is the knapsack problem. Are there any constraints on the integers you can choose from? Are they divisible? Are they all less than some given value? There may be ways to solve the problem in polynomial time given such constraints - Google will provide you with answers.
Simple online games of 20 questions powered by an eerily accurate AI.
How do they guess so well?
You can think of it as the Binary Search Algorithm.
In each iteration, we ask a question, which should eliminate roughly half of the possible word choices. If there are total of N words, then we can expect to get an answer after log2(N) questions.
With 20 question, we should optimally be able to find a word among 2^20 = 1 million words.
One easy way to eliminate outliers (wrong answers) would be to probably use something like RANSAC. This would mean, instead of taking into account all questions which have been answered, you randomly pick a smaller subset, which is enough to give you a single answer. Now you repeat that a few times with different random subset of questions, till you see that most of the time, you are getting the same result. you then know you have the right answer.
Of course this is just one way of many ways of solving this problem.
I recommend reading about the game here: http://en.wikipedia.org/wiki/Twenty_Questions
In particular the Computers section:
The game suggests that the information
(as measured by Shannon's entropy
statistic) required to identify an
arbitrary object is about 20 bits. The
game is often used as an example when
teaching people about information
theory. Mathematically, if each
question is structured to eliminate
half the objects, 20 questions will
allow the questioner to distinguish
between 220 or 1,048,576 subjects.
Accordingly, the most effective
strategy for Twenty Questions is to
ask questions that will split the
field of remaining possibilities
roughly in half each time. The process
is analogous to a binary search
algorithm in computer science.
A decision tree supports this kind of application directly. Decision trees are commonly used in artificial intelligence.
A decision tree is a binary tree that asks "the best" question at each branch to distinguish between the collections represented by its left and right children. The best question is determined by some learning algorithm that the creators of the 20 questions application use to build the tree. Then, as other posters point out, a tree 20 levels deep gives you a million things.
A simple way to define "the best" question at each point is to look for a property that most evenly divides the collection into half. That way when you get a yes/no answer to that question, you get rid of about half of the collection at each step. This way you can approximate binary search.
Wikipedia gives a more complete example:
http://en.wikipedia.org/wiki/Decision_tree_learning
And some general background:
http://en.wikipedia.org/wiki/Decision_tree
It bills itself as "the neural net on the internet", and therein lies the key. It likely stores the question/answer probabilities in a spare matrix. Using those probabilities, it's able to use a decision tree algorithm to deduce which question to ask that would best narrow down the next question. Once it narrows the number of possible answers to a few dozen, or if it's reached 20 questions already, then it starts reading off the most likely.
The really intriguing aspect of 20q.net is that unlike most decision tree and neural network algorithms I'm aware of, 20q supports a sparse matrix and incremental updates.
Edit: Turns out the answer's been on the net this whole time. Robin Burgener, the inventor, described his algorithm in detail in his 2005 patent filing.
It is using a learning algorithm.
k-NN is a good example of one of these.
Wikipedia: k-Nearest Neighbor Algorithm