Related
I am learning quick sort. I am not quiet understand how to choose pivot..
Assume I have a list of: [15, 5, 4, 18, 12, 19, 14, 10, 8, 20]
If I choose the pivot of 19, will the two lists for the next call be [15,5,4,18,12,14,10,8], [20]?
Yes, if you choose 19 as your pivot then those will be the two sublists created. Usually, quicksort requires a way of choosing the pivot which is consistent throughout your sort. Some might say pick the first, middle, or last element of your list.
I'm currently studying selection algorithms, namely, median of medians.
I came across two following sentences:
In computer science, a selection algorithm is an algorithm for finding
the kth smallest number in a list or array;
In computer science, the median of medians is an approximate (median)
selection algorithm, frequently used to supply a good pivot for an
exact selection algorithm, mainly the quickselect, that selects the
kth largest element of an initially unsorted array.
What does kth smallest/largest element mean?
To make question a bit more concrete, consider following (unsorted) array:
[19, 1, 7, 20, 8, 10, 19, 24, 23, 6]
For example, what is 5th smallest element? And what is 5th largest element?
If you sort the array from smallest to largest, the kth smallest element is the kth element in the sorted array. The kth largest element is the kth from the end in the sorted array. Let's examine your example array in Python:
In [2]: sorted([19, 1, 7, 20, 8, 10, 19, 24, 23, 6])
Out[2]: [1, 6, 7, 8, 10, 19, 19, 20, 23, 24]
The smallest element is 1, second smallest is 6, and so on. So the kth smallest is the kth element from the left. Similarly, 24 is the largest, 23 the second largest, and so on, so the kth largest element is the kth element from the right. So if k = 5:
In [3]: sorted([19, 1, 7, 20, 8, 10, 19, 24, 23, 6])[4] # index 4 is 5th from the start
Out[3]: 10
In [4]: sorted([19, 1, 7, 20, 8, 10, 19, 24, 23, 6])[-5] # index -5 is 5th from the end
Out[4]: 19
Note that you don't have to sort the array in order to get the kth smallest/largest value. Sorting is just an easy way to see which value corresponds to k.
Your shop sells several different types of dolls. Each doll has a suggested price, and no
two types of doll have the same price. You would like to fix an actual selling price for
each doll so that dolls of different types are as different in price as possible. Due to
some government regulations, you can only modify the suggested price within a fixed
band of ±K—in other words, if the suggested price is p, you can pick any selling price
in the range {p− K, p− K + 1, . . . , p+ K −1, p+ K}. Of course, the selling price must
always be non-negative.
For instance, suppose there are four types of dolls with suggested prices 130, 210, 70
and 90 and you are allowed to modify prices within a band of 20. Then, you can adjust
the prices to 150, 210, 50 and 100, respectively, so that the minimum difference in price
between any two types of dolls is 50. (For the second doll, you could have picked any
price between 200 and 230.) You can check that this is the largest separation that you
can achieve given the constraint.
In each of the cases below, you are given a sequence of prices and the value of K. You
have to determine the maximum separation that you can achieve between all pairs in
the sequence if you are allowed to modify each price by upto ±K.
(a) K = 13. Sequence: 144, 152, 214, 72, 256, 3, 39, 117, 238, 280.
(b) K = 10. Sequence: 10, 48, 57, 32, 61, 74, 33, 45, 99, 81, 19, 24, 101.
(c) K = 20. Sequence: 10, 19, 154, 67, 83, 39, 54, 110, 124, 99, 139, 170
So basically, I just need to find the value of maximum separation without coding. I tried to devise an algorithm, but failed miserably, so I just started brute forcing it, by basically increasing/decreasing each of the prices by a certain value, but the bruteforcing applied here is just too tough due to the value of K. (It would have been simple for any K<6).
Can someone define a function or recurrence relation to calculate it? The solutions are up online, but they only give the answer as an integer and don't explain how to reach the solution. I am a beginner in programming, so try explaining using pseudocode/ little bit of C++, please. Thank you.
Source: http://www.iarcs.org.in/inoi/2013/zio2013/zio2013-qpaper.pdf
Solution: http://www.iarcs.org.in/inoi/2013/zio2013/zio2013-solutions.pdf
Here is a O(nlogn) algorithm.
To illustrate I will use the second example: 10, 48, 57, 32, 61, 74, 33, 45, 99, 81, 19, 24, 101 with K=10
Sort the list (10, 19, 24, 32, 33, 45, 48, 57, 61, 74, 81, 99, 101)
Use bisection to find the minimum separation x
For a trial value of x, assign the final values greedily placing them as small as possible while satisfying the conditions (non-negative, within K of original value, at least x greater than previous).
So let us start with x=10.
We will move as follows:
10->0 (can't go negative so this is smallest allowed)
19->10 (can't go within K=10 of the previous value)
24->20
32->30
33->40
45->50
48 becomes impossible. We can only assign values between 38 and 58, but none of these are more than 10 away from the previous 50.
We conclude that x=10 is too high a separation and we need to move lower.
You might try x=7 and find it is possible, x=9 find it is impossible, then try x=8:
10->0
19->9 (can only move to values 9->29)
24->17
32->25
33->33
45->41
48->49
57->56
61->64
74->72
81->80
99->89
101->97
And so we have found that x=8 is possible, x=9 is impossible and therefore x=8 is the maximum possible separation.
Suppose I have the following lists:
[1, 2, 3, 20, 23, 24, 25, 32, 31, 30, 29]
[1, 2, 3, 20, 23, 28, 29]
[1, 2, 3, 20, 21, 22]
[1, 2, 3, 14, 15, 16]
[16, 17, 18]
[16, 17, 18, 19, 20]
Order matters here. These are the nodes resulting from a depth-first search in a weighted graph. What I want to do is break down the lists into unique paths (where a path has at least 2 elements). So, the above lists would return the following:
[1, 2, 3]
[20, 23]
[24, 25, 32, 31, 30, 29]
[28, 29]
[20, 21, 22]
[14, 15, 16]
[16, 17, 18]
[19, 20]
The general idea I have right now is this:
Look through all pairs of lists to create a set of lists of overlapping segments at the beginning of the lists. For example, in the above example, this would be the output:
[1, 2, 3, 20, 23]
[1, 2, 3, 20]
[1, 2, 3]
[16, 17, 18]
The next output would be this:
[1, 2, 3]
[16, 17, 18]
Once I have the lists from step 2, I look through each input list and chop off the front if it matches one of the lists from step 2. The new lists look like this:
[20, 23, 24, 25, 32, 31, 30, 29]
[20, 23, 28, 29]
[20, 21, 22]
[14, 15, 16]
[19, 20]
I then go back and apply step 1 to the truncated lists from step 3. When step 1 doesn't output any overlapping lists, I'm done.
Step 2 is the tricky part here. What's silly is it's actually equivalent to solving the original problem, although on smaller lists.
What's the most efficient way to solve this problem? Looking at all pairs obviously requires O(N^2) time, and step 2 seems wasteful since I need to run the same procedure to solve these smaller lists. I'm trying to figure out if there's a smarter way to do this, and I'm stuck.
Seems like the solution is to modify a Trie to serve the purpose. Trie compression gives clues, but the kind of compression that is needed here won't yield any performance benefits.
The first list you add becomes it's own node (rather than k nodes). If there is any overlap, nodes split but never get smaller than holding two elements of the array.
A simple example of the graph structure looks like this:
insert (1,2,3,4,5)
graph: (1,2,3,4,5)->None
insert (1,2,3)
graph: (1,2,3)->(4,5), (4,5)->None
insert (3,2,3)
graph: (1,2,3)->(4,5), (4,5)->None, (3,32)->None
segments
output: (1,2,3), (4,5), (3,32)
The child nodes should also be added as an actual Trie, at least when there are enough of them to avoid a linear search when adding/removing from the data structure and potentially increasing the runtime by a factor of N. If that is implemented, then the data structure has the same big O performance as a Trie with a somewhat higher hidden constants. Meaning that it takes O(L*N), where L is the average size of the list and N is the number of lists. Obtaining the segments is linear in the number of segments.
The final data structure, basically a directed graph, for your example would looks like below, with the start node at the bottom.
Note that this data structure can be built as you run the DFS rather than afterwords.
I ended up solving this by thinking about the problem slightly differently. Instead of thinking about sequences of nodes (where an edge is implicit between each successive pair of nodes), I'm thinking about sequences of edges. I basically use the algorithm I posted originally. Step 2 is simply an iterative step where I repeatedly identify prefixes until there are no more prefixes left to identify. This is pretty quick, and dealing with edges instead of nodes really simplified everything.
Thanks for everyone's help!
I'm trying to understand quicksort and I get the general idea, but I'm having trouble with the below question. Is there an easy way to identify which pivot is being used based on the array after each iteration?
Consider the following array and its state after iterations of QuickSort on the array:
Initial Array: 32, 12, 17, 73, 40, 88, 16, 75
After Iter 1: 32, 12, 17, 40, 16, 73, 88, 75
After Iter 2: 12, 16, 17, 40, 32, 73, 88, 75
After Iter 3: 12, 16, 17, 40, 32, 73, 88, 75
After Iter 4: 12, 16, 17, 32, 40, 73, 88, 75
After Iter 5: 12, 16, 17, 32, 40, 73, 75, 88
Name the pivot selection strategy used in this QuickSort execution.
Hint: Examine what value is being selected as the pivot at each stage. Remember
that QuickSort first sorts the left sub-array and its left-sub-array recursively before
sorting the right sub-arrays.
Any element is chosen as pivot and then in first iteration, all elements smaller than pivot are placed to the left of pivot and greater to the right, if they are already not. This means swapping pivot ahead in the array as well if needed. Knowing this and looking at the iteration should help identify the pivot.
For e.g. in your above case, i believe the middle element is being chosen as pivot i.e. 73. After first iteration, all elements lesser than it are moved to left and greater than it are moved to it's right.