When quicksorting a dataset the list gets split down and is recursive, in that the solution calls itself on the smaller lists.
I was practising quicksort on an algorithm but a sublist of length 2 is a stone in my shoe, I can't solve it. The original list was:
2 0 1 7 4 3 5 6
Pivot being at 2, left at 0, right at 6, I start. Left moves along to 7, 7>=2. Right moves down to 1, 1<=2. Left and right have crossed. As I understand, now right becomes the split point and two new lists are formed.
2 0 1 7 4 3 5 6
As you can see, the first list, 2 and 0, is 2 items long. So 2 is the pivot, and 0 is both left and right. Left doesn't move along, right moves along to 2, 2<=2. Left and right have crossed so p replaces R and L onwards is a new list. But this leaves 2 and 0 unsorted.
Where am I going wrong?
The problem in your case came from the fact that i don't move pivot in its sorted place. After the partitioning with pivot 2 your array should look like this:
0 1 2 7 4 3 5 6
^
Let's go through partition procedure with the input array 13 19 9 5 12 8 7 4 21 2 6 11. And let's choose 11 as a pivot.
During the procedure, you need to maintain two pointers, one for the element just before the first element bigger than the pivot ^^, and another one for the current you are looking at ||.
The code looks like this:
A is array left..right
pivot = A[right]
i = left - 1 // the one before the first bigger than the pivot
for j = left to right - 1
if A[j] <= pivot
i = i + 1
swap A[i] with A[j]
swap A[i+1] with A[right] // put pivot at its place, i + 1 - is the index to split on
And the example:
13 19 9 5 12 8 7 4 21 2 6 11
13 19 9 5 12 8 7 4 21 2 6 11 13 > 11, skip
^^ ||
13 19 9 5 12 8 7 4 21 2 6 11 19 > 11, skip
^^ ||
9 19 13 5 12 8 7 4 21 2 6 11 9 < 11, swap
^^ ||
9 5 13 19 12 8 7 4 21 2 6 11 5 < 11, swap
^^ ||
9 5 13 19 12 8 7 4 21 2 6 11 12 > 11, skip
^^ ||
9 5 8 19 12 13 7 4 21 2 6 11 8 < 11, swap
^^ ||
9 5 8 7 12 13 19 4 21 2 6 11 7 < 11, swap
^^ ||
9 5 8 7 4 13 19 12 21 2 6 11 4 < 11, swap
^^ ||
9 5 8 7 4 13 19 12 21 2 6 11 21 > 11, skip
^^ ||
can you continue yourself?
The quicksort algorithm only has base case of empty array or array of size 1. In your case of [2 0] , the algorithm chooses 2 as a pivot, partitions [2 0] into empty array and array [0] and merges it with pivot [2], giving sorted array [0 2].
Related
Okay, so given this tree I need to write out the pre-order, in-order, and post-order traversals for it.
9
/ \
5 12
/ \ / \
2 7 11 15
/ / / \ \
3 6 10 13 16
\
17
This is what I've come up with, my teacher didn't do a great job of going over this so I'm not sure if I'm anywhere near correct.
pre-order: 9 5 2 3 7 6 12 11 10 13 15 16 17
in-order: 3 2 5 7 6 9 12 11 10 13 15 16 17
post-order: 3 2 6 7 5 10 11 17 16 15 13 12 9
Any help would be greatly appreciated
Pre-order: Do a depth-first traveral and write out the node when you encounter it the first time. So this is correct (9 5 2 3 7 6 12 11 10 13 15 16 17).
Post-order: Do a depth-first traversal and write out the node after you have processed all its children. So the correct sequence would be (3 2 6 7 5 10 13 11 17 16 15 12 9).
In-order: Do a depth-first traversal and write out the left subtree first, then the node itself and afterwards the right subtree. So the correct sequence would be (3 2 5 6 7 9 10 11 13 12 15 16 17). Here it makes a difference whether a single child is the left or right, for the other methods it doesn't matter.
I am struggling with following assignment:
Given sorted sequences of numbers and operations and , find an optimal sequence of those operations (the shortest one), which creates one sorted sequence.
I've devised following algorithm:
1. Sort sequences C1, C2, ..., Cn with respect to they first elements.
2. When number of sequences is greater than one:
3. Find in C1 the last position with the number that is less than the first number in C2.
4. If found_position == |C1|
5. C1 = concat(C1, C2)
6. Else:
7. C1a, C1b = split(C1, found_position + 1).
8. C1 = concat(C1a, C2).
9. Insert C1b to the set of sequences maintaining the order (with respect to their first elements).
10. Remove C2 from the set of sequences.
11. Go to step 2., in step 3. start searching from found_position.
An example:
1 4 5 9
2 6 10 11
7 12 20
1 4 5 9 2 6 10 11 7 12 20
^
1 4 5 9 2 6 10 11 7 12 20 // split
^
1 2 6 10 11 4 5 9 7 12 20 // concat
^
1 2 6 10 11 4 5 9 7 12 20
1 2 6 10 11 4 5 9 7 12 20
^
1 2 4 5 9 6 10 11 7 12 20
^
1 2 4 5 9 6 10 11 7 12 20
^
1 2 4 5 9 6 10 11 7 12 20
^
1 2 4 5 6 10 11 7 12 20 9
^
. . .
. . .
1 2 4 5 6 7 9 10 11 12 20
To maintain ordered working set of sequences, I could use balanced binary tree (insert in step 8 is nlog n).
Is it correct? How to prove its correctness?
I have been trying to solve a puzzle in Interviewstreet. But I don't have a clue for the problem by now. It'll be great if someone can give me a hint.
The puzzle is:
You have N soldiers numbered from 1 to N. Each of your soldiers is either a liar or a truthful person. You have M sets of information about them. The information is of the following form:
Each line contains 3 integers - A, B and C. This means that in the set of soldiers numbered as {A, A+1, A+2, ..., B}, exactly C of them are liars.
There are M lines like the above.
Let L be the total number of your liar soldiers. Since you can't find the exact value of L, you want to find the minimum and maximum value of L.
Input:
The first line of the input contains two integers N and M.
Each of next M lines contains three integers - A, B and C (1 <= Ai <= Bi <= n) and (0 <= Ci <= Bi-Ai). where Ai, B i and C i refers to the values of A, B and C in the ith line respectively
N and M are not more than 101, and it is guaranteed the given informations are satisfiable. You can always find a situation that satisfies the given information .
Output:
Print two integers Lmin and Lmax to the output.
Sample Input
3 2
1 2 1
2 3 1
Sample Output
1 2
Sample Input
20 11
3 8 4
1 9 6
1 13 9
5 11 5
4 19 12
8 13 5
4 8 4
7 9 2
10 13 3
7 16 7
14 19 4
Sample Output
13 14
Explanation
In the first sample testcase the first line is "3 2", meaning that there are 3 soldiers and we have two sets of information. The first information is that in the set of soldiers {1, 2} one is a liar and the second piece of information is that in the set of soldiers {2,3} again there is one liar. Now there are two possibilities for this scenario: Soldiers number 1 and 3 are liars or soldier number 2 is liar.
So the minimum number of liars is 1 and maximum number of liars is 2. Hence the answer, 1 2.
This is Yet Another Dynamic Programming Problem. No heuristics needed.
At each i from 0 to n, by how far along you are to satisfying all currently open conditions, you need to track the minimum and maximum number of liars. (An open condition is something of the form, "From here to j I need k more liars.")
If you have the solution for i, moving to i+1 goes as follows for each partial solution that you have:
Drop all conditions that you've reached and satisfied.
Add all new conditions for this number. If a new condition conflicts with your existing solution, throw this partial solution away. Here are the rules for conflict between a condition saying that by j you need k liars and by j' you need k' liars with j <= j':
If k < k' there is a conflict. (You can't have more liars by j and then less again by j'.
If j' - j < k' - k there is a conflict. (You can't add k' - k liars in j' - j soldiers.)
Otherwise there is no conflict.
If no condition say that by soldier j you need to add j-i liars, you can add for the current step a partial solution with the current soldier not being a liar. (By "add" here I mean "make sure this state is tracked, and update max/min as needed if it was not tracked".)
If no condition says 0 additional soldiers by a future point, you can add for the current step a partial solution with the current soldier being a liar. (In this solution first alter the state to say that every condition needs one fewer liars - because you added one, then proceed as before.)
Your starting condition is that with i = 0, there is 1 solution with 0 liars and absolutely no conditions.
From the solution for 0 start generating partial solutions for 1, 2, ... , n. And when you reach n you have your answer.
(Note, with a modest modification you can figure out not only what the max and min are, but exactly how many solutions there are.)
You can get 90% of the way there using these principles:
If the number of liars in a set is equal to zero, decompose that set into sets of size 1, each with number of liars equal to zero.
So 1 3 0 becomes 1 1 0 and 2 2 0 and 3 3 0.
If the number of liars in a set is equal to the size of the set, decompose that set into sets of size 1, each with number of liars equal to one.
So 2 5 4 becomes 2 2 1 and 3 3 1 and 4 4 1 and 5 5 1.
For any two sets A and B that we have, if A is a subset of B, then remove all of A's elements from B, and subtract the number of liars in A from the number of liars in B.
We'll use these principles to solve the longer of your two sample problems.
Start by taking the input given, and turning them into sets of indices.
3 4 5 6 7 8 [4]
1 2 3 4 5 6 7 8 9 [6]
1 2 3 4 5 6 7 8 9 10 11 12 13 [9]
5 6 7 8 9 10 11 [5]
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [12]
8 9 10 11 12 13 [5]
4 5 6 7 8 [4]
7 8 9 [2]
10 11 12 13 [3]
7 8 9 10 11 12 13 14 15 16 [7]
14 15 16 17 18 19 [4]
4 5 6 7 8 is a subset of 3 4 5 6 7 8, so subtract one from the other.
3 [1]
1 2 3 4 5 6 7 8 9 [6]
1 2 3 4 5 6 7 8 9 10 11 12 13 [9]
5 6 7 8 9 10 11 [5]
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [12]
8 9 10 11 12 13 [5]
4 5 6 7 8 [4]
7 8 9 [2]
10 11 12 13 [3]
7 8 9 10 11 12 13 14 15 16 [7]
14 15 16 17 18 19 [4]
7 8 9, 10 11 12 13, and 14 15 16 17 18 19 are all subsets of 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19, so subtract them.
3 [1]
1 2 3 4 5 6 7 8 9 [6]
1 2 3 4 5 6 7 8 9 10 11 12 13 [9]
5 6 7 8 9 10 11 [5]
4 5 6 [3]
8 9 10 11 12 13 [5]
4 5 6 7 8 [4]
7 8 9 [2]
10 11 12 13 [3]
7 8 9 10 11 12 13 14 15 16 [7]
14 15 16 17 18 19 [4]
4 5 6 has three liars, so decompose them into individual sets.
3 [1]
4 [1]
5 [1]
6 [1]
1 2 3 4 5 6 7 8 9 [6]
1 2 3 4 5 6 7 8 9 10 11 12 13 [9]
5 6 7 8 9 10 11 [5]
8 9 10 11 12 13 [5]
4 5 6 7 8 [4]
7 8 9 [2]
10 11 12 13 [3]
7 8 9 10 11 12 13 14 15 16 [7]
14 15 16 17 18 19 [4]
subtract 3,4,5 and 6 from all the sets that contain them.
3 [1]
4 [1]
5 [1]
6 [1]
1 2 7 8 9 [2]
1 2 7 8 9 10 11 12 13 [5]
7 8 9 10 11 [3]
8 9 10 11 12 13 [5]
7 8 [1]
7 8 9 [2]
10 11 12 13 [3]
7 8 9 10 11 12 13 14 15 16 [7]
14 15 16 17 18 19 [4]
subtract 7 8 from 7 8 9
3 [1]
4 [1]
5 [1]
6 [1]
9 [1]
1 2 7 8 9 [2]
1 2 7 8 9 10 11 12 13 [5]
7 8 9 10 11 [3]
8 9 10 11 12 13 [5]
7 8 [1]
10 11 12 13 [3]
7 8 9 10 11 12 13 14 15 16 [7]
14 15 16 17 18 19 [4]
subtract 9 from all the sets that contains it.
3 [1]
4 [1]
5 [1]
6 [1]
9 [1]
1 2 7 8 [1]
1 2 7 8 10 11 12 13 [4]
7 8 10 11 [2]
8 10 11 12 13 [4]
7 8 [1]
10 11 12 13 [3]
7 8 10 11 12 13 14 15 16 [6]
14 15 16 17 18 19 [4]
subtract 7 8 from any sets that contain both.
3 [1]
4 [1]
5 [1]
6 [1]
9 [1]
1 2 [0]
1 2 10 11 12 13 [3]
10 11 [1]
8 10 11 12 13 [4]
7 8 [1]
10 11 12 13 [3]
10 11 12 13 14 15 16 [5]
14 15 16 17 18 19 [4]
1 2 has 0 liars, so decompose them into individual sets.
1 [0]
2 [0]
3 [1]
4 [1]
5 [1]
6 [1]
9 [1]
1 2 10 11 12 13 [3]
10 11 [1]
8 10 11 12 13 [4]
7 8 [1]
10 11 12 13 [3]
10 11 12 13 14 15 16 [5]
14 15 16 17 18 19 [4]
subtract 1 and 2 from all other sets that contain them.
1 [0]
2 [0]
3 [1]
4 [1]
5 [1]
6 [1]
9 [1]
10 11 [1]
8 10 11 12 13 [4]
7 8 [1]
10 11 12 13 [3]
10 11 12 13 14 15 16 [5]
14 15 16 17 18 19 [4]
subtract 10 11 from any sets that contain both.
1 [0]
2 [0]
3 [1]
4 [1]
5 [1]
6 [1]
9 [1]
10 11 [1]
8 12 13 [3]
7 8 [1]
12 13 [2]
12 13 14 15 16 [4]
14 15 16 17 18 19 [4]
8 12 13 has three liars, so decompose them into individual sets, and subtract them from any other sets that contain them.
1 [0]
2 [0]
3 [1]
4 [1]
5 [1]
6 [1]
7 [0]
8 [1]
9 [1]
10 11 [1]
12 [1]
13 [1]
14 15 16 [2]
14 15 16 17 18 19 [4]
subtract 14 15 16 from 14 15 16 17 18 19.
1 [0]
2 [0]
3 [1]
4 [1]
5 [1]
6 [1]
7 [0]
8 [1]
9 [1]
10 11 [1]
12 [1]
13 [1]
14 15 16 [2]
17 18 19 [2]
Our resulting sets are all disjoint from one another. If we union them together, like so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [13]
we can see that the number of liars from 1 to 19 is 13.
This technique doesn't totally solve the problem in all cases. For example, in the shorter of your two sample inputs, this technique does literally nothing. However, for larger problems it does decompose your sets into more modular forms, which I expect would make brute forcing easier/faster. For instance, in the larger sample, we have decomposed the problem space into exactly two possibilities:
1. there are 13 liars among soldiers 1-19, and Soldier 20 is not a liar.
2. there are 13 liars among soldiers 1-19, Soldier 20 is a liar.
We can easily evaluate these two cases to determine that the minimum liar count is 13, and the maximum is 14. This is much faster than trying all 2^20 combinations of liars and nonliars.
You can easily formulate this as an Integer Linear Program. Since the constraint matrix is totally unimodular, it can be quickly solved by any ILP solver.
Ok, how about re-casting the problem as probability/distribution estimation
This is very similar to "inverse problems" (eg infering probability distribution from known averages) which methods like MAXENT (Maximum Entropy) solve very well (eg http://books.google.gr/books?id=Kk6SyQ0AmjsC&pg=PA35&lpg=PA35&dq=MAXENT+inference&source=bl&ots=W4kVjXRpe7&sig=IzjnOVT0FQJtIXSkeFssNxolLh4&hl=el&sa=X&ei=nxJkU-LUHMmkPciigZAH&ved=0CGcQ6AEwCDgK#v=onepage&q=MAXENT%20inference&f=false)
(plus it is nice to be able to connect seemingly strange fields to the underlyijng physical reality)
Ornithology?? :)
When does the quicksort algorithm take O(n^2) time?
Quicksort works by taking a pivot, then putting all the elements lower than that pivot on one side and all the higher elements on the other; it then recursively sorts the two sub groups in the same way (all the way down until everything is sorted.) Now if you pick the worst pivot each time (the highest or lowest element in the list) you'll only have one group to sort, with everything in that group other than the original pivot that you picked. This in essence gives you n groups that each need to be iterated through n times, hence the O(n^2) complexity.
The most common reason for this occurring is if the pivot is chosen to be the first or last element in the list in the quicksort implementation. For unsorted lists this is just as valid as any other, however for sorted or nearly sorted lists (which occur quite commonly in practice) this is very likely to give you the worst case scenario. This is why all half-decent implementations tend to take a pivot from the centre of the list.
There are modifications to the standard quicksort algorithm to avoid this edge case - one example is the dual-pivot quicksort that was integrated into Java 7.
In short, Quicksort for sorting an array lowest element first works like this:
Choose a pivot element
Presort array, such that all elements smaller than the pivot are on the left side
Recursively do step 1. and 2. for the left side and the right side
Ideally, you would want a pivot element that partitions the sequence in two equally long subsequences but this is not so easy.
There are different schemes for choosing the pivot element. Early versions just took the leftmost element. In the worst case, the pivot element will always be the lowest element of the current range.
Leftmost element is pivot
In this case it can be easily thought out that the worst case is an monotonic increasing array:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Rightmost element is pivot
Similarly, when choosing the rightmost element the worst case will be a decreasing sequence.
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Center element is pivot
One possible remedy for the worst-case for presorted arrays, is to use the center element (or slightly left of center if the sequence is of even length). Then, the worst case would be quite more exotic. It can be constructed by modifying the Quicksort algorithm to set the array elements corresponding to the currently selected pivot element to a monotonic increasing value. I.e. we know the first pivot is the center, so the center must be the lowest value, e.g. 0. Next it gets swapped to the leftmost, i.e. the leftmost value is now in the center and would be the next pivot element, so it must be 1. Now, we can already guess that the array would look like this:
1 ? ? 0 ? ? ?
Here is the C++ code for the modified Quicksort to generate a worst sequence:
// g++ -std=c++11 worstCaseQuicksort.cpp && ./a.out
#include <algorithm> // swap
#include <iostream>
#include <vector>
#include <numeric> // iota
int main( void )
{
std::vector<int> v(20); /**< will hold the worst case later */
/* p basically saves the indices of what was the initial position of the
* elements of v. As they get swapped around by Quicksort p becomes a
* permutation */
auto p = v;
std::iota( p.begin(), p.end(), 0 );
/* in the worst case we need to work on v.size( sequences, because
* the initial sequence is always split after the first element */
for ( auto i = 0u; i < v.size(); ++i )
{
/* i can be interpreted as:
* - subsequence starting index
* - current minimum value, if we start at 0 */
/* note thate in the last step iPivot == v.size()-1 */
auto const iPivot = ( v.size()-1 + i )/2;
v[ p[ iPivot ] ] = i;
std::swap( p[ iPivot ], p[i] );
}
for ( auto x : v ) std::cout << " " << x;
}
The result:
0
0 1
1 0 2
2 0 1 3
1 3 0 2 4
4 2 0 1 3 5
1 5 3 0 2 4 6
4 2 6 0 1 3 5 7
1 5 3 7 0 2 4 6 8
8 2 6 4 0 1 3 5 7 9
1 9 3 7 5 0 2 4 6 8 10
6 2 10 4 8 0 1 3 5 7 9 11
1 7 3 11 5 9 0 2 4 6 8 10 12
10 2 8 4 12 6 0 1 3 5 7 9 11 13
1 11 3 9 5 13 7 0 2 4 6 8 10 12 14
8 2 12 4 10 6 14 0 1 3 5 7 9 11 13 15
1 9 3 13 5 11 7 15 0 2 4 6 8 10 12 14 16
16 2 10 4 14 6 12 8 0 1 3 5 7 9 11 13 15 17
1 17 3 11 5 15 7 13 9 0 2 4 6 8 10 12 14 16 18
10 2 18 4 12 6 16 8 14 0 1 3 5 7 9 11 13 15 17 19
1 11 3 19 5 13 7 17 9 15 0 2 4 6 8 10 12 14 16 18 20
16 2 12 4 20 6 14 8 18 10 0 1 3 5 7 9 11 13 15 17 19 21
1 17 3 13 5 21 7 15 9 19 11 0 2 4 6 8 10 12 14 16 18 20 22
12 2 18 4 14 6 22 8 16 10 20 0 1 3 5 7 9 11 13 15 17 19 21 23
1 13 3 19 5 15 7 23 9 17 11 21 0 2 4 6 8 10 12 14 16 18 20 22 24
There is order in this. The right side is just increments of two starting with zero. The left side also has an order. Let's format the left side for the 73 element long worst case sequence nicely using Ascii art:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
------------------------------------------------------------------------------------------------------------
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35
37 39 41 43 45 47 49 51 53
55 57 59 61 63
65 67
69
71
The header is the element index. In the first row numbers starting from 1 and increasing by 2 are given to every 2nd element. In the second row the same is done to every 4th element, in the 3rd row numbers are assigned to every 8th element and so on. In this case the first value to be written in the i-th row is at index 2^i-1, but for certain lengths this looks a tad different.
The resulting structure is reminiscent to an inverted binary tree whose nodes are labeled bottom-up starting from the leaves.
Median of leftmost, center and rightmost elements is pivot
Another way is to use the median of the leftmost, the center and the rightmost element. In this case the worst case can only be, that the w.l.o.g. left subsequence is of length 2 (not just length 1 like in the examples above). Also we assume that the rightmost value will always be the highest of the median-of-three. This also means it is the highest of all values. Making adjustments in the program above, we now have this:
auto p = v;
std::iota( p.begin(), p.end(), 0 );
auto i = 0u;
for ( ; i < v.size(); i+=2 )
{
auto const iPivot0 = i;
auto const iPivot1 = ( i + v.size()-1 )/2;
v[ p[ iPivot1 ] ] = i+1;
v[ p[ iPivot0 ] ] = i;
std::swap( p[ iPivot1 ], p[i+1] );
}
if ( v.size() > 0 && i == v.size() )
v[ v.size()-1 ] = i-1;
The generated sequences are:
0
0 1
0 1 2
0 1 2 3
0 2 1 3 4
0 2 1 3 4 5
0 4 2 1 3 5 6
0 4 2 1 3 5 6 7
0 4 2 6 1 3 5 7 8
0 4 2 6 1 3 5 7 8 9
0 8 2 6 4 1 3 5 7 9 10
0 8 2 6 4 1 3 5 7 9 10 11
0 6 2 10 4 8 1 3 5 7 9 11 12
0 6 2 10 4 8 1 3 5 7 9 11 12 13
0 10 2 8 4 12 6 1 3 5 7 9 11 13 14
0 10 2 8 4 12 6 1 3 5 7 9 11 13 14 15
0 8 2 12 4 10 6 14 1 3 5 7 9 11 13 15 16
0 8 2 12 4 10 6 14 1 3 5 7 9 11 13 15 16 17
0 16 2 10 4 14 6 12 8 1 3 5 7 9 11 13 15 17 18
0 16 2 10 4 14 6 12 8 1 3 5 7 9 11 13 15 17 18 19
0 10 2 18 4 12 6 16 8 14 1 3 5 7 9 11 13 15 17 19 20
0 10 2 18 4 12 6 16 8 14 1 3 5 7 9 11 13 15 17 19 20 21
0 16 2 12 4 20 6 14 8 18 10 1 3 5 7 9 11 13 15 17 19 21 22
0 16 2 12 4 20 6 14 8 18 10 1 3 5 7 9 11 13 15 17 19 21 22 23
0 12 2 18 4 14 6 22 8 16 10 20 1 3 5 7 9 11 13 15 17 19 21 23 24
Pseudorandom element with random seed 0 is pivot
The worst case sequences for center element and median-of-three look already pretty random, but in order to make Quicksort even more robust the pivot element can be chosen randomly. If the random sequence used is at least reproducible on every Quicksort run, then we can also construct a worst case sequence for that. We only have to adjust the iPivot = line in the first program, e.g. to:
srand(0); // you shouldn't use 0 as a seed
for ( auto i = 0u; i < v.size(); ++i )
{
auto const iPivot = i + rand() % ( v.size() - i );
[...]
The generated sequences are:
0
1 0
1 0 2
2 3 1 0
1 4 2 0 3
5 0 1 2 3 4
6 0 5 4 2 1 3
7 2 4 3 6 1 5 0
4 0 3 6 2 8 7 1 5
2 3 6 0 8 5 9 7 1 4
3 6 2 5 7 4 0 1 8 10 9
8 11 7 6 10 4 9 0 5 2 3 1
0 12 3 10 6 8 11 7 2 4 9 1 5
9 0 8 10 11 3 12 4 6 7 1 2 5 13
2 4 14 5 9 1 12 6 13 8 3 7 10 0 11
3 15 1 13 5 8 9 0 10 4 7 2 6 11 12 14
11 16 8 9 10 4 6 1 3 7 0 12 5 14 2 15 13
6 0 15 7 11 4 5 14 13 17 9 2 10 3 12 16 1 8
8 14 0 12 18 13 3 7 5 17 9 2 4 15 11 10 16 1 6
3 6 16 0 11 4 15 9 13 19 7 2 10 17 12 5 1 8 18 14
6 0 14 9 15 2 8 1 11 7 3 19 18 16 20 17 13 12 10 4 5
14 16 7 9 8 1 3 21 5 4 12 17 10 19 18 15 6 0 11 2 13 20
1 2 22 11 16 9 10 14 12 6 17 0 5 20 4 21 19 8 3 7 18 15 13
22 1 15 18 8 19 13 0 14 23 9 12 10 5 11 21 6 4 17 2 16 7 3 20
2 19 17 6 10 13 11 8 0 16 12 22 4 18 15 20 3 24 21 7 5 14 9 1 23
So how to check whether those sequences are correct?
Measure time it took for the sequences. Plot time over the sequence length N. If the curve scales with O(N^2) instead of O(N log(N)), then these are indeed worst case sequences.
Adjust a correct Quicksort to give debug output about the subsequence lengths and/or the chosen pivot elements. One of the subsequences should always be of length 1 (or 2 for median-of-three). The chosen pivot elements printed should be increasing.
Getting a pivot equal to the lowest or highest number, should also trigger the worst case scenario of O(n2).
Different implementations of quicksort have different datasets required to give it a worstcase runtime. It depends on where the algorithm selects it's pivot-element.
And also as Ghpst said, selecting the biggest or smallest number would give you a worstcase.
If I remember correctly quicksort normally uses a random element for pivot to minimize the chance of getting a worstcase.
I think if the array is in revrse order then it will be worst case for pivot the last element of that array
The factors that contribute to the worst-case scenario of quicksort are as follows:
Worst case occurs when the subarrays are completely unbalanced
The worst case occurs when there are 0 elements in one subarray and n-1 elements in the other.
In other words, the worst-case running time of quicksort occurs when Quicksort takes in a sorted array (in decreasing order), to be on the time complexity of O(n^2).
Consider the following example (values in vectors are target practice results and I'm trying to automagically sort by shooting score). We generate three vectors. We sort values in columns 1:20 in ascending order and rows in descending order based on out.tot column.
# Generate data
shooter1 <- round(runif(n = 20, min = 1, max = 10))
shooter2 <- round(runif(n = 20, min = 1, max = 10))
shooter3 <- round(runif(n = 20, min = 1, max = 10))
out <- data.frame(t(data.frame(shooter1, shooter2, shooter3)))
colnames(out) <- 1:ncol(out)
out.sort <- t(apply(out, 1, sort, na.last = FALSE))
out.tot <- apply(out , 1, sum)
colnames(out.sort) <- 1:ncol(out.sort)
out2 <- cbind(out.sort, out.tot)
out3 <- apply(out2, 2, sort, decreasing = TRUE, na.last = FALSE)
out2 has row names attached while out3 lost them. The only difference is that I used MARGIN = 2, which is probably the culprit (because it takes in column by column). I can match rows by hand, but is there a way I can keep row names in out3 from disappearing?
> out2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 out.tot
shooter1 1 2 2 3 3 3 4 5 5 5 6 6 6 6 6 7 8 9 9 10 106
shooter2 1 3 3 3 3 4 4 4 5 5 5 5 5 6 7 8 8 9 9 10 107
shooter3 1 1 2 2 2 3 3 4 5 5 5 6 6 6 6 7 8 8 8 9 97
> out3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 out.tot
[1,] 1 3 3 3 3 4 4 5 5 5 6 6 6 6 7 8 8 9 9 10 107
[2,] 1 2 2 3 3 3 4 4 5 5 5 6 6 6 6 7 8 9 9 10 106
[3,] 1 1 2 2 2 3 3 4 5 5 5 5 5 6 6 7 8 8 8 9 97
If I understand your example, going from out2 to out3 you are sorting each column independently - meaning that the values on row 1 may not all come from the data generated from shooter1. It makes sense then that the rownames are dropped in as much as rownames are names of observations and you are no longer keeping data from one observation on one row.