How to sort an array in linear timer and in place? - algorithm

question origin
Given an unsorted array of size n containing objects with ids of 0 … n-1, sort the array in place and in linear time. Assume that the objects contain large members such as binary data, so instantiating new copies of the objects is prohibitively expensive.
void linearSort(int* input, const int n) {
for (int i = 0; i < n; i++) {
while (input[i] != i) {
// swap
int swapPoint = input[i];
input[i] = input[swapPoint];
input[swapPoint] = swapPoint;
}
}
}
Is this linear? Does this sort work with any kind of array of ints? If so, why do we need quicksort anymore?

Despite the while loop inside the for, this sort is linear O(n). If the while loop occurs multiple times for a given i then for the i values that match swapPoint there will not execute the while loop at all.
This implementation will only work for arrays of ints where there are no duplicates and the values are sequential from 0 to n-1, which is why Quicksort still is relevant being O(n log n) because it works with non-sequential values.
This can be easily tested by making the worst case:
input = new int[] {1, 2, 3, 4, 5, 6, 7, 8, 9, 0};
and then using the following code:
int whileCount = 0;
for (int i = 0; i < n; i++)
{
while (input[i] != i)
{
whileCount++;
// swap
int swapPoint = input[i];
input[i] = input[swapPoint];
input[swapPoint] = swapPoint;
}
Console.WriteLine("for: {0}, while: {1}", i, whileCount);
}
The output will be as follows:
for: 0, while: 9
for: 1, while: 9
for: 2, while: 9
for: 3, while: 9
for: 4, while: 9
for: 5, while: 9
for: 6, while: 9
for: 7, while: 9
for: 8, while: 9
for: 9, while: 9
so you see even in the worst case where you have the while loop run n-1 times in the first iteration of the for loop, you still only get n-1 iterations of the while loop for the entire process.
Further examples with random data:
{7, 1, 2, 4, 3, 5, 0, 6, 8, 9} => 2 on i=0, 1 on i=3 and nothing more. (total 3 while loop runs)
{7, 8, 2, 1, 0, 3, 4, 5, 6, 9} => 7 on i=0 and nothing more (total 7 while loop runs)
{9, 8, 7, 4, 3, 1, 0, 2, 5, 6} => 2 on i=0, 2 on i=1, 1 on i=2, 1 on i=3 (total 6 while loop runs)

Each you put input[i] to the position swapPoint, which is exactly where it needs to go. So in the following steps those elements are already at the right place and the total time of exchange won't exceed the size n.

Related

Partial Insertion Sort

Is it possible to sort only the first k elements from an array using insertion sort principles?
Because as the algorithm runs over the array, it will sort accordingly.
Since it is needed to check all the elements (to find out who is the smallest), it will eventually sort the whole thing.
Example:
Original array: {5, 3, 8, 1, 6, 2, 8, 3, 10}
Expected output for k = 3: {1, 2, 3, 5, 8, 6, 8, 3, 10} (Only the first k elements were sorted, the rest of the elements are not)
Such partial sorting is possible while resulting method looks like hybrid of selection sort - in the part of search of the smallest element in the tail of array, and insertion sort - in the part of shifting elements (but without comparisons). Sorting preserves order of tail elements (though it was not asked explicitly)
Ideone
void ksort(int a[], int n, int k)
{ int i, j, t;
for (i = 0; i < k; i++)
{ int min = i;
for (j = i+1; j < n; j++)
if (a[j] < a[min]) min = j;
t = a[min];
for (j = min; j > i; j--)
a[j] = a[j-1];
a[i] = t;
}
}
Yes, it is possible. This will run in time O(k n) where n is the size of your array.
You are better off using heapsort. It will run in time O(n + k log(n)) instead. The heapify step is O(n), then each element extracted is O(log(n)).
A technical note. If you're clever, you'll establish the heap backwards to the end of your array. So when you think of it as a tree, put the n-2i, n-2i-1th elements below the n-ith one. So take your array:
{5, 3, 8, 1, 6, 2, 8, 3, 10}
That is a tree like so:
10
3
2
3
5
6
8
1
8
When we heapify we get the tree:
1
2
3
3
5
6
8
10
8
Which is to say the array:
{5, 3, 8, 10, 6, 3, 8, 2, 1}
And now each element extraction requires swapping the last element to the final location, then letting the large element "fall down the tree". Like this:
# swap
{1*, 3, 8, 10, 6, 3, 8, 2, 5*}
# the 5 compares with 8, 2 and swaps with the 2:
{1, 3, 8, 10, 6, 3, 8?, 5*, 2*}
# the 5 compares with 3, 6 and swaps with the 3:
{1, 3, 8, 10, 6?, 5*, 8, 3*, 2}
# The 5 compares with the 3 and swaps, note that 1 is now outside of the tree:
{1, 5*, 8, 10, 6, 3*, 8, 3, 2}
Which in a array-tree representation is:
{1}
2
3
3
5
6
8
10
8
Repeat again and we get:
# Swap
{1, 2, 8, 10, 6, 3, 8, 3, 5}
# Fall
{1, 2, 8, 10, 6, 5, 8, 3, 3}
aka:
{1, 2}
3
3
5
6
8
10
8
And again:
# swap
{1, 2, 3, 10, 6, 5, 8, 3, 8}
# fall
{1, 2, 3, 10, 6, 8, 8, 5, 3}
or
{1, 2, 3}
3
5
8
6
8
10
And so on.
Just in case anyone needs this in the future, I came up with a solution that is "pure" in the sense of not being a hybrid between the original Insertion sort and some other sorting algorithm.
void partialInsertionSort(int A[], int n, int k){
int i, j, aux, start;
int count = 0;
for(i = 1; i < n; i++){
aux = A[i];
if (i > k-1){
start = k - 1;
//This next part is needed only to maintain
//the original element order
if(A[i] < A[k])
A[i] = A[k];
}
else start = i - 1;
for(j = start; j >= 0 && A[j] > aux; j--)
A[j+1] = A[j];
A[j+1] = aux;
}
}
Basically, this algorithm sorts the first k elements. Then, the k-th element acts like a pivot: only when the remaining array elements are smaller than this pivot, it is then inserted in the corrected position between the sorted k elements just like in the original algorithm.
Best case scenario: array is already ordered
Considering that comparison is the basic operation, then the number of comparisons is 2n-k-1 → Θ(n)
Worst case scenario: array is ordered in reverse
Considering that comparison is the basic operation, then the number of comparisons is (2kn - k² - 3k + 2n)/2 → Θ(kn)
(Both take into account the comparison made to maintain the array order)

Longest Increasing subsequence length in NlogN.[Understanding the Algo]

Problem Statement: Aim is to find the longest increasing subsequence(not contiguous) in nlogn time.
Algorithm: I understood the algorithm as explained here :
http://www.geeksforgeeks.org/longest-monotonically-increasing-subsequence-size-n-log-n/.
What i did not understand is what is getting stored in tail in the following code.
int LongestIncreasingSubsequenceLength(std::vector<int> &v) {
if (v.size() == 0)
return 0;
std::vector<int> tail(v.size(), 0);
int length = 1; // always points empty slot in tail
tail[0] = v[0];
for (size_t i = 1; i < v.size(); i++) {
if (v[i] < tail[0])
// new smallest value
tail[0] = v[i];
else if (v[i] > tail[length-1])
// v[i] extends largest subsequence
tail[length++] = v[i];
else
// v[i] will become end candidate of an existing subsequence or
// Throw away larger elements in all LIS, to make room for upcoming grater elements than v[i]
// (and also, v[i] would have already appeared in one of LIS, identify the location and replace it)
tail[CeilIndex(tail, -1, length-1, v[i])] = v[i];
}
return length;
}
For example ,if input is {2,5,3,,11,8,10,13,6},
the code gives correct length as 6.
But tail will be storing 2,3,6,8,10,13.
So I want to understand what is stored in tail?.This will help me in understanding correctness of this algo.
tail[i] is the minimal end value of the increasing subsequence (IS) of length i+1.
That's why tail[0] is the 'smallest value' and why we can increase the value of LIS (length++) when the current value is bigger than end value of the current longest sequence.
Let's assume that your example is the starting values of the input:
input = 2, 5, 3, 7, 11, 8, 10, 13, 6, ...
After 9 steps of our algorithm tail looks like this:
tail = 2, 3, 6, 8, 10, 13, ...
What does tail[2] means? It means that the best IS of length 3 ends with tail[2]. And we could build an IS of length 4 expanding it with the number that is bigger than tail[2].
tail[0] = 2, IS length = 1: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[1] = 3, IS length = 2: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[2] = 6, IS length = 3: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[3] = 8, IS length = 4: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[4] = 10,IS length = 5: 2, 5, 3, 7, 11, 8, 10, 13, 6
tail[5] = 13,IS length = 6: 2, 5, 3, 7, 11, 8, 10, 13, 6
This presentation allows you to use binary search (note that defined part of tail is always sorted) to update tail and to find the result at the end of the algorithm.
Tail srotes the Longest Increasing Subsequence (LIS).
It will update itself following the explanation given in the link you provided and claimed to have understood. Check the example.
You want the minimum value at the first element of the tail, which explains the first if statement.
The second if statement is there to allow the LIS to grow, since we want to maximize its length.

Dynamic programming (Solve combination of scores) [duplicate]

It was one of my interview question, and I could not think of the good way to get number N. (plus, I did not understand the American football scoring system as well)
6 points for the touchdown
1 point for the extra point (kicked)
2 points for a safety or a conversion (extra try after a touchdown)
3 points for a field goal
What would be an efficient algorithm to get all combinations of point-accumulations necessary to get a certain score N?
Assuming here you are looking for a way to get number of possibilities and not the actual possibilities.
First let's find a recursive function:
f(n) = (f(n-6) >= 0? f(n-6) : 0) + (f(n-1) >= 0 ? f(n-1) : 0) + (f(n-2) >= 0 ? f(n-2) : 0) + (f(n-3) >= 0 ? f(n-3) : 0)
base: f(0) = 1 and f(n) = -infinity [n<0]
The idea behind it is: You can always get to 0, by a no scoring game. If you can get to f(n-6), you can also get to f(n), and so on for each possibility.
Using the above formula one can easily create a recursive solution.
Note that you can even use dynamic programming with it, initialize a table with [-5,n], init f[0] = 0 and f[-1] = f[-2] = f[-3] = f[-4] = f[-5] = -infinity and iterate over indexes [1,n] to achieve the number of possibilities based on the the recursive formula above.
EDIT:
I just realized that a simplified version of the above formula could be:
f(n) = f(n-6) + f(n-1) + f(n-2) + f(n-3)
and base will be: f(0) = 1, f(n) = 0 [n<0]
The two formulas will yield exactly the same result.
This is identical to the coin change problem, apart from the specific numbers used. See this question for a variety of answers.
You could use dynamic programming loop from 1 to n, here is some pseudo code:
results[1] = 1
for i from 1 to n :
results[i+1] += results[i]
results[i+2] += results[i]
results[i+3] += results[i]
results[i+6] += results[i]
this way complexity is O(N), instead of exponential complexity if you compute recursively by subtracting from the final score... like computing a Fibonacci series.
I hope my explanation is understandable enough..
I know this question is old, but all of the solutions I see help calculate the number of scoring permutations rather than the number of scoring combinations. (So I think either something like this should be an answer or the question title should be changed.)
Some code such as the following (which could then be converted into a dp) will calculate the number of possible combinations of different scores:
int getScoreCombinationCount(int score, int scoreVals[], int scoreValIndex) {
if (scoreValIndex < 0)
return 0;
if (score == 0)
return 1;
if (score < 0)
return 0;
return getScoreCombinationCount(score - scoreVals[scoreValIndex], scoreVals, scoreValIndex) +
getScoreCombinationCount(score, scoreVals, scoreValIndex - 1);
}
This solution, implemented based on a solution in the book Elements of Programming Interviews seems to be correct for counting the number of 'combinations' (no duplicate sets) for a set of score points.
For example, if points = {7, 3, 2}, there are 2 combinations for a total score of 7:
{7} and {3, 2, 2}.
public static int ScoreCombinationCount(int total, int[] points)
{
int[] combinations = new int[total + 1];
combinations[0] = 1;
for (var i = 0; i < points.Length; i++)
{
int point = points[i];
for (var j = point; j <= total; j++)
{
combinations[j] += combinations[j - point];
}
}
return combinations[total];
}
I am not sure I understand the logic though. Can someone explain?
The answer to this question depends on whether or not you allow the total number of combinations to include duplicate unordered combinations.
For example, in American football, you can score 2, 3, or 7 points (yes, I know you can miss the extra point on a touchdown, but let's ignore 1 point).
Then if your target N is 5, then you can reach it with {2, 3} or {3, 2}. If you count that as two combinations, then the Dynamic Programming solution by #amit will work. However, if you count those two combinations as one combination, then the iterative solution by #Maximus will work.
Below is some Java code, where findWays() corresponds to counting all possible combinations, including duplicates, and findUniqueWays() corresponds to counting only unique combinations.
// Counts the number of non-unique ways to reach N.
// Note that this algorithm counts {1,2} separately from {2,1}
// Applies a recurrence relationship. For example, with values={1,2}:
// cache[i] = cache[i-1] + cache[i-2]
public static long findWays(int N, int[] values) {
long cache[] = new long[N+1];
cache[0] = 1;
for (int i = 1; i <= N; i++) {
cache[i] = 0;
for (int value : values) {
if (value <= i)
cache[i] += cache[i-value];
}
}
return cache[N];
}
// Counts the number of unique ways to reach N.
// Note that this counts truly unique combinations: {1,2} is the same as {2,1}
public static long findUniqueWays(int N, int[] values) {
long [] cache = new long[N+1];
cache[0] = 1;
for (int i = 0; i < values.length; i++) {
int value = values[i];
for (int j = value; j <= N; j++) {
cache[j] += cache[j-value];
}
}
return cache[N];
}
Below is a test case where the possible points are {2,3,7}.
private static void testFindUniqueWaysFootball() {
int[] points = new int[]{2, 3, 7}; // Ways of scoring points.
int[] NValues = new int[]{5, 7, 10}; // Total score.
long result = -1;
for (int N : NValues) {
System.out.printf("\nN = %d points\n", N);
result = findWays(N, points);
System.out.printf("findWays() result = %d\n", result);
result = findUniqueWays(N, points);
System.out.printf("findUniqueWays() result = %d\n", result);
}
}
The output is:
N = 5 points
findWays() result = 2
findUniqueWays() result = 1
N = 7 points
findWays() result = 4
findUniqueWays() result = 2
N = 10 points
findWays() result = 9
findUniqueWays() result = 3
The results above show that to reach N=7 points, then there 4 non-unique ways to do so (those ways are {7}, {2,2,3}, {2,3,2}, {3,2,2}). However, there are only 2 unique ways (those ways are {7} and {2,2,3}). However, .
Below is a python program to find all combinations ignoring the combination order (e.g. 2,3,6 and 3,2,6 are considered one combination). This is a dynamic programming solution with order(n) time. Scores are 2,3,6,7.
We traverse from row score 2 to row score 7 (4 rows). Row score 2 contains the count if we only consider score 2 in calculating the number of combinations. Row score 3 produces each column by taking the count in row score 2 for the same final score plus the previous 3 count in its own row (current position minus 3). Row score 6 uses row score 3, which contains counts for both 2,3 and adds in the previous 6 count (current position minus 6). Row score 7 uses row score 6, which contains counts for row scores 2,3,6 plus the previous 7 count.
For example, numbers[1][12] = numbers[0][12] + numbers[1][9] (9 = 12-3) which results in 3 = 1 + 2; numbers[3][12] = numbers[2][12] + numbers[3][9] (9 = 12-3) which results in 7 = 6 + 1;
def cntMoney(num):
mSz = len(scores)
numbers = [[0]*(1+num) for _ in range(mSz)]
for mI in range(mSz): numbers[mI][0] = 1
for mI,m in enumerate(scores):
for i in range(1,num+1):
numbers[mI][i] = numbers[mI][i-m] if i >= m else 0
if mI != 0: numbers[mI][i] += numbers[mI-1][i]
print('m,numbers',m,numbers[mI])
return numbers[mSz-1][num]
scores = [2,3,6,7]
num = 12
print('score,combinations',num,cntMoney(num))
output:
('m,numbers', 2, [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1])
('m,numbers', 3, [1, 0, 1, 1, 1, 1, 2, 1, 2, 2, 2, 2, 3])
('m,numbers', 6, [1, 0, 1, 1, 1, 1, 3, 1, 3, 3, 3, 3, 6])
('m,numbers', 7, [1, 0, 1, 1, 1, 1, 3, 2, 3, 4, 4, 4, 7])
('score,combinations', 12, 7)
Below is a python program to find all ordered combinations (e.g. 2,3,6 and 3,2,6 are considered two combinations). This is a dynamic programming solution with order(n) time. We build up from the start, adding the combinations calculated from previous score numbers, for each of the scores (2,3,6,7).
'vals[i] += vals[i-s]' means the current value equals the addition of the combinations from the previous values for the given scores. For example, for column vals[12] = the addition of scores 2,3,6,7: 26 = 12+9+3+2 (i-s = 10,9,6,5).
def allSeq(num):
vals = [0]*(num+1)
vals[0] = 1
for i in range(num+1):
for s in scores:
if i-s >= 0: vals[i] += vals[i-s]
print(vals)
return vals[num]
scores = [2,3,6,7]
num = 12
print('num,seqsToNum',num,allSeq(num))
Output:
[1, 0, 1, 1, 1, 2, 3, 4, 6, 9, 12, 18, 26]
('num,seqsToNum', 12, 26)
Attached is a program that prints the sequences for each score up to the given final score.
def allSeq(num):
seqs = [[] for _ in range(num+1)]
vals = [0]*(num+1)
vals[0] = 1
for i in range(num+1):
for sI,s in enumerate(scores):
if i-s >= 0:
vals[i] += vals[i-s]
if i == s: seqs[i].append(str(s))
else:
for x in seqs[i-s]:
seqs[i].append(x + '-' + str(s))
print(vals)
for sI,seq in enumerate(seqs):
print('num,seqsSz,listOfSeqs',sI,len(seq),seq)
return vals[num],seqs[num]
scores = [2,3,6,7]
num = 12
combos,seqs = allSeq(num)
Output:
[1, 0, 1, 1, 1, 2, 3, 4, 6, 9, 12, 18, 26]
('num,seqsSz,listOfSeqs', 0, 0, [])
('num,seqsSz,listOfSeqs', 1, 0, [])
('num,seqsSz,listOfSeqs', 2, 1, ['2'])
('num,seqsSz,listOfSeqs', 3, 1, ['3'])
('num,seqsSz,listOfSeqs', 4, 1, ['2-2'])
('num,seqsSz,listOfSeqs', 5, 2, ['3-2', '2-3'])
('num,seqsSz,listOfSeqs', 6, 3, ['2-2-2', '3-3', '6'])
('num,seqsSz,listOfSeqs', 7, 4, ['3-2-2', '2-3-2', '2-2-3', '7'])
('num,seqsSz,listOfSeqs', 8, 6, ['2-2-2-2', '3-3-2', '6-2', '3-2-3', '2-3-3', '2-6'])
('num,seqsSz,listOfSeqs', 9, 9, ['3-2-2-2', '2-3-2-2', '2-2-3-2', '7-2', '2-2-2-3', '3-3-3', '6-3', '3-6', '2-7'])
('num,seqsSz,listOfSeqs', 10, 12, ['2-2-2-2-2', '3-3-2-2', '6-2-2', '3-2-3-2', '2-3-3-2', '2-6-2', '3-2-2-3', '2-3-2-3', '2-2-3-3', '7-3', '2-2-6', '3-7'])
('num,seqsSz,listOfSeqs', 11, 18, ['3-2-2-2-2', '2-3-2-2-2', '2-2-3-2-2', '7-2-2', '2-2-2-3-2', '3-3-3-2', '6-3-2', '3-6-2', '2-7-2', '2-2-2-2-3', '3-3-2-3', '6-2-3', '3-2-3-3', '2-3-3-3', '2-6-3', '3-2-6', '2-3-6', '2-2-7'])
('num,seqsSz,listOfSeqs', 12, 26, ['2-2-2-2-2-2', '3-3-2-2-2', '6-2-2-2', '3-2-3-2-2', '2-3-3-2-2', '2-6-2-2', '3-2-2-3-2', '2-3-2-3-2', '2-2-3-3-2', '7-3-2', '2-2-6-2', '3-7-2', '3-2-2-2-3', '2-3-2-2-3', '2-2-3-2-3', '7-2-3', '2-2-2-3-3', '3-3-3-3', '6-3-3', '3-6-3', '2-7-3', '2-2-2-6', '3-3-6', '6-6', '3-2-7', '2-3-7'])
~

Algorithms for bucket sort

How can I bucket sort an array of integers that contains negative numbers?
And, what's the difference between bucket sort and counting sort?
Bucket sort for negative values
Using Bucket sort for negative values simply requires mapping each element to a bucket proportional to its a distance from the minimal value to be sorted.
For example when using a bucket per value (as suggested above) for the following input would be as follows:
input array: {4, 2, -2, 2, 4, -1, 0}
min = -2
bucket0: {-2}
bucket1: {-1}
bucket2: {0}
bucket3: {}
bucket4: {2, 2}
bucket5: {}
bucket6: {4, 4}
Suggested algorithm
#A: array to be sorted
#count: number of items in A
#max: maximal value in A
#min: minimal value in A
procedure BucketSort(A, count, max, min)
#calculate the range of item in each bucket
bucketRange = (max - min + 1) / bucketsCount
#distribute the item to the buckets
for each item in A:
bucket[(item.value - min) / bucketRange].push(item)
#sort each bucket and build the sorted array A
index = 0
for bucket in {0...bucketsCount}:
sort(bucket)
for item in {0...itemsInBucket}:
A[index] = item
index++
C++ implementation
Notice the bucketRange which is proportional to the range between max and min
#include <iostream>
#include <stdio.h>
#include <vector>
#include <algorithm> // std::sort
#include <stdlib.h> // rand
#include <limits> // numeric_limits
using namespace std;
#define MAX_BUCKETS_COUNT (10) // choose this according to your space limitations
void BucketSort(int * arr, int count, int max, int min)
{
if (count == 0 or max == min)
{
return;
}
// set the number of buckets to use
int bucketsCount = std::min(count, MAX_BUCKETS_COUNT);
vector<int> *buckets = new vector<int>[bucketsCount];
// using this range we will we distribute the items into the buckets
double bucketRange = (((double)max - min + 1) / (bucketsCount));
for (int i = 0; i < count; ++i)
{
int bucket = (int)((arr[i] - min) / bucketRange);
buckets[bucket].push_back(arr[i]);
}
int index = 0;
for (int i = 0; i < bucketsCount; ++i)
{
// here we sort each bucket O(klog(k) - k being the number of item in the bucket
sort(buckets[i].begin(), buckets[i].end());
for (vector<int>::iterator iter = buckets[i].begin(); iter != buckets[i].end(); ++iter)
{
arr[index] = *iter;
++index;
}
}
delete[] buckets;
}
Testing the code
int main ()
{
int items = 50;
int data[items];
int shift = 15;//inorder to get some negative values in the array
int max = std::numeric_limits<int>::min();
int min = std::numeric_limits<int>::max();
printf("before sorting: ");
for (int i = 0; i < items; ++i)
{
data[i] = rand() % items - shift;
data[i] < min ? min = data[i]: true;
data[i] > max ? max = data[i]: true;
printf("%d ,", data[i]);
}
printf("\n");
BucketSort(data, items, max, min);
printf("after sorting: ");
for (int i = 0; i < items; ++i)
{
printf("%d ,", data[i]);
}
printf("\n");
return 0;
}
This is basically a link only answer but it gives you the information you need to formulate a good question.
Bucket Sort
Wikipedia's step 1, where you "Set up an array of initially empty buckets", will need to include buckets for negative numbers.
Counting Sort
"Compared to counting sort, bucket sort requires linked lists, dynamic arrays or a large amount of preallocated memory to hold the sets of items within each bucket, whereas counting sort instead stores a single number (the count of items) per bucket."
Bucket sort, or bin sort, is a sorting algorithm that works by distributing the elements of an array into a number of buckets. Each bucket is then sorted individually, either using a different sorting algorithm, or by recursively applying the bucket sorting algorithm.
Steps:
Set up an array of initially empty "buckets".
Scatter: Go over the original array, putting each object in its bucket.
Sort each non-empty bucket.
Gather: Visit the buckets in order and put all elements back into the original array.
Bucket sort assumes that the input is drawn from a uniform distribution and has an average-case running time of O(n). The computational complexity estimates involve the number of buckets.
Worst case performance: O(n^2)
Best case performance: Omega(n+k)
Average case performance: Theta(n+k)
Worst case space complexity: O(n.k)
For implementation and pictographic understanding:
http://javaexplorer03.blogspot.in/2015/11/bucket-sort-or-bin-sort.html
Bucket Sort needs an ordered dictionary with the unique values as the keys with their respective frequencies as the values. This is what the first line does and assigns this dictionary to k.
The second line returns a python list using double list comprehension to output the ordered key 'frequency' times. Sum(..., []) flattens
neglist = [-1, 4, 5, 6, 7, 3, 4, 3, 2, 5, 8, -2, 7, 8, 0, -3, 7, 3, 7, 3, 1, 15, 12, 4, 5, 6, 7, 3, 1, 15]
poslist = [4, 2, 7, 9, 12, 3, 7]
def bucket(k):
k = dict((uni, k.count(uni)) for uni in list(set(k)))
return sum(([key for i in range(k.get(key))] for key in sorted(k.keys())), [])
print("NegList: ", bucket(neglist))
print("PosList: ", bucket(poslist))
'''
NegList: [-3, -2, -1, 0, 1, 1, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 7, 7, 7, 7, 7, 8, 8, 12, 15, 15]
PosList: [2, 3, 4, 7, 7, 9, 12]
'''

Algorithm to get all combinations of (American) football point-accumulations necessary to get a certain score N

It was one of my interview question, and I could not think of the good way to get number N. (plus, I did not understand the American football scoring system as well)
6 points for the touchdown
1 point for the extra point (kicked)
2 points for a safety or a conversion (extra try after a touchdown)
3 points for a field goal
What would be an efficient algorithm to get all combinations of point-accumulations necessary to get a certain score N?
Assuming here you are looking for a way to get number of possibilities and not the actual possibilities.
First let's find a recursive function:
f(n) = (f(n-6) >= 0? f(n-6) : 0) + (f(n-1) >= 0 ? f(n-1) : 0) + (f(n-2) >= 0 ? f(n-2) : 0) + (f(n-3) >= 0 ? f(n-3) : 0)
base: f(0) = 1 and f(n) = -infinity [n<0]
The idea behind it is: You can always get to 0, by a no scoring game. If you can get to f(n-6), you can also get to f(n), and so on for each possibility.
Using the above formula one can easily create a recursive solution.
Note that you can even use dynamic programming with it, initialize a table with [-5,n], init f[0] = 0 and f[-1] = f[-2] = f[-3] = f[-4] = f[-5] = -infinity and iterate over indexes [1,n] to achieve the number of possibilities based on the the recursive formula above.
EDIT:
I just realized that a simplified version of the above formula could be:
f(n) = f(n-6) + f(n-1) + f(n-2) + f(n-3)
and base will be: f(0) = 1, f(n) = 0 [n<0]
The two formulas will yield exactly the same result.
This is identical to the coin change problem, apart from the specific numbers used. See this question for a variety of answers.
You could use dynamic programming loop from 1 to n, here is some pseudo code:
results[1] = 1
for i from 1 to n :
results[i+1] += results[i]
results[i+2] += results[i]
results[i+3] += results[i]
results[i+6] += results[i]
this way complexity is O(N), instead of exponential complexity if you compute recursively by subtracting from the final score... like computing a Fibonacci series.
I hope my explanation is understandable enough..
I know this question is old, but all of the solutions I see help calculate the number of scoring permutations rather than the number of scoring combinations. (So I think either something like this should be an answer or the question title should be changed.)
Some code such as the following (which could then be converted into a dp) will calculate the number of possible combinations of different scores:
int getScoreCombinationCount(int score, int scoreVals[], int scoreValIndex) {
if (scoreValIndex < 0)
return 0;
if (score == 0)
return 1;
if (score < 0)
return 0;
return getScoreCombinationCount(score - scoreVals[scoreValIndex], scoreVals, scoreValIndex) +
getScoreCombinationCount(score, scoreVals, scoreValIndex - 1);
}
This solution, implemented based on a solution in the book Elements of Programming Interviews seems to be correct for counting the number of 'combinations' (no duplicate sets) for a set of score points.
For example, if points = {7, 3, 2}, there are 2 combinations for a total score of 7:
{7} and {3, 2, 2}.
public static int ScoreCombinationCount(int total, int[] points)
{
int[] combinations = new int[total + 1];
combinations[0] = 1;
for (var i = 0; i < points.Length; i++)
{
int point = points[i];
for (var j = point; j <= total; j++)
{
combinations[j] += combinations[j - point];
}
}
return combinations[total];
}
I am not sure I understand the logic though. Can someone explain?
The answer to this question depends on whether or not you allow the total number of combinations to include duplicate unordered combinations.
For example, in American football, you can score 2, 3, or 7 points (yes, I know you can miss the extra point on a touchdown, but let's ignore 1 point).
Then if your target N is 5, then you can reach it with {2, 3} or {3, 2}. If you count that as two combinations, then the Dynamic Programming solution by #amit will work. However, if you count those two combinations as one combination, then the iterative solution by #Maximus will work.
Below is some Java code, where findWays() corresponds to counting all possible combinations, including duplicates, and findUniqueWays() corresponds to counting only unique combinations.
// Counts the number of non-unique ways to reach N.
// Note that this algorithm counts {1,2} separately from {2,1}
// Applies a recurrence relationship. For example, with values={1,2}:
// cache[i] = cache[i-1] + cache[i-2]
public static long findWays(int N, int[] values) {
long cache[] = new long[N+1];
cache[0] = 1;
for (int i = 1; i <= N; i++) {
cache[i] = 0;
for (int value : values) {
if (value <= i)
cache[i] += cache[i-value];
}
}
return cache[N];
}
// Counts the number of unique ways to reach N.
// Note that this counts truly unique combinations: {1,2} is the same as {2,1}
public static long findUniqueWays(int N, int[] values) {
long [] cache = new long[N+1];
cache[0] = 1;
for (int i = 0; i < values.length; i++) {
int value = values[i];
for (int j = value; j <= N; j++) {
cache[j] += cache[j-value];
}
}
return cache[N];
}
Below is a test case where the possible points are {2,3,7}.
private static void testFindUniqueWaysFootball() {
int[] points = new int[]{2, 3, 7}; // Ways of scoring points.
int[] NValues = new int[]{5, 7, 10}; // Total score.
long result = -1;
for (int N : NValues) {
System.out.printf("\nN = %d points\n", N);
result = findWays(N, points);
System.out.printf("findWays() result = %d\n", result);
result = findUniqueWays(N, points);
System.out.printf("findUniqueWays() result = %d\n", result);
}
}
The output is:
N = 5 points
findWays() result = 2
findUniqueWays() result = 1
N = 7 points
findWays() result = 4
findUniqueWays() result = 2
N = 10 points
findWays() result = 9
findUniqueWays() result = 3
The results above show that to reach N=7 points, then there 4 non-unique ways to do so (those ways are {7}, {2,2,3}, {2,3,2}, {3,2,2}). However, there are only 2 unique ways (those ways are {7} and {2,2,3}). However, .
Below is a python program to find all combinations ignoring the combination order (e.g. 2,3,6 and 3,2,6 are considered one combination). This is a dynamic programming solution with order(n) time. Scores are 2,3,6,7.
We traverse from row score 2 to row score 7 (4 rows). Row score 2 contains the count if we only consider score 2 in calculating the number of combinations. Row score 3 produces each column by taking the count in row score 2 for the same final score plus the previous 3 count in its own row (current position minus 3). Row score 6 uses row score 3, which contains counts for both 2,3 and adds in the previous 6 count (current position minus 6). Row score 7 uses row score 6, which contains counts for row scores 2,3,6 plus the previous 7 count.
For example, numbers[1][12] = numbers[0][12] + numbers[1][9] (9 = 12-3) which results in 3 = 1 + 2; numbers[3][12] = numbers[2][12] + numbers[3][9] (9 = 12-3) which results in 7 = 6 + 1;
def cntMoney(num):
mSz = len(scores)
numbers = [[0]*(1+num) for _ in range(mSz)]
for mI in range(mSz): numbers[mI][0] = 1
for mI,m in enumerate(scores):
for i in range(1,num+1):
numbers[mI][i] = numbers[mI][i-m] if i >= m else 0
if mI != 0: numbers[mI][i] += numbers[mI-1][i]
print('m,numbers',m,numbers[mI])
return numbers[mSz-1][num]
scores = [2,3,6,7]
num = 12
print('score,combinations',num,cntMoney(num))
output:
('m,numbers', 2, [1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1])
('m,numbers', 3, [1, 0, 1, 1, 1, 1, 2, 1, 2, 2, 2, 2, 3])
('m,numbers', 6, [1, 0, 1, 1, 1, 1, 3, 1, 3, 3, 3, 3, 6])
('m,numbers', 7, [1, 0, 1, 1, 1, 1, 3, 2, 3, 4, 4, 4, 7])
('score,combinations', 12, 7)
Below is a python program to find all ordered combinations (e.g. 2,3,6 and 3,2,6 are considered two combinations). This is a dynamic programming solution with order(n) time. We build up from the start, adding the combinations calculated from previous score numbers, for each of the scores (2,3,6,7).
'vals[i] += vals[i-s]' means the current value equals the addition of the combinations from the previous values for the given scores. For example, for column vals[12] = the addition of scores 2,3,6,7: 26 = 12+9+3+2 (i-s = 10,9,6,5).
def allSeq(num):
vals = [0]*(num+1)
vals[0] = 1
for i in range(num+1):
for s in scores:
if i-s >= 0: vals[i] += vals[i-s]
print(vals)
return vals[num]
scores = [2,3,6,7]
num = 12
print('num,seqsToNum',num,allSeq(num))
Output:
[1, 0, 1, 1, 1, 2, 3, 4, 6, 9, 12, 18, 26]
('num,seqsToNum', 12, 26)
Attached is a program that prints the sequences for each score up to the given final score.
def allSeq(num):
seqs = [[] for _ in range(num+1)]
vals = [0]*(num+1)
vals[0] = 1
for i in range(num+1):
for sI,s in enumerate(scores):
if i-s >= 0:
vals[i] += vals[i-s]
if i == s: seqs[i].append(str(s))
else:
for x in seqs[i-s]:
seqs[i].append(x + '-' + str(s))
print(vals)
for sI,seq in enumerate(seqs):
print('num,seqsSz,listOfSeqs',sI,len(seq),seq)
return vals[num],seqs[num]
scores = [2,3,6,7]
num = 12
combos,seqs = allSeq(num)
Output:
[1, 0, 1, 1, 1, 2, 3, 4, 6, 9, 12, 18, 26]
('num,seqsSz,listOfSeqs', 0, 0, [])
('num,seqsSz,listOfSeqs', 1, 0, [])
('num,seqsSz,listOfSeqs', 2, 1, ['2'])
('num,seqsSz,listOfSeqs', 3, 1, ['3'])
('num,seqsSz,listOfSeqs', 4, 1, ['2-2'])
('num,seqsSz,listOfSeqs', 5, 2, ['3-2', '2-3'])
('num,seqsSz,listOfSeqs', 6, 3, ['2-2-2', '3-3', '6'])
('num,seqsSz,listOfSeqs', 7, 4, ['3-2-2', '2-3-2', '2-2-3', '7'])
('num,seqsSz,listOfSeqs', 8, 6, ['2-2-2-2', '3-3-2', '6-2', '3-2-3', '2-3-3', '2-6'])
('num,seqsSz,listOfSeqs', 9, 9, ['3-2-2-2', '2-3-2-2', '2-2-3-2', '7-2', '2-2-2-3', '3-3-3', '6-3', '3-6', '2-7'])
('num,seqsSz,listOfSeqs', 10, 12, ['2-2-2-2-2', '3-3-2-2', '6-2-2', '3-2-3-2', '2-3-3-2', '2-6-2', '3-2-2-3', '2-3-2-3', '2-2-3-3', '7-3', '2-2-6', '3-7'])
('num,seqsSz,listOfSeqs', 11, 18, ['3-2-2-2-2', '2-3-2-2-2', '2-2-3-2-2', '7-2-2', '2-2-2-3-2', '3-3-3-2', '6-3-2', '3-6-2', '2-7-2', '2-2-2-2-3', '3-3-2-3', '6-2-3', '3-2-3-3', '2-3-3-3', '2-6-3', '3-2-6', '2-3-6', '2-2-7'])
('num,seqsSz,listOfSeqs', 12, 26, ['2-2-2-2-2-2', '3-3-2-2-2', '6-2-2-2', '3-2-3-2-2', '2-3-3-2-2', '2-6-2-2', '3-2-2-3-2', '2-3-2-3-2', '2-2-3-3-2', '7-3-2', '2-2-6-2', '3-7-2', '3-2-2-2-3', '2-3-2-2-3', '2-2-3-2-3', '7-2-3', '2-2-2-3-3', '3-3-3-3', '6-3-3', '3-6-3', '2-7-3', '2-2-2-6', '3-3-6', '6-6', '3-2-7', '2-3-7'])
~

Resources