Efficiently calculate next permutation of length k from n choices - algorithm

I need to efficiently calculate the next permutation of length k from n
choices. Wikipedia lists a great
algorithm
for computing the next permutation of length n from n choices.
The best thing I can come up with is using that algorithm (or the Steinhaus–Johnson–Trotter algorithm), and then just only considering the first k items of the list, and iterating again whenever the changes are all above that position.
Constraints:
The algorithm must calculate the next permutation given nothing more than
the current permutation. If it needs to generate a list of all permutations,
it will take up too much memory.
It must be able to compute a permutation of only length k of n (this is
where the other algorithm fails
Non-constraints:
Don't care if it's in-place or not
I don't care if it's in lexographical order, or any order for that matter
I don't care too much how efficiently it computes the next permutation,
within reason of course, it can't give me the next permutation by making a
list of all possible ones each time.

You can break this problem down into two parts:
1) Find all subsets of size k from a set of size n.
2) For each such subset, find all permutations of a subset of size k.
The referenced Wikipedia article provides an algorithm for part 2, so I won't repeat it here. The algorithm for part 1 is pretty similar. For simplicity, I'll describe it for "find all subsets of size k of the integers [0...n-1].
1) Start with the subset [0...k-1]
2) To get the next subset, given a subset S:
2a) Find the smallest j such that j ∈ S ∧ j+1 ∉ S. If j == n-1, there is no next subset; we're done.
2b) The elements less than j form a sequence i...j-1 (since if any of those values were missing, j wouldn't be minimal). If i is not 0, replace these elements with i-i...j-i-1. Replace element j with element j+1.

Related

Find the size of max perfect subset of a given set of integers

Give a list of distinct integers>=2. Take any subset of it with size>=2. A subset is called perfect if after arranging the numbers in ascending order. It satisfies a[i]*a[i]=a[i+1] for all elements in the subset. We have to return the size of a perfect subset that is maximum.
My Thoughts:
One naive approach could be to choose an element one by one and see what's the size of the perfect subset it forms, and then we could simply return the max size. This will be computationally intensive.
Any ideas for an elegant solution
This sounds like a variation on the Longest Increasing Subsequence problem.
You can sort the initial sequence (or set) of numbers in ascending order in an array Nums[1..N].
Let L[i] be the size of the largest perfect subset ending with the value at position i (i.e. Nums[i]).
Then L[i] = L[j] + 1 if we can find an index j such that Nums[j]^2 = Nums[i]. Otherwise, L[i] = 1. Because Nums is sorted, you can binary search the index j.
This gives an O(N log N) solution.

count the number of subarrays in a given array with its average being k

Given an integer array a, and an integer k, we want to design an algorithm to count the number of subarrays with the average of that subarray being k. The most naive method is to traverse all possible subarrays and calculate the corresponding average. The time complexity of this naive method is O(n^2) where $n$ is the length of a. I wonder whether it is possible to do better than O(n^2).
Usually for this kind of problem, one uses prefix sum together with a hashmap, but this technique does not seem to apply here.
Consider a prefix sum array, call it a.
You want to find all such pairs (i, j) that (a[j]-a[i])/(j-i) == k.
Now watch the hands:
(a[j]-a[i])/(j-i) == k
a[j]-a[i] == k*(j-i)
a[j]-a[i] == k*j-k*i
a[j]-k*j == a[i]-k*i
So if you subtract k*j from jth element of the prefix sum array, you are left with the task of counting identical pairs.

Finding the kth smallest element in a sequence where duplicates are compressed?

I've been asked to write a program to find the kth order statistic of a data set consisting of character and their occurrences. For example, I have a data set consisting of
B,A,C,A,B,C,A,D
Here I have A with 3 occurrences, B with 2 occurrences C with 2 occurrences and D with on occurrence. They can be grouped in pairs (characters, number of occurrences), so, for example, we could represent the above sequence as
(A,3), (B,2), (C,2) and (D,1).
Assuming than k is the number of these pairs, I am asked to find the kth of the data set in O(n) where n is the number of pairs.
I thought could sort the element based their number of occurrence and find their kth smallest elements, but that won't work in the time bounds. Can I please have some help on the algorithm for this problem?
Assuming that you have access to a linear-time selection algorithm, here's a simple divide-and-conquer algorithm for solving the problem. I'm going to let k denote the total number of pairs and m be the index you're looking for.
If there's just one pair, return the key in that pair.
Otherwise:
Using a linear-time selection algorithm, find the median element. Let medFreq be its frequency.
Sum up the frequencies of the elements less than the median. Call this less. Note that the number of elements less than or equal to the median is less + medFreq.
If less < m < less + medFreq, return the key in the median element.
Otherwise, if m ≤ less, recursively search for the mth element in the first half of the array.
Otherwise (m > less + medFreq), recursively search for the (m - less - medFreq)th element in the second half of the array.
The key insight here is that each iteration of this algorithm tosses out half of the pairs, so each recursive call is on an array half as large as the original array. This gives us the following recurrence relation:
T(k) = T(k / 2) + O(k)
Using the Master Theorem, this solves to O(k).

Given a set of n integers, list all possible subsets with sum>=k

Given an unsorted set of integers in the form of array, find all possible subsets whose sum is greater than or equal to a const integer k,
eg:- Our set is {1,2,3} and k=2
Possible subsets:-
{2},
{3},
{1,2},
{1,3},
{2,3},
{1,2,3}
I can only think of a naive algorithm which lists all the subsets of set and checks if sum of subset is >=k or not, but its an exponential algorithm and listing all subsets requires O(2^N). Can I use dynamic programming to solve it in polynomial time?
Listing all the subsets is going to be still O(2^N) because in the worst case you may still have to list all subsets apart from the empty one.
Dynamic programming can help you count the number of sets that have sum >= K
You go bottom-up keeping track of how many subsets summed to some value from range [1..K]. An approach like this will be O(N*K) which is going to be only feasible for small K.
The idea with the dynamic programming solution is best illustrated with an example. Consider this situation. Assume you know that out of all the sets composed of the first i elements you know that t1 sum to 2 and t2 sum to 3. Let's say that the next i+1 element is 4. Given all the existing sets we can build all the new sets by either appending the element i+1 or leaving it out. If we leave it out we get t1 subsets that sum to 2 and t2 subsets that sum to 3. If we append it then we obtain t1 subsets that sum to 6 (2 + 4) and t2 that sum to 7 (3 + 4) and one subset which contains just i+1 which sums to 4. That gives us the numbers of subsets that sum to (2,3,4,6,7) consisting of the first i+1 elements. We continue until N.
In pseudo-code this could look something like this:
int DP[N][K];
int set[N];
//go through all elements in the set by index
for i in range[0..N-1]
//count the one element subset consisting only of set[i]
DP[i][set[i]] = 1
if (i == 0) continue;
//case 1. build and count all subsets that don't contain element set[i]
for k in range[1..K-1]
DP[i][k] += DP[i-1][k]
//case 2. build and count subsets that contain element set[i]
for k in range[0..K-1]
if k + set[i] >= K then break inner loop
DP[i][k+set[i]] += DP[i-1][k]
//result is the number of all subsets - number of subsets with sum < K
//the -1 is for the empty subset
return 2^N - sum(DP[N-1][1..K-1]) - 1
Can I use dynamic programming to solve it in polynomial time?
No. The problem is even harder than #amit (in the comments) mentions. Finding if there exists a subset that sums to a specific k is the subset-sum problem, which is NP-hard. Instead you are asking for how many solutions are equal to a specific k, which is in the much more difficult class of P#. In addition, your exact problem is slightly more difficult since you want to not only count, but enumerate all the possible subsets for k and targets < k.
If k is 0, and every element of the set is positive then you have no choice but to output every possible subset, so the lower-bound to this problem is O(2N) -- the time taken to produce the output.
Unless you know something more about the value k that you haven't told us, there's no faster general solution that to just check every subset.

Finding sub-array sum in an integer array

Given an array of N positive integers. It can have n*(n+1)/2 sub-arrays including single element sub-arrays. Each sub-array has a sum S. Find S's for all sub-arrays is obviously O(n^2) as number of sub-arrays are O(n^2). Many sums S's may be repeated also. Is there any way to find count of all distinct sum (not the exact values of sums but only count) in O(n logn).
I tried an approach but stuck on the way. I iterated the array from index 1 to n.
Say a[i] is the given array. For each index i, a[i] will add to all the sums in which a[i-1] is involved and will include itself also as individual element. But duplicate will emerge if among sums in which a[i-1] is involved, the difference of two sums is a[i]. I mean that, say sums Sp and Sq end up at a[i-1] and difference of both is a[i]. Then Sp + a[i] equals Sq, giving Sq as a duplicate.
Say C[i] is count of the distinct sums in which end up at a[i].
So C[i] = C[i-1] + 1 - numbers of pairs of sums in which a[i-1] is involved whose difference is a[i].
But problem is to find the part of number of pairs in O(log n). Please give me some hint about this or if I am on wrong way and completely different approach is required problem point that out.
When S is not too large, we can count the distinct sums with one (fast) polynomial multiplication. When S is larger, N is hopefully small enough to use a quadratic algorithm.
Let x_1, x_2, ..., x_n be the array elements. Let y_0 = 0 and y_i = x_1 + x_2 + ... + x_i. Let P(z) = z^{y_0} + z^{y_1} + ... + z^{y_n}. Compute the product of polynomials P(z) * P(z^{-1}); the coefficient of z^k with k > 0 is nonzero if and only if k is a sub-array sum, so we just have to read off the number of nonzero coefficients of positive powers. The powers of z, moreover, range from -S to S, so the multiplication takes time on the order of S log S.
You can look at the sub-arrays as a kind of tree. In the sense that subarray [0,3] can be divided to [0,1] and [2,3].
So build up a tree, where nodes are defined by length of the subarray and it's staring offset in the original array, and whenever you compute a subarray, store the result in this tree.
When computing a sub-array, you can check this tree for existing pre-computed values.
Also, when dividing, parts of the array can be computed on different CPU cores, if that matters.
This solution assumes that you don't need all values at once, rather ad-hoc.
For the former, there could be some smarter solution.
Also, I assume that we're talking about counts of elements in 10000's and more. Otherwise, such work is a nice excercise but has not much of a practical value.

Resources