Iterating through every combination of elements with repetitions without generating whole set - algorithm

I need to iterate over every possible combination of elements (with repetitions) up to n elements long.
I've found multiple solutions for this problem. But all of these are recursively generating collection of every possible combination, then iterating over it. while this works, for large element collections and combinations size it results in heavy memory use, so I'm looking for a solution that would allow me to calculate next combination from previous one, knowing number of elements, and maximum size of combination.
Is this even possible and is there any particular algorith that would work here?

Generate the combinations so that each combination is sorted. (This assumes the elements themselves can easily be placed in order. The precise ordering relationship is not important as long as it is a total order.)
Start with the combination consisting of n repetitions of the smallest element. To produce the next combination from any given combination:
Scan backwards until you find an element which is not the largest element. If you can't find one, you are done.
Replace that element and all following elements with the next larger element of that element.
If you want combinations of all lengths up to n, run that algorithm for each length up to n. Or start with a vector which contains empty slots and use the above algorithm with the understanding that the "next larger element" after an empty slot is the smallest element.
Example: length 3 of 3 values:
1 1 1
1 1 2
1 1 3
1 2 2
1 2 3
1 3 3
2 2 2
2 2 3
2 3 3
3 3 3

Related

Is the computational complexity of counting runs in cribbage O(N*log(N)) in the worst case?

In the card game cribbage, counting the runs for a hand during the show (one of the stages of a turn in the game) is reporting the longest increasing subsequence which consists of only values that increase by 1. If duplicate values exist are apart of this subsequence than a double run (or triple, quadruple, et cetera) is reported.
Some examples:
("A","2","3","4","5") => (1,5) Single run for 5
("A","2","3","4","4") => (2,4) Double run for 4
("A","2","3","3","3") => (3,3) Triple run for 3
("A","2","3","4","6") => (1,4) Single run for 4
("A","2","3","5","6") => (1,3) Single run for 3
("A","2","4","5","7") => (0,0) No runs
To address cases that arise with hands larger than the cribbage hand size of 5. A run will be selected if it has the maximum product of the number duplicates of a subsequence and that subsequences length.
Some relevant examples:
("A","2","2","3","5","6","7","8","9","T","J") => (1,7) Single run for 7
("A","2","2","3","5","6","7","8") => (2,3) Double run for 3
My method for finding the maximum scoring run is as follows:
Create a list of ranks and sort it. O(N*log(N))
Create a list to store the length of the maximum run length and how many duplicates of it exist. Initialize it to [1 duplicate, 1 long].
Create an identical list as above to store the current run.
Create a flag that indicates whether the duplicate you've encountered is not the initial duplicate of this value. Initialize it to False.
Create a variable to store the increase in duplicate subsequences if additional duplicates values are found after the initial duplicate. Initialize it to 1.
Iterate over the differences between adjacent elements. O(N)
If the difference is greater than one, the run has ended. Check if the product of the elements of the max run is less than the current run and the current run has length 3 or greater. If this is true, the current run becomes the maximum run and the current run list is reset to [1,1]. The flag is reset to False. The increment for duplicate subsequences is reset to 1. Iterate to next value.
If the difference is 1, increment the length of the current run by 1 and set the flag to False. Iterate to next value.
If the difference is 0 and the flag is False, set the increment for duplicate subsequences equal to the current number of duplicates for the run. Then, double the number of duplicates for the run and set the flag to True. Iterate to the next value
If the difference is 0 and the flag is True, increase the number of the runs by the increment for duplicate subsequences value.
After the iteration, check the current run list as in step 7 against the max run and set max run accordingly.
I believe this has O(N*(1+log(N)). I believe this is the best time complexity, but I am not sure how to prove this or what a better algorithm would look like. Is there a way to do this without sorting the list first that achieves a better time complexity? If not, how does one go about proving this is the best time complexity?
iterate over the differences between
Time complexity of an algorithm is a well-traveled path. Proving the complexity of an algorithm varies slightly among mathematician clusters; rather, the complexity community usually works with modular pseudo-code and standard reductions. For instance, a for loop based on the input length is O(N) (surprise); sorting a list is known to be O(log N) at best (in the general case). For an good treatment, see Big O, how do you calculate/approximate it?.
Note: O(N x (1+log(N)) is slightly sloppy notation. Only the greatest complexity factor -- the one that dominates as N approaches infinity -- is used. Drop the 1+: it's simply O(N log N).
As I suggested in a comment, you can simply count elements. Keep a list of counts, indexed by your card values. For discussing the algorithm, don't use the "dirty" data of character representations: "A23456789TJQK"; simply use their values, either 0-12 or 1-13.
for rank in hand:
count[rank] += 1
This is a linear pass through the data, O(N).
Now, traverse your array of counts, finding the longest sequence of non-zero values. This is a fixed-length list of 13 elements, touching each element only once: O(1). If you accumulate a list of multiples (card counts, then you'll also have your combinatoric factors at the end.
The resulting algorithm and code are, therefore, O(N).
For instance, let's shorten this to 7 card values, 0-6. Given the input integers
1 2 1 3 6 1 3 5 0
You make the first pass to count items:
[1 3 1 2 0 1 1]
A second pass gives you a max run length of 4, with counts [1 3 1 2].
You report a run of 4, a triple and a double, or the point count
4 * (1 * 3 * 1 * 2)
You can also count the pair values:
2 * 3! + 2 * 2!

What is the correct approach to solve this matrix

We are given following matrix
5 7 9
7 8 4
4 2 9
We need to find maximum sum row or column and then we need to subtract 1 from each element of that row or column and then we need to repeat this operation for 3 times.
I will try to explain.
The matrix is n*n and the increment process is repeated for k times.
An o(n^2+k×log(n)) algorithm is possible.
If the sum of the biggest row/columns is a row so:
The row sum is increased by n.
All columns sum is increased by 1.
The two rules apply for columns as well.
For rule one store all rows/columns sum in 2 AVL Trees
(or every other data structure that support o(log(n)) insert and remove)
For rule two store rows/columns number of operations. (just two integers)
Now take the max of both trees where the two integers play a role for difference of the data staractures. Change it and change the other it and insert back.

How to display all ways to give change

As far as I know, counting every way to give change to a set sum and a starting till configuration is a classic Dynamic Programming problem.
I was wondering if there was a way to also display (or store) the actual change structures that could possibly amount to the given sum while preserving the DP complexity.
I have never saw this issue being discussed and I would like some pointers or a brief explanation of how this can be done or why this cannot be done.
DP for change problem has time complexity O(Sum * ValuesCount) and storage complexity O(Sum).
You can prepare extra data for this problem in the same time as DP for change, but you need more storage O(O(Sum*ValuesCount), and a lot of time for output of all variants O(ChangeWaysCount).
To prepare data for way recovery, make the second array B of arrays (or lists). When you incrementing count array A element from some previous element, add used value to corresponding element of B. At the end, unwind all the ways from the last element.
Example: values 1,2,3, sum 4
index 0 1 2 3 4
A 0 1 2 3 4
B - 1 1 2 1 2 3 1 2 3
We start unwinding from B[4] elements:
1-1-1-1 (B[4]-B[3]-B[2]-B[1])
2-1-1 (B[4]-B[2]-B[1])
2-2 (B[4]-B[2])
3-1 (B[4]-B[1])
Note that I have used only ways with non-increasing values to avoid permutation variants (i.e. 1-3 and 3-1)

Which algorithm should be used to solve this sorting exercise?

I am trying to solve the following problem,but I am stuck. The problem is as follows-
Suppose you are given an array of integers.You have to remove certain elements from the array so that the entire array is sorted in ascending order. Elements can't be repeated, so if any element is repeated in the array, it also should be removed. You have to find the minimum number of integers that needs to be removed from the array in order to sort it in ascending order.
I am posting a few test cases so that the question becomes clear:
Input
1 1 2
Output-1(since 1 is repeated and if removed, the array is sorted)
2.Input - 1 8 9 3 4 5
Output-2(since 8,9 if removed the array is sorted and it is also the minimum number required)
Input- 1 7 8 9 3
Output - 1(only 3 should be removed)
My approach was to move through the array and if the previous number is bigger than the current number, then the previous number should be removed. By this approach, 1 and 2 will get solved, but case 3 will output 3 for this approach.
How should I solve this problem? Is there any specific algorithm that might be helpful?
Let's assume that we have thrown away some numbers so that the remaining array is sorted. What does the remaining numbers form in the initial array? An increasing subsequence. We want to throw away as few numbers as we can, or, put it another way, keep as many numbers as possible. Thus, we need to find the longest increasing subsequence in the given array.

Sorting second column relative to first column

I have got the following sequence (representing a tree):
4 2
1 4
3 4
5 4
2 7
0 7
6 0
Now, I am trying to sort this sequence, such that when a value appears on the left (column 1), it has already appeared on the right (column 2). More concretely, the result of the sorting algorithm should be:
1 4
3 4
5 4
4 2
2 7
6 0
0 7
Obviously, this works in O(n^2) with an algorithm iterating over each entry of column 1 and then look for corresponding entries in column two. But as n can be quite big (> 100000) in my scenario, I'm looking for a O(n log n) way to do it. Is this even possible?
Assumption:
I'm assuming this is also a valid sort sequence:
1 4
4 2
3 4
5 4
2 7
6 0
0 7
i.e. Once a value appears once on the right, it can appear on the left.
If this is not the case (i.e. all occurrences on the right has to be before any occurrence on the left), ignore the "remove all edges pointing to that element" part and only remove the intermediate element if it has no incoming edges left.
Algorithm:
Construct a graph where each element A points to another element B if the right element of A is equal to the left element of B. This can be done using a hash multi-map:
Go through the elements, inserting each element A into the hash map as A.left -> A.
Go through the elements again, connecting each element B with all elements appearing under B.right.
Perform a topological sort of the graph, giving you your result. I should be modified such that, instead of removing an edge pointing to an element, we remove all edges pointing to that element (i.e. if we already found an element containing some element on the right, we don't need to find another for that element to appear on the left).
Currently this is O(n2) running time, because there are too many edges - if we have:
(1,2),(1,2),...,(1,2),(2,3),(2,3),...,(2,3)
There are O(n2) edges.
This can be avoided by, instead of having elements point directly to each other, create an intermediate element. In the above case, 1/2 the elements will point to that element and that element will point to the other half. Then, when doing the topological sort, when we would've remove an edge to that element, we instead remove that element and all edges pointing from / to it.
Now there will be a maximum of O(n) edges, and, since topological sort can be done in linear time with respect to the elements and edges, the overall running time is O(n).
Note that it's not always possible to get a result: (1,2), (2,1).
Illustrations:
For your example (pre-optimization), we'd have:
For my example above, we'd have:

Resources