Stack permutations sorted in lexicographical order - algorithm

A stack permutation of number N is defined as the number of sequences which you can print by doing the following
Keep two stacks say A and B.
Push numbers from 1 to N in reverse order in B. (so the top of B is 1 and the last element in B is N)
Do the following operations
Choose the top element from A or B and print it and delete it (pop it). This can be done on a non-empty stack only.
Move the top element from B to A (if B is non-empty)
If both stacks are empty then stop
All possible sequences obtained by doing these operations in some order are called stack permutations.
eg: N = 2
stack permutations are (1, 2) and (2, 1)
eg: N = 3
stack permutations are (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1) and (3, 2, 1)
The number of stack permutations for N numbers is C(N), where C(N) is the Nth Catalan Number.
Suppose we generate all stack permutations for a given N and then print them in lexicographical order (dictionary order), how can we determine the kth permutation, without actually generating all the permutations and then sorting them?
I want some algorithmic approaches that are programmable.

You didn't say whether k should be 0 based or 1 based. I chose 0. Switching back is easy.
The approach is to first write a function to be able to count how many stack permutations there are from a given decision point. Use memoization to make it fast. And then proceed down the decision tree by skipping over decisions that lead to permutations which are lexicographically smaller. That will lead to the list of decisions that are the one you want.
def count_stack_permutations (on_b, on_a=0, can_take_from_a=True, cache={}):
key = (on_b, on_a, can_take_from_a)
if on_a < 0:
return 0 # can't go negative.
elif on_b == 0:
if can_take_from_a:
return 1 # Just drain a
else:
return 0 # Got nothing.
elif key not in cache:
# Drain b
answer = count_stack_permutations(on_b-1, on_a, True)
# Drain a?
if can_take_from_a:
answer = answer + count_stack_permutations(on_b, on_a-1, True)
# Move from b to a.
answer = answer + count_stack_permutations(on_b-1, on_a+1, False)
cache[key] = answer
return cache[key]
def find_kth_permutation (n, k):
# The end of the array is the top
a = []
b = list(range(n, 0, -1))
can_take_from_a = True # We obviously won't first. :-)
answer = []
while 0 < max(len(a), len(b)):
action = None
on_a = len(a)
on_b = len(b)
# If I can take from a, that is always smallest.
if can_take_from_a:
if count_stack_permutations(on_b, on_a - 1, True) <= k:
k = k - count_stack_permutations(on_b, on_a - 1, True)
else:
action = 'a'
# Taking from b is smaller than digging into b so I can take deeper.
if action is None:
if count_stack_permutations(on_b-1, on_a, True) <= k:
k = k - count_stack_permutations(on_b-1, on_a, True)
else:
action = 'b'
# Otherwise I will move.
if action is None:
if count_stack_permutations(on_b-1, on_a, False) < k:
return None # Should never happen
else:
action = 'm'
if action == 'a':
answer.append(a.pop())
can_take_from_a = True
elif action == 'b':
answer.append(b.pop())
can_take_from_a = True
else:
a.append(b.pop())
can_take_from_a = False
return answer
# And demonstrate it in action.
for k in range(0, 6):
print((k, find_kth_permutation(3, k)))

This is possible using factoradic(https://en.wikipedia.org/wiki/Factorial_number_system)
If you need quick solution in Java use JNumberTools
JNumberTools.permutationsOf("A","B","C")
.uniqueNth(4) //next 4th permutation
.forEach(System.out::println);
This API will generate the next nth permutation directly in lexicographic order. So you can even generate next billionth permutation of 100 items.
for generating next nth permutation of given size use:
JNumberTools.permutationsOf("A","B","C")
.kNth(2,4) //next 4th permutation of size 2
.forEach(System.out::println);
maven dependency for JNumberTools is:
<dependency>
<groupId>io.github.deepeshpatel</groupId>
<artifactId>jnumbertools</artifactId>
<version>1.0.0</version>
</dependency>

Related

Counting the number of ways to make up a string

I have just started learning dynamic programming and was able to do some of the basic problems, such as fibbonaci, the knapsack and a few more problems. Coming across the problem
below, I got stuck and do not know how to proceed forward. What confuses me is what would be the base case in this case, and the overlapping problems. Not knowing
this prevents me from developing a relation. They are not as apparent in this example as they were in the previous ones I have solved thus far.
Suppose we are given some string origString, a string toMatch and some number maxNum greater than or equal to 0. How can we count in how many ways it is possible to take maxNum number of nonempty and nonoverlapping substrings of the string origString to make up the string toMatch?
Example:
If origString = "ppkpke", and toMatch = "ppke"
maxNum = 1: countWays("ppkpke", "ppke", 1) will give 0 because toMatch is not a substring of origString.
maxNum = 2: countWays("ppkpke", "ppke", 2) will give 4 because 4 different combinations of 2 substring made up of "ppkpke" can make "ppke".
Those strings are "ppk" & "e", "pp" & "ke" , "p" & "pke" (excluding "p") and "p" & "pke" (excluding "k")
As an initial word of caution, I’d say that although my solution happens to match the expected output for the tiny test set, it is very likely wrong. It’s up to you to double-check it on other examples you may have etc.
The algorithm walks the longer string and tries to spread the shorter string over it. The incremental state of the algorithm consists of tuples of 3 elements:
long string coordinate i (origString[i] == toMatch[j])
short string coordinate j (origString[i] == toMatch[j])
number of ways we made it into that^^^ state
Then we just walk along the strings over and over again, using stored, previously discovered state, and sum up the total number(s) of ways each state was achieved — in the typical dynamic programming fashion.
For a state to count as a solution, j must be at the end of the short string and the number of iterations of the dynamic algorithm must be equivalent to the number of substrings we wanted at that point (because each iteration added one substring).
It is not entirely clear to me from the assignment whether maxNum actually means something like “exactNum”, i.e. exactly that many substrings, or whether we should sum across all lower or equal numbers of substrings. So the function returns a dictionary like { #substrings : #decompositions }, so that the output can be adjusted as needed.
#!/usr/bin/env python
def countWays(origString, toMatch, maxNum):
origLen = len(origString)
matchLen = len(toMatch)
state = {}
for i in range(origLen):
for j in range(matchLen):
o = i + j
if origString[o] != toMatch[j]:
break
state[(o, j)] = 1
sums = {}
for n in range(1, maxNum):
if not state:
break
nextState = {}
for istart, jstart in state:
prev = state[(istart, jstart)]
for i in range(istart + 1, origLen):
for j in range(jstart + 1, matchLen):
o = i + j - jstart - 1
if origString[o] != toMatch[j]:
break
nextState[(o, j)] = prev + nextState.get((o, j), 0)
sums[n] = sum(state[(i, j)] for i, j in state if j == matchLen - 1)
state = nextState
sums[maxNum] = sum(state[(i, j)] for i, j in state if j == matchLen - 1)
return sums
result = countWays(origString='ppkpke', toMatch='ppke', maxNum=5)
print('for an exact number of substrings:', result)
print(' for up to a number of substrings:', {
n: s for n, s in ((m, sum(result[k] for k in range(1, m + 1)))
for m in range(1, 1 + max(result.keys())))})
This^^^ code is a quick and ugly hack and nothing more. There is a huge room for improvement, including (but not limited to) the use of generator functions (yield), the use of #memoize etc. Here’s some output:
for an exact number of substrings: {1: 0, 2: 4, 3: 8, 4: 4, 5: 0}
for up to a number of substrings: {1: 0, 2: 4, 3: 12, 4: 16, 5: 16}
It would be an interesting (and nicely challenging) exercise to store a bit more of the dynamic state (e.g. to keep it for each n) and then reconstruct and pretty-print (efficiently) the exact string (de)compositions that were counted.
Here is a recursive solution.
Compares the first character of source and target, and if they're equal, choose to either take it (advancing by 1 char in both strings) or not take it (advancing by 1 char in source but not in target). The value of k is decremented everytime a new substring is created; there is an additional variable continued which is True if we're in the middle of building a substring, and False otherwise.
def countWays(source, target, k, continued=False):
if len(target) == 0:
return (k == 0)
elif (k == 0 and not continued) or len(source) == 0:
return 0
elif source[0] == target[0]:
if continued:
return countWays(source[1:], target[1:], k, True) + countWays(source[1:], target[1:], k-1, True) + countWays(source[1:], target, k, False)
else:
return countWays(source[1:], target[1:], k-1, True) + countWays(source[1:], target, k, False)
else:
return countWays(source[1:], target, k, False)
print(countWays('ppkpke', 'ppke', 1))
# 0
print(countWays('ppkpke', 'ppke', 2))
# 4
print(countWays('ppkpke', 'ppke', 3))
# 8
print(countWays('ppkpke', 'ppke', 4))
# 4
print(countWays('ppkpke', 'ppke', 5))
# 0

Implementing iterative solution in a functionally recursive way with memoization

I am trying to solve the following problem on leetcode: Coin Change 2
Input: amount = 5, coins = [1, 2,5]
Output: 4 Explanation: there are four ways to make up the amount:
5=5
5=2+2+1
5=2+1+1+1
5=1+1+1+1+1
I am trying to implement an iterative solution which essentially simulates/mimic recursion using stack. I have managed to implement it and the solution works, but it exceeds time limit.
I have noticed that the recursive solutions make use of memoization to optimize. I would like to incorporate that in my iterative solution as well, but I am lost on how to proceed.
My solution so far:
# stack to simulate recursion
stack = []
# add starting indexes and sum to stack
#Tuple(x,y) where x is sum, y is index of the coins array input
for i in range(0, len(coins)):
if coins[i]<=amount:
stack.append((coins[i], i))
result = 0
while len(stack)!=0:
c = stack.pop()
currentsum = c[0]
currentindex = c[1]
# can't explore further
if currentsum >amount:
continue
# condition met, increment result
if currentsum == amount:
result = result+1
continue
# add coin at current index to sum if doesn't exceed amount (append call to stack)
if (currentsum+coins[currentindex])<=amount:
stack.append((currentsum+coins[currentindex], currentindex))
#skip coin at current index (append call to stack)
if (currentindex+1)<=len(coins)-1:
stack.append((currentsum, currentindex+1))
return result
I have tried using dictionary to record appends to the stack as follows:
#if the call has not already happened, add to dictionary
if dictionary.get((currentsum, currentindex+1), None) == None:
stack.append((currentsum, currentindex+1))
dictionary[currentsum, currentindex+1)] = 'visited'
Example, if call (2,1) of sum = 2 and coin-array-index = 1 is made, I append it to dictionary. If the same call is encountered again, I don't append it again. However, it does not work as different combinations can have same sum and index.
Is there anyway I can incorporate memoization in my iterative solution above. I want to do it in a way such that it is functionally same as the recursive solution.
I have managed to figure out the solution. Essentially, I used post order traversal and used a state variable to record the stage of recursion the current call is in. Using the stage, I have managed to go bottom up after going top down.
The solution I came up with is as follows:
def change(self, amount: int, coins: List[int]) -> int:
if amount<=0:
return 1
if len(coins) == 0:
return 0
d= dict()
#currentsum, index, instruction
coins.sort(reverse=True)
stack = [(0, 0, 'ENTER')]
calls = 0
while len(stack)!=0:
currentsum, index, instruction = stack.pop()
if currentsum == amount:
d[(currentsum, index)] = 1
continue
elif instruction == 'ENTER':
stack.append((currentsum, index, 'EXIT'))
if (index+1)<=(len(coins)-1):
if d.get((currentsum, index+1), None) == None:
stack.append((currentsum, index+1, 'ENTER'))
newsum = currentsum + coins[index]
if newsum<=amount:
if d.get((newsum, index), None) == None:
stack.append((newsum, index, 'ENTER'))
elif instruction == 'EXIT':
newsum = currentsum + coins[index]
left = 0 if d.get((newsum, index), None) == None else d.get((newsum, index))
right = 0 if d.get((currentsum, index+1), None) == None else d.get((currentsum, index+1))
d[(currentsum, index)] = left+right
calls = calls+1
print(calls)
return d[(0,0)]

What is the sublist array that can give us maximum 'flip-flop' sum?

my problem is that I'm given an array of with length l.
let's say this is my array: [1,5,4,2,9,3,6] let's call this A.
This array can have multiple sub arrays with nodes being adjacent to each other. so we can have [1,5,4] or [2,9,3,6] and so on. the length of each sub array does not matter.
But the trick is the sum part. we cannot just add all numbers, it works like flip flop. so for the sublist [2,9,3,6] the sum would be [2,-9,3,-6] which is: -10. and is pretty small.
what would be the sublist (or sub-array if you like) of this array A that produces the maximum sum?
one possible way would be (from intuition) that the sublist [4,2,9] will output a decent result : [4, -2, 9] = (add all the elements) = 11.
The question is, how to come up with a result like this?
what is the sub-array that gives us the maximum flip-flop sum?
and mainly, what is the algorithm that takes any array as an input and outputs a sub-array with all numbers being adjacent and with the maximum sum?
I haven't come up with anything but I'm pretty sure I should pick either dynamic programming or divide and conquer to solve this issue. again, I don't know, I may be totally wrong.
The problem can indeed be solved using dynamic programming, by keeping track of the maximum sum ending at each position.
However, since the current element can be either added to or subtracted from a sum (depending on the length of the subsequence), we will keep track of the maximum sums ending here, separately, for both even as well as odd subsequence lengths.
The code below (implemented in python) does that (please see comments in the code for additional details).
The time complexity is O(n).
a = [1, 5, 4, 2, 9, 3, 6]
# initialize the best sequences which end at element a[0]
# best sequence with odd length ending at the current position
best_ending_here_odd = a[0] # the sequence sum value
best_ending_here_odd_start_idx = 0
# best sequence with even length ending at the current position
best_ending_here_even = 0 # the sequence sum value
best_ending_here_even_start_idx = 1
best_sum = 0
best_start_idx = 0
best_end_idx = 0
for i in range(1, len(a)):
# add/subtract the current element to the best sequences that
# ended in the previous element
best_ending_here_even, best_ending_here_odd = \
best_ending_here_odd - a[i], best_ending_here_even + a[i]
# swap starting positions (since a sequence which had odd length when it
# was ending at the previous element has even length now, and vice-versa)
best_ending_here_even_start_idx, best_ending_here_odd_start_idx = \
best_ending_here_odd_start_idx, best_ending_here_even_start_idx
# we can always make a sequence of even length with sum 0 (empty sequence)
if best_ending_here_even < 0:
best_ending_here_even = 0
best_ending_here_even_start_idx = i + 1
# update the best known sub-sequence if it is the case
if best_ending_here_even > best_sum:
best_sum = best_ending_here_even
best_start_idx = best_ending_here_even_start_idx
best_end_idx = i
if best_ending_here_odd > best_sum:
best_sum = best_ending_here_odd
best_start_idx = best_ending_here_odd_start_idx
best_end_idx = i
print(best_sum, best_start_idx, best_end_idx)
For the example sequence in the question, the above code outputs the following flip-flop sub-sequence:
4 - 2 + 9 - 3 + 6 = 14
As quertyman wrote, we can use dynamic programming. This is similar to Kadane's algorithm but with a few twists. We need a second temporary variable to keep track of trying each element both as an addition and as a subtraction. Note that a subtraction must be preceded by an addition but not vice versa. O(1) space, O(n) time.
JavaScript code:
function f(A){
let prevAdd = [A[0], 1] // sum, length
let prevSubt = [0, 0]
let best = [0, -1, 0, null] // sum, idx, len, op
let add
let subt
for (let i=1; i<A.length; i++){
// Try adding
add = [A[i] + prevSubt[0], 1 + prevSubt[1]]
if (add[0] > best[0])
best = [add[0], i, add[1], ' + ']
// Try subtracting
if (prevAdd[0] - A[i] > 0)
subt = [prevAdd[0] - A[i], 1 + prevAdd[1]]
else
subt = [0, 0]
if (subt[0] > best[0])
best = [subt[0], i, subt[1], ' - ']
prevAdd = add
prevSubt = subt
}
return best
}
function show(A, sol){
let [sum, i, len, op] = sol
let str = A[i] + ' = ' + sum
for (let l=1; l<len; l++){
str = A[i-l] + op + str
op = op == ' + ' ? ' - ' : ' + '
}
return str
}
var A = [1, 5, 4, 2, 9, 3, 6]
console.log(JSON.stringify(A))
var sol = f(A)
console.log(JSON.stringify(sol))
console.log(show(A, sol))
Update
Per OP's request in the comments, here is some theoretical elaboration on the general recurrence (pseudocode): let f(i, subtract) represent the maximum sum up to and including the element indexed at i, where subtract indicates whether or not the element is subtracted or added. Then:
// Try subtracting
f(i, true) =
if f(i-1, false) - A[i] > 0
then f(i-1, false) - A[i]
otherwise 0
// Try adding
f(i, false) =
A[i] + f(i-1, true)
(Note that when f(i-1, true) evaluates
to zero, the best ending at
i as an addition is just A[i])
The recurrence only depends on the evaluation at the previous element, which means we can code it with O(1) space, just saving the very last evaluation after each iteration, and updating the best so far (including the sequence's ending index and length if we want).

Need help in understanding Dynamic Programming approach for "balanced 0-1 matrix"?

Problem: I am struggling to understand/visualize the Dynamic Programming approach for "A type of balanced 0-1 matrix in "Dynamic Programming - Wikipedia Article."
Wikipedia Link: https://en.wikipedia.org/wiki/Dynamic_programming#A_type_of_balanced_0.E2.80.931_matrix
I couldn't understand how the memoization works when dealing with a multidimensional array. For example, when trying to solve the Fibonacci series with DP, using an array to store previous state results is easy, as the index value of the array store the solution for that state.
Can someone explain DP approach for the "0-1 balanced matrix" in simpler manner?
Wikipedia offered both a crappy explanation and a not ideal algorithm. But let's work with it as a starting place.
First let's take the backtracking algorithm. Rather than put the cells of the matrix "in some order", let's go everything in the first row, then everything in the second row, then everything in the third row, and so on. Clearly that will work.
Now let's modify the backtracking algorithm slightly. Instead of going cell by cell, we'll go row by row. So we make a list of the n choose n/2 possible rows which are half 0 and half 1. Then have a recursive function that looks something like this:
def count_0_1_matrices(n, filled_rows=None):
if filled_rows is None:
filled_rows = []
if some_column_exceeds_threshold(n, filled_rows):
# Cannot have more than n/2 0s or 1s in any column
return 0
else:
answer = 0
for row in possible_rows(n):
answer = answer + count_0_1_matrices(n, filled_rows + [row])
return answer
This is a backtracking algorithm like what we had before. We are just doing whole rows at a time, not cells.
But notice, we're passing around more information than we need. There is no need to pass in the exact arrangement of rows. All that we need to know is how many 1s are needed in each remaining column. So we can make the algorithm look more like this:
def count_0_1_matrices(n, still_needed=None):
if still_needed is None:
still_needed = [int(n/2) for _ in range(n)]
# Did we overrun any column?
for i in still_needed:
if i < 0:
return 0
# Did we reach the end of our matrix?
if 0 == sum(still_needed):
return 1
# Calculate the answer by recursion.
answer = 0
for row in possible_rows(n):
next_still_needed = [still_needed[i] - row[i] for i in range(n)]
answer = answer + count_0_1_matrices(n, next_still_needed)
return answer
This version is almost the recursive function in the Wikipedia version. The main difference is that our base case is that after every row is finished, we need nothing, while Wikipedia would have us code up the base case to check the last row after every other is done.
To get from this to a top-down DP, you only need to memoize the function. Which in Python you can do by defining then adding an #memoize decorator. Like this:
from functools import wraps
def memoize(func):
cache = {}
#wraps(func)
def wrap(*args):
if args not in cache:
cache[args] = func(*args)
return cache[args]
return wrap
But remember that I criticized the Wikipedia algorithm? Let's start improving it! The first big improvement is this. Do you notice that the order of the elements of still_needed can't matter, just their values? So just sorting the elements will stop you from doing the calculation separately for each permutation. (There can be a lot of permutations!)
#memoize
def count_0_1_matrices(n, still_needed=None):
if still_needed is None:
still_needed = [int(n/2) for _ in range(n)]
# Did we overrun any column?
for i in still_needed:
if i < 0:
return 0
# Did we reach the end of our matrix?
if 0 == sum(still_needed):
return 1
# Calculate the answer by recursion.
answer = 0
for row in possible_rows(n):
next_still_needed = [still_needed[i] - row[i] for i in range(n)]
answer = answer + count_0_1_matrices(n, sorted(next_still_needed))
return answer
That little innocuous sorted doesn't look important, but it saves a lot of work! And now that we know that still_needed is always sorted, we can simplify our checks for whether we are done, and whether anything went negative. Plus we can add an easy check to filter out the case where we have too many 0s in a column.
#memoize
def count_0_1_matrices(n, still_needed=None):
if still_needed is None:
still_needed = [int(n/2) for _ in range(n)]
# Did we overrun any column?
if still_needed[-1] < 0:
return 0
total = sum(still_needed)
if 0 == total:
# We reached the end of our matrix.
return 1
elif total*2/n < still_needed[0]:
# We have total*2/n rows left, but won't get enough 1s for a
# column.
return 0
# Calculate the answer by recursion.
answer = 0
for row in possible_rows(n):
next_still_needed = [still_needed[i] - row[i] for i in range(n)]
answer = answer + count_0_1_matrices(n, sorted(next_still_needed))
return answer
And, assuming you implement possible_rows, this should both work and be significantly more efficient than what Wikipedia offered.
=====
Here is a complete working implementation. On my machine it calculated the 6'th term in under 4 seconds.
#! /usr/bin/env python
from sys import argv
from functools import wraps
def memoize(func):
cache = {}
#wraps(func)
def wrap(*args):
if args not in cache:
cache[args] = func(*args)
return cache[args]
return wrap
#memoize
def count_0_1_matrices(n, still_needed=None):
if 0 == n:
return 1
if still_needed is None:
still_needed = [int(n/2) for _ in range(n)]
# Did we overrun any column?
if still_needed[0] < 0:
return 0
total = sum(still_needed)
if 0 == total:
# We reached the end of our matrix.
return 1
elif total*2/n < still_needed[-1]:
# We have total*2/n rows left, but won't get enough 1s for a
# column.
return 0
# Calculate the answer by recursion.
answer = 0
for row in possible_rows(n):
next_still_needed = [still_needed[i] - row[i] for i in range(n)]
answer = answer + count_0_1_matrices(n, tuple(sorted(next_still_needed)))
return answer
#memoize
def possible_rows(n):
return [row for row in _possible_rows(n, n/2)]
def _possible_rows(n, k):
if 0 == n:
yield tuple()
else:
if k < n:
for row in _possible_rows(n-1, k):
yield tuple(row + (0,))
if 0 < k:
for row in _possible_rows(n-1, k-1):
yield tuple(row + (1,))
n = 2
if 1 < len(argv):
n = int(argv[1])
print(count_0_1_matrices(2*n)))
You're memoizing states that are likely to be repeated. The state that needs to be remembered in this case is the vector (k is implicit). Let's look at one of the examples you linked to. Each pair in the vector argument (of length n) is representing "the number of zeros and ones that have yet to be placed in that column."
Take the example on the left, where the vector is ((1, 1) (1, 1) (1, 1) (1, 1)), when k = 2 and the assignments leading to it were 1 0 1 0, k = 3 and 0 1 0 1, k = 4. But we could get to the same state, ((1, 1) (1, 1) (1, 1) (1, 1)), k = 2 from a different set of assignments, for example: 0 1 0 1, k = 3 and 1 0 1 0, k = 4. If we would memoize the result for the state, ((1, 1) (1, 1) (1, 1) (1, 1)), we could avoid recalculating the recursion for that branch again.
Please let me know if there's anything I could better clarify.
Further elaboration in response to your comment:
The Wikipedia example seems to be pretty much a brute-force with memoization. The algorithm seems to attempt to enumerate all the matrixes but uses memoization to exit early from repeated states. How do we enumerate all possibilities? To take their example, n = 4, we start with the vector [(2,2),(2,2),(2,2),(2,2)] where zeros and ones are yet to be placed. (Since the sum of each tuple in the vector is k, we could have a simpler vector where k and the count of either ones or zeros is maintained.)
At every stage, k, in the recursion, we enumerate all possible configurations for the next vector. If the state exists in our hash, we simply return the value for that key. Otherwise, we assign the vector as a new key in the hash (in which case this recursion branch will continue).
For example:
Vector [(2,2),(2,2),(2,2),(2,2)]
Possible assignments of 1's: [1 1 0 0], [1 0 1 0], [1 0 0 1] ... etc.
First branch: [(2,1),(2,1),(1,2),(1,2)]
is this vector a key in the hash?
if yes, return value lookup
else, assign this vector as a key in the hash where the value is the sum
of the function calls with the next possible vectors as their arguments
Building on the excellent answer by https://stackoverflow.com/users/585411/btilly, I've updated their algorithm to exclude "0" cases in the still_needed tuple. The code is about 50% faster largely because of more cache hits using the collapsable tuple.
import time
from typing import Tuple
from sys import argv
from functools import cache
#cache
def possible_rows(n, k=None) -> Tuple[int]:
if k is None:
k = n / 2
return [row for row in _possible_rows(n, k)]
def _possible_rows(n, k) -> Tuple[int]:
if 0 == n:
yield tuple()
else:
if k < n:
for row in _possible_rows(n-1, k):
yield tuple(row + (0,))
if 0 < k:
for row in _possible_rows(n-1, k-1):
yield tuple(row + (1,))
def count(n: int, k: int) -> int:
if n == 0:
return 1
still_needed = tuple([k] * n)
return count_0_1_matrices(k, still_needed)
#cache
def count_0_1_matrices(k:int, still_needed: Tuple[int]):
"""
Assume still_needed contains only positive ints, and is sorted ascending
"""
# Calculate the answer by recursion.
answer = 0
for row in possible_rows(len(still_needed), k):
# Decrement the still_needed value tuple by the row tuple and only keep positive results. Sorting is important for cache hits.
next_still_needed = tuple(sorted([sn - r for sn, r in zip(still_needed, row) if sn > r]))
# Only continue if we still need values and there are enough rows left
if not next_still_needed:
answer += 1
elif len(next_still_needed) >= k and sum(next_still_needed) >= next_still_needed[-1] * k:
# sum / k -> how many rows left. We need enough rows left to continue down this path.
answer += count_0_1_matrices(k, next_still_needed)
return answer
if __name__ == "__main__":
n = 7
if 1 < len(argv):
n = int(argv[1])
start = time.time()
result = count(2*n, n)
print(f"{result} in {time.time() - start} seconds")

How do you find the largest gap in a vector in O(n) time?

You are given the locations of various cars in the same lane on a highway as doubles to a vector, in no particular order. How can you find the largest gap between neighboring cars in O(n) time?
It seems like a simple solution would be to sort then check, but of course this isn't linear.
Divide the vector in n+1 equally sized buckets. For each such buckets, store the maximum and the minimum value, all other values can be discarded. Because of the pigeonhole principle, at least one of those parts is empty, so the non-minimum/non-maximum values in either parts don't have an influence for the result.
Then, go over the buckets and calculate the distance to the next and the previous non-empty bucket, and take the maximum; this is the final result.
An example with n=5 and values 5,2,20,17,3. Minimum is 2, maximum is 20 => bucket size is (20-2)/5 = 4.
Bucket: 2 6 10 14 18 20
Min/Max: 2-5 - - 17,17 20,20
Differences: 2-5, 5-17, 17-20.
Maximum is 5-17.
My Python implementation of ipc's solution:
def maximum_gap(l):
n = len(l)
if n < 2:
return 0
(x_min, x_max) = (min(l), max(l))
if x_min == x_max:
return 0
buckets = [None] * (n + 1)
bucket_size = float(x_max - x_min) / n
for x in l:
k = int((x - x_min) / bucket_size)
if buckets[k] is None:
buckets[k] = (x, x)
else:
buckets[k] = (min(x, buckets[k][0]), max(x, buckets[k][1]))
result = 0
for i in range(n):
if buckets[i + 1] is None:
buckets[i + 1] = buckets[i]
else:
result = max(result, buckets[i + 1][0] - buckets[i][1])
return result
assert maximum_gap([]) == 0
assert maximum_gap([42]) == 0
assert maximum_gap([1, 1, 1, 1]) == 0
assert maximum_gap([1, 2, 3, 4, 6, 8]) == 2
assert maximum_gap([5, 2, 20, 17, 3]) == 12
I use a tuple for bucket's elements, None if empty. In the last part, I eliminate preemptively any remaining empty bucket by assigning it to the previous one (this works, since the first one is guaranteed to be non-empty).
Note the special case when all elements are equal.

Resources