I need an algorithm that produces a partition of the number n into k parts with the added restrictions that each element of the partition must be between a and b. Ideally, all possible partitions satisfying the restrictions should be equally likely. Partitions are considered the same if they have the same elements in different order.
For example, with n=10, k=3, a=2, b=4 one has only {4,4,2} and {4,3,3} as possible outcomes.
Is there a standard algorithm for such a problem? One can assume that at least one partition satisfying the restrictions always exists.
You can implement this as a recursive algorithm. Basically, the recurrence is like this:
if k == 1 and a <= n <= b, then the only partition is [n], otherwise none
otherwise, combine all the elements x from a to b with all the partitions for n-x, k-1
to prevent duplicates, also substitute the lower bound a with x
Here's some Python (aka executable pseudo-code):
def partitions(n, k, a, b):
if k == 1 and a <= n <= b:
yield [n]
elif n > 0 and k > 0:
for x in range(a, b+1):
for p in partitions(n-x, k-1, x, b):
yield [x] + p
print(list(partitions(10, 3, 2, 4)))
# [[2, 4, 4], [3, 3, 4]]
This could be further improved by checking (k-1)*a and (k-1)*b for the lower and upper bounds for the remaining elements, respectively, and restricting the range for x accordingly:
min_x = max(a, n - (k-1) * b)
max_x = min(b, n - (k-1) * a)
for x in range(min_x, max_x+1):
For partitions(110, 12, 3, 12) with 3,157 solutions, this reduces the number of recursive calls from 638,679 down to 24,135.
Here's a sampling algorithm that uses conditional probability.
import collections
import random
countmemo = {}
def count(n, k, a, b):
assert n >= 0
assert k >= 0
assert a >= 0
assert b >= 0
if k == 0:
return 1 if n == 0 else 0
key = (n, k, a, b)
if key not in countmemo:
countmemo[key] = sum(
count(n - c, k - 1, a, c) for c in range(a, min(n, b) + 1))
return countmemo[key]
def sample(n, k, a, b):
partition = []
x = random.randrange(count(n, k, a, b))
while k > 0:
for c in range(a, min(n, b) + 1):
y = count(n - c, k - 1, a, c)
if x < y:
partition.append(c)
n -= c
k -= 1
b = c
break
x -= y
else:
assert False
return partition
def test():
print(collections.Counter(
tuple(sample(20, 6, 2, 5)) for i in range(10000)))
if __name__ == '__main__':
test()
If k and b - a are not too big you can try a randomized depth-first search:
import random
def restricted_partition_rec(n, k, min, max):
if k <= 0 or n < min:
return []
ps = list(range(min, max + 1))
random.shuffle(ps)
for p in ps:
if p > n:
continue
elif p < n:
subp = restricted_partition(n - p, k - 1, min, max)
if subp:
return [p] + subp
elif k == 1:
return [p]
return []
def restricted_partition(n, k, min, max):
return sorted(restricted_partition_rec(n, k, min, max), reverse=True)
print(restricted_partition(10, 3, 2, 4))
>>>
[4, 4, 2]
Although I'm not sure if all the partitions have exactly the same probability in this case.
Related
Can someone explain how the logic of the composition of substitutions works with the following block of code?
plus2(0, X, X). % 0+X = X
plus2(s(X), Y, s(Z)) :-
plus2(Y, X, Z). % (X+1) + Y = Z+1 therefore Y+X=Z
Here is better naming:
% Reduced to zero
peano_add(0, Sum, Sum).
peano_add(s(N), M, s(Sum)) :-
% Decrement towards 0
% Swap N & M, because N + M is M + N
peano_add(M, N, Sum).
This is using Peano arithmetic, which represents natural numbers (i.e. integers starting from zero) in a relative way, as compound terms, as successors ultimately of 0. For example, s(s(0)) represents 2. Such relativity is convenient and elegant for Prolog, because it can be used ("reasoned with") in an uninstantiated (var) variable.
In swi-prolog, this produces:
?- peano_add(N, M, Sum).
N = 0,
M = Sum ; % When N is zero, M is same as Sum - could be 0 or successor
N = Sum, Sum = s(_),
M = 0 ; % When M is zero, N is same as Sum
N = s(0),
M = s(_A),
Sum = s(s(_A)) ; % 1 + 1 = 2
N = s(s(_A)),
M = s(0),
Sum = s(s(s(_A))) ; % 2 + 1 = 3
N = s(s(0)),
M = s(s(_A)),
Sum = s(s(s(s(_A)))) ; % 2 + 2 = 4
N = s(s(s(_A))),
M = s(s(0)),
Sum = s(s(s(s(s(_A))))) % 3 + 2 = 5 etc.
... and if we ask it how we can add two natural numbers to sum to 2:
?- peano_add(N, M, s(s(0))).
N = 0,
M = s(s(0)) ; % 0 + 2
N = s(s(0)),
M = 0 ; % 2 + 0
N = M, M = s(0) ; % 1 + 1
false.
Whereas if we don't swap the arguments:
% Reduced to zero
peano_add(0, Sum, Sum).
peano_add(s(N), M, s(Sum)) :-
% Decrement towards 0
% Not swapping args, to demonstrate weakness
peano_add(N, M, Sum).
... we get:
?- peano_add(N, M, Sum).
N = 0,
M = Sum ;
N = s(0),
Sum = s(M) ;
N = s(s(0)),
Sum = s(s(M)) ;
N = s(s(s(0))),
Sum = s(s(s(M))) ;
N = s(s(s(s(0)))),
Sum = s(s(s(s(M)))) ;
... which is still correct, but doesn't "involve" M as much as it could.
Both methods are counting from 0 upwards to infinity.
Swapping the parameters brings the advantage of checking the 2nd argument, to fail both fast and when appropriate:
?- peano_add(s(s(N)), z, Sum).
false. % Correct, because z is not valid
% Versus, when unswapped, this undesirable:
?- peano_add(s(s(N)), z, Sum).
N = 0,
Sum = s(s(z)) ; % Wrong - did not check whether z is valid
N = s(0),
Sum = s(s(s(z))) ; % Still wrong
N = s(s(0)),
Sum = s(s(s(s(z)))) ; % Will keep being wrong
Sadly, there is a common practice in Prolog example code of using meaningless variable names (such as A, B, X, Y), which adds confusion and should be generally avoided.
I've been scratching my head about this for two days now and I cannot come up with a solution. What I'm looking for is a function f(s, n) such that it returns a set containing all subsets of s where the length of each subset is n.
Demo:
s={a, b, c, d}
f(s, 4)
{{a, b, c, d}}
f(s, 3)
{{a, b, c}, {a, b, d}, {a, c, d}, {b, c, d}}
f(s, 2)
{{a, b}, {a, c}, {a, d}, {b, c}, {b, d}, {c, d}}
f(s, 1)
{{a}, {b}, {c}, {d}}
I have a feeling that recursion is the way to go here. I've been fiddling with something like
f(S, n):
for s in S:
t = f( S-{s}, n-1 )
...
But this does not seem to do the trick. I did notice that len(f(s,n)) seems to be the binomial coefficient bin(len(s), n). I guess this could be utilized somehow.
Can you help me please?
Let us call n the size of the array and k the number of elements to be out in a subarray.
Let us consider the first element A[0] of the array A.
If this element is put in the subset, the problem becomes a (n-1, k-1) similar problem.
If not, it becomes a (n-1, k) problem.
This can be simply implemented in a recursive function.
We just have to pay attention to deal with the extreme cases k == 0 or k > n.
During the process, we also have to keep trace of:
n: the number of remaining elements of A to consider
k: the number of elements that remain to be put in the current subset
index: the index of the next element of A to consider
The current_subset array that memorizes the elements already selected.
Here is a simple code in c++ to illustrate the algorithm
Output
For 5 elements and subsets of size 3:
3 4 5
2 4 5
2 3 5
2 3 4
1 4 5
1 3 5
1 3 4
1 2 5
1 2 4
1 2 3
#include <iostream>
#include <vector>
void print (const std::vector<std::vector<int>>& subsets) {
for (auto &v: subsets) {
for (auto &x: v) {
std::cout << x << " ";
}
std::cout << "\n";
}
}
// n: number of remaining elements of A to consider
// k: number of elements that remain to be put in the current subset
// index: index of next element of A to consider
void Get_subset_rec (std::vector<std::vector<int>>& subsets, int n, int k, int index, std::vector<int>& A, std::vector<int>& current_subset) {
if (n < k) return;
if (k == 0) {
subsets.push_back (current_subset);
return;
}
Get_subset_rec (subsets, n-1, k, index+1, A, current_subset);
current_subset.push_back(A[index]);
Get_subset_rec (subsets, n-1, k-1, index+1, A, current_subset);
current_subset.pop_back(); // remove last element
return;
}
void Get_subset (std::vector<std::vector<int>>& subsets, int subset_length, std::vector<int>& A) {
std::vector<int> current_subset;
Get_subset_rec (subsets, A.size(), subset_length, 0, A, current_subset);
}
int main () {
int subset_length = 3; // subset size
std::vector A = {1, 2, 3, 4, 5};
int size = A.size();
std::vector<std::vector<int>> subsets;
Get_subset (subsets, subset_length, A);
std::cout << subsets.size() << "\n";
print (subsets);
}
Live demo
One way to solve this is by backtracking. Here's a possible algorithm in pseudo code:
def backtrack(input_set, idx, partial_res, res, n):
if len(partial_res == n):
res.append(partial_res[:])
return
for i in range(idx, len(input_set)):
partial_res.append(input_set[i])
backtrack(input_set, idx+1, partial_res, res, n) # path with input_set[i]
partial_res.pop()
backtrack(input_set, idx+1, partial_res, res, n) # path without input_set[i]
Time complexity of this approach is O(2^len(input_set)) since we make 2 branches at each element of input_set, regardless of whether the path leads to a valid result or not. The space complexity is O(len(input_set) choose n) since this is the number of valid subsets you get, as you correctly pointed out in your question.
Now, there is a way to optimize the above algorithm to reduce the time complexity to O(len(input_set) choose n) by pruning the recursive tree to paths that can lead to valid results only.
If n - len(partial_res) < len(input_set) - idx + 1, we are sure that even if we took every remaining element in input_set[idx:] we are still short at least one to reach n. So we can employ this as a base case and return and prune.
Also, if n - len(partial_res) == len(input_set) - idx + 1, this means that we need each and every element in input_set[idx:] to get the required n length result. Thus, we can't skip any elements and so the second branch of our recursive call becomes redundant.
backtrack(input_set, idx+1, partial_res, res, n) # path without input_set[i]
We can skip this branch with a conditional check.
Implementing these base cases correctly, reduces the time complexity of the algorithm to O(len(input_set) choose k), which is a hard limit because that's the number of subsets that there are.
subseqs 0 _ = [[]]
subseqs k [] = []
subseqs k (x:xs) = map (x:) (subseqs (k-1) xs) ++ subseqs k xs
Live demo
The function looks for subsequences of (non-negative) length k in a given sequence. There are three cases:
If the length is 0: there is a single empty subsequence in any sequence.
Otherwise, if the sequence is empty: there are no subsequences of any (positive) length k.
Otherwise, there is a non-empty sequence that starts with x and continues with xs, and a positive length k. All our subsequences are of two kinds: those that contain x (they are subsequences of xs of length k-1, with x stuck at the front of each one), and those that do not contain x (they are just subsequences of xs of length k).
The algorithm is a more or less literal translation of these notes to Haskell. Notation cheat sheet:
[] an empty list
[w] a list with a single element w
x:xs a list with a head of x and a tail of xs
(x:) a function that sticks an x in front of any list
++ list concatenation
f a b c a function f applied to arguments a b and c
Here is a non-recursive python function that takes a list superset and returns a generator that produces all subsets of size k.
def subsets_k(superset, k):
if k > len(superset):
return
if k == 0:
yield []
return
indices = list(range(k))
while True:
yield [superset[i] for i in indices]
i = k - 1
while indices[i] == len(superset) - k + i:
i -= 1
if i == -1:
return
indices[i] += 1
for j in range(i + 1, k):
indices[j] = indices[i] + j - i
Testing it:
for s in subsets_k(['a', 'b', 'c', 'd', 'e'], 3):
print(s)
Output:
['a', 'b', 'c']
['a', 'b', 'd']
['a', 'b', 'e']
['a', 'c', 'd']
['a', 'c', 'e']
['a', 'd', 'e']
['b', 'c', 'd']
['b', 'c', 'e']
['b', 'd', 'e']
['c', 'd', 'e']
I'm unsure about the value of x in this Hoare triple: { a = 0 } while (x > a) do (x := x − 1) { x = 0 }.
I have 2 potential ideas for how to prove whether this Hoare triple is valid or not:
Assuming x is 0, the Hoare triple is valid, or
Assuming x is any arbitrary value, we break it down into cases and conclude that the Hoare triple is not valid for all values of x
Are either of the above approaches valid, or is there another approach I should take?
So you have
{a = 0}
while (x > a)
x := x - 1
{x = 0}
Let's try the loop invariant x ≥ a & a = 0 and let's abbreviate it with I. When we annotate the program, we get:
{a = 0}
{I} # Loop invariant should be true before the loop
while (x > a)
{I & x > a} # Loop invariant + condition holds
x := x - 1
{I} # Loop invariant should be true after each iteration
{I & x ≤ a} # Loop invariant + negation of loop condition
{x = 0}
Now we need to apply the weakest precondition to x := x - 1:
{a = 0}
{I}
while (x > a)
{I & x > a}
{x - 1 ≥ a & a = 0} # I[x-1/x]
x := x - 1
{I}
{I & x ≤ a}
{x = 0}
We end up with the following proof obligations:
(a = 0) ⇒ (x ≥ a & a = 0) holds, since x ∈ ℕ
(x ≥ a & a = 0) ⇒ (x - 1 > a & a = 0), holds. Proof trivial.
(x ≥ a & a = 0 & x ≤ a) ⇒ (x = 0) holds. Proof trivial.
So the original Hoare triple holds.
I'm quite new to prolog and I am trying to write a predicate which gives the value of nth prime number and it looks like nth_prime(N, Prime) .
I have already done the function that counts if the number is prime or not
div(X, Y):- 0 is X mod Y.
div(X, Y):- X>Y+1, Y1 is Y+1, div(X, Y1).
prime(2):- true.
prime(X):- X<2, false.
prime(X):- not(div(X, 2)).
I don't understand what is my next step, and how I should count which prime belong to N.
Your code is a bit unusual for prolog but (with the exception of prime(1)) it works.
Here is a solution for your predicate:
nextprime(N,N):-
prime(N),
!.
nextprime(P, Prime):-
PP is P+1,
nextprime(PP,Prime).
nthprime(1, 2).
nthprime(N, Prime):-
N>1,
NN is N-1,
nthprime(NN, PrevPrime),
PP is PrevPrime+1,
nextprime(PP, Prime).
?- nthprime(1,P).
P = 2 ;
false.
?- nthprime(2,P).
P = 3 ;
false.
?- nthprime(3,P).
P = 5 ;
false.
It works as follows: It is known that the first prime number is 2 (nthprime(1, 2).). For every other number N larger than 1, get the previous prime number (nthprime(NN, PrevPrime)), add 1 until you hit a prime number. The add 1 part is done through a help predicate nextprime/2: for a given number P it will check if this number is a prime. If yes, it returns this number, otherwise it will call itself for the next higher number (nextprime(PP,Prime)) and forwards the output. The bang ! is called a cut which cuts the other choice branches. So if you once hit a prime, you can not go back and try the other path.
To test it you can ask ?- nthprime(N,P). for a given N. Or to check multiple answers at once, let's introdice a helperpredicate nthprimeList/2 which calls nthprime/2 for every item in the firstlist and puts the "output" into a list:
nthprimeList([],[]).
nthprimeList([N|TN],[P|TP]):-
nthprime(N,P),
nthprimeList(TN,TP).
?- nthprimeList([1,2,3,4,5,6,7,8,9],[P1,P2,P3,P4,P5,P6,P7,P8,P9]).
P1 = 2,
P2 = 3,
P3 = 5,
P4 = 7,
P5 = 11,
P6 = 13,
P7 = 17,
P8 = 19,
P9 = 23;
false.
Using your definitions, we define the following to count up and test all numbers from 2 and up, one after another:
nth_prime(N, Prime):-
nth_prime(N, Prime, 1, 2). % 2 is the candidate for 1st prime
nth_prime(N, P, I, Q):- % Q is I-th prime candidate
prime(Q)
-> ( I = N, P = Q
; I1 is I+1, Q1 is Q+1, nth_prime(N, P, I1, Q1)
)
; Q1 is Q+1, nth_prime(N, P, I, Q1).
Testing:
30 ?- nth_prime(N,P).
N = 1,
P = 2 ;
N = 2,
P = 3 ;
N = 3,
P = 5 ;
N = 4,
P = 7 ;
N = 5,
P = 11 .
31 ?- nth_prime(N,P), N>24.
N = 25,
P = 97 ;
N = 26,
P = 101 ;
N = 27,
P = 103 .
32 ?- nth_prime(N,P), N>99.
N = 100,
P = 541 ;
N = 101,
P = 547 ;
N = 102,
P = 557 .
Let's say, we have a list/an array of positive integers x1, x2, ... , xn.
We can do a join operation on this sequence, that means that we can replace two elements that are next to each other with one element, which is sum of these elements. For example:
-> array/list: [1;2;3;4;5;6]
we can join 2 and 3, and replace them with 5;
we can join 5 and 6, and replace them with 11;
we cannot join 2 and 4;
we cannot join 1 and 3 etc.
Main problem is to find minimum join operations for given sequence, after which this sequence will be sorted in increasing order.
Note: empty and one-element sequences are sorted in increasing order.
Basic examples:
for [4; 6; 5; 3; 9] solution is 1 (we join 5 and 3)
for [1; 3; 6; 5] solution is also 1 (we join 6 and 5)
What I am looking for, is an algorithm that solve this problem. It could be in pseudocode, C, C++, PHP, OCaml or similar (I mean: I would understand solution, if You wrote solution in one of these languages).
This is an ideal problem to solve using Dynamic Programming, and the recurrence described by #lijie is exactly the right approach, with a few minor tweaks to ensure all possibilities are considered. There are two key observations: (a) Any sequence of join operations results in a set of non-overlapping summed subsequences of the original vector, and (b) For the optimal join-sequence, if we look to the right of any summed subsequence (m...n), that portion is an optimal solution to the problem: "find an optimal join-sequence for the sub-vector (n+1)...N such that the resulting final sequence is sorted, and all elements are >= sum(m...n).
Implementing the recurrence directly would of course result in an exponential time algorithm, but a simple tweak using Dynamic Programming makes it O(N^2), because essentially all (m,n) pairs are considered once. An easy way to implement the recurrence using Dynamic Programming is to have a data-structure indexed by (m,n) that stores the results of f(m,n) once they are computed, so that the next time we invoke f(m,n), we can lookup the previously saved results. The following code does this using the R programming language. I am using the formulation where we want to find the min-number of joins to get a non-decreasing sequence. For those new to R, to test this code, simply download R from any mirror (Google "R Project"), fire it up, and paste the two function definitions (f and solve) into the console, and then solve any vector using "solve(c(...))" as in the examples below.
f <- function(m,n) {
name <- paste(m,n)
nCalls <<- nCalls + 1
# use <<- for global assignment
if( !is.null( Saved[[ name ]] ) ) {
# the solution for (m,n) has been cached, look it up
nCached <<- nCached + 1
return( Saved[[ name ]] )
}
N <- length(vec) # vec is global to this function
sum.mn <- -Inf
if(m >= 1)
sum.mn <- sum( vec[m:n] )
if(n == N) { # boundary case: the (m,n) range includes the last number
result <- list( num = 0, joins = list(), seq = c())
} else
{
bestNum <- Inf
bestJoins <- list()
bestSeq <- c()
for( k in (n+1):N ) {
sum.nk <- sum( vec[ (n+1):k ] )
if( sum.nk < sum.mn ) next
joinRest <- f( n+1, k )
numJoins <- joinRest$num + k-n-1
if( numJoins < bestNum ) {
bestNum <- numJoins
if( k == n+1 )
bestJoins <- joinRest$joins else
bestJoins <- c( list(c(n+1,k)), joinRest$joins )
bestSeq <- c( sum.nk, joinRest$seq)
}
}
result <- list( num = bestNum, joins = bestJoins, seq = bestSeq )
}
Saved[[ name ]] <<- result
result
}
solve <- function(input) {
vec <<- input
nCalls <<- 0
nCached <<- 0
Saved <<- c()
result <- f(0,0)
cat( 'Num calls to f = ', nCalls, ', Cached = ', nCached, '\n')
cat( 'Min joins = ', result$num, '\n')
cat( 'Opt summed subsequences: ')
cat( do.call( paste,
lapply(result$joins,
function(pair) paste(pair[1], pair[2], sep=':' ))),
'\n')
cat( 'Final Sequence: ', result$seq, '\n' )
}
Here are some sample runs:
> solve(c(2,8,2,2,9,12))
Num calls to f = 22 , Cached = 4
Min joins = 2
Opt summed subsequences: 2:3 4:5
Final Sequence: 2 10 11 12
> solve(c(1,1,1,1,1))
Num calls to f = 19 , Cached = 3
Min joins = 0
Opt summed subsequences:
Final Sequence: 1 1 1 1 1
> solve(c(4,3,10,11))
Num calls to f = 10 , Cached = 0
Min joins = 1
Opt summed subsequences: 1:2
Final Sequence: 7 10 11
> solve(c (2, 8, 2, 2, 8, 3, 8, 9, 9, 2, 9, 8, 8, 7, 4, 2, 7, 5, 9, 4, 6, 7, 4, 7, 3, 4, 7, 9, 1, 2, 5, 1, 8, 7, 3, 3, 6, 3, 8, 5, 6, 5))
Num calls to f = 3982 , Cached = 3225
Min joins = 30
Opt summed subsequences: 2:3 4:5 6:7 8:9 10:12 13:16 17:19 20:23 24:27 28:33 34:42
Final Sequence: 2 10 10 11 18 19 21 21 21 21 26 46
Note that the min number of joins for the sequence considered by #kotlinski is 30, not 32 or 33.
Greedy algorithm!
import Data.List (inits)
joinSequence :: (Num a, Ord a) => [a] -> Int
joinSequence (x:xs) = joinWithMin 0 x xs
where joinWithMin k _ [] = k
joinWithMin k x xs =
case dropWhile ((< x) . snd) $ zip [0..] $ scanl1 (+) xs
of (l, y):_ -> joinWithMin (k + l) y $ drop (l+1) xs
_ -> k + length xs
joinSequence _ = 0
At each step, grab more elements until their sum is not less than the last. If you run out of elements, just join all the ones that remain to the prior group.
That was wrong.
Combinatorial explosion!
joinSequence :: (Num a, Ord a) => [a] -> Int
joinSequence = joinWithMin 0 0
where joinWithMin k _ [] = k
joinWithMin k m xs =
case dropWhile ((< m) . snd) $ zip [0..] $ scanl1 (+) xs
of [] -> k + length xs
ys -> minimum [ joinWithMin (k+l) y $ drop (l+1) xs
| (l, y) <- ys ]
Just try every possible joining and take the minimum. I couldn't think of a smart heuristic to limit backtracking, but this should be O(n²) with dynamic programming, and O(2n) as written.
A dynamic programming approach:
Let the original array be a[i], 0 <= i < N.
Define f(m, n) to be the minimum number of joins needed to make a[n..N-1] sorted, such that all elements in the sorted sublist are > (or >=, if another variant is desired) the sum of a[m..n-1] (let the sum of an empty list to be -inf).
The base case is f(m, N) = 0 (the sublist is empty).
The recursion is f(m, n) = min_{n < k <= N s.t. sum(a[n..k-1]) > sum(a[m..n-1])} f(n, k) + k-n-1. If no values of k are suitable, then let f(m, n) = inf (anything >= N will also work, because there are at most N-1 joins).
Calculate f(m,n) in decreasing order of m and n.
Then, the desired answer is f(0,0).
EDIT
Oops this is basically ephemient's second answer, I believe, although I am not familiar enough with Haskell to know exactly what it is doing.
Some Haskell code:
sortJoin (a:b:c:xs)
| a <= b = a : sortJoin (b:c:xs)
| a+b <= c = a+b : sortJoin (c:xs)
| otherwise = sortJoin (a:b+c:xs)
sortJoin (a:b:[]) = if a <= b then [a,b] else [a+b]
sortJoin a#_ = a
edits xs = length xs - length (sortJoin xs)
UPDATE: Made this work with test = [2, 8, 2, 2, 8, 3, 8, 9, 9, 2, 9, 8, 8, 7, 4, 2, 7, 5, 9, 4, 6, 7, 4, 7, 3, 4, 7, 9, 1, 2, 5, 1, 8, 7, 3, 3, 6, 3, 8, 5, 6, 5]
...now we get:
> sortJoin test
[2,8,12,20,20,23,27,28,31,55]
> edits test
32
Hopefully keeping it simple. Here's some pseudo-code that's exponential time.
Function "join" (list, max-join-count, join-count) ->
Fail if join-count is greater than max-join-count.
If the list looks sorted return join-count.
For Each number In List
Recur (list with current and next number joined, max-join-count, join-count + 1)
Function "best-join" (list) ->
max-join-count = 0
while not join (list, max-join-count++)
Here's an implementation on Clojure:
(defn join-ahead [f i v]
(concat (take i v)
[(f (nth v i) (nth v (inc i)))]
(drop (+ 2 i) v)))
(defn sort-by-joining
"Sort a list by joining neighboring elements with `+'"
([v max-join-count join-count]
(if (or (nil? max-join-count)
(<= join-count max-join-count))
(if (or (empty? v)
(= v (sort v)))
{:vector v :join-count join-count}
(loop [i 0]
(when (< (inc i) (count v))
(let [r (sort-by-joining (join-ahead + i v)
max-join-count
(inc join-count))]
(or r (recur (inc i)))))))))
([v max-join-count]
(sort-by-joining v max-join-count 0))
([v]
(sort-by-joining v nil 0)))
(defn fewest-joins [v]
(loop [i 0]
(if (sort-by-joining v i)
i
(recur (inc i)))))
(deftest test-fewest-joins
(is (= 0 (fewest-joins nil)))
(is (= 1 (fewest-joins [4 6 5 3 9])))
(is (= 6 (fewest-joins [1 9 22 90 1 1 1 32 78 13 1]))))
This is pchalasani code in F# with some modifications. The memoization is similar, I added a sumRange function generator for sums in O(1) time and moved the start position to f 1 0 to skip checking for n = 0 in minJoins.
let minJoins (input: int array) =
let length = input.Length
let sum = sumRange input
let rec f = memoize2 (fun m n ->
if n = length then
0
else
let sum_mn = sum m n
{n + 1 .. length}
|> Seq.filter (fun k -> sum (n + 1) k >= sum_mn)
|> Seq.map (fun k -> f (n + 1) k + k-n-1)
|> Seq.append {length .. length}
|> Seq.min
)
f 1 0
Full code.
open System.Collections.Generic
// standard memoization
let memoize2 f =
let cache = new Dictionary<_, _>()
(fun x1 x2 ->
match cache.TryGetValue((x1, x2)) with
| true, y -> y
| _ ->
let v = f x1 x2
cache.Add((x1, x2), v)
v)
// returns a function that takes two integers n,m and returns sum(array[n:m])
let sumRange (array : int array) =
let forward = Array.create (array.Length + 1) 0
let mutable total = 0
for i in 0 .. array.Length - 1 do
total <- total + array.[i]
forward.[i + 1] <- total
(fun i j -> forward.[j] - forward.[i - 1])
// min joins to sort an array ascending
let minJoins (input: int array) =
let length = input.Length
let sum = sumRange input
let rec f = memoize2 (fun m n ->
if n = length then
0
else
let sum_mn = sum m n
{n + 1 .. length}
|> Seq.filter (fun k -> sum (n + 1) k >= sum_mn)
|> Seq.map (fun k -> f (n + 1) k + k-n-1)
|> Seq.append {length .. length} // if nothing passed the filter return length as the min
|> Seq.min
)
f 1 0
let input = [|2;8;2;2;8;3;8;9;9;2;9;8;8;7;4;2;7;5;9;4;6;7;4;7;3;4;7;9;1;2;5;1;8;7;3;3;6;3;8;5;6;5|]
let output = minJoins input
printfn "%A" output
// outputs 30