How to check if this algorithm may not terminate? - algorithm

Let x denote a vector of p values (i.e. a data point in p dimensional space).
I have two sets: set A of n elements A = {xi, .., xn} and a set B of m elements B = {xj, .., xm}, where |A| > 1 and |B| > 1. Given an integer k > 0, let dist(x, k, A) a function which returns the mean Euclidean distance from x to its k nearest points in A; and dist(x, k, B) the mean Euclidean distance from x to its k nearest points in B.
I have the following algorithm:
Repeat
{
A' = { x in A, such that dist(x, k, A) > dist(x, k, B) }
B' = { x in B, such that dist(x, k, A) < dist(x, k, B) }
A = { x in A such that x not in A' } U B'
B = { x in B such that x not in B' } U A'
}
Until CONDITION == True
Termination: CONDITION is True when no more elements move from A to B or from B to A (that is A' and B' becomes empty), or when |A| or |B| becomes less than or equals to 1.
1) Is it possible to prove that this algorithm terminates ?
2) And if so, is it also possible to have an upper bound for the number of iterations required to terminate ?
Note: the k nearest points to x in a set S, means: the k points (others than x) in S, having the smallest Euclidean distance to x.

It looks like this algorithm can loop forever, oscillating between two or more states. I determined this experimentally using the following Python program:
def mean(seq):
if len(seq) == 0:
raise IndexError("didn't expect empty sequence for mean")
return sum(seq) / float(len(seq))
def dist(a,b):
return abs(a-b)
def mean_dist(x, k, a):
neighbors = {p for p in a if p != x}
neighbors = sorted(neighbors, key=lambda p: dist(p,x))
return mean([dist(x, p) for p in neighbors[:k]])
def frob(a,b,k, verbose = False):
def show(msg):
if verbose:
print msg
seen_pairs = set()
iterations = 0
while True:
iterations += 1
show("Iteration #{}".format(iterations))
a_star = {x for x in a if mean_dist(x, k, a) > mean_dist(x,k,b)}
b_star = {x for x in b if mean_dist(x, k, a) < mean_dist(x,k,b)}
a_temp = {x for x in a if x not in a_star} | b_star
b_temp = {x for x in b if x not in b_star} | a_star
show("\tA`: {}".format(list(a_star)))
show("\tB`: {}".format(list(b_star)))
show("\tA becomes {}".format(list(a_temp)))
show("\tB becomes {}".format(list(b_temp)))
if a_temp == a and b_temp == b:
return a, b
key = (tuple(sorted(a_temp)), tuple(sorted(b_temp)))
if key in seen_pairs:
raise Exception("Infinite loop for values {} and {}".format(list(a_temp),list(b_temp)))
seen_pairs.add(key)
a = a_temp
b = b_temp
import random
#creates a set of random integers, with the given number of elements.
def randSet(size):
a = set()
while len(a) < size:
a.add(random.randint(0, 10))
return a
size = 2
k = 1
#p equals one because I don't feel like doing vector math today
while True:
a = randSet(size)
b = randSet(size)
try:
frob(a,b, k)
except IndexError as e:
continue
except Exception as e:
print "infinite loop detected for initial inputs {} and {}".format(list(a), list(b))
#run the algorithm again, but showing our work this time
try:
frob(a,b,k, True)
except:
pass
break
Result:
infinite loop detected for initial inputs [10, 4] and [1, 5]
Iteration #1
A`: [10, 4]
B`: [1, 5]
A becomes [1, 5]
B becomes [10, 4]
Iteration #2
A`: [1, 5]
B`: [10, 4]
A becomes [10, 4]
B becomes [1, 5]
Iteration #3
A`: [10, 4]
B`: [1, 5]
A becomes [1, 5]
B becomes [10, 4]
In this case, the loop never terminates because A and B continually switch entirely. While experimenting with larger set sizes, I found a case where only some elements switch:
infinite loop detected for initial inputs [8, 1, 0] and [9, 4, 5]
Iteration #1
A`: [8]
B`: [9]
A becomes [0, 1, 9]
B becomes [8, 4, 5]
Iteration #2
A`: [9]
B`: [8]
A becomes [0, 1, 8]
B becomes [9, 4, 5]
Iteration #3
A`: [8]
B`: [9]
A becomes [0, 1, 9]
B becomes [8, 4, 5]
Here, elements 8 and 9 move back and forth while the other elements stay in place.

Related

How can I get partition?

I am new on prolog and I want to list the n-ary partitions of a number in prolog using backtracking. The result must be something like this:
?- nary(3,9,P).
P = [9] ? ;
P = [3,3,3] ? ;
P = [3,3,1,1,1] ? ;
P = [3,1,1,1,1,1,1] ? ;
P = [1,1,1,1,1,1,1,1,1] ? ;
no
Do you have any ideas of how to do it?
Lots of thanks.
Assuming that you mean n-ary partitions as described in Characterizing the Number of m-ary Partitions Modulo m:
... These are partitions of an integer n wherein each part is a power of a fixed integer m >= 2. ... As an example, note that there are five 3-ary partitions of n=9: 9, 3+3+3, 3+3+1+1+1, 3+1+1+1+1+1+1, 1+1+1+1+1+1+1+1+1. ...
Note, that in the linked Paper the m corresponds to the n in your question. Building on base_limit_powers/3 from this answer, I propose to keep using DCGs and CLP(FD). Let's start with a name for the predicate and the DCG, e.g. nary_partitions_of/3 (Note that I swapped the second and third arguments of your predicate nary/3 to more easily obtain a nice declarative name) and partitions_//4.
The second argument is a list (Partitions), consisting of powers of the first argument (N), that add up to the number that is the third argument (Int). The first number has to be greater or equal to 2 per definition (see above), hence the sum has to be as well. Since you want the powers in the list in descending order, the predicate reverse/2 from library(lists) can be used to reorder the list of powers as described by base_limit_powers/3 and have partitions_//4 start with the largest number. Because of N and Int being larger than 1 the list of powers is not empty and its corresponding reversed list can be written in head and tail notation ([FRP|RPs], read F irst of R eversed P owers and rest of R eversed P ower s), thereby providing easy access to the first (and largest) number (FRP) that is to be used as second argument of partitions_//4. At the time partitions_//4 is called (using phrase/2), the list Partitions is yet to be described, therefore the sum of its numbers is still 0. Putting all this together the relation nary_partitions_of/3 can be defined like so:
nary_partitions_of(N,Partitions,Int) :-
N #> 1,
Int #> 1,
base_limit_powers(N,Int,Powers),
reverse(Powers,[FRP|RPs]),
phrase(partitions_([FRP|RPs],FRP,0,Int),Partitions).
Then the n-ary partitions can be described by a DCG like so:
partitions_(_Powers,_Last,Int,Int) --> % If the sum and the integer are equal
[]. % the list is finished
partitions_(Powers,Last,Sum0,Int) --> % otherwise
{Sum0 #< Int}, % the sum is smaller than the integer and
{P #=< Last}, % the power to be added is less or equal to the last power in the partition and
{member(P,Powers)}, % the power is from the list of powers and
{Sum1 #= Sum0 + P}, % adds to the sum of powers so far and
[P], % the power is in the list of partitions and
partitions_(Powers,P,Sum1,Int). % the remainder of the partition-list is described recursively
With this definition your example query produces the desired answers:
?- nary_partitions_of(3,P,9).
P = [9] ? ;
P = [3,3,3] ? ;
P = [3,3,1,1,1] ? ;
P = [3,1,1,1,1,1,1] ? ;
P = [1,1,1,1,1,1,1,1,1] ? ;
no
Due to the use of CLP(FD) you can also use nary_partitions_of/3 to ask different types of questions, e.g. to check if a given list is a partition of some N and Int, however this query loops after producing the answers:
?- nary_partitions_of(N,[9],Int).
Int = N = 9 ? ;
Int = 9,
N = 3 ? ;
% loop here
But due to the use of CLP(FD) that can be remedied by constraining the range of N like so:
?- N in 2..5, nary_partitions_of(N,[9],Int).
Int = 9,
N = 3 ? ;
no
?- N in 2..5, nary_partitions_of(N,[3,3,3],Int).
Int = 9,
N = 3 ? ;
no
?- N in 2..5, nary_partitions_of(N,[3,3,1,1,1],Int).
Int = 9,
N = 3 ? ;
no
?- N in 2..5, nary_partitions_of(N,[3,3,1,1],Int).
Int = 8,
N = 3 ? ;
no
?- N in 2..5, nary_partitions_of(N,[3,3,4],Int).
no
You can also look for n-ary partitions of some number, e.g. 9 without specifying N:
?- N in 2..6, nary_partitions_of(N,P,9).
N = 4,
P = [4,4,1] ? ;
N = 6,
P = [6,1,1,1] ? ;
N = 5,
P = [5,1,1,1,1] ? ;
N = 4,
P = [4,1,1,1,1,1] ? ;
P = [1,1,1,1,1,1,1,1,1],
N in 3..6, % residual goal
N^2#=_A, % residual goal
_A in 10..36 ? ; % residual goal
.
.
.
However this leads to some residual goals in the answers (see CLP(FD) documentation for details). You can again remedy this by constraining N, this time also labeling it:
?- N in 2..6, nary_partitions_of(N,P,9), label([N]).
N = 4,
P = [4,4,1] ? ;
N = 6,
P = [6,1,1,1] ? ;
N = 5,
P = [5,1,1,1,1] ? ;
N = 4,
P = [4,1,1,1,1,1] ? ;
N = 4,
P = [1,1,1,1,1,1,1,1,1] ? ;
N = 5,
P = [1,1,1,1,1,1,1,1,1] ? ;
N = 6,
P = [1,1,1,1,1,1,1,1,1] ? ;
N = 3,
P = [9] ? ;
N = 3,
P = [3,3,3] ? ;
N = 3,
P = [3,3,1,1,1] ? ;
N = 3,
P = [3,1,1,1,1,1,1] ? ;
N = 3,
P = [1,1,1,1,1,1,1,1,1] ? ;
N = 2,
P = [8,1] ? ;
N = 2,
P = [4,4,1] ? ;
N = 2,
P = [4,2,2,1] ? ;
N = 2,
P = [4,2,1,1,1] ? ;
N = 2,
P = [4,1,1,1,1,1] ? ;
N = 2,
P = [2,2,2,2,1] ? ;
N = 2,
P = [2,2,2,1,1,1] ? ;
N = 2,
P = [2,2,1,1,1,1,1] ? ;
N = 2,
P = [2,1,1,1,1,1,1,1] ? ;
N = 2,
P = [1,1,1,1,1,1,1,1,1] ? ;
no
I came up with this, where the nary partitions are either the number in a list, or it expanded (spand). Then the spand process takes the first number evenly divisible by the partition size, splits the list on it, inserts the partition, and either completes there OR spands the newly constructed complete list on backtracking.
This was about the only way I could get your backtracking request without the backtracking undoing the previous expansion and turning 1,1,1 back into 3. I haven't been able to split the list on the first evenly divisible element without leaving any choicepoints in a nicer way, but there probably is a nicer way.
spand(In, Psize, Out) :-
once((append(Left, [Elem|Right], In), % first divisible element, e.g. 3
0 is Elem mod Psize)),
length(Parts, Psize), % make the partition list, e.g. [1,1,1]
Pt is Elem / Psize,
maplist(=(Pt), Parts),
append(Left, Parts, Temp),
( append(Temp, Right, Out) % choicepoint for backtracking
; append(Temp, Right, Out_),
spand(Out_, Psize, Out)).
nary(Psize, Target, Parts) :-
Parts = [Target]
;
spand([Target], Psize, Parts).
e.g.
?- nary(3, 9, Parts).
Parts = [9] ;
Parts = [3, 3, 3] ;
Parts = [1, 1, 1, 3, 3] ;
Parts = [1, 1, 1, 1, 1, 1, 3] ;
Parts = [1, 1, 1, 1, 1, 1, 1, 1, 1] ;
false
e.g.
?- nary(2, 24, Parts).
Parts = [24] ;
Parts = [12, 12] ;
Parts = [6, 6, 12] ;
Parts = [3, 3, 6, 12] ;
Parts = [3, 3, 3, 3, 12] ;
Parts = [3, 3, 3, 3, 6, 6] ;
Parts = [3, 3, 3, 3, 3, 3, 6] ;
Parts = [3, 3, 3, 3, 3, 3, 3, 3] ;
false
Your example is too short to see if [6, 6, 12] -> [3, 3, 6, 12] or [6, 6, 6, 6]. Depth first or Breadth first.

ruby enumerators: immediately skip multiple iterations (or start iterating from n)

I'm iterating over permutations of a list (18 items) like this:
List = [item0..item18] # (unpredictable)
Permutation_size = 7
Start_at = 200_000_000
for item, i in List.repeated_permutation(Permutation_size).each_with_index
next if i < Start_at
# do stuff
end
Start_at is used to resume from a previously saved state so it's always different but it takes almost 200s to reach 200 million so I'm wondering if there is a faster way to skip multiple iterations or start at iteration n (converting the enumerator to an array takes even longer). If not, a way to create a custom repeated_permutation(n).each_with_index (that yields results in the same order) would also be appreciated.
Feel free to redirect me to an existing answer (I haven't found any)
PS. (what I had come up with)
class Array
def rep_per_with_index len, start_at = 0
b = size
raise 'btl' if b > 36
counter = [0]*len
# counter = (start_at.to_s b).split('').map {|i| '0123456789'.include?(i) ? i.to_i : (i.ord - 87)} #this is weird, your way is way faster
start_at.to_s(b).chars.map {|i| i.to_i b}
counter.unshift *[0]*(len - counter.length)
counter.reverse!
i = start_at
Enumerator.new do |y|
loop do
y << [counter.reverse.map {|i| self[i]}, i]
i += 1
counter[0] += 1
counter.each_with_index do |v, i|
if v >= b
if i == len - 1
raise StopIteration
else
counter[i] = 0
counter[i + 1] += 1
end
else
break
end
end
end
end
end
end
I first construct a helper method, change_base, with three arguments:
off, the base-10 offset into the sequence of repeated permutations of the given array arr,
m, a number system base; and
p, the permutation size.
The method performs three steps to construct an array off_m:
converts off to base m (radix m);
separates the digits of the base m value into an array; and
if necessary, pads the array with leading 0s to make it of size p.
By setting m = arr.size, each digit of off_m is an offset into arr, so off_m maps the base-10 offset to a unique permutation of size p.
def change_base(m, p, off)
arr = off.to_s(m).chars.map { |c| c.to_i(m) }
arr.unshift(*[0]*(p-arr.size))
end
Some examples:
change_base(16, 2, 32)
#=> [2, 0]
change_base(16, 3, 255)
#=> [0, 15, 15]
change_base(36, 4, 859243)
#=> [18, 14, 35, 31]
18*36**3 + 14*36**2 + 35*36**1 + 31
#=> 859243
This implementation of change_base requires that m <= 36. I assume that will be sufficient, but algorithms are available to convert base-10 numbers to numbers with arbitrarily-large bases.
We now construct a method which accepts the given array, arr, the size of each permutation, p and a given base-10 offset into the sequence of permutations. The method returns a permutation, namely, an array of size p whose elements are elements of arr.
def offset_to_perm(arr, p, off)
arr.values_at(*change_base(arr.size, p, off))
end
We can now try this with an example.
arr = (0..3).to_a
p = 2
(arr.size**p).times do |off|
print "perm for off = "
print " " if off < 10
print "#{off}: "
p offset_to_perm(arr, p, off)
end
perm for off = 0: [0, 0]
perm for off = 1: [0, 1]
perm for off = 2: [0, 2]
perm for off = 3: [0, 3]
perm for off = 4: [0, 1]
perm for off = 5: [1, 1]
perm for off = 6: [2, 1]
perm for off = 7: [3, 1]
perm for off = 8: [0, 2]
perm for off = 9: [1, 2]
perm for off = 10: [2, 2]
perm for off = 11: [3, 2]
perm for off = 12: [0, 3]
perm for off = 13: [1, 3]
perm for off = 14: [2, 3]
perm for off = 15: [3, 3]
If we wish to begin at, say, offset 5, we can write:
i = 5
p offset_to_perm(arr, p, i)
[1, 1]
i = i.next #=> 6
p offset_to_perm(arr, p, i)
[2, 1]
...

no. of permutation of number from 1 to n in which i >i+1 and i-1

for a given N how many permutations of [1, 2, 3, ..., N] satisfy the following property.
Let P1, P2, ..., PN denote the permutation. The property we want to satisfy is that there exists an i between 2 and n-1 (inclusive) such that
Pj > Pj + 1 ∀ i ≤ j ≤ N - 1.
Pj > Pj - 1 ∀ 2 ≤ j ≤ i.
like for N=3
Permutations [1, 3, 2] and [2, 3, 1] satisfy the property.
Is there any direct formula or algorithm to find these set in programming.
There are 2^(n-1) - 2 such permutations. If n is the largest element, then the permutation is uniquely determined by the nonempty, proper subset of {1, 2, ..., n-1} which lies to the left of n in the permutation. This answer is consistent with the excellent answer of #גלעדברקן in view of the well-known fact that the elements in each row of Pascal's triangle sum to a power of two (hence the part of the row between the two ones is two less than a power of two).
Here is a Python enumeration which generates all n! permutations and checks them for validity:
import itertools
def validPerm(p):
n = max(p)
i = p.index(n)
if i == 0 or i == n-1:
return False
else:
before = p[:i]
after = p[i+1:]
return before == sorted(before) and after == sorted(after, reverse = True)
def validPerms(n):
nums = list(range(1,n+1))
valids = []
for p in itertools.permutations(nums):
lp = list(p)
if validPerm(lp): valids.append(lp)
return valids
For example,
>>> validPerms(4)
[[1, 2, 4, 3], [1, 3, 4, 2], [1, 4, 3, 2], [2, 3, 4, 1], [2, 4, 3, 1], [3, 4, 2, 1]]
which gives the expected number of 6.
On further edit: The above code was to verify the formula for nondegenerate unimodal permutations (to coin a phrase since "unimodal permutations" is used in the literature for the 2^(n-1) permutations with exactly one peak, but the 2 which either begin or end with n are arguably in some sense degenerate). From an enumeration point of view you would want to do something more efficient. The following is a Python implementation of the idea behind the answer of #גלעדברקן :
def validPerms(n):
valids = []
nums = list(range(1,n)) #1,2,...,n-1
snums = set(nums)
for i in range(1,n-1):
for first in itertools.combinations(nums,i):
#first will be already sorted
rest = sorted(snums - set(first),reverse = True)
valids.append(list(first) + [n] + rest)
return valids
It is functionally equivalent to the above code, but substantially more efficient.
Let's look at an example:
{1,2,3,4,5,6}
Clearly, any positioning of 6 at i will mean the right side of it will be sorted descending and the left side of it ascending. For example, i = 3
{1,2,6,5,4,3}
{1,3,6,5,4,2}
{1,4,6,5,3,2}
...
So for each positioning of N between 2 and n-1, we have (n - 1) choose (position - 1) arrangements. This leads to the answer:
sum [(n - 1) choose (i - 1)], for i = 2...(n - 1)
there are ans perm. and ans is as follows
ans equal to 2^(n-1) and
ans -= 2
as it need to be in between 2 <=i <= n-1 && we know that nC1 ans nCn = 1

Number of possible equations of K numbers whose sum is N in ruby

I have to create a program in ruby on rails so that it will take less time to solve the particular condition. Now i am to getting the less response time for k=4 but response time is more in case of k>5
Problem:
Problem is response time.
When value of k is more than 5 (k>5) response time is too late for given below equation.
Input: K, N (where 0 < N < ∞, 0 < K < ∞, and K <= N)
Output: Number of possible equations of K numbers whose sum is N.
Example Input:
N=10 K=3
Example Output:
Total unique equations = 8
1 + 1 + 8 = 10
1 + 2 + 7 = 10
1 + 3 + 6 = 10
1 + 4 + 5 = 10
2 + 2 + 6 = 10
2 + 3 + 5 = 10
2 + 4 + 4 = 10
3 + 3 + 4 = 10
For reference, N=100, K=3 should have a result of 833 unique sets
Here is my ruby code
module Combination
module Pairs
class Equation
def initialize(params)
#arr=[]
#n = params[:n]
#k = params[:k]
end
#To create possible equations
def create_equations
return "Please Enter value of n and k" if #k.blank? && #n.blank?
begin
Integer(#k)
rescue
return "Error: Please enter any +ve integer value of k"
end
begin
Integer(#n)
rescue
return "Error: Please enter any +ve integer value of n"
end
return "Please enter k < n" if #n < #k
create_equations_sum
end
def create_equations_sum
aar = []
#arr = []
#list_elements=(1..#n).to_a
(1..#k-1).each do |i|
aar << [*0..#n-1]
end
traverse([], aar, 0)
return #arr.uniq #return result
end
#To check sum
def generate_sum(*args)
new_elements = []
total= 0
args.flatten.each do |arg|
total += #list_elements[arg]
new_elements << #list_elements[arg]
end
if total < #n
new_elements << #n - total
#arr << new_elements.sort
else
return
end
end
def innerloop(arrayOfCurrentValues)
generate_sum(arrayOfCurrentValues)
end
#Recursive method to create dynamic nested loops.
def traverse(accumulated,params, index)
if (index==params.size)
return innerloop(accumulated)
end
currentParam = params[index]
currentParam.each do |currentElementOfCurrentParam|
traverse(accumulated+[currentElementOfCurrentParam],params, index+1)
end
end
end
end
end
run the code using
params = {:n =>100, :k =>4}
c = Combination::Pairs::Equation.new(params)
c.create_equations
Here are two ways to compute your answer. The first is simple but not very efficient; the second, which relies on an optimization technique, is much faster, but requires considerably more code.
Compact but Inefficient
This is a compact way to do the calculation, making use of the method Array#repeated_combination:
Code
def combos(n,k)
[*(1..n-k+1)].repeated_combination(3).select { |a| a.reduce(:+) == n }
end
Examples
combos(10,3)
#=> [[1, 1, 8], [1, 2, 7], [1, 3, 6], [1, 4, 5],
# [2, 2, 6], [2, 3, 5], [2, 4, 4], [3, 3, 4]]
combos(100,4).size
#=> 832
combos(1000,3).size
#=> 83333
Comment
The first two calculations take well under one second, but the third took a couple of minutes.
More efficient, but increased complexity
Code
def combos(n,k)
return nil if k.zero?
return [n] if k==1
return [1]*k if k==n
h = (1..k-1).each_with_object({}) { |i,h| h[i]=[[1]*i] }
(2..n-k+1).each do |i|
g = (1..[n/i,k].min).each_with_object(Hash.new {|h,k| h[k]=[]}) do |m,f|
im = [i]*m
mxi = m*i
if m==k
f[mxi].concat(im) if mxi==n
else
f[mxi] << im if mxi + (k-m)*(i+1) <= n
(1..[(i-1)*(k-m), n-mxi].min).each do |j|
h[j].each do |a|
f[mxi+j].concat([a+im]) if
((a.size==k-m && mxi+j==n) ||
(a.size<k-m && (mxi+j+(k-m-a.size)*(i+1))<=n))
end
end
end
end
g.update({ n=>[[i]*k] }) if i*k == n
h.update(g) { |k,ov,nv| ov+nv }
end
h[n]
end
Examples
p combos(10,3)
#=> [[3, 3, 4], [2, 4, 4], [2, 3, 5], [1, 4, 5],
# [2, 2, 6], [1, 3, 6], [1, 2, 7], [1, 1, 8]]
p combos(10,4)
#=> [[2, 2, 3, 3], [1, 3, 3, 3], [2, 2, 2, 4], [1, 2, 3, 4], [1, 1, 4, 4],
# [1, 2, 2, 5], [1, 1, 3, 5], [1, 1, 2, 6], [1, 1, 1, 7]]
puts "size=#{combos(100 ,3).size}" #=> 833
puts "size=#{combos(100 ,5).size}" #=> 38224
puts "size=#{combos(1000,3).size}" #=> 83333
Comment
The calculation combos(1000,3).size took about five seconds, the others were all well under one second.
Explanation
This method employs dynamic programming to compute a solution. The state variable is the largest positive integer used to compute arrays with sizes no more than k whose elements sum to no more than n. Begin with the largest integer equal to one. The next step is compute all combinations of k or fewer elements that include the numbers 1 and 2, then 1, 2 and 3, and so on, until we have all combinations of k or fewer elements that include the numbers 1 through n. We then select all combinations of k elements that sum to n from the last calculation.
Suppose
k => 3
n => 7
then
h = (1..k-1).each_with_object({}) { |i,h| h[i]=[[1]*i] }
#=> (1..2).each_with_object({}) { |i,h| h[i]=[[1]*i] }
#=> { 1=>[[1]], 2=>[[1,1]] }
This reads, using the only the number 1, [[1]] is the array of all arrays that sum to 1 and [[1,1]] is the array of all arrays that sum to 2.
Notice that this does not include the element 3=>[[1,1,1]]. That's because, already having k=3 elments, if cannot be combined with any other elements, and sums to 3 < 7.
We next execute:
enum = (2..n-k+1).each #=> #<Enumerator: 2..5:each>
We can convert this enumerator to an array to see what values it will pass into its block:
enum.to_a #=> [2, 3, 4, 5]
As n => 7 you may be wondering why this array ends at 5. That's because there are no arrays containing three positive integers, of which at least one is a 6 or a 7, whose elements sum to 7.
The first value enum passes into the block, which is represented by the block variable i, is 2. We will now compute a hash g that includes all arrays that sum to n => 7 or less, have at most k => 3 elements, include one or more 2's and zero or more 1's. (That's a bit of a mouthful, but it's still not precise, as I will explain.)
enum2 = (1..[n/i,k].min).each_with_object(Hash.new {|h,k| h[k]=[]})
#=> (1..[7/2,3].min).each_with_object(Hash.new {|h,k| h[k]=[]})
#=> (1..3).each_with_object(Hash.new {|h,k| h[k]=[]})
Enumerable#each_with_object creates an initially-empty hash that is represented by the block variable f. The default value of this hash is such that:
f[k] << o
is equivalent to
(f[k] |= []) << o
meaning that if f does not have a key k,
f[k] = []
is executed before
f[k] << o
is performed.
enum2 will pass the following elements into its block:
enum2.to_a #=> => [[1, {}], [2, {}], [3, {}]]
(though the hash may not be empty when elements after the first are passed into the block). The first element passed to the block is [1, {}], represented by the block variables:
m => 1
f => Hash.new {|h,k| h[k]=[]}
m => 1 means we will intially construct arrays that contain one (i=) 2.
im = [i]*m #=> [2]*1 => [2]
mxi = m*i #=> 2*1 => 2
As (m == k) #=> (1 == 3) => false, we next execute
f[mxi] << im if mxi + (k-m)*(i+1) <= n
#=> f[2] << [2] if 2 + (3-1)*(1+1) <= 7
#=> f[2] << [2] if 8 <= 7
This considers whether [2] should be added to f[2] without adding any integers j < i = 2. (We have yet to consider the combining of one 2 with integers less than 2 [i.e., 1].) As 8 <= 7, we do not add [2] to f[2]. The reason is that, for this to be part of an array of length k=3, it would be of the form [2,x,y], where x > 2 and y > 2, so 2+x+y >= 2+3+3 = 8 > n = 7. Clear as mud?
Next,
enum3 = (1..[(i-1)*(k-m), n-mxi].min).each
#=> = (1..[2,5].min).each
#=> = (1..2).each
#=> #<Enumerator: 1..2:each>
which passes the values
enum3.to_a #=> [1, 2]
into its block, represented by the block variable j, which is the key of the hash h. What we will be doing here is combine one 2 (m=1) with arrays of elements containing integers up to 1 (i.e., just 1) that sum to j, so the elements of the resulting array will sum to m * i + j => 1 * 2 + j => 2 + j.
The reason enum3 does not pass values of j greater than 2 into its block is that h[l] is empty for l > 2 (but its a little more complicated when i > 2).
For j => 1,
h[j] #=> [[1]]
enum4 = h[j].each #=> #<Enumerator: [[1]]:each>
enum4.to_a #=> [[1]]
a #=> [1]
so
f[mxi+j].concat([a+im]) if
((a.size==k-m && mxi+j==n) || (a.size<k-m && (mxi+j+(k-m-a.size)*(i+1))<=n))
#=> f[2+1].concat([[1]+[2]) if ((1==2 && 2+1==7) || (1<=3-1 && (2+1+(1)*(3)<=7))
#=> f[3].concat([1,2]) if ((false && false) || (1<=2 && (6<=7))
#=> f[3] = [] << [[1,2]] if (false || (true && true)
#=> f[3] = [[1,2]] if true
So the expression on the left is evaluated. Again, the conditional expressions are a little complex. Consider first:
a.size==k-m && mxi+j==n
which is equivalent to:
([2] + f[j]).size == k && ([2] + f[j]).reduce(:+) == n
That is, include the array [2] + f[j] if it has k elements that sum to n.
The second condition considers whether the array the arrays [2] + f[j] with fewer than k elements can be "completed" with integers l > i = 2 and have a sum of n or less.
Now, f #=> {3=>[[1, 2]]}.
We now increment j to 2 and consider arrays [2] + h[2], whose elements will total 4.
For j => 2,
h[j] #=> [[1, 1]]
enum4 = h[j].each #=> #<Enumerator: [[1, 1]]:each>
enum4.to_a #=> [[1, 1]]
a #=> [1, 1]
f[mxi+j].concat([a+im]) if
((a.size==k-m && mxi+j==n) || (a.size<k-m && (mxi+j+(k-m-a.size)*(i+1)<=n))
#=> f[4].concat([1, 1, 2]) if ((2==(3-1) && 2+2 == 7) || (2+2+(3-1-2)*(3)<=7))
#=> f[4].concat([1, 1, 2]) if (true && false) || (false && true))
#=> f[4].concat([1, 1, 2]) if false
so this operation is not performed (since [1,1,2].size => 3 = k and [1,1,2].reduce(:+) => 4 < 7 = n.
We now increment m to 2, meaning that we will construct arrays having two (i=) 2's. After doing so, we see that:
f={3=>[[1, 2]], 4=>[[2, 2]]}
and no other arrays are added when m => 3, so we have:
g #=> {3=>[[1, 2]], 4=>[[2, 2]]}
The statement
g.update({ n=>[i]*k }) if i*k == n
#=> g.update({ 7=>[2,2,2] }) if 6 == 7
adds the element 7=>[2,2,2] to the hash g if the sum of its elements equals n, which it does not.
We now fold g into h, using Hash#update (aka Hash#merge!):
h.update(g) { |k,ov,nv| ov+nv }
#=> {}.update({3=>[[1, 2]], 4=>[[2, 2]]} { |k,ov,nv| ov+nv }
#=> {1=>[[1]], 2=>[[1, 1]], 3=>[[1, 2]], 4=>[[2, 2]]}
Now h contains all the arrays (values) whose keys are the array totals, comprised of the integers 1 and 2, which have at most 3 elements and sum to at most 7, excluding those arrays with fewer than 3 elements which cannot sum to 7 when integers greater than two are added.
The operations performed are as follows:
i m j f
h #=> { 1=>[[1]], 2=>[[1,1]] }
2 1 1 {3=>[[1, 2]]}
2 1 2 {3=>[[1, 2]]}
2 2 1 {3=>[[1, 2]], 4=>[[2, 2]]}
{3=>[[1, 2]], 4=>[[2, 2]]}
3 1 1 {}
3 1 2 {}
3 1 3 {}
3 1 4 {7=>[[2, 2, 3]]}
3 2 1 {7=>[[2, 2, 3], [1, 3, 3]]}
g before g.update: {7=>[[2, 2, 3], [1, 3, 3]]}
g after g.update: {7=>[[2, 2, 3], [1, 3, 3]]}
h after h.update(g): {1=>[[1]],
2=>[[1, 1]],
3=>[[1, 2]],
4=>[[2, 2]],
7=>[[2, 2, 3], [1, 3, 3]]}
4 1 1 {}
4 1 2 {}
4 1 3 {7=>[[1, 2, 4]]}
g before g.update: {7=>[[1, 2, 4]]}
g after g.update: {7=>[[1, 2, 4]]}
h after h.update(g): {1=>[[1]],
2=>[[1, 1]],
3=>[[1, 2]],
4=>[[2, 2]],
7=>[[2, 2, 3], [1, 3, 3], [1, 2, 4]]}
5 1 1 {}
5 1 2 {7=>[[1, 1, 5]]}
g before g.update: {7=>[[1, 1, 5]]}
g after g.update: {7=>[[1, 1, 5]]}
h after h.update(g): {1=>[[1]],
2=>[[1, 1]],
3=>[[1, 2]],
4=>[[2, 2]],
7=>[[2, 2, 3], [1, 3, 3], [1, 2, 4], [1, 1, 5]]}
And lastly,
h[n].select { |a| a.size == k }
#=> h[7].select { |a| a.size == 3 }
#=> [[2, 2, 3], [1, 3, 3], [1, 2, 4], [1, 1, 5]]
#Cary's answer is very in-depth and impressive, but it appears to me that there is a much more naive solution, which proved to be much more efficient as well - good old recursion:
def combos(n,k)
if k == 1
return [n]
end
(1..n-1).flat_map do |i|
combos(n-i,k-1).map { |r| [i, *r].sort }
end.uniq
end
This solution simply reduces the problem each level by taking decreasing the target sum by each number between 1 and the previous target sum, while reducing k by one. Now make sure you don't have duplicates (by sort and uniq) - and you have your answer...
This is great for k < 5, and is much faster than Cary's solution, but as k gets larger, I found that it makes much too many iterations, sort and uniq took a very big toll on the calculation.
So I made sure that won't be needed, by making sure I get only sorted answers - each recursion should check only numbers larger than those already used:
def combos(n,k,min = 1)
if n < k || n < min
return []
end
if k == 1
return [n]
end
(min..n-1).flat_map do |i|
combos(n-i,k-1, i).map { |r| [i, *r] }
end
end
This solution is on par with Cary's on combos(100, 7):
user system total real
My Solution 2.570000 0.010000 2.580000 ( 2.695615)
Cary's 2.590000 0.000000 2.590000 ( 2.609374)
But we can do better: caching! This recursion does many calculations again and again, so caching stuff we already did will save us a lot of work when dealing with long sums:
def combos(n,k,min = 1, cache = {})
if n < k || n < min
return []
end
cache[[n,k,min]] ||= begin
if k == 1
return [n]
end
(min..n-1).flat_map do |i|
combos(n-i,k-1, i, cache).map { |r| [i, *r] }
end
end
end
This solution is mighty fast and passes Cary's solution for large n by light-years:
Benchmark.bm do |bm|
bm.report('Uri') { combos(1000, 3) }
bm.report('Cary') { combos_cary(1000, 3) }
end
user system total real
Uri 0.200000 0.000000 0.200000 ( 0.214080)
Cary 7.210000 0.000000 7.210000 ( 7.220085)
And is on par with k as high as 9, and I believe it is still less complicated than his solution.
You want the number of integer partitions of n into exactly k summands. There is a (computationally) somewhat ugly recurrence for that number.
The idea is this: let P(n,k) be the number of ways to partition n into k nonzero summands; then P(n,k) = P(n-1,k-1) + P(n-k,k). Proof: every partition either contains a 1 or it doesn't contain a 1 as one of the summands. The first case P(n-1,k-1) calculates the number of cases where there is a 1 in the sum; take that 1 away from the sum and partition the remaining n-1 into the now available k-1 summands. The second case P(n-k,k) considers the case where every summand is strictly greater than 1; to do that, reduce all of the k summands by 1 and recurse from there. Obviously, P(n,1) = 1 for all n > 0.
Here's a link that mentions that probably, no closed form is known for general k.

When given groups of numbers, how can I find the minimal set of groups to represent all the numbers?

When given groups of numbers, how can I find the minimal set of groups to cover all the numbers? The constraint is that the chosen groups should not overlap.
For example, given three groups of numbers (1,2), (2,3), and (3,4), we can select (1,2) and (3,4) as (2,3) is redundant.
With (1,2),(2,3),(3,4),(1,4), we have two solutions (1,2),(3,4) or (1,4),(2,3).
With (1,2,3),(1,2,4), and (3,4), there is a redundancy, but there is no solution.
The algorithm that I came up with (for G = (1,2),(2,3),(3,4),(1,4) example) is
collect all the numbers from the groups x = (1,2,3,4)
for g in G:
x = remove g in x # x = (3,4)
find G' = (a set of (g' in (G - g))) that makes (G' + g = x) # G' = ((3,4))
if find (G' + g) return (G',g) # return ((1,2)(3,4))
I know my algorithm has a lot of holes in it in terms of performance, and I assume this might be a well known problem. Any hints to this problem?
I found a working code in python from this site : http://www.cs.mcgill.ca/~aassaf9/python/algorithm_x.html
X = {1, 2, 3, 4, 5, 6, 7}
Y = {
'A': [1, 4, 7],
'B': [1, 4],
'C': [4, 5, 7],
'D': [3, 5, 6],
'E': [2, 3, 6, 7],
'F': [2, 7]}
def solve(X, Y, solution=[]):
if not X:
yield list(solution)
else:
c = min(X, key=lambda c: len(X[c]))
for r in list(X[c]):
solution.append(r)
cols = select(X, Y, r)
for s in solve(X, Y, solution):
yield s
deselect(X, Y, r, cols)
solution.pop()
def select(X, Y, r):
cols = []
for j in Y[r]:
for i in X[j]:
for k in Y[i]:
if k != j:
X[k].remove(i)
cols.append(X.pop(j))
return cols
def deselect(X, Y, r, cols):
for j in reversed(Y[r]):
X[j] = cols.pop()
for i in X[j]:
for k in Y[i]:
if k != j:
X[k].add(i)
X = {j: set(filter(lambda i: j in Y[i], Y)) for j in X}
a = solve(X, Y)
for i in a: print i

Resources