I just started to learn programming and am trying to write a function that outputs all possible combinations. So far I've been able to find all possible combinations of size 2 but I'm not sure how to leave the code open ended to deal with combinations of larger sizes. Would some sort of recursion would be useful?
I know I could use the built in combination method but I'm just trying to figure out how to write it from scratch. Any advice would be much appreciated. Thanks!
def two_combos(input)
list = []
for index1 in (0...input.length)
for index2 in (0...input.length)
if input[index1] != input[index2]
if list.include?([input[index2],input[index1]])==false
list << [input[index1],input[index2]]
end
end
end
end
return list
end
two_combos(["A","B","C"])
#outputs
=> [["A", "B"], ["A", "C"], ["B", "C"]]
#Missing
["A","B","C"]
This implementation is like counting recursively in binary:
def combinations(items)
return [] unless items.any?
prefix = items[0]
suffixes = combinations(items[1..-1])
[[prefix]] + suffixes + suffixes.map {|item| [prefix] + item }
end
> combinations(%w(a b c))
=> [["a"], ["b"], ["c"], ["b", "c"], ["a", "b"], ["a", "c"], ["a", "b", "c"]]
At each stage, the combinations are a concatenation of:
the first element alone
the combinations of the following elements (elements 1..n-1)
the first element combined with the combinations of the following elements
Here is recursive algorithm
def combinations(array, size)
fail "size is too big" if size > array.size
combination([], [], array, size)
end
def combination(result, step, array, size)
steps = size - step.size
array[0..-steps].each_with_index do |a, i|
next_step = step + [a]
if next_step.size < size
combination(result, next_step, array[i+1..-1], size)
else
result << next_step
end
end
result
end
a = ("A".."E").to_a
p combinations(a, 1)
# [["A"], ["B"], ["C"], ["D"], ["E"]]
p combinations(a, 2)
# [["A", "B"], ["A", "C"], ["A", "D"], ["A", "E"], ["B", "C"], ["B", "D"], ["B", "E"], ["C", "D"], ["C", "E"], ["D", "E"]]
p combinations(a, 3)
# [["A", "B", "C"], ["A", "B", "D"], ["A", "B", "E"], ["A", "C", "D"], ["A", "C", "E"], ["A", "D", "E"], ["B", "C", "D"], ["B", "C", "E"], ["B", "D", "E"], ["C", "D", "E"]]
p combinations(a, 4)
# [["A", "B", "C", "D"], ["A", "B", "C", "E"], ["A", "B", "D", "E"], ["A", "C", "D", "E"], ["B", "C", "D", "E"]]
I can think of a way to calculate combinations of a given size without using recursion, but it is not especially efficient. It is very efficient, however, if you want to obtain all combinations of all sizes (sometimes referred to as "power"). [Edit: evidently not. See the benchmark.] My understand is that the question concerns the latter, but I will give methods for each.
If index has n elements, each combination can be represented by an n-element array whose elements are each zero or one, 1 meaning the combination includes the element at that index, '0' (or a leading space) meaning it does not. We therefore can generate the set of all combinations of all sizes by simply generating all binary numbers of length n, converting each from its string representatation (with leading zeroes) to an array of "0"'s and "1"s, replacing the "1"'s with their index positions, removing the "0"'s and extracting the element of index at the given index positions.
Code
def all_combos(sz)
[*(0..2**sz-1)].map { |i| ("%0#{sz}b" % i).chars }
.map { |a| a.each_with_index
.select { |n,ndx| n=="1" }.map(&:last) }
end
def combos(input, n, all_combos)
all_combos.select { |c| c.size == n }.map { |c| input.values_at(*c) }
end
def power(input, all_combos)
all_combos.map { |c| input.values_at(*c) }
end
Example
input = %w{b e a r s}
#=> ["b", "e", "a", "r", "s"]
ac = all_combos(input.size)
#=> [[], [4], [3], [3, 4], [2], [2, 4], [2, 3], [2, 3, 4],
# [1], [1, 4], [1, 3], [1, 3, 4], [1, 2], [1, 2, 4], [1, 2, 3],
# [1, 2, 3, 4], [0], [0, 4], [0, 3], [0, 3, 4], [0, 2], [0, 2, 4],
# [0, 2, 3], [0, 2, 3, 4], [0, 1], [0, 1, 4], [0, 1, 3], [0, 1, 3, 4],
# [0, 1, 2], [0, 1, 2, 4], [0, 1, 2, 3], [0, 1, 2, 3, 4]]
(0..input.size).each { |i| puts "size #{i}"; p combos(input, i, ac) }
# size 0
# [[]]
# size 1
# [["s"], ["r"], ["a"], ["e"], ["b"]]
# size 2
# [["r", "s"], ["a", "s"], ["a", "r"], ["e", "s"], ["e", "r"],
# ["e", "a"], ["b", "s"], ["b", "r"], ["b", "a"], ["b", "e"]]
# size 3
# [["a", "r", "s"], ["e", "r", "s"], ["e", "a", "s"], ["e", "a", "r"],
# ["b", "r", "s"], ["b", "a", "s"], ["b", "a", "r"], ["b", "e", "s"],
# ["b", "e", "r"], ["b", "e", "a"]]
# size 4
# [["e", "a", "r", "s"], ["b", "a", "r", "s"], ["b", "e", "r", "s"],
# ["b", "e", "a", "s"], ["b", "e", "a", "r"]]
# size 5
# [["b", "e", "a", "r", "s"]]
power(input, ac)
#=> [[], ["s"], ["r"], ["r", "s"], ["a"], ["a", "s"], ["a", "r"],
# ["a", "r", "s"], ["e"], ["e", "s"], ["e", "r"], ["e", "r", "s"],
# ["e", "a"], ["e", "a", "s"], ["e", "a", "r"], ["e", "a", "r", "s"],
# ["b"], ["b", "s"], ["b", "r"], ["b", "r", "s"], ["b", "a"],
# ["b", "a", "s"], ["b", "a", "r"], ["b", "a", "r", "s"], ["b", "e"],
# ["b", "e", "s"], ["b", "e", "r"], ["b", "e", "r", "s"], ["b", "e", "a"],
# ["b", "e", "a", "s"], ["b", "e", "a", "r"], ["b", "e", "a", "r", "s"]]
Gentlemen, start your engines!
Methods compared
module Methods
def ruby(array)
(0..array.size).each_with_object([]) { |i,a|
a.concat(array.combination(i).to_a) }
end
def todd(input)
permutations(input) << []
end
private
def permutations(items)
return [] unless items.any?
prefix = items[0]
suffixes = permutations(items[1..-1])
[[prefix]] + suffixes + suffixes.map {|item| [prefix, item].flatten }
end
public
def fl00r(array)
(1..array.size).each_with_object([]) { |i,a|
a.concat(combinations(array, i)) } << []
end
private
def combinations(array, size)
fail "size is too big" if size > array.size
combination([], [], array, size)
end
def combination(result, step, array, size)
steps = size - step.size
array[0..-steps].each_with_index do |a, i|
next_step = step + [a]
if next_step.size < size
combination(result, next_step, array[i+1..-1], size)
else
result << next_step
end
end
result
end
public
def cary(input)
ac = all_combos(input.size)
ac.map { |c| input.values_at(*c) }
end
private
def all_combos(sz)
[*0..2**sz-1].map { |i| ("%0#{sz}b" % i).chars }
.map { |a| a.each_with_index.select { |n,ndx| n=="1" }.map(&:last) }
end
end
Test data
def test_array(n)
[*1..n]
end
Helpers
def compute(arr, meth)
send(meth, arr)
end
def compute_sort(arr, meth)
compute(arr, meth).map(&:sort).sort
end
Include module
include Methods
#methods = Methods.public_instance_methods(false)
#=> [:ruby, :todd, :fl00r, :cary]
Confirm methods return the same values
arr = test_array(8)
a = compute_sort(arr, #methods.first)
puts #methods[1..-1].all? { |m| a == compute_sort(arr, m) }
#=> true
Benchmark code
require 'benchmark'
#indent = methods.map { |m| m.to_s.size }.max
[10, 15, 20].each do |n|
puts "\nn = #{n}"
arr = test_array(n)
Benchmark.bm(#indent) do |bm|
#methods.each do |m|
bm.report m.to_s do
compute(arr, m).size
end
end
end
end
Tests (seconds)
n = 10
user system total real
ruby 0.000000 0.000000 0.000000 ( 0.000312)
todd 0.000000 0.000000 0.000000 ( 0.001611)
fl00r 0.000000 0.000000 0.000000 ( 0.002675)
cary 0.010000 0.000000 0.010000 ( 0.010026)
n = 15
user system total real
ruby 0.010000 0.000000 0.010000 ( 0.010742)
todd 0.070000 0.010000 0.080000 ( 0.081821)
fl00r 0.080000 0.000000 0.080000 ( 0.076030)
cary 0.430000 0.020000 0.450000 ( 0.450382)
n = 20
user system total real
ruby 0.310000 0.040000 0.350000 ( 0.350484)
todd 2.360000 0.130000 2.490000 ( 2.487493)
fl00r 2.320000 0.090000 2.410000 ( 2.405377)
cary 21.420000 0.620000 22.040000 ( 22.053303)
I draw only one definitive conclusion.
Related
I can't find a simple solution for this problem
For example we have an array:
["a", "a", "a", "a", "a", "b", "b", "c", "a", "a", "a"]
I need to count the identical elements in this way:
[["a", 5], ["b", 2], ["c", 1], ["a", 3]]
Uses the chunk method to group identical elements, then uses map to convert [letter, array] pairs to [letter, count].
arr = ["a", "a", "a", "a", "a", "b", "b", "c", "a", "a", "a"]
counted = arr.chunk { |x| x }.map { |a, b| [a, b.count] }
# => [["a", 5], ["b", 2], ["c", 1], ["a", 3]]
In Ruby 2.2 you could use Enumable#slice_when:
arr = ["a", "a", "a", "a", "a", "b", "b", "c", "a", "a", "a"]
arr.slice_when { |e,f| e!=f }.map { |a| [a.first, a.size] }
#=> [["a", 5], ["b", 2], ["c", 1], ["a", 3]]
Given a set of items [z,a,b,c] I want to find the "cartesian power" (cartesian product with itself n times) but only those results that have a z in them. For example:
normal_values = ["a","b","c"]
p limited_cartesian( normal_values, "z", 2 )
#=> [
#=> ["z", "z"]
#=> ["z", "a"]
#=> ["z", "b"]
#=> ["z", "c"]
#=> ["a", "z"]
#=> ["b", "z"]
#=> ["c", "z"]
#=> ]
I can do this by spinning through the full set and skipping entries that don't have the special value, but I'm wondering if there's a simpler way. Preferably one that allows me to only evaluate the desired entries lazily, without wasting time calculating unwanted entries.
def limited_cartesian( values, special, power )
[special, *values].repeated_permutation(power)
.select{ |prod| prod.include?( special ) }
end
Edit: With v3.0 I finally have something respectable. As is oft the case, the key was looking at the problem the right way. It occurred to me that I could repeatedly permute normal_values << special, power - 1 times, then for each of those permutations, there would be one more element to add. If the permutation contained at least one special, any element of normal_values << special could be added; otherwise, special must be added.
def limited_cartesian( values, special, power )
all_vals = values + [special]
all_vals.repeated_permutation(power-1).map do |p|
if p.include?(special)
*all_vals.each_with_object([]) { |v,a| a << (p + [v]) }
else
p + [special]
end
end
end
limited_cartesian( values, 'z', 1 )
# [["z"]]
limited_cartesian( values, 'z', 2 )
# => [["a", "z"], ["b", "z"], ["c", "z"],
# ["z", "a"], ["z", "b"], ["z", "c"],
# ["z", "z"]]
limited_cartesian( values, 'z', 3 )
# => [["a", "a", "z"], ["a", "b", "z"], ["a", "c", "z"],
# ["a", "z", "a"], ["a", "z", "b"], ["a", "z", "c"],
# ["a", "z", "z"], ["b", "a", "z"], ["b", "b", "z"],
# ["b", "c", "z"], ["b", "z", "a"], ["b", "z", "b"],
# ["b", "z", "c"], ["b", "z", "z"], ["c", "a", "z"],
# ["c", "b", "z"], ["c", "c", "z"], ["c", "z", "a"],
# ["c", "z", "b"], ["c", "z", "c"], ["c", "z", "z"],
# ["z", "a", "a"], ["z", "a", "b"], ["z", "a", "c"],
# ["z", "a", "z"], ["z", "b", "a"], ["z", "b", "b"],
# ["z", "b", "c"], ["z", "b", "z"], ["z", "c", "a"],
# ["z", "c", "b"], ["z", "c", "c"], ["z", "c", "z"],
# ["z", "z", "a"], ["z", "z", "b"], ["z", "z", "c"],
# ["z", "z", "z"]]
This is my v2.1, which works, but is not pretty. I'll leave it for the record.
def limited_cartesian( values, special, power )
ndx = Array(0...power)
ndx[1..-1].each_with_object( [[special]*power] ) do |i,a|
ndx.combination(i).to_a.product(values.repeated_permutation(power-i).to_a)
.each { |pos, val| a << stuff_special(special, pos, val.dup) }
end
end
def stuff_special( special, pos, vals )
pos.each_with_object(Array.new(pos.size + vals.size)) {|j,r|
r[j] = special }.map {|e| e.nil? ? vals.shift : e }
end
# e.g., stuff_special( 'z', [1,4], ["a","b","c"]) => ["a","z","b","c","z"]
I have the following array
a3 = [["a", "b"], ["a","c"], ["b","c"], ["b", "a"], ["c","b"]]
I want to get the following output [["a","b"], ["a","c"], ["b","c"]] and delete ["b","a"] and ["c","b"]
I have the following code
a3.each do |ary3|
x = ary3[0]
y = ary3[1]
x = ary3[1]
y = ary3[0]
if a3.include?([x,y])
a3 - [y,x]
end
end
print a3
I tried using the swap, but no luck!
Thanks for the help.
Two arrays are considered to be equal if they contain the same elements and these elements are in the same order:
["a", "b"] == ["b", "a"]
#=> false
["a", "b"] == ["a", "b"]
#=> true
So you need to sort the inner arrays first and then you can use Array#uniq to ensure that each element in the outer array will only appear once:
arr = [["a", "b"], ["a", "c"], ["b", "c"], ["b", "a"], ["c", "b"]]
arr.map(&:sort).uniq
#=> [["a", "b"], ["a", "c"], ["b", "c"]]
This will leave arr untouched, however:
arr
#=> [["a", "b"], ["a", "c"], ["b", "c"], ["b", "a"], ["c", "b"]]
You will need to use mutator methods (with a !) to edit the array in place:
arr = [["a", "b"], ["a", "c"], ["b", "c"], ["b", "a"], ["c", "b"]]
arr.map!(&:sort).uniq!
arr
#=> [["a", "b"], ["a", "c"], ["b", "c"]]
Edit
As a follow-up to #sawa's comment, who was concerned that it may not be desirable to change the ordering of the inner arrays, i looked a bit deeper into Array#uniq. Consider the following array:
arr = [["b", "a"], ["a", "c"], ["b", "c"], ["b", "a"], ["c", "b"]]
I figured out that Array#uniq actually takes a block that lets you specify how the elements should be be compared:
arr.uniq!{|x| x.sort }
arr
#=> [["b", "a"], ["a", "c"], ["b", "c"]]
Cool thing is, this also works with Symbol#to_proc (the &: notation) and actually looks even more elegant than my original answer:
arr.uniq!(&:sort)
arr
#=> [["b", "a"], ["a", "c"], ["b", "c"]]
You can still use Array#sort! if you want the inner arrays to be sorted afterwards:
arr.uniq!(&:sort!)
arr
#=> [["a", "b"], ["a", "c"], ["b", "c"]]
My last observation on this is though, that the order probably isn't important or else two arrays with different order would not be considered equal. This got me thinking (again) and i posed myself the question: Why not use a Set? It would work like this:
require 'set'
sets = [Set["a", "b"], Set["a", "c"], Set["b", "c"], Set["b", "a"], Set["c", "b"]]
sets.uniq!
sets
#=> [#<Set: {"a", "b"}>, #<Set: {"a", "c"}>, #<Set: {"b", "c"}>]
Just keep in mind that a Set will not allow you to add the same element multiple times, whereas an array does:
[%w[a b b b c], %w[a b b b c], %w[a b c]].uniq(&:sort)
#=> [["a", "b", "b", "b", "c"], ["a", "b", "c"]]
[Set.new(%w[a b b b c]), Set.new(%w[a b b b c]), Set.new(%w[a b c])].uniq
#=> [#<Set: {"a", "b", "c"}>]
check this one: #delete_if as below:
a3 = [["a", "b"], ["a","c"], ["b","c"], ["b", "a"], ["c","b"]]
p a3.delete_if{|x| [["b", "a"], ["c","b"]].include? x}
#=> [["a", "b"], ["a", "c"], ["b", "c"]]
As per your comment and description post :
a3 = [["a", "b"], ["a","c"], ["b","c"], ["b", "a"], ["c","b"],["b", "a"]]
p a3.each {|x| a3.delete(x.reverse) if a3.include? x.reverse}
#=> [["a", "b"], ["a", "c"], ["b", "c"]]
BenchMark:
require 'benchmark'
N = 10000
Benchmark.bm(20) do | x |
a3 = [["a", "b"], ["a","c"], ["b","c"], ["b", "a"], ["c","b"],["b", "a"]]
x.report('Mine') do
N.times { a3.each {|x| a3.delete(x.reverse) if a3.include? x.reverse} }
end
a3 = [["a", "b"], ["a","c"], ["b","c"], ["b", "a"], ["c","b"],["b", "a"]]
x.report('padde') do
N.times { a3.uniq!(&:sort!) }
end
end
Output:
user system total real
Mine 0.172000 0.000000 0.172000 ( 0.361021)
padde 0.203000 0.000000 0.203000 ( 0.460026)
If I have a string such as 'abcde' and I want to get a 2d array of all combinations of 1 or 2 letters.
[ ['a', 'b', 'c', 'd', 'e'], ['ab', 'c', 'de'], ['a', 'bc', 'd', 'e'] ...
How would I go abouts doing so?
I want to do this in ruby, and think I should be using a regex. I've tried using
strn = 'abcde'
strn.scan(/[a-z][a-z]/)
but this is only going to give me the distinct sets of 2 characters
['ab', 'cd']
I think this should do it (haven't tested yet):
def find_letter_combinations(str)
return [[]] if str.empty?
combinations = []
find_letter_combinations(str[1..-1]).each do |c|
combinations << c.unshift(str[0])
end
return combinations if str.length == 1
find_letter_combinations(str[2..-1]).each do |c|
combinations << c.unshift(str[0..1])
end
combinations
end
Regular expressions will not help for this sort of problem. I suggest using the handy Array#combination(n) function in Ruby 1.9:
def each_letter_and_pair(s)
letters = s.split('')
letters.combination(1).to_a + letters.combination(2).to_a
end
ss = each_letter_and_pair('abcde')
ss # => [["a"], ["b"], ["c"], ["d"], ["e"], ["a", "b"], ["a", "c"], ["a", "d"], ["a", "e"], ["b", "c"], ["b", "d"], ["b", "e"], ["c", "d"], ["c", "e"], ["d", "e"]]
No, regex is not suitable here. Sure you can match either one or two chars like this:
strn.scan(/[a-z][a-z]?/)
# matches: ['ab', 'cd', 'e']
but you can't use regex to generate a (2d) list of all combinations.
A functional recursive approach:
def get_combinations(xs, lengths)
return [[]] if xs.empty?
lengths.take(xs.size).flat_map do |n|
get_combinations(xs.drop(n), lengths).map { |ys| [xs.take(n).join] + ys }
end
end
get_combinations("abcde".chars.to_a, [1, 2])
#=> [["a", "b", "c", "d", "e"], ["a", "b", "c", "de"],
# ["a", "b", "cd", "e"], ["a", "bc", "d", "e"],
# ["a", "bc", "de"], ["ab", "c", "d", "e"],
# ["ab", "c", "de"], ["ab", "cd", "e"]]
Given an array of arrays
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
What is the simplest way to merge the array items that contain members that are shared by any two or more arrays items. For example the above should be
[["A", "B", "C", "D","E", "F"], ["G"]] since "B" and "C" are shared by the first and second array items.
Here are some more test cases.
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
=> [["A", "B", "C", "D", "E", "F", "G"]]
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
=> [["A", "B", "C", "D", "E", "F"], ["G", "H,"]]
Here is my quick version which can be optimized I am sure :)
# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
array.collect! do |e|
t = e
e.each do |f|
array.each do |a|
if a.index(f)
t = t | a
end
end
end
e = t.sort
end
p array.uniq
Edit: Martin DeMello code was fixed.
When running Martin DeMello code (the accepted answer) I get:
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]] =>
[["B", "C", "E", "F", "A", "D", "G"], ["A", "B", "C", "D"], ["F", "G"]]
and
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]] =>
[["B", "C", "E", "F", "A", "D"], ["A", "B", "C", "D"], ["G", "H"], ["G", "H"]]
which does not seem to meet your spec.
Here is my approach using a few of his ideas:
a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
b = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
def reduce(array)
h = Hash.new {|h,k| h[k] = []}
array.each_with_index do |x, i|
x.each do |j|
h[j] << i
if h[j].size > 1
# merge the two sub arrays
array[h[j][0]].replace((array[h[j][0]] | array[h[j][1]]).sort)
array.delete_at(h[j][1])
return reduce(array)
# recurse until nothing needs to be merged
end
end
end
array
end
puts reduce(a).to_s #[["A", "B", "C", "D", "E", "F", "G"]]
puts reduce(b).to_s #[["A", "B", "C", "D", "E", "F"], ["G", "H"]]
Different algorithm, with a merge-as-you-go approach rather than taking two passes over the array (vaguely influenced by the union-find algorithm). Thanks for a fun problem :)
A = [["A", "G"],["B", "C", "E", "F"], ["A", "B", "C", "D"], ["B"], ["H", "I"]]
H = {}
B = (0...(A.length)).to_a
def merge(i,j)
A[j].each do |e|
if H[e] and H[e] != j
merge(i, H[e])
else
H[e] = i
end
end
A[i] |= A[j]
B[j] = i
end
A.each_with_index do |x, i|
min = A.length
x.each do |j|
if H[j]
merge(H[j], i)
else
H[j] = i
end
end
end
out = B.sort.uniq.map {|i| A[i]}
p out
Not the simplest ,may be the longest :)
l = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
puts l.flatten.inject([[],[]]) {|r,e| if l.inject(0) {|c,a| if a.include?(e) then c+1 else c end} >= 2 then r[0] << e ; r[0].uniq! else r[1] << e end ; r}.inspect
#[["B", "C"], ["E", "F", "A", "D", "G"]]
l = [["B", "C", "E", "F"], ["A", "B","C", "D"], ["G"]]
p l.inject([]){|r,e|
r.select{|i|i&e!=[]}==[]&&(r+=[e])||(r=r.map{|i|(i&e)!=nil&&(i|e).sort||i})
}
im not sure about your cond.
The simplest way to do it would be to take the powerset of an array (a set containing every possible combination of elements of the array), throw out any of the resulting sets if they don't have a common element, flatten the remaining sets and discard subsets and duplicates.
Or at least it would be if Ruby had proper Set support. Actually doing this in Ruby is horribly inefficient and an awful kludge:
power_set = array.inject([[]]){|c,y|r=[];c.each{|i|r<<i;r<<i+[y]};r}.reject{|x| x.empty?}
collected_powerset = power_set.collect{|subset| subset.flatten.uniq.sort unless
subset.inject(subset.last){|acc,a| acc & a}.empty?}.uniq.compact
collected_powerset.reject{|x| collected_powerset.any?{|c| (c & x) == x && x.length < c.length}}
Power set operation comes from here.
Straightforward rather than clever. It's destructive of the original array. The basic idea is:
go down the list of arrays, noting which array each element appears in
for every entry in this index list that shows an element in more than one array, merge all those arrays into the lowest-indexed array
when merging two arrays, replace the lower-indexed array with the merged result, and the higher-indexed array with a pointer to the lower-indexed array.
It's "algorithmically cheaper" than intersecting every pair of arrays, though the actual running speed will depend on what ruby hands over to the C layer.
a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
h = Hash.new {|h,k| h[k] = []}
a.each_with_index {|x, i| x.each {|j| h[j] << i}}
b = (0...(a.length)).to_a
h.each_value do |x|
x = x.sort_by {|i| b[i]}
if x.length > 1
x[1..-1].each do |i|
b[i] = [b[i], b[x[0]]].min
a[b[i]] |= a[i]
end
end
end
a = b.sort.uniq.map {|i| a[i]}
def merge_intersecting(input, result=[])
head = input.first
tail = input[1..-1]
return result if tail.empty?
intersection = tail.select { |arr| !(head & arr).empty? }
unless intersection.empty?
merged = head | intersection.flatten
result << merged.sort
end
merge_intersecting(tail, result)
end
require 'minitest/spec'
require 'minitest/autorun'
describe "" do
it "merges input array" do
input = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
output = [["A", "B", "C", "D", "E", "F", "G"]]
merge_intersecting(input).must_equal output
end
it "merges input array" do
input = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
output = [["A", "B", "C", "D", "E", "F"], ["G", "H"]]
merge_intersecting(input).must_equal output
end
end