How do I flatten a nested hash, recursively, into an array of arrays with a specific format? - ruby

I have a nested hash that looks something like this:
{
'a' => {
'b' => ['c'],
'd' => {
'e' => ['f'],
'g' => ['h', 'i', 'j', 'k']
},
'l' => ['m', 'n', 'o', 'p']
},
'q' => {
'r' => ['s']
}
}
The hash can have even more nesting, but the values of the last level are always arrays.
I would like to "flatten" the hash into a format where I get a an array of arrays representing all keys and values that makes up an entire "path/branch" of the nested hash all they way from lowest level value to the top of the hash. So kind of like traversing up through the "tree" starting from the bottom while collecting keys and values on the way.
The output of that for the particular hash should be:
[
['a', 'b', 'c'],
['a', 'd', 'e', 'f'],
['a', 'd', 'g', 'h', 'i', 'j', 'k'],
['a', 'l', 'm', 'n', 'o', 'p'],
['q', 'r', 's']
]
I tried many different things, but nothing worked so far. Again keep in mind that more levels than these might occur, so the solution has to be generic.
Note: the order of the arrays and the order of the elements in them is not important.
I did the following, but it's not really working:
tree_def = {
'a' => {
'b' => ['c'],
'd' => {
'e' => ['f'],
'g' => ['h', 'i', 'j', 'k']
},
'l' => ['m', 'n', 'o', 'p']
},
'q' => {
'r' => ['s']
}
}
branches = [[]]
collect_branches = lambda do |tree, current_branch|
tree.each do |key, hash_or_values|
current_branch.push(key)
if hash_or_values.kind_of?(Hash)
collect_branches.call(hash_or_values, branches.last)
else # Reached lowest level in dependency tree (which is always an array)
# Add a new branch
branches.push(current_branch.clone)
current_branch.push(*hash_or_values)
current_branch = branches.last
end
end
end
collect_branches.call(tree_def, branches[0])
branches #=> wrong result

As hinted at in the comments:
Looks pretty straightforward. Descend into hashes recursively, taking note of keys you visited in this branch. When you see an array, no need to recurse further. Append it to the list of keys and return
Tracking is easy, just pass the temp state down to recursive calls in arguments.
I meant something like this:
def tree_flatten(tree, path = [], &block)
case tree
when Array
block.call(path + tree)
else
tree.each do |key, sub_tree|
tree_flatten(sub_tree, path + [key], &block)
end
end
end
tree_flatten(tree_def) do |path|
p path
end
This code simply prints each flattened path as it gets one, but you can store it in an array too. Or even modify tree_flatten to return you a ready array, instead of yielding elements one by one.

You can do it like that:
def flat_hash(h)
return [h] unless h.kind_of?(Hash)
h.map{|k,v| flat_hash(v).map{|e| e.unshift(k)} }.flatten(1)
end
input = {
'a' => {
'b' => ['c'],
'd' => {
'e' => ['f'],
'g' => ['h', 'i', 'j', 'k']
},
'l' => ['m', 'n', 'o', 'p']
},
'q' => {
'r' => ['s']
}
}
p flat_hash(input)
The output will be:
[
["a", "b", "c"],
["a", "d", "e", "f"],
["a", "d", "g", "h", "i", "j", "k"],
["a", "l", "m", "n", "o", "p"],
["q", "r", "s"]
]

This of course calls for a recursive solution. The following method does not mutate the original hash.
Code
def recurse(h)
h.each_with_object([]) do |(k,v),arr|
v.is_a?(Hash) ? recurse(v).each { |a| arr << [k,*a] } : arr << [k,*v]
end
end
Example
h = { 'a'=>{ 'b'=>['c'],
'd'=>{ 'e'=>['f'], 'g' => ['h', 'i', 'j', 'k'] },
'l' => ['m', 'n', 'o', 'p'] },
'q'=>{ 'r'=>['s'] } }
recurse h
#=> [["a", "b", "c"],
# ["a", "d", "e", "f"],
# ["a", "d", "g", "h", "i", "j", "k"],
# ["a", "l", "m", "n", "o", "p"],
# ["q", "r", "s"]]
Explanation
The operations performed by recursive methods are always difficult to explain. In my experience the best way is to salt the code with puts statements. However, that in itself is not enough because when viewing output it is difficult to keep track of the level of recursion at which particular results are obtained and either passed to itself or returned to a version of itself. The solution to that is to indent and un-indent results, which is what I've done below. Note the way I've structured the code and the few helper methods I use are fairly general-purpose, so this approach can be adapted to examine the operations performed by other recursive methods.
INDENT = 8
def indent; #col += INDENT; end
def undent; #col -= INDENT; end
def pu(s); print " "*#col; puts s; end
def puhline; pu('-'*(70-#col)); end
#col = -INDENT
def recurse(h)
begin
indent
puhline
pu "passed h = #{h}"
h.each_with_object([]) do |(k,v),arr|
pu " k = #{k}, v=#{v}, arr=#{arr}"
if v.is_a?(Hash)
pu " calling recurse(#{v})"
ar = recurse(v)
pu " return value=#{ar}"
pu " calculating recurse(v).each { |a| arr << [k,*a] }"
ar.each do |a|
pu " a=#{a}"
pu " [k, *a] = #{[k,*a]}"
arr << [k,*a]
end
else
pu " arr << #{[k,*v]}"
arr << [k,*v]
end
pu "arr = #{arr}"
end.tap { |a| pu "returning=#{a}" }
ensure
puhline
undent
end
end
recurse h
----------------------------------------------------------------------
passed h = {"a"=>{"b"=>["c"], "d"=>{"e"=>["f"], "g"=>["h", "i", "j", "k"]},
"l"=>["m", "n", "o", "p"]}, "q"=>{"r"=>["s"]}}
k = a, v={"b"=>["c"], "d"=>{"e"=>["f"], "g"=>["h", "i", "j", "k"]},
"l"=>["m", "n", "o", "p"]}, arr=[]
calling recurse({"b"=>["c"], "d"=>{"e"=>["f"], "g"=>["h", "i", "j", "k"]},
"l"=>["m", "n", "o", "p"]})
--------------------------------------------------------------
passed h = {"b"=>["c"], "d"=>{"e"=>["f"], "g"=>["h", "i", "j", "k"]},
"l"=>["m", "n", "o", "p"]}
k = b, v=["c"], arr=[]
arr << ["b", "c"]
arr = [["b", "c"]]
k = d, v={"e"=>["f"], "g"=>["h", "i", "j", "k"]}, arr=[["b", "c"]]
calling recurse({"e"=>["f"], "g"=>["h", "i", "j", "k"]})
------------------------------------------------------
passed h = {"e"=>["f"], "g"=>["h", "i", "j", "k"]}
k = e, v=["f"], arr=[]
arr << ["e", "f"]
arr = [["e", "f"]]
k = g, v=["h", "i", "j", "k"], arr=[["e", "f"]]
arr << ["g", "h", "i", "j", "k"]
arr = [["e", "f"], ["g", "h", "i", "j", "k"]]
returning=[["e", "f"], ["g", "h", "i", "j", "k"]]
------------------------------------------------------
return value=[["e", "f"], ["g", "h", "i", "j", "k"]]
calculating recurse(v).each { |a| arr << [k,*a] }
a=["e", "f"]
[k, *a] = ["d", "e", "f"]
a=["g", "h", "i", "j", "k"]
[k, *a] = ["d", "g", "h", "i", "j", "k"]
arr = [["b", "c"], ["d", "e", "f"], ["d", "g", "h", "i", "j", "k"]]
k = l, v=["m", "n", "o", "p"],
arr=[["b", "c"], ["d", "e", "f"], ["d", "g", "h", "i", "j", "k"]]
arr << ["l", "m", "n", "o", "p"]
arr = [["b", "c"], ["d", "e", "f"], ["d", "g", "h", "i", "j", "k"],
["l", "m", "n", "o", "p"]]
returning=[["b", "c"], ["d", "e", "f"], ["d", "g", "h", "i", "j", "k"],
["l", "m", "n", "o", "p"]]
--------------------------------------------------------------
return value=[["b", "c"], ["d", "e", "f"], ["d", "g", "h", "i", "j", "k"],
["l", "m", "n", "o", "p"]]
calculating recurse(v).each { |a| arr << [k,*a] }
a=["b", "c"]
[k, *a] = ["a", "b", "c"]
a=["d", "e", "f"]
[k, *a] = ["a", "d", "e", "f"]
a=["d", "g", "h", "i", "j", "k"]
[k, *a] = ["a", "d", "g", "h", "i", "j", "k"]
a=["l", "m", "n", "o", "p"]
[k, *a] = ["a", "l", "m", "n", "o", "p"]
arr = [["a", "b", "c"], ["a", "d", "e", "f"], ["a", "d", "g", "h", "i", "j", "k"],
["a", "l", "m", "n", "o", "p"]]
k = q, v={"r"=>["s"]}, arr=[["a", "b", "c"], ["a", "d", "e", "f"],
["a", "d", "g", "h", "i", "j", "k"], ["a", "l", "m", "n", "o", "p"]]
calling recurse({"r"=>["s"]})
--------------------------------------------------------------
passed h = {"r"=>["s"]}
k = r, v=["s"], arr=[]
arr << ["r", "s"]
arr = [["r", "s"]]
returning=[["r", "s"]]
--------------------------------------------------------------
return value=[["r", "s"]]
----------------------------------------------------------------------
calculating recurse(v).each { |a| arr << [k,*a] }
a=["r", "s"]
[k, *a] = ["q", "r", "s"]
arr = [["a", "b", "c"], ["a", "d", "e", "f"], ["a", "d", "g", "h", "i", "j", "k"],
["a", "l", "m", "n", "o", "p"], ["q", "r", "s"]]
returning=[["a", "b", "c"], ["a", "d", "e", "f"], ["a", "d", "g", "h", "i", "j", "k"],
["a", "l", "m", "n", "o", "p"], ["q", "r", "s"]]
----------------------------------------------------------------------
#=> [["a", "b", "c"], ["a", "d", "e", "f"], ["a", "d", "g", "h", "i", "j", "k"],
# ["a", "l", "m", "n", "o", "p"], ["q", "r", "s"]]

This will return an Array with all the paths.
def paths(element, path = [], accu = [])
case element
when Hash
element.each do |key, value|
paths(value, path + [key], accu)
end
when Array
accu << (path + element)
end
accu
end
For nicer printing you can do
paths(tree_def).map { |path| path.join(".") }

See following which will keep calling recursively till it reaches to array values.
This recursion call will go with multiple branches and op should be individual copy for each branch so I used string which is always created as a new object here otherwise array will be like going with call by reference
hash = {"a"=>{"b"=>["c"], "d"=>{"e"=>["f"], "g"=>["h", "i", "j", "k"]}, "l"=>["m", "n", "o", "p"]}, "q"=>{"r"=>["s"]}}
#output = []
def nested_array(h, op='')
h.map do |k,v|
if Hash === v
nested_array(v, op+k)
else
#output << (op+k+v.join).chars
end
end
end
nested_array(hash)
#output will be your desired array.
[
["a", "b", "c"],
["a", "d", "e", "f"],
["a", "d", "g", "h", "i", "j", "k"],
["a", "l", "m", "n", "o", "p"],
["q", "r", "s"]
]
update: key values pair can be more than single character so following approach for nested_array may work better.
def nested_array(h, op=[])
h.map do |k,v|
if Hash === v
nested_array(v, Array.new(op) << k)
else
#output << ((Array.new(op) << k) + v)
end
end
end

All the solutions here are recursive, below is a non-recursive
solution.
def flatten(input)
sol = []
while(input.length > 0)
unprocessed_input = []
input.each do |l, r|
if r.is_a?(Array)
sol << l + r
else
r.each { |k, v| unprocessed_input << [l + [k], v] }
end
end
input = unprocessed_input
end
return sol
end
flatten([[[], h]])
Code Explanation:
Hash in array form is [[k1, v1], [k2, v2]].
When input_hash is presented in the above form, [[], { a: {..} }], partial_solutions of this form, [ [a], {..} ], can be generated. Index '0' holds the partial solution and Index '1' holds the yet to be processed input.
As this format is easy to map partial_solution with unprocessed input and accumulate unprocessed input, converting input_hash to this format result in, [[[], input_hash]]
Solution:
[["a", "b", "c"], ["a", "l", "m", "n", "o", "p"], ["q", "r", "s"], ["a", "d", "e", "f"], ["a", "d", "g", "h", "i", "j", "k"]]

Related

no implicit conversion of String into Integer, simple ruby function not working

When I run this code I get a typeError, but when I do it by hand in the IRB everything seems to be working out okay. I believe the problem lies somewhere in my IF statement but I don't know how to fix it.
numerals = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
def convertToNumbers(string)
arr = string.downcase.split('')
new_array = []
arr.each do |i|
if (arr[i] =~ [a-z])
numValue = numerals.index(arr[i]).to_s
new_array.push(numValue)
end
end
end
You probably meant
arr[i] =~ /[a-z]/
which matches the characters a through z. What you wrote
arr[i] =~ [a-z]
is constructing an array and trying to compare it using the regex comparison operator, which is a type error (assuming variables a and z are defined).
A few issues. As Tyler pointed out inside of the loop you are still referencing arr when you look to only need to use i. Also, the regex issue Max pointed out is valid as well. The function also will return arr and not the new_array array as that is the result of the for loop output.
I made a few modifications.
def convertToNumbers(string)
numerals = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]
arr = string.downcase.split('')
new_array = []
arr.each do |i|
if (i =~ /[a-z]/)
numValue = numerals.index(i).to_s
new_array.push(numValue)
end
end
new_array.join
end
puts convertToNumbers('abcd');
which prints out '0123'

pushing array to array in ruby have a wrong result

I have problem when pushing array to array,
the result is not as I expected.
I run this code below :
#arr = ["e", "s", "l", "e"]
def permutations(array, i=0)
combined_array = Array.new
(i..array.size-1).each do |j|
array[i], array[j] = array[j], array[i]
puts "ARRAY : #{array}"
combined_array << array.to_a
end
return combined_array
end
permutations(#arr)
I got the output :
ARRAY : ["e", "s", "l", "e"]
ARRAY : ["s", "e", "l", "e"]
ARRAY : ["l", "e", "s", "e"]
ARRAY : ["e", "e", "s", "l"]
=> [["e", "e", "s", "l"], ["e", "e", "s", "l"], ["e", "e", "s", "l"], ["e", "e", "s", "l"]]
Result expected :
ARRAY : ["e", "s", "l", "e"]
ARRAY : ["s", "e", "l", "e"]
ARRAY : ["l", "e", "s", "e"]
ARRAY : ["e", "e", "s", "l"]
=> [["e", "s", "l", "e"], ["s", "e", "l", "e"], ["l", "e", "s", "e"], ["e", "e", "s", "l"]]
How to solve this problem ?
According to the documentation, #to_a called on an Array returns self (the array itself, not a copy).
You are adding the same array to combined_array multiple times.
Change the .to_a to .dup and it will work fine.
I think #GolfWolf has solved your problem.
But you don't have to write such a function to solve your problem in Ruby, Ruby has permutation method which you can use it.
p arr.permutation.to_a
If you want to get first 4 element then you can do this,
p arr.permutation.take(4)

Why can't I sort an array of strings by `count`?

With this code:
line = ("Ignore punctuation, please :)")
string = line.strip.downcase.split(//)
string.select! {|x| /[a-z]/.match(x) }
string.sort_by!{ |x| string.count(x)}
the result is:
["r", "g", "s", "l", "c", "o", "o", "p", "u", "i", "t", "u", "a", "t", "i", "a", "p", "n", "e", "e", "n", "n", "e"]
Does sorting by count not work in this case? Why? Is there a better way to isolate the words by frequency?
By your comment, I suppose that you want to sort characters by frequency and alphabetically. When the only sort_by! criteria is string.count(x), frequency groups with the same number of characters can appear mixed with each other. To sort each group alphabetically you have to add a second criteria in the sort_by! method:
line = ("Ignore punctuation, please :)")
string = line.strip.downcase.split(//)
string.select! {|x| /[a-z]/.match(x) }
string.sort_by!{ |x| [string.count(x), x]}
Then the output will be
["c", "g", "l", "r", "s", "a", "a", "i", "i", "o", "o", "p", "p", "t", "t", "u", "u", "e", "e", "e", "n", "n", "n"]
Let's look at your code line-by-line.
line = ("Ignore punctuation, please :)")
s = line.strip.downcase
#=> "ignore punctuation, please :)"
There's no particular reason to strip here, as you will be removing spaces and punctuation later anyway.
string = s.split(//)
#=> ["i", "g", "n", "o", "r", "e", " ", "p", "u", "n", "c", "t",
# "u", "a", "t", "i", "o", "n", ",", " ", "p", "l", "e", "a",
# "s", "e", " ", ":", ")"]
You've chosen to split the sentence into characters, which is fine, but as I'll mention at the end, you could just use String methods. In any case,
string = s.chars
does the same thing and is arguably more clear. What you have now is an array named string. Isn't that a bit confusing? Let's instead call it arr:
arr = s.chars
(One often sees s and str for names of strings, a and arr for names of arrays, h and hash for names of hashes, and so on.)
arr.select! {|x| /[a-z]/.match(x) }
#=> ["i", "g", "n", "o", "r", "e", "p", "u", "n", "c", "t", "u",
# "a", "t", "i", "o", "n", "p", "l", "e", "a", "s", "e"]
Now you've eliminated all but lowercase letters. You could also write that:
arr.select! {|x| s =~ /[a-z]/ }
or
arr.select! {|x| s[/[a-z]/] }
You are now ready to sort.
arr.sort_by!{ |x| arr.count(x) }
#=> ["l", "g", "s", "c", "r", "i", "p", "u", "a", "o", "t", "p",
# "a", "t", "i", "o", "u", "n", "n", "e", "e", "n", "e"]
This is OK, but it's not good practice to be sorting an array in place and counting the frequency of its elements at the same time. Better would be:
arr1 = arr.sort_by{ |x| arr.count(x) }
which gives the same ordering. Is the resulting sorted array correct? Let's count the number of times each letter appears in the string.
I will create a hash whose keys are the unique elements of arr and whose values are the number of times the associated key appears in arr. There are a few ways to do this. A simple but not very efficient way is as follows:
h = {}
a = arr.uniq
#=> ["l", "g", "s", "c", "r", "i", "p", "u", "a", "o", "t", "n", "e"]
a.each { |c| h[c] = arr.count(c) }
h #=> {"l"=>1, "g"=>1, "s"=>1, "c"=>1, "r"=>1, "i"=>2, "p"=>2,
# "u"=>2, "a"=>2, "o"=>2, "t"=>2, "n"=>3, "e"=>3}
This would normally be written:
h = arr.uniq.each_with_object({}) { |c,h| h[c] = arr.count(c) }
The elements of h are in increasing order of value, but that's just coincidence. To ensure they are in that order (to make it easier to see the order), we would need to construct an array, sort it, then convert it to a hash:
a = arr.uniq.map { |c| [c, arr.count(c)] }
#=> [["l", 1], ["g", 1], ["s", 1], ["c", 1], ["r", 1], ["a", 2], ["p", 2],
# ["u", 2], ["i", 2], ["o", 2], ["t", 2], ["n", 3], ["e", 3]]
a = a.sort_by { |_,count| count }
#=> [["l", 1], ["g", 1], ["s", 1], ["c", 1], ["r", 1], ["a", 2], ["t", 2],
# ["u", 2], ["i", 2], ["o", 2], ["p", 2], ["n", 3], ["e", 3]]
h = Hash[a]
#=> {"l"=>1, "g"=>1, "s"=>1, "c"=>1, "r"=>1, "i"=>2, "t"=>2,
# "u"=>2, "a"=>2, "o"=>2, "p"=>2, "n"=>3, "e"=>3}
One would normally see this written:
h = Hash[arr.uniq.map { |c| [c, arr.count(c)] }.sort_by(&:last)]
or, in Ruby v2.0+:
h = arr.uniq.map { |c| [c, arr.count(c)] }.sort_by(&:last).to_h
Note that, prior to Ruby 1.9, there was no concept of key ordering in hashes.
The values of h's key-value pairs show that your sort is correct. It is not, however, very efficient. That's because in:
arr.sort_by { |x| arr.count(x) }
you repeatedly traverse arr, counting frequencies of elements. It's better to construct the hash above:
h = arr.uniq.each_with_object({}) { |c,h| h[c] = arr.count(c) }
before performing the sort, then:
arr.sort_by { |x| h[x] }
As an aside, let me mention a more efficient way to construct the hash h, one which requires only a single pass through arr:
h = Hash.new(0)
arr.each { |x| h[x] += 1 }
h #=> {"l"=>1, "g"=>1, "s"=>1, "c"=>1, "r"=>1, "a"=>2, "p"=>2,
# "u"=>2, "i"=>2, "o"=>2, "t"=>2, "n"=>3, "e"=>3}
or, more succinctly:
h = arr.each_with_object(Hash.new(0)) { |x,h| h[x] += 1 }
Here h is called a counting hash:
h = Hash.new(0)
creates an empty hash whose default value is zero. This means that if h does not have a key k, h[k] will return zero. The abbreviated assignment h[c] += 1 expands to:
h[c] = h[c] + 1
and if h does not have a key c, the default value is assigned to h[c] on the right side:
h[c] = 0 + 1 #=> 1
but the next time c is encountered:
h[c] = h[c] + 1
#=> 1 + 1 => 2
Lastly, let's start over and do as much as we can with String methods:
line = ("Ignore punctuation, please :)")
s = line.strip.downcase.gsub(/./) { |c| (c =~ /[a-z]/) ? c : '' }
#=> "ignorepunctuationplease"
h = s.each_char.with_object(Hash.new(0)) { |c,h| h[c] += 1 }
#=> {"i"=>2, "g"=>1, "n"=>3, "o"=>2, "r"=>1, "e"=>3, "p"=>2,
# "u"=>2, "c"=>1, "t"=>2, "a"=>2, "l"=>1, "s"=>1}
s.each_char.sort_by { |c| h[c] }
#=> ["l", "g", "s", "c", "r", "i", "p", "u", "a", "o", "t", "p",
# "a", "t", "i", "o", "u", "n", "n", "e", "e", "n", "e"]

Returning a list of indices where a certain object appears in a nested array

I am trying to figure out how to form an array that collects every index of a particular object (in this case a single letter) where it appears in a nested set of arrays. For instance, using the array set below,
boggle_board = [["P", "P", "X", "A"],
["V", "F", "S", "Z"],
["O", "P", "W", "N"],
["D", "H", "L", "E"]]
I would expect something like boggle_board.include?("P") to return a nested array of indices [[0,0][0,1],[2,1]]. Any ideas on how to do this?
Nothing super-elegant comes to mind for me right now. This seems to work:
def indices_of(board, letter)
indices = []
board.each_with_index do |ar, i|
ar.each_with_index do |s, j|
indices.push([i, j]) if s == letter
end
end
indices
end
boggle_board = [["P", "P", "X", "A"],
["V", "F", "S", "Z"],
["O", "P", "W", "N"],
["D", "H", "L", "E"]]
indices_of(boggle_board, "P")
# => [[0, 0], [0, 1], [2, 1]]
I would use Matrix#each_with_index.The below code is more Rubyistic:
require "matrix"
m = Matrix[["P", "P", "X", "A"],
["V", "F", "S", "Z"],
["O", "P", "W", "N"],
["D", "H", "L", "E"]]
ar = []
m.each_with_index {|e, row, col| ar << [row,col] if e == "P"}
ar #=> [[0, 0], [0, 1], [2, 1]]

merging array items in ruby

Given an array of arrays
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
What is the simplest way to merge the array items that contain members that are shared by any two or more arrays items. For example the above should be
[["A", "B", "C", "D","E", "F"], ["G"]] since "B" and "C" are shared by the first and second array items.
Here are some more test cases.
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
=> [["A", "B", "C", "D", "E", "F", "G"]]
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
=> [["A", "B", "C", "D", "E", "F"], ["G", "H,"]]
Here is my quick version which can be optimized I am sure :)
# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
array.collect! do |e|
t = e
e.each do |f|
array.each do |a|
if a.index(f)
t = t | a
end
end
end
e = t.sort
end
p array.uniq
Edit: Martin DeMello code was fixed.
When running Martin DeMello code (the accepted answer) I get:
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]] =>
[["B", "C", "E", "F", "A", "D", "G"], ["A", "B", "C", "D"], ["F", "G"]]
and
[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]] =>
[["B", "C", "E", "F", "A", "D"], ["A", "B", "C", "D"], ["G", "H"], ["G", "H"]]
which does not seem to meet your spec.
Here is my approach using a few of his ideas:
a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
b = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
def reduce(array)
h = Hash.new {|h,k| h[k] = []}
array.each_with_index do |x, i|
x.each do |j|
h[j] << i
if h[j].size > 1
# merge the two sub arrays
array[h[j][0]].replace((array[h[j][0]] | array[h[j][1]]).sort)
array.delete_at(h[j][1])
return reduce(array)
# recurse until nothing needs to be merged
end
end
end
array
end
puts reduce(a).to_s #[["A", "B", "C", "D", "E", "F", "G"]]
puts reduce(b).to_s #[["A", "B", "C", "D", "E", "F"], ["G", "H"]]
Different algorithm, with a merge-as-you-go approach rather than taking two passes over the array (vaguely influenced by the union-find algorithm). Thanks for a fun problem :)
A = [["A", "G"],["B", "C", "E", "F"], ["A", "B", "C", "D"], ["B"], ["H", "I"]]
H = {}
B = (0...(A.length)).to_a
def merge(i,j)
A[j].each do |e|
if H[e] and H[e] != j
merge(i, H[e])
else
H[e] = i
end
end
A[i] |= A[j]
B[j] = i
end
A.each_with_index do |x, i|
min = A.length
x.each do |j|
if H[j]
merge(H[j], i)
else
H[j] = i
end
end
end
out = B.sort.uniq.map {|i| A[i]}
p out
Not the simplest ,may be the longest :)
l = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]
puts l.flatten.inject([[],[]]) {|r,e| if l.inject(0) {|c,a| if a.include?(e) then c+1 else c end} >= 2 then r[0] << e ; r[0].uniq! else r[1] << e end ; r}.inspect
#[["B", "C"], ["E", "F", "A", "D", "G"]]
l = [["B", "C", "E", "F"], ["A", "B","C", "D"], ["G"]]
p l.inject([]){|r,e|
r.select{|i|i&e!=[]}==[]&&(r+=[e])||(r=r.map{|i|(i&e)!=nil&&(i|e).sort||i})
}
im not sure about your cond.
The simplest way to do it would be to take the powerset of an array (a set containing every possible combination of elements of the array), throw out any of the resulting sets if they don't have a common element, flatten the remaining sets and discard subsets and duplicates.
Or at least it would be if Ruby had proper Set support. Actually doing this in Ruby is horribly inefficient and an awful kludge:
power_set = array.inject([[]]){|c,y|r=[];c.each{|i|r<<i;r<<i+[y]};r}.reject{|x| x.empty?}
collected_powerset = power_set.collect{|subset| subset.flatten.uniq.sort unless
subset.inject(subset.last){|acc,a| acc & a}.empty?}.uniq.compact
collected_powerset.reject{|x| collected_powerset.any?{|c| (c & x) == x && x.length < c.length}}
Power set operation comes from here.
Straightforward rather than clever. It's destructive of the original array. The basic idea is:
go down the list of arrays, noting which array each element appears in
for every entry in this index list that shows an element in more than one array, merge all those arrays into the lowest-indexed array
when merging two arrays, replace the lower-indexed array with the merged result, and the higher-indexed array with a pointer to the lower-indexed array.
It's "algorithmically cheaper" than intersecting every pair of arrays, though the actual running speed will depend on what ruby hands over to the C layer.
a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
h = Hash.new {|h,k| h[k] = []}
a.each_with_index {|x, i| x.each {|j| h[j] << i}}
b = (0...(a.length)).to_a
h.each_value do |x|
x = x.sort_by {|i| b[i]}
if x.length > 1
x[1..-1].each do |i|
b[i] = [b[i], b[x[0]]].min
a[b[i]] |= a[i]
end
end
end
a = b.sort.uniq.map {|i| a[i]}
def merge_intersecting(input, result=[])
head = input.first
tail = input[1..-1]
return result if tail.empty?
intersection = tail.select { |arr| !(head & arr).empty? }
unless intersection.empty?
merged = head | intersection.flatten
result << merged.sort
end
merge_intersecting(tail, result)
end
require 'minitest/spec'
require 'minitest/autorun'
describe "" do
it "merges input array" do
input = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]
output = [["A", "B", "C", "D", "E", "F", "G"]]
merge_intersecting(input).must_equal output
end
it "merges input array" do
input = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]
output = [["A", "B", "C", "D", "E", "F"], ["G", "H"]]
merge_intersecting(input).must_equal output
end
end

Resources