Group hash of arrays by array element - ruby

I have this data structure resulted from a query grouping
{
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
I want to manipulate it so I end up with a structure grouped like this
{
0 => {
'AR' => 2,
'AQ' => 6,
nil => 1
},
1 => {
'AQ' => 1,
nil => 4
},
2 => {
'BG' => 1,
nil => 1
}
}

input = {
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
result = {}
input.each do |k, v|
if result[k[0]]
result[k[0]].merge!({ k[1] => v })
else
result[k[0]] = { k[1] => v }
end
end
puts result
#{0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}
I think this is not the most succinct way, I hope some advice!

hash = {
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
new_hash = {}
hash.each{|k, v| new_hash[k[0]] ||= {}; new_hash[k[0]].merge!({k[1] => v})}
puts new_hash # {0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}

Here is one more very similar to previous answers but with using of #each_with_object:
hash = {
[0, "AR"]=>2,
[0, nil]=>1,
[0, "AQ"]=>6,
[1, nil]=>4,
[1, "AQ"]=>3,
[2, "BG"]=>1,
[2, nil]=>1,
}
result_hash = Hash.new { |h,k| h[k] = {} }
hash.each_with_object(result_hash) do |((parrent_key, key), value), res|
res[parrent_key].merge!(key => value)
end
=> {0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}

I came up with an answer that doesn't require additional variable assignments in its enclosing scope (it has "referential transparency": https://en.wikipedia.org/wiki/Referential_transparency)
input
.group_by { |(arr, num)| arr.first }
.each_with_object(Hash.new) do |(key, vals), hsh|
vals.each do |((key, innerkey), innerval)|
hsh[key] ||= {}
hsh[key][innerkey] = innerval
end
hsh
end
# {0=>{"AR"=>2, nil=>1, "AQ"=>6}, 1=>{nil=>4, "AQ"=>3}, 2=>{"BG"=>1, nil=>1}}
Two high-level steps:
I noticed the output object is grouped by the first array element (here, 0/1/2). I use #group_by to create a hash with that structure.
# output of `#group_by` on first array element:
key: 0, vals: [ [[0, "AR"], 2], [[0, nil], 1], [[0, "AQ"], 6] ]
key: 1, vals: [ [[1, nil], 4], [[1, "AQ"], 3] ]
key: 2, vals: [ [[2, "BG"], 1], [[2, nil], 1] ]
I use #each_with_object to construct the nested hashes. For each vals array above, I extracted the second and third values by destructuring the arrays in the block parameter (((key, innerkey), innerval)) and then the hash assignment was straightforward.

Related

Convert Ruby array of elements to Hash of counts with indices

Given a two dimensional array in Ruby:
[ [1, 1, 1],
[1, 1],
[1, 1, 1, 1],
[1, 1]
]
I'd like to create a Hash, where the keys are the counts of each internal array, and the values are arrays of indices of the original array whose internal array sizes have the particular count. The resulting Hash would be:
{ 2 => [1, 3], 3 => [0], 4 => [2] }
How do I concisely express this functionally in Ruby? I am attempting something akin to Hash.new([]).tap { |h| array.each_with_index { |a, i| h[a.length] << i } }, but the resulting Hash is empty.
There are two problems with your code. The first is that when h is empty and you write, say, h[2] << 1, since h does not have a key 2, h[2] returns the default, so this expression becomes [] << 1 #=> [1], but [1] is not attached to the hash, so no key and value are added.
You need to write h[2] = h[2] << 11. If you do that, your code returns h #=> {3=>[0, 1, 2, 3], 2=>[0, 1, 2, 3], 4=>[0, 1, 2, 3]}. Unfortunately, that's still incorrect, which takes us to the second problem with your code: you did not define the newly-created hash's default value correctly.
First note that
h[3].object_id
#=> 70113420279440
h[2].object_id
#=> 70113420279440
h[4].object_id
#=> 70113420279440
Aha, all three values are the same object! new's argument [] is returned by h[k] when h does not have a key k. The problem is that is the same array is returned for all keys k added to the hash, so you would be adding a key-value pair to an empty array for the first new key, then adding a second key-value pair to that same array for the next new key, and so on. See below for how the hash needs to be defined.
With these two changes your code works fine, but I would suggest writing it as follows.
arr = [ [1, 1, 1], [1, 1], [1, 1, 1, 1], [1, 1] ]
arr.each_with_index.with_object(Hash.new {|h,k| h[k]=[]}) { |(a,i),h|
h[a.size] << i }
#=> {3=>[0], 2=>[1, 3], 4=>[2]}
which use the form of Hash::new that uses a block to calculate the hash's default value (i.e., the value returned by h[k] when a hash h does not have a key k),
or
arr.each_with_index.with_object({}) { |(a,i),h| (h[a.size] ||= []) << i }
#=> {3=>[0], 2=>[1, 3], 4=>[2]}
both of which are effectively the following:
h = {}
arr.each_with_index do |a,i|
sz = a.size
h[sz] = [] unless h.key?(sz)
h[a.size] << i
end
h #=> {3=>[0], 2=>[1, 3], 4=>[2]}
Another way is to use Enumerable#group_by, grouping on array size, after picking up the index for each inner array.
h = arr.each_with_index.group_by { |a,i| a.size }
#=> {3=>[[[1, 1, 1], 0]],
# 2=>[[[1, 1], 1], [[1, 1], 3]],
# 4=>[[[1, 1, 1, 1], 2]]}
h.each_key { |k| h[k] = h[k].map(&:last) }
#=> {3=>[0], 2=>[1, 3], 4=>[2]}
1 The expression h[2] = h[2] << 1 uses the methods Hash#[]= and Hash#[], which is why h[2] on the left of = does not return the default value. This expression can alternatively be written h[2] ||= [] << 1.
arry = [ [1, 1, 1],
[1, 1],
[1, 1, 1, 1],
[1, 1]
]
h = {}
arry.each_with_index do |el,i|
c = el.count
h.has_key?(c) ? h[c] << i : h[c] = [i]
end
p h
This will give you
{3=>[0], 2=>[1, 3], 4=>[2]}

Merging an array of hashes

I have the following array:
x = [ { a: [1,2] }, { a: [3,4] }, { a: [5,6] } ]
and I need to get
{ a: [[1,2], [3,4], [5,6]] }
I have tried to use (among other options) merge:
x.each_with_object({}) do |a, b|
b.merge!(a) {|k, o, n| o.zip(n) }
end
But unfortunately, I get an extra array around the result.
Any suggestions?
THANKS
x.flat_map(&:to_a).group_by(&:first).map{ |k, v| [k, v.map(&:last)] }.to_h
#=> [{:a=>[[1, 2], [3, 4], [5, 6]]}]
It is maybe not the most efficient way to get the expected result but you can do
h = Hash.new([])
x.each { |hash|
hash.each { |key, values|
h[key] = h[key] + [values]
}
}
That way, at the end h is {:a=>[[1, 2], [3, 4], [5, 6]]}
key = x.first.first.first
#=> :a
{ key=>x.map { |h| h[key] } }
#=> {:a=>[[1, 2], [3, 4], [5, 6]]}
Note
a = x.first
#=> {:a=>[1, 2]}
b = a.first
#=> [:a, [1, 2]]
b.first
#=> :a
Another way:
a = x.map { |h| h.merge(h) { |_,v,_| [v] } }
#=> [{:a=>[[1, 2]]}, {:a=>[[3, 4]]}, {:a=>[[5, 6]]}]
a.reduce { |t,h| t.merge(h) { |_,o,n| o+n } }
#=> {:a=>[[1, 2], [3, 4], [5, 6]]}
Both steps use the form of Hash#merge that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for details.
These two steps could be combined into one as follows:
x.reduce { |t,h| t.merge(h.merge(h) { |_,v,_| [v] }) { |_,o,n| o+n } }
#=> {:a=>[1, 2, [3, 4], [5, 6]]}

Each loop not working as expected

I have an array called #results that is made up only of arrays. I want to iterate through #results and permanently delete any of the inner arrays that are smaller than a given size:
My code:
def check_results limit
#results.each_with_index do |result, index|
#results.delete_at(index) if result.size < limit
end
end
Unfortunately, this only deletes the first item where the array length is less than limit. For example if limit = 4 and #results = [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1]] then check_results returns [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1]]
I can't figure out why this is happening. Am I using the wrong loop?
You should do this, as delete_at modifies the array, and you will get unexpected behavior if you are deleting elements while iterating it
#results.reject { |i| i.size < limit }
Above code will exclude all array elements whose size is smaller than limit
It's not a good idea to modify the #results array in place as that will conflict with the outer iteration.
What you should do instead is use select to build a new array.
def check_results(limit)
#result.select { |result| result.size > limit }
end
As per the documentation, #delete_at returns the element at that index.
a = ["ant", "bat", "cat", "dog"]
a.delete_at(2) #=> "cat"
a #=> ["ant", "bat", "dog"]
a.delete_at(99) #=> nil
I added some debug statements to show you what is happening at each step, assuming limit is 4:
#results = [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1]]
#results.each_with_index do |r, i|
puts "RESULT: #{r.to_s}"
puts "INDEX: #{i}"
#results.delete_at(i) if r.size < 4
puts "ARRAY: #{#results.to_s}"
end
RESULT: [1, 1, 1, 1]
INDEX: 0
ARRAY: [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1]]
RESULT: [1, 1, 1, 1]
INDEX: 1
ARRAY: [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1], [1, 1]]
RESULT: [1, 1, 1]
INDEX: 2
ARRAY: [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1]]
# #results == [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1]]
As you can see, the element originally at index 2 has been removed. Because you are modifying #results while you are iterating through it, an index of 3 no longer exists, and an index of 2 has already been analyzed. This is why you should not modify an object while iterating through it.
Ideally, you want to use #delete_if. Similar to methods ending in !, #delete_if will modify the array (not return a copy of the result), based on conditions from a block (as an argument). The following would be how you would implement the method:
def check_results(limit)
#results.delete_if { |arr| arr.length < limit }
end
#results = [ ['foo', 'bar'], ['bizz', 'bazz'], ['kaboom'] ]
check_results(2)
# => #results == [ ['foo', 'bar'], ['bizz', 'bazz'] ]
If you do not want to modify #results, then I suggest a similar method, #reject. Again, #results will not be modified, and instead a copy of the results will be returned.
def check_results(limit)
#results.reject { |arr| arr.length < limit }
end
#results = [ ['foo', 'bar'], ['bizz', 'bazz'], ['kaboom'] ]
check_results(2)
# => [ ['foo', 'bar'], ['bizz', 'bazz'] ]
# => #results == [ ['foo', 'bar'], ['bizz', 'bazz'], ['kaboom'] ]

Finding duplicates in nested arrays

I have a hash, which contains a hash, which contains a number of arrays, like this:
{ "bob" =>
{
"foo" => [1, 3, 5],
"bar" => [2, 4, 6]
},
"fred" =>
{
"foo" => [1, 7, 9],
"bar" => [8, 10, 12]
}
}
I would like to compare the arrays against the other arrays, and then alert me if they are duplicates. It is possible for hash["bob"]["foo"] and hash["fred"]["foo"] to have duplicates, but not for hash["bob"]["foo"] and hash["bob"]["bar"]. Same with hash["fred"].
I can't even figure out where to begin with this one. I suspect inject will be involved somewhere, but I could be wrong.
This snippet will return an array of duplicates for each key. Duplicates can only be generated for equal keys.
duplicates = (keys = h.values.map(&:keys).flatten.uniq).map do |key|
{key => h.values.map { |h| h[key] }.inject(&:&)}
end
This will return [{"foo"=>[1]}, {"bar"=>[]}] which indicates that the key foo was the only one containing a duplicate of 1.
The snippet above assume h is the variable name of your hash.
h = {
"bob" =>
{
"foo" => [1, 3, 5],
"bar" => [2, 4, 6]
},
"fred" =>
{
"foo" => [1, 7, 9],
"bar" => [1, 10, 12]
}
}
h.each do |k, v|
numbers = v.values.flatten
puts k if numbers.length > numbers.uniq.length
end
There are many ways to do it.
Here's one that should be easy to read.
It works in Ruby 1.9. It uses + to combine two arrays and then uses the uniq! operator to figure out whether there is a duplicate number.
h = { "bob" =>
{
"foo" => [1, 3, 5],
"bar" => [2, 4, 6]
},
"fred" =>
{
"foo" => [1, 7, 12],
"bar" => [8, 10, 12]
}
}
h.each do |person|
if (person[1]["foo"] + person[1]["bar"]).uniq! != nil
puts "Duplicate in #{person[1]}"
end
end
I'm not sure what exactly you are looking for. But at look at a possible solution, perhaps you can reuse something.
outer_hash.each do |person, inner_hash|
seen_arrays = Hash.new
inner_hash.each do |inner_key, array|
other = seen_arrays[array]
if other
raise "array #{person}/#{inner_key} is a duplicate of #{other}"
end
seen_arrays[array] = "#{person}/#{inner_key}"
end
end

How to quickly print Ruby hashes in a table format?

Is there a way to quickly print a ruby hash in a table format into a file?
Such as:
keyA keyB keyC ...
123 234 345
125 347
4456
...
where the values of the hash are arrays of different sizes. Or is using a double loop the only way?
Thanks
Try this gem I wrote (prints hashes, ruby objects, ActiveRecord objects in tables): http://github.com/arches/table_print
Here's a version of steenslag's that works when the arrays aren't the same size:
size = h.values.max_by { |a| a.length }.length
m = h.values.map { |a| a += [nil] * (size - a.length) }.transpose.insert(0, h.keys)
nil seems like a reasonable placeholder for missing values but you can, of course, use whatever makes sense.
For example:
>> h = {:a => [1, 2, 3], :b => [4, 5, 6, 7, 8], :c => [9]}
>> size = h.values.max_by { |a| a.length }.length
>> m = h.values.map { |a| a += [nil] * (size - a.length) }.transpose.insert(0, h.keys)
=> [[:a, :b, :c], [1, 4, 9], [2, 5, nil], [3, 6, nil], [nil, 7, nil], [nil, 8, nil]]
>> m.each { |r| puts r.map { |x| x.nil?? '' : x }.inspect }
[:a, :b, :c]
[ 1, 4, 9]
[ 2, 5, ""]
[ 3, 6, ""]
["", 7, ""]
["", 8, ""]
h = {:a => [1, 2, 3], :b => [4, 5, 6], :c => [7, 8, 9]}
p h.values.transpose.insert(0, h.keys)
# [[:a, :b, :c], [1, 4, 7], [2, 5, 8], [3, 6, 9]]
No, there's no built-in function. Here's a code that would format it as you want it:
data = { :keyA => [123, 125, 4456], :keyB => [234000], :keyC => [345, 347] }
length = data.values.max_by{ |v| v.length }.length
widths = {}
data.keys.each do |key|
widths[key] = 5 # minimum column width
# longest string len of values
val_len = data[key].max_by{ |v| v.to_s.length }.to_s.length
widths[key] = (val_len > widths[key]) ? val_len : widths[key]
# length of key
widths[key] = (key.to_s.length > widths[key]) ? key.to_s.length : widths[key]
end
result = ""
data.keys.each {|key| result += key.to_s.ljust(widths[key]) + " " }
result += "\n"
for i in 0.upto(length)
data.keys.each { |key| result += data[key][i].to_s.ljust(widths[key]) + " " }
result += "\n"
end
# TODO write result to file...
Any comments and edits to refine the answer are very welcome.

Resources