How to find union between array of first value in ruby - ruby

I have a set of array like this:
[["1","2"],["1","3"],["2","3"],["2","5"]]
I want to find the union of first values like
["1","2"],["1","3"] matches so i need to create new array like ["1","2,3"]
so the resulting array will be like
[["1","2,3"],["2","3,5"]]

Like most problems in Ruby, the Enumerable module does the job:
input = [["1","2"],["1","3"],["2","3"],["2","5"]]
result = input.group_by do |item|
# Group by first element
item[0]
end.collect do |key, items|
# Compose into new format
[
key,
items.collect do |item|
item[1]
end.join(',')
]
end
puts result.inspect
# => [["1", "2,3"], ["2", "3,5"]]
The group_by method comes in very handy when aggregating things like this, and collect is great for rewriting how the elements appear.

What you are asking for is not a true union for a true union of each 2 it would be:
data = [["1","2"],["1","3"],["2","3"],["2","5"]]
data.each_slice(2).map{|a,b| a | b.to_a }
#=> [["1","2","3"],["2","3","5"]]
Here is a very simple solution that modifies this concept to fit your needs:
data = [["1","2"],["1","3"],["2","3"],["2","5"]]
data.each_slice(2).map do |a,b|
unified = (a | b.to_a)
[unified.shift,unified.join(',')]
end
#=> [["1", "2,3"], ["2", "3,5"]]
Added to_a to piped variable b in the event that there are an uneven number of arrays. eg.
data = [["1","2"],["1","3"],["2","3"],["2","5"],["4","7"]]
data.each_slice(2).map do |a,b|
unified = (a | b.to_a)
[unified.shift,unified.join(',')]
end
#=> [["1", "2,3"], ["2", "3,5"], ["4","7"]]
If you meant that you want this to happen regardless of order then this will work but will destroy the data object
data.group_by(&:shift).map{|k,v| [k,v.flatten.join(',')]}
#=> [["1", "2,3"], ["2", "3,5"], ["4","7"]]
Non destructively you could call
data.map(&:dup).group_by(&:shift).map{|k,v| [k,v.flatten.join(',')]}
#=> [["1", "2,3"], ["2", "3,5"], ["4","7"]]

Here's another way.
Code
def doit(arr)
arr.each_with_object({}) { |(i,*rest),h| (h[i] ||= []).concat(rest) }
.map { |i,rest| [i, rest.join(',')] }
end
Examples
arr1 = [["1","2"],["1","3"],["2","3"],["2","5"]]
doit(arr1)
#=> [["1", "2,3"], ["2", "3,5"]]
arr2 = [["1","2","6"],["2","7"],["1","3"],["2","3","9","4","cat"]]
doit(arr2)
# => [["1", "2,6,3"], ["2", "7,3,9,4,cat"]]
Explanation
For arr1 above, we obtain:
enum = arr1.each_with_object({})
#=> #<Enumerator: [["1", "2"], ["1", "3"], ["2", "3"],
# ["2", "5"]]:each_with_object({})>
We can convert enum to an array see its elements:
enum.to_a
#=> [[["1", "2"], {}], [["1", "3"], {}],
# [["2", "3"], {}], [["2", "5"], {}]]
These elements will be passed into the block, and assigned to the block variables, by Enumerator#each, which will invoke Array#each. The first of these elements ([["1", "2"], {}]) can be obtained by invoking Enumerator#next on enum:
(i,*rest),h = enum.next
#=> [["1", "2"], {}]
i #=> "1"
rest #=> ["2"]
h #=> {}
We then execute:
(h[i] ||= []).concat(rest)
#=> (h["1"] ||= []).concat(["2"])
#=> (nil ||= []).concat(["2"])
#=> [].concat(["2"])
#=> ["2"]
each then passes the next element of enum to the block:
(i,*rest),h = enum.next
#=> [["1", "3"], {"1"=>["2"]}]
i #=> "1"
rest #=> ["3"]
h #=> {"1"=>["2"]}
(h[i] ||= []).concat(rest)
#=> (h["1"] ||= []).concat(["3"])
#=> (["2"] ||= []).concat(["3"])
#=> ["2"].concat(["3"])
#=> ["2", "3"]
After passing the last two elements of enum into the block, we obtain:
h=> {"1"=>["2", "3"], "2"=>["3", "5"]}
map creates an enumerator:
enum_h = h.each
#=> > #<Enumerator: {"1"=>["2", "3"]}:each>
and calls Enumerator#each (which calls Hash#each) to pass each element of enum_h into the block:
i, rest = enum_h.next
#=> ["1", ["2", "3"]]
then computes:
[i, rest.join(',')]
#=> ["1", ["2", "3"].join(',')]
#=> ["1", "2,3"]
The other element of enum_h is processed similarly.

Related

Ruby hash with multiple comma separated values to array of hashes with same keys

What is the most efficient and pretty way to map this:
{name:"cheese,test", uid:"1,2"}
to this:
[ {name:"cheese", uid:"1"}, {name:"test", uid:"2"} ]
should work dinamically for example with: { name:"cheese,test,third", uid:"1,2,3" } or {name:"cheese,test,third,fourth", uid:"1,2,3,4", age:"9,8,7,6" }
Finally I made this:
hash = {name:"cheese,test", uid:"1,2"}
results = []
length = hash.values.first.split(',').length
length.times do |i|
results << hash.map {|k,v| [k, v.split(',')[i]]}
end
results.map{|e| e.to_h}
It is working, but i am not pleased with it, has to be a cleaner and more 'rubyst' way to do this
def splithash(h)
# Transform each element in the Hash...
h.map do |k, v|
# ...by splitting the values on commas...
v.split(',').map do |vv|
# ...and turning these into individual { k => v } entries.
{ k => vv }
end
end.inject do |a,b|
# Then combine these by "zip" combining each list A to each list B...
a.zip(b)
# ...which will require a subsequent .flatten to eliminate nesting
# [ [ 1, 2 ], 3 ] -> [ 1, 2, 3 ]
end.map(&:flatten).map do |s|
# Then combine all of these { k => v } hashes into one containing
# all the keys with associated values.
s.inject(&:merge)
end
end
Which can be used like this:
splithash(name:"cheese,test", uid:"1,2", example:"a,b")
# => [{:name=>"cheese", :uid=>"1", :example=>"a"}, {:name=>"test", :uid=>"2", :example=>"b"}]
It looks a lot more convoluted at first glance, but this handles any number of keys.
I would likely use transpose and zip like so:
hash = {name:"cheese,test,third,fourth", uid:"1,2,3,4", age:"9,8,7,6" }
hash.values.map{|x| x.split(",")}.transpose.map{|v| hash.keys.zip(v).to_h}
#=> [{:name=>"cheese", :uid=>"1", :age=>"9"}, {:name=>"test", :uid=>"2", :age=>"8"}, {:name=>"third", :uid=>"3", :age=>"7"}, {:name=>"fourth", :uid=>"4", :age=>"6"}]
To break it down a bit (code slightly modified for operational clarity):
hash.values
#=> ["cheese,test,third,fourth", "1,2,3,4", "9,8,7,6"]
.map{|x| x.split(",")}
#=> [["cheese", "test", "third", "fourth"], ["1", "2", "3", "4"], ["9", "8", "7", "6"]]
.transpose
#=> [["cheese", "1", "9"], ["test", "2", "8"], ["third", "3", "7"], ["fourth", "4", "6"]]
.map do |v|
hash.keys #=> [[:name, :uid, :age], [:name, :uid, :age], [:name, :uid, :age], [:name, :uid, :age]]
.zip(v) #=> [[[:name, "cheese"], [:uid, "1"], [:age, "9"]], [[:name, "test"], [:uid, "2"], [:age, "8"]], [[:name, "third"], [:uid, "3"], [:age, "7"]], [[:name, "fourth"], [:uid, "4"], [:age, "6"]]]
.to_h #=> [{:name=>"cheese", :uid=>"1", :age=>"9"}, {:name=>"test", :uid=>"2", :age=>"8"}, {:name=>"third", :uid=>"3", :age=>"7"}, {:name=>"fourth", :uid=>"4", :age=>"6"}]
end
Input
hash={name:"cheese,test,third,fourth", uid:"1,2,3,4", age:"9,8,7,6" }
Code
p hash
.transform_values { |v| v.split(',') }
.map { |k, v_arr| v_arr.map { |v| [k, v] }
}
.transpose
.map { |array| array.to_h }
Output
[{:name=>"cheese", :uid=>"1", :age=>"9"}, {:name=>"test", :uid=>"2", :age=>"8"}, {:name=>"third", :uid=>"3", :age=>"7"}, {:name=>"fourth", :uid=>"4", :age=>"6"}]
We are given
h = { name: "cheese,test", uid: "1,2" }
Here are two ways to create the desired array. Neither construct arrays that are then converted to hashes.
#1
First compute
g = h.transform_values { |s| s.split(',') }
#=> {:name=>["cheese", "test"], :uid=>["1", "2"]}
then compute
g.first.last.size.times.map { |i| g.transform_values { |v| v[i] } }
#=> [{:name=>"cheese", :uid=>"1"}, {:name=>"test", :uid=>"2"}]
Note
a = g.first
#=> [:name, ["cheese", "test"]]
b = a.last
#=> ["cheese", "test"]
b.size
#=> 2
#2
This approach does not convert the values of the hash to arrays.
(h.first.last.count(',')+1).times.map do |i|
h.transform_values { |s| s[/(?:\w+,){#{i}}\K\w+/] }
end
#=> [{:name=>"cheese", :uid=>"1"}, {:name=>"test", :uid=>"2"}]
We have
a = h.first
#=> [:name, "cheese,test"]
s = a.last
#=> "cheese,test"
s.count(',')+1
#=> 2
We can express the regular expression in free-spacing mode to make it self-documenting.
/
(?: # begin a non-capture group
\w+, # match one or more word characters followed by a comma
) # end the non-capture group
{#{i}} # execute the preceding non-capture group i times
\K # discard all matches so far and reset the start of the match
\w+ # match one or more word characters
/x # invoke free-spacing regex definition mode

Looking to convert information from a file into a hash Ruby

Hello I have been doing some research for sometime on this particular project I have been working on and I am at a loss. What I am looking to do is use information from a file and convert that to a hash using some of those components for my key. Within the file I have:1,Foo,20,Smith,40,John,55
An example of what I am looking for I am looking for an output like so {1 =>[Foo,20], 2 =>[Smith,40] 3 => [John,55]}
Here is what I got.
h = {}
people_file = File.open("people.txt") # I am only looking to read here.
until people_file.eof?
i = products_file.gets.chomp.split(",")
end
people_file.close
FName = 'test'
str = "1,Foo,20,Smith, 40,John,55"
File.write(FName, str)
#=> 26
base, *arr = File.read(FName).
split(/\s*,\s*/)
enum = (base.to_i).step
arr.each_slice(2).
with_object({}) {|pair,h| h[enum.next]=pair}
#=> {1=>["Foo", "20"], 2=>["Smith", "40"],
# 3=>["John", "55"]}
The steps are as follows.
s = File.read(FName)
#=> "1,Foo,20,Smith, 40,John,55"
base, *arr = s.split(/\s*,\s*/)
#=> ["1", "Foo", "20", "Smith", "40", "John", "55"]
base
#=> "1"
arr
#=> ["Foo", "20", "Smith", "40", "John", "55"]
a = base.to_i
#=> 1
I assume the keys are to be sequential integers beginning with a #=> 1.
enum = a.step
#=> (1.step)
enum.next
#=> 1
enum.next
#=> 2
enum.next
#=> 3
Continuing,
enum = a.step
b = arr.each_slice(2)
#=> #<Enumerator: ["Foo", "20", "Smith", "40", "John", "55"]:each_slice(2)>
Note I needed to redefine enum (or execute enum.rewind) to reinitialize it. We can see the elements that will be generated by this enumerator by converting it to an array.
b.to_a
#=> [["Foo", "20"], ["Smith", "40"], ["John", "55"]]
Continuing,
c = b.with_object({})
#=> #<Enumerator: #<Enumerator: ["Foo", "20", "Smith", "40", "John", "55"]
# :each_slice(2)>:with_object({})>
c.to_a
#=> [[["Foo", "20"], {}], [["Smith", "40"], {}], [["John", "55"], {}]]
The now-empty hashes will be constructed as calculations progress.
c.each {|pair,h| h[enum.next]=pair}
#=> {1=>["Foo", "20"], 2=>["Smith", "40"], 3=>["John", "55"]}
To see how the last step is performed, each initially directs the enumerator c to generate the first value, which it passes to the block. The block variables are assigned to that value, and the block calculation is performed.
enum = a.step
b = arr.each_slice(2)
c = b.with_object({})
pair, h = c.next
#=> [["Foo", "20"], {}]
pair
#=> ["Foo", "20"]
h #=> {}
h[enum.next]=pair
#=> ["Foo", "20"]
Now,
h#=> {1=>["Foo", "20"]}
The calculations are similar for the remaining two elements generated by the enumerator c.
See IO::write, IO::read, Numeric#step, Enumerable#each_slice, Enumerator#with_object, Enumerator#next and Enumerator#rewind. write and read respond to File because File is a subclass of IO (File.superclass #=> IO). split's argument, the regular expression, /\s*,\s*/, causes the string to be split on commas together with any spaces that surround the commas. Converting [["Foo", "20"], {}] to pair and h is a product of Array Decompostion.

Multiple sub-hashes out of one hash

I have a hash:
hash = {"a_1_a" => "1", "a_1_b" => "2", "a_1_c" => "3", "a_2_a" => "3",
"a_2_b" => "4", "a_2_c" => "4"}
What's the best way to get the following sub-hashes:
[{"a_1_a" => "1", "a_1_b" => "2", "a_1_c" => "3"},
{"a_2_a" => "3", "a_2_b" => "4", "a_2_c" => "4"}]
I want them grouped by the key, based on the regexp /^a_(\d+)/. I'll have 50+ key/value pairs in the original hash, so something dynamic would work best, if anyone has any suggestions.
If you're only concerned about the middle component you can use group_by to get you most of the way there:
hash.group_by do |k,v|
k.split('_')[1]
end.values.map do |list|
Hash[list]
end
# => [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"}, {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]
The final step is extracting the grouped lists and combining those back into the required hashes.
Code
def partition_hash(hash)
hash.each_with_object({}) do |(k,v), h|
key = k[/(?<=_).+(?=_)/]
h[key] = (h[key] || {}).merge(k=>v)
end.values
end
Example
hash = {"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3", "a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}
partition_hash(hash)
#=> [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]
Explanation
The steps are as follows.
enum = hash.each_with_object({})
#=> #<Enumerator: {"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3", "a_2_a"=>"3",
# "a_2_b"=>"4", "a_2_c"=>"4"}:each_with_object({})>
The first element of this enumerator is generated and passed to the block, and the block variables are computed using parallel assignment.
(k,v), h = enum.next
#=> [["a_1_a", "1"], {}]
k #=> "a_1_a"
v #=> "1"
h #=> {}
and the block calculation is performed.
key = k[/(?<=_).+(?=_)/]
#=> "1"
h[key] = (h[key] || {}).merge(k=>v)
#=> h["1"] = (h["1"] || {}).merge("a_1_a"=>"1")
#=> h["1"] = (nil || {}).merge("a_1_a"=>"1")
#=> h["1"] = {}.merge("a_1_a"=>"1")
#=> h["1"] = {"a_1_a"=>"1"}
so now
h #=> {"1"=>{"a_1_a"=>"1"}}
The next value of enum is now generated and passed to the block, and the following calculations are performed.
(k,v), h = enum.next
#=> [["a_1_b", "2"], {"1"=>{"a_1_a"=>"1"}}]
k #=> "a_1_b"
v #=> "2"
h #=> {"1"=>{"a_1_a"=>"1"}}
key = k[/(?<=_).+(?=_)/]
#=> "1"
h[key] = (h[key] || {}).merge(k=>v)
#=> h["1"] = (h["1"] || {}).merge("a_1_b"=>"2")
#=> h["1"] = ({"a_1_a"=>"1"}} || {}).merge("a_1_b"=>"2")
#=> h["1"] = {"a_1_a"=>"1"}}.merge("a_1_b"=>"2")
#=> h["1"] = {"a_1_a"=>"1", "a_1_b"=>"2"}
After the remaining four elements of enum have been passed to the block the following has is returned.
h #=> {"1"=>{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# "2"=>{"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}}
The final step is simply to extract the values.
h.values
#=> [{"a_1_a"=>"1", "a_1_b"=>"2", "a_1_c"=>"3"},
# {"a_2_a"=>"3", "a_2_b"=>"4", "a_2_c"=>"4"}]

Ruby symbols to#proc with puts

I'm learning ruby for 2 weeks but i have a problem you probably now that
words = ["hello", "world", "test"]
words.map do |word|
word.upcase
end
is equal to
words = ["hello", "world", "test"]
words.map(&:upcase)
but why that
m = method(:puts)
words.map(&m)
work ??
Object#method gets a Method object from the given symbol. method(:puts) gets the Method object corresponding to Kernel#puts.
The unary & operator calls the to_proc under the hood:
method(:puts).respond_to?(:to_proc) # => true
to_proc in Method returns a Proc equivalent to { |x| method_name(x) }:
[1, 2, 3].map &method(:String) # => ["1", "2", "3"]
[1, 2, 3].map { |x| String(x) } # => ["1", "2", "3"]
You are using the Object#method to pull the method off Kernel and store it in a variable "m".
The & operator calls Method#to_proc. Stating all this explicitly:
irb(main):009:0> m = Kernel.method(:puts); ["hello", "world", "test"].map{|word| m.to_proc.call(word)}
hello
world
test
=> [nil, nil, nil]

Join common keys in hash one-liner

I have this array of pairs:
[{"a"=>"1"}, {"b"=>"2"}, {"a"=>"3"}, {"b"=>"4"}, {"a"=>"5"}]
I would like a method to merge the keys in common with multiple values to:
[{"a"=>["1","3","5"]}, {"b"=>["2","4"]}]
Improved following Marc-Andre's suggestion.
array = [{"a"=>"1"}, {"b"=>"2"}, {"a"=>"3"}, {"b"=>"4"}, {"a"=>"5"}]
array.group_by(&:keys).map{|k, v| {k.first => v.flat_map(&:values)}}
Or
array.group_by{|h| h.keys.first}.each_value{|a| a.map!{|h| h.values.first}}
Haven't tried it yet, but something like that should also works
a.each_with_object( Hash.new{ |h,k| h[k] = [] } ) do |x, hash|
hash[x.keys.first] << x.values.first
end
edit: For a one liner, and the same output :
[a.each_with_object( Hash.new{ |h,k| h[k] = [] } ) { |x, hash| hash[x.keys.first] << x.values.first }]
A solution to your problem:
array.map(&:first).group_by(&:first).map{|k, v| {k => v.map(&:last)}}
I'm curious as to why you start and end with hashes containing only one key-pair. Arrays would be better suited. E.g.:
other = [["a", "1"], ["b", "2"], ["a", "3"], ["b", "4"], ["a", "5"]]
r = other.group_by(&:first).map{|k, v| [k => v.map(&:last)]}
r # => [["a", ["1", "3", "5"]], ["b", ["2", "4"]]]
Hash[r] # => {"a"=>["1", "3", "5"], "b"=>["2", "4"]}
array = [{"a"=>"1"}, {"b"=>"2"}, {"a"=>"3"}, {"b"=>"4"}, {"a"=>"5"}]
{}.tap{ |r| array.each{ |h| h.each{ |k,v| (r[k]||=[]) << v } } }
Not sure if you're giving out awards for brevity, but I like this way. The merge function with a block is ideally suited for this:
new = {}
array.each {|p| new.merge!(p) {|k,l,r| [l,r].flatten }}

Resources