Multiclass partition function for ruby array/list - ruby

I have just gotten deeper into functional aspects of ruby and are fiddling with map/reduce and some filtering.
I have now gotten to a point where I have a list of items of the following structure:
{:price=>100.0, :size=>'small', :description=>'some description'}
The value for :size may be one of ['small', 'medium', 'large'].
Is there a way to partition the whole list into sublists with only those elements which are of size small, medium and large without setting up a filter function for each of these values?
Basically I am asking wether there is some multiclass Array.partition.
Thanks for any help!

I believe you are looking for Enumberable#group_by:
list = [
{:price=>100.0, :size=>'small', :description=>'some description'},
{:price=>123.0, :size=>'small', :description=>'some description 2'},
{:price=>456.0, :size=>'medium', :description=>'some description 3'}
]
list.group_by {|item| item[:size]}
# => {
# "small" => [
# {:price=>100.0, :size=>"small", :description=>"some description"},
# {:price=>123.0, :size=>"small", :description=>"some description 2"}
# ],
# "medium" => [
# {:price=>456.0, :size=>"medium", :description=>"some description 3"}
# ]
# }

input = [
{:price=>100.0, :size=>'small', :description=>'some description 1'},
{:price=>100.0, :size=>'large', :description=>'some description 2'},
{:price=>100.0, :size=>'small', :description=>'some description 3'},
{:price=>100.0, :size=>'large', :description=>'some description 4'},
{:price=>100.0, :size=>'small', :description=>'some description 5'},
{:price=>100.0, :size=>'small', :description=>'some description 6'}
]
input.group_by { |e| e[:size] }
Whether you consider the result should not contain the size in hashes, use Hash#delete to mutate elements:
input.group_by { |e| e.delete :size }
#⇒ {
# "large" => [
# [0] {
# :description => "some description 2",
# :price => 100.0
# },
# [1] {
# :description => "some description 4",
# :price => 100.0
# }
# ],
# "small" => [
# [0] {
# :description => "some description 1",
# :price => 100.0
# },
# [1] {
# :description => "some description 3",
# :price => 100.0
# },
# [2] {
# :description => "some description 5",
# :price => 100.0
# },
# [3] {
# :description => "some description 6",
# :price => 100.0
# }
# ]
# }

Related

group_by multiple times in ruby

I have an array of hashes called events:
events = [
{:name => "Event 1", :date => "2019-02-21 08:00:00", :area => "South", :micro_area => "A"},
{:name => "Event 2", :date => "2019-02-21 08:00:00", :area => "South", :micro_area => "A"},
{:name => "Event 3", :date => "2019-02-21 08:00:00", :area => "South", :micro_area => "B"},
{:name => "Event 4", :date => "2019-02-21 08:00:00", :area => "South", :micro_area => "B"},
{:name => "Event 5", :date => "2019-02-21 08:00:00", :area => "North", :micro_area => "A"},
{:name => "Event 6", :date => "2019-02-21 08:00:00", :area => "North", :micro_area => "A"},
{:name => "Event 7", :date => "2019-02-21 08:00:00", :area => "North", :micro_area => "B"},
{:name => "Event 8", :date => "2019-02-21 08:00:00", :area => "North", :micro_area => "B"}
]
I want to know how to group_by first date, then area then micro_area to end up with a single array of hashes for example:
[
{
"2019-02-21 08:00:00": {
"South": {
"A": [
{:name=>"Event 1", :date=>"2019-02-21 08:00:00", :area=>"South", :micro_area=>"A" },
{:name=>"Event 2", :date=>"2019-02-21 08:00:00", :area=>"South", :micro_area=>"A" }
],
"B": [
{:name=>"Event 3", :date=>"2019-02-21 08:00:00", :area=>"South", :micro_area=>"B" },
{:name=>"Event 4", :date=>"2019-02-21 08:00:00", :area=>"South", :micro_area=>"B" }
]
},
"North": {
"A": [
{:name=>"Event 5", :date=>"2019-02-21 08:00:00", :area=>"North", :micro_area=>"A" },
{:name=>"Event 6", :date=>"2019-02-21 08:00:00", :area=>"North", :micro_area=>"A" }
],
"B": [
{:name=>"Event 7", :date=>"2019-02-21 08:00:00", :area=>"North", :micro_area=>"B" },
{:name=>"Event 8", :date=>"2019-02-21 08:00:00", :area=>"North", :micro_area=>"B" }
]
}
}
}
]
Trying events.group_by { |r| [r[:date], r[:area], r[:micro_area]] } doesn't seem too work the way I want it to.
I think following will work for you,
events = [
{ name: 'Event 1', date: '2019-02-21 08:00:00', area: 'South', micro_area: 'A' }
]
events.group_by { |x| x[:date] }.transform_values do |v1|
v1.group_by { |y| y[:area] }.transform_values do |v2|
v2.group_by { |z| z[:micro_area] }
end
end
# {
# "2019-02-21 08:00:00"=>{
# "South"=>{
# "A"=>[
# {:name=>"Event 1", :date=>"2019-02-21 08:00:00", :area=>"South", :micro_area=>"A"}
# ]
# }
# }
# }
Another option is to build the nested structure as you traverse your hash:
events.each_with_object({}) do |event, result|
d, a, m = event.values_at(:date, :area, :micro_area)
result[d] ||= {}
result[d][a] ||= {}
result[d][a][m] ||= []
result[d][a][m] << event
end
Another option is grouping them like you did in the question. Then build the nested structure from the array used as key.
# build an endless nested structure
nested = Hash.new { |hash, key| hash[key] = Hash.new(&hash.default_proc) }
# group by the different criteria and place them in the above nested structure
events.group_by { |event| event.values_at(:date, :area, :micro_area) }
.each { |(*path, last), events| nested.dig(*path)[last] = events }
# optional - reset all default procs
reset_default_proc = ->(hash) { hash.each_value(&reset_default_proc).default = nil if hash.is_a?(Hash) }
reset_default_proc.call(nested)
The above leaves the answer in the nested variable.
References:
Hash::new to create the nested hash.
Hash#default_proc to get the default proc of a hash.
Hash#default= to reset the hash default back to nil.
Hash#dig to traverse the nested structure until the last node.
Hash#[]= to set the last node equal to the grouped events.
Array decomposition and array to argument conversion to capture all but the last node into path and call #dig with the contents of path as arguments.
Here is a recursive solution that will handle arbitrary levels of nesting and arbitrary grouping objects.
def hashify(events, grouping_keys)
return events if grouping_keys.empty?
first_key, *remaining_keys = grouping_keys
events.group_by { |h| h[first_key] }.
transform_values { |a|
hashify(a.map { |h|
h.reject { |k,_| k == first_key } },
remaining_keys) }
end
Before executing this with the sample data from the questions let's add a hash with a different date to events.
events <<
{ :name=>"Event 8", :date=>"2018-12-31 08:00:00",
:area=>"North", :micro_area=>"B" }
grouping_keys = [:date, :area, :micro_area]
hashify(events, grouping_keys)
#=> {"2019-02-21 08:00:00"=>{
# "South"=>{
# "A"=>[{:name=>"Event 1"}, {:name=>"Event 2"}],
# "B"=>[{:name=>"Event 3"}, {:name=>"Event 4"}]
# },
# "North"=>{
# "A"=>[{:name=>"Event 5"}, {:name=>"Event 6"}],
# "B"=>[{:name=>"Event 7"}, {:name=>"Event 8"}]
# }
# },
# "2018-12-31 08:00:00"=>{
# "North"=>{
# "B"=>[{:name=>"Event 8"}]
# }
# }
# }
hashify(events, [:date, :area])
#=> {"2019-02-21 08:00:00"=>{
# "South"=>[
# {:name=>"Event 1", :micro_area=>"A"},
# {:name=>"Event 2", :micro_area=>"A"},
# {:name=>"Event 3", :micro_area=>"B"},
# {:name=>"Event 4", :micro_area=>"B"}
# ],
# "North"=>[
# {:name=>"Event 5", :micro_area=>"A"},
# {:name=>"Event 6", :micro_area=>"A"},
# {:name=>"Event 7", :micro_area=>"B"},
# {:name=>"Event 8", :micro_area=>"B"}
# ]
# },
# "2018-12-31 08:00:00"=>{
# "North"=>[
# {:name=>"Event 8", :micro_area=>"B"}
# ]
# }
# }
See Enumerable#group_by, Hash#transform_values and Hash#reject.

Search through Nested Hash having groups using Hashie Gem

I have a nested hash of a PDF that is in this format:
[ { :page => 1,
:lines => [
{ :y => 774.0,
:text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
},
# ...
]
},
{ :page => 2,
:lines => [
{ :y => 774.0,
:text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
},
# ...
],
# ...
}
]
I want to get the :x and :y for given :text from all 4 pages.
I tried this:
require 'hashie'
coordinates.extend(Hashie::Extensions::DeepLocate)
#hash_array = Hash.new
#hash_array = coordinates.deep_locate -> (key, value, object) { key == :text && value == "XXXX" }
This is giving me:
[ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } },
{ :x => 18.0, :width => 421.59599999999995, :text => "XXXX" },
{ :x => 18.0, :width => 421.59599999999995, :text => "XXXX" },
{ :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
But I need :x and :y to be displayed like this:
x = " " and y = " "
I will use these values for my further validation.
I don't know if you'll accept a solution that doesn't use Hashie, but here's how I'd do it:
data = [
{ :page => 1,
:lines => [
{ :y => 774.0,
:text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
},
# ...
]
},
{ :page => 2,
:lines => [
{ :y => 774.0,
:text_groups => [ { :x => 18.0, :width => 421.59599999999995, :text => "XXXX" } ]
},
# ...
],
# ...
}
]
SEARCH_TEXT = "XXXX"
coords = data.each_with_object([]) do |page, res|
page[:lines].each do |line|
line[:text_groups].each do |group|
next unless group[:text] == SEARCH_TEXT
res << { x: group[:x], y: line[:y] }
end
end
end
p coords
# => [ { :x => 18.0, :y => 774.0 },
# { :x => 18.0, :y => 774.0 } ]

Nested hash iteration: How to iterate a merge over an ( (array of hashes) within a hash )

I'm trying to do as the title says. Here is my code:
school.each { |x| school[:students][x].merge!(semester:"Summer") }
I think I pinpointed the problem to the "[x]" above. If I substitute an array position such as "[2]" it works fine. How can make the iteration work?
If the info above is not enough or you'd like to offer a better solution, please see the details below. Thanks!
The error message I get:
file.rb:31:in []': no implicit conversion of Array into Integer (TypeError)
from file.rb:31:inblock in '
from file.rb:31:in each'
from file.rb:31:in'
The nested hash below before alteration:
school = {
:name => "Happy Funtime School",
:location => "NYC",
:instructors => [
{:name=>"Blake", :subject=>"being awesome" },
{:name=>"Ashley", :subject=>"being better than blake"},
{:name=>"Jeff", :subject=>"karaoke"}
],
:students => [
{:name => "Marissa", :grade => "B"},
{:name=>"Billy", :grade => "F"},
{:name => "Frank", :grade => "A"},
{:name => "Sophie", :grade => "C"}
]
}
I'm trying to append :semester=>"Summer" to each of the last four hashes. Here is what I'm trying to go for:
# ...preceding code is the same. Changed code below...
:students => [
{:name => "Marissa", :grade => "B", :semester => "Summer"},
{:name=>"Billy", :grade => "F", :semester => "Summer"},
{:name => "Frank", :grade => "A", :semester => "Summer"},
{:name => "Sophie", :grade => "C", :semester => "Summer"}
]
}
Just iterate over the students:
school[:students].each { |student| student[:semester] = "Summer" }
Or, using merge:
school[:students].each { |student| student.merge!(semester: "Summer") }
The issue is that when you do array.each {|x| do something}, x actually refers to each element in the array.
For example, in the first iteration of the loop,
x = {:name => "Marissa", :grade => "B"}
So what you are really doing is trying to reference:
school[:student][{:name => "Marissa", :grade => "B"}]
Which will not work
What you could do instead is create a for loop to track the index.
for i in 0 ... school[:student].count
school[:students][i].merge!(semester:"Summer")
end
Edit: Stefan's solution is much better than mine, but I will leave this up to show where you went wrong.
I would do as below using Hash#store :
require 'awesome_print'
school = {
:name => "Happy Funtime School",
:location => "NYC",
:instructors => [
{
:name => "Blake",
:subject => "being awesome"
},
{
:name => "Ashley",
:subject => "being better than blake"
},
{
:name => "Jeff",
:subject => "karaoke"
}
],
:students => [
{
:name => "Marissa",
:grade => "B"
},
{
:name => "Billy",
:grade => "F"
},
{
:name => "Frank",
:grade => "A"
},
{
:name => "Sophie",
:grade => "C"
}
]
}
school[:students].each{|h| h.store(:semester ,"Summer")}
ap school,:index => false,:indent => 10
output
{
:name => "Happy Funtime School",
:location => "NYC",
:instructors => [
{
:name => "Blake",
:subject => "being awesome"
},
{
:name => "Ashley",
:subject => "being better than blake"
},
{
:name => "Jeff",
:subject => "karaoke"
}
],
:students => [
{
:name => "Marissa",
:grade => "B",
:semester => "Summer"
},
{
:name => "Billy",
:grade => "F",
:semester => "Summer"
},
{
:name => "Frank",
:grade => "A",
:semester => "Summer"
},
{
:name => "Sophie",
:grade => "C",
:semester => "Summer"
}
]
}

Getting the hash key whose value array contains a given string

I have the following code:
country_code = infer_country # will grab a user's two character country code
region = 'us' # united states by default
region_map = {
"au" => ["au"], # australia
"al" => ["al", "ba", "bg", "hr", "md", "me", "mk", "ro", "si"], # bulgaria and the balkans
"cn" => ["cn"], # china
"ee" => ["ee", "lt", "lv"], # estonia and the baltics
"fi" => ["fi"], # finland
"at" => ["at", "ch", "de"], # germany, austria, switzerland
"cy" => ["cy", "gr", "mt"], # greece, cyprus, malta
"hk" => ["hk"], # hong_kong
"id" => ["id"], # indonesia
"it" => ["it"], # italy
"jp" => ["jp"], # japan
"kp" => ["kp", "kr"], # korea
"ar" => ["ar", "bl", "bo", "br", "bz", "cl", "co", "cr", "cu", "do", "ec", "gf", "gp", "gt", "hn",
"ht", "mf", "mq", "mx", "ni", "pa", "pe", "pr", "py", "sv", "uy", "ve"], # latin america including brazil
"my" => ["my"], # malaysia
"af" => ["af", "eg", "iq", "ir", "sa", "ye", "sy", "il", "jo", "ps", "lb", "om", "kw", "qa", "bh"], # middle east
"nl" => ["nl"], # netherlands
"no" => ["no"], # norway
"pl" => ["pl"], # poland
"pt" => ["pt"], # portugal
"ph" => ["ph"], # philippines
"ru" => ["ru"], # russia
"rs" => ["rs"], # serbia
"sg" => ["sg"], # singapore
"za" => ["za"], # south africa
"bn" => ["bn", "bu", "kh", "la", "tl", "vn"], # south east asia
"es" => ["es"], # spain
"tw" => ["tw"], # taiwan
"th" => ["th"], # thailand
"tr" => ["tr"], # turkey
"gb" => ["gb" ] # united kingdom
}.invert
# version 0.0
region_map.each do |key, value|
if key.include? country_code
region = value
break
end
end
puts region
If country_code is "gb", then "gb" should be printed out. If country_code is in south east asia, say it's "vn", then "bn" should be printed out.
How can I elegantly solve this problem? I can restructure my hash if necessary.
def find_region(country_code)
pair = #region_map.find{|k, v| v.include?(country_code)}
pair && pair.first
end
find_region('gb') # => "gb"
find_region('bz') # => "ar"
find_region('lv') # => "ee"
find_region('ls') # => nil
def find_region(country_code)
#region_map.each {|k,v| return k if v.include? country_code}
nil
end
region_map = Hash.new("us").merge(
"au" => "au",
"al" => "al",
"ba" => "al",
"bg" => "al",
"hr" => "al",
"md" => "al",
"me" => "al",
"mk" => "al",
"ro" => "al",
"si" => "al",
"al" => "al",
"cn" => "cn",
...
)
region_map["non existing code"] # => "us"
region_map[nil] # => "us"
region_map["au"] # => "au"
region_map["ba"] # => "al"
region_map["cn"] # => "cn"
or
def region_map code
case code
when "au"
"au"
when "al", "ba", "bg", "hr", "md", "me", "mk", "ro", "si"
"al"
when "cn"
"cn"
when "ee", "lt", "lv"
"ee"
...
else
"us"
end
end
region_map("non existing code") # => "us"
region_map(nil) # => "us"
region_map("au") # => "au"
region_map("ba") # => "al"
region_map("cn") # => "cn"

Ruby: Transform a flat array in a tree representation

I am trying to write a function to convert a flat array with a path information into a tree representation of that array.
The goal would be to turn an array like the following:
[
{ :name => "a", :path => [ 'a' ] },
{ :name => "b", :path => [ 'a', 'b' ] },
{ :name => "c", :path => [ 'a', 'b', 'c' ] },
{ :name => "d", :path => [ 'a', 'd' ] },
{ :name => "e", :path => [ 'e' ] }
]
into one like this:
[{:node=>{:name=>"a", :path=>["a"]},
:children=>
[{:node=>{:name=>"b", :path=>["a", "b"]},
:children=>
[{:node=>{:name=>"c", :path=>["a", "b", "c"]}, :children=>[]}]},
{:node=>{:name=>"d", :path=>["a", "d"]}, :children=>[]}]},
{:node=>{:name=>"e", :path=>["e"]}, :children=>[]}]
The closest result I got with was with the following code:
class Tree
def initialize
#root = { :node => nil, :children => [ ] }
end
def from_array( array )
array.inject(self) { |tree, node| tree.add(node) }
#root[:children]
end
def add(node)
recursive_add(#root, node[:path].dup, node)
self
end
private
def recursive_add(parent, path, node)
if(path.empty?)
parent[:node] = node
return
end
current_path = path.shift
children_nodes = parent[:children].find { |child| child[:node][:path].last == current_path }
unless children_nodes
children_nodes = { :node => nil, :children => [ ] }
parent[:children].push children_nodes
end
recursive_add(children_nodes, path, node)
end
end
flat = [
{ :name => "a", :path => [ 'a' ] },
{ :name => "b", :path => [ 'a', 'b' ] },
{ :name => "c", :path => [ 'a', 'b', 'c' ] },
{ :name => "d", :path => [ 'a', 'd' ] },
{ :name => "e", :path => [ 'e' ] }
]
require 'pp'
pp Tree.new.from_array( flat )
But it is quite verbose and I have the feeling that it might not be very effective for very large sets.
What would be the cleanest and most effective way to achieve that in ruby?
This is my try.
array = [
{ :name => "a", :path => [ 'a' ] },
{ :name => "b", :path => [ 'a', 'b' ] },
{ :name => "c", :path => [ 'a', 'b', 'c' ] },
{ :name => "d", :path => [ 'a', 'd' ] },
{ :name => "e", :path => [ 'e' ] }
]
array
.sort_by{|h| -h[:path].length}
.map{|h| {node: h, children: []}}
.tap{|array|
while array.first[:node][:path].length > 1
child = array.shift
array
.find{|h| h[:node][:name] == child[:node][:path][-2]}[:children]
.push(child)
end
}
# => [
{:node=>{:name=>"e", :path=>["e"]}, :children=>[]},
{:node=>{:name=>"a", :path=>["a"]}, :children=>[
{:node=>{:name=>"d", :path=>["a", "d"]}, :children=>[]},
{:node=>{:name=>"b", :path=>["a", "b"]}, :children=>[
{:node=>{:name=>"c", :path=>["a", "b", "c"]}, :children=>[]}
]}
]}
]

Resources