Transforming hash values based on condition after using group_by - ruby

I have an array of pipes which have the following attributes: pipe_id grade and grade_confidence.
I am looking to find objects within an array that have the same attributes as other objects with the same ID. I have been using group_by and transform_values to find the IDs that have only one grade - that works fine (thanks to answers in Using group_by for only certain attributes). However I would still like to keep the grade_confidence in the final result if possible.
class Pipes
attr_accessor :pipe_id, :grade, :grade_confidence
def initialize(pipe_id, grade, grade_confidence)
#pipe_id = pipe_id
#grade = grade
#grade_confidence = grade_confidence
end
end
pipe1 = Pipes.new("10001", "60", "A")
pipe2 = Pipes.new("10001", "60", "A")
pipe3 = Pipes.new("10001", "3", "F")
pipe4 = Pipes.new("1005", "15", "A")
pipe5 = Pipes.new("1004", "40", "A")
pipe6 = Pipes.new("1004", "40", "B")
pipes = []
pipes.push(pipe1, pipe2, pipe3, pipe4, pipe5, pipe6)
# We now have our array of pipe objects.
non_dups = pipes.group_by(&:pipe_id).transform_values { |a| a.map(&:grade).uniq }.select { |k,v| v.size == 1 }
puts non_dups
# => {"1005"=>["15"], "1004"=>["40"]}
Desired
The above does what I want - as "10001" has two different grades, it is ignored, and "1004" and "1005" have the same grades per ID. But what I would like is to keep the grade_confidence too, or include grade_confidence based on a condition also.
E.g. If grade_confidence is == "B" the final result would be # => {"1004"=>["40", "B"]}
or
If grade_confidence is == "A" the final result would be # => {"1005"=>["15", "A"], "1004"=>["40", "A"]}
Is it possible to tweak the transform_values to allow this or would I need to go another route?
Thanks

You need to update it:
non_dups = pipes
.group_by(&:pipe_id)
.transform_values { |a| [a.map(&:grade).uniq, a.map(&:grade_confidence)]}
.select { |_, (grades, _confidences)| grades.size == 1 }
.transform_values {|grades, confindeces| [grades.first, confindeces.sort.first]}

Related

How to solve the "retrieve_values" problem

I'm working on this problem:
Write a method retrieve_values that takes in two hashes and a key. The method should return an array containing the values from the two hashes that correspond with the given key.
def retrieve_values(hash1, hash2, key)
end
dog1 = {"name"=>"Fido", "color"=>"brown"}
dog2 = {"name"=>"Spot", "color"=> "white"}
print retrieve_values(dog1, dog2, "name") #=> ["Fido", "Spot"]
puts
print retrieve_values(dog1, dog2, "color") #=> ["brown", "white"]
puts
I came up with a working solution:
def retrieve_values(hash1, hash2, key)
arr = []
hash1.each { |key| } && hash2.each { |key| }
if key == "name"
arr << hash1["name"] && arr << hash2["name"]
elsif key == "color"
arr << hash1["color"] && arr << hash2["color"]
end
return arr
end
I then looked at the 'official' solution:
def retrieve_values(hash1, hash2, key)
val1 = hash1[key]
val2 = hash2[key]
return [val1, val2]
end
What is wrong with my code? Or is it an acceptable "different" approach?
Line with hash1.each { |key| } && hash2.each { |key| } just does nothing it is not needed even in your solution.
This part a bit difficult to read arr << hash1["name"] && arr << hash2["name"]. It mutates the array two times in one line, this kind of style could lead to bugs.
Also, your code sticks only to two keys name and color:
dog1 = {"name"=>"Fido", "color"=>"brown", "age" => 1}
dog2 = {"name"=>"Spot", "color"=> "white", "age" => 2}
> retrieve_values(dog1, dog2, "age")
=> []
The official solution will return [1, 2].
You don't need here to explicitly use return keyword, any block of code returns the last evaluated expression. But it is a matter of style guide.
It is possible to simplify even the official solution:
def retrieve_values(hash1, hash2, key)
[hash1[key], hash2[key]]
end

Selecting data from structured data, but the elegant functional way

Apparently, my ability to think functional withered over time. I have problems to select a sub-dataset from a dataset. I can solve the problem the hacky imperative style, but I believe, there is a sweet functional solution, which I am unfortunately not able to find.
Consider this data structure (tried to not simplify it beyond usability):
class C
attr_reader :attrC
def initialize(base)
#attrC = { "c1" => base+10 , "c2" => base+20, "c3" => base+30}
end
end
class B
attr_reader :attrB
##counter = 0
def initialize
#attrB = Hash.new
#attrB["b#{##counter}"] = C.new(##counter)
##counter += 1
end
end
class A
attr_reader :attrA
def initialize
#attrA = { "a1" => B.new, "a2" => B.new, "a3" => B.new}
end
end
which is created as a = A.new. The complete data set then would be
#<A: #attrA={"a1"=>#<B: #attrB={"b0"=>#<C: #attrC={"c1"=>10, "c2"=>20, "c3"=>30}>}>,
"a2"=>#<B: #attrB={"b1"=>#<C: #attrC={"c1"=>11, "c2"=>21, "c3"=>31}>}>,
"a3"=>#<B: #attrB={"b2"=>#<C: #attrC={"c1"=>12, "c2"=>22, "c3"=>32}>}>}>
which is subject to a selection. I want to retrieve only those instances of B where attrB's key is "b2".
My hacky way would is:
result = Array.new
A.new.attrA.each do |_,va|
result << va.attrB.select { |kb,_| kb == "b2" }
end
p result.reject { |a| a.empty?} [0]
which results in exactly what I intended:
{"b2"=>#<C: #attrC={"c1"=>12, "c2"=>22, "c3"=>32}>}
but I believe there would be a one-liner using map, fold, zip and reduce.
If you want a one-liner:
a.attrA.values.select { |b| b.attrB.keys == %w(b2) }
This returns instances of B. In your question, you're getting attrB values rather than instances of B. If that's what you want, there's this ugly reduce:
a.attrA.values.reduce([]) { |memo, b| memo << b.attrB if b.attrB.keys == %w(b2) ; memo }
I'm not sure what you're trying to do here, though?

How to merge the hash this way

I have written a huge code something like below
headers, *data_rows = #testCaseSheet
local = headers.zip(*data_rows)
local = local[1..-1].map {|dataRow| local[0].zip(dataRow).to_h}
testCaseHash = {}
local.each do |value|
testCaseHash[value["Locator"]] = value.tap {|hs| hs.delete("Locator")}
end
#testCaseSheet = []
p testCaseHash
[h["Test Name"], testCaseHash],
which output me this as below, now I need to merge this action with each test, I don't know how to do this.
hash= {"Action"=>{"css=#entityType"=>"Type", "id=idNumber"=>"TypeAndWait", "id=shortName"=>"TypeAndTab", "id=FirstName"=>"TypeTabAndWait", nil=>nil},
"Test1"=>{"css=#entityType"=>"Individual", "id=idNumber"=>"2323", "id=shortName"=>"M", "id=FirstName"=>"Abc", "id=lastName"=>"Gg"},
"Test2"=>{"css=#entityType"=>"Legal", "id=idNumber"=>"2323", "id=shortName"=>"Z", "id=FirstName"=>"Xyz", "id=lastName"=>"Gg"}}
Now I want to merge this action with the followings tests for an example,
hash= { "Test1"=>{"css=#entityType"=>["Individual","Type"], "id=idNumber"=>["2323","TypeAndWait"], "id=shortName"=>["M","TypeAndTab"], "id=FirstName"=>["Abc","TypeTabAndWait"]},
"Test2"=>{"css=#entityType"=>["Legal""Type"], "id=idNumber"=>["2323","TypeAndWait"], "id=shortName"=>["Z","TypeAndTab"], "id=FirstName"=>["Xyz","TypeTabAndWait"]}}
I don't know how to merge this way, Can anyone help me?
If I understand you want something like this
hash_1 = {a: "a1", b: "b1", c: "c1"}
hash_2 = {a: "a2", b: "b2", d: "d1"}
p hash_1.merge(hash_2) { |k, v1, v2| v1 = [v1, v2] }
# => {:a=>["a1", "a2"], :b=>["b1", "b2"], :c=>"c1", :d=>"d1"}
Which in your case can be:
test_1_value = my_hash['Test1'].merge(my_hash['Action']) { |k, v1, v2| v1 = [v1, v2] }
# => {"css=#entityType"=>["Individual", "Type"], "id=idNumber"=>["2323", "TypeAndWait"], "id=shortName"=>["M", "TypeAndTab"], "id=FirstName"=>["Abc", "TypeTabAndWait"], "id=\"lastName"=>"Gg", nil=>nil}
This is a general solution, you can manipulate furthermore removing the unwanted keys ad apply to fit your needs.
Edit - picking up comments
Remove unwanted keys and simplified merge block:
keys_to_remove = ["id=lastName", "whatever", nil]
test_1_value = my_hash['Test1'].merge(my_hash['Action']) { |k, *vs| vs }.delete_if{ |k, _| keys_to_remove.include? k }
# => {"css=#entityType"=>["Individual", "Type"], "id=idNumber"=>["2323", "TypeAndWait"], "id=shortName"=>["M", "TypeAndTab"], "id=FirstName"=>["Abc", "TypeTabAndWait"]}
I want to expand on iGians answer. Although the answer describes how the issue should be solved, it didn't use any iteration. You can iterate over the tests in the following way:
hash = {
"Action"=>{"css=#entityType"=>"Type", "id=idNumber"=>"TypeAndWait", "id=shortName"=>"TypeAndTab", "id=FirstName"=>"TypeTabAndWait", nil=>nil},
"Test1"=>{"css=#entityType"=>"Individual", "id=idNumber"=>"2323", "id=shortName"=>"M", "id=FirstName"=>"Abc", "id=lastName"=>"Gg"},
"Test2"=>{"css=#entityType"=>"Legal", "id=idNumber"=>"2323", "id=shortName"=>"Z", "id=FirstName"=>"Xyz", "id=lastName"=>"Gg"},
}
action = hash.delete 'Action'
tests = hash
tests.each_value do |test|
action_with_test_keys = action.select { |key, _value| test.key? key }
test.merge!(action_with_test_keys) { |_key, *values| values } # values = [old, new]
end
This assumes that 'Action' is the only non-test key in the hash and all other values should be merged with the 'Action' value. Keep in mind that this approach mutates the hash variable. If you don't want this you should simply #dup the hash beforehand or look for a non-mutating approach.
Optimizations:
If you use Ruby 2.5.0 or higher you can use #slice instead of #select.
action.select { |key, _value| test.key? key }
# is replaced with
action.slice(*test.keys)
If you are 100% sure that each test in tests contains the same keys and there is always at least one test present, you could move the action_with_test_keys assignment out of the #each_value block to save resources.
tests = hash # anchor point in the above solution
action_with_test_keys = action.slice(*tests.values.first.keys) # added
References:
Hash#delete to remove the 'Action' key from the hash variable.
Hash#each_value to iterate over each value of tests.
Hash#select to select only the action keys that are present on test.
Hash#key? to check if the given key is present.
Hash#merge! to merge action_with_test_keys and update the test variable.
Hash#slice replacement for Hash#select if you use Ruby 2.5.0 or higher.
Generally speaking, it might be a good idea to build up the desired data structure while dealing with the underlaying data objects. However, if you need to transform you hash afterwards, here is one way to do that:
hash = {
"Action"=>{"css=#entityType"=>"Type", "id=idNumber"=>"TypeAndWait", "id=shortName"=>"TypeAndTab", "id=FirstName"=>"TypeTabAndWait", nil=>nil},
"Test1"=>{"css=#entityType"=>"Individual", "id=idNumber"=>"2323", "id=shortName"=>"M", "id=FirstName"=>"Abc", "id=lastName"=>"Gg"},
"Test2"=>{"css=#entityType"=>"Legal", "id=idNumber"=>"2323", "id=shortName"=>"Z", "id=FirstName"=>"Xyz", "id=lastName"=>"Gg"}
}
action = hash['Action']
tests = hash.reject { |k, v| k == 'Action' }
mapping = tests.map do |name, test|
groups = (action.to_a + test.to_a).group_by(&:first)
no_keys = groups.map { |k, v| [k, v.each(&:shift).flatten] }
no_keys.reject! { |k, v| v.length == 1 }
[name, Hash[no_keys]]
end
Hash[mapping]
# => {"Test1"=>{"css=#entityType"=>["Type", "Individual"], "id=idNumber"=>["TypeAndWait", "2323"], "id=shortName"=>["TypeAndTab", "M"], "id=FirstName"=>["TypeTabAndWait", "Abc"]},
# "Test2"=>{"css=#entityType"=>["Type", "Legal"], "id=idNumber"=>["TypeAndWait", "2323"], "id=shortName"=>["TypeAndTab", "Z"], "id=FirstName"=>["TypeTabAndWait", "Xyz"]}}
I hope you find that useful.

Merge array of hashes by some keys and sum values of other keys

I have a array like
array = [
{"point"=>6, "score"=>4, "team"=>"Challenger"},
{"point"=>4, "score"=>2, "team"=>"INB"},
{"point"=>2, "score"=>2, "team"=>"Super-11"},
{"point"=>3, "score"=>7, "team"=>"INB"}
]
I want to merge hashes by "team" and sum the values of "point" and "score". Additionally want to insert an key "qualified" in each hash if point is greater than 5. So the final result will be:
result= [
{"point"=>6, "score"=>4, "qualified"=> "yes", "team"=>"Challenger"},
{"point"=>7, "score"=>9, "qualified"=> "yes", "team"=>"INB"},
{"point"=>2, "score"=>2, "qualified"=> "no", "team"=>"Super-11"}
]
Any help would be appreciated. Thanks!
One more possible solution :)
array.group_by { |item| item['team'] }.map do |_, items|
result = items.inject({}) { |hash, item| hash.merge(item) { |_, old, new| Integer(old) + new rescue old } }
result.merge("qualified" => result['point'] > 5 ? "yes" : "no")
end
Combination of group_by and map should help
result =
array.group_by {|item| item['team'] }
.map do |team, items|
total_points = items.map{|item| item['point']}.reduce(0, :+)
total_score = items.map{|item| item['score']}.reduce(0, :+)
qualified = points > 5
{
'point' => total_points,
'score' => total_score,
'qualified' => qualified ,
'team' => team
}
end
result = array.group_by{|i| i['team']}
.map do |k,v|
points = v.map{|i| i['point']}.inject(0, :+)
score = v.map{|i| i['score']}.inject(0, :+)
{
'point' => points,
'score' => score,
'qualified' => points > 5 ? 'yes' : 'no',
'team' => k
}
end
This is an alternative version. group_by is mandatory, I guess.
I used a temporary hash with keys as symbol to store data during iterations.
result = array.group_by { |hash| hash['team'] }.map do |team|
tmp_hash = {point: 0, score: 0, team: team[0], qualified: 'no'}
team[1].each { |h| tmp_hash[:point] += h['point'] ; tmp_hash[:score] += h['score'] }
tmp_hash[:qualified] = 'yes' if tmp_hash[:point] > 5
tmp_hash
end
this gives as result:
# => [
# {:point=>6, :score=>4, :team=>"Challenger", :qualified=>"yes"},
# {:point=>7, :score=>9, :team=>"INB", :qualified=>"yes"},
# {:point=>2, :score=>2, :team=>"Super-11", :qualified=>"no"}
# ]
After doing group_by, a simple map operation which takes the first element as the mapped value, sums up point and score within it and then merges the qualified condition into it is easy enough:
array
.group_by { |h| h["team"] }
.map do |_, a|
["point", "score"].each { |k| a.first[k] = a.sum { |h| h[k] } }
a.first.merge({"qualified": a.first["score"] > 5 ? 'yes' : 'no'})
end
Online demo here
array.each_with_object({}) do |g,h|
h.update(g["team"]=>g.merge("qualified"=>g["score"] > 5 ? "yes" : "no")) do |_,o,n|
{ "point" =>o["point"]+n["point"],
"score" =>o["score"]+n["score"],
"team" =>o["team"],
"qualified"=>(o["score"]+n["score"]) > 5 ? "yes" : "no" }
end
end.values
#=> [{"point"=>6, "score"=>4, "team"=>"Challenger", "qualified"=>"no"},
# {"point"=>7, "score"=>9, "team"=>"INB", "qualified"=>"yes"},
# {"point"=>2, "score"=>2, "team"=>"Super-11", "qualified"=>"no"}]
This uses the form of Hash#update (aka merge!) that employs a block to determine the values of keys (here the value of :id) that are present in both hashes being merged. See the doc for the description of the three block variables (here _, o and n).
Note that the receiver of values (at the end) is
{"Challenger"=>{"point"=>6, "score"=>4, "team"=>"Challenger", "qualified"=>"no"},
"INB"=>{"point"=>7, "score"=>9, "team"=>"INB", "qualified"=>"yes"},
"Super-11"=>{"point"=>2, "score"=>2, "team"=>"Super-11", "qualified"=>"no"}}
One could alternatively make a separate pass at the end to add the key "qualified':
array.each_with_object({}) do |g,h|
h.update(g["team"]=>g) do |_,o,n|
{ "point" =>o["point"]+n["point"],
"score" =>o["score"]+n["score"],
"team" =>o["team"] }
end
end.values.
map { |h| h.merge("qualified"=>(h["score"] > 5) ? "yes" : "no") }

Parsing XML to hash with Nori and Nokogiri with undesired result

I am attempting to convert an XML document to a Ruby hash using Nori. But instead of receiving a collection of the root element, a new node containing the collection is returned. This is what I am doing:
#xml = content_for(:layout)
#hash = Nori.new(:parser => :nokogiri, :advanced_typecasting => false).parse(#xml)
or
#hash = Hash.from_xml(#xml)
Where the content of #xml is:
<bundles>
<bundle>
<id>6073</id>
<name>Bundle-1</name>
<status>1</status>
<bundle_type>
<id>6713</id>
<name>BundleType-1</name>
</bundle_type>
<begin_at nil=\"true\"></begin_at>
<end_at nil=\"true\"></end_at>
<updated_at>2013-03-21T23:02:32Z</updated_at>
<created_at>2013-03-21T23:02:32Z</created_at>
</bundle>
<bundle>
<id>6074</id>
<name>Bundle-2</name>
<status>1</status>
<bundle_type>
<id>6714</id>
<name>BundleType-2</name>
</bundle_type>
<begin_at nil=\"true\"></begin_at>
<end_at nil=\"true\"></end_at>
<updated_at>2013-03-21T23:02:32Z</updated_at>
<created_at>2013-03-21T23:02:32Z</created_at>
</bundle>
</bundles>
The parser returns #hash of format:
{"bundles"=>{"bundle"=>[{"id"=>"6073", "name"=>"Bundle-1", "status"=>"1", "bundle_type"=>{"id"=>"6713", "name"=>"BundleType-1"}, "begin_at"=>nil, "end_at"=>nil, "updated_at"=>"2013-03-21T23:02:32Z", "created_at"=>"2013-03-21T23:02:32Z"}, {"id"=>"6074", "name"=>"Bundle-2", "status"=>"1", "bundle_type"=>{"id"=>"6714", "name"=>"BundleType-2"}, "begin_at"=>nil, "end_at"=>nil, "updated_at"=>"2013-03-21T23:02:32Z", "created_at"=>"2013-03-21T23:02:32Z"}]}}
Instead I would like to get:
{"bundles"=>[{"id"=>"6073", "name"=>"Bundle-1", "status"=>"1", "bundle_type"=>{"id"=>"6713", "name"=>"BundleType-1"}, "begin_at"=>nil, "end_at"=>nil, "updated_at"=>"2013-03-21T23:02:32Z", "created_at"=>"2013-03-21T23:02:32Z"}, {"id"=>"6074", "name"=>"Bundle-2", "status"=>"1", "bundle_type"=>{"id"=>"6714", "name"=>"BundleType-2"}, "begin_at"=>nil, "end_at"=>nil, "updated_at"=>"2013-03-21T23:02:32Z", "created_at"=>"2013-03-21T23:02:32Z"}]}
The point is that I control the XML, where it if formed similar to the way described above.
My question is also related to Does RABL's JSON output not conform to standard? Can it?
Imagine an XML that consists only of a list of the same tags, e.g.
<shoppinglist>
<item>apple</item>
<item>banana</item>
<item>cherry</item>
<item>pear</item>
<shoppinglist>
When you convert this into a hash, it is quite straightforward to access the items with e.g. hash['shoppinglist']['item'][0]. But what would you expect in this case? just an array? According to your logic, the items should now be accessible with hash['shoppinglist'][0] but what if you have different elements inside the container e.g.
<shoppinglist>
<date>2013-01-01</date>
<item>apple</item>
<item>banana</item>
<item>cherry</item>
<item>pear</item>
<shoppinglist>
How would you now access the items? And how the date? The problem is that the conversion to a hash has to work in the general case.
Although i do not know Nori, i am pretty sure what you ask from it is not baked in, just because it makes no sense when you consider the general case. As an alternative, you can still get the bundle array up one level by yourself:
#hash['bundles'] = #hash['bundles']['bundle']
The general solution to to your problem is not very pretty.
I created a special Object that I named an ArrayHash. It has the special property that if in has only one key and that value of the data pointed to by that key is an array it adds integer keys to those array elements.
So if normal ruby Hash dictionary would look like
{bundle"=>["0", "1", "A", "B"]}
then in an ArrayHash dictionaary would look like this
{"bundle"=>["0", "1", "A", "B"], 0=>"0", 1=>"1", 2=>"A", 3=>"B"}
Since the extra keys are of type Fixnum this Hash looks just like the Array
[ "0", "1", "A", "B" ]
except that it also has a "bundle" entry so its size is 5
Below is the code to force Nori to use this special Dictionary.
require 'nori'
class Nori
class ArrayHash < Hash
def [](a)
if a.is_a? Fixnum and self.size == 1
key = self.keys[0]
self[key][a]
else
super
end
end
def inspect
if self.size == 1 and self.to_a[0][1].class == Array
p = Hash[self.to_a]
self.values[0].each.with_index do |v, i|
p[i] = v
end
p.inspect
else
super
end
end
end
end
class Nori
class XMLUtilityNode
alias :old_to_hash :to_hash
def to_hash
ret = old_to_hash
raise if ret.size != 1
raise unless ret.class == Hash
a = ret.to_a[0]
k, v = a.first, a.last
if v.class == Hash
v = ArrayHash[ v.to_a ]
end
ret = ArrayHash[ k, v ]
ret
end
end
end
h = Nori.new(:parser => :nokogiri, :advanced_typecasting => false).parse(<<EOF)
<top>
<aundles>
<bundle>0</bundle>
<bundle>1</bundle>
<bundle>A</bundle>
<bundle>B</bundle>
</aundles>
<bundles>
<nundle>A</nundle>
<bundle>A</bundle>
<bundle>B</bundle>
</bundles>
</top>
EOF
puts "#{h['top']['aundles'][0]} == #{ h['top']['aundles']['bundle'][0]}"
puts "#{h['top']['aundles'][1]} == #{ h['top']['aundles']['bundle'][1]}"
puts "#{h['top']['aundles'][2]} == #{ h['top']['aundles']['bundle'][2]}"
puts "#{h['top']['aundles'][3]} == #{ h['top']['aundles']['bundle'][3]}"
puts h.inspect
The output is then
0 == 0
1 == 1
A == A
B == B
{"top"=>{"aundles"=>{"bundle"=>["0", "1", "A", "B"], 0=>"0", 1=>"1", 2=>"A", 3=>"B"}, "bundles"=>{"nundle"=>"A", "bundle"=>["A", "B"]}}}

Resources