Ruby hash direct access vs merge - ruby

Is there any difference between:
#attr[:field] = new_value
and
#attr.merge(:field => new_value)

If you're using merge! instead of merge, there is no difference.
The only difference is that you can use multiple fields (meaning: another hash) in the merge parameters.
Example:
h1 = { "a" => 100, "b" => 200 }
h2 = { "b" => 254, "c" => 300 }
h3 = h1.merge(h2)
puts h1 # => {"a" => 100, "b" => 200}
puts h3 # => {"a"=>100, "b"=>254, "c"=>300}
h1.merge!(h2)
puts h1 # => {"a"=>100, "b"=>254, "c"=>300}
When assigning single values, I would prefer h[:field] = new_val over merge for readability reasons and I guess it is faster than merging.
You can also take a look at the Hash-rdoc: http://ruby-doc.org/core/classes/Hash.html#M000759

They do the same thing, however:
#attr[:field] = new_value
is more efficient, since no hash traversing is necessary.

Merge returns a new hash at a different location merging other_hashes into self
but Merge! operated like "update" returning self hash, copying at self-location.
h1 = { "a": 1 }
h2 = { "b": 2 }
def merge_hash (a,b)
puts h1 # {:a=> 1}
puts h1.object_id # 340720
h1 = h1.merge(h2)
puts h1 # {:a=>1, :b=>2}
puts h1.object_id # 340760
end
merge_hash(h1, h2)
puts h1 # {:a=> 1}
h1.object_id # 340720
def merge_hash (a,b)
puts h1 # {:a=> 1}
puts h1.object_id # 340720
h1 = h1.merge!(h2)
puts h1 # {:a=>1, :b=>2}
puts h1.object_id # 340720
end
merge_hash(h1, h2)
puts h1 # {:a=>1, :b=>2}
h1.object_id # 340720

You can use non-bang merge to use hashes in a functional programming style.
Ruby Functional Programming has under Don't update variables
Don't update hashes
No:
hash = {:a => 1, :b => 2}
hash[:c] = 3
hash
Yes:
hash = {:a => 1, :b => 2}
new_hash = hash.merge(:c => 3)

Related

How to make a custom proc in a class so I can do: params_hash.downcase_keys instead of downcase_keys(params_hash)?

I have a hash of key values and I want to downcase all of the Keys.
However I don't want to have to create an local variable, I would rather functionally do it.
NOT:
x = downcase_keys(params_hash)
BUT THIS:
params_hash.downcase_keys
How would do this in ruby?
I do not understand why you tagged this question as functional-programming it seems you are looking for a method to call on a Hash object.
Be aware that you may encounter problems doing so because duplicated keys are going to be overwritten.
h = {"a" => 1, "B" => 2}
# Singleton method on the object
def h.downcase_keys
temp = map {|k,v| [k.downcase, v]}.to_h
clear
merge! temp
end
h.downcase_keys()
p h # {"a"=>1, "b"=>2}
# Method available on all Hash objects
class Hash
def downcase_keys
temp = map {|k,v| [k.downcase, v]}.to_h
clear
merge! temp
end
end
h = {"a" => 1, "B" => 2}
h.downcase_keys()
p h # {"a"=>1, "b"=>2}
def downcase_keys(hash)
hash.downcase_keys
end
h = {"C" => 1, "B" => 2, "D" => 3}
downcase_keys(h)
p h # {"c"=>1, "b"=>2, "d"=>3}

Subtract two hashes in Ruby

Can the hash class be modified so that given two hashes, a new hash containing only keys that are present in one hash but not the other can be created?
E.g.:
h1 = {"Cat" => 100, "Dog" => 5, "Bird" => 2, "Snake" => 10}
h2 = {"Cat" => 100, "Dog" => 5, "Bison" => 30}
h1.difference(h2) = {"Bird" => 2, "Snake" => 10}
Optionally, the difference method could include any key/value pairs such that the key is present in both hashes but the value differs between them.
h1 = {"Cat" => 100, "Dog" => 5, "Bird" => 2, "Snake" => 10}
h2 = {"Cat" => 999, "Dog" => 5, "Bison" => 30}
Case 1: keep all key/value pairs k=>v in h1 for which there is no key k in h2
This is one way:
h1.dup.delete_if { |k,_| h2.key?(k) }
#=> {"Bird"=>2, "Snake"=>10}
This is another:
class Array
alias :spaceship :<=>
def <=>(o)
first <=> o.first
end
end
(h1.to_a - h2.to_a).to_h
#=> {"Bird"=>2, "Snake"=>10}
class Array
alias :<=> :spaceship
remove_method(:spaceship)
end
Case 2: keep all key/value pairs in h1 that are not in h2
All you need for this is:
(h1.to_a - h2.to_a).to_h
#=> {"Cat"=>100, "Bird"=>2, "Snake"=>10}
Array#to_h was introduced in Ruby 2.0. For earlier versions, use Hash[].
Use the reject method:
class Hash
def difference(other)
reject do |k,v|
other.has_key? k
end
end
end
To only reject key/value pairs if the values are identical (as per mallanaga's suggestion via a comment on my original answer, which I have deleted):
class Hash
def difference(other)
reject do |k,v|
other.has_key?(k) && other[k] == v
end
end
end
You can do this:
h2.each_with_object(h1.dup){|(k, v), h| h.delete(k)}
try using hashdiff gem.
diff=HashDiff.diff(h1,h2)
For deep nesting you can add a bit of recursion, something like (untested)
class Hash
def -(h2)
raise ArgumentError unless h2.is_a?(Hash)
h1 = dup
h1.delete_if do |k, v|
if v.is_a?(Hash) && h2[k].is_a?(Hash)
h1[k] = v - h2[k]
h1[k].blank?
else
h2[k] == v
end
end
end
end
end

How to merge multiple hashes in Ruby?

h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
Hash#merge works for 2 hashes: h.merge(h2)
How to merge 3 hashes?
h.merge(h2).merge(h3) works but is there a better way?
You could do it like this:
h, h2, h3 = { a: 1 }, { b: 2 }, { c: 3 }
a = [h, h2, h3]
p Hash[*a.map(&:to_a).flatten] #= > {:a=>1, :b=>2, :c=>3}
Edit: This is probably the correct way to do it if you have many hashes:
a.inject{|tot, new| tot.merge(new)}
# or just
a.inject(&:merge)
Since Ruby 2.0 on that can be accomplished more graciously:
h.merge **h1, **h2
And in case of overlapping keys - the latter ones, of course, take precedence:
h = {}
h1 = { a: 1, b: 2 }
h2 = { a: 0, c: 3 }
h.merge **h1, **h2
# => {:a=>0, :b=>2, :c=>3}
h.merge **h2, **h1
# => {:a=>1, :c=>3, :b=>2}
You can just do
[*h,*h2,*h3].to_h
# => {:a=>1, :b=>2, :c=>3}
This works whether or not the keys are Symbols.
Ruby 2.6 allows merge to take multiple arguments:
h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
h4 = { 'c' => 4 }
h5 = {}
h.merge(h2, h3, h4, h5) # => {:a=>1, :b=>2, :c=>3, "c"=>4}
This works with Hash.merge! and Hash.update too. Docs for this here.
Also takes empty hashes and keys as symbols or strings.
Much simpler :)
Answer using reduce (same as inject)
hash_arr = [{foo: "bar"}, {foo2: "bar2"}, {foo2: "bar2b", foo3: "bar3"}]
hash_arr.reduce { |acc, h| (acc || {}).merge h }
# => {:foo2=>"bar2", :foo3=>"bar3", :foo=>"bar"}
Explanation
For those beginning with Ruby or functional programming, I hope this brief explanation might help understand what's happening here.
The reduce method when called on an Array object (hash_arr) will iterate through each element of the array with the returned value of the block being stored in an accumulator (acc). Effectively, the h parameter of my block will take on the value of each hash in the array, and the acc parameter will take on the value that is returned by the block through each iteration.
We use (acc || {}) to handle the initial condition where acc is nil. Note that the merge method gives priority to keys/values in the original hash. This is why the value of "bar2b" doesn't appear in my final hash.
Hope that helps!
To build upon #Oleg Afanasyev's answer, you can also do this neat trick:
h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
z = { **h, **h2, **h3 } # => {:a=>1, :b=>2, :c=>3}
Cheers!
class Hash
def multi_merge(*args)
args.unshift(self)
args.inject { |accum, ele| accum.merge(ele) }
end
end
That should do it. You could easily monkeypatch that into Hash as I have shown.
newHash = [h, h2, h3].each_with_object({}) { |oh, nh| nh.merge!(oh)}
# => {:a=>1, :b=>2, :c=>3}
Here are the 2 monkeypatched ::Hash instance methods we use in our app. Backed by Minitest specs. They use merge! instead of merge internally, for performance reasons.
class ::Hash
# Merges multiple Hashes together. Similar to JS Object.assign.
# Returns merged hash without modifying the receiver.
#
# #param *other_hashes [Hash]
#
# #return [Hash]
def merge_multiple(*other_hashes)
other_hashes.each_with_object(self.dup) do |other_hash, new_hash|
new_hash.merge!(other_hash)
end
end
# Merges multiple Hashes together. Similar to JS Object.assign.
# Modifies the receiving hash.
# Returns self.
#
# #param *other_hashes [Hash]
#
# #return [Hash]
def merge_multiple!(*other_hashes)
other_hashes.each(&method(:merge!))
self
end
end
Tests:
describe "#merge_multiple and #merge_multiple!" do
let(:hash1) {{
:a => "a",
:b => "b"
}}
let(:hash2) {{
:b => "y",
:c => "c"
}}
let(:hash3) {{
:d => "d"
}}
let(:merged) {{
:a => "a",
:b => "y",
:c => "c",
:d => "d"
}}
describe "#merge_multiple" do
subject { hash1.merge_multiple(hash2, hash3) }
it "should merge three hashes properly" do
assert_equal(merged, subject)
end
it "shouldn't modify the receiver" do
refute_changes(->{ hash1 }) do
subject
end
end
end
describe "#merge_multiple!" do
subject { hash1.merge_multiple!(hash2, hash3) }
it "should merge three hashes properly" do
assert_equal(merged, subject)
end
it "shouldn't modify the receiver" do
assert_changes(->{ hash1 }, :to => merged) do
subject
end
end
end
end
Just for fun, you can do it also this way:
a = { a: 1 }, { b: 2 }, { c: 3 }
{}.tap { |h| a.each &h.method( :update ) }
#=> {:a=>1, :b=>2, :c=>3}
With modern Ruby, you wont even have to use merge unless you need to change the variable in place using the ! variant, you can just double splat (**) your way through.
h = { a: 1 }
h2 = { b: 2 }
h3 = { c: 3 }
merged_hash = { **h, **h2, **h3 }
=> { a: 1, b: 2, c:3 }

Ruby when to use methods ! vs assigning the value back

While working in Ruby I often find myself in a conflict between using methods with a ! or using a normal method as assign the value back. I am not sure when to use what.
For example,
I have 2 hashes (h1 and h2) and I want to merge them and store the value back in hash h1, now should I be using
h1.merge!(h2)
or
h1 = h1.merge(h2)?
Is there any difference between the two?
Most of the time there is very little difference between h1.merge!(h2) and h1 = h1.merge(h2).
However, note that:
Since merge! modifies the old hash, you might be unintentionally affecting some other object in your program that holds a reference to that same hash. It is bad practice to modify a hash that you received as a method parameter because the caller usually does not expect it.
Using merge! is not functional programming, if you are a fan of that.
Using merge! is probably more efficient since it does not create a new hash, especially for large hashes.
I would use merge most of the time and only use merge! if I determine that it is safe and better.
Is there any difference between the two?
Yes,of-course there is.
You should use !(bang) version of Hash#merge,when you want to change the receiver itself.
Example
h1 = {a: 1}
h2 = {b: 2}
h3 = h1.merge(h2) # => {:a=>1, :b=>2}
h1 # => {:a=>1}
Now see :
h1 = {a: 1}
h2 = {b: 2}
h1.merge!(h2) # => {:a=>1, :b=>2}
h1 # => {:a=>1, :b=>2}
h1 = h1.merge(h2) gives the same answer
Humm,that is because,you are assinging the new hash to h1 after applying the Hash#merge method:
h1 = {a: 1}
h2 = {b: 2}
h1.object_id # => 69435570
h1 = h1.merge(h2) # => {:a=>1, :b=>2}
h1 # => {:a=>1, :b=>2}
h1.object_id # => 69434820
As you said I want to merge them and store the value back in hash h1,Then I would recommend to use Hash#merge!.Because h1 = h1.merge(h2) is same as h1.merge!(h2)(will save the new hash creation,and assign back it to the h1).
While working in Ruby I often find myself in a conflict between using
methods with a ! or using a normal method.
You should be thinking about more important things, so just adopt the rule: I will never use bang methods. Now be free and soar...
require 'benchmark'
n = 1_000_000
h1 = {a: 1, b: 2}
h2 = {b: 3, c: 4}
Benchmark.bm(20) do |b|
b.report("no-bang-hash-merge") do
n.times { h1 = h1.merge h2 }
end
b.report("bang-hash-merge") do
n.times { h1.merge! h2 }
end
end
--output:--
user system total real
no-bang-hash-merge 2.750000 0.050000 2.800000 ( 2.817345)
bang-hash-merge 0.400000 0.000000 0.400000 ( 0.406870)
.
require 'benchmark'
hash_size = 10_000
#Keys overlap:
key1 = 'a'
key2 = nil
h1 = {}
hash_size.times do |i|
h1[key1] = i
key2 = key1.dup if i == hash_size/2
key1.succ!
end
h2 = {}
hash_size.times do |i|
h2[key2] = i
key2.succ!
end
=begin
#No overlap:
key = 'a'
h1 = {}
hash_size.times do |i|
h1[key] = i
key.succ!
end
h2 = {}
hash_size.times do |i|
h2[key] = i
key.succ!
end
=end
n = 100_000
puts "50% of keys overlap, hash size #{hash_size}:"
Benchmark.bm(20) do |b|
b.report("no-bang-hash-merge") do
n.times { h1 = h1.merge h2 }
end
b.report("bang-hash-merge") do
n.times { h1.merge! h2 }
end
end
--some test runs:---
50% of keys overlap, hash size 10000:
user system total real
no-bang-hash-merge 1500.570000 74.520000 1575.090000 (1695.523240)
bang-hash-merge 255.910000 0.940000 256.850000 (269.957178)
No keys overlap, hash size 10000:
user system total real
no-bang-hash-merge 1906.070000 109.340000 2015.410000 (2151.865636)
bang-hash-merge 162.680000 0.190000 162.870000 (163.369607)
So if you need the speed, bang away. Otherwise, don't risk it.

Turning a Hash of Arrays into an Array of Hashes in Ruby

We have the following datastructures:
{:a => ["val1", "val2"], :b => ["valb1", "valb2"], ...}
And I want to turn that into
[{:a => "val1", :b => "valb1"}, {:a => "val2", :b => "valb2"}, ...]
And then back into the first form. Anybody with a nice looking implementation?
This solution works with arbitrary numbers of values (val1, val2...valN):
{:a => ["val1", "val2"], :b => ["valb1", "valb2"]}.inject([]){|a, (k,vs)|
vs.each_with_index{|v,i| (a[i] ||= {})[k] = v}
a
}
# => [{:a=>"val1", :b=>"valb1"}, {:a=>"val2", :b=>"valb2"}]
[{:a=>"val1", :b=>"valb1"}, {:a=>"val2", :b=>"valb2"}].inject({}){|a, h|
h.each_pair{|k,v| (a[k] ||= []) << v}
a
}
# => {:a=>["val1", "val2"], :b=>["valb1", "valb2"]}
Using a functional approach (see Enumerable):
hs = h.values.transpose.map { |vs| h.keys.zip(vs).to_h }
#=> [{:a=>"val1", :b=>"valb1"}, {:a=>"val2", :b=>"valb2"}]
And back:
h_again = hs.first.keys.zip(hs.map(&:values).transpose).to_h
#=> {:a=>["val1", "val2"], :b=>["valb1", "valb2"]}
Let's look closely what the data structure we are trying to convert between:
#Format A
[
["val1", "val2"], :a
["valb1", "valb2"], :b
["valc1", "valc2"] :c
]
#Format B
[ :a :b :c
["val1", "valb1", "valc1"],
["val2", "valb2", "valc3"]
]
It is not diffculty to find Format B is the transpose of Format A in essential , then we can come up with this solution:
h={:a => ["vala1", "vala2"], :b => ["valb1", "valb2"], :c => ["valc1", "valc2"]}
sorted_keys = h.keys.sort_by {|a,b| a.to_s <=> b.to_s}
puts sorted_keys.inject([]) {|s,e| s << h[e]}.transpose.inject([]) {|r, a| r << Hash[*sorted_keys.zip(a).flatten]}.inspect
#[{:b=>"valb1", :c=>"valc1", :a=>"vala1"}, {:b=>"valb2", :c=>"valc2", :a=>"vala2"}]
m = {}
a,b = Array(h).transpose
b.transpose.map { |y| [a, y].transpose.inject(m) { |m,x| m.merge Hash[*x] }}
My attempt, perhaps slightly more compact.
h = { :a => ["val1", "val2"], :b => ["valb1", "valb2"] }
h.values.transpose.map { |s| Hash[h.keys.zip(s)] }
Should work in Ruby 1.9.3 or later.
Explanation:
First, 'combine' the corresponding values into 'rows'
h.values.transpose
# => [["val1", "valb1"], ["val2", "valb2"]]
Each iteration in the map block will produce one of these:
h.keys.zip(s)
# => [[:a, "val1"], [:b, "valb1"]]
and Hash[] will turn them into hashes:
Hash[h.keys.zip(s)]
# => {:a=>"val1", :b=>"valb1"} (for each iteration)
This will work assuming all the arrays in the original hash are the same size:
hash_array = hash.first[1].map { {} }
hash.each do |key,arr|
hash_array.zip(arr).each {|inner_hash, val| inner_hash[key] = val}
end
You could use inject to build an array of hashes.
hash = { :a => ["val1", "val2"], :b => ["valb1", "valb2"] }
array = hash.inject([]) do |pairs, pair|
pairs << { pair[0] => pair[1] }
pairs
end
array.inspect # => "[{:a=>["val1", "val2"]}, {:b=>["valb1", "valb2"]}]"
Ruby documentation has a few more examples of working with inject.

Resources