Efficient way to verify a large hash in Ruby tests - ruby

What is an efficient way to test that a hash contains specific keys and values?
By efficient I mean the following items:
easy to read output when failing
easy to read source of test
shortest test to still be functional
Sometimes in Ruby I must create a large hash. I would like to learn an efficient way to test these hashes:
expected_hash = {
:one => 'one',
:two => 'two',
:sub_hash1 => {
:one => 'one',
:two => 'two'
},
:sub_hash2 => {
:one => 'one',
:two => 'two'
}
}
I can test this has several ways. The two ways I use the most are the whole hash at once, or a single item:
assert_equal expected_hash, my_hash
assert_equal 'one', my_hash[:one]
These work for small hashes like our example hash, but for a very large hash these methods break down. The whole hash test will display too much information on a failure. And the single item test would make my test code too large.
I was thinking an efficient way would be to break up the tests into many smaller tests that validated only part of the hash. Each of these smaller tests could use the whole hash style test. Unfortunately I don't know how to get Ruby to do this for items not in a sub-hash. The sub-hashes can be done like the following:
assert_equal expected_hash[:sub_hash1], my_hash[:sub_hash1]
assert_equal expected_hash[:sub_hash2], my_hash[:sub_hash2]
How do I test the remaining parts of the hash?

When testing, you need to use manageable chunks. As you found, testing against huge hashes makes it difficult to track what is happening.
Consider converting your hashes to arrays using to_a. Then you can easily use the set operators to compare arrays, and find out what is missing/changed.
For instance:
[1,2] - [2,3]
=> [1]
[2,3] - [1,2]
=> [3]

You can use hash method on hashes to compare 2 of them or their parts:
h = {:test => 'test'}
=> {:test=>"test"}
h1 = {:test => 'test2'}
=> {:test=>"test2"}
h.hash
=> -1058076452551767024
h1.hash
=> 1300393442551759555
h.hash == h1.hash
=> false

Related

Understanding map in Ruby

I'm pretty new to Ruby and am trying to understand an example of the map method that I came across:
{:a => "foo", :b => "bar"}.map{|a, b| "#{a}=#{b}"}.join('&')
which returns:
=> "a=foo&b=bar"
I don't understand how the
b=bar
is returned. The string interpolation is what is confusing me as it seems it would return something like:
=> "a=foo&bbar"
> {:a => "foo", :b => "bar"}.map{|key, value| "#{key}=#{value}"}
#=> ["a=foo", "b=bar"]
map method will fetch each element of hash as key and value pair
"#{key}=#{value}" is a String Interpolation which adds = between your key and value
Using this syntax everything between the opening #{ and closing } bits
is evaluated as Ruby code, and the result of this evaluation will be
embedded into the string surrounding it.
Array#join will returns a string created by converting each element of the array to a string, separated by the given separator.
so here in your case:
> ["a=foo", "b=bar"].join('&')
#=> "a=foo&b=bar"
In Rails you can convert hash to query params using Hash#to_query method, which will return the same result.
> {:a => "foo", :b => "bar"}.to_query
#=> "a=foo&b=bar"
The symbol key :a and the local variable a have nothing in common. The names are only coincidentally the same. Consider this code instead:
{
var1: "value1",
var2: "value2"
}.map do |key, value|
"#{key}=#{value}"
end.join('&')
# => "var1=value1&var2=value2"
Here the variables are different. What map does, like each, is iterate over each key-value pair in the Hash. That means you can do things like this, too, to simplify:
{
var1: "value1",
var2: "value2"
}.map do |pair|
pair.join('=')
end.join('&')
# => "var1=value1&var2=value2"
Normally when iterating over a Hash you should use names like k,v or key,value to be clear on what you're operating on.
If you're ever confused what's going on internally in an iteration loop, you can debug like this:
{
var1: "value1",
var2: "value2"
}.map do |pair|
puts pair.inspect
pair.join('=')
end.join('&')
That gives you this output:
[:var1, "value1"]
[:var2, "value2"]
That technique helps a lot. There's even the short-hand notation for this:
p pair
There are 2 method calls occurring here, the map and the join. One way to make it clearer and easier to understand is to separate the two methods and alter the keywords used in the map method. So instead of
{:a => "foo", :b => "bar"}.map{|a, b| "#{a}=#{b}"}.join('&')
Lets have
{:a => "foo", :b => "bar"}.map{|key, value| "#{key}=#{value}"}
This returns an array. #=> ["a=foo", "b=bar"]
Now:
["a=foo", "b=bar"].join('&')
produces a sting
#=> "a=foo&b=bar"
Map is iterating over the two key/value pairs and creating a string with the '=' between them and returns it in an array. It would iterate over all the key/value pairs in the harsh. Our example just has 2.
Join attaches the two elements of the array together with the '&' symbol between them and returns it as string. It would attach all elements of the array no matter its size.
What helped me to learn map and join is to open up the irb or pry and create a few hashes and arrays and play around with them. I highly recommend using unique names for your values that explain what is going on.
I hope this helps you.

Getting an array of hash values given specific keys

Given certain keys, I want to get an array of values from a hash (in the order I gave the keys). I had done this:
class Hash
def values_for_keys(*keys_requested)
result = []
keys_requested.each do |key|
result << self[key]
end
return result
end
end
I modified the Hash class because I do plan to use it almost everywhere in my code.
But I don't really like the idea of modifying a core class. Is there a builtin solution instead? (couldn't find any, so I had to write this).
You should be able to use values_at:
values_at(key, ...) → array
Return an array containing the values associated with the given keys. Also see Hash.select.
h = { "cat" => "feline", "dog" => "canine", "cow" => "bovine" }
h.values_at("cow", "cat") #=> ["bovine", "feline"]
The documentation doesn't specifically say anything about the order of the returned array but:
The example implies that the array will match the key order.
The standard implementation does things in the right order.
There's no other sensible way for the method to behave.
For example:
>> h = { :a => 'a', :b => 'b', :c => 'c' }
=> {:a=>"a", :b=>"b", :c=>"c"}
>> h.values_at(:c, :a)
=> ["c", "a"]
i will suggest you do this:
your_hash.select{|key,value| given_keys.include?(key)}.values

Ruby koans's assertion on test_hash_is_unordered

In Ruby 1.9 a Hash is sorted on the basis of order of insertion.
Why the Ruby koans's assertion on test_hash_is_unordered method returns true?
To me, the method's title is quite misleading... maybe it refers to the fact that Ruby will recognize 2 equal hashes that were created with different keys order insetions.
But, theorically, this kind of assertion:
hash1 = { :one => "uno", :two => "dos" }
hash2 = { :two => "dos", :one => "uno" }
assert_equal ___, hash1 == hash2
Should return false. Or not?
From the fine manual:
hsh == other_hash → true or false
Equality—Two hashes are equal if they each contain the same number of keys and if each key-value pair is equal to (according to Object#==) the corresponding elements in the other hash.
So two Hashes are considered equal if they have the same key/value pairs regardless of order.
The examples in the documentation even contain this:
h2 = { 7 => 35, "c" => 2, "a" => 1 }
h3 = { "a" => 1, "c" => 2, 7 => 35 }
h2 == h3 #=> true
Yes, the test_hash_is_unordered title is somewhat misleading as order isn't specifically being testing, only order with respect to equality is being demonstrated.
I think it's just a question of what 'unordered' means in such a context.
As a human, I would find it very difficult to compare two sets if they were not in order. The problem is that I can not easily match up identical elements and see if the sets are equivalent. Unless the sets happened to be listed in the same order, I would see them as unequal. This seems to be the conclusion that you came to also.
However, the thing is that the order of items in a mathematical concept of a set is simply unimportant. There is no way to 'order' the items, so two sets are identical if they contain the same elements. The set items are unordered, but they are are not 'out of order'; the concept of order does not apply.
I suppose that this is encapsulated entirely in the expression 'hash_is_unordered' but that was not immediately obvious to me, at least!

Convert array-of-hashes to a hash-of-hashes, indexed by an attribute of the hashes

I've got an array of hashes representing objects as a response to an API call. I need to pull data from some of the hashes, and one particular key serves as an id for the hash object. I would like to convert the array into a hash with the keys as the ids, and the values as the original hash with that id.
Here's what I'm talking about:
api_response = [
{ :id => 1, :foo => 'bar' },
{ :id => 2, :foo => 'another bar' },
# ..
]
ideal_response = {
1 => { :id => 1, :foo => 'bar' },
2 => { :id => 2, :foo => 'another bar' },
# ..
}
There are two ways I could think of doing this.
Map the data to the ideal_response (below)
Use api_response.find { |x| x[:id] == i } for each record I need to access.
A method I'm unaware of, possibly involving a way of using map to build a hash, natively.
My method of mapping:
keys = data.map { |x| x[:id] }
mapped = Hash[*keys.zip(data).flatten]
I can't help but feel like there is a more performant, tidier way of doing this. Option 2 is very performant when there are a very minimal number of records that need to be accessed. Mapping excels here, but it starts to break down when there are a lot of records in the response. Thankfully, I don't expect there to be more than 50-100 records, so mapping is sufficient.
Is there a smarter, tidier, or more performant way of doing this in Ruby?
Ruby <= 2.0
> Hash[api_response.map { |r| [r[:id], r] }]
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
However, Hash::[] is pretty ugly and breaks the usual left-to-right OOP flow. That's why Facets proposed Enumerable#mash:
> require 'facets'
> api_response.mash { |r| [r[:id], r] }
#=> {1=>{:id=>1, :foo=>"bar"}, 2=>{:id=>2, :foo=>"another bar"}}
This basic abstraction (convert enumerables to hashes) was asked to be included in Ruby long ago, alas, without luck.
Note that your use case is covered by Active Support: Enumerable#index_by
Ruby >= 2.1
[UPDATE] Still no love for Enumerable#mash, but now we have Array#to_h. It creates an intermediate array, but it's better than nothing:
> object = api_response.map { |r| [r[:id], r] }.to_h
Something like:
ideal_response = api_response.group_by{|i| i[:id]}
#=> {1=>[{:id=>1, :foo=>"bar"}], 2=>[{:id=>2, :foo=>"another bar"}]}
It uses Enumerable's group_by, which works on collections, returning matches for whatever key value you want. Because it expects to find multiple occurrences of matching key-value hits it appends them to arrays, so you end up with a hash of arrays of hashes. You could peel back the internal arrays if you wanted but could run a risk of overwriting content if two of your hash IDs collided. group_by avoids that with the inner array.
Accessing a particular element is easy:
ideal_response[1][0] #=> {:id=>1, :foo=>"bar"}
ideal_response[1][0][:foo] #=> "bar"
The way you show at the end of the question is another valid way of doing it. Both are reasonably fast and elegant.
For this I'd probably just go:
ideal_response = api_response.each_with_object(Hash.new) { |o, h| h[o[:id]] = o }
Not super pretty with the multiple brackets in the block but it does the trick with just a single iteration of the api_response.

Is saving a hash in another hash common practice?

I'd like to save some hash objects to a collection (in the Java world think of it as a List). I search online to see if there is a similar data structure in Ruby and have found none. For the moment being I've been trying to save hash a[] into hash b[], but have been having issues trying to get data out of hash b[].
Are there any built-in collection data structures on Ruby? If not, is saving a hash in another hash common practice?
If it's accessing the hash in the hash that is the problem then try:
>> p = {:name => "Jonas", :pos => {:x=>100.23, :y=>40.04}}
=> {:pos=>{:y=>40.04, :x=>100.23}, :name=>"Jonas"}
>> p[:pos][:x]
=> 100.23
There shouldn't be any problem with that.
a = {:color => 'red', :thickness => 'not very'}
b = {:data => a, :reason => 'NA'}
Perhaps you could explain what problems you're encountering.
The question is not completely clear, but I think you want to have a list (array) of hashes, right?
In that case, you can just put them in one array, which is like a list in Java:
a = {:a => 1, :b => 2}
b = {:c => 3, :d => 4}
list = [a, b]
You can retrieve those hashes like list[0] and list[1]
Lists in Ruby are arrays. You can use Hash.to_a.
If you are trying to combine hash a with hash b, you can use Hash.merge
EDIT: If you are trying to insert hash a into hash b, you can do
b["Hash a"] = a;
All the answers here so far are about Hash in Hash, not Hash plus Hash, so for reasons of completeness, I'll chime in with this:
# Define two independent Hash objects
hash_a = { :a => 'apple', :b => 'bear', :c => 'camel' }
hash_b = { :c => 'car', :d => 'dolphin' }
# Combine two hashes with the Hash#merge method
hash_c = hash_a.merge(hash_b)
# The combined hash has all the keys from both sets
puts hash_c[:a] # => 'apple'
puts hash_c[:c] # => 'car', not 'camel' since B overwrites A
Note that when you merge B into A, any keys that A had that are in B are overwritten.

Resources