Extract value from optional nested object - ruby

How to extract value with static key (:value) in situation when we have object with one of optional nested objects?
message_obj = {
'id': 123456,
'message': {
'value': 'some value',
}
}
callback_obj = {
'id': 234567,
'callback': {
'value': 'some value',
}
}
In this situation, I using next instruction:
some_obj[:message] ? some_obj[:message][:value] : some_obj[:callback][:value]
How to extract value from nested object, then we know list of acceptable objects names (eg. [:message, :callback, :picture, ...]). In parent object exist only one nested object.

I would use Hash#values_at and then pick the value from the one hash that was returned:
message
.values_at(*[:message, :callback, :picture, ...])
.compact
.first[:value]

You could use dig
For example:
message_obj = {
'id': 123456,
'message': {
'value': 'some message value',
}
}
callback_obj = {
'id': 234567,
'callback': {
'value': 'some callback value',
}
}
objects = [message_obj, callback_obj]
objects.each do |obj|
message_value = obj.dig(:message, :value)
callback_value = obj.dig(:callback, :value)
puts "found a message value #{message_value}" if message_value
puts "found a callback value #{callback_value}" if callback_value
end
This would print:
found a message value some message value
found a callback value some callback value
The nice thing about dig is the paths can be any length, for example the following would also work.
objects = [message_obj, callback_obj]
paths = [
[:message, :value],
[:callback, :value],
[:foo, :bar],
[:some, :really, :long, :path, :to, :a, :value]
]
objects.each do |obj|
paths.each do |path|
value = obj.dig(*path)
puts value if value
end
end

Use Ruby's Pattern-Matching Feature with Hashes
This is a great opportunity to use the pattern matching features of Ruby 3. Some of these features were introduced as experimental and changed often in the Ruby 2.7 series, but most have now stabilized and are considered part of the core language, although I personally expect that they will continue to continue to grow and improve especially as they are more heavily adopted.
While still evolving, Ruby's pattern matching allows you to do things like:
objects = [message_obj, callback_obj, {}, nil]
objects.map do
case _1
in message: v
in callback: v
else v = nil
end
v.values.first if v
end.compact
#=> ["some message value", "some callback value"]
You simply define a case for each Hash key you want to match (very easy with top-level keys; a little harder for deeply-nested keys) and then bind them to a variable like v. You can then use any methods you like to operate on the bound variable, either inside or outside the pattern-matching case statement. In this case, since all patterns are bound to v, it makes more sense to invoke our methods on whatever instance of v was found In your example, each :value key has a single value, so we can just use #first or #pop on v.values to get the results we want.
I threw in an else clause to set v to avoid NoMatchingPatternError, and a nil guard in the event that v == nil, but this is otherwise very straightforward, compact, and extremely extensible. Since I expect pattern matching, especially for Hash-based patterns, to continue to evolve in the Ruby 3 series, this is a good way to both explore the feature and to take a fairly readable and extensible approach to what might otherwise require a lot more looping and validation, or the use of a third-party gem method like Hashie#deep_find. Your mileage may vary.
Caveats
As of Ruby 3.1.1, the ability to use the find pattern on deeply-nested keys is somewhat limited, and the use of variable binding when using the alternates syntax currently throws an exception. As this is a fairly new feature in core, keep an eye on the changelog for Ruby's master branch (and yes, future readers, the branch is still labeled "master" at the time of this writing) or on the release notes for the upcoming Ruby 3.2.0 preview and beyond.

Related

Ruby method operating on hash without side effects

I want to create a function that adds a new element to a hash as below:
numbers_hash = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers["four"] = "cuatro"
end
add_new_value(numbers_hash)
I have read that immutability is important, and methods with side effects are not a good idea. Clearly this method is modifying the original input, how should I handle this?
Ruby is an OOP language with some functional patterns
Ruby is an object oriented language. Side-effects are important in OO. When you call a method on an object and that method modifies the object, that's a side-effect, and that's fine:
a = [1, 2, 3]
a.delete_at(1) # side effect in delete_at
# a is now [1, 3]
Ruby also allows a functional style, where data is transformed without side-effects. You've probably seen or used the map-reduce pattern:
a = ["1", "2", "3"]
a.map(&:to_i).reduce(&:+) # => 6
# a is unchanged
Command Query Separation
What may have confused you is a rule invented by Bertrand Meyers, the Command Query Separation Rule. This rule says that a method must either
Have a side effect, but no return value, or
Have no side effect, but return something
But not both. Note that although it's called a rule, in Ruby I would treat it as a strong guideline. There are times when violating this rule makes for better code, but in my experience this rule can be adhered to most of the time.
We have to clarify what we mean by "has a return value" in Ruby, since every Ruby method has a return value--the value of the last statement it executed (or nil if it was empty). What we mean is that the method has an intentional return value, one that is part of this method's contract and that the caller can be expected to use.
Here's an example of a method that has a side-effect and a return value, violating this rule:
# Open the valve if possible. Returns whether or not the valve is open.
def open_valve
#valve_open = true if #power_available
#valve_open
end
and how you'd separate that into two methods to adhere to this rule:
attr_reader :valve_open
def open_valve
#valve_open = true if #power_available
end
If you choose to adhere to this rule, you may find it useful to name side-effect methods with verb phrases, and returning-something methods with noun phrases. This makes it obvious from the start what kind of method you are dealing with, and makes naming methods easier.
What is a side-effect?
A side effect is something that changes the state of an object or or external entity like a file. This method that changes the state of its object has a side effect:
def register_error
#error_count += 1
end
This method that changes the state of its argument has a side effect:
def delete_ones(ary)
ary.delete(1)
end
This method that writes to a file has a side effect:
def log(line)
File.open(log_path, "a") { |f| f.puts(line) }
end
I would not necessarily agree that you should always avoid mutation an argument. Especially in the context of your example it seems like the mutation is the only purpose the method exists. Therefore it is not a side-effect IMO.
I would call it an unwanted side-effect when a method changes input parameters while doing something unrelated and that it is not obvious by the methods name that is also mutates input arguments.
You might prefer to return a new hash and keep the old hash unchanged:
numbers_hash_1 = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers.merge(four: "cuatro")
end
numbers_hash_2 = add_new_value(numbers_hash_1)
#=> {:one=>"uno", :two=>"dos", :three=>"tres", :four=>"cuatro"}
numbers_hash_1
#=> {:one=>"uno", :two=>"dos", :three=>"tres"}
Quote from the docs of Hash#merge:
merge(*other_hashes) → new_hash
Returns the new Hash formed by merging each of other_hashes into a copy of self.

Why is a string key for a hash frozen?

According to the specification, strings that are used as a key to a hash are duplicated and frozen. Other mutable objects do not seem to have such special consideration. For example, with an array key, the following is possible.
a = [0]
h = {a => :a}
h.keys.first[0] = 1
h # => {[1] => :a}
h[[1]] # => nil
h.rehash
h[[1]] # => :a
On the other hand, a similar thing cannot be done with a string key.
s = "a"
h = {s => :s}
h.keys.first.upcase! # => RuntimeError: can't modify frozen String
Why is string designed to be different from other mutable objects when it comes to a hash key? Is there any use case where this specification becomes useful? What other consequences does this specification have?
I actually have a use case where absence of such special specification about strings may be useful. That is, I read with the yaml gem a manually written YAML file that describes a hash. the keys may be strings, and I would like to allow case insensitivity in the original YAML file. When I read a file, I might get a hash like this:
h = {"foo" => :foo, "Bar" => :bar, "BAZ" => :baz}
And I want to normalize the keys to lower case to get this:
h = {"foo" => :foo, "bar" => :bar, "baz" => :baz}
by doing something like this:
h.keys.each(&:downcase!)
but that returns an error for the reason explained above.
In short it's just Ruby trying to be nice.
When a key is entered in a Hash, a special number is calculated, using the hash method of the key. The Hash object uses this number to retrieve the key. For instance, if you ask what the value of h['a'] is, the Hash calls the hash method of string 'a' and checks if it has a value stored for that number. The problem arises when someone (you) mutates the string object, so the string 'a' is now something else, let's say 'aa'. The Hash would not find a hash number for 'aa'.
The most common types of keys for hashes are strings, symbols and integers. Symbols and integers are immutable, but strings are not. Ruby tries to protect you from the confusing behaviour described above by dupping and freezing string keys. I guess it's not done for other types because there could be nasty performance side effects (think of large arrays).
Immutable keys make sense in general because their hash codes will be stable.
This is why strings are specially-converted, in this part of MRI code:
if (RHASH(hash)->ntbl->type == &identhash || rb_obj_class(key) != rb_cString) {
st_insert(RHASH(hash)->ntbl, key, val);
}
else {
st_insert2(RHASH(hash)->ntbl, key, val, copy_str_key);
}
In a nutshell, in the string-key case, st_insert2 is passed a pointer to a function that will trigger the dup and freeze.
So if we theoretically wanted to support immutable lists and immutable hashes as hash keys, then we could modify that code to something like this:
VALUE key_klass;
key_klass = rb_obj_class(key);
if (key_klass == rb_cArray || key_klass == rb_cHash) {
st_insert2(RHASH(hash)->ntbl, key, val, freeze_obj);
}
else if (key_klass == rb_cString) {
st_insert2(RHASH(hash)->ntbl, key, val, copy_str_key);
}
else {
st_insert(RHASH(hash)->ntbl, key, val);
}
Where freeze_obj would be defined as:
static st_data_t
freeze_obj(st_data_t obj)
{
return (st_data_t)rb_obj_freeze((VALUE) obj);
}
So that would solve the specific inconsistency that you observed, where the array-key was mutable. However to be really consistent, more types of objects would need to be made immutable as well.
Not all types, however. For example, there'd be no point to freezing immediate objects like Fixnum because there is effectively only one instance of Fixnum corresponding to each integer value. This is why only String needs to be special-cased this way, not Fixnum and Symbol.
Strings are a special exception simply as a matter of convenience for Ruby programmers, because strings are very often used as hash keys.
Conversely, the reason that other object types are not frozen like this, which admittedly leads to inconsistent behavior, is mostly a matter of convenience for Matz & Company to not support edge cases. In practice, comparatively few people will use a container object like an array or a hash as a hash key. So if you do so, it's up to you to freeze before insertion.
Note that this is not strictly about performance, because the act of freezing a non-immediate object simply involves flipping the FL_FREEZE bit on the basic.flags bitfield that's present on every object. That's of course a cheap operation.
Also speaking of performance, note that if you are going to use string keys, and you are in a performance-critical section of code, you might want to freeze your strings before doing the insertion. If you don't, then a dup is triggered, which is a more-expensive operation.
Update #sawa pointed out that leaving your array-key simply frozen means the original array might be unexpectedly immutable outside of the key-use context, which could also be an unpleasant surprise (although otoh it would serve you right for using an array as a hash-key, really). If you therefore surmise that dup + freeze is the way out of that, then you would in fact incur possible noticeable performance cost. On the third hand, leave it unfrozen altogether, and you get the OP's original weirdness. Weirdness all around. Another reason for Matz et al to defer these edge cases to the programmer.
See this thread on the ruby-core mailing list for an explanation (freakily, it happened to be the first mail I stumbled across when I opened up the mailing list in my mail app!).
I've no idea about the first part of your question, but hHere is a practical answer for the 2nd part:
new_hash = {}
h.each_pair do |k,v|
new_hash.merge!({k.downcase => v})
end
h.replace new_hash
There's lots of permutations of this kind of code,
Hash[ h.map{|k,v| [k.downcase, v] } ]
being another (and you're probably aware of these, but sometimes it's best to take the practical route:)
You are askin 2 different questions: theoretical and practical. Lain was the first to answer, but I would like to provide what I consider a proper, lazier solution to your practical question:
Hash.new { |hsh, key| # this block get's called only if a key is absent
downcased = key.to_s.downcase
unless downcased == key # if downcasing makes a difference
hsh[key] = hsh[downcased] if hsh.has_key? downcased # define a new hash pair
end # (otherways just return nil)
}
The block used with Hash.new constructor is only invoked for those missing keys, that are actually requested. The above solution also accepts symbols.
A very old question - but if anyone else is trying to answer the "how can I get around the hash keys are freezing strings" part of the question...
A simple trick you could do to solve the String special case is:
class MutableString < String
end
s = MutableString.new("a")
h = {s => :s}
h.keys.first.upcase! # => RuntimeError: can't modify frozen String
puts h.inspect
Doesn't work unless you are creating the keys, and unless you are then careful that it doesn't cause any problems with anything that strictly requires that the class is exactly "String"

Class (Type) checking

Is there a good library (preferably gem) for doing class checking of an object? The difficult part is that I not only want to check the type of a simple object but want to go inside an array or a hash, if any, and check the classes of its components. For example, if I have an object:
object = [
"some string",
4732841,
[
"another string",
{:some_symbol => [1, 2, 3]}
],
]
I want to be able to check with various levels of detail, and if there is class mismatch, then I want it to return the position in some reasonable way. I don't yet have a clear idea of how the error (class mismatch) format should be, but something like this:
object.class_check(Array) # => nil (`nil` will mean the class matches)
object.class_check([String, Fixnum, Array]) # => nil
object.class_check([String, Integer, Array]) # => nil
object.class_check([String, String, Array]) # => [1] (This indicates the position of class mismatch)
object.class_check([String, Fixnum, [Symbol, Hash]) # => [2,0] (meaning type mismatch at object[2][0])
If there is no such library, can someone (show me the direction in which I should) implement this? Probably, I should use kind_of? and recursive definition.
is_a? or kind_of? do what you are asking for... though you seem to know that already(?).
Here is something you can start with
class Object
def class_check(tclass)
return self.kind_of? tclass unless tclass.kind_of? Array
return false unless self.kind_of? Array
return false unless length == tclass.length
zip(tclass).each { | a, b | return false unless a.class_check(b) }
true
end
end
It will return true if the classes match and false otherwise.
Calculation of the indices is missing.

Testing for collection index-types (ie arguments to [] method)

What's an efficient, rubyesque way of testing if a collection supports string indexing?
Long version:
I'd like for a class of mine to normalize certain values. To achieve this, an instance takes a collection of values as its 'values' attribute. I'd like values= to accept both lists (integer indexed collections, including the built-in Array) and associative arrays (object indexed collections, such as Hash). If passed a list, it turns the list into an associative array by inverting keys & values. This means the method needs to distinguish lists from associative arrays. It should also work on any indexed collection, not just descendants of Array and Hash, so any sort of type sniffing the collection type is considered ugly and wrong. Type sniffing the index type, however...
Currently, I'm using exceptions to tell the difference, but I prefer to use exceptions for, well, exceptional circumstances rather than a general control structure. It's just a personal preference, one I'm not too attached to. If exceptions are the ruby way to solve this problem, please let me know.
def values=(values)
begin
values['']
#values = values.dup
rescue TypeError
#values = Hash[ values.zip((0..values.length-1).to_a) ]
end
#values.each_value { |v| #values[v] = v}
end
Note: a complete solution would take the transitive closure of values, but for now I can assume the keys & values of values are from different domains.
The point of all this is to enable code like:
toggle.values = [:off, :on]
toggle.normalise(:off) == 0
toggle.normalise(1) == 1
bool.values = {:off => 0, :false => 0, :no => 0,
:on => 1, :true => 1, :yes => 1}
bool.normalise(:yes) == 1
bool.normalise(0) == 0
PS. This is for a personal project, so elegance and the Ruby way are paramount. I'm looking for interesting approaches, especially if they illustrate an interesting concept (such as "exceptions can be used as behavioral tests").
Duck typing to the rescue!
hash = if collection.respond_to? :to_hash
collection.to_hash
elsif collection.respond_to? :to_ary
collection.to_ary.inject({}) { |_hash,(key,value)| _hash.merge!(key => value) }
else if collection.respond_to? :inject
collection.inject({}) { |_hash,(key,value)| _hash.merge!(key => value) }
else
raise ArgumentError, "not a collection type I understand"
end
if want_dupe and collection.object_id == hash.object_id
hash = hash.dup
end

How can I use C# style enumerations in Ruby?

I just want to know the best way to emulate a C# style enumeration in Ruby.
Specifically, I would like to be able to perform logical tests against the set of values given some variable. Example would be the state of a window: "minimized, maximized, closed, open"
If you need the enumerations to map to values (eg, you need minimized to equal 0, maximised to equal 100, etc) I'd use a hash of symbols to values, like this:
WINDOW_STATES = { :minimized => 0, :maximized => 100 }.freeze
The freeze (like nate says) stops you from breaking things in future by accident.
You can check if something is valid by doing this
WINDOW_STATES.keys.include?(window_state)
Alternatively, if you don't need any values, and just need to check 'membership' then an array is fine
WINDOW_STATES = [:minimized, :maximized].freeze
Use it like this
WINDOW_STATES.include?(window_state)
If your keys are going to be strings (like for example a 'state' field in a RoR app), then you can use an array of strings. I do this ALL THE TIME in many of our rails apps.
WINDOW_STATES = %w(minimized maximized open closed).freeze
This is pretty much what rails validates_inclusion_of validator is purpose built for :-)
Personal Note:
I don't like typing include? all the time, so I have this (it's only complicated because of the .in?(1, 2, 3) case:
class Object
# Lets us write array.include?(x) the other way round
# Also accepts multiple args, so we can do 2.in?( 1,2,3 ) without bothering with arrays
def in?( *args )
# if we have 1 arg, and it is a collection, act as if it were passed as a single value, UNLESS we are an array ourselves.
# The mismatch between checking for respond_to on the args vs checking for self.kind_of?Array is deliberate, otherwise
# arrays of strings break and ranges don't work right
args.length == 1 && args.first.respond_to?(:include?) && !self.kind_of?(Array) ?
args.first.include?( self ) :
args.include?( self )
end
end
end
This lets you type
window_state.in? WINDOW_STATES
It's not quite the same, but I'll often build a hash for this kind of thing:
STATES = {:open => 1, :closed => 2, :max => 3, :min => 4}.freeze()
Freezing the hash keeps me from accidentally modifying its contents.
Moreover, if you want to raise an error when accessing something that doesn't exist, you can use a defualt Proc to do this:
STATES = Hash.new { |hash, key| raise NameError, "#{key} is not allowed" }
STATES.merge!({:open => 1, :closed => 2, :max => 3, :min => 4}).freeze()
STATES[:other] # raises NameError
I don't think Ruby supports true enums -- though, there are still solutions available.
Enumerations and Ruby
The easiest way to define an Enum in ruby to use a class with constant variables.
class WindowState
Open = 1
Closed = 2
Max = 3
Min = 4
end
Making a class or hash as others have said will work. However, the Ruby thing to do is to use symbols. Symbols in Ruby start with a colon and look like this:
greetingtype = :hello
They are kind of like objects that consist only of a name.

Resources