Ruby method operating on hash without side effects - ruby

I want to create a function that adds a new element to a hash as below:
numbers_hash = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers["four"] = "cuatro"
end
add_new_value(numbers_hash)
I have read that immutability is important, and methods with side effects are not a good idea. Clearly this method is modifying the original input, how should I handle this?

Ruby is an OOP language with some functional patterns
Ruby is an object oriented language. Side-effects are important in OO. When you call a method on an object and that method modifies the object, that's a side-effect, and that's fine:
a = [1, 2, 3]
a.delete_at(1) # side effect in delete_at
# a is now [1, 3]
Ruby also allows a functional style, where data is transformed without side-effects. You've probably seen or used the map-reduce pattern:
a = ["1", "2", "3"]
a.map(&:to_i).reduce(&:+) # => 6
# a is unchanged
Command Query Separation
What may have confused you is a rule invented by Bertrand Meyers, the Command Query Separation Rule. This rule says that a method must either
Have a side effect, but no return value, or
Have no side effect, but return something
But not both. Note that although it's called a rule, in Ruby I would treat it as a strong guideline. There are times when violating this rule makes for better code, but in my experience this rule can be adhered to most of the time.
We have to clarify what we mean by "has a return value" in Ruby, since every Ruby method has a return value--the value of the last statement it executed (or nil if it was empty). What we mean is that the method has an intentional return value, one that is part of this method's contract and that the caller can be expected to use.
Here's an example of a method that has a side-effect and a return value, violating this rule:
# Open the valve if possible. Returns whether or not the valve is open.
def open_valve
#valve_open = true if #power_available
#valve_open
end
and how you'd separate that into two methods to adhere to this rule:
attr_reader :valve_open
def open_valve
#valve_open = true if #power_available
end
If you choose to adhere to this rule, you may find it useful to name side-effect methods with verb phrases, and returning-something methods with noun phrases. This makes it obvious from the start what kind of method you are dealing with, and makes naming methods easier.
What is a side-effect?
A side effect is something that changes the state of an object or or external entity like a file. This method that changes the state of its object has a side effect:
def register_error
#error_count += 1
end
This method that changes the state of its argument has a side effect:
def delete_ones(ary)
ary.delete(1)
end
This method that writes to a file has a side effect:
def log(line)
File.open(log_path, "a") { |f| f.puts(line) }
end

I would not necessarily agree that you should always avoid mutation an argument. Especially in the context of your example it seems like the mutation is the only purpose the method exists. Therefore it is not a side-effect IMO.
I would call it an unwanted side-effect when a method changes input parameters while doing something unrelated and that it is not obvious by the methods name that is also mutates input arguments.
You might prefer to return a new hash and keep the old hash unchanged:
numbers_hash_1 = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers.merge(four: "cuatro")
end
numbers_hash_2 = add_new_value(numbers_hash_1)
#=> {:one=>"uno", :two=>"dos", :three=>"tres", :four=>"cuatro"}
numbers_hash_1
#=> {:one=>"uno", :two=>"dos", :three=>"tres"}
Quote from the docs of Hash#merge:
merge(*other_hashes) → new_hash
Returns the new Hash formed by merging each of other_hashes into a copy of self.

Related

Overriding the << method for instance variables

Let's suppose I have this class:
class Example
attr_accessor :numbers
def initialize(numbers = [])
#numbers = numbers
end
private
def validate!(number)
number >= 0 || raise(ArgumentError)
end
end
I would like to run the #validate! on any new number before pushing it into the numbers:
example = Example.new([1, 2, 3])
example.numbers # [1, 2, 3]
example.numbers << 4
example.numbers # [1, 2, 3, 4]
example.numbers << -1 # raise ArgumentError
Below is the best I can do but I'm really not sure about it.
Plus it works only on <<, not on push. I could add it but there is risk of infinite loop...).
Is there a more "regular" way to do it? I couldn't find any official process for that.
class Example
attr_accessor :numbers
def initialize(numbers = [])
#numbers = numbers
bind = self # so the instance is usable inside the singleton block
#numbers.singleton_class.send(:define_method, :<<) do |value|
# here, self refers to the #numbers array, so use bind to refer to the instance
bind.send(:validate!, value)
push(value)
end
end
private
def validate!(number)
number >= 0 || raise(ArgumentError)
end
end
Programming is a lot like real life: it is not a good idea to just run around and let strangers touch your private parts.
You are solving the wrong problem. You are trying to regulate what strangers can do when they play with your private parts, but instead you simply shouldn't let them touch your privates in the first place.
class Example
def initialize(numbers = [])
#numbers = numbers.clone
end
def numbers
#numbers.clone.freeze
end
def <<(number)
validate(number)
#numbers << number
self
end
private
def validate(number)
raise ArgumentError, "number must be non-negative, but is #{number}" unless number >= 0
end
end
example = Example.new([1, 2, 3])
example.numbers # [1, 2, 3]
example << 4
example.numbers # [1, 2, 3, 4]
example << -1 # raise ArgumentError
Let's look at all the changes I made one-by-one.
cloneing the initializer argument
You are taking a mutable object (an array) from an untrusted source (the caller). You should make sure that the caller cannot do anything "sneaky". In your first code, I can do this:
ary = [1, 2, 3]
example = Example.new(ary)
ary << -1
Since you simply took my array I handed you, I can still do to the array anything I want!
And even in the hardened version, I can do this:
ary = [1, 2, 3]
example = Example.new(ary)
class << ary
remove_method :<<
end
ary << -1
Or, I can freeze the array before I hand it to you, which makes it impossible to add a singleton method to it.
Even without the safety aspects, you should still do this, because you violate another real-life rule: Don't play with other people's toys! I am handing you my array, and then you mutate it. In the real world, that would be considered rude. In programming, it is surprising, and surprises breed bugs.
cloneing in the getter
This goes to the heart of the matter: the #numbers array is my private internal state. I should never hand that to strangers. If you don't hand the #numbers array out, then none of the problems you are protecting against can even occur.
You are trying to protect against strangers mutating your internal state, and the solution to that is simple: don't give strangers your internal state!
The freeze is technically not necessary, but I like it to make clear to the caller that this is just a view into the state of the example object, and they are only allowed to view what I want them to.
And again, even without the safety aspects, this would still be a bad idea: by exposing your internal implementation to clients, you can no longer change the internal implementation without breaking clients. If you change the array to a linked list, your clients are going to break, because they are used to getting an array that you can randomly index, but you can't randomly index a linked list, you always have to traverse it from the front.
The example is unfortunately too small and simple to judge that, but I would even question why you are handing out arrays in the first place. What do the clients want to do with those numbers? Maybe it is enough for them to just iterate over them, in which case you don't need to give them a whole array, just an iterator:
class Example
def each(...)
return enum_for(__callee__) unless block_given?
#numbers.each(...)
self
end
end
If the caller wants an array, they can still easily get one by calling to_a on the Enumerator.
Note that I return self. This has two reasons:
It is simply the contract of each. Every other object in Ruby that implements each returns self. If this were Java, this would be part of the Iterable interface.
I would actually accidentally leak the internal state that I work so hard to protect! As I just wrote: every implementation of each returns self, so what does #numbers.each return? It returns #numbers, which means my whole Example#each method returns #numbers which is exactly the thing I am trying to hide!
Implement << myself
Instead of handing out my internal state and have the caller append to it, I control what happens with my internal state. I implement my own version of << in which I can check for whatever I want and make sure no invariants of my object are violated.
Note that I return self. This has two reasons:
It is simply the contract of <<. Every other object in Ruby that implements << returns self. If this were Java, this would be part of the Appendable interface.
I would actually accidentally leak the internal state that I work so hard to protect! As I just wrote: every implementation of << returns self, so what does #numbers << number return? It returns #numbers, which means my whole Example#<< method returns #numbers which is exactly the thing I am trying to hide!
Drop the bang
In Ruby, method names that end with a bang mean "This method is more surprising than its non-bang counterpart". In your case, there is no non-bang counterpart, so the method shouldn't have a bang.
Don't abuse boolean operators for control flow
… or at least if you do, use the keyword versions (and / or) instead of the symbolic ones (&& / ||).
But really, you should void it altogether. do or die is idiomatic in Perl, but not in Ruby.
Technically, I have changed the return value of your method: it used to return true for a valid value, now it returns nil. But you ignore its return value anyway, so it doesn't matter.
validate is probably not a good name for the method, though. I would expect a method named validate to return a boolean result, not raise an exception.
An exceptional message
You should add messages to your exceptions that tell the programmer what went wrong. Another possibility is to create more specific exceptions, e.g.
class NegativeNumberError < ArgumentError; end
But that would be overkill in this case. In general, if you expect code to "read" your exception, create a new class, if you expect humans to read your exception, then a message is enough.
Encapsulation, Data Abstraction, Information Hiding
Those are three subtly different but related concepts, and they are among the most important concepts in programming. We always want hide our internal state and encapsulate it behind methods that we control.
Encapsulation to the max
Some people (including myself) don't particularly like even the object itself playing with its internal state. Personally, I even encapsulate private instance variables that are never exposed behind getters and setters. The reason is that this makes the class easier to subclass: you can override and specialize methods, but not instance variables. So, if I use the instance variable directly, a subclass cannot "hook" into those accesses.
Whereas if I use getter and setter methods, the subclass can override those (or only one of those).
Note: the example is too small and simple, so I had some real trouble coming up with a good name (there is not enough in the example to understand how the variable is used and what it means), so eventually, I just gave up, but you will see what I mean about using getters and setters:
class Example
class NegativeNumberError < ArgumentError; end
def initialize(numbers = [])
self.numbers_backing = numbers.clone
end
def each(...)
return enum_for(__callee__) unless block_given?
numbers_backing.each(...)
self
end
def <<(number)
validate(number)
numbers_backing << number
self
end
private
attr_accessor :numbers_backing
def validate(number)
raise NegativeNumberError unless number >= 0
end
end
example = Example.new([1, 2, 3])
example.each.to_a # [1, 2, 3]
example << 4
example.each.to_a # [1, 2, 3, 4]
example << -1 # raise NegativeNumberError

Naming Ruby methods

Given a class:
class Shell
attr_reader :spiral
def initialize spiral
#spiral = spiral
end
def ?????
# do stuff...
end
end
some_shell = Shell.new([[1,2],[4,3])
some_shell.spiral #=> [[1,2],
# [4,3]]
some_shell.????? #=> [1,2,3,4]
Does it make more sense to name ?????:
unwrap_spiral
or
unwrapped_spiral
It seems like unwrap_spiral is saying to some_shell, "I want you to unwrap that spiral" and unwrapped_spiral is saying to some_shell, "I want you to give me an unwrapped spiral".
I read in POODR:
The distinction between a message that asks for what the sender wants and a message that tells the receiver how to behave may seem subtle but the consequences are significant.
Which seems like it would be better to choose unwrapped_spiral.
Does that make sense?
Naming things is one of the classical two (out of three) hard things to do.
Jokes aside, I am guided as follows:
Am I inquiring about a property of the object? Then my method name is a noun that represents that property.
Am I converting the object to another form? Then my method name is "to_target". In your case it could be to_unwrapped.
Am I asking the object to carry on some processing internally? Then my method name is a verb. For example, "climb" to climb the spiral.
For completeness; Am I building an object from another? Then my factory method is usually called from "from_source". Hypothetically in your case, "from_unwrapped."
Ruby seems to place unusual emphasis on the intent of names and the implications of them. For example, unwrap_spiral implies that it might do the operation in-place unless there was a compantion method like unwrap_spiral! that made it clear it didn't.
unwrapped_spiral may be a shade too verbose. It's not clear why spiral factors into this so much when unwrapped might suffice.
Another thing to consider is organizing methods that operate on spiral under the same alphabetic ordering: spiral_unwrap or spiral_unwrapped.
I actually think there is already a method called flatten that seems to result in the same result.
http://apidock.com/ruby/Array/flatten
If you have a more advanced custom method, I would go with unspiral
My inclination would be that rather than Shell having an Array spiral, to have a Spiral class (which may just be a subclass of Array) which has an unwrap method (which in this case would just alias to Array#flatten).
class Spiral < Array
def unwrap
flatten
end
end
class Shell
attr_reader :spiral
def initialize(spiral)
# Or if your convention would allow, accept `spiral` as an Array
# and assign #spiral = Spiral.new(spiral)
#spiral = spiral
end
end
> shell = Shell.new Spiral.new([[1,2],[4,3]])
=> #<Shell:0x000000018286f0 #spiral=[[1, 2], [4, 3]]>
> shell.spiral
=> [[1, 2], [4, 3]]
> shell.spiral.unwrap
=> [1, 2, 4, 3]
The reason this makes sense because the concept you want to operate on in this case is the Spiral, which belongs to the Shell, rather than the shell itself. This opens you up to having additional methods which operate on the spiral itself. As the implementation of Spiral gets more complex, Shell wouldn't have to necessarily get any more complex - it could simply expose and operate on the Spiral's public interface.
Shell < Spiral
In your example, Shell only lives for #spiral, so you might :
define Spiral class
let Shell inherit from Spiral
define Spiral#unwrap (Array#flatten isn't called flattened)
use Shell(Spiral)#unwrap
Shell#spiral
If Shell is more than just its #spiral, you could :
define Spiral class anyway
define Spiral#unwrap
use Shell#spiral#unwrap : shell.spiral.unwrap

undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)

I'm trying to return a list of values based on user defined arguments, from hashes defined in the local environment.
def my_method *args
#initialize accumulator
accumulator = Hash.new(0)
#define hashes in local environment
foo=Hash["key1"=>["var1","var2"],"key2"=>["var3","var4","var5"]]
bar=Hash["key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"]]
baz=Hash["key6"=>["var13","var14","var15","var16"]]
#iterate over args and build accumulator
args.each do |x|
if foo.has_key?(x)
accumulator=foo.assoc(x)
elsif bar.has_key?(x)
accumulator=bar.assoc(x)
elsif baz.has_key?(x)
accumulator=baz.assoc(x)
else
puts "invalid input"
end
end
#convert accumulator to list, and return value
return accumulator = accumulator.to_a {|k,v| [k].product(v).flatten}
end
The user is to call the method with arguments that are keywords, and the function to return a list of values associated with each keyword received.
For instance
> my_method(key5,key6,key1)
=> ["var10","var11","var12","var13","var14","var15","var16","var1","var2"]
The output can be in any order. I received the following error when I tried to run the code:
undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)
Please would you point me how to troubleshoot this? In Terminal assoc performs exactly how I expect it to:
> foo.assoc("key1")
=> ["var1","var2"]
I'm guessing you're coming to Ruby from some other language, as there is a lot of unnecessary cruft in this method. Furthermore, it won't return what you expect for a variety of reasons.
`accumulator = Hash.new(0)`
This is unnecessary, as (1), you're expecting an array to be returned, and (2), you don't need to pre-initialize variables in ruby.
The Hash[...] syntax is unconventional in this context, and is typically used to convert some other enumerable (usually an array) into a hash, as in Hash[1,2,3,4] #=> { 1 => 2, 3 => 4}. When you're defining a hash, you can just use the curly brackets { ... }.
For every iteration of args, you're assigning accumulator to the result of the hash lookup instead of accumulating values (which, based on your example output, is what you need to do). Instead, you should be looking at various array concatenation methods like push, +=, <<, etc.
As it looks like you don't need the keys in the result, assoc is probably overkill. You would be better served with fetch or simple bracket lookup (hash[key]).
Finally, while you can call any method in Ruby with a block, as you've done with to_a, unless the method specifically yields a value to the block, Ruby will ignore it, so [k].product(v).flatten isn't actually doing anything.
I don't mean to be too critical - Ruby's syntax is extremely flexible but also relatively compact compared to other languages, which means it's easy to take it too far and end up with hard to understand and hard to maintain methods.
There is another side effect of how your method is constructed wherein the accumulator will only collect the values from the first hash that has a particular key, even if more than one hash has that key. Since I don't know if that's intentional or not, I'll preserve this functionality.
Here is a version of your method that returns what you expect:
def my_method(*args)
foo = { "key1"=>["var1","var2"],"key2"=>["var3","var4","var5"] }
bar = { "key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"] }
baz = { "key6"=>["var13","var14","var15","var16"] }
merged = [foo, bar, baz].reverse.inject({}, :merge)
args.inject([]) do |array, key|
array += Array(merged[key])
end
end
In general, I wouldn't define a method with built-in data, but I'm going to leave it in to be closer to your original method. Hash#merge combines two hashes and overwrites any duplicate keys in the original hash with those in the argument hash. The Array() call coerces an array even when the key is not present, so you don't need to explicitly handle that error.
I would encourage you to look up the inject method - it's quite versatile and is useful in many situations. inject uses its own accumulator variable (optionally defined as an argument) which is yielded to the block as the first block parameter.

Safely call methods

Is there a nice way how to write:
a = one.two.three.four
where "one" - assigned, "two" - nil. This statement makes an exception.
I want to have "a" nil if any of "two", "three", "four" are nil, otherwise to have result of "four" in "a".
Is it possible to do this without writing condition statements?
First of all, you need to find out if this code violates the Law of Demeter. If that is the case, then the correct solution to this problem is to not write code this way.
How would you find out if its breaking it? Here is one article that tries to explain how that applies to Ruby language.
In your case, you would break it down into multiple calls, with guard clauses around it. In the call one.two.three.four, we can assume that four is a property of three (rather, the object returned by three). And three would be a property of two. So you would add a method in two:
# Note: This is an over-simplified example
def three_four
return unless three
three.four
end
And in one you would have:
def two_three_four
return unless two
two.three_four
end
A more relevant example:
invoice.customer.primary_address.zipcode
So you would have Customer#primary_address_zipcode and Invoice#customer_primary_address_zip_code (Or a better abbreviated name, that would make more sense)
a = one.try(:two).try(:three).try(:four)

Where and when to use Lambda?

I am trying to understand why do we really need lambda or proc in ruby (or any other language for that matter)?
#method
def add a,b
c = a+b
end
#using proc
def add_proc a,b
f = Proc.new {|x,y| x + y }
f.call a,b
end
#using lambda function
def add_lambda a,b
f = lambda {|x,y| x + y}
f.call a,b
end
puts add 1,1
puts add_proc 1,2
puts add_lambda 1,3
I can do a simple addition using: 1. normal function def, 2. using proc and 3. using lambda.
But why and where use lambda in the real world? Any examples where functions cannot be used and lambda should be used.
It's true, you don't need anonymous functions (or lambdas, or whatever you want to call them). But there are a lot of things you don't need. You don't need classes—just pass all the instance variables around to ordinary functions. Then
class Foo
attr_accessor :bar, :baz
def frob(x)
bar = baz*x
end
end
would become
def new_Foo(bar,baz)
[bar,baz]
end
def bar(foo)
foo[0]
end
# Other attribute accessors stripped for brevity's sake
def frob(foo,x)
foo[0] = foo[1]*x
end
Similarly, you don't need any loops except for loop...end with if and break. I could go on and on.1 But you want to program with classes in Ruby. You want to be able to use while loops, or maybe even array.each { |x| ... }, and you want to be able to use unless instead of if not.
Just like these features, anonymous functions are there to help you express things elegantly, concisely, and sensibly. Being able to write some_function(lambda { |x,y| x + f(y) }) is much nicer than having to write
def temp(x,y)
x + f(y)
end
some_function temp
It's much bulkier to have to break off the flow of code to write out a deffed function, which then has to be given a useless name, when it's just as clear to write the operation in-line. It's true that there's nowhere you must use a lambda, but there are lots of places I'd much rather use a lambda.
Ruby solves a lot of the lambda-using cases with blocks: all the functions like each, map, and open which can take a block as an argument are basically taking a special-cased anonymous function. array.map { |x| f(x) + g(x) } is the same as array.map(&lambda { |x| f(x) + g(x) }) (where the & just makes the lambda "special" in the same way that the bare block is). Again, you could write out a separate deffed function every time—but why would you want to?
Languages other than Ruby which support that style of programming don't have blocks, but often support a lighter-weight lambda syntax, such as Haskell's \x -> f x + g x, or C#'s x => f(x) + g(x);2. Any time I have a function which needs to take some abstract behavior, such as map, or each, or on_clicked, I'm going to be thankful for the ability to pass in a lambda instead of a named function, because it's just that much easier. Eventually, you stop thinking of them as somehow special—they're about as exciting as literal syntax for arrays instead of empty().append(1).append(2).append(3). Just another useful part of the language.
1: In the degenerate case, you really only need eight instructions: +-<>[].,. <> move an imaginary "pointer" along an array; +- increment and decrement the integer in the current cell; [] perform a loop-while-non-zero; and ., do input and output. In fact, you really only need just one instruction, such as subleq a b c (subtract a from b and jump to c if the result is less than or equal to zero).
2: I've never actually used C#, so if that syntax is wrong, feel free to correct it.
Blocks are more-or-less the same thing
Well, in Ruby, one doesn't usually use lambda or proc, because blocks are about the same thing and much more convenient.
The uses are infinite, but we can list some typical cases. One normally thinks of functions as lower-level blocks performing a piece of the processing, perhaps written generally and made into a library.
But quite often one wants to automate the wrapper and provide a custom library. Imagine a function that makes an HTTP or HTTPS connection, or a straight TCP one, feeds the I/O to its client, and then closes the connection. Or perhaps just does the same thing with a plain old file.
So in Ruby we would put the function in a library and have it take a block for the user .. the client .. the "caller" to write his application logic.
In another language this would have to be done with a class that implements an interface, or a function pointer. Ruby has blocks, but they are all examples of a lambda-style design pattern.
1) It is just a convenience. You don't need to name certain blocks
special_sort(array, :compare_proc => lambda { |left, right| left.special_param <=> right.special_param }
(imagine if you had to name all these blocks)
2) #lambda is usually used to create clojures:
def generate_multiple_proc(cofactor)
lambda { |element| element * cofactor }
end
[1, 2, 3, 4].map(&generate_multiple_proc(2)) # => [2, 3, 5, 8]
[1, 2, 3, 4].map(&generate_multiple_proc(3)) # => [3, 6, 9, 12]
It comes down to style. Lambdas are a a declarative style, methods are an imperative style. Consider this:
Lambda, blocks, procs, are all different types of closure. Now the question is, when and why to use an anonymous closure. I can answer that - at least in ruby!
Closures contain the lexical context of where they were called from. If you call a method from within a method, you do not get the context of where the method was called. This is due to the way the object chain is stored in the AST.
A Closure (lambda) on the other hand, can be passed WITH lexical context through a method, allowing for lazy evaluation.
Also lambdas naturally lend themselves to recursion and enumeration.
In case of OOP, you should create a function in a class only if there should be such an operation on the class according to your domain modeling.
If you need a quick function which can be written inline such as for comparison etc, use a lambda
Also check these SO posts -
When to use lambda, when to use Proc.new?
C# Lambda expressions: Why should I use them?
When to use a lambda in Ruby on Rails?
They're used as "higher-order" functions. Basically, for cases where you pass one function to another, so that the receiving function can call the passed-in one according to its own logic.
This is common in Ruby for iteration, e.g. some_list.each { |item| ... } to do something to each item of some_list. Although notice here that we don't use the keyword lambda; as noted, a block is basically the same thing.
In Python (since we have a language-agnostic tag on this question) you can't write anything quite like a Ruby block, so the lambda keyword comes up more often. However, you can get a similar "shortcut" effect from list comprehensions and generator expressions.
I found this helpful in understanding the differences:
http://www.robertsosinski.com/2008/12/21/understanding-ruby-blocks-procs-and-lambdas/
But in general the point is sometimes your writing a method but you don't know what you're going to want to do at a certain point in that method, so you let the caller decide.
E.g.:
def iterate_over_two_arrays(arr1, arr2, the_proc)
arr1.each do |x|
arr2.each do |y|
# ok I'm iterating over two arrays, but I could do lots of useful things now
# so I'll leave it up to the caller to decide by passing in a proc
the_proc.call(x,y)
end
end
end
Then instead of writing a iterate_over_two_arrays_and_print_sum method and a iterate_over_two_arrays_and_print_product method you just call:
iterate_over_two_arrays([1,2,3], [4,5,6], Proc.new {|x,y| puts x + y }
or
iterate_over_two_arrays([1,2,3], [4,5,6], Proc.new {|x,y| puts x * y }
so it's more flexible.

Resources