Overriding the << method for instance variables - ruby

Let's suppose I have this class:
class Example
attr_accessor :numbers
def initialize(numbers = [])
#numbers = numbers
end
private
def validate!(number)
number >= 0 || raise(ArgumentError)
end
end
I would like to run the #validate! on any new number before pushing it into the numbers:
example = Example.new([1, 2, 3])
example.numbers # [1, 2, 3]
example.numbers << 4
example.numbers # [1, 2, 3, 4]
example.numbers << -1 # raise ArgumentError
Below is the best I can do but I'm really not sure about it.
Plus it works only on <<, not on push. I could add it but there is risk of infinite loop...).
Is there a more "regular" way to do it? I couldn't find any official process for that.
class Example
attr_accessor :numbers
def initialize(numbers = [])
#numbers = numbers
bind = self # so the instance is usable inside the singleton block
#numbers.singleton_class.send(:define_method, :<<) do |value|
# here, self refers to the #numbers array, so use bind to refer to the instance
bind.send(:validate!, value)
push(value)
end
end
private
def validate!(number)
number >= 0 || raise(ArgumentError)
end
end

Programming is a lot like real life: it is not a good idea to just run around and let strangers touch your private parts.
You are solving the wrong problem. You are trying to regulate what strangers can do when they play with your private parts, but instead you simply shouldn't let them touch your privates in the first place.
class Example
def initialize(numbers = [])
#numbers = numbers.clone
end
def numbers
#numbers.clone.freeze
end
def <<(number)
validate(number)
#numbers << number
self
end
private
def validate(number)
raise ArgumentError, "number must be non-negative, but is #{number}" unless number >= 0
end
end
example = Example.new([1, 2, 3])
example.numbers # [1, 2, 3]
example << 4
example.numbers # [1, 2, 3, 4]
example << -1 # raise ArgumentError
Let's look at all the changes I made one-by-one.
cloneing the initializer argument
You are taking a mutable object (an array) from an untrusted source (the caller). You should make sure that the caller cannot do anything "sneaky". In your first code, I can do this:
ary = [1, 2, 3]
example = Example.new(ary)
ary << -1
Since you simply took my array I handed you, I can still do to the array anything I want!
And even in the hardened version, I can do this:
ary = [1, 2, 3]
example = Example.new(ary)
class << ary
remove_method :<<
end
ary << -1
Or, I can freeze the array before I hand it to you, which makes it impossible to add a singleton method to it.
Even without the safety aspects, you should still do this, because you violate another real-life rule: Don't play with other people's toys! I am handing you my array, and then you mutate it. In the real world, that would be considered rude. In programming, it is surprising, and surprises breed bugs.
cloneing in the getter
This goes to the heart of the matter: the #numbers array is my private internal state. I should never hand that to strangers. If you don't hand the #numbers array out, then none of the problems you are protecting against can even occur.
You are trying to protect against strangers mutating your internal state, and the solution to that is simple: don't give strangers your internal state!
The freeze is technically not necessary, but I like it to make clear to the caller that this is just a view into the state of the example object, and they are only allowed to view what I want them to.
And again, even without the safety aspects, this would still be a bad idea: by exposing your internal implementation to clients, you can no longer change the internal implementation without breaking clients. If you change the array to a linked list, your clients are going to break, because they are used to getting an array that you can randomly index, but you can't randomly index a linked list, you always have to traverse it from the front.
The example is unfortunately too small and simple to judge that, but I would even question why you are handing out arrays in the first place. What do the clients want to do with those numbers? Maybe it is enough for them to just iterate over them, in which case you don't need to give them a whole array, just an iterator:
class Example
def each(...)
return enum_for(__callee__) unless block_given?
#numbers.each(...)
self
end
end
If the caller wants an array, they can still easily get one by calling to_a on the Enumerator.
Note that I return self. This has two reasons:
It is simply the contract of each. Every other object in Ruby that implements each returns self. If this were Java, this would be part of the Iterable interface.
I would actually accidentally leak the internal state that I work so hard to protect! As I just wrote: every implementation of each returns self, so what does #numbers.each return? It returns #numbers, which means my whole Example#each method returns #numbers which is exactly the thing I am trying to hide!
Implement << myself
Instead of handing out my internal state and have the caller append to it, I control what happens with my internal state. I implement my own version of << in which I can check for whatever I want and make sure no invariants of my object are violated.
Note that I return self. This has two reasons:
It is simply the contract of <<. Every other object in Ruby that implements << returns self. If this were Java, this would be part of the Appendable interface.
I would actually accidentally leak the internal state that I work so hard to protect! As I just wrote: every implementation of << returns self, so what does #numbers << number return? It returns #numbers, which means my whole Example#<< method returns #numbers which is exactly the thing I am trying to hide!
Drop the bang
In Ruby, method names that end with a bang mean "This method is more surprising than its non-bang counterpart". In your case, there is no non-bang counterpart, so the method shouldn't have a bang.
Don't abuse boolean operators for control flow
… or at least if you do, use the keyword versions (and / or) instead of the symbolic ones (&& / ||).
But really, you should void it altogether. do or die is idiomatic in Perl, but not in Ruby.
Technically, I have changed the return value of your method: it used to return true for a valid value, now it returns nil. But you ignore its return value anyway, so it doesn't matter.
validate is probably not a good name for the method, though. I would expect a method named validate to return a boolean result, not raise an exception.
An exceptional message
You should add messages to your exceptions that tell the programmer what went wrong. Another possibility is to create more specific exceptions, e.g.
class NegativeNumberError < ArgumentError; end
But that would be overkill in this case. In general, if you expect code to "read" your exception, create a new class, if you expect humans to read your exception, then a message is enough.
Encapsulation, Data Abstraction, Information Hiding
Those are three subtly different but related concepts, and they are among the most important concepts in programming. We always want hide our internal state and encapsulate it behind methods that we control.
Encapsulation to the max
Some people (including myself) don't particularly like even the object itself playing with its internal state. Personally, I even encapsulate private instance variables that are never exposed behind getters and setters. The reason is that this makes the class easier to subclass: you can override and specialize methods, but not instance variables. So, if I use the instance variable directly, a subclass cannot "hook" into those accesses.
Whereas if I use getter and setter methods, the subclass can override those (or only one of those).
Note: the example is too small and simple, so I had some real trouble coming up with a good name (there is not enough in the example to understand how the variable is used and what it means), so eventually, I just gave up, but you will see what I mean about using getters and setters:
class Example
class NegativeNumberError < ArgumentError; end
def initialize(numbers = [])
self.numbers_backing = numbers.clone
end
def each(...)
return enum_for(__callee__) unless block_given?
numbers_backing.each(...)
self
end
def <<(number)
validate(number)
numbers_backing << number
self
end
private
attr_accessor :numbers_backing
def validate(number)
raise NegativeNumberError unless number >= 0
end
end
example = Example.new([1, 2, 3])
example.each.to_a # [1, 2, 3]
example << 4
example.each.to_a # [1, 2, 3, 4]
example << -1 # raise NegativeNumberError

Related

Ruby method operating on hash without side effects

I want to create a function that adds a new element to a hash as below:
numbers_hash = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers["four"] = "cuatro"
end
add_new_value(numbers_hash)
I have read that immutability is important, and methods with side effects are not a good idea. Clearly this method is modifying the original input, how should I handle this?
Ruby is an OOP language with some functional patterns
Ruby is an object oriented language. Side-effects are important in OO. When you call a method on an object and that method modifies the object, that's a side-effect, and that's fine:
a = [1, 2, 3]
a.delete_at(1) # side effect in delete_at
# a is now [1, 3]
Ruby also allows a functional style, where data is transformed without side-effects. You've probably seen or used the map-reduce pattern:
a = ["1", "2", "3"]
a.map(&:to_i).reduce(&:+) # => 6
# a is unchanged
Command Query Separation
What may have confused you is a rule invented by Bertrand Meyers, the Command Query Separation Rule. This rule says that a method must either
Have a side effect, but no return value, or
Have no side effect, but return something
But not both. Note that although it's called a rule, in Ruby I would treat it as a strong guideline. There are times when violating this rule makes for better code, but in my experience this rule can be adhered to most of the time.
We have to clarify what we mean by "has a return value" in Ruby, since every Ruby method has a return value--the value of the last statement it executed (or nil if it was empty). What we mean is that the method has an intentional return value, one that is part of this method's contract and that the caller can be expected to use.
Here's an example of a method that has a side-effect and a return value, violating this rule:
# Open the valve if possible. Returns whether or not the valve is open.
def open_valve
#valve_open = true if #power_available
#valve_open
end
and how you'd separate that into two methods to adhere to this rule:
attr_reader :valve_open
def open_valve
#valve_open = true if #power_available
end
If you choose to adhere to this rule, you may find it useful to name side-effect methods with verb phrases, and returning-something methods with noun phrases. This makes it obvious from the start what kind of method you are dealing with, and makes naming methods easier.
What is a side-effect?
A side effect is something that changes the state of an object or or external entity like a file. This method that changes the state of its object has a side effect:
def register_error
#error_count += 1
end
This method that changes the state of its argument has a side effect:
def delete_ones(ary)
ary.delete(1)
end
This method that writes to a file has a side effect:
def log(line)
File.open(log_path, "a") { |f| f.puts(line) }
end
I would not necessarily agree that you should always avoid mutation an argument. Especially in the context of your example it seems like the mutation is the only purpose the method exists. Therefore it is not a side-effect IMO.
I would call it an unwanted side-effect when a method changes input parameters while doing something unrelated and that it is not obvious by the methods name that is also mutates input arguments.
You might prefer to return a new hash and keep the old hash unchanged:
numbers_hash_1 = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers.merge(four: "cuatro")
end
numbers_hash_2 = add_new_value(numbers_hash_1)
#=> {:one=>"uno", :two=>"dos", :three=>"tres", :four=>"cuatro"}
numbers_hash_1
#=> {:one=>"uno", :two=>"dos", :three=>"tres"}
Quote from the docs of Hash#merge:
merge(*other_hashes) → new_hash
Returns the new Hash formed by merging each of other_hashes into a copy of self.

Loop method until it returns falsey

I was trying to make my bubble sort shorter and I came up with this
class Array
def bubble_sort!(&block)
block = Proc.new { |a, b| a <=> b } unless block_given?
sorted = each_index.each_cons(2).none? do |i, next_i|
if block.call(self[i], self[next_i]) == 1
self[i], self[next_i] = self[next_i], self[i]
end
end until sorted
self
end
def bubble_sort(&prc)
self.dup.bubble_sort!(&prc)
end
end
I don't particularly like the thing with sorted = --sort code-- until sorted.
I just want to run the each_index.each_cons(s).none? code until it returns true. It's a weird situation that I use until, but the condition is a code I want to run. Any way, my try seems awkward, and ruby usually has a nice concise way of putting things. Is there a better way to do this?
This is just my opinion
have you ever read the ruby source code of each and map to understand what they do?
No, because they have a clear task expressed from the method name and if you test them, they will take an object, some parameters and then return a value to you.
For example if I want to test the String method split()
s = "a new string"
s.split("new")
=> ["a ", " string"]
Do you know if .split() takes a block?
It is one of the core ruby methods, but to call it I don't pass a block 90% of the times, I can understand what it does from the name .split() and from the return value
Focus on the objects you are using, the task the methods should accomplish and their return values.
I read your code and I can not refactor it, I hardly can understand what the code does.
I decided to write down some points, with possibility to follow up:
1) do not use the proc for now, first get the Object Oriented code clean.
2) split bubble_sort! into several methods, each one with a clear task
def ordered_inverted! (bubble_sort!), def invert_values, maybe perform a invert_values until sorted, check if existing methods already perform this sorting functionality
3) write specs for those methods, tdd will push you to keep methods simple and easy to test
4) If those methods do not belong to the Array class, include them in the appropriate class, sometimes overly complicated methods are just performing simple String operations.
5) Reading books about refactoring may actually help more then trying to force the usage of proc and functional programming when not necessary.
After looking into it further I'm fairly sure the best solution is
loop do
break if condition
end
Either that or the way I have it in the question, but I think the loop do version is clearer.
Edit:
Ha, a couple weeks later after I settled for the loop do solution, I stumbled into a better one. You can just use a while or until loop with an empty block like this:
while condition; end
until condition; end
So the bubble sort example in the question can be written like this
class Array
def bubble_sort!(&block)
block = Proc.new { |a, b| a <=> b } unless block_given?
until (each_index.each_cons(2).none? do |i, next_i|
if block.call(self[i], self[next_i]) == 1
self[i], self[next_i] = self[next_i], self[i]
end
end); end
self
end
def bubble_sort(&prc)
self.dup.bubble_sort!(&prc)
end
end

What is prefered way to loop in Ruby?

Why is each loop preferred over for loop in Ruby? Is there a difference in time complexity or are they just syntactically different?
Yes, these are two different ways of iterating over, But hope this calculation helps.
require 'benchmark'
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
a.each { |x| sum += x }
}
This takes 5.866932 sec
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
for x in a
sum += x
end
}
This takes 6.146521 sec.
Though its not a right way to do the benchmarking, there are some other constraints too. But on a single machine, each seems to be a bit faster than for.
The variable referencing an item in iteration is temporary and does not have significance outside of the iteration. It is better if it is hidden from outside of the iteration. With external iterators, such variable is located outside of the iteration block. In the following, e is useful only within do ... end, but is separated from the block, and written outside of it; it does not look easy to a programmer:
for e in [:foo, :bar] do
...
end
With internal iterators, the block variable is defined right inside the block, where it is used. It is easier to read:
[:foo, :bar].each do |e|
...
end
This visibility issue is not just for a programmer. With respect to visibility in the sense of scope, the variable for an external iterator is accessible outside of the iteration:
for e in [:foo] do; end
e # => :foo
whereas in internal iterator, a block variable is invisible from outside:
[:foo].each do |e|; end
e # => undefined local variable or method `e'
The latter is better from the point of view of encapsulation.
When you want to nest the loops, the order of variables would be somewhat backwards with external iterators:
for a in [[:foo, :bar]] do
for e in a do
...
end
end
but with internal iterators, the order is more straightforward:
[[:foo, :bar]].each do |a|
a.each do |e|
...
end
end
With external iterators, you can only use hard-coded Ruby syntax, and you also have to remember the matching between the keyword and the method that is internally called (for calls each), but for internal iterators, you can define your own, which gives flexibility.
each is the Ruby Way. Implements the Iterator Pattern that has decoupling benefits.
Check also this: "for" vs "each" in Ruby
An interesting question. There are several ways of looping in Ruby. I have noted that there is a design principle in Ruby, that when there are multiple ways of doing the same, there are usually subtle differences between them, and each case has its own unique use, its own problem that it solves. So in the end you end up needing to be able to write (and not just to read) all of them.
As for the question about for loop, this is similar to my earlier question whethe for loop is a trap.
Basically there are 2 main explicit ways of looping, one is by iterators (or, more generally, blocks), such as
[1, 2, 3].each { |e| puts e * 10 }
[1, 2, 3].map { |e| e * 10 )
# etc., see Array and Enumerable documentation for more iterator methods.
Connected to this way of iterating is the class Enumerator, which you should strive to understand.
The other way is Pascal-ish looping by while, until and for loops.
for y in [1, 2, 3]
puts y
end
x = 0
while x < 3
puts x; x += 1
end
# same for until loop
Like if and unless, while and until have their tail form, such as
a = 'alligator'
a.chop! until a.chars.last == 'g'
#=> 'allig'
The third very important way of looping is implicit looping, or looping by recursion. Ruby is extremely malleable, all classes are modifiable, hooks can be set up for various events, and this can be exploited to produce most unusual ways of looping. The possibilities are so endless that I don't even know where to start talking about them. Perhaps a good place is the blog by Yusuke Endoh, a well known artist working with Ruby code as his artistic material of choice.
To demonstrate what I mean, consider this loop
class Object
def method_missing sym
s = sym.to_s
if s.chars.last == 'g' then s else eval s.chop end
end
end
alligator
#=> "allig"
Aside of readability issues, the for loop iterates in the Ruby land whereas each does it from native code, so in principle each should be more efficient when iterating all elements in an array.
Loop with each:
arr.each {|x| puts x}
Loop with for:
for i in 0..arr.length
puts arr[i]
end
In the each case we are just passing a code block to a method implemented in the machine's native code (fast code), whereas in the for case, all code must be interpreted and run taking into account all the complexity of the Ruby language.
However for is more flexible and lets you iterate in more complex ways than each does, for example, iterating with a given step.
EDIT
I didn't come across that you can step over a range by using the step() method before calling each(), so the flexibility I claimed for the for loop is actually unjustified.

Defining method in Ruby with equals

Being new to Ruby, I'm having trouble explaining to myself the behavior around method definitions within Ruby.
The example is noted below...
class Foo
def do_something(action)
action.inspect
end
def do_something_else=action
action.inspect
end
end
?> f.do_something("drive")
=> "\"drive\""
?> f.do_something_else=("drive")
=> "drive"
The first example is self explanatory. What Im trying to understand is the behavior of the second example. Other than what looks to be one producing a string literal and the other is not, what is actually happening? Why would I use one over the other?
Generally, do_something is a getter, and do_something= is a setter.
class Foo
attr_accessor :bar
end
is equivalent to
class Foo
def bar
#bar
end
def bar=(value)
#bar = value
end
end
To answer your question about the difference in behavior, methods that end in = always return the right hand side of the expression. In this case returning action, not action.inspect.
class Foo
def do_something=(action)
"stop"
end
end
?> f = Foo.new
?> f.do_something=("drive")
=> "drive"
Both of your methods are actually being defined and called as methods. Quite a lot of things in Ruby can be defined as methods, even the operators such as +, -, * and /. Ruby allows methods to have three special notational suffixes. I made that phrase up all by myself. What I mean by notational suffixes is that the thing on the end of the method will indicate how that method is supposed to work.
Bang!
The first notational suffix is !. This indicates that the method is supposed to be destructive, meaning that it modifies the object that it's called on. Compare the output of these two scripts:
a = [1, 2, 3]
a.map { |x| x * x }
a
And:
a = [1, 2, 3]
a.map! { |x| x * x }
a
There's a one character difference between the two scripts, but they operate differently! The first one will still go through each element in the array and perform the operation inside the block, but the object in a will still be the same [1,2,3] that you started with.
In the second example, however, the a at the end will instead be [1, 4, 9] because map! modified the object in place!
Query
The second notational suffix is ?, and that indicates that a method is used to query an object about something, and means that the method is supposed to return true, false or in some extreme circumstances, nil.
Now, note that the method doesn't have to return true or false... it's just that it'd be very nice if it did that!
Proof:
def a?
true
end
def b?
"moo"
end
Calling a? will return true, and calling b? will return "moo". So there, that's query methods. The methods that should return true or false but sometimes can return other things because some developers don't like other developers.
Setters!
NOW we get to the meat of your (paraphrased) question: what does = mean on the end of a method?
That usually indicates that a method is going to set a particular value, as Erik already outlined before I finished typing this essay of an answer.
However, it may not set one, just like the query methods may not return true or false. It's just convention.
You can call that setter method like this also:
foo.something_else="value"
Or (my favourite):
foo.something_else = "value"
In theory, you can actually ignore the passed in value, just like you can completely ignore any arguments passed into any method:
def foo?(*args)
"moo"
end
>> foo?(:please, :oh, :please, :why, :"won't", :you, :use, :these, :arguments, :i, :got, :just, :for, :you, :question_mark?)
=> "moo"
Ruby supports all three syntaxes for setter methods, although it's very rare to see the one you used!
Well, I hope this answer's been roughly educational and that you understand more things about Ruby now. Enjoy!
You cannot define a return value for assignment methods. The return value is always the same as the value passed in, so that assignment chains (x = y = z = 3) will always work.
Typically, you would omit the brackets when you invoke the method, so that it behaves like a property:
my_value = f.do_something= "drive"
def do_something_else=action
action.inspect
end
This defines a setter method, so do_something_else appears as though we are initializing a attribute. So the value initialized is directly passed,

Data structures for a Blackjack game

I'm trying to learn how to use Ruby and as my first application I'd like to build a console based Blackjack game.
I'm not too familiar with the Ruby constructs and the only way I'll learn and feel at home is build things and learn from my mistakes.
I'm thinking of creating a Card class, and has a Stack class has a collection of Cards.
However I don't know exactly what built in type I need to use to hold these Card objects.
Here is my card class:
class Card
attr_accessor :number, :suit
def initialize(number, suit)
#number = number
#suit = suit
end
def to_s
"#{#number} of #{#suit}"
end
end
Since I'm coming from C#, I thought of using something like List Cards {get;set;} - does something like this exists in Ruby or maybe there is a better more rubyesque way to do this.
Ruby is a dynamic language, so it usually doesn't rely on strict compile-time type checking, thus making constructions, such as templatized List<Card> fairly useless. Ruby generally has one universal data type of ordered collections - Array, which closely mimics ArrayList concept of Java/C#.
So, unless you want Stack to be something special, you can just use Array as is:
#stack = []
#stack << Card.new(6, :spades)
#stack << Card.new(10, :hearts)
or extend Array in some way:
class Stack < Array
# some ultra-cool new methods
end
#stack = Stack.new
#stack << Card.new(...)
Ruby's builtin Array type has methods for acting like a stack:
a = [1, 2, 3] # arrays can be initialized with []
a.push(4) # a is now [1, 2, 3, 4]
a.pop # returns 4, a is now [1, 2, 3]
Two Ruby Quiz episodes were dedicated to blackjack. The solutions to the second one might give you some ideas: http://www.rubyquiz.com/quiz151.html

Resources