Ruby initialization idiom using symbols - ruby

I'm doing some work in an existing Ruby code base, and I'm quite new to Ruby. I see this initialization idiom pretty often:
def initialize(input)
#number = input[:number]
#color = input[:color]
end
I guess I have two questions. One is whether this is a common idiom and if so, what's good about it? The second is, what is actually happening here? Is the implication that input is an array? A hash? The underlying mechanics aren't clear to me.

Yes, it's common. Generally when you see code like this, it means that input is a Hash. Occasionally someone might pass in an object that acts like a hash, though, and we can still expect this code to work. We can tell that input is definitely not an array, because arrays require Integers to be used as their index, but :number and :color are Symbols.
Whenever you see something starting with a colon like :number, that is a Ruby symbol.
Whenever you see something starting with a # like #number, that is the name of an instance variable. Instance variables are used to store data inside objects for later use.
Suppose we have a class Foo defined like this:
class Foo
def initialize(input)
#number = input[:number]
#color = input[:color]
end
end
In this case, we can create a new object like this:
Foo.new(number: 4, color: 'red')
The code above is equivalent to:
input_hash = { :number => 4, :color => 'red' }
Foo.new(input_hash)
One nice thing about this pattern is that you can tell exactly what each input variable is being used for because it will be written next to a descriptive symbol, and also it doesn't matter what order you put the input variables in.
Improvements
If you want to improve this code, you might consider using a new feature of Ruby called keyword arguments. Alternatively, you might also consider Hash#fetch so you can have more control over what happens when one of the keys is missing, instead of just storing a nil value in your object. I would also recommend that you check the input hash for any unexpected keys and raise an exception if they are found.

Is this a common idiom and if so, what's good about it?
It's not common, but it's acceptable. When there are more than n parameters being passed in, where "n" is often > 3, we should use a hash.
What's good about it? Look at the code. Is it simple and readable? Does it make sense that an input parameter of :number is being assigned to an instance variable of the same name?
what is actually happening here? Is the implication that input is an array? A hash? The underlying mechanics aren't clear to me.
It's a hash where :number and :color are keys.

Related

Specify Ruby method namespace for readability

This is a bit of a weird question, but I'm not quite sure how to look it up. In our project, we already have an existing concept of a "shift". There's a section of code that reads:
foo.shift
In this scenario, it's easy to read this as trying to access the shift variable of object foo. But it could also be Array#shift. Is there a way to specify which class we expect the method to belong to? I've tried variations such as:
foo.send(Array.shift)
Array.shift(foo)
to make it more obvious which method was being called, but I can't get it to work. Is there a way to be more explicit about which class the method you're trying to call belongs to to help in code readability?
On a fundamental level you shouldn't be concerned about this sort of thing and you absolutely can't tell the Array shift method to operate on anything but an Array object. Many of the core Ruby classes are implemented in C and have optimizations that often depend on specific internals being present. There's safety measures in place to prevent you from trying to do something too crazy, like rebinding and applying methods of that sort arbitrarily.
Here's an example of two "shifty" objects to help illustrate a real-world situation and how that applies:
class CharacterArray < Array
def initialize(*args)
super(args.flat_map(&:chars))
end
def inspect
join('').inspect
end
end
class CharacterList < String
def shift
slice!(0, 1)
end
end
You can smash Array#shift on to the first and it will work by pure chance because you're dealing with an Array. It won't work with the second one because that's not an Array, it's missing significant methods that the shift method likely depends on.
In practice it doesn't matter what you're using, they're both the same:
list_a = CharacterArray.new("test")
list_a.shift
# => "t"
list_a.shift
# => "e"
list_a << "y"
# => "sty"
list_b = CharacterList.new("test")
list_b.shift
# => "t"
list_b.shift
# => "e"
list_b << "y"
# => "sty"
These both implement the same interfaces, they both produce the same results, and as far as you're concerned, as the caller, that's good enough. This is the foundation of Duck Typing which is the philosophy Ruby has deeply embraced.
If you try the rebind trick on the CharacterList you're going to end up in trouble, it won't work, yet that class delivers on all your expectations as far as interface goes.
Edit: As Sergio points out, you can't use the rebind technique, Ruby abruptly explodes:
Array.instance_method(:shift).bind(list_b).call
# => Error: bind argument must be an instance of Array (TypeError)
If readability is the goal then that has 35 more characters than list_b.shift which is usually going dramatically in the wrong direction.
After some discussion in the comments, one solution is:
Array.instance_method(:shift).bind(foo).call
Super ugly, but gets across the idea that I wanted which was to completely specify which instance method was actually being called. Alternatives would be to rename the variable to something like foo_array or to call it as foo.to_a.shift.
The reason this is difficult is that Ruby is not strongly-typed, and this question is all about trying to bring stronger typing to it. That's why the solution is gross! Thanks to everybody for their input!

Why is there no `.split!` in Ruby?

It just seems pretty logical to have it when there's even a downcase!. Has anyone else run into this use case in Ruby?
For the curious, I'm trying to do this:
def some_method(foo)
foo.downcase!.split!(" ")
## do some stuff with foo later. ##
end
some_method("A String like any other")
Instead of this:
def some_method(foo)
foo = foo.downcase.split(" ")
## do some stuff with foo later. ##
end
some_method("A String like any other")
Which isn't a really big deal...but ! just seems cooler.
Why is there no .split! in Ruby?
It just seems pretty logical to have it when there's even a downcase!.
It may be logical, but it is impossible: objects cannot change their class or their identity in Ruby. You may be thinking of Smalltalk's become: which doesn't and cannot exist in Ruby. become: changes the identity of an object and thus can also change its class.
I don't see this "use case" as very important.
The only thing a "bang method" is doing is saving you the trouble of assigning a variable.
The reason "bang methods" are the exception instead of the rule is they can produce confusing results if you don't understand them.
i.e. if you write
a = "string"
def my_upcase(string)
string.upcase!
end
b = my_upcase(a)
then both a and b will have transformed value even if you didn't intend to change a. Removing the exclamation point fixes this example, but if you're using mutable objects such as hashes and arrays you'll have to look out for this in other situations as well.
a = [1,2,3]
def get_last_element(array)
array.pop
end
b = get_last_element(a)
Since Array#pop has side effects, a is now 1,2. It has the last element removed, which might not have been what you intended. You could replace .pop here with [-1] or .last to get rid of the side effect
The exclamation point in a method name is essentially warning you that there are side effects. This is important in the concept of functional programming, which prescribes side effect free code. Ruby is very much a functional programming language by design (although it's very object oriented as well).
If your "use case" boils down to avoiding assigning a variable, that seems like a really minor discomfort.
For a more technical reason, though, see Jorg Mittag's answer. It's impossible to write a method which changes the class of self
this
def some_method(foo)
foo = foo.downcase.split(" ")
end
some_method("A String like any other")
is the same as this
def some_method(foo)
foo.downcase.split
end
some_method("A String like any other")
Actually, both of your methods return the same result. We can look at a few examples of methods that modify the caller.
array.map! return a modified original array
string.upcase! return a modified original string
However,
split modifies the class of the caller, changing a string to an array.
Notice how the above examples only modify the content of the object, instead of changing its class.
This is most likely why there isn't a split! method, although it's pretty easy to define one yourself.
#split creates an array out of a string, you can't permanently mutate(!) the string into being an array. Because the method is creating a new form from the source information(string), the only thing you need to do to make it permanent, is to bind it to a variable.

Why can't I overwrite self in the Integer class?

I want to be able to write number.incr, like so:
num = 1; num.incr; num
#=> 2
The error I'm seeing states:
Can't change the value of self
If that's true, how do bang! methods work?
You cannot change the value of self
An object is a class pointer and a set of instance methods (note that this link is an old version of Ruby, because its dramatically simpler, and thus better for explanatory purposes).
"Pointing" at an object means you have a variable which stores the object's location in memory. Then to do anything with the object, you first go to the location in memory (we might say "follow the pointer") to get the object, and then do the thing (e.g. invoke a method, set an ivar).
All Ruby code everywhere is executing in the context of some object. This is where your instance variables get saved, it's where Ruby looks for methods that don't have a receiver (e.g. $stdout is the receiver in $stdout.puts "hi", and the current object is the receiver in puts "hi"). Sometimes you need to do something with the current object. The way to work with objects is through variables, but what variable points at the current object? There isn't one. To fill this need, the keyword self is provided.
self acts like a variable in that it points at the location of the current object. But it is not like a variable, because you can't assign it new value. If you could, the code after that point would suddenly be operating on a different object, which is confusing and has no benefits over just using a variable.
Also remember that the object is tracked by variables which store memory addresses. What is self = 2 supposed to mean? Does it only mean that the current code operates as if it were invoked 2? Or does it mean that all variables pointing at the old object now have their values updated to point at the new one? It isn't really clear, but the former unnecessarily introduces an identity crisis, and the latter is prohibitively expensive and introduce situations where it's unclear what is correct (I'll go into that a bit more below).
You cannot mutate Fixnums
Some objects are special at the C level in Ruby (false, true, nil, fixnums, and symbols).
Variables pointing at them don't actually store a memory location. Instead, the address itself stores the type and identity of the object. Wherever it matters, Ruby checks to see if it's a special object (e.g. when looking up an instance variable), and then extracts the value from it.
So there isn't a spot in memory where the object 123 is stored. Which means self contains the idea of Fixnum 123 rather than a memory address like usual. As with variables, it will get checked for and handled specially when necessary.
Because of this, you cannot mutate the object itself (though it appears they keep a special global variable to allow you to set instance variables on things like Symbols).
Why are they doing all of this? To improve performance, I assume. A number stored in a register is just a series of bits (typically 32 or 64), which means there are hardware instructions for things like addition and multiplication. That is to say the ALU, is wired to perform these operations in a single clock cycle, rather than writing the algorithms with software, which would take many orders of magnitude longer. By storing them like this, they avoid the cost of storing and looking the object in memory, and they gain the advantage that they can directly add the two pointers using hardware. Note, however, that there are still some additional costs in Ruby, that you don't have in C (e.g. checking for overflow and converting result to Bignum).
Bang methods
You can put a bang at the end of any method. It doesn't require the object to change, it's just that people usually try to warn you when you're doing something that could have unexpected side-effects.
class C
def initialize(val)
#val = val # => 12
end # => :initialize
def bang_method!
"My val is: #{#val}" # => "My val is: 12"
end # => :bang_method!
end # => :bang_method!
c = C.new 12 # => #<C:0x007fdac48a7428 #val=12>
c.bang_method! # => "My val is: 12"
c # => #<C:0x007fdac48a7428 #val=12>
Also, there are no bang methods on integers, It wouldn't fit with the paradigm
Fixnum.instance_methods.grep(/!$/) # => [:!]
# Okay, there's one, but it's actually a boolean negation
1.! # => false
# And it's not a Fixnum method, it's an inherited boolean operator
1.method(:!).owner # => BasicObject
# In really, you call it this way, the interpreter translates it
!1 # => false
Alternatives
Make a wrapper object: I'm not going to advocate this one, but it's the closest to what you're trying to do. Basically create your own class, which is mutable, and then make it look like an integer. There's a great blog post walking through this at http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html it will get you 95% of the way there
Don't depend directly on the value of a Fixnum: I can't give better advice than this without knowing what you're trying to do / why you feel this is a need.
Also, you should show your code when you ask questions like this. I misunderstood how you were approaching it for a long time.
It's simply impossible to change self to another object. self is the receiver of the message send. There can be only one.
If that's true, how do bang! methods work?
The bang (!) is simply part of the method name. It has absolutely no special meaning whatsoever. It is a convention among Ruby programmers to name surprising variants of less surprising methods with a bang, but that's just that: a convention.

Why is splat argument in ruby not used all the time?

I know splat arguments are used when we do not know the number of arguments that would be passed. I wanted to know whether I should use splat all the time. Are there any risks in using the splat argument whenever I pass on arguments?
The splat is great when the method you are writing has a genuine need to have an arbitrary number of arguments, for a method such as Hash#values_at.
In general though, if a method actually requires a fixed number of arguments it's a lot clearer to have named arguments than to pass arrays around and having to remember which position serves which purpose. For example:
def File.rename(old_name, new_name)
...
end
is clearer than:
def File.rename(*names)
...
end
You'd have to read the documentation to know whether the old name was first or second. Inside the method, File.rename would need to implement error handling around whether you had passed the correct number of arguments. So unless you need the splat, "normal" arguments are usually clearer.
Keyword arguments (new in ruby 2.0) can be even clearer at point of usage, although their use in the standard library is not yet widespread.
For a method that would take an arbitrary amount of parameters, options hash is a de facto solution:
def foo(options = {})
# One way to do default values
defaults = { bar: 'baz' }
options = defaults.merge(options)
# Another way
options[:bar] ||= 'baz'
bar = options[bar]
do_stuff_with(bar)
end
A good use of splat is when you're working with an array and want to use just the first argument of the array and do something else with the rest of the array. It's much quicker as well than other methods. Here's a smart guy Jesse Farmer's use of it https://gist.github.com/jfarmer/d0f37717f6e7f6cebf72 and here is an example of some other ways I tried solving the spiraling array problem and some benchmarks to go with it. https://gist.github.com/TalkativeTree/6724065
The problem with it is that it's not easily digestible. If you've seen and used it before, great, but it could slow down other people's understanding of what the code is doing. Even your own if you haven't looked at it in a while hah.
Splat lets the argument be interpreted as an array, and you would need an extra step to take it out. Without splat, you do not need special things to do to access the argument:
def foo x
#x = x
end
but if you put it in an array using splat, you need extra step to take it out of the array:
def foo *x
#x = x.first # or x.pop, x.shift, etc.
end
There is no reason to introduce an extra step unless necessary.

What is the semantics of this "do ... end"

I am new to Ruby and am learning from reading an already written code.
I encounter this code:
label = TkLabel.new(#root) do
text 'Current Score: '
background 'lightblue'
end
What is the semantics of the syntax "do" above?
I played around with it and it seems like creating a TkLabel object then set its class variable text and background to be what specified in quote. However when I tried to do the same thing to a class I created, that didn't work.
Oh yeah, also about passing hash into function, such as
object.function('argument1'=>123, 'argument2'=>321)
How do I make a function that accepts that kind of argument?
Thanks in advance
What you're looking at is commonly referred to as a DSL, or Domain Specific Language.
At first glance it may not be clear why the code you see works, as text and background are seemingly undefined, but the trick here is that that code is actually evaluated in a scope in which they are. At it's simplest, the code driving it might look something like this:
class TkLabel
def initialize(root, &block)
#root = root
if block
# the code inside the block in your app is actually
# evaluated in the scope of the new instance of TkLabel
instance_eval(&block)
end
end
def text(value)
# set the text
end
def background(value)
# set the background
end
end
Second question first: that's just a hash. Create a function that accepts a single argument, and treat it like a hash.
The "semantics" are that initialize accepts a block (the do...end bit), and some methods accepting string parameters to set specific attributes.
Without knowing how you tried to do it, it's difficult to go much beyond that. Here are a few, possible, references that might help you over some initial hurdles.
Ruby is pretty decent at making miniature, internal DSLs because of its ability to accepts blocks and its forgiving (if arcane at times) syntax.

Resources