Why is there no `.split!` in Ruby? - ruby

It just seems pretty logical to have it when there's even a downcase!. Has anyone else run into this use case in Ruby?
For the curious, I'm trying to do this:
def some_method(foo)
foo.downcase!.split!(" ")
## do some stuff with foo later. ##
end
some_method("A String like any other")
Instead of this:
def some_method(foo)
foo = foo.downcase.split(" ")
## do some stuff with foo later. ##
end
some_method("A String like any other")
Which isn't a really big deal...but ! just seems cooler.

Why is there no .split! in Ruby?
It just seems pretty logical to have it when there's even a downcase!.
It may be logical, but it is impossible: objects cannot change their class or their identity in Ruby. You may be thinking of Smalltalk's become: which doesn't and cannot exist in Ruby. become: changes the identity of an object and thus can also change its class.

I don't see this "use case" as very important.
The only thing a "bang method" is doing is saving you the trouble of assigning a variable.
The reason "bang methods" are the exception instead of the rule is they can produce confusing results if you don't understand them.
i.e. if you write
a = "string"
def my_upcase(string)
string.upcase!
end
b = my_upcase(a)
then both a and b will have transformed value even if you didn't intend to change a. Removing the exclamation point fixes this example, but if you're using mutable objects such as hashes and arrays you'll have to look out for this in other situations as well.
a = [1,2,3]
def get_last_element(array)
array.pop
end
b = get_last_element(a)
Since Array#pop has side effects, a is now 1,2. It has the last element removed, which might not have been what you intended. You could replace .pop here with [-1] or .last to get rid of the side effect
The exclamation point in a method name is essentially warning you that there are side effects. This is important in the concept of functional programming, which prescribes side effect free code. Ruby is very much a functional programming language by design (although it's very object oriented as well).
If your "use case" boils down to avoiding assigning a variable, that seems like a really minor discomfort.
For a more technical reason, though, see Jorg Mittag's answer. It's impossible to write a method which changes the class of self

this
def some_method(foo)
foo = foo.downcase.split(" ")
end
some_method("A String like any other")
is the same as this
def some_method(foo)
foo.downcase.split
end
some_method("A String like any other")

Actually, both of your methods return the same result. We can look at a few examples of methods that modify the caller.
array.map! return a modified original array
string.upcase! return a modified original string
However,
split modifies the class of the caller, changing a string to an array.
Notice how the above examples only modify the content of the object, instead of changing its class.
This is most likely why there isn't a split! method, although it's pretty easy to define one yourself.

#split creates an array out of a string, you can't permanently mutate(!) the string into being an array. Because the method is creating a new form from the source information(string), the only thing you need to do to make it permanent, is to bind it to a variable.

Related

Specify Ruby method namespace for readability

This is a bit of a weird question, but I'm not quite sure how to look it up. In our project, we already have an existing concept of a "shift". There's a section of code that reads:
foo.shift
In this scenario, it's easy to read this as trying to access the shift variable of object foo. But it could also be Array#shift. Is there a way to specify which class we expect the method to belong to? I've tried variations such as:
foo.send(Array.shift)
Array.shift(foo)
to make it more obvious which method was being called, but I can't get it to work. Is there a way to be more explicit about which class the method you're trying to call belongs to to help in code readability?
On a fundamental level you shouldn't be concerned about this sort of thing and you absolutely can't tell the Array shift method to operate on anything but an Array object. Many of the core Ruby classes are implemented in C and have optimizations that often depend on specific internals being present. There's safety measures in place to prevent you from trying to do something too crazy, like rebinding and applying methods of that sort arbitrarily.
Here's an example of two "shifty" objects to help illustrate a real-world situation and how that applies:
class CharacterArray < Array
def initialize(*args)
super(args.flat_map(&:chars))
end
def inspect
join('').inspect
end
end
class CharacterList < String
def shift
slice!(0, 1)
end
end
You can smash Array#shift on to the first and it will work by pure chance because you're dealing with an Array. It won't work with the second one because that's not an Array, it's missing significant methods that the shift method likely depends on.
In practice it doesn't matter what you're using, they're both the same:
list_a = CharacterArray.new("test")
list_a.shift
# => "t"
list_a.shift
# => "e"
list_a << "y"
# => "sty"
list_b = CharacterList.new("test")
list_b.shift
# => "t"
list_b.shift
# => "e"
list_b << "y"
# => "sty"
These both implement the same interfaces, they both produce the same results, and as far as you're concerned, as the caller, that's good enough. This is the foundation of Duck Typing which is the philosophy Ruby has deeply embraced.
If you try the rebind trick on the CharacterList you're going to end up in trouble, it won't work, yet that class delivers on all your expectations as far as interface goes.
Edit: As Sergio points out, you can't use the rebind technique, Ruby abruptly explodes:
Array.instance_method(:shift).bind(list_b).call
# => Error: bind argument must be an instance of Array (TypeError)
If readability is the goal then that has 35 more characters than list_b.shift which is usually going dramatically in the wrong direction.
After some discussion in the comments, one solution is:
Array.instance_method(:shift).bind(foo).call
Super ugly, but gets across the idea that I wanted which was to completely specify which instance method was actually being called. Alternatives would be to rename the variable to something like foo_array or to call it as foo.to_a.shift.
The reason this is difficult is that Ruby is not strongly-typed, and this question is all about trying to bring stronger typing to it. That's why the solution is gross! Thanks to everybody for their input!

How to reset value of local variable within loop?

I'd like to point out I tried quite extensively to find a solution for this and the closest I got was this. However I couldn't see how I could use map to solve my issue here. I'm brand new to Ruby so please bear that in mind.
Here's some code I'm playing with (simplified):
def base_word input
input_char_array = input.split('') # split string to array of chars
#file.split("\n").each do |dict_word|
input_text = input_char_array
dict_word.split('').each do |char|
if input_text.include? char.downcase
input_text.slice!(input_text.index(char))
end
end
end
end
I need to reset the value of input_text back to the original value of input_char_array after each cycle, but from what I gather since Ruby is reference-based, the modifications I make with the line input_text.slice!(input_text.index(char)) are reflected back in the original reference, and I end up assigning input_text to an empty array fairly quickly as a result.
How do I mitigate that? As mentioned I've tried to use .map but maybe I haven't fully wrapped my head around how I ought to go about it.
You can get an independent reference by cloning the array. This, obviously, has some RAM usage implications.
input_text = input_char_array.dup
The Short and Quite Frankly Not Very Good Answer
Using slice! overwrites the variable in place, equivalent to
input_text = input_text.slice # etc.
If you use plain old slice instead, it won't overwrite input_text.
The Longer and Quite Frankly Much Better Answer
In Ruby, code nested four levels deep is often a smell. Let's refactor, and avoid the need to reset a loop at all.
Instead of splitting the file by newline, we'll use Ruby's built-in file handling module to read through the lines. Memoizing it (the ||= operator) may prevent it from reloading the file each time it's referenced, if we're running this more than once.
def dictionary
#dict ||= File.open('/path/to/dictionary')
end
We could also immediately make all the words lowercase when we open the file, since every character is downcased individually in the original example.
def downcased_dictionary
#dict ||= File.open('/path/to/dictionary').each(&:downcase)
end
Next, we'll use Ruby's built-in file and string functions, including #each_char, to do the comparisons and output the results. We don't need to convert any inputs into Arrays (at all!), because #include? works on strings, and #each_char iterates over the characters of a string.
We'll decompose the string-splitting into its own method, so the loop logic and string logic can be understood more clearly.
Lastly, by using #slice instead of #slice!, we don't overwrite input_text and entirely avoid the need to reset the variable later.
def base_word(input)
input_text = input.to_s # Coerce in case it's not a string
# Read through each line in the dictionary
dictionary.each do |word|
word.each_char {|char| slice_base_word(input_text, char) }
end
end
def slice_base_word(input, char)
input.slice(input.index(char)) if input.include?(char)
end

Why is splat argument in ruby not used all the time?

I know splat arguments are used when we do not know the number of arguments that would be passed. I wanted to know whether I should use splat all the time. Are there any risks in using the splat argument whenever I pass on arguments?
The splat is great when the method you are writing has a genuine need to have an arbitrary number of arguments, for a method such as Hash#values_at.
In general though, if a method actually requires a fixed number of arguments it's a lot clearer to have named arguments than to pass arrays around and having to remember which position serves which purpose. For example:
def File.rename(old_name, new_name)
...
end
is clearer than:
def File.rename(*names)
...
end
You'd have to read the documentation to know whether the old name was first or second. Inside the method, File.rename would need to implement error handling around whether you had passed the correct number of arguments. So unless you need the splat, "normal" arguments are usually clearer.
Keyword arguments (new in ruby 2.0) can be even clearer at point of usage, although their use in the standard library is not yet widespread.
For a method that would take an arbitrary amount of parameters, options hash is a de facto solution:
def foo(options = {})
# One way to do default values
defaults = { bar: 'baz' }
options = defaults.merge(options)
# Another way
options[:bar] ||= 'baz'
bar = options[bar]
do_stuff_with(bar)
end
A good use of splat is when you're working with an array and want to use just the first argument of the array and do something else with the rest of the array. It's much quicker as well than other methods. Here's a smart guy Jesse Farmer's use of it https://gist.github.com/jfarmer/d0f37717f6e7f6cebf72 and here is an example of some other ways I tried solving the spiraling array problem and some benchmarks to go with it. https://gist.github.com/TalkativeTree/6724065
The problem with it is that it's not easily digestible. If you've seen and used it before, great, but it could slow down other people's understanding of what the code is doing. Even your own if you haven't looked at it in a while hah.
Splat lets the argument be interpreted as an array, and you would need an extra step to take it out. Without splat, you do not need special things to do to access the argument:
def foo x
#x = x
end
but if you put it in an array using splat, you need extra step to take it out of the array:
def foo *x
#x = x.first # or x.pop, x.shift, etc.
end
There is no reason to introduce an extra step unless necessary.

What is the semantics of this "do ... end"

I am new to Ruby and am learning from reading an already written code.
I encounter this code:
label = TkLabel.new(#root) do
text 'Current Score: '
background 'lightblue'
end
What is the semantics of the syntax "do" above?
I played around with it and it seems like creating a TkLabel object then set its class variable text and background to be what specified in quote. However when I tried to do the same thing to a class I created, that didn't work.
Oh yeah, also about passing hash into function, such as
object.function('argument1'=>123, 'argument2'=>321)
How do I make a function that accepts that kind of argument?
Thanks in advance
What you're looking at is commonly referred to as a DSL, or Domain Specific Language.
At first glance it may not be clear why the code you see works, as text and background are seemingly undefined, but the trick here is that that code is actually evaluated in a scope in which they are. At it's simplest, the code driving it might look something like this:
class TkLabel
def initialize(root, &block)
#root = root
if block
# the code inside the block in your app is actually
# evaluated in the scope of the new instance of TkLabel
instance_eval(&block)
end
end
def text(value)
# set the text
end
def background(value)
# set the background
end
end
Second question first: that's just a hash. Create a function that accepts a single argument, and treat it like a hash.
The "semantics" are that initialize accepts a block (the do...end bit), and some methods accepting string parameters to set specific attributes.
Without knowing how you tried to do it, it's difficult to go much beyond that. Here are a few, possible, references that might help you over some initial hurdles.
Ruby is pretty decent at making miniature, internal DSLs because of its ability to accepts blocks and its forgiving (if arcane at times) syntax.

Why are exclamation marks used in Ruby methods?

In Ruby some methods have a question mark (?) that ask a question like include? that ask if the object in question is included, this then returns a true/false.
But why do some methods have exclamation marks (!) where others don't?
What does it mean?
In general, methods that end in ! indicate that the method will modify the object it's called on. Ruby calls these as "dangerous methods" because they change state that someone else might have a reference to. Here's a simple example for strings:
foo = "A STRING" # a string called foo
foo.downcase! # modifies foo itself
puts foo # prints modified foo
This will output:
a string
In the standard libraries, there are a lot of places you'll see pairs of similarly named methods, one with the ! and one without. The ones without are called "safe methods", and they return a copy of the original with changes applied to the copy, with the callee unchanged. Here's the same example without the !:
foo = "A STRING" # a string called foo
bar = foo.downcase # doesn't modify foo; returns a modified string
puts foo # prints unchanged foo
puts bar # prints newly created bar
This outputs:
A STRING
a string
Keep in mind this is just a convention, but a lot of Ruby classes follow it. It also helps you keep track of what's getting modified in your code.
The exclamation point means many things, and sometimes you can't tell a lot from it other than "this is dangerous, be careful".
As others have said, in standard methods it's often used to indicate a method that causes an object to mutate itself, but not always. Note that many standard methods change their receiver and don't have an exclamation point (pop, shift, clear), and some methods with exclamation points don't change their receiver (exit!). See this article for example.
Other libraries may use it differently. In Rails an exclamation point often means that the method will throw an exception on failure rather than failing silently.
It's a naming convention but many people use it in subtly different ways. In your own code a good rule of thumbs is to use it whenever a method is doing something "dangerous", especially when two methods with the same name exist and one of them is more "dangerous" than the other. "Dangerous" can mean nearly anything though.
This naming convention is lifted from Scheme.
1.3.5 Naming conventions
By convention, the names of procedures
that always return a boolean value
usually end in ``?''. Such procedures
are called predicates.
By convention, the names of procedures
that store values into previously
allocated locations (see section 3.4)
usually end in ``!''. Such procedures
are called mutation procedures. By
convention, the value returned by a
mutation procedure is unspecified.
! typically means that the method acts upon the object instead of returning a result. From the book Programming Ruby:
Methods that are "dangerous," or modify the receiver, might be named with a trailing "!".
It is most accurate to say that methods with a Bang! are the more dangerous or surprising version. There are many methods that mutate without a Bang such as .destroy and in general methods only have bangs where a safer alternative exists in the core lib.
For instance, on Array we have .compact and .compact!, both methods mutate the array, but .compact! returns nil instead of self if there are no nil's in the array, which is more surprising than just returning self.
The only non-mutating method I've found with a bang is Kernel's .exit! which is more surprising than .exit because you cannot catch SystemExit while the process is closing.
Rails and ActiveRecord continues this trend in that it uses bang for more 'surprising' effects like .create! which raises errors on failure.
From themomorohoax.com:
A bang can used in the below ways, in order of my personal preference.
An active record method raises an error if the method does not do
what it says it will.
An active record method saves the record or a method saves an
object (e.g. strip!)
A method does something “extra”, like posts to someplace, or does
some action.
The point is: only use a bang when you’ve really thought about whether
it’s necessary, to save other developers the annoyance of having to
check why you are using a bang.
The bang provides two cues to other developers.
that it’s not necessary to save the object after calling the
method.
when you call the method, the db is going to be changed.
Simple explanation:
foo = "BEST DAY EVER" #assign a string to variable foo.
=> foo.downcase #call method downcase, this is without any exclamation.
"best day ever" #returns the result in downcase, but no change in value of foo.
=> foo #call the variable foo now.
"BEST DAY EVER" #variable is unchanged.
=> foo.downcase! #call destructive version.
=> foo #call the variable foo now.
"best day ever" #variable has been mutated in place.
But if you ever called a method downcase! in the explanation above, foo would change to downcase permanently. downcase! would not return a new string object but replace the string in place, totally changing the foo to downcase.
I suggest you don't use downcase! unless it is totally necessary.
!
I like to think of this as an explosive change that destroys all that has gone before it. Bang or exclamation mark means that you are making a permanent saved change in your code.
If you use for example Ruby's method for global substitutiongsub!the substitution you make is permanent.
Another way you can imagine it, is opening a text file and doing find and replace, followed by saving. ! does the same in your code.
Another useful reminder if you come from the bash world is sed -i has this similar effect of making permanent saved change.
Bottom line: ! methods just change the value of the object they are called upon, whereas a method without ! returns a manipulated value without writing over the object the method was called upon.
Only use ! if you do not plan on needing the original value stored at the variable you called the method on.
I prefer to do something like:
foo = "word"
bar = foo.capitalize
puts bar
OR
foo = "word"
puts foo.capitalize
Instead of
foo = "word"
foo.capitalize!
puts foo
Just in case I would like to access the original value again.
Called "Destructive Methods" They tend to change the original copy of the object you are referring to.
numbers=[1,0,10,5,8]
numbers.collect{|n| puts n*2} # would multiply each number by two
numbers #returns the same original copy
numbers.collect!{|n| puts n*2} # would multiply each number by two and destructs the original copy from the array
numbers # returns [nil,nil,nil,nil,nil]
My answer explains the significance of Ruby methods with exclamation marks/shebangs in the context of Ruby on Rails (RoR) model validations.
Essentially, whenever developers define Model validations (explained here), their ultimate goal is to decline a database record change & raise/throw the relevant exception(s) in case invalid data has been submitted to update the record in question.
RoR ActiveRecord gem defines various model manipulation methods (Ruby on Rails guides.). Among the methods, the valid? method is the only one that triggers validation without database action/modification. The rest of the methods attempt to change the database.
These methods trigger callbacks whenever they run. Some of the methods in the list feature a sister method with a shebang. What is the difference between the two? It has to do with the form of callback returned whenever a record validation fails.
Methods without the exclamation/shebang merely return a boolean false in the event of record validation failure while the methods with a shebang raise/throw an exception which can then be handled appropriately in code.
Just as a heads-up, since I experienced this myself.
In Ruby, ! mutates the object and returns it. Otherwise it will return nil.
So, if you are doing some kind of operations on an array for example, and call the method .compact! and there is nothig to compact, it will return nil.
Example:
arr = [1, 2, 3, nil]
arr.compact!
=> [1, 2, 3]
Run again arr.compact!
=> nil
It is better to explicitly return again the array arr if you need to use it down the line, otherwise you will get the nil value.
Example:
arr = [1, 2, 3]
arr.compact! => nil
arr # to get the value of the array

Resources