What is the purpose of `Array#include?` as compared to `Array#index`? - ruby

Array#include? provides only a weaker information than what Array#index provides, i.e., when Array#index returns nil, the corresponding method call with Array#include? will return false, and when Array#index returns an integer, Array#include? returns true. Furthermore, comparing the two indicates that there is no significant difference in speed; rather Array#index often shows a better result than Array#include?:
a = %w[boo zoo foo bar]
t = Time.now
10000.times do
a.include?("foo")
end
puts Time.now - t # => 0.005626235
t = Time.now
10000.times do
a.index("foo")
end
puts Time.now - t # => 0.003683945
Then, what is the purpose of Array#include?? Can't all code using it be rewritten using Array#index?

I know this isn't an official reason, but I can think of a few things:
Clarity: as a name, include? makes more sense at first sight, and also allows easy visual confirmation of code correctness by identifying itself as a boolean predicate. This follows the concept of making wrong code look wrong (see http://www.joelonsoftware.com/articles/Wrong.html)
Good typing: If all you want is a boolean value for a boolean check, making that a number could lead to bugs
Cleanliness: Isn't it nicer to see a printed output of "true" rather than going back to C and having no boolean to speak of?

Related

Ruby method operating on hash without side effects

I want to create a function that adds a new element to a hash as below:
numbers_hash = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers["four"] = "cuatro"
end
add_new_value(numbers_hash)
I have read that immutability is important, and methods with side effects are not a good idea. Clearly this method is modifying the original input, how should I handle this?
Ruby is an OOP language with some functional patterns
Ruby is an object oriented language. Side-effects are important in OO. When you call a method on an object and that method modifies the object, that's a side-effect, and that's fine:
a = [1, 2, 3]
a.delete_at(1) # side effect in delete_at
# a is now [1, 3]
Ruby also allows a functional style, where data is transformed without side-effects. You've probably seen or used the map-reduce pattern:
a = ["1", "2", "3"]
a.map(&:to_i).reduce(&:+) # => 6
# a is unchanged
Command Query Separation
What may have confused you is a rule invented by Bertrand Meyers, the Command Query Separation Rule. This rule says that a method must either
Have a side effect, but no return value, or
Have no side effect, but return something
But not both. Note that although it's called a rule, in Ruby I would treat it as a strong guideline. There are times when violating this rule makes for better code, but in my experience this rule can be adhered to most of the time.
We have to clarify what we mean by "has a return value" in Ruby, since every Ruby method has a return value--the value of the last statement it executed (or nil if it was empty). What we mean is that the method has an intentional return value, one that is part of this method's contract and that the caller can be expected to use.
Here's an example of a method that has a side-effect and a return value, violating this rule:
# Open the valve if possible. Returns whether or not the valve is open.
def open_valve
#valve_open = true if #power_available
#valve_open
end
and how you'd separate that into two methods to adhere to this rule:
attr_reader :valve_open
def open_valve
#valve_open = true if #power_available
end
If you choose to adhere to this rule, you may find it useful to name side-effect methods with verb phrases, and returning-something methods with noun phrases. This makes it obvious from the start what kind of method you are dealing with, and makes naming methods easier.
What is a side-effect?
A side effect is something that changes the state of an object or or external entity like a file. This method that changes the state of its object has a side effect:
def register_error
#error_count += 1
end
This method that changes the state of its argument has a side effect:
def delete_ones(ary)
ary.delete(1)
end
This method that writes to a file has a side effect:
def log(line)
File.open(log_path, "a") { |f| f.puts(line) }
end
I would not necessarily agree that you should always avoid mutation an argument. Especially in the context of your example it seems like the mutation is the only purpose the method exists. Therefore it is not a side-effect IMO.
I would call it an unwanted side-effect when a method changes input parameters while doing something unrelated and that it is not obvious by the methods name that is also mutates input arguments.
You might prefer to return a new hash and keep the old hash unchanged:
numbers_hash_1 = {"one": "uno", "two": "dos", "three": "tres", }
def add_new_value(numbers)
numbers.merge(four: "cuatro")
end
numbers_hash_2 = add_new_value(numbers_hash_1)
#=> {:one=>"uno", :two=>"dos", :three=>"tres", :four=>"cuatro"}
numbers_hash_1
#=> {:one=>"uno", :two=>"dos", :three=>"tres"}
Quote from the docs of Hash#merge:
merge(*other_hashes) → new_hash
Returns the new Hash formed by merging each of other_hashes into a copy of self.

Does 'any?' break from the loop when a match is found? [duplicate]

This question already has answers here:
Do all? and any? guarantee short-circuit evaluation?
(3 answers)
Closed 4 years ago.
Does any? break from the loop when a match is found?
The following is the any? source code, but I don't understand it.
static VALUE
enum_any(VALUE obj)
{
VALUE result = Qfalse;
rb_block_call(obj, id_each, 0, 0, ENUMFUNC(any), (VALUE)&result);
return result;
}
Yes, it does break the loop. One does not need to dig into c code to check that:
[1,2,3].any? { |e| puts "Checking #{e}"; e == 2 }
# Checking 1
# Checking 2
#⇒ true
The term is "short-circuiting" and yes, any? does that. After it finds a match, it doesn't look any further.
Does any? break from the loop when a match is found?
The documentation is unclear about that:
The method returns true if the block ever returns a value other than false or nil.
Note: it does not say "when the block ever returns a value other than false or nil" or "as soon as the block ever returns a value other than false or nil".
This can be interpreted either way, or it can be interpreted as making no guarantees at all. If you go by this documentation, then you can neither guarantee that it will short-ciruit, nor can you guarantee that it won't short-circuit.
Generally speaking, this is typical for API specifications: make the minimum amount of guarantees, giving the API implementor maximum freedom in how to implement the API.
There is somewhere else we can look: the ISO Ruby Programming Language Specification (bold emphasis mine):
15.3.2.2.2 Enumerable#any?
any?(&block)
Visibility: public
Behavior:
a) Invoke the method each on the receiver
b) For each element X which each yields
If block is given, call block with X as the argument.
If this call results in a trueish object, return true
As you can see, again it only says "if", but not "when" or "as soon as". This sentence can be interpreted in two ways: "Return true as the result of the method" (no indication of how often the block gets called, only that the method will return true at the end) or "return true when you encounter an invocation of the block that evaluates to a trueish value".
Try #3: The Ruby Spec:
it "stops iterating once tähe return value is determined" do
So, yes, we can indeed rely on the fact that the block is only evaluated until the first truthy value is encountered.
The following is the any? source code, but I don't understand it.
Note: by looking at the source code, you can not determine how something behaves in Ruby. You can only determine how something behaves in that specific version of that specific implementation of Ruby. Different implementations may behave differently (for example, in YARV, Ruby threads cannot run at the same time, in JRuby, they can). Even different versions of the same implementation can behave differently.
It is usually not a good idea to make assumptions about the behavior of a programming language by just looking at a single version of a single implementation.
However, if you really want to look at some implementation, and are fully aware about the limitations of this approach, then I would suggest to look at Rubinius, Topaz, Opal, IronRuby, or JRuby. They are (in my opinion) better organized and easier to read than YARV.
For example, this is the code for Enumerable#any? in Rubinius:
def any?
if block_given?
each { |*element| return true if yield(*element) }
else
each { return true if Rubinius.single_block_arg }
end
false
end
This looks rather clear and readable, doesn't it?
This is the definition in Topaz:
def any?(&block)
if block
self.each { |*e| return true if yield(*e) }
else
self.each_entry { |e| return true if e }
end
false
end
This also looks fairly readable.
The soure in Opal is a little bit more complex, but only marginally so:
def any?(pattern = undefined, &block)
if `pattern !== undefined`
each do |*value|
comparable = `comparableForPattern(value)`
return true if pattern.public_send(:===, *comparable)
end
elsif block_given?
each do |*value|
if yield(*value)
return true
end
end
else
each do |*value|
if Opal.destructure(value)
return true
end
end
end
false
end
[Note the interesting use of overriding the ` method for injecting literal ECMAScript into the compiled code.]
Most of the added complexity compared to the Rubinius and Topaz versions stems from the fact that Opal already supports the third overload of any? taking a pattern which was introduced in Ruby 2.5, whereas Rubinius and Topaz only support the two overloads with a block and without any arguments at all.
IronRuby's implementation implements the short-circuiting like this:
if (predicate.Yield(item, out blockResult)) {
result = blockResult;
return selfBlock.PropagateFlow(predicate, blockResult);
}
JRuby's implementation is a little bit more involved still, but you can see that as soon as it encounters a truthy block value, it breaks out of the loop by throwing a SPECIAL_JUMP exception and catching it to return true.
Yes and it's easy to prove:
irb(main):009:0> %w{ant bear cat}.any? {|word| puts "hello"; word.length >= 4}
hello
hello
=> true
It has printed only twice. If it did not break it would print 3 times.

Is there a { |x| x } shorthand in ruby?

I often use .group_by{ |x| x } and .find{ |x| x }
The latter is to find the first item in an array which is true.
Currently I'm just using .compact.first but I feel like there must be an elegant way to use find here, like find(&:to_bool) or .find(true) that I'm missing.
Using .find(&:nil?) works but is the opposite of what I want, and I couldn't find a method that was the opposite of #find or #detect, or a method like #true?
So is there a more elegant way to write .find{ |x| x }? If not, I'll stick with .compact.first
(I know compact won't remove false but that's not a problem for me, also please avoid rails methods for this)
Edit: For my exact case it is used on arrays of only strings and nils e.g.
[nil, "x", nil, nil, nil, nil, "y", nil, nil, nil, nil] => "x"
If you do not care about what is returned you can sometimes use the hash method.
Thw feature you are asking for is not available in Ruby yet, however. it is present in the Ruby road-map:
https://bugs.ruby-lang.org/issues/6373
Expected to be implemented before 2035-12-25, can you wait?
That being said, how much typing is group_by{|x|x} ?
Edit:
As Stefan pointed out, my answer is now longer valid for Ruby 2.2 and above since the introduction of Object#itself.
There’s not.
If tap worked without a block you could do:
array.detect(&:tap)
But it doesn’t. Either way, I think what you have is extremely concise, idiomatic, and happens to be the same number of characters as the non-working above alternative, and thus you should stick with that:
array.compact.first
You could monkey-patch your way to getting a shorter version, but then it becomes unclear to anyone otherwise familiar with Ruby, which probably isn’t worth the minor “savings”.
As a curiosity, if you happened to want array.detect { |x| !x } (the opposite) you could do:
array.detect(&:!)
This works because !x is actually shorthand for x.!. Of course this would only ever give you nil or false, which is probably not very useful.
No, there is not. I personally have a utility library I include in all my projects which has something like
IDENTITIY = -> x { x }
Then you would have
.group_by(&IDENTITY)
There is also Object#itself that simply returns self:
.group_by(&:itself)
Although the tag is for ruby - with Rails (more specifically ActiveSupport) you are given a method presence which will work for anything that responds positively to present? (that would exclude blank strings, arrays, hashes, etc):
array.find(&:presence)
It's not quite equivalent to the preferred result, but it will work for most cases I've come across.
I frequently use group_by, map, select, sort_by, and other various hash methods. I discovered this useful little extension yesterday by fiddling around with another answer on a similar question:
class Hash
def method_missing(n)
if has_key? n
self[n]
else
raise NoMethodError
end
end
end
For any hash created by ruby, or any data that has been jsonified by as_json, this addition allows me to write code which is a little shorter. Example:
# make yellow cells
yellow = red = false
tube_steps_status.group_by(&:step_ordinal).each do |type|
group = type.last.select(&:completed).sort_by(&:completed)
red = true if group.last.step_status == 'red' if group.any?
yellow = true if group.map(&:step_status).include?('red')
end
tube_summary_status = 'yellow' if yellow unless red

In environments that take Boolean arguments, is it a good idea to wrap all functions instead of allowing them to be implicitly coerced?

Take the String#=~ function for instance. It will return the index of the first match if the match is found, which, as a Fixnum will always act as true in boolean environments. If the match isn't found, it returns null, which acts as false.
Now suppose I have a class:
class A
attr_accessor :myprop
# prints "I am awesome" if #myprop matches /awesome/
# and "I am not awesome" otherwise
def report_on_awesomeness!
puts "I am #{myprop =~ /awesome/ ? 'awesome' : 'not awesome'}."
end
end
This code will pretty much work just as expected, but the first element in the trial conditional operator is the subject of my question.
Is it a good idea not to wrap myprop =~ /awesome/? I'm not talking about abstracting it into another method like def is_awesome?; myprop =~ /awesome/; end but rather whether my current convention, which forces Ruby to implicitly casts Fixnums to true and nils to false, is preferable over wrapping the condition into something I cast myself. I could easily do this:
class A
attr_accessor :myprop
# prints "I am awesome" if #myprop matches /awesome/
# and "I am not awesome" otherwise
def report_on_awesomeness!
puts "I am #{(myprop =~ /awesome/).nil? ? 'not awesome' : 'awesome'}."
end
end
Pros I see for the first style:
Most maintainers (including future me) are used to the implicit type
It's shorter
Pros I see for the second style:
It's more obvious exactly what the relationship is between the result of the =~ method and its boolean interpretation
It gives you more freedom to use more creative explicit casting
I suspect that there might be some middle ground, where you leave implicit type conversions in cases where it's idiomatic (e.g., regular expression matching using =~) and do it explicitly when it's not (e.g., your own properties, especially if they have multiple return types).
I would appreciate any insights or experiences the community can share on this issue.
IMHO that's a personal choice. You can take any style, since you feels better by working with that.
Once I defined true? on Object to get its boolean value (another name could be to_bool):
class Object
def true?
!!self
end
end
But the double bang (!!) is simpler to convert anything to Boolean and I prefer to use it - but not everywhere. I use it only when I need explicity a boolean value (I wouldn't use it in the case of this question).
BTW, false.nil? == false; it could lead to confusion.

Ruby case statement with multiple variables using an Array

I'd like to compare multiple variables for a case statement, and am currently thinking overriding the case equals operator (===) for Array is the best way to do it. Is this the best way?
Here is an example use case:
def deposit_apr deposit,apr
# deposit: can be nil or 2 length Array of [nil or Float, String]
# apr: can be nil or Float
case [deposit,apr]
when [[Float,String],Float]
puts "#{deposit[0]} #{deposit[1]}, #{apr*100.0}% APR"
when [[nil,String],Float]
puts "#{apr*100.0}% APR on deposits greater than 100 #{deposit[1]}"
when [[Float,String],nil]
puts "#{deposit[0]} #{deposit[1]}"
else
puts 'N/A'
end
end
The only problem is the Array case equals operator doesn't apply the case equal to the elements of the Array.
ruby-1.9.2-p0 > deposit_apr([656.00,'rupees'],0.065)
N/A
It will if I override, but am not sure what I'd be breaking if I did:
class Array
def ===(other)
result = true
self.zip(other) {|bp,ap| result &&= bp === ap}
result
end
end
Now, it all works:
ruby-1.9.2-p0 > deposit_apr([656.00,'rupees'],0.065)
656.0 rupees, 6.5% APR
Am I missing something?
I found this question because I was looking to run a case statement on multiple variables, but, going through the following, came to the conclusion that needing to compare multiple variables might suggest that a different approach is needed. (I went back to my own code with this conclusion, and found that even a Hash is helping me write code that is easier to understand.)
Gems today use "no monkey patching" as a selling point. Overriding an operator is probably not the right approach. Monkey patching is great for experimentation, but it's too easy for things to go awry.
Also, there's a lot of type-checking. In a language that is designed for Duck Typing, this clearly indicates the need for a different approach. For example, what happens if I pass in integer values instead of floats? We'd get an 'N/A', even though that's not likely what we're looking for.
You'll notice that the example given in the question is difficult to read. We should be able to find a way to represent this logic more clearly to the reader (and to the writer, when they revisit the code again in a few months and have to puzzle out what's going on).
And finally, since there are multiple numbers with associated logic, it seems like there's at least one value object-type class (Deposit) that wants to be written.
For cleanliness, I'm going to assume that a nil APR can be considered a 0.0% APR.
class Deposit
def initialize(amount, unit='USD', options={})
#amount = amount.to_f # `nil` => 0.0
#unit = unit.to_s # Example assumes unit is always present
#apr = options.fetch(:apr, 0.0).to_f # `apr: nil` => 0.0
end
end
Once we have our Deposit object, we can implement the print logic without needing case statements at all.
class Deposit
# ... lines omitted
def to_s
string = "#{#amount} #{#unit}"
string << ", #{#apr * 100.0}% APR" if #apr > 0.0
string
end
end
d = Deposit.new(656.00, 'rupees', apr: 0.065)
d.to_s
# => "656.0 rupees, 6.5% APR"
e = Deposit.new(100, 'USD', apr: nil)
e.to_s
# => "100.0 USD"
f = Deposit.new(100, 'USD')
f.to_s
# => "100.0 USD"
Conclusion: If you're comparing multiple variables in a case statement, use that as a smell to suggest a deeper design issue. Multiple-variable cases might indicate that there's an object that wants to be created.
If you are worried about breaking something by changing Array behavior, and certainly that's a reasonable worry, then just put your revised operator in a subclass of Array.
it's definitely not the best way. even more - you should not redefine methods of standart classes as core functionality may depend on it - have fun debugging then.
defensive style is nice(with lot of type checks and whatnot) but it usually hurts performance and readability.
if you know that you will not pass anything else than bunch of floats and strings to that method - why do you need all those checks for?
IMO use exception catching and fix the source of problem, don't try to fix the problem somewhere in the middle

Resources