Ruby evaluates the default value in fetch even when key is found - ruby

h = {a: "foo"}
h.fetch(:a, h.fetch(:b))
yields key not found: :b
It seems strange that Ruby evaluates the default value even when the key is found? Is there a way around this?
Edit: Apparently this behavior falls under the paradigm of lazy vs eager evaluation. Nearly all imperative languages use eager evaluation, while many functional languages use lazy evaluation. However some languages, such as Python (which was before last week the only language I knew), have lazy evaluation capabilities for some operations.

It seems strange that Ruby evaluates the default value even when the key is found? Is there a way around this?
The overwhelming majority of mainstream programming languages is strict, i.e. arguments are fully evaluated before being passed. The only exception is Haskell, and calling it mainstream is a bit of a stretch.
So, no, it is not really "strange", it is how (almost) every language works, and it is also how every single other method in Ruby works. Every method in Ruby always fully evaluates all its arguments. That is, why, for example defined? and alias cannot be methods but must be builtin language constructs.
However, there is a way to delay evaluation, so to speak, using blocks: the content of a block is only evaluated each time it is called. Thankfully, Hash#fetch does take a block, so you can just use
h.fetch(:a) { h.fetch(:b) }

If you want the computation to only run if the key is not found, you can use the alternate form of fetch which accepts a block:
h.fetch(a) { h.fetch b }
I didn't actually know this was the case but I had a hunch and tried it and it worked. The reason I thought to try this was that something similar can be done in other methods such as gsub, i.e.
"123".gsub(/[0-9]/) { |i| i.to_i + 1 } == "234"

You could use the || operator instead, since it's lazy evaluated by design.
For example the following code where Nope is not defined does not throw an error.
{ a: "foo" }[:a] || Nope
However the fetch version will throw an error.
{ a: "foo" }.fetch(:a, Nope)
Personally I prefer how fetch is designed because it wont hide a bug in the default value.
However, in cases where I would rather not evaluate a default statement, then I would definitely reach for the || operator before doing something in a block or a lambda/proc.

Related

Provide alias for Ruby's built-in keyword

For example, I want to make Object#rescue another name so I can use in my code like:
def dangerous
something_dangerous!
dont_worry # instead of rescue here
false
end
I tried
class ::Object
alias :dont_worry :rescue
end
But cannot find the rescue method on Object:
`<class:Object>': undefined method `rescue' for class `Object' (NameError)
Another example is I would like to have when in the language to replace:
if cond
# eval when cond is truthy
end
to
when cond
# eval when cond is truthy
end
Is it possible to give a Ruby keyword alias done in Ruby?
Or I need to hack on Ruby C source code?
Thanks!
This is not possible without some deep changes to the Ruby language itself. The things you describe are not methods but keywords of the language, i.e. the actual core of what is Ruby. As such, these things are not user-changeable at all.
If you still want to change the names of the keywords, you would at least have to adapt the language parser. If you don't change semantics at all, this might do it as is. But if you want to change what these keywords represent, things get messy really quick.
Also note that Ruby in itself is sometimes quite ambiguous (e.g. with regards to parenthesis, dots, spacing) and goes to great length to resolve this in a mostly consistent way. If you change keywords, you would have to ensure that things won't get any more ambiguous. This could e.g. happen with your change of if to when. when is used as a keywords is case statements already and would thus could be a source of ambiguity when used as an if.

Why does Enumerable#detect need a Proc/lambda?

Enumerable#detect returns the first value of an array where the block evaluates to true. It has an optional argument that needs to respond to call and is invoked in this case, returning its value. So,
(1..10).detect(lambda{ "none" }){|i| i == 11} #=> "none"
Why do we need the lambda? Why don't we just pass the default value itself, since (in my tests) the lambda can't have any parameters anyway? Like this:
(1..10).detect("none"){|i| i == 11} #=> "none"
As with all things in Ruby, the "principle of least surprise" applies. Which is not to say "least surprise for you" of course. Matz is quite candid about what it actually means:
Everyone has an individual background. Someone may come from Python, someone else may come from Perl, and they may be surprised by different aspects of the language. Then they come up to me and say, 'I was surprised by this feature of the language, so Ruby violates the principle of least surprise.' Wait. Wait. The principle of least surprise is not for you only. The principle of least surprise means principle of least my surprise. And it means the principle of least surprise after you learn Ruby very well. For example, I was a C++ programmer before I started designing Ruby. I programmed in C++ exclusively for two or three years. And after two years of C++ programming, it still surprises me.
So, the rational here is really anyone's guess.
One possibility is that it allows for or is consistent with use-cases where you want to conditionally run something expensive:
arr.detect(lambda { do_something_expensive }) { |i| is_i_ok? i }
Or as hinted by #majioa, perhaps to pass a method:
arr.detect(method(:some_method)) { |i| is_i_ok? i }
Accepting a callable object allows allows "lazy" and generic solutions, for example in cases where you'd want to do something expensive, raise an exception, etc...
I can't see a reason why detect couldn't accept non callable arguments, though, especially now in Ruby 2.1 where it's easy to create cheap frozen litterals. I've opened a feature request to that effect.
It is probably so you can generate an appropiate result from the input. You can then do something like
arr = (1..10).to_a
arr.detect(lambda{ arr.length }){|i| i == 11} #=> 10
As you said, returning a constant value is very easy with a lambda anyway.
Really it is interestion question. I can understand why authors have added the feature with method call, you can just pass method variable, containing a Method object or similar, as an argument. I think it is simply voluntaristic solution for the :detect method, because it could be easy to add switch on type of passed argument to select weither it is the Method or not.
I've reverified the examples, and got:
(1..10).detect(proc {'wqw'}) { |i| i % 5 == 0 and i % 7 == 0 } #=> nil
# => "wqw"
(1..10).detect('wqw') { |i| i % 5 == 0 and i % 7 == 0 } #=> nil
# NoMethodError: undefined method `call' for "wqw":String
That is amazing. =)
The best use case for having a lambda is to raise a custom exception
arr = (1..10).to_a
arr.detect(lambda{ raise "not found" }){|i| i == 11} #=> RuntimeError: not found
So, as the K is trivial (just surround with ->{ }) there is not much sense in checking for fallback behavior.
The similar case of passing a &-ed symbol instead of a block is in fact not similar at all, as in that case it indicates something that will be called on the items of the enumerable.

Why Ruby 1.9 allows overriding ! != !~?

There were two good reasons why Ruby 1.8 didn't support certain kinds of overloading like ||/or, &&/and, !/not, ?::
Short circuit semantics cannot be implemented with methods in Ruby without very extensive changes to the language.
Ruby is hard coded to treat only nil and false as false in boolean contexts.
The first reason doesn't apply to !/not but second still does. It's not like you can introduce your own kinds of boolean-like objects using just ! while &&/|| are still hard-coded. For other uses there's already complementarity operator ~ with &/|.
I can imagine there's a lot of code expecting !obj to be synonymous with obj ? false : true, and !!obj with obj ? true : false - I'm not even sure how is code supposed to deal with objects that evaluate to true in boolean context, but ! to something non-false.
It doesn't look like Ruby plans to introduce support for other false values. Nothing in Ruby stdlib seems to override ! so I haven't found any examples.
Does it have some really good use I'm missing?
Self-replying. I found one somewhat reasonable use so far. I can hack this to work in 1.9.2 in just a few lines:
describe "Mathematics" do
it "2 + 2 != 5" do
(2+2).should != 5
end
end
Before 1.9.2 Ruby this translates to:
describe "Mathematics" do
it "2 + 2 != 5" do
((2+2).should == 5) ? false : true
end
end
But as return value gets thrown away we have no ways of distinguishing == 5 from != 5 short of asking Ruby for block's parse tree. PositiveOperatorMatcher#==(5) would just raise ExpectationNotMetError exception and that would be it. It seems should !~ /pattern/, should !be_something etc. can also be made to work.
This is some improvement over (2+2).should_not == 5, but not really a big one. And there's no way to hack it further to get things like (2+2).should be_even or be_odd.
Of course the right thing to do here would be asking Ruby for parse tree and macro-expanding that. Or using debug hooks in Ruby interpreter. Are there use cases better than this one?
By the way, it wouldn't be enough to allow overrides for ! while translating !=/!~ the old way. For !((2+2).should == 5) to work #== cannot raise an exception before ! gets called. But if #== fails and ! doesn't get run execution will simply continue. We will be able to report assertion as failed after block exits, but at cost of executing test after the first failure. (unless we turn on ruby debug hooks just after failed assertion to see if the very next method call is !self, or something like that)

Tips on understanding Ruby syntax, when to use ?, and unless

Is the keyword unless the same as if?
When do you use ??
I've seen:
if someobject?
I know it checks against nil correct?
Is the keyword 'unless' the same as 'if' ?
No, it's the opposite.
unless foo is the same as if !foo
if someobject?
I know it checks against nil correct?
No it calls a method named someobject?. I.e. the ? is just part of the method name.
? can be used in methodnames, but only as the last character. Conventionally it is used to name methods which return a boolean value (i.e. either true or false).
? can also be used as part of the conditional operator condition ? then_part : else_part, but that's not how it is used in your example.
unless is actually the opposite of if. unless condition is equivalent to if !condition.
Which one you use depends on what feels more natural to the intention you're expressing in code.
e.g.
unless file_exists?
# create file
end
vs.
if !file_exists?
# create file
end
Regarding ?, there is a convention for boolean methods in Ruby to end with a ?.
This statement:
unless conditional expression
Is the equivalent to:
if not (conditional expression)
In Ruby you can end your method names with a question mark which is normally used to show that it is a boolean method.
With Rails a check against nil would look like this:
someobject.nil?
This calls the nil?() method of the object, which returns true for NilObject and false for anything else.
I think the convention for ?-suffix is to use it when naming a method that returns a boolean value. It is not a special character, but is used to make the name of the method easier to understand, or at least I think that's what the intention was. It's to make it clear that the method is like asking a question: it shouldn't change anything, only return some kind of status...
There's also !-suffix that I think by convention means that the method may have side-effects or may modify the object it is called on (rather than return a modified copy). Either way, the ! is to make you think carefully about calling such a method and to make sure you understand what that method does.
I don't think anything enforces these conventions (I've never tried to break them) so of course you could abuse them horribly, but your fellow developers would not be happy working with your code.
for unless see here: http://railstips.org/blog/archives/2008/12/01/unless-the-abused-ruby-conditional/
if someobject?
The appending of a '?' here only means that it returns a boolean.

Why use short-circuit code?

Related Questions: Benefits of using short-circuit evaluation, Why would a language NOT use Short-circuit evaluation?, Can someone explain this line of code please? (Logic & Assignment operators)
There are questions about the benefits of a language using short-circuit code, but I'm wondering what are the benefits for a programmer? Is it just that it can make code a little more concise? Or are there performance reasons?
I'm not asking about situations where two entities need to be evaluated anyway, for example:
if($user->auth() AND $model->valid()){
$model->save();
}
To me the reasoning there is clear - since both need to be true, you can skip the more costly model validation if the user can't save the data.
This also has a (to me) obvious purpose:
if(is_string($userid) AND strlen($userid) > 10){
//do something
};
Because it wouldn't be wise to call strlen() with a non-string value.
What I'm wondering about is the use of short-circuit code when it doesn't effect any other statements. For example, from the Zend Application default index page:
defined('APPLICATION_PATH')
|| define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
This could have been:
if(!defined('APPLICATION_PATH')){
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
}
Or even as a single statement:
if(!defined('APPLICATION_PATH'))
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
So why use the short-circuit code? Just for the 'coolness' factor of using logic operators in place of control structures? To consolidate nested if statements? Because it's faster?
For programmers, the benefit of a less verbose syntax over another more verbose syntax can be:
less to type, therefore higher coding efficiency
less to read, therefore better maintainability.
Now I'm only talking about when the less verbose syntax is not tricky or clever in any way, just the same recognized way of doing, but in fewer characters.
It's often when you see specific constructs in one language that you wish the language you use could have, but didn't even necessarily realize it before. Some examples off the top of my head:
anonymous inner classes in Java instead of passing a pointer to a function (way more lines of code).
in Ruby, the ||= operator, to evaluate an expression and assign to it if it evaluates to false or is null. Sure, you can achieve the same thing by 3 lines of code, but why?
and many more...
Use it to confuse people!
I don't know PHP and I've never seen short-circuiting used outside an if or while condition in the C family of languages, but in Perl it's very idiomatic to say:
open my $filehandle, '<', 'filename' or die "Couldn't open file: $!";
One advantage of having it all in one statement is the variable declaration. Otherwise you'd have to say:
my $filehandle;
unless (open $filehandle, '<', 'filename') {
die "Couldn't open file: $!";
}
Hard to claim the second one is cleaner in that case. And it'd be wordier still in a language that doesn't have unless
I think your example is for the coolness factor. There's no reason to write code like that.
EDIT: I have no problem with doing it for idiomatic reasons. If everyone else who uses a language uses short-circuit evaluation to make statement-like entities that everyone understands, then you should too. However, my experience is that code of that sort is rarely written in C-family languages; proper form is just to use the "if" statement as normal, which separates the conditional (which presumably has no side effects) from the function call that the conditional controls (which presumably has many side effects).
Short circuit operators can be useful in two important circumstances which haven't yet been mentioned:
Case 1. Suppose you had a pointer which may or may not be NULL and you wanted to check that it wasn't NULL, and that the thing it pointed to wasn't 0. However, you must not dereference the pointer if it's NULL. Without short-circuit operators, you would have to do this:
if (a != NULL) {
if (*a != 0) {
⋮
}
}
However, short-circuit operators allow you to write this more compactly:
if (a != NULL && *a != 0) {
⋮
}
in the certain knowledge that *a will not be evaluated if a is NULL.
Case 2. If you want to set a variable to a non-false value returned from one of a series of functions, you can simply do:
my $file = $user_filename ||
find_file_in_user_path() ||
find_file_in_system_path() ||
$default_filename;
This sets the value of $file to $user_filename if it's present, or the result of find_file_in_user_path(), if it's true, or … so on. This is seen perhaps more often in Perl than C, but I have seen it in C.
There are other uses, including the rather contrived examples which you cite above. But they are a useful tool, and one which I have missed when programming in less complex languages.
Related to what Dan said, I'd think it all depends on the conventions of each programming language. I can't see any difference, so do whatever is idiomatic in each programming language. One thing that could make a difference that comes to mind is if you had to do a series of checks, in that case the short-circuiting style would be much clearer than the alternative if style.
What if you had a expensive to call (performance wise) function that returned a boolean on the right hand side that you only wanted called if another condition was true (or false)? In this case Short circuiting saves you many CPU cycles. It does make the code more concise because of fewer nested if statements. So, for all the reasons you listed at the end of your question.
The truth is actually performance. Short circuiting is used in compilers to eliminate dead code saving on file size and execution speed. At run-time short-circuiting does not execute the remaining clause in the logical expression if their outcome does not affect the answer, speeding up the evaluation of the formula. I am struggling to remember an example. e.g
a AND b AND c
There are two terms in this formula evaluated left to right.
if a AND b evaluates to FALSE then the next expression AND c can either be FALSE AND TRUE or FALSE AND FALSE. Both evaluate to FALSE no matter what the value of c is. Therefore the compiler does not include AND c in the compiled format hence short-circuiting the code.
To answer the question there are special cases when the compiler cannot determine whether the logical expression has a constant output and hence would not short-circuit the code.
Think of it this way, if you have a statement like
if( A AND B )
chances are if A returns FALSE you'll only ever want to evaluate B in rare special cases. For this reason NOT using short ciruit evaluation is confusing.
Short circuit evaluation also makes your code more readable by preventing another bracketed indentation and brackets have a tendency to add up.

Resources