Pythonesque blocks and postfix expressions - syntax

In JavaScript,
f = function(x) {
return x + 1;
}
(5)
seems at a glance as though it should assign f the successor function, but actually assigns the value 6, because the lambda expression followed by parentheses is interpreted by the parser as a postfix expression, specifically a function call. Fortunately this is easy to fix:
f = function(x) {
return x + 1;
};
(5)
behaves as expected.
If Python allowed a block in a lambda expression, there would be a similar problem:
f = lambda(x):
return x + 1
(5)
but this time we can't solve it the same way because there are no semicolons. In practice Python avoids the problem by not allowing multiline lambda expressions, but I'm working on a language with indentation-based syntax where I do want multiline lambda and other expressions, so I'm trying to figure out how to avoid having a block parse as the start of a postfix expression. Thus far I'm thinking maybe each level of the recursive descent parser should have a parameter along the lines of 'we have already eaten a block in this statement so don't do postfix'.
Are there any existing languages that encounter this problem, and how do they solve it if so?

Python has semicolons. This is perfectly valid (though ugly and not recommended) Python code: f = lambda(x): x + 1; (5).
There are many other problems with multi-line lambdas in otherwise standard Python syntax though. It is completely incompatible with how Python handles indentation (whitespace in general, actually) inside expressions - it doesn't, and that's the complete opposite of what you want. You should read the numerous python-ideas thread about multi-line lambdas. It's somewhere between very hard to impossible.
If you want arbitrarily complex compound statements inside lambdas you can't use the existing rules for multi-line expressions even if you made all statements expressions. You'd have to change the indentation handling (see the language reference for how it works right now) so that expressions can also contain blocks. This is hard to do without breaking perfectly fine Python code, and will certainly result in a language many Python programmers will consider worse in several regards: Harder to understand, more complex to implement, permits some stupid errors, etc.
Most languages don't solve this exact problem at all. Most candidates (Scala, Ruby, Lisps, and variants of these three) have explicit end-of-block tokens. I know of two languages that have the same problem, one of which (Haskell) has been mentioned by another answer. Coffeescript also uses indentation without end-of-block tokens. It parses the transliteration of your example correctly. However, I could not find any specification of how or why it does this (and I won't dig through the parser source code). Both differ significantly from Python in syntax as well as design philosophy, so their solution is of little (if any) use for Python.

In Haskell, there is an implicit semicolon whenever you start a line with the same indentation as a previous one, assuming the parser is in a layout-sensitive mode.
More specifically, after a token is encountered that signals the start of a (layout-sensitive) block, the indentation level of the first token of the first block item is remembered. Each line that is indented more continues the current block item; each line that is indented the same starts a new block item, and the first line that is indented less implies the closure of the block.
How your last example would be treated depends on whether the f = is a block item in some block or not. If it is, then there will be an implicit semicolon between the lambda expression and the (5), since the latter is indented the same as the former. If it is not, then the (5) will be treated as continuing whatever block item the f = is a part of, making it an argument to the lamda function.
The details are a bit messier than this; look at the Haskell 2010 report.

Related

Is 'X = someFunction() + 2` a statement or an expression?

I read that an expression is anything that gives some value, like 2 + X, while a statement is any instruction to the computer to execute something, like print("hi").
What about the following line of code?
X = someFunction() + 2
someFunction() returns some numerical value (I think a lot of languages wouldn't compile this code if it didn't), and thus someFunction() + 2 is 'something that yields some value' - aka an expression.
But, someFunction() is code to be executed, thus a statement.
My question:
There are often lines of code that equal some value, but are also an instruction to be executed. What are these lines of code considered?
In certain computer languages called "functional languages", everything--including the code that prints "hi"--is an expression. At the other extreme, you can write code in machine language (so you, not a compiler, are deciding exactly what sequence of bytes should compose the executable program), and at that level practically everything (even adding 2 to something) is an "instruction to the computer to execute something".
I've used a lot of different computer languages, and a far as I can recall, in each case there was documentation somewhere defining what makes a statement in that particular language (if indeed the language even has a concept of "statement"). The definition is based on syntax, not so much on what the code does.
For example, in C or C++, if you write
{ x + 2; }
then technically the "x + 2;" is a statement. It is a useless statement that doesn't do anything, but syntactically, it is a statement nevertheless. In fact, one way to write a statement in C is to just append a semicolon to an expression (http://msdn.microsoft.com/en-us/library/1t054cy7.aspx). You don't even need the expression; a semicolon by itself can be a statement (http://msdn.microsoft.com/en-us/library/h7zyw61x.aspx).
By the way, in C++, the '+' in an expression such as (x + 2) may actually be a function call. So if you say anything that calls a function is a statement, then (x + 2) would be, or at least could be, a statement in C++. But I don't know any authority who defines it that way.
It varies by language, but ultimately: a statement is anything you can't embed inside another (simple, i.e. not a block) statement.
In C, your example is an expression, because you can do this:
while (X = someFunction() + 2) {
// ...
}
But in Python, the same thing is a syntax error, because = can only be a statement:
# nope!
while X = someFunction() + 2:
pass
In most languages, any expression can also be used as a statement by itself, though this may or may not be useful.
Calling a statement an "instruction to execute something" is a poor way to think about it, though. All code is an instruction to execute something.
A statement is more like a single complete thought. It's really just part of the syntax; depending on the language/compiler/runtime, a statement or expression may become very many machine instructions, or several statements might be reduced to just one instruction.
tldr; It is a statement when it is parsed as statement, and an expression when it is parsed as an expression. The rules of which depend upon the particular language in question.
Expressions and statements should not be confused with "what actually happens" underneath, but merely as describing the syntax constructs of a language's grammar.
Because the grammar and parsing rules [generally] depend on the program as a whole, taking part of an expression and using it as a statement, where such is allowed, does not indicate that it is a statement, much less when it appears in an expression context.
As for the particular example given, it depends on programming language and where the construct appears. Some languages support assignments as expressions, while others do not.
For instance, consider this JavaScript (see Appendix A of ES5 for the grammar rules).
{ x = y = f() + 2 }
In this case, the block is a statement (BlockStatement) and x = .. is also considered a "statement" (although it is really an Expression via Statement -> ExpressionStatement) while y = .. is an expression. Likewise, f() is an expression (technically, f is also an expression in JavaScript) and 2 is an expression and f() + 2 is an expression.
However, the following is invalid Pascal because Pascal's syntax does not support := (assignment) in an expression and an assignment is always a statement.
X := Y := F() + 2
Some languages also forbid general expressions as statements, which further throws off the notion that, in y = EXPR, it is correct to consider EXPR a valid statement. The following is invalid C#, but is dubiously valid in JavaScript and many other languages.
{ f() + 2; }
I would say that the "someFunction() + 2" is an expression being evaluated. Then I would say that "x = someFunction() + 2" is a statement, because most languages would generally evaluate the function's return plus two, and then assign that value to x.

Ruby case/when vs if/elsif

The case/when statements remind me of try/catch statements in Python, which are fairly expensive operations. Is this similar with the Ruby case/when statements? What advantages do they have, other than perhaps being more concise, to if/elsif Ruby statements? When would I use one over the other?
The case expression is not at all like a try/catch block. The Ruby equivalents to try and catch are begin and rescue.
In general, the case expression is used when you want to test one value for several conditions. For example:
case x
when String
"You passed a string but X is supposed to be a number. What were you thinking?"
when 0
"X is zero"
when 1..5
"X is between 1 and 5"
else
"X isn't a number we're interested in"
end
The case expression is orthogonal to the switch statement that exists in many other languages (e.g. C, Java, JavaScript), though Python doesn't include any such thing. The main difference with case is that it is an expression rather than a statement (so it yields a value) and it uses the === operator for equality, which allows us to express interesting things like "Is this value a String? Is it 0? Is it in the range 1..5?"
Ruby's begin/rescue/end is more similar to Python's try/catch (assuming Python's try/catch is similar to Javascript, Java, etc.). In both of the above the code runs, catches errors and continues.
case/when is like C's switch and ignoring the === operator that bjhaid mentions operates very much like if/elseif/end. Which you use is up to you, but there are some advantages to using case when the number of conditionals gets long. No one likes /if/elsif/elsif/elsif/elsif/elsif/end :-)
Ruby has some other magical things involving that === operator that can make case nice, but I'll leave that to the documentation which explains it better than I can.

Unwanted evaluation in assignments in Mathematica: why it happens and how to debug it during the package-loading?

I am developing a (large) package which does not load properly anymore.
This happened after I changed a single line of code.
When I attempt to load the package (with Needs), the package starts loading and then one of the setdelayed definitions “comes alive” (ie. Is somehow evaluated), gets trapped in an error trapping routine loaded a few lines before and the package loading aborts.
The error trapping routine with abort is doing its job, except that it should not have been called in the first place, during the package loading phase.
The error message reveals that the wrong argument is in fact a pattern expression which I use on the lhs of a setdelayed definition a few lines later.
Something like this:
……Some code lines
Changed line of code
g[x_?NotGoodQ]:=(Message[g::nogood, x];Abort[])
……..some other code lines
g/: cccQ[g[x0_]]:=True
When I attempt to load the package, I get:
g::nogood: Argument x0_ is not good
As you see the passed argument is a pattern and it can only come from the code line above.
I tried to find the reason for this behavior, but I have been unsuccessful so far.
So I decided to use the powerful Workbench debugging tools .
I would like to see step by step (or with breakpoints) what happens when I load the package.
I am not yet too familiar with WB, but it seems that ,using Debug as…, the package is first loaded and then eventually debugged with breakpoints, ect.
My problem is that the package does not even load completely! And any breakpoint set before loading the package does not seem to be effective.
So…2 questions:
can anybody please explain why these code lines "come alive" during package loading? (there are no obvious syntax errors or code fragments left in the package as far as I can see)
can anybody please explain how (if) is possible to examine/debug
package code while being loaded in WB?
Thank you for any help.
Edit
In light of Leonid's answer and using his EvenQ example:
We can avoid using Holdpattern simply by definying upvalues for g BEFORE downvalues for g
notGoodQ[x_] := EvenQ[x];
Clear[g];
g /: cccQ[g[x0_]] := True
g[x_?notGoodQ] := (Message[g::nogood, x]; Abort[])
Now
?g
Global`g
cccQ[g[x0_]]^:=True
g[x_?notGoodQ]:=(Message[g::nogood,x];Abort[])
In[6]:= cccQ[g[1]]
Out[6]= True
while
In[7]:= cccQ[g[2]]
During evaluation of In[7]:= g::nogood: -- Message text not found -- (2)
Out[7]= $Aborted
So...general rule:
When writing a function g, first define upvalues for g, then define downvalues for g, otherwise use Holdpattern
Can you subscribe to this rule?
Leonid says that using Holdpattern might indicate improvable design. Besides the solution indicated above, how could one improve the design of the little code above or, better, in general when dealing with upvalues?
Thank you for your help
Leaving aside the WB (which is not really needed to answer your question) - the problem seems to have a straightforward answer based only on how expressions are evaluated during assignments. Here is an example:
In[1505]:=
notGoodQ[x_]:=True;
Clear[g];
g[x_?notGoodQ]:=(Message[g::nogood,x];Abort[])
In[1509]:= g/:cccQ[g[x0_]]:=True
During evaluation of In[1509]:= g::nogood: -- Message text not found -- (x0_)
Out[1509]= $Aborted
To make it work, I deliberately made a definition for notGoodQ to always return True. Now, why was g[x0_] evaluated during the assignment through TagSetDelayed? The answer is that, while TagSetDelayed (as well as SetDelayed) in an assignment h/:f[h[elem1,...,elemn]]:=... does not apply any rules that f may have, it will evaluate h[elem1,...,elem2], as well as f. Here is an example:
In[1513]:=
ClearAll[h,f];
h[___]:=Print["Evaluated"];
In[1515]:= h/:f[h[1,2]]:=3
During evaluation of In[1515]:= Evaluated
During evaluation of In[1515]:= TagSetDelayed::tagnf: Tag h not found in f[Null]. >>
Out[1515]= $Failed
The fact that TagSetDelayed is HoldAll does not mean that it does not evaluate its arguments - it only means that the arguments arrive to it unevaluated, and whether or not they will be evaluated depends on the semantics of TagSetDelayed (which I briefly described above). The same holds for SetDelayed, so the commonly used statement that it "does not evaluate its arguments" is not literally correct. A more correct statement is that it receives the arguments unevaluated and does evaluate them in a special way - not evaluate the r.h.s, while for l.h.s., evaluate head and elements but not apply rules for the head. To avoid that, you may wrap things in HoldPattern, like this:
Clear[g,notGoodQ];
notGoodQ[x_]:=EvenQ[x];
g[x_?notGoodQ]:=(Message[g::nogood,x];Abort[])
g/:cccQ[HoldPattern[g[x0_]]]:=True;
This goes through. Here is some usage:
In[1527]:= cccQ[g[1]]
Out[1527]= True
In[1528]:= cccQ[g[2]]
During evaluation of In[1528]:= g::nogood: -- Message text not found -- (2)
Out[1528]= $Aborted
Note however that the need for HoldPattern inside your left-hand side when making a definition is often a sign that the expression inside your head may also evaluate during the function call, which may break your code. Here is an example of what I mean:
In[1532]:=
ClearAll[f,h];
f[x_]:=x^2;
f/:h[HoldPattern[f[y_]]]:=y^4;
This code attempts to catch cases like h[f[something]], but it will obviously fail since f[something] will evaluate before the evaluation comes to h:
In[1535]:= h[f[5]]
Out[1535]= h[25]
For me, the need for HoldPattern on the l.h.s. is a sign that I need to reconsider my design.
EDIT
Regarding debugging during loading in WB, one thing you can do (IIRC, can not check right now) is to use good old print statements, the output of which will appear in the WB's console. Personally, I rarely feel a need for debugger for this purpose (debugging package when loading)
EDIT 2
In response to the edit in the question:
Regarding the order of definitions: yes, you can do this, and it solves this particular problem. But, generally, this isn't robust, and I would not consider it a good general method. It is hard to give a definite advice for a case at hand, since it is a bit out of its context, but it seems to me that the use of UpValues here is unjustified. If this is done for error - handling, there are other ways to do it without using UpValues.
Generally, UpValues are used most commonly to overload some function in a safe way, without adding any rule to the function being overloaded. One advice is to avoid associating UpValues with heads which also have DownValues and may evaluate -by doing this you start playing a game with evaluator, and will eventually lose. The safest is to attach UpValues to inert symbols (heads, containers), which often represent a "type" of objects on which you want to overload a given function.
Regarding my comment on the presence of HoldPattern indicating a bad design. There certainly are legitimate uses for HoldPattern, such as this (somewhat artificial) one:
In[25]:=
Clear[ff,a,b,c];
ff[HoldPattern[Plus[x__]]]:={x};
ff[a+b+c]
Out[27]= {a,b,c}
Here it is justified because in many cases Plus remains unevaluated, and is useful in its unevaluated form - since one can deduce that it represents a sum. We need HoldPattern here because of the way Plus is defined on a single argument, and because a pattern happens to be a single argument (even though it describes generally multiple arguments) during the definition. So, we use HoldPattern here to prevent treating the pattern as normal argument, but this is mostly different from the intended use cases for Plus. Whenever this is the case (we are sure that the definition will work all right for intended use cases), HoldPattern is fine. Note b.t.w., that this example is also fragile:
In[28]:= ff[Plus[a]]
Out[28]= ff[a]
The reason why it is still mostly OK is that normally we don't use Plus on a single argument.
But, there is a second group of cases, where the structure of usually supplied arguments is the same as the structure of patterns used for the definition. In this case, pattern evaluation during the assignment indicates that the same evaluation will happen with actual arguments during the function calls. Your usage falls into this category. My comment for a design flaw was for such cases - you can prevent the pattern from evaluating, but you will have to prevent the arguments from evaluating as well, to make this work. And pattern-matching against not completely evaluated expression is fragile. Also, the function should never assume some extra conditions (beyond what it can type-check) for the arguments.

Any reason NOT to always use keyword arguments?

Before jumping into python, I had started with some Objective-C / Cocoa books. As I recall, most functions required keyword arguments to be explicitly stated. Until recently I forgot all about this, and just used positional arguments in Python. But lately, I've ran into a few bugs which resulted from improper positions - sneaky little things they were.
Got me thinking - generally speaking, unless there is a circumstance that specifically requires non-keyword arguments - is there any good reason NOT to use keyword arguments? Is it considered bad style to always use them, even for simple functions?
I feel like as most of my 50-line programs have been scaling to 500 or more lines regularly, if I just get accustomed to always using keyword arguments, the code will be more easily readable and maintainable as it grows. Any reason this might not be so?
UPDATE:
The general impression I am getting is that its a style preference, with many good arguments that they should generally not be used for very simple arguments, but are otherwise consistent with good style. Before accepting I just want to clarify though - is there any specific non-style problems that arise from this method - for instance, significant performance hits?
There isn't any reason not to use keyword arguments apart from the clarity and readability of the code. The choice of whether to use keywords should be based on whether the keyword adds additional useful information when reading the code or not.
I follow the following general rule:
If it is hard to infer the function (name) of the argument from the function name – pass it by keyword (e.g. I wouldn't want to have text.splitlines(True) in my code).
If it is hard to infer the order of the arguments, for example if you have too many arguments, or when you have independent optional arguments – pass it by keyword (e.g. funkyplot(x, y, None, None, None, None, None, None, 'red') doesn't look particularly nice).
Never pass the first few arguments by keyword if the purpose of the argument is obvious. You see, sin(2*pi) is better than sin(value=2*pi), the same is true for plot(x, y, z).
In most cases, stable mandatory arguments would be positional, and optional arguments would be keyword.
There's also a possible difference in performance, because in every implementation the keyword arguments would be slightly slower, but considering this would be generally a premature optimisation and the results from it wouldn't be significant, I don't think it's crucial for the decision.
UPDATE: Non-stylistical concerns
Keyword arguments can do everything that positional arguments can, and if you're defining a new API there are no technical disadvantages apart from possible performance issues. However, you might have little issues if you're combining your code with existing elements.
Consider the following:
If you make your function take keyword arguments, that becomes part of your interface.
You can't replace your function with another that has a similar signature but a different keyword for the same argument.
You might want to use a decorator or another utility on your function that assumes that your function takes a positional argument. Unbound methods are an example of such utility because they always pass the first argument as positional after reading it as positional, so cls.method(self=cls_instance) doesn't work even if there is an argument self in the definition.
None of these would be a real issue if you design your API well and document the use of keyword arguments, especially if you're not designing something that should be interchangeable with something that already exists.
If your consideration is to improve readability of function calls, why not simply declare functions as normal, e.g.
def test(x, y):
print "x:", x
print "y:", y
And simply call functions by declaring the names explicitly, like so:
test(y=4, x=1)
Which obviously gives you the output:
x: 1
y: 4
or this exercise would be pointless.
This avoids having arguments be optional and needing default values (unless you want them to be, in which case just go ahead with the keyword arguments! :) and gives you all the versatility and improved readability of named arguments that are not limited by order.
Well, there are a few reasons why I would not do that.
If all your arguments are keyword arguments, it increases noise in the code and it might remove clarity about which arguments are required and which ones are optionnal.
Also, if I have to use your code, I might want to kill you !! (Just kidding), but having to type the name of all the parameters everytime... not so fun.
Just to offer a different argument, I think there are some cases in which named parameters might improve readability. For example, imagine a function that creates a user in your system:
create_user("George", "Martin", "g.m#example.com", "payments#example.com", "1", "Radius Circle")
From that definition, it is not at all clear what these values might mean, even though they are all required, however with named parameters it is always obvious:
create_user(
first_name="George",
last_name="Martin",
contact_email="g.m#example.com",
billing_email="payments#example.com",
street_number="1",
street_name="Radius Circle")
I remember reading a very good explanation of "options" in UNIX programs: "Options are meant to be optional, a program should be able to run without any options at all".
The same principle could be applied to keyword arguments in Python.
These kind of arguments should allow a user to "customize" the function call, but a function should be able to be called without any implicit keyword-value argument pairs at all.
Sometimes, things should be simple because they are simple.
If you always enforce you to use keyword arguments on every function call, soon your code will be unreadable.
When Python's built-in compile() and __import__() functions gain keyword argument support, the same argument was made in favor of clarity. There appears to be no significant performance hit, if any.
Now, if you make your functions only accept keyword arguments (as opposed to passing the positional parameters using keywords when calling them, which is allowed), then yes, it'd be annoying.
I don't see the purpose of using keyword arguments when the meaning of the arguments is obvious
Keyword args are good when you have long parameter lists with no well defined order (that you can't easily come up with a clear scheme to remember); however there are many situations where using them is overkill or makes the program less clear.
First, sometimes is much easier to remember the order of keywords than the names of keyword arguments, and specifying the names of arguments could make it less clear. Take randint from scipy.random with the following docstring:
randint(low, high=None, size=None)
Return random integers x such that low <= x < high.
If high is None, then 0 <= x < low.
When wanting to generate a random int from [0,10) its clearer to write randint(10) than randint(low=10) in my view. If you need to generate an array with 100 numbers in [0,10) you can probably remember the argument order and write randint(0, 10, 100). However, you may not remember the variable names (e.g., is the first parameter low, lower, start, min, minimum) and once you have to look up the parameter names, you might as well not use them (as you just looked up the proper order).
Also consider variadic functions (ones with variable number of parameters that are anonymous themselves). E.g., you may want to write something like:
def square_sum(*params):
sq_sum = 0
for p in params:
sq_sum += p*p
return sq_sum
that can be applied a bunch of bare parameters (square_sum(1,2,3,4,5) # gives 55 ). Sure you could have written the function to take an named keyword iterable def square_sum(params): and called it like square_sum([1,2,3,4,5]) but that may be less intuitive, especially when there's no potential confusion about the argument name or its contents.
A mistake I often do is that I forget that positional arguments have to be specified before any keyword arguments, when calling a function. If testing is a function, then:
testing(arg = 20, 56)
gives a SyntaxError message; something like:
SyntaxError: non-keyword arg after keyword arg
It is easy to fix of course, it's just annoying. So in the case of few - lines programs as the ones you mention, I would probably just go with positional arguments after giving nice, descriptive names to the parameters of the function. I don't know if what I mention is that big of a problem though.
One downside I could see is that you'd have to think of a sensible default value for everything, and in many cases there might not be any sensible default value (including None). Then you would feel obliged to write a whole lot of error handling code for the cases where a kwarg that logically should be a positional arg was left unspecified.
Imagine writing stuff like this every time..
def logarithm(x=None):
if x is None:
raise TypeError("You can't do log(None), sorry!")

Why use short-circuit code?

Related Questions: Benefits of using short-circuit evaluation, Why would a language NOT use Short-circuit evaluation?, Can someone explain this line of code please? (Logic & Assignment operators)
There are questions about the benefits of a language using short-circuit code, but I'm wondering what are the benefits for a programmer? Is it just that it can make code a little more concise? Or are there performance reasons?
I'm not asking about situations where two entities need to be evaluated anyway, for example:
if($user->auth() AND $model->valid()){
$model->save();
}
To me the reasoning there is clear - since both need to be true, you can skip the more costly model validation if the user can't save the data.
This also has a (to me) obvious purpose:
if(is_string($userid) AND strlen($userid) > 10){
//do something
};
Because it wouldn't be wise to call strlen() with a non-string value.
What I'm wondering about is the use of short-circuit code when it doesn't effect any other statements. For example, from the Zend Application default index page:
defined('APPLICATION_PATH')
|| define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
This could have been:
if(!defined('APPLICATION_PATH')){
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
}
Or even as a single statement:
if(!defined('APPLICATION_PATH'))
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
So why use the short-circuit code? Just for the 'coolness' factor of using logic operators in place of control structures? To consolidate nested if statements? Because it's faster?
For programmers, the benefit of a less verbose syntax over another more verbose syntax can be:
less to type, therefore higher coding efficiency
less to read, therefore better maintainability.
Now I'm only talking about when the less verbose syntax is not tricky or clever in any way, just the same recognized way of doing, but in fewer characters.
It's often when you see specific constructs in one language that you wish the language you use could have, but didn't even necessarily realize it before. Some examples off the top of my head:
anonymous inner classes in Java instead of passing a pointer to a function (way more lines of code).
in Ruby, the ||= operator, to evaluate an expression and assign to it if it evaluates to false or is null. Sure, you can achieve the same thing by 3 lines of code, but why?
and many more...
Use it to confuse people!
I don't know PHP and I've never seen short-circuiting used outside an if or while condition in the C family of languages, but in Perl it's very idiomatic to say:
open my $filehandle, '<', 'filename' or die "Couldn't open file: $!";
One advantage of having it all in one statement is the variable declaration. Otherwise you'd have to say:
my $filehandle;
unless (open $filehandle, '<', 'filename') {
die "Couldn't open file: $!";
}
Hard to claim the second one is cleaner in that case. And it'd be wordier still in a language that doesn't have unless
I think your example is for the coolness factor. There's no reason to write code like that.
EDIT: I have no problem with doing it for idiomatic reasons. If everyone else who uses a language uses short-circuit evaluation to make statement-like entities that everyone understands, then you should too. However, my experience is that code of that sort is rarely written in C-family languages; proper form is just to use the "if" statement as normal, which separates the conditional (which presumably has no side effects) from the function call that the conditional controls (which presumably has many side effects).
Short circuit operators can be useful in two important circumstances which haven't yet been mentioned:
Case 1. Suppose you had a pointer which may or may not be NULL and you wanted to check that it wasn't NULL, and that the thing it pointed to wasn't 0. However, you must not dereference the pointer if it's NULL. Without short-circuit operators, you would have to do this:
if (a != NULL) {
if (*a != 0) {
⋮
}
}
However, short-circuit operators allow you to write this more compactly:
if (a != NULL && *a != 0) {
⋮
}
in the certain knowledge that *a will not be evaluated if a is NULL.
Case 2. If you want to set a variable to a non-false value returned from one of a series of functions, you can simply do:
my $file = $user_filename ||
find_file_in_user_path() ||
find_file_in_system_path() ||
$default_filename;
This sets the value of $file to $user_filename if it's present, or the result of find_file_in_user_path(), if it's true, or … so on. This is seen perhaps more often in Perl than C, but I have seen it in C.
There are other uses, including the rather contrived examples which you cite above. But they are a useful tool, and one which I have missed when programming in less complex languages.
Related to what Dan said, I'd think it all depends on the conventions of each programming language. I can't see any difference, so do whatever is idiomatic in each programming language. One thing that could make a difference that comes to mind is if you had to do a series of checks, in that case the short-circuiting style would be much clearer than the alternative if style.
What if you had a expensive to call (performance wise) function that returned a boolean on the right hand side that you only wanted called if another condition was true (or false)? In this case Short circuiting saves you many CPU cycles. It does make the code more concise because of fewer nested if statements. So, for all the reasons you listed at the end of your question.
The truth is actually performance. Short circuiting is used in compilers to eliminate dead code saving on file size and execution speed. At run-time short-circuiting does not execute the remaining clause in the logical expression if their outcome does not affect the answer, speeding up the evaluation of the formula. I am struggling to remember an example. e.g
a AND b AND c
There are two terms in this formula evaluated left to right.
if a AND b evaluates to FALSE then the next expression AND c can either be FALSE AND TRUE or FALSE AND FALSE. Both evaluate to FALSE no matter what the value of c is. Therefore the compiler does not include AND c in the compiled format hence short-circuiting the code.
To answer the question there are special cases when the compiler cannot determine whether the logical expression has a constant output and hence would not short-circuit the code.
Think of it this way, if you have a statement like
if( A AND B )
chances are if A returns FALSE you'll only ever want to evaluate B in rare special cases. For this reason NOT using short ciruit evaluation is confusing.
Short circuit evaluation also makes your code more readable by preventing another bracketed indentation and brackets have a tendency to add up.

Resources