new to TCL and running into a short circuit issue it seems. Coming from vbscript, I'm able to perform this properly, but trying to convert to a TCL script I'm having issues with the short circuit side effect and have been trying to find the proper way of doing this.
In the following snippet, I want to execute "do something" only if BOTH sides are true, but because of short circuiting, it will only evaluate the second argument if the first fails to determine the value of the expression.
if {$basehour != 23 && $hours != 0} {
do something
}
Maybe I'm not searching for the right things, but so far I've been unable to find the solution. Any tips would be appreciated.
The && operator always does short-circuiting in Tcl (as it does in C and Java and a number of other languages too). If you want the other version and can guarantee that both sub-expressions yield booleans (e.g., they come from equality tests such as you're doing) then you can use the & operator instead, which does bit-wise AND and will do what you want when working on bools. If you're doing this, it's wise to put parentheses around the sub-expressions for clarity; while everyone remember the precedence of == with respect to &&, the order w.r.t. & is often forgotten. (The parentheses are free in terms of execution cost.)
if {($basehour != 23) & ($hours != 0)} {
do something
}
However, it's usually not necessary to do this. If you're wanting an AND that you're feeding into a boolean test (e.g., the if command's expression) then there's no reason to not short-circuit, as in your original code; if the first clause gives false, the second one won't change what value the overall expression produces.
Related
I've never understood why we use syntax like this:
if (a == b || a == c)
when it could be simplified to something like this:
if (a == b || c)
Is this an issue with compilers or something? Can we really not account for a string of code like this and make it work?
There is no technical limitation that would make it impossible (or even too difficult) to implement a language that treats a == b || c as a shortcut for a == b || a == c. The problem is that it would be almost (?) impossible to come up with rules to do so only in the cases where that's what's expected.
For instance consider the expression result == null || fileIsClosed where fileIsClosed is a boolean. Surely the programmer would not expect this to be treated as result == null || result == fileIsClosed. You could come up with additional rules like "the replacement is only applied if the right operand of || is not a boolean", but then the replacement also does not work if you do booleanResult == possibleResult1 || possibleResult2. In fact the only thing about this example, that tells us whether the programmer intended for the replacement to happen, are the names of the variables. Obviously the compiler can't infer meaning from variable names, so it'd be impossible to do what the user wants in every case, making simple rules without exceptions (like "expr1 || expr2 is true iff at least one of expr1 and expr2 are true") preferable.
So in summary: We don't want the replacement to take place in all cases and inferring in which cases the replacements would make sense with complete accuracy would be impossible. Since it should be easy to reason about code, implementing a system that may or may not apply the replacement based on rules that 90% of programmers won't know or understand will lead to confusing behavior in certain cases and is therefore not a good idea.
Some part of the answer is: if (a == b || c) would not be interpretable without context or knowing the meaning behind this expression.
Consider the following situation:
What if c would be a boolean expression or a boolean value by itself? It could then be interpreted as either:
if (a has the value of b OR c is true)
if (a has the value of b OR c)
Now what was the intention when a human being coded these lines? ... *think. We don't exactly know nor does a compiler when it is asked to produce object code to be executed on a machine (as intended by the developer).
Or more drastically: A compiler can (and should) not guess, like humans sometimes do.
You can, you are just using the wrong programming language :)
In Icon for example, you can write a = (b | c). The way that works is that | really concatenates sequences, and = (equality) filters them (where false is no results, and true is some results). It implements this through back-tracking.
You can do even crazier things, like write ((0 to 4) > 1) * 3 will print out 6 9 12.
As you already know the string you wrote would have different meaning. However, some languages provide extension to change the behavior of the operator. You can use it if you want to define your own custom operations.
In JavaScript,
f = function(x) {
return x + 1;
}
(5)
seems at a glance as though it should assign f the successor function, but actually assigns the value 6, because the lambda expression followed by parentheses is interpreted by the parser as a postfix expression, specifically a function call. Fortunately this is easy to fix:
f = function(x) {
return x + 1;
};
(5)
behaves as expected.
If Python allowed a block in a lambda expression, there would be a similar problem:
f = lambda(x):
return x + 1
(5)
but this time we can't solve it the same way because there are no semicolons. In practice Python avoids the problem by not allowing multiline lambda expressions, but I'm working on a language with indentation-based syntax where I do want multiline lambda and other expressions, so I'm trying to figure out how to avoid having a block parse as the start of a postfix expression. Thus far I'm thinking maybe each level of the recursive descent parser should have a parameter along the lines of 'we have already eaten a block in this statement so don't do postfix'.
Are there any existing languages that encounter this problem, and how do they solve it if so?
Python has semicolons. This is perfectly valid (though ugly and not recommended) Python code: f = lambda(x): x + 1; (5).
There are many other problems with multi-line lambdas in otherwise standard Python syntax though. It is completely incompatible with how Python handles indentation (whitespace in general, actually) inside expressions - it doesn't, and that's the complete opposite of what you want. You should read the numerous python-ideas thread about multi-line lambdas. It's somewhere between very hard to impossible.
If you want arbitrarily complex compound statements inside lambdas you can't use the existing rules for multi-line expressions even if you made all statements expressions. You'd have to change the indentation handling (see the language reference for how it works right now) so that expressions can also contain blocks. This is hard to do without breaking perfectly fine Python code, and will certainly result in a language many Python programmers will consider worse in several regards: Harder to understand, more complex to implement, permits some stupid errors, etc.
Most languages don't solve this exact problem at all. Most candidates (Scala, Ruby, Lisps, and variants of these three) have explicit end-of-block tokens. I know of two languages that have the same problem, one of which (Haskell) has been mentioned by another answer. Coffeescript also uses indentation without end-of-block tokens. It parses the transliteration of your example correctly. However, I could not find any specification of how or why it does this (and I won't dig through the parser source code). Both differ significantly from Python in syntax as well as design philosophy, so their solution is of little (if any) use for Python.
In Haskell, there is an implicit semicolon whenever you start a line with the same indentation as a previous one, assuming the parser is in a layout-sensitive mode.
More specifically, after a token is encountered that signals the start of a (layout-sensitive) block, the indentation level of the first token of the first block item is remembered. Each line that is indented more continues the current block item; each line that is indented the same starts a new block item, and the first line that is indented less implies the closure of the block.
How your last example would be treated depends on whether the f = is a block item in some block or not. If it is, then there will be an implicit semicolon between the lambda expression and the (5), since the latter is indented the same as the former. If it is not, then the (5) will be treated as continuing whatever block item the f = is a part of, making it an argument to the lamda function.
The details are a bit messier than this; look at the Haskell 2010 report.
I'm trying to curb some of the bad habits of a self-proclaimed "senior programmer." He insists on writing If blocks like this:
if (expression) {}
else {
statements
}
Or as he usually writes it in classic ASP VBScript:
If expression Then
Else
statements
End If
The expression could be something as easily negated as:
if (x == 0) {}
else {
statements
}
Other than clarity of coding style, what other reasons can I provide for my opinion that the following is preferred?
if (x != 0) {
statements
}
Or even the more general case (again in VBScript):
If Not expression Then
statements
End If
Reasons that come to my mind for supporting your opinion (which I agree with BTW) are:
Easier to read (which implies easier to understand)
Easier to maintain (because of point #1)
Consistent with 'established' coding styles in most major programming languages
I have NEVER come across the coding-style/form that your co-worker insists on using.
I've tried it both ways. McConnell in Code Complete says one should always include both the then and the else to demonstrate that one has thought about both conditions, even if the operation is nothing (NOP). It looks like your friend is doing this.
I've found this practice to add no value in the field because unit testing handles this or it is unnecessary. YMMV, of course.
If you really want to burn his bacon, calculate how much time he's spending writing the empty statements, multiply by 1.5 (for testing) and then multiply that number by his hourly rate. Send him a bill for the amount.
As an aside, I'd move the close curly bracket to the else line:
if (expression) {
} else {
statements
}
The reason being that it is tempting to (or easy to accidentally) add some statement outside the block.
For this reason, I abhor single-line (bare) statements, of the form
if (expression)
statement
Because it can get fugly (and buggy) really fast
if (expression)
statement1
statement2
statement2 will always run, even though it might look like it should be subject to expression. Getting in the habit of always using brackets will kill this stumbling point dead.
Related Questions: Benefits of using short-circuit evaluation, Why would a language NOT use Short-circuit evaluation?, Can someone explain this line of code please? (Logic & Assignment operators)
There are questions about the benefits of a language using short-circuit code, but I'm wondering what are the benefits for a programmer? Is it just that it can make code a little more concise? Or are there performance reasons?
I'm not asking about situations where two entities need to be evaluated anyway, for example:
if($user->auth() AND $model->valid()){
$model->save();
}
To me the reasoning there is clear - since both need to be true, you can skip the more costly model validation if the user can't save the data.
This also has a (to me) obvious purpose:
if(is_string($userid) AND strlen($userid) > 10){
//do something
};
Because it wouldn't be wise to call strlen() with a non-string value.
What I'm wondering about is the use of short-circuit code when it doesn't effect any other statements. For example, from the Zend Application default index page:
defined('APPLICATION_PATH')
|| define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
This could have been:
if(!defined('APPLICATION_PATH')){
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
}
Or even as a single statement:
if(!defined('APPLICATION_PATH'))
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
So why use the short-circuit code? Just for the 'coolness' factor of using logic operators in place of control structures? To consolidate nested if statements? Because it's faster?
For programmers, the benefit of a less verbose syntax over another more verbose syntax can be:
less to type, therefore higher coding efficiency
less to read, therefore better maintainability.
Now I'm only talking about when the less verbose syntax is not tricky or clever in any way, just the same recognized way of doing, but in fewer characters.
It's often when you see specific constructs in one language that you wish the language you use could have, but didn't even necessarily realize it before. Some examples off the top of my head:
anonymous inner classes in Java instead of passing a pointer to a function (way more lines of code).
in Ruby, the ||= operator, to evaluate an expression and assign to it if it evaluates to false or is null. Sure, you can achieve the same thing by 3 lines of code, but why?
and many more...
Use it to confuse people!
I don't know PHP and I've never seen short-circuiting used outside an if or while condition in the C family of languages, but in Perl it's very idiomatic to say:
open my $filehandle, '<', 'filename' or die "Couldn't open file: $!";
One advantage of having it all in one statement is the variable declaration. Otherwise you'd have to say:
my $filehandle;
unless (open $filehandle, '<', 'filename') {
die "Couldn't open file: $!";
}
Hard to claim the second one is cleaner in that case. And it'd be wordier still in a language that doesn't have unless
I think your example is for the coolness factor. There's no reason to write code like that.
EDIT: I have no problem with doing it for idiomatic reasons. If everyone else who uses a language uses short-circuit evaluation to make statement-like entities that everyone understands, then you should too. However, my experience is that code of that sort is rarely written in C-family languages; proper form is just to use the "if" statement as normal, which separates the conditional (which presumably has no side effects) from the function call that the conditional controls (which presumably has many side effects).
Short circuit operators can be useful in two important circumstances which haven't yet been mentioned:
Case 1. Suppose you had a pointer which may or may not be NULL and you wanted to check that it wasn't NULL, and that the thing it pointed to wasn't 0. However, you must not dereference the pointer if it's NULL. Without short-circuit operators, you would have to do this:
if (a != NULL) {
if (*a != 0) {
⋮
}
}
However, short-circuit operators allow you to write this more compactly:
if (a != NULL && *a != 0) {
⋮
}
in the certain knowledge that *a will not be evaluated if a is NULL.
Case 2. If you want to set a variable to a non-false value returned from one of a series of functions, you can simply do:
my $file = $user_filename ||
find_file_in_user_path() ||
find_file_in_system_path() ||
$default_filename;
This sets the value of $file to $user_filename if it's present, or the result of find_file_in_user_path(), if it's true, or … so on. This is seen perhaps more often in Perl than C, but I have seen it in C.
There are other uses, including the rather contrived examples which you cite above. But they are a useful tool, and one which I have missed when programming in less complex languages.
Related to what Dan said, I'd think it all depends on the conventions of each programming language. I can't see any difference, so do whatever is idiomatic in each programming language. One thing that could make a difference that comes to mind is if you had to do a series of checks, in that case the short-circuiting style would be much clearer than the alternative if style.
What if you had a expensive to call (performance wise) function that returned a boolean on the right hand side that you only wanted called if another condition was true (or false)? In this case Short circuiting saves you many CPU cycles. It does make the code more concise because of fewer nested if statements. So, for all the reasons you listed at the end of your question.
The truth is actually performance. Short circuiting is used in compilers to eliminate dead code saving on file size and execution speed. At run-time short-circuiting does not execute the remaining clause in the logical expression if their outcome does not affect the answer, speeding up the evaluation of the formula. I am struggling to remember an example. e.g
a AND b AND c
There are two terms in this formula evaluated left to right.
if a AND b evaluates to FALSE then the next expression AND c can either be FALSE AND TRUE or FALSE AND FALSE. Both evaluate to FALSE no matter what the value of c is. Therefore the compiler does not include AND c in the compiled format hence short-circuiting the code.
To answer the question there are special cases when the compiler cannot determine whether the logical expression has a constant output and hence would not short-circuit the code.
Think of it this way, if you have a statement like
if( A AND B )
chances are if A returns FALSE you'll only ever want to evaluate B in rare special cases. For this reason NOT using short ciruit evaluation is confusing.
Short circuit evaluation also makes your code more readable by preventing another bracketed indentation and brackets have a tendency to add up.
I wonder why ruby give and, or less precedence than &&, || , and assign operator? Is there any reason?
My guess is that's a direct carry-over from Perl. The operators or and and were added later in Perl 5 for specific situations were lower precedence was desired.
For example, in Perl, here we wish that || had lower precedence, so that we could write:
try to perform big long hairy complicated action || die ;
and be sure that the || was not going to gobble up part of the action. Perl 5 introduced or, a new version of || that has low precedence, for exactly this purpose.
An example in Ruby where you could use or but not ||:
value = possibly_false or raise "foo"
If you used ||, it would be a syntax error.
The difference is precedence. ||, && have higher precedence than =, but and, or have lower. So while you can do:
a = nil || 0
You would have to do:
a = (nil or 0)
to get same effect. If you do:
a = nil or 0
The result of expression would still be 0, but a value would be nil.
They have very low precedence so that the operands don't have to be wrapped in parentheses, as is sometimes the case with && and ||.
Being able to control the precedence of your operators is sometimes useful, especially if you are concerned with readability -- extra parenthesis in conditional statements can sometimes obscure the actual logic.
To be frank, though, I think the reason Ruby has the boolean operator precedence levels it does stems mostly from the fact that Matz was a Perl programmer before he ever wrote Ruby, and borrowed much of the core syntax and operators from that language.
I believe the idea is specifically to get them below the assignment operators, so you can write logic tests with assignments but without parens.