Why don't programming languages use simplified boolean expressions? - logic

I've never understood why we use syntax like this:
if (a == b || a == c)
when it could be simplified to something like this:
if (a == b || c)
Is this an issue with compilers or something? Can we really not account for a string of code like this and make it work?

There is no technical limitation that would make it impossible (or even too difficult) to implement a language that treats a == b || c as a shortcut for a == b || a == c. The problem is that it would be almost (?) impossible to come up with rules to do so only in the cases where that's what's expected.
For instance consider the expression result == null || fileIsClosed where fileIsClosed is a boolean. Surely the programmer would not expect this to be treated as result == null || result == fileIsClosed. You could come up with additional rules like "the replacement is only applied if the right operand of || is not a boolean", but then the replacement also does not work if you do booleanResult == possibleResult1 || possibleResult2. In fact the only thing about this example, that tells us whether the programmer intended for the replacement to happen, are the names of the variables. Obviously the compiler can't infer meaning from variable names, so it'd be impossible to do what the user wants in every case, making simple rules without exceptions (like "expr1 || expr2 is true iff at least one of expr1 and expr2 are true") preferable.
So in summary: We don't want the replacement to take place in all cases and inferring in which cases the replacements would make sense with complete accuracy would be impossible. Since it should be easy to reason about code, implementing a system that may or may not apply the replacement based on rules that 90% of programmers won't know or understand will lead to confusing behavior in certain cases and is therefore not a good idea.

Some part of the answer is: if (a == b || c) would not be interpretable without context or knowing the meaning behind this expression.
Consider the following situation:
What if c would be a boolean expression or a boolean value by itself? It could then be interpreted as either:
if (a has the value of b OR c is true)
if (a has the value of b OR c)
Now what was the intention when a human being coded these lines? ... *think. We don't exactly know nor does a compiler when it is asked to produce object code to be executed on a machine (as intended by the developer).
Or more drastically: A compiler can (and should) not guess, like humans sometimes do.

You can, you are just using the wrong programming language :)
In Icon for example, you can write a = (b | c). The way that works is that | really concatenates sequences, and = (equality) filters them (where false is no results, and true is some results). It implements this through back-tracking.
You can do even crazier things, like write ((0 to 4) > 1) * 3 will print out 6 9 12.

As you already know the string you wrote would have different meaning. However, some languages provide extension to change the behavior of the operator. You can use it if you want to define your own custom operations.

Related

TCL how to require both operands to determine result in IF statement

new to TCL and running into a short circuit issue it seems. Coming from vbscript, I'm able to perform this properly, but trying to convert to a TCL script I'm having issues with the short circuit side effect and have been trying to find the proper way of doing this.
In the following snippet, I want to execute "do something" only if BOTH sides are true, but because of short circuiting, it will only evaluate the second argument if the first fails to determine the value of the expression.
if {$basehour != 23 && $hours != 0} {
do something
}
Maybe I'm not searching for the right things, but so far I've been unable to find the solution. Any tips would be appreciated.
The && operator always does short-circuiting in Tcl (as it does in C and Java and a number of other languages too). If you want the other version and can guarantee that both sub-expressions yield booleans (e.g., they come from equality tests such as you're doing) then you can use the & operator instead, which does bit-wise AND and will do what you want when working on bools. If you're doing this, it's wise to put parentheses around the sub-expressions for clarity; while everyone remember the precedence of == with respect to &&, the order w.r.t. & is often forgotten. (The parentheses are free in terms of execution cost.)
if {($basehour != 23) & ($hours != 0)} {
do something
}
However, it's usually not necessary to do this. If you're wanting an AND that you're feeding into a boolean test (e.g., the if command's expression) then there's no reason to not short-circuit, as in your original code; if the first clause gives false, the second one won't change what value the overall expression produces.

When comparing a variable to a literal, should one place the literal on the left or right of the equals '==' operator?

When learning to code, I was taught the following style when checking the value of a variable:
int x;
Object *object;
...
if(x == 7) { ... }
if(object == NULL) { ... }
However, now that I am in the field, I have encountered more than one co-worker who swears by the approach of switching the lhs and rhs in the if statements:
if(7 == x) { ... }
if(NULL == object) { ... }
The reasoning being that if you accidentally type = instead of ==, then the code will fail at compile. Being unaccustomed to this style, reading 7 == x is difficult for me, slowing my comprehension of their code.
It seems if I adopt this style, I will likely someday in the future save myself from debugging an x = 7 bug, but in the mean time, every time somebody reads my code I may be wasting their time because I fear the syntax is unorthodox.
Is the 7 == x style generally accepted and readable in the industry, or is this just a personal preference of my coworkers?
The reasoning being that if you accidentally type = instead of ==, then the code will fail at compile.
True. On the other hand, I believe modern C and C++ compilers (I'm assuming you're using one of those languages? You haven't said) will warn you if you do this.
Have you tried it with the compiler you're using? If it doesn't do it by default, look to see if there are flags you can use to provoke it - ideally to make it an error rather than just a warning.
For example, using the Microsoft C compiler, I get:
cl /Wall Test.c
test.c(3) : warning C4706: assignment within conditional expression
That's pretty clear, IMO. (The default warning settings don't spot it, admittedly.)
Being unaccustomed to this style, reading 7 == x is difficult for me, slowing my comprehension of their code.
Indeed. Your approach is the more natural style, and should (IMO) be used unless you're really dealing with a compiler which doesn't spot this as a potential problem (and you have no alternative to using that compiler).
EDIT: Note that this isn't a problem in all languages - not even all C-like languages.
For example, although both Java and C# have a similar if construct, the condition expression in both needs to be implicitly convertible to a Boolean value. While the assignment part would compile, the type of the expression in your first example would be int, which isn't implicitly convertible to the relevant Boolean type in either language, leading to a compile-time error. The rare situation where you'd still have a problem would be:
if (foo == true)
which, if typo'd to:
if (foo = true)
would compile and do the wrong thing. The MS C# compiler even warns you about that, although it's generally better to just use
if (foo)
or
if (!foo)
where possible. That just leaves things like:
if (x == MethodReturningBool())
vs
if (MethodReturningBool() == x)
which is still pretty rare, and there's still a warning for it in the MS C# compiler (and probably in some Java compilers).

Times and NonCommutativeMultiply, handing the difference automatically

I've got some symbols which should are non-commutative, but I don't want to have to remember which expressions have this behaviour whilst constructing equations.
I've had the thought to use MakeExpression to act on the raw boxes, and automatically uplift multiply to non-commutative multiply when appropriate (for instance when some of the symbols are non-commutative objects).
I was wondering whether anyone had any experience with this kind of configuration.
Here's what I've got so far:
(* Detect whether a set of row boxes represents a multiplication *)
Clear[isRowBoxMultiply];
isRowBoxMultiply[x_RowBox] := (Print["rowbox: ", x];
Head[ToExpression[x]] === Times)
isRowBoxMultiply[x___] := (Print["non-rowbox: ", x]; False)
(* Hook into the expression maker, so that we can capture any \
expression of the form F[x___], to see how it is composed of boxes, \
and return true or false on that basis *)
MakeExpression[
RowBox[List["F", "[", x___, "]"]], _] := (HoldComplete[
isRowBoxMultiply[x]])
(* Test a number of expressions to see whether they are automatically \
detected as multiplies or not. *)
F[a]
F[a b]
F[a*b]
F[a - b]
F[3 x]
F[x^2]
F[e f*g ** h*i j]
Clear[MakeExpression]
This appears to correctly identify expressions that are multiplication statements:
During evaluation of In[561]:= non-rowbox: a
Out[565]= False
During evaluation of In[561]:= rowbox: RowBox[{a,b}]
Out[566]= True
During evaluation of In[561]:= rowbox: RowBox[{a,*,b}]
Out[567]= True
During evaluation of In[561]:= rowbox: RowBox[{a,-,b}]
Out[568]= False
During evaluation of In[561]:= rowbox: RowBox[{3,x}]
Out[569]= True
During evaluation of In[561]:= non-rowbox: SuperscriptBox[x,2]
Out[570]= False
During evaluation of In[561]:= rowbox: RowBox[{e,f,*,RowBox[{g,**,h}],*,i,j}]
Out[571]= True
So, it looks like it's not out of the questions that I might be able to conditionally rewrite the boxes of the underlying expression; but how to do this reliably?
Take the expression RowBox[{"e","f","*",RowBox[{"g","**","h"}],"*","i","j"}], this would need to be rewritten as RowBox[{"e","**","f","**",RowBox[{"g","**","h"}],"**","i","**","j"}] which seems like a non trivial operation to do with the pattern matcher and a rule set.
I'd be grateful for any suggestions from those more experienced with me.
I'm trying to find a way of doing this without altering the default behaviour and ordering of multiply.
Thanks! :)
Joe
This is not a most direct answer to your question, but for many purposes working as low-level as directly with the boxes might be an overkill. Here is an alternative: let the Mathematica parser parse your code, and make a change then. Here is a possibility:
ClearAll[withNoncommutativeMultiply];
SetAttributes[withNoncommutativeMultiply, HoldAll];
withNoncommutativeMultiply[code_] :=
Internal`InheritedBlock[{Times},
Unprotect[Times];
Times = NonCommutativeMultiply;
Protect[Times];
code];
This replaces Times dynamically with NonCommutativeMultiply, and avoids the intricacies you mentioned. By using Internal`InheritedBlock, I make modifications to Times local to the code executed inside withNoncommutativeMultiply.
You now can automate the application of this function with $Pre:
$Pre = withNoncommutativeMultiply;
Now, for example:
In[36]:=
F[a]
F[a b]
F[a*b]
F[a-b]
F[3 x]
F[x^2]
F[e f*g**h*i j]
Out[36]= F[a]
Out[37]= F[a**b]
Out[38]= F[a**b]
Out[39]= F[a+(-1)**b]
Out[40]= F[3**x]
Out[41]= F[x^2]
Out[42]= F[e**f**g**h**i**j]
Surely, using $Pre in such manner is hardly appropriate, since in all your code multiplication will be replaced with noncommutative multiplication - I used this as an illustration. You could make a more complicated redefinition of Times, so that this would only work for certain symbols.
Here is a safer alternative based on lexical, rather than dynamic, scoping:
ClearAll[withNoncommutativeMultiplyLex];
SetAttributes[withNoncommutativeMultiplyLex, HoldAll];
withNoncommutativeMultiplyLex[code_] :=
With ## Append[
Hold[{Times = NonCommutativeMultiply}],
Unevaluated[code]]
you can use this in the same way, but only those instances of Times which are explicitly present in the code would be replaced. Again, this is just an illustration of the principles, one can extend or specialize this as needed. Instead of With, which is rather limited in its ability to specialize / add special cases, one can use replacement rules which have similar semantics.
If I understand correctly, you want to input
a b and a*b
and have MMA understand automatically that Times is really a non commutative operator (which has its own -separate - commutation rules).
Well, my suggestion is that you use the Notation package.
It is very powerful and (relatively) easy to use (especially for a sophisticated user like you seem to be).
It can be used programmatically and it can reinterpret predefined symbols like Times.
Basically it can intercept Times and change it to MyTimes. You then write code for MyTimes deciding for example which symbols are non commuting and then the output can be pretty formatted again as times or whatever else you wish.
The input and output processing are 2 lines of code. That’s it!
You have to read the documentation carefully and do some experimentation, if what you want is not more or less “standard hacking” of the input-output jobs.
Your case seems to me pretty much standard (again: If I understood well what you want to achieve) and you should find useful to read the “advanced” pages of the Notation package.
To give you an idea of how powerful and flexible the package is, I am using it to write the input-output formatting of a sizable package of Category Theory where noncommutative operations abound. But wait! I am not just defining ONE noncommutative operation, I am defining an unlimited number of noncommutative operations.
Another thing I did was to reinterpret Power when the arguments are categories, without overloading Power. This allows me to treat functorial categories using standard mathematics notation.
Now my “infinite” operations and "super Power" have the same look and feel of standard MMA symbols, including copy-paste functionality.
So, this doesn't directly answer the question, but it's does provide the sort of implementation that I was thinking about.
So, after a bit of investigation and taking on board some of #LeonidShifrin's suggestions, I've managed to implement most of what I was thinking of. The idea is that it's possible to define patterns that should be considered to be non-commuting quantities, using commutingQ[form] := False. Then any multiplicative expression (actually any expression) can be wrapped with withCommutativeSensitivity[expr] and the expression will be manipulated to separate the quantities into Times[] and NonCommutativeMultiply[] sub-expressions as appropriate,
In[1]:= commutingQ[b] ^:= False;
In[2]:= withCommutativeSensitivity[ a (a + b + 4) b (3 + a) b ]
Out[1]:= a (3 + a) (a + b + 4) ** b ** b
Of course it's possible to use $Pre = withCommutativeSensitivity to have this behaviour become default (come on Wolfram! Make it default already ;) ). It would, however, be nice to have it a more fundamental behaviour though. I'd really like to make a module and Needs[NonCommutativeQuantities] at the beginning of any note book that is needs it, and not have all the facilities that use $Pre break on me (doesn't tracing use it?).
Intuitively I feel that there must be a natural way to hook this functionality into Mathematica on at the level of box parsing and wire it up using MakeExpression[]. Am I over extending here? I'd appreciate any thoughts as to whether I'm chasing up a blind alley. (I've had a few experiments in this direction, but always get caught in a recursive definition that I can't work out how to break).
Any thoughts would be gladly received,
Joe.
Code
Unprotect[NonCommutativeMultiply];
ClearAll[NonCommutativeMultiply]
NonCommutativeMultiply[a_] := a
Protect[NonCommutativeMultiply];
ClearAll[commutingQ]
commutingQ::usage = "commutingQ[\!\(\*
StyleBox[\"expr\", \"InlineFormula\",\nFontSlant->\"Italic\"]\)] \
returns True if expr doesn't contain any constituent parts that fail \
the commutingQ test. By default all objects return True to \
commutingQ.";
commutingQ[x_] := If[Length[x] == 0, True, And ## (commutingQ /# List ## x)]
ClearAll[times2, withCommutativeSensitivity]
SetAttributes[times2, {Flat, OneIdentity, HoldAll}]
SetAttributes[withCommutativeSensitivity, HoldAll];
gatherByCriteria[list_List, crit_] :=
With[{gathered =
Gather[{#, crit[#1]} & /# list, #1[[2]] == #2[[2]] &]},
(Identity ## Union[#[[2]]] -> #[[1]] &)[Transpose[#]] & /# gathered]
times2[x__] := Module[{a, b, y = List[x]},
Times ## (gatherByCriteria[y, commutingQ] //.
{True -> Times, False -> NonCommutativeMultiply,
HoldPattern[a_ -> b_] :> a ## b})]
withCommutativeSensitivity[code_] := With ## Append[
Hold[{Times = times2, NonCommutativeMultiply = times2}],
Unevaluated[code]]
This answer does not address your question but rather the problem that leads you to ask that question. Mathematica is pretty useless when dealing with non-commuting objects but since such objects abound in, e.g., particle physics, there are some usefull packages around to deal with the situation.
Look at the grassmanOps package. They have a method to define symbols as either commuting or anti-commuting and overload the standard NonCommutativeMultiply to handle, i.e. pass through, commuting symbols. They also define several other operators, such as Derivative, to handle anti-commuting symbols. It is probably easily adapted to cover arbitrary commutation rules and it should at the very least give you an insigt into what things need to be changed if you want to roll your own.

Why use short-circuit code?

Related Questions: Benefits of using short-circuit evaluation, Why would a language NOT use Short-circuit evaluation?, Can someone explain this line of code please? (Logic & Assignment operators)
There are questions about the benefits of a language using short-circuit code, but I'm wondering what are the benefits for a programmer? Is it just that it can make code a little more concise? Or are there performance reasons?
I'm not asking about situations where two entities need to be evaluated anyway, for example:
if($user->auth() AND $model->valid()){
$model->save();
}
To me the reasoning there is clear - since both need to be true, you can skip the more costly model validation if the user can't save the data.
This also has a (to me) obvious purpose:
if(is_string($userid) AND strlen($userid) > 10){
//do something
};
Because it wouldn't be wise to call strlen() with a non-string value.
What I'm wondering about is the use of short-circuit code when it doesn't effect any other statements. For example, from the Zend Application default index page:
defined('APPLICATION_PATH')
|| define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
This could have been:
if(!defined('APPLICATION_PATH')){
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
}
Or even as a single statement:
if(!defined('APPLICATION_PATH'))
define('APPLICATION_PATH', realpath(dirname(__FILE__) . '/../application'));
So why use the short-circuit code? Just for the 'coolness' factor of using logic operators in place of control structures? To consolidate nested if statements? Because it's faster?
For programmers, the benefit of a less verbose syntax over another more verbose syntax can be:
less to type, therefore higher coding efficiency
less to read, therefore better maintainability.
Now I'm only talking about when the less verbose syntax is not tricky or clever in any way, just the same recognized way of doing, but in fewer characters.
It's often when you see specific constructs in one language that you wish the language you use could have, but didn't even necessarily realize it before. Some examples off the top of my head:
anonymous inner classes in Java instead of passing a pointer to a function (way more lines of code).
in Ruby, the ||= operator, to evaluate an expression and assign to it if it evaluates to false or is null. Sure, you can achieve the same thing by 3 lines of code, but why?
and many more...
Use it to confuse people!
I don't know PHP and I've never seen short-circuiting used outside an if or while condition in the C family of languages, but in Perl it's very idiomatic to say:
open my $filehandle, '<', 'filename' or die "Couldn't open file: $!";
One advantage of having it all in one statement is the variable declaration. Otherwise you'd have to say:
my $filehandle;
unless (open $filehandle, '<', 'filename') {
die "Couldn't open file: $!";
}
Hard to claim the second one is cleaner in that case. And it'd be wordier still in a language that doesn't have unless
I think your example is for the coolness factor. There's no reason to write code like that.
EDIT: I have no problem with doing it for idiomatic reasons. If everyone else who uses a language uses short-circuit evaluation to make statement-like entities that everyone understands, then you should too. However, my experience is that code of that sort is rarely written in C-family languages; proper form is just to use the "if" statement as normal, which separates the conditional (which presumably has no side effects) from the function call that the conditional controls (which presumably has many side effects).
Short circuit operators can be useful in two important circumstances which haven't yet been mentioned:
Case 1. Suppose you had a pointer which may or may not be NULL and you wanted to check that it wasn't NULL, and that the thing it pointed to wasn't 0. However, you must not dereference the pointer if it's NULL. Without short-circuit operators, you would have to do this:
if (a != NULL) {
if (*a != 0) {
⋮
}
}
However, short-circuit operators allow you to write this more compactly:
if (a != NULL && *a != 0) {
⋮
}
in the certain knowledge that *a will not be evaluated if a is NULL.
Case 2. If you want to set a variable to a non-false value returned from one of a series of functions, you can simply do:
my $file = $user_filename ||
find_file_in_user_path() ||
find_file_in_system_path() ||
$default_filename;
This sets the value of $file to $user_filename if it's present, or the result of find_file_in_user_path(), if it's true, or … so on. This is seen perhaps more often in Perl than C, but I have seen it in C.
There are other uses, including the rather contrived examples which you cite above. But they are a useful tool, and one which I have missed when programming in less complex languages.
Related to what Dan said, I'd think it all depends on the conventions of each programming language. I can't see any difference, so do whatever is idiomatic in each programming language. One thing that could make a difference that comes to mind is if you had to do a series of checks, in that case the short-circuiting style would be much clearer than the alternative if style.
What if you had a expensive to call (performance wise) function that returned a boolean on the right hand side that you only wanted called if another condition was true (or false)? In this case Short circuiting saves you many CPU cycles. It does make the code more concise because of fewer nested if statements. So, for all the reasons you listed at the end of your question.
The truth is actually performance. Short circuiting is used in compilers to eliminate dead code saving on file size and execution speed. At run-time short-circuiting does not execute the remaining clause in the logical expression if their outcome does not affect the answer, speeding up the evaluation of the formula. I am struggling to remember an example. e.g
a AND b AND c
There are two terms in this formula evaluated left to right.
if a AND b evaluates to FALSE then the next expression AND c can either be FALSE AND TRUE or FALSE AND FALSE. Both evaluate to FALSE no matter what the value of c is. Therefore the compiler does not include AND c in the compiled format hence short-circuiting the code.
To answer the question there are special cases when the compiler cannot determine whether the logical expression has a constant output and hence would not short-circuit the code.
Think of it this way, if you have a statement like
if( A AND B )
chances are if A returns FALSE you'll only ever want to evaluate B in rare special cases. For this reason NOT using short ciruit evaluation is confusing.
Short circuit evaluation also makes your code more readable by preventing another bracketed indentation and brackets have a tendency to add up.

Is there any wisdom behind "and", "or" operators in Ruby?

I wonder why ruby give and, or less precedence than &&, || , and assign operator? Is there any reason?
My guess is that's a direct carry-over from Perl. The operators or and and were added later in Perl 5 for specific situations were lower precedence was desired.
For example, in Perl, here we wish that || had lower precedence, so that we could write:
try to perform big long hairy complicated action || die ;
and be sure that the || was not going to gobble up part of the action. Perl 5 introduced or, a new version of || that has low precedence, for exactly this purpose.
An example in Ruby where you could use or but not ||:
value = possibly_false or raise "foo"
If you used ||, it would be a syntax error.
The difference is precedence. ||, && have higher precedence than =, but and, or have lower. So while you can do:
a = nil || 0
You would have to do:
a = (nil or 0)
to get same effect. If you do:
a = nil or 0
The result of expression would still be 0, but a value would be nil.
They have very low precedence so that the operands don't have to be wrapped in parentheses, as is sometimes the case with && and ||.
Being able to control the precedence of your operators is sometimes useful, especially if you are concerned with readability -- extra parenthesis in conditional statements can sometimes obscure the actual logic.
To be frank, though, I think the reason Ruby has the boolean operator precedence levels it does stems mostly from the fact that Matz was a Perl programmer before he ever wrote Ruby, and borrowed much of the core syntax and operators from that language.
I believe the idea is specifically to get them below the assignment operators, so you can write logic tests with assignments but without parens.

Resources