Related
I've got some symbols which should are non-commutative, but I don't want to have to remember which expressions have this behaviour whilst constructing equations.
I've had the thought to use MakeExpression to act on the raw boxes, and automatically uplift multiply to non-commutative multiply when appropriate (for instance when some of the symbols are non-commutative objects).
I was wondering whether anyone had any experience with this kind of configuration.
Here's what I've got so far:
(* Detect whether a set of row boxes represents a multiplication *)
Clear[isRowBoxMultiply];
isRowBoxMultiply[x_RowBox] := (Print["rowbox: ", x];
Head[ToExpression[x]] === Times)
isRowBoxMultiply[x___] := (Print["non-rowbox: ", x]; False)
(* Hook into the expression maker, so that we can capture any \
expression of the form F[x___], to see how it is composed of boxes, \
and return true or false on that basis *)
MakeExpression[
RowBox[List["F", "[", x___, "]"]], _] := (HoldComplete[
isRowBoxMultiply[x]])
(* Test a number of expressions to see whether they are automatically \
detected as multiplies or not. *)
F[a]
F[a b]
F[a*b]
F[a - b]
F[3 x]
F[x^2]
F[e f*g ** h*i j]
Clear[MakeExpression]
This appears to correctly identify expressions that are multiplication statements:
During evaluation of In[561]:= non-rowbox: a
Out[565]= False
During evaluation of In[561]:= rowbox: RowBox[{a,b}]
Out[566]= True
During evaluation of In[561]:= rowbox: RowBox[{a,*,b}]
Out[567]= True
During evaluation of In[561]:= rowbox: RowBox[{a,-,b}]
Out[568]= False
During evaluation of In[561]:= rowbox: RowBox[{3,x}]
Out[569]= True
During evaluation of In[561]:= non-rowbox: SuperscriptBox[x,2]
Out[570]= False
During evaluation of In[561]:= rowbox: RowBox[{e,f,*,RowBox[{g,**,h}],*,i,j}]
Out[571]= True
So, it looks like it's not out of the questions that I might be able to conditionally rewrite the boxes of the underlying expression; but how to do this reliably?
Take the expression RowBox[{"e","f","*",RowBox[{"g","**","h"}],"*","i","j"}], this would need to be rewritten as RowBox[{"e","**","f","**",RowBox[{"g","**","h"}],"**","i","**","j"}] which seems like a non trivial operation to do with the pattern matcher and a rule set.
I'd be grateful for any suggestions from those more experienced with me.
I'm trying to find a way of doing this without altering the default behaviour and ordering of multiply.
Thanks! :)
Joe
This is not a most direct answer to your question, but for many purposes working as low-level as directly with the boxes might be an overkill. Here is an alternative: let the Mathematica parser parse your code, and make a change then. Here is a possibility:
ClearAll[withNoncommutativeMultiply];
SetAttributes[withNoncommutativeMultiply, HoldAll];
withNoncommutativeMultiply[code_] :=
Internal`InheritedBlock[{Times},
Unprotect[Times];
Times = NonCommutativeMultiply;
Protect[Times];
code];
This replaces Times dynamically with NonCommutativeMultiply, and avoids the intricacies you mentioned. By using Internal`InheritedBlock, I make modifications to Times local to the code executed inside withNoncommutativeMultiply.
You now can automate the application of this function with $Pre:
$Pre = withNoncommutativeMultiply;
Now, for example:
In[36]:=
F[a]
F[a b]
F[a*b]
F[a-b]
F[3 x]
F[x^2]
F[e f*g**h*i j]
Out[36]= F[a]
Out[37]= F[a**b]
Out[38]= F[a**b]
Out[39]= F[a+(-1)**b]
Out[40]= F[3**x]
Out[41]= F[x^2]
Out[42]= F[e**f**g**h**i**j]
Surely, using $Pre in such manner is hardly appropriate, since in all your code multiplication will be replaced with noncommutative multiplication - I used this as an illustration. You could make a more complicated redefinition of Times, so that this would only work for certain symbols.
Here is a safer alternative based on lexical, rather than dynamic, scoping:
ClearAll[withNoncommutativeMultiplyLex];
SetAttributes[withNoncommutativeMultiplyLex, HoldAll];
withNoncommutativeMultiplyLex[code_] :=
With ## Append[
Hold[{Times = NonCommutativeMultiply}],
Unevaluated[code]]
you can use this in the same way, but only those instances of Times which are explicitly present in the code would be replaced. Again, this is just an illustration of the principles, one can extend or specialize this as needed. Instead of With, which is rather limited in its ability to specialize / add special cases, one can use replacement rules which have similar semantics.
If I understand correctly, you want to input
a b and a*b
and have MMA understand automatically that Times is really a non commutative operator (which has its own -separate - commutation rules).
Well, my suggestion is that you use the Notation package.
It is very powerful and (relatively) easy to use (especially for a sophisticated user like you seem to be).
It can be used programmatically and it can reinterpret predefined symbols like Times.
Basically it can intercept Times and change it to MyTimes. You then write code for MyTimes deciding for example which symbols are non commuting and then the output can be pretty formatted again as times or whatever else you wish.
The input and output processing are 2 lines of code. That’s it!
You have to read the documentation carefully and do some experimentation, if what you want is not more or less “standard hacking” of the input-output jobs.
Your case seems to me pretty much standard (again: If I understood well what you want to achieve) and you should find useful to read the “advanced” pages of the Notation package.
To give you an idea of how powerful and flexible the package is, I am using it to write the input-output formatting of a sizable package of Category Theory where noncommutative operations abound. But wait! I am not just defining ONE noncommutative operation, I am defining an unlimited number of noncommutative operations.
Another thing I did was to reinterpret Power when the arguments are categories, without overloading Power. This allows me to treat functorial categories using standard mathematics notation.
Now my “infinite” operations and "super Power" have the same look and feel of standard MMA symbols, including copy-paste functionality.
So, this doesn't directly answer the question, but it's does provide the sort of implementation that I was thinking about.
So, after a bit of investigation and taking on board some of #LeonidShifrin's suggestions, I've managed to implement most of what I was thinking of. The idea is that it's possible to define patterns that should be considered to be non-commuting quantities, using commutingQ[form] := False. Then any multiplicative expression (actually any expression) can be wrapped with withCommutativeSensitivity[expr] and the expression will be manipulated to separate the quantities into Times[] and NonCommutativeMultiply[] sub-expressions as appropriate,
In[1]:= commutingQ[b] ^:= False;
In[2]:= withCommutativeSensitivity[ a (a + b + 4) b (3 + a) b ]
Out[1]:= a (3 + a) (a + b + 4) ** b ** b
Of course it's possible to use $Pre = withCommutativeSensitivity to have this behaviour become default (come on Wolfram! Make it default already ;) ). It would, however, be nice to have it a more fundamental behaviour though. I'd really like to make a module and Needs[NonCommutativeQuantities] at the beginning of any note book that is needs it, and not have all the facilities that use $Pre break on me (doesn't tracing use it?).
Intuitively I feel that there must be a natural way to hook this functionality into Mathematica on at the level of box parsing and wire it up using MakeExpression[]. Am I over extending here? I'd appreciate any thoughts as to whether I'm chasing up a blind alley. (I've had a few experiments in this direction, but always get caught in a recursive definition that I can't work out how to break).
Any thoughts would be gladly received,
Joe.
Code
Unprotect[NonCommutativeMultiply];
ClearAll[NonCommutativeMultiply]
NonCommutativeMultiply[a_] := a
Protect[NonCommutativeMultiply];
ClearAll[commutingQ]
commutingQ::usage = "commutingQ[\!\(\*
StyleBox[\"expr\", \"InlineFormula\",\nFontSlant->\"Italic\"]\)] \
returns True if expr doesn't contain any constituent parts that fail \
the commutingQ test. By default all objects return True to \
commutingQ.";
commutingQ[x_] := If[Length[x] == 0, True, And ## (commutingQ /# List ## x)]
ClearAll[times2, withCommutativeSensitivity]
SetAttributes[times2, {Flat, OneIdentity, HoldAll}]
SetAttributes[withCommutativeSensitivity, HoldAll];
gatherByCriteria[list_List, crit_] :=
With[{gathered =
Gather[{#, crit[#1]} & /# list, #1[[2]] == #2[[2]] &]},
(Identity ## Union[#[[2]]] -> #[[1]] &)[Transpose[#]] & /# gathered]
times2[x__] := Module[{a, b, y = List[x]},
Times ## (gatherByCriteria[y, commutingQ] //.
{True -> Times, False -> NonCommutativeMultiply,
HoldPattern[a_ -> b_] :> a ## b})]
withCommutativeSensitivity[code_] := With ## Append[
Hold[{Times = times2, NonCommutativeMultiply = times2}],
Unevaluated[code]]
This answer does not address your question but rather the problem that leads you to ask that question. Mathematica is pretty useless when dealing with non-commuting objects but since such objects abound in, e.g., particle physics, there are some usefull packages around to deal with the situation.
Look at the grassmanOps package. They have a method to define symbols as either commuting or anti-commuting and overload the standard NonCommutativeMultiply to handle, i.e. pass through, commuting symbols. They also define several other operators, such as Derivative, to handle anti-commuting symbols. It is probably easily adapted to cover arbitrary commutation rules and it should at the very least give you an insigt into what things need to be changed if you want to roll your own.
I'm in love with Ruby. In this language all core functions are actually methods. That's why I prefer postfix notation – when the data, which I want to process is placed left from the body of anonymous processing function, for example: array.map{...}. I believe, that it has advantages in how easy is this code to read.
But Mathetica, being functional (yeah, it can be procedural if you want) dictates a style, where Function name is placed left from the data. As we can see in its manuals, // is used only when it's some simple Function, without arguments, like list // MatrixForm. When Function needs a lot of arguments, people who wrote manuals, use syntax F[data].
It would be okay, but my problem is the case F[f,data], for example Do[function, {x, a, b}]. Most of Mathematica functions (if not all) have arguments in exactly this order – [function, data], not [data, function]. As I prefer to use pure functions to keep namespace clean instead of creating a lot of named functions in my notebook, the argument function can be too big – so big, that argument data would be placed on the 5-20th line of code after the line with Function call.
This is why sometimes, when evil Ruby nature takes me under control, I rewrite such functions in postfix way:
Because it's important for me, that pure function (potentially big code) is placed right from processing data. Yeah I do it and I'm happy. But there are two things:
this causes Mathematica's highlighting parser problem: the x in postfix notation is highlighted with blue color, not turquoise;
everytime when I look into Mathematica manuals, I see examples like this one: Do[x[[i]] = (v[[i]] - U[[i, i + 1 ;; n]].x[[i + 1 ;; n]])/ U[[i, i]], {i, n, 1, -1}];, which means... hell, they think it's easy to read/support/etc.?!
So these two things made me ask this question here: am I so bad boy, that use my Ruby-style, and should I write code like these guys do, or is it OK, and I don't have to worry, and should write as I like to?
The style you propose is frequently possible, but is inadvisable in the case of Do. The problem is that Do has the attribute HoldAll. This is important because the loop variable (x in the example) must remain unevaluated and be treated as a local variable. To see this, try evaluating these expressions:
x = 123;
Do[Print[x], {x, 1, 2}]
(* prints 1 and 2 *)
{x, 1, 2} // Do[Print[x], #]&
(* error: Do::itraw: Raw object 123 cannot be used as an iterator.
Do[Print[x], {123, 1, 2}]
*)
The error occurs because the pure function Do[Print[x], #]& lacks the HoldAll attribute, causing {x, 1, 2} to be evaluated. You could solve the problem by explicitly defining a pure function with the HoldAll attribute, thus:
{x, 1, 2} // Function[Null, Do[Print[x], #], HoldAll]
... but I suspect that the cure is worse than the disease :)
Thus, when one is using "binding" expressions like Do, Table, Module and so on, it is safest to conform with the herd.
I think you need to learn to use the styles that Mathematica most naturally supports. Certainly there is more than one way, and my code does not look like everyone else's. Nevertheless, if you continue to try to beat Mathematica syntax into your own preconceived style, based on a different language, I foresee nothing but continued frustration for you.
Whitespace is not evil, and you can easily add line breaks to separate long arguments:
Do[
x[[i]] = (v[[i]] - U[[i, i + 1 ;; n]].x[[i + 1 ;; n]]) / U[[i, i]]
, {i, n, 1, -1}
];
This said, I like to write using more prefix (f # x) and infix (x ~ f ~ y) notation that I usually see, and I find this valuable because it is easy to determine that such functions are receiving one and two arguments respectively. This is somewhat nonstandard, but I do not think it is kicking over the traces of Mathematica syntax. Rather, I see it as using the syntax to advantage. Sometimes this causes syntax highlighting to fail, but I can live with that:
f[x] ~Do~ {x, 2, 5}
When using anything besides the standard form of f[x, y, z] (with line breaks as needed), you must be more careful of evaluation order, and IMHO, readability can suffer. Consider this contrived example:
{x, y} // # + 1 & ## # &
I do not find this intuitive. Yes, for someone intimate with Mathematica's order of operations, it is readable, but I believe it does not improve clarity. I tend to reserve // postfix for named functions where reading is natural:
Do[f[x], {x, 10000}] //Timing //First
I'd say it is one of the biggest mistakes to try program in a language B in ways idiomatic for a language A, only because you happen to know the latter well and like it. There is nothing wrong in borrowing idioms, but you have to make sure to understand the second language well enough so that you know why other people use it the way they do.
In the particular case of your example, and generally, I want to draw attention to a few things others did not mention. First, Do is a scoping construct which uses dynamic scoping to localize its iterator symbols. Therefore, you have:
In[4]:=
x=1;
{x,1,5}//Do[f[x],#]&
During evaluation of In[4]:= Do::itraw: Raw object
1 cannot be used as an iterator. >>
Out[5]= Do[f[x],{1,1,5}]
What a surprise, isn't it. This won't happen when you use Do in a standard fashion.
Second, note that, while this fact is largely ignored, f[#]&[arg] is NOT always the same as f[arg]. Example:
ClearAll[f];
SetAttributes[f, HoldAll];
f[x_] := Print[Unevaluated[x]]
f[5^2]
5^2
f[#] &[5^2]
25
This does not affect your example, but your usage is close enough to those cases affected by this, since you manipulate the scopes.
Mathematica supports 4 ways of applying a function to its arguments:
standard function form: f[x]
prefix: f#x or g##{x,y}
postfix: x // f, and
infix: x~g~y which is equivalent to g[x,y].
What form you choose to use is up to you, and is often an aesthetic choice, more than anything else. Internally, f#x is interpreted as f[x]. Personally, I primarily use postfix, like you, because I view each function in the chain as a transformation, and it is easier to string multiple transformations together like that. That said, my code will be littered with both the standard form and prefix form mostly depending on whim, but I tend to use standard form more as it evokes a feeling of containment with regards to the functions parameters.
I took a little liberty with the prefix form, as I included the shorthand form of Apply (##) alongside Prefix (#). Of the built in commands, only the standard form, infix form, and Apply allow you easily pass more than one variable to your function without additional work. Apply (e.g. g ## {x,y}) works by replacing the Head of the expression ({x,y}) with the function, in effect evaluating the function with multiple variables (g##{x,y} == g[x,y]).
The method I use to pass multiple variables to my functions using the postfix form is via lists. This necessitates a little more work as I have to write
{x,y} // f[ #[[1]], #[[2]] ]&
to specify which element of the List corresponds to the appropriate parameter. I tend to do this, but you could combine this with Apply like
{x,y} // f ## #&
which involves less typing, but could be more difficult to interpret when you read it later.
Edit: I should point out that f and g above are just placeholders, they can, and often are, replaced with pure functions, e.g. #+1& # x is mostly equivalent to #+1&[x], see Leonid's answer.
To clarify, per Leonid's answer, the equivalence between f#expr and f[expr] is true if f does not posses an attribute that would prevent the expression, expr, from being evaluated before being passed to f. For instance, one of the Attributes of Do is HoldAll which allows it to act as a scoping construct which allows its parameters to be evaluated internally without undo outside influence. The point is expr will be evaluated prior to it being passed to f, so if you need it to remain unevaluated, extra care must be taken, like creating a pure function with a Hold style attribute.
You can certainly do it, as you evidently know. Personally, I would not worry about how the manuals write code, and just write it the way I find natural and memorable.
However, I have noticed that I usually fall into definite patterns. For instance, if I produce a list after some computation and incidentally plot it to make sure it's what I expected, I usually do
prodListAfterLongComputation[
args,
]//ListPlot[#,PlotRange->Full]&
If I have a list, say lst, and I am now focusing on producing a complicated plot, I'll do
ListPlot[
lst,
Option1->Setting1,
Option2->Setting2
]
So basically, anything that is incidental and perhaps not important to be readable (I don't need to be able to instantaneously parse the first ListPlot as it's not the point of that bit of code) ends up being postfix, to avoid disrupting the already-written complicated code it is applied to. Conversely, complicated code I tend to write in the way I find easiest to parse later, which, in my case, is something like
f[
g[
a,
b,
c
]
]
even though it takes more typing and, if one does not use the Workbench/Eclipse plugin, makes it more work to reorganize code.
So I suppose I'd answer your question with "do whatever is most convenient after taking into account the possible need for readability and the possible loss of convenience such as code highlighting, extra work to refactor code etc".
Of course all this applies if you're the only one working with some piece of code; if there are others, it is a different question alltogether.
But this is just an opinion. I doubt it's possible for anybody to offer more than this.
For one-argument functions (f#(arg)), ((arg)//f) and f[arg] are completely equivalent even in the sense of applying of attributes of f. In the case of multi-argument functions one may write f#Sequence[args] or Sequence[args]//f with the same effect:
In[1]:= SetAttributes[f,HoldAll];
In[2]:= arg1:=Print[];
In[3]:= f#arg1
Out[3]= f[arg1]
In[4]:= f#Sequence[arg1,arg1]
Out[4]= f[arg1,arg1]
So it seems that the solution for anyone who likes postfix notation is to use Sequence:
x=123;
Sequence[Print[x],{x,1,2}]//Do
(* prints 1 and 2 *)
Some difficulties can potentially appear with functions having attribute SequenceHold or HoldAllComplete:
In[18]:= Select[{#, ToExpression[#, InputForm, Attributes]} & /#
Names["System`*"],
MemberQ[#[[2]], SequenceHold | HoldAllComplete] &][[All, 1]]
Out[18]= {"AbsoluteTiming", "DebugTag", "EvaluationObject", \
"HoldComplete", "InterpretationBox", "MakeBoxes", "ParallelEvaluate", \
"ParallelSubmit", "Parenthesize", "PreemptProtect", "Rule", \
"RuleDelayed", "Set", "SetDelayed", "SystemException", "TagSet", \
"TagSetDelayed", "Timing", "Unevaluated", "UpSet", "UpSetDelayed"}
Old habits die hard, and I realise that I have been using opts___Rule pattern-matching and constructs like thisoption /. {opts} /. Options[myfunction] in the very large package that I'm currently developing. Sal Manango's "Mathematica Cookbook" reminds me that the post-version-6 way of doing this is opts:OptionsPattern[] and OptionValue[thisoption]. The package requires version 8 anyway, but I had just never changed the way I wrote this kind of code over the years.
Is it worth refactoring all that from my pre-version-6 way of doing things? Are there performance or other benefits?
Regards
Verbeia
EDIT: Summary
A lot of good points were made in response to this question, so thank you (and plus one, of course) all. To summarise, yes, I should refactor to use OptionsPattern and OptionValue. (NB: OptionsPattern not OptionPattern as I had it before!) There are a number of reasons why:
It's a touch faster (#Sasha)
It better handles functions where the arguments must be in HoldForm (#Leonid)
OptionsPattern automatically checks that you are passing a valid option to that function (FilterRules would still be needed if you are passing to a different function (#Leonid)
It handles RuleDelayed (:>) much better (#rcollyer)
It handles nested lists of rules without using Flatten (#Andrew)
It is a bit easier to assign multiple local variables using OptionValue /# list instead of having multiple calls to someoptions /. {opts} /. Options[thisfunction] (came up in comments between #rcollyer and me)
EDIT: 25 July I initially thought that the one time using the /. syntax might still make sense is if you are deliberately extracting a default option from another function, not the one actually being called. It turns out that this is handled by using the form of OptionsPattern[] with a list of heads inside it, for example: OptionsPattern[{myLineGraph, DateListPlot, myDateTicks, GraphNotesGrid}] (see the "More information" section in the documentation). I only worked this out recently.
It seems like relying on pattern-matcher yields faster execution than by using PatternTest as the latter entails invocation of the evaluator. Anyway, my timings indicate that some speed-ups can be achieved, but I do not think they are so critical as to prompt re-factoring.
In[7]:= f[x__, opts : OptionsPattern[NIntegrate]] := {x,
OptionValue[WorkingPrecision]}
In[8]:= f2[x__, opts___?OptionQ] := {x,
WorkingPrecision /. {opts} /. Options[NIntegrate]}
In[9]:= AbsoluteTiming[Do[f[1, 2, PrecisionGoal -> 17], {10^6}];]
Out[9]= {5.0885088, Null}
In[10]:= AbsoluteTiming[Do[f2[1, 2, PrecisionGoal -> 17], {10^6}];]
Out[10]= {8.0908090, Null}
In[11]:= f[1, 2, PrecisionGoal -> 17]
Out[11]= {1, 2, MachinePrecision}
In[12]:= f2[1, 2, PrecisionGoal -> 17]
Out[12]= {1, 2, MachinePrecision}
While several answers have stressed different aspects of old vs. new way of using options, I'd like to make a few additional observations. The newer constructs OptionValue - OptionsPattern provide more safety than OptionQ, since OptionValue inspects a list of global Options to make sure that the passed option is known to the function. The older OptionQ seems however easier to understand since it is based only on the standard pattern-matching and isn't directly related to any of the global properties. Whether or not you want this extra safety provided by any of these constructs is up to you, but my guess is that most people find it useful, especially for larger projects.
One reason why these type checks are really useful is that often options are passed as parameters by functions in a chain-like manner, filtered, etc., so without such checks some of the pattern-matching errors would be very hard to catch since they would be causing harm "far away" from the place of their origin.
In terms of the core language, the OptionValue - OptionsPattern constructs are an addition to the pattern-matcher, and perhaps the most "magical" of all its features. It was not necessary semantically, as long as one is willing to consider options as a special case of rules. Moreover, OptionValue connects the pattern-matching to Options[symbol] - a global property. So, if one insists on language purity, rules as in opts___?OptionQ seem easier to understand - one does not need anything except the standard rule-substitution semantics to understand this:
f[a_, b_, opts___?OptionQ] := Print[someOption/.Flatten[{opts}]/.Options[f]]
(I remind that the OptionQ predicate was designed specifically to recognize options in the older versions of Mathematica), while this:
f[a_, b_, opts:OptionsPattern[]] := Print[OptionValue[someOption]]
looks quite magical. It becomes a bit clearer when you use Trace and see that the short form of OptionValue evaluates to a longer form, but the fact that it automaticaly determines the enclosing function name is still remarkable.
There are a few more consequences of OptionsPattern being a part of the pattern language. One is the speed improvements discussed by #Sasha. However, speed issues are often over-emphasized (this is not to detract from his observations), and I expect this to be especially true for functions with options, since these tend to be the higher-level functions, which will likely have non-trivial body, where most of the computation time will be spent.
Another rather interesting difference is when one needs to pass options to a function which holds its arguments. Consider a following example:
ClearAll[f, ff, fff, a, b, c, d];
Options[f] = Options[ff] = {a -> 0, c -> 0};
SetAttributes[{f, ff}, HoldAll];
f[x_, y_, opts___?OptionQ] :=
{{"Parameters:", {HoldForm[x], HoldForm[y]}}, {" options: ", {opts}}};
ff[x_, y_, opts : OptionsPattern[]] :=
{{"Parameters:", {HoldForm[x], HoldForm[y]}}, {" options: ", {opts}}};
This is ok:
In[199]:= f[Print["*"],Print["**"],a->b,c->d]
Out[199]= {{Parameters:,{Print[*],Print[**]}},{ options: ,{a->b,c->d}}}
But here our OptionQ-based function leaks evaluation as a part of pattern-matching process:
In[200]:= f[Print["*"],Print["**"],Print["***"],a->b,c->d]
During evaluation of In[200]:= ***
Out[200]= f[Print[*],Print[**],Print[***],a->b,c->d]
This is not completely trivial. What happens is that the pattern-matcher, to establish a fact of match or non-match, must evaluate the third Print, as a part of evaluation of OptionQ, since OptionQ does not hold arguments. To avoid the evaluation leak, one needs to use Function[opt,OptionQ[Unevaluated[opt]],HoldAll] in place of OptionQ. With OptionsPattern we don't have this problem, since the fact of the match can be established purely syntactically:
In[201]:= ff[Print["*"],Print["**"],a->b,c->d]
Out[201]= {{Parameters:,{Print[*],Print[**]}},{ options: ,{a->b,c->d}}}
In[202]:= ff[Print["*"],Print["**"],Print["***"],a->b,c->d]
Out[202]= ff[Print[*],Print[**],Print[***],a->b,c->d]
So, to summarize: I think choosing one method over another is largely a matter of taste - each one can be used productively, and also each one can be abused. I am more inclined to use the newer way, since it provides more safety, but I do not exclude that there exist some corner cases when it will surprise you - while the older method is semantically easier to understand. This is something similar to C-C++ comparison (if this is an appropriate one): automation and (possibly) safety vs. simplicity and purity. My two cents.
A little known (but frequently useful) fact is that options are allowed to appear in nested lists:
In[1]:= MatchQ[{{a -> b}, c -> d}, OptionsPattern[]]
Out[1]= True
The options handling functions such as FilterRules know about this:
In[2]:= FilterRules[{{PlotRange -> 3}, PlotStyle -> Blue,
MaxIterations -> 5}, Options[Plot]]
Out[2]= {PlotRange -> 3, PlotStyle -> RGBColor[0, 0, 1]}
OptionValue takes it into account:
In[3]:= OptionValue[{{a -> b}, c -> d}, a]
Out[3]= b
But ReplaceAll (/.) doesn't take this into account of course:
In[4]:= a /. {{a -> b}, c -> d}
During evaluation of In[4]:= ReplaceAll::rmix: Elements of {{a->b},c->d} are a mixture of lists and nonlists. >>
Out[4]= a /. {{a -> b}, c -> d}
So, if you use OptionsPattern, you should probably also use OptionValue to ensure that you can consume the set of options the user passes in.
On the other hand, if you use ReplaceAll (/.), you should stick to opts___Rule for the same reason.
Note that opts___Rule is also a little bit too forgiving in certain (admittedly obscure) cases:
Not a valid option:
In[5]:= MatchQ[Unevaluated[Rule[a]], OptionsPattern[]]
Out[5]= False
But ___Rule lets it through:
In[6]:= MatchQ[Unevaluated[Rule[a]], ___Rule]
Out[6]= True
Update: As rcollyer pointed out, another more serious problem with ___Rule is that it misses options specified with RuleDelayed (:>). You can work around it (see rcollyer's answer), but it's another good reason to use OptionValue.
Your code itself has a subtle, but fixable flaw. The pattern opts___Rule will not match options of the form a :> b, so if you ever need to use it, you'll have to update your code. The immediate fix is to replace opts___Rule with opts:(___Rule | ___RuleDelayed) which requires more typing than OptionsPattern[]. But, for the lazy among us, OptionValue[...] requires more typing than the short form of ReplaceAll. However, I think it makes for cleaner reading code.
I find the use of OptionsPattern[] and OptionValue to be easier to read and instantly comprehend what is being done. The older form of opts___ ... and ReplaceAll was much more difficult to comprehend on a first pass read through. Add to that, the clear timing advantages, and I'd go with updating your code.
As I learned recently there are some types of expressions in Mathematica which are automatically parsed by the FrontEnd.
For example if we evaluate HoldComplete[Rotate[Style[expr, Red], 0.5]] we see that the FrontEnd does not display the original expression:
Is it possible to control such behavior of the FrontEnd?
And is it possible to get complete list of expressions those are parsed by the FrontEnd automatically?
EDIT
We can see calls to MakeBoxes when using Print:
On[MakeBoxes]; Print[HoldComplete#Rotate["text", Pi/2]]
But copy-pasting the printed output gives changed expression: HoldComplete[Rotate["text", 1.5707963267948966]]. It shows that Print does not respect HoldComplete.
When creating output Cell there should be calls for MakeBoxes too. Is there a way to see them?
I have found a post by John Fultz with pretty clear explanation of how graphics functionality works:
In version 6, the kernel has
absolutely no involvement whatsoever
in generating the rendered image.
The steps taken in displaying a
graphic in version 6 are very much
like those used in displaying
non-graphical output. It works as
follows:
1) The expression is evaluated, and
ultimately produces something with
head Graphics[] or Graphics3D[].
2) The resulting expression is passed
through MakeBoxes. MakeBoxes has a
set of rules which turns the graphics
expression into the box language which
the front end uses to represent
graphics. E.g.,
In[9]:= MakeBoxes[Graphics[{Point[{0, 0}]}], StandardForm]
Out[9]= GraphicsBox[{PointBox[{0, 0}]}]
Internally, we call this the "typeset"
expression. It may be a little weird
thinking of graphics as being
"typeset", but it's fundamentally the
same operation which happens for
typesetting (which has worked this way
for 11 years), so I'll use the term.
3) The resulting typeset expression is
sent via MathLink to the front end.
4) The front end parses the typeset
expression and creates internal
objects which generally have a
one-to-one correspondence to the
typeset expression.
5) The front end renders the internal
objects.
This means that the conversion is performed in the Kernel by a call to MakeBoxes.
This call can be intercepted through high-level code:
list = {};
MakeBoxes[expr_, form_] /; (AppendTo[list, HoldComplete[expr]];
True) := Null;
HoldComplete[Rotate[Style[expr, Red], 0.5]]
ClearAll[MakeBoxes];
list
Here is what we get as output:
One can see that MakeBoxes does not respect HoldAllComplete attribute.
The list of symbols which are auto-converted before sending to the FrontEnd one can get from FormatValues:
In[1]:= list =
Select[Names["*"],
ToExpression[#, InputForm,
Function[symbol, Length[FormatValues#symbol] > 0, HoldAll]] &];
list // Length
During evaluation of In[1]:= General::readp: Symbol I is read-protected. >>
Out[2]= 162
There are two aspects to what you witness. First, transcription of the expression you entered into boxes and rendering those boxes by Front-End. By default the output is typeset using StandardForm, which has a typesetting rule to render graphics and geometric transforms. If you use InputForm, there are no such rule. You can control which form is used via Preferences->Evaluation.
You can convince yourself that HoldComplete correctly did its job by using InputForm or FullForm on the input, or using InputForm display on the output cell.
EDIT Using the OutputForm:
In[13]:= OutputForm[%]
Out[13]//OutputForm= HoldComplete[Rotate[expr, 0.5]]
In regard to your question about complete list of symbols, it includes Graphics, geometric operations, and possibly others, but I do not know of the complete list.
Not quite an answer, but in Preferences > Evaluation there are options to "Only use textual boxes when converting (input|output) to typeset forms."
If you check these, then using Cell > Convert To... > StandardForm etc... will show the Rotate[..] instead of the visually rotated result.
John Fultz has recently answered my question on converting TableForm to "typeset" expressions and it is worth to cite it here since it amplifies (while partially contradicts) the general explanation cited in my previous answer:
ToBoxes is returning precisely what
the kernel sends to the front end
without variation (except, in the
general case, for the evaluation
semantics and side effects possibly
being different, but that's not an
issue in your example).
The issue is that the front end has
two different specifications for
specifying GridBox options... one of
which dates back to version 3, and the
other, more expansive set dates to
version 6. The front end understands
both sets of options, but
canonicalizes anything it receives to
the version 6 options.
GridBox is the only box which has had
such a wholesale change of options,
and it was necessary to support new
functionality we added in v6. But the
front end will continue to understand
the old options for a seriously long
time (probably forever), as the old
options show up not only in certain
kernel typesetting constructs, but in
legacy notebook files.
ToBoxes[] of TableForm is creating the
legacy options, as there's been no
need to update the typesetting of
TableForm in a while (ToBoxes[] of
Grid, on teh other hand, uses modern
options). The conversion is done by
the front end. You could rely on the
front end to do the conversion for
you, or you could figure out how the
options map yourself.
So in this case the final stage of the conversion of the expression is done by the FrontEnd.
I would like to define a new operator of the form x /==> y, where
the operator /==> is treated as e.g. the /# operator of Map, and
is translated to MyFunction[x, y]. There is one important aspect: I
want the resulting operator to behave in the frontend like any two-bit
operator does, that is, the two characters (a Divide and a
DoubleLongRightArrow) should be connected together, no syntax
coloration should appear, and they are to be selected together when
clicked, so precedence must be set. Also, I'd rather avoid using the
Notation` package. As a result, I'd like to see something like this:
In[11]:= FullForm[x/\[DoubleLongRightArrow]y]
Out[11]//FullForm= MyFunction[x,y]
Does anyone have an idea how to achieve this?
The Notation Package is perhaps the closest to doing this kind of thing, but according to the response to my own question of a similar nature, what you want is unfortunately not practical.
Don't let this stop you from trying however, as you will probably learn new things in the process. The Notation Package and the the functions that underpin it are far from useless.
You may also find the replies to this question informative.
There are a number of functions that are useful for manual implementation of syntax changes. Rather than try to write my own help file for these, I will direct you to the official pages on these functions. After reading them, please ask any focused questions you have, or for help with implementing specific ideas. I or others here should be able to either answer your question, show you how to do something, or explain why it is not readily possible.
The index page on Textual Input and Output.
MakeBoxes, and MakeExpression, and an example of their use.
PreRead
More drastically, one might use CellEvaluationFunction which can be used to do unusual things.
There are more, and I will try to extend this list later. (others are welcome to edit this post)
Thanks to Mr.Wizard's links, I've found the only example in the documentation on how to parse new operators (the gplus example in Low-Level Input). According to this example, here is my version for the new operator PerArrow. Please comment/critize on the code below:
In[1]:= PerArrow /: MakeBoxes[PerArrow[x_, y_], StandardForm] :=
RowBox[{MakeBoxes[x, StandardForm],
RowBox[{AdjustmentBox["/", BoxMargins -> -.2],
AdjustmentBox["\[DoubleLongRightArrow]", BoxMargins -> -.1]}],
MakeBoxes[y, StandardForm]}];
MakeExpression[
RowBox[{x_, "/", RowBox[{"\[DoubleLongRightArrow]", y_}]}],
StandardForm] :=
MakeExpression[RowBox[{"PerArrow", "[", x, ",", y, "]"}],
StandardForm];
In[3]:= PerArrow[x, y]
Out[3]= x /\[DoubleLongRightArrow] y
In[4]:= x /\[DoubleLongRightArrow]y
Out[4]= x /\[DoubleLongRightArrow] y
In[5]:= FullForm[x /\[DoubleLongRightArrow]y]
Out[5]//FullForm= \!\(\*
TagBox[
StyleBox[
RowBox[{"PerArrow", "[",
RowBox[{"x", ",", "y"}], "]"}],
ShowSpecialCharacters->False,
ShowStringCharacters->True,
NumberMarks->True],
FullForm]\)
For sake of clarity, here is a screenshot as well:
Since the operator is not fully integrated, further concerns are:
the operator is selected weird when clicked (DoubleLongRightArrow with y instead of with /).
accordingly, the parsing part requires the DoubleLongRightArrow to be RowBox-ed with y, otherwise it yields syntax error
syntax coloration (at In[4] and In[5])
it prints weird if inputted directly (notice the large gaps at In[4] and In[5])
Now, I can live with these, though it would be nice to have some means to iron out all the minor issues. I guess all these boil down to basically some even lower-level syntax handler, which does not now how to group the new operator. Any idea on how to tackle these? I understand that Cell has a multitude of options which might come handy (like CellEvaluationFunction, ShowAutoStyles and InputAutoReplacements) though I'm again clueless here.