Mathematica execution-time bug: symbol names

Mathematica execution-time bug: symbol names - wolfram-mathematica

There is a strange bug that has been in Mathematica for years, at least since version 5.1, and persisting through version 7.
Module[{f, L}, L = f[];
Do[L = f[L, i], {i, 10^4}]] // Timing
{0.015, Null}
Module[{weirdness, L}, L = weirdness[];
Do[L = weirdness[L, i], {i, 10^4}]] // Timing
{2.266, Null}
What causes this? Is it a hashing problem?
Is it fixed in Version 8?
Is there a way to know what symbol names cause a slowdown, other than testing?

What causes this? Is it a hashing problem?
Yes, more or less.
Is it fixed in Version 8?
Yes (also more or less). That is to say, it is not possible to fix in any "complete" sense. But the most common cases are much better handled.
Is there a way to know what symbol names cause a slowdown, other than testing?
No way of which I am aware.
In version 7 there is an earlier fix of a similar nature to the one in version 8. It was off by default (we'd not had adequate time to test it when we shipped, and it did not get turned on for version 7.0.1). It can be accessed as follows.
SetSystemOptions["NeedNotReevaluateOptions"->{"UseSymbolLists"->True}];
This brings your example back to the realm of the reasonable.
Module[{weirdness, L}, L = weirdness[];
Do[L = weirdness[L, i], {i, 10^4}]] // Timing
Out[8]= {0.020997, Null}
---edit---
I can explain the optimization involved here in slightly more detail. First recall that Mathematica emulates "infinite evaluation", that is, expressions keep evaluating until they no longer change. This can be costly and hence requires careful short circuit optimizations to forestall it when possible.
A mechanism we use is a variant of hashing, that serves to indicate that symbols on which an expression might depend are unchanged and hence that expression is unchanged. It is here that collisions might occur, thus necessitating more work.
In a bad case, the Mathematica kernel might need to walk the entire expression in order to determine that it is unchanged. This walk can be as costly as reevaluation. An optimization, new to version 7 (noted above), is to record explicitly, for some types of expression, those symbols upon which it depends. Then the reevaluation check can be shortened by simply checking that none of these symbols has been changed since the last time the expression was evaluated.
The implementation details are a bit involved (and also a bit proprietary, though perhaps not so hard to guess). But that, in brief, is what is going on under the hood. Earlier versions sometimes did significant expression traversal just to discover that the expression needed no reevaluation. This can still happen, but it is a much more rare event now.
---end edit---
Daniel Lichtblau
Wolfram Research

As to version 8: I tried 100,000 random strings of various lengths and didn't find anything out of the ordinary.
chars = StringCases[CharacterRange["A", "z"], WordCharacter] //Flatten;
res = Table[
ToExpression[
StringReplace[
"First[AbsoluteTiming[Module[{weirdness,L},L=weirdness[];\
\[IndentingNewLine]Do[L=weirdness[L,i],{i,10^4}]]]]",
"weirdness" ->
StringJoin[
RandomChoice[chars, RandomInteger[{1, 20}]]]]], {100000}];

Related

Is it worth it to rewrite an if statement to avoid branching?

Recently I realized I have been doing too much branching without caring the negative impact on performance it had, therefore I have made up my mind to attempt to learn all about not branching. And here is a more extreme case, in attempt to make the code to have as little branch as possible.
Hence for the code
if(expression)
A = C; //A and C have to be the same type here obviously
expression can be A == B, or Q<=B, it could be anything that resolve to true or false, or i would like to think of it in term of the result being 1 or 0 here
I have come up with this non branching version
A += (expression)*(C-A); //Edited with thanks
So my question would be, is this a good solution that maximize efficiency?
If yes why and if not why?

Depends on the compiler, instruction set, optimizer, etc. When you use a boolean expression as an int value, e.g., (A == B) * C, the compiler has to do the compare, and the set some register to 0 or 1 based on the result. Some instruction sets might not have any way to do that other than branching. Generally speaking, it's better to write simple, straightforward code and let the optimizer figure it out, or find a different algorithm that branches less.

Jeez, no, don't do that!
Anyone who "penalize[s] [you] a lot for branching" would hopefully send you packing for using something that awful.
How is it awful, let me count the ways:
There's no guarantee you can multiply a quantity (e.g., C) by a boolean value (e.g., (A==B) yields true or false). Some languages will, some won't.
Anyone casually reading it is going observe a calculation, not an assignment statement.
You're replacing a comparison, and a conditional branch with two comparisons, two multiplications, a subtraction, and an addition. Seriously non-optimal.
It only works for integral numeric quantities. Try this with a wide variety of floating point numbers, or with an object, and if you're really lucky it will be rejected by the compiler/interpreter/whatever.

You should only ever consider doing this if you had analyzed the runtime properties of the program and determined that there is a frequent branch misprediction here, and that this is causing an actual performance problem. It makes the code much less clear, and its not obvious that it would be any faster in general (this is something you would also have to measure, under the circumstances you are interested in).

After doing research, I came to the conclusion that when there are bottleneck, it would be good to include timed profiler, as these kind of codes are usually not portable and are mainly used for optimization.
An exact example I had after reading the following question below
Why is it faster to process a sorted array than an unsorted array?
I tested my code on C++ using that, that my implementation was actually slower due to the extra arithmetics.
HOWEVER!
For this case below
if(expression) //branched version
A += C;
//OR
A += (expression)*(C); //non-branching version
The timing was as of such.
Branched Sorted list was approximately 2seconds.
Branched unsorted list was aproximately 10 seconds.
My implementation (whether sorted or unsorted) are both 3seconds.
This goes to show that in an unsorted area of bottleneck, when we have a trivial branching that can be simply replaced by a single multiplication.
It is probably more worthwhile to consider the implementation that I have suggested.
** Once again it is mainly for the areas that is deemed as the bottleneck **

What is the recommended way to check that a list is a list of numbers in argument of a function?

I've been looking at the ways to check arguments of functions. I noticed that
MatrixQ takes 2 arguments, the second is a test to apply to each element.
But ListQ only takes one argument. (also for some reason, ?ListQ does not have a help page, like ?MatrixQ does).
So, for example, to check that an argument to a function is a matrix of numbers, I write
ClearAll[foo]
foo[a_?(MatrixQ[#, NumberQ] &)] := Module[{}, a + 1]
What would be a good way to do the same for a List? This below only checks that the input is a List
ClearAll[foo]
foo[a_?(ListQ[#] &)] := Module[{}, a + 1]
I could do something like this:
ClearAll[foo]
foo[a_?(ListQ[#] && (And ## Map[NumberQ[#] &, # ]) &)] := Module[{}, a + 1]
so that foo[{1, 2, 3}] will work, but foo[{1, 2, x}] will not (assuming x is a symbol). But it seems to me to be someone complicated way to do this.
Question: Do you know a better way to check that an argument is a list and also check the list content to be Numbers (or of any other Head known to Mathematica?)
And a related question: Any major run-time performance issues with adding such checks to each argument? If so, do you recommend these checks be removed after testing and development is completed so that final program runs faster? (for example, have a version of the code with all the checks in, for the development/testing, and a version without for production).

You might use VectorQ in a way completely analogous to MatrixQ. For example,
f[vector_ /; VectorQ[vector, NumericQ]] := ...
Also note two differences between VectorQ and ListQ:
A plain VectorQ (with no second argument) only gives true if no elements of the list are lists themselves (i.e. only for 1D structures)
VectorQ will handle SparseArrays while ListQ will not
I am not sure about the performance impact of using these in practice, I am very curious about that myself.
Here's a naive benchmark. I am comparing two functions: one that only checks the arguments, but does nothing, and one that adds two vectors (this is a very fast built-in operation, i.e. anything faster than this could be considered negligible). I am using NumericQ which is a more complex (therefore potentially slower) check than NumberQ.
In[2]:= add[a_ /; VectorQ[a, NumericQ], b_ /; VectorQ[b, NumericQ]] :=
a + b
In[3]:= nothing[a_ /; VectorQ[a, NumericQ],
b_ /; VectorQ[b, NumericQ]] := Null
Packed array. It can be verified that the check is constant time (not shown here).
In[4]:= rr = RandomReal[1, 10000000];
In[5]:= Do[add[rr, rr], {10}]; // Timing
Out[5]= {1.906, Null}
In[6]:= Do[nothing[rr, rr], {10}]; // Timing
Out[6]= {0., Null}
Homogeneous non-packed array. The check is linear time, but very fast.
In[7]:= rr2 = Developer`FromPackedArray#RandomInteger[10000, 1000000];
In[8]:= Do[add[rr2, rr2], {10}]; // Timing
Out[8]= {1.75, Null}
In[9]:= Do[nothing[rr2, rr2], {10}]; // Timing
Out[9]= {0.204, Null}
Non-homogeneous non-packed array. The check takes the same time as in the previous example.
In[10]:= rr3 = Join[rr2, {Pi, 1.0}];
In[11]:= Do[add[rr3, rr3], {10}]; // Timing
Out[11]= {5.625, Null}
In[12]:= Do[nothing[rr3, rr3], {10}]; // Timing
Out[12]= {0.282, Null}
Conclusion based on this very simple example:
VectorQ is highly optimized, at least when using common second arguments. It's much faster than e.g. adding two vectors, which itself is a well optimized operation.
For packed arrays VectorQ is constant time.
#Leonid's answer is very relevant too, please see it.

Regarding the performance hit (since your first question has been answered already) - by all means, do the checks, but in your top-level functions (which receive data directly from the user of your functionality. The user can also be another independent module, written by you or someone else). Don't put these checks in all your intermediate functions, since such checks will be duplicate and indeed unjustified.
EDIT
To address the problem of errors in intermediate functions, raised by #Nasser in the comments: there is a very simple technique which allows one to switch pattern-checks on and off in "one click". You can store your patterns in variables inside your package, defined prior to your function definitions.
Here is an example, where f is a top-level function, while g and h are "inner functions". We define two patterns: for the main function and for the inner ones, like so:
Clear[nlPatt,innerNLPatt ];
nlPatt= _?(!VectorQ[#,NumericQ]&);
innerNLPatt = nlPatt;
Now, we define our functions:
ClearAll[f,g,h];
f[vector:nlPatt]:=g[vector]+h[vector];
g[nv:innerNLPatt ]:=nv^2;
h[nv:innerNLPatt ]:=nv^3;
Note that the patterns are substituted inside definitions at definition time, not run-time, so this is exactly equivalent to coding those patterns by hand. Once you test, you just have to change one line: from
innerNLPatt = nlPatt
to
innerNLPatt = _
and reload your package.
A final question is - how do you quickly find errors? I answered that here, in sections "Instead of returning $Failed, one can throw an exception, using Throw.", and "Meta-programming and automation".
END EDIT
I included a brief discussion of this issue in my book here. In that example, the performance hit was on the level of 10% increase of running time, which IMO is borderline acceptable. In the case at hand, the check is simpler and the performance penalty is much less. Generally, for a function which is any computationally-intensive, correctly-written type checks cost only a small fraction of the total run-time.
A few tricks which are good to know:
Pattern-matcher can be very fast, when used syntactically (no Condition or PatternTest present in the pattern).
For example:
randomString[]:=FromCharacterCode#RandomInteger[{97,122},5];
rstest = Table[randomString[],{1000000}];
In[102]:= MatchQ[rstest,{__String}]//Timing
Out[102]= {0.047,True}
In[103]:= MatchQ[rstest,{__?StringQ}]//Timing
Out[103]= {0.234,True}
Just because in the latter case the PatternTest was used, the check is much slower, because evaluator is invoked by the pattern-matcher for every element, while in the first case, everything is purely syntactic and all is done inside the pattern-matcher.
The same is true for unpacked numerical lists (the timing difference is similar). However, for packed numerical lists, MatchQ and other pattern-testing functions don't unpack for certain special patterns, moreover, for some of them the check is instantaneous.
Here is an example:
In[113]:=
test = RandomInteger[100000,1000000];
In[114]:= MatchQ[test,{__?IntegerQ}]//Timing
Out[114]= {0.203,True}
In[115]:= MatchQ[test,{__Integer}]//Timing
Out[115]= {0.,True}
In[116]:= Do[MatchQ[test,{__Integer}],{1000}]//Timing
Out[116]= {0.,Null}
The same, apparently, seems to be true for functions like VectorQ, MatrixQ and ArrayQ with certain predicates (NumericQ) - these tests are extremely efficient.
A lot depends on how you write your test, i.e. to what degree you reuse the efficient Mathematica structures.
For example, we want to test that we have a real numeric matrix:
In[143]:= rm = RandomInteger[10000,{1500,1500}];
Here is the most straight-forward and slow way:
In[144]:= MatrixQ[rm,NumericQ[#]&&Im[#]==0&]//Timing
Out[144]= {4.125,True}
This is better, since we reuse the pattern-matcher better:
In[145]:= MatrixQ[rm,NumericQ]&&FreeQ[rm,Complex]//Timing
Out[145]= {0.204,True}
We did not utilize the packed nature of the matrix however. This is still better:
In[146]:= MatrixQ[rm,NumericQ]&&Total[Abs[Flatten[Im[rm]]]]==0//Timing
Out[146]= {0.047,True}
However, this is not the end. The following one is near instantaneous:
In[147]:= MatrixQ[rm,NumericQ]&&Re[rm]==rm//Timing
Out[147]= {0.,True}

Since ListQ just checks that the head is List, the following is a simple solution:
foo[a:{___?NumberQ}] := Module[{}, a + 1]

Is Put - Get cycle in Mathematica always deterministic?

In Mathematica as in other systems of computer math the numbers are internally stored in binary form. However when exporting them with such functions as Put and PutAppend they are converted into approximate decimals. When you import them back with such functions as Get they are restored from this approximate decimal representation to binary form.
The question is whether the recovered number is always identical to the original binary number and, if not always, in which cases it is not and how large can be the difference? I am particularly interested in the Put - Get cycle (on the same computer system).
The following two simple experiments show that probably the Put - Get cycle in Mathematica always restores original numbers exactly even for arbitrary precision numbers:
In[1]:= list=RandomReal[{-10^6,10^6},10000];
Put[list,"test.txt"];
list2=Get["test.txt"];
Order[list,list2]===0
Order[Total#Abs[list-list2],0.]===0
Out[4]= True
Out[5]= True
In[6]:= list=SetPrecision[RandomReal[{-10^6,10^6},10000],50];
Put[list,"test.txt"];
list2=Get["test.txt"];
Order[list,list2]===0
Total#Abs[list-list2]//InputForm
Out[9]= True
Out[10]//InputForm=
0``39.999515496936205
But maybe I am missing something?
UPDATE
With more correct test code I have found that in reality these tests show only that restored numbers have identical binary RealDigits but their Precisions may differ even in Equal sense. Here are more correct tests:
test := (Put[list, "test.txt"];
list2 = Get["test.txt"];
{Order[list, list2] === 0,
Order[Total#Abs[list - list2], 0.] === 0,
Total[Order ### RealDigits[Transpose[{list, list2}], 2]],
Total[Order ### Map[Precision, Transpose[{list, list2}], {-1}]],
Total[1 - Boole[Equal ### Map[Precision, Transpose[{list, list2}], {-1}]]]})
In[8]:= list=RandomReal[NormalDistribution[],10000]^1001;
test
Out[9]= {False,True,0,1,3}
In[6]:= list=RandomReal[NormalDistribution[],10000,WorkingPrecision->50]^1001;
test
Out[7]= {False,False,0,-2174,1}

I'm afraid I can't give a definitive answer. If you look into the text file you see it's stored as something like the InputForm of the values, including the precision indication for non-machine precision numbers.
Assuming that Get uses the same conversion routines as ImportString and ExportString your test can be sped up a tiny bit.
Monitor[
Do[
i = RandomReal[{$MinMachineNumber, 10 $MinMachineNumber}, 100000];
If[i =!=
ToExpression[ImportString[ExportString[i, "Text"], "List"]],
Print[i]], {n, 100}
],
n]
I have tested this for several hundreds of millions of numbers in various ranges between $MinMachineNumber and $MaxMachineNumber and I always get back the original numbers. It's no proof, of course, but it seems unlikely that you're going to see numbers for which this is not true if there are any (and in that case the difference would be so tiny as to be negligible).

One important thing to know is that Put[] / Get[] doesn't keep packed arrays packed. You should check out DumpSave[]. It's much faster as it's a binary format and keeps arrays packed.

Mathematica Notation and syntax mods

I am experimenting with syntax mods in Mathematica, using the Notation package.
I am not interested in mathematical notation for a specific field, but general purpose syntax modifications and extensions, especially notations that reduce the verbosity of Mathematica's VeryLongFunctionNames, clean up unwieldy constructs, or extend the language in a pleasing way.
An example modification is defining Fold[f, x] to evaluate as Fold[f, First#x, Rest#x]
This works well, and is quite convenient.
Another would be defining *{1,2} to evaluate as Sequence ## {1,2} as inspired by Python; this may or may not work in Mathematica.
Please provide information or links addressing:
Limits of notation and syntax modification
Tips and tricks for implementation
Existing packages, examples or experiments
Why this is a good or bad idea

Not a really constructive answer, just a couple of thoughts. First, a disclaimer - I don't suggest any of the methods described below as good practices (perhaps generally they are not), they are just some possibilities which seem to address your specific question. Regarding the stated goal - I support the idea very much, being able to reduce verbosity is great (for personal needs of a solo developer, at least). As for the tools: I have very little experience with Notation package, but, whether or not one uses it or writes some custom box-manipulation preprocessor, my feeling is that the whole fact that the input expression must be parsed into boxes by Mathematica parser severely limits a number of things that can be done. Additionally, there will likely be difficulties with using it in packages, as was mentioned in the other reply already.
It would be easiest if there would be some hook like $PreRead, which would allow the user to intercept the input string and process it into another string before it is fed to the parser. That would allow one to write a custom preprocessor which operates on the string level - or you can call it a compiler if you wish - which will take a string of whatever syntax you design and generate Mathematica code from it. I am not aware of such hook (it may be my ignorance of course). Lacking that, one can use for example the program style cells and perhaps program some buttons which read the string from those cells and call such preprocessor to generate the Mathematica code and paste it into the cell next to the one where the original code is.
Such preprocessor approach would work best if the language you want is some simple language (in terms of its syntax and grammar, at least), so that it is easy to lexically analyze and parse. If you want the Mathematica language (with its full syntax modulo just a few elements that you want to change), in this approach you are out of luck in the sense that, regardless of how few and "lightweight" your changes are, you'd need to re-implement pretty much completely the Mathematica parser, just to make those changes, if you want them to work reliably. In other words, what I am saying is that IMO it is much easier to write a preprocessor that would generate Mathematica code from some Lisp-like language with little or no syntax, than try to implement a few syntactic modifications to otherwise the standard mma.
Technically, one way to write such a preprocessor is to use standard tools like Lex(Flex) and Yacc(Bison) to define your grammar and generate the parser (say in C). Such parser can be plugged back to Mathematica either through MathLink or LibraryLink (in the case of C). Its end result would be a string, which, when parsed, would become a valid Mathematica expression. This expression would represent the abstract syntax tree of your parsed code. For example, code like this (new syntax for Fold is introduced here)
"((1|+|{2,3,4,5}))"
could be parsed into something like
"functionCall[fold,{plus,1,{2,3,4,5}}]"
The second component for such a preprocessor would be written in Mathematica, perhaps in a rule-based style, to generate Mathematica code from the AST. The resulting code must be somehow held unevaluated. For the above code, the result might look like
Hold[Fold[Plus,1,{2,3,4,5}]]
It would be best if analogs of tools like Lex(Flex)/Yacc(Bison) were available within Mathematica ( I mean bindings, which would require one to only write code in Mathematica, and generate say C parser from that automatically, plugging it back to the kernel either through MathLink or LibraryLink). I may only hope that they will become available in some future versions. Lacking that, the approach I described would require a lot of low-level work (C, or Java if your prefer). I think it is still doable however. If you can write C (or Java), you may try to do some fairly simple (in terms of the syntax / grammar) language - this may be an interesting project and will give an idea of what it will be like for a more complex one. I'd start with a very basic calculator example, and perhaps change the standard arithmetic operators there to some more weird ones that Mathematica can not parse properly itself, to make it more interesting. To avoid MathLink / LibraryLink complexity at first and just test, you can call the resulting executable from Mathematica with Run, passing the code as one of the command line arguments, and write the result to a temporary file, that you will then import into Mathematica. For the calculator example, the entire thing can be done in a few hours.
Of course, if you only want to abbreviate certain long function names, there is a much simpler alternative - you can use With to do that. Here is a practical example of that - my port of Peter Norvig's spelling corrector, where I cheated in this way to reduce the line count:
Clear[makeCorrector];
makeCorrector[corrector_Symbol, trainingText_String] :=
Module[{model, listOr, keys, words, edits1, train, max, known, knownEdits2},
(* Proxies for some commands - just to play with syntax a bit*)
With[{fn = Function, join = StringJoin, lower = ToLowerCase,
rev = Reverse, smatches = StringCases, seq = Sequence, chars = Characters,
inter = Intersection, dv = DownValues, len = Length, ins = Insert,
flat = Flatten, clr = Clear, rep = ReplacePart, hp = HoldPattern},
(* body *)
listOr = fn[Null, Scan[If[# =!= {}, Return[#]] &, Hold[##]], HoldAll];
keys[hash_] := keys[hash] = Union[Most[dv[hash][[All, 1, 1, 1]]]];
words[text_] := lower[smatches[text, LetterCharacter ..]];
With[{m = model},
train[feats_] := (clr[m]; m[_] = 1; m[#]++ & /# feats; m)];
With[{nwords = train[words[trainingText]],
alphabet = CharacterRange["a", "z"]},
edits1[word_] := With[{c = chars[word]}, join ### Join[
Table[
rep[c, c, #, rev[#]] &#{{i}, {i + 1}}, {i, len[c] - 1}],
Table[Delete[c, i], {i, len[c]}],
flat[Outer[#1[c, ##2] &, {ins[#1, #2, #3 + 1] &, rep},
alphabet, Range[len[c]], 1], 2]]];
max[set_] := Sort[Map[{nwords[#], #} &, set]][[-1, -1]];
known[words_] := inter[words, keys[nwords]]];
knownEdits2[word_] := known[flat[Nest[Map[edits1, #, {-1}] &, word, 2]]];
corrector[word_] := max[listOr[known[{word}], known[edits1[word]],
knownEdits2[word], {word}]];]];
You need some training text with a large number of words as a string to pass as a second argument, and the first argument is the function name for a corrector. Here is the one that Norvig used:
text = Import["http://norvig.com/big.txt", "Text"];
You call it once, say
In[7]:= makeCorrector[correct, text]
And then use it any number of times on some words
In[8]:= correct["coputer"] // Timing
Out[8]= {0.125, "computer"}
You can make your custom With-like control structure, where you hard-code the short names for some long mma names that annoy you the most, and then wrap that around your piece of code ( you'll lose the code highlighting however). Note, that I don't generally advocate this method - I did it just for fun and to reduce the line count a bit. But at least, this is universal in the sense that it will work both interactively and in packages. Can not do infix operators, can not change precedences, etc, etc, but almost zero work.

(my first reply/post.... be gentle)
From my experience, the functionality appears to be a bit of a programming cul-de-sac. The ability to define custom notations seems heavily dependent on using the 'notation palette' to define and clear each custom notation. ('everything is an expression'... well, except for some obscure cases, like Notations, where you have to use a palette.) Bummer.
The Notation package documentation mentions this explicitly, so I can't complain too much.
If you just want to define custom notations in a particular notebook, Notations might be useful to you. On the other hand, if your goal is to implement custom notations in YourOwnPackage.m and distribute them to others, you are likely to encounter issues. (unless you're extremely fluent in Box structures?)
If someone can correct my ignorance on this, you'd make my month!! :)
(I was hoping to use Notations to force MMA to treat subscripted variables as symbols.)

Not a full answer, but just to show a trick I learned here (more related to symbol redefinition than to Notation, I reckon):
Unprotect[Fold];
Fold[f_, x_] :=
Block[{$inMsg = True, result},
result = Fold[f, First#x, Rest#x];
result] /; ! TrueQ[$inMsg];
Protect[Fold];
Fold[f, {a, b, c, d}]
(*
--> f[f[f[a, b], c], d]
*)
Edit
Thanks to #rcollyer for the following (see comments below).
You can switch the definition on or off as you please by using the $inMsg variable:
$inMsg = False;
Fold[f, {a, b, c, d}]
(*
->f[f[f[a,b],c],d]
*)
$inMsg = True;
Fold[f, {a, b, c, d}]
(*
->Fold::argrx: (Fold called with 2 arguments; 3 arguments are expected.
*)
Fold[f, {a, b, c, d}]
That's invaluable while testing

Where is DropWhile in Mathematica?

Mathematica 6 added TakeWhile, which has the syntax:
TakeWhile[list, crit]
gives elements ei from the beginning of list, continuing so long as crit[ei] is True.
There is however no corresponding "DropWhile" function. One can construct DropWhile using LengthWhile and Drop, but it almost seems as though one is discouraged from using DropWhile. Why is this?
To clarify, I am not asking for a way to implement this function. Rather: why is it not already present? It seems to me that there must be a reason for its absence other than an oversight, or it would have been corrected by now. Is there something inefficient, undesirable, or superfluous about DropWhile?
There appears to be some ambiguity about the function of DropWhile, so here is an example:
DropWhile = Drop[#, LengthWhile[#, #2]] &;
DropWhile[{1,2,3,4,5}, # <= 3 &]
Out= {4, 5}

Just a blind guess.
There are a lot list operations that could take a while criteria. For example:
Total..While
Accumulate..While
Mean..While
Map..While
Etc..While
They are not difficult to construct, anyway.
I think those are not included just because the number of "primitive" functions is already growing too long, and the criteria of "is it frequently needed and difficult to implement with good performance by the user?" is prevailing in those cases.

The ubiquitous Lists in Mathematica are fixed length vectors, and when they are of a machine numbers it is a packed array.
Thus the natural functions for a recursively defined linked list (e.g. in Lisp or Haskell) are not the primary tools in Mathematica.
So I am inclined to think this explains why Wolfram did not fill out its repertoire of manipulation functions.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Mathematica execution-time bug: symbol names - wolfram-mathematica

Related

Is it worth it to rewrite an if statement to avoid branching?

What is the recommended way to check that a list is a list of numbers in argument of a function?

Is Put - Get cycle in Mathematica always deterministic?

Mathematica Notation and syntax mods

Where is DropWhile in Mathematica?

Categories

Resources