Aggregating Tally counters - wolfram-mathematica

Many times I find myself counting occurrences with Tally[ ] and then, once I discarded the original list, having to add (and join) to that counters list the results from another list.
This typically happens when I am counting configurations, occurrences, doing some discrete statistics, etc.
So I defined a very simple but handy function for Tally aggregation:
aggTally[listUnTallied__List:{},
listUnTallied1_List,
listTallied_List] :=
Join[Tally#Join[listUnTallied, listUnTallied1], listTallied] //.
{a___, {x_, p_}, b___, {x_, q_}, c___} -> {a, {x, p + q}, b, c};
Such that
l = {x, y, z}; lt = Tally#l;
n = {x};
m = {x, y, t};
aggTally[n, {}]
{{x, 1}}
aggTally[m, n, {}]
{{x, 2}, {y, 1}, {t, 1}}
aggTally[m, n, lt]
{{x, 3}, {y, 2}, {t, 1}, {z, 1}}
This function has two problems:
1) Performance
Timing[Fold[aggTally[Range##2, #1] &, {}, Range[100]];]
{23.656, Null}
(* functional equivalent to *)
Timing[s = {}; j = 1; While[j < 100, s = aggTally[Range#j, s]; j++]]
{23.047, Null}
2) It does not validate that the last argument is a real Tallied list or null (less important for me, though)
Is there a simple, elegant, faster and more effective solution? (I understand that these are too many requirements, but wishing is free)

Perhaps, this will suit your needs?
aggTallyAlt[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
{#[[1, 1]], Total##[[All, 2]]} & /#
GatherBy[Join[Tally#Join[listUnTallied, listUnTallied1], listTallied], First]
The timings are much better, and there is a pattern-based check on the last arg.
EDIT:
Here is a faster version:
aggTallyAlt1[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
Transpose[{#[[All, 1, 1]], Total[#[[All, All, 2]], {2}]}] &#
GatherBy[Join[Tally#Join[listUnTallied, listUnTallied1], listTallied], First]
The timings for it:
In[39]:= Timing[Fold[aggTallyAlt1[Range##2, #1] &, {}, Range[100]];]
Timing[s = {}; j = 1; While[j < 100, s = aggTallyAlt1[Range#j, s]; j++]]
Out[39]= {0.015, Null}
Out[40]= {0.016, Null}

The following solution is just a small modification of your original function. It applies Sort before using ReplaceRepeated and can thus use a less general replacement pattern which makes it much faster:
aggTally[listUnTallied__List : {}, listUnTallied1_List,
listTallied : {{_, _Integer} ...}] :=
Sort[Join[Tally#Join[listUnTallied, listUnTallied1],
listTallied]] //. {a___, {x_, p_}, {x_, q_}, c___} -> {a, {x, p + q}, c};

Here's the fastest thing I've come up with yet, (ab)using the tagging available with Sow and Reap:
aggTally5[untallied___List, tallied_List: {}] :=
Last[Reap[
Scan[((Sow[#2, #] &) ### Tally[#]) &, {untallied}];
Sow[#2, #] & ### tallied;
, _, {#, Total[#2]} &]]
Not going to win any beauty contests, but it's all about speed, right? =)

If you stay purely symbolic, you may try something along the lines of
(Plus ## Times ### Join[#1, #2] /. Plus -> List /. Times -> List) &
for joining tally lists. This is stupid fast but returns something that isn't a tally list, so it needs some work (after which it may not be so fast anymore ;) ).
EDIT: So I've got a working version:
aggT = Replace[(Plus ## Times ### Join[#1, #2]
/. Plus -> List
/. Times[a_, b_] :> List[b, a]),
k_Symbol -> List[k, 1], {1}] &;
Using a couple of random symbolic tables I get
a := Tally#b;
b := Table[f[RandomInteger#99 + 1], {i, 100}];
Timing[Fold[aggT[#1, #2] &, a, Table[a, {i, 100}]];]
--> {0.104954, Null}
This version only adds tally lists, doesn't check anything, still returns some integers, and comparing to Leonid's function:
Timing[Fold[aggTallyAlt1[#2, #1] &, a, Table[b, {i, 100}]];]
--> {0.087039, Null}
it's already a couple of seconds slower :-(.
Oh well, nice try.

Related

Is it possible to access the partial list content while being Sow'ed or must one wait for it to be Reap'ed?

I've been learning Sow/Reap. They are cool constructs. But I need help to see if I can use them to do what I will explain below.
What I'd like to do is: Plot the solution of NDSolve as it runs. I was thinking I can use Sow[] to collect the solution (x,y[x]) as NDSolve runs using EvaluationMonitor. But I do not want to wait to the end, Reap it and then plot the solution, but wanted to do it as it is running.
I'll show the basic setup example
max = 30;
sol1 = y /.
First#NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1},
y, {x, 0, max}];
Plot[sol1[x], {x, 0, max}, PlotRange -> All, AxesLabel -> {"x", "y[x]"}]
Using Reap/Sow, one can collect the data points, and plot the solution at the end like this
sol = Reap[
First#NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1},
y, {x, 0, max}, EvaluationMonitor :> Sow[{x, y[x]}]]][[2, 1]];
ListPlot[sol, AxesLabel -> {"x", "y[x]"}]
Ok, so far so good. But what I want is to access the partially being build list, as it accumulates by Sow and plot the solution. The only setup I know how do this is to have Dynamic ListPlot that refreshes when its data changes. But I do not know how to use Sow to move the partially build solution to this data so that ListPlot update.
I'll show how I do it without Sow, but you see, I am using AppenedTo[] in the following:
ClearAll[x, y, lst];
max = 30;
lst = {{0, 0}};
Dynamic[ListPlot[lst, Joined -> False, PlotRange -> {{0, max}, All},
AxesLabel -> {"x", "y[x]"}]]
NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1}, y, {x, 0, max},
EvaluationMonitor :> {AppendTo[lst, {x, y[x]}]; Pause[0.01]}]
I was thinking of a way to access the partially build list by Sow and just use that to refresh the plot, on the assumption that might be more efficient than AppendTo[]
I can't just do this:
ClearAll[x, y, lst];
max = 30;
lst = {{0, 0}};
Dynamic[ListPlot[lst, Joined -> False, PlotRange -> All]]
NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1}, y, {x, 0, max},
EvaluationMonitor :> {lst = Reap[Sow[{x, y[x]}] ][[2, 1]]; Pause[0.01]}]
Since it now Sow one point, and Reap it, so I am just plotting one point at a time. The same as if I just did:
NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1}, y, {x, 0, max},
EvaluationMonitor :> {lst = Sow[{x, y[x]}]; Pause[0.01]}]
my question is, how to use Sow/Reap in the above, to avoid me having manage the lst by the use of AppendTo in this case. (or by pre-allocation using Table, but then I would not know the size to allocate) Since I assume that may be Sow/Reap would be more efficient?
ps. What would be nice, if Reap had an option to tell it to Reap what has been accumulated by Sow, but do not remove it from what has been Sow'ed so far. Like a passive Reap sort of. Well, just a thought.
thanks
Update: 8:30 am
Thanks for the answers and comments. I just wanted to say, that the main goal of asking this was just to see if there is a way to access part of the data while being Sowed. I need to look more at Bag, I have not used that before.
Btw, The example shown above, was just to give a context to where such a need might appear. If I wanted to simulate the solution in this specific case, I do not even have to do it as I did, I could obtain the solution data first, then, afterwords, animate it.
Hence no need to even worry about allocation of a buffer myself, or use AppenedTo. But there could many other cases where it will be easier to access the data as it is being accumulated by Sow. This example is just what I had at the moment.
To do this specific example more directly, one can simply used Animate[], afterwords, like this:
Remove["Global`*"];
max = 30;
sol = Reap[
First#NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1},
y, {x, 0, max}, EvaluationMonitor :> Sow[{x, y[x]}]]][[2, 1]];
Animate[ListPlot[sol[[1 ;; idx]], Joined -> False,
PlotRange -> {{0, max}, All}, AxesLabel -> {"x", "y[x]"}], {idx, 1,
Length[sol], 1}]
Or, even make a home grown animate, like this
Remove["Global`*"];
max = 30;
sol = Reap[
First#NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1},
y, {x, 0, max}, EvaluationMonitor :> Sow[{x, y[x]}]]][[2, 1]];
idx = 1;
Dynamic[idx];
Dynamic[ListPlot[sol[[1 ;; idx]], Joined -> False,
PlotRange -> {{0, max}, All}, AxesLabel -> {"x", "y[x]"}]]
Do[++idx; Pause[0.01], {i, 1, Length[sol] - 1}]
Small follow up question: Can one depend on using Internal``Bag now? Since it is in Internal context, will there be a chance it might be removed/changed/etc... in the future, breaking some code? I seems to remember reading somewhere that this is not likely, but I do not feel comfortable using something in Internal Context. If it is Ok for us to use it, why is it in Internal context then?
(so many things to lean in Mathematica, so little time)
Thanks,
Experimentation shows that both Internal`Bag and linked lists are slower than using AppendTo. After considering this I recalled what Sasha told me, which is that list (array) creation is what takes time.
Therefore, neither method above, nor a Sow/Reap in which the result is collected as a list at each step is going to be more efficient (in fact, less) than AppendTo.
I believe that only array pre-allocation can be faster among the native Mathematica constructs.
Old answer below for reference:
I believe this is the place for Internal`Bag, Internal`StuffBag, and Internal`BagPart.
I had to resort to a clumsy double variable method because the Bag does not seem to update inside Dynamic the way I expected.
ClearAll[x, y, lst];
max = 30;
bag = Internal`Bag[];
lst = {{}};
Dynamic#ListPlot[lst, Joined -> False, PlotRange -> All]
NDSolve[{y'[x] == y[x] Cos[x + y[x]], y[0] == 1}, y, {x, 0, max},
EvaluationMonitor :> {Internal`StuffBag[bag, {x, y[x]}];
lst = Internal`BagPart[bag, All];
Pause[0.01]}
]

Unevaluated form of a[[i]]

Consider following simple, illustrating example
cf = Block[{a, x, degree = 3},
With[{expr = Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
This is one of the possible ways to transfer code in the body of a Compile statement. It produces the Part::partd error, since a[[i]] is at the moment of evaluation not a list.
The easy solution is to just ignore this message or turn it off. There are of course other ways around it. For instance one could circumvent the evaluation of a[[i]] by replacing it inside the Compile-body before it is compiled
cf = ReleaseHold[Block[{a, x, degree = 3},
With[{expr = Product[x - a[i], {i, degree}]},
Hold[Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]] /.
a[i_] :> a[[i]]]
]
]
If the compiled function a large bit of code, the Hold, Release and the replacement at the end goes a bit against my idea of beautiful code. Is there a short and nice solution I have not considered yet?
Answer to the post of Szabolcs
Could you tell me though why you are using With here?
Yes, and it has to do with the reason why I cannot use := here. I use With, to have something like a #define in C, which means a code-replacement at the place I need it. Using := in With delays the evaluation and what the body of Compile sees is not the final piece of code which it is supposed to compile. Therefore,
<< CompiledFunctionTools`
cf = Block[{a, x, degree = 3},
With[{expr := Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]]];
CompilePrint[cf]
shows you, that there is a call to the Mathematica-kernel in the compiled function
I4 = MainEvaluate[ Function[{x, a}, degree][ R0, T(R1)0]]
This is bad because Compile should use only the local variables to calculate the result.
Update
Szabolcs solution works in this case but it leaves the whole expression unevaluated. Let me explain, why it is important that the expression is expanded before it is compiled. I have to admit, my toy-example was not the best. So lets try a better one using With and SetDelayed like in the solution of Szabolcs
Block[{a, x}, With[
{expr := D[Product[x - a[[i]], {i, 3}], x]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
Say I have a polynomial of degree 3 and I need the derivative of it inside the Compile. In the above code I want Mathematica to calculate the derivative for unassigned roots a[[i]] so I can use the formula very often in the compiled code. Looking at the compiled code above
one sees, that the D[..] cannot be compiled as nicely as the Product and stays unevaluated
11 R1 = MainEvaluate[ Hold[D][ R5, R0]]
Therefore, my updated question is: Is it possible to evaluate a piece of code without evaluating the Part[]-accesses in it better/nicer than using
Block[{a, x}, With[
{expr = D[Quiet#Product[x - a[[i]], {i, 3}], x]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
Edit: I put the Quiet to the place it belongs. I had it in front of code block to make it visible to everyone that I used Quiet here to suppress the warning. As Ruebenko pointed already out, it should in real code always be as close as possible to where it belongs. With this approach you probably don't miss other important warnings/errors.
Update 2
Since we're moving away from the original question, we should move this discussion maybe to a new thread. I don't know to whom I should give the best answer-award to my question since we discussed Mathematica and Scoping more than how to suppress the a[[i]] issue.
Update 3
To give the final solution: I simply suppress (unfortunately like I did all the time) the a[[i]] warning with Quiet. In a real example below, I have to use Quiet outside the complete Block to suppress the warning.
To inject the required code into the body of Compile I use a pure function and give the code to inline as argument. This is the same approach Michael Trott is using in, e.g. his Numerics book. This is a bit like the where clause in Haskell, where you define stuff you used afterwards.
newtonC = Function[{degree, f, df, colors},
Compile[{{x0, _Complex, 0}, {a, _Complex, 1}},
Block[{x = x0, xn = 0.0 + 0.0 I, i = 0, maxiter = 256,
eps = 10^(-6.), zeroId = 1, j = 1},
For[i = 0, i < maxiter, ++i,
xn = x - f/(df + eps);
If[Abs[xn - x] < eps,
Break[]
];
x = xn;
];
For[j = 1, j <= degree, ++j,
If[Abs[xn - a[[j]]] < eps*10^2,
zeroId = j + 1;
Break[];
];
];
colors[[zeroId]]*(1 - (i/maxiter)^0.3)*1.5
],
CompilationTarget -> "C", RuntimeAttributes -> {Listable},
RuntimeOptions -> "Speed", Parallelization -> True]]##
(Quiet#Block[{degree = 3, polynomial, a, x},
polynomial = HornerForm[Product[x - a[[i]], {i, degree}]];
{degree, polynomial, HornerForm[D[polynomial, x]],
List ### (ColorData[52, #] & /# Range[degree + 1])}])
And this function is now fast enough to calculate the Newton-fractal of a polynomial where the position of the roots is not fixed. Therefore, we can adjust the roots dynamically.
Feel free to adjust n. Here it runs up to n=756 fluently
(* ImageSize n*n, Complex plange from -b-I*b to b+I*b *)
With[{n = 256, b = 2.0},
DynamicModule[{
roots = RandomReal[{-b, b}, {3, 2}],
raster = Table[x + I y, {y, -b, b, 2 b/n}, {x, -b, b, 2 b/n}]},
LocatorPane[Dynamic[roots],
Dynamic[
Graphics[{Inset[
Image[Reverse#newtonC[raster, Complex ### roots], "Real"],
{-b, -b}, {1, 1}, 2 {b, b}]}, PlotRange -> {{-b, b}, {-
b, b}}, ImageSize -> {n, n}]], {{-b, -b}, {b, b}},
Appearance -> Style["\[Times]", Red, 20]
]
]
]
Teaser:
Ok, here is the very oversimplified version of the code generation framework I am using for various purposes:
ClearAll[symbolToHideQ]
SetAttributes[symbolToHideQ, HoldFirst];
symbolToHideQ[s_Symbol, expandedSymbs_] :=! MemberQ[expandedSymbs, Unevaluated[s]];
ClearAll[globalProperties]
globalProperties[] := {DownValues, SubValues, UpValues (*,OwnValues*)};
ClearAll[getSymbolsToHide];
Options[getSymbolsToHide] = {
Exceptions -> {List, Hold, HoldComplete,
HoldForm, HoldPattern, Blank, BlankSequence, BlankNullSequence,
Optional, Repeated, Verbatim, Pattern, RuleDelayed, Rule, True,
False, Integer, Real, Complex, Alternatives, String,
PatternTest,(*Note- this one is dangerous since it opens a hole
to evaluation leaks. But too good to be ingored *)
Condition, PatternSequence, Except
}
};
getSymbolsToHide[code_Hold, headsToExpand : {___Symbol}, opts : OptionsPattern[]] :=
Join ## Complement[
Cases[{
Flatten[Outer[Compose, globalProperties[], headsToExpand]], code},
s_Symbol /; symbolToHideQ[s, headsToExpand] :> Hold[s],
Infinity,
Heads -> True
],
Hold /# OptionValue[Exceptions]];
ClearAll[makeHidingSymbol]
SetAttributes[makeHidingSymbol, HoldAll];
makeHidingSymbol[s_Symbol] :=
Unique[hidingSymbol(*Unevaluated[s]*) (*,Attributes[s]*)];
ClearAll[makeHidingRules]
makeHidingRules[symbs : Hold[__Symbol]] :=
Thread[List ## Map[HoldPattern, symbs] -> List ## Map[makeHidingSymbol, symbs]];
ClearAll[reverseHidingRules];
reverseHidingRules[rules : {(_Rule | _RuleDelayed) ..}] :=
rules /. (Rule | RuleDelayed)[Verbatim[HoldPattern][lhs_], rhs_] :> (rhs :> lhs);
FrozenCodeEval[code_Hold, headsToEvaluate : {___Symbol}] :=
Module[{symbolsToHide, hidingRules, revHidingRules, result},
symbolsToHide = getSymbolsToHide[code, headsToEvaluate];
hidingRules = makeHidingRules[symbolsToHide];
revHidingRules = reverseHidingRules[hidingRules];
result =
Hold[Evaluate[ReleaseHold[code /. hidingRules]]] /. revHidingRules;
Apply[Remove, revHidingRules[[All, 1]]];
result];
The code works by temporarily hiding most symbols with some dummy ones, and allow certain symbols evaluate. Here is how this would work here:
In[80]:=
FrozenCodeEval[
Hold[Compile[{{x,_Real,0},{a,_Real,1}},D[Product[x-a[[i]],{i,3}],x]]],
{D,Product,Derivative,Plus}
]
Out[80]=
Hold[Compile[{{x,_Real,0},{a,_Real,1}},
(x-a[[1]]) (x-a[[2]])+(x-a[[1]]) (x-a[[3]])+(x-a[[2]]) (x-a[[3]])]]
So, to use it, you have to wrap your code in Hold and indicate which heads you want to evaluate. What remains here is just to apply ReleseHold to it. Note that the above code just illustrates the ideas, but is still quite limited. The full version of my method involves other steps which make it much more powerful but also more complex.
Edit
While the above code is still too limited to accomodate many really interesting cases, here is one additional neat example of what would be rather hard to achieve with the traditional tools of evaluation control:
In[102]:=
FrozenCodeEval[
Hold[f[x_, y_, z_] :=
With[Thread[{a, b, c} = Map[Sqrt, {x, y, z}]],
a + b + c]],
{Thread, Map}]
Out[102]=
Hold[
f[x_, y_, z_] :=
With[{a = Sqrt[x], b = Sqrt[y], c = Sqrt[z]}, a + b + c]]
EDIT -- Big warning!! Injecting code using With or Function into Compile that uses some of Compile's local variables is not reliable! Consider the following:
In[63]:= With[{y=x},Compile[x,y]]
Out[63]= CompiledFunction[{x$},x,-CompiledCode-]
In[64]:= With[{y=x},Compile[{{x,_Real}},y]]
Out[64]= CompiledFunction[{x},x,-CompiledCode-]
Note the renaming of x to x$ in the first case. I recommend you read about localization here and here. (Yes, this is confusing!) We can guess about why this only happens in the first case and not the second, but my point is that this behaviour might not be intended (call it a bug, dark corner or undefined behaviour), so relying on it is fragile ...
Replace-based solutions, like my withRules function do work though (this was not my intended use for that function, but it fits well here ...)
In[65]:= withRules[{y->x},Compile[x,y]]
Out[65]= CompiledFunction[{x},x,-CompiledCode-]
Original answers
You can use := in With, like so:
cf = Block[{a, x, degree = 3},
With[{expr := Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
It will avoid evaluating expr and the error from Part.
Generally, = and := work as expected in all of With, Module and Block.
Could you tell me though why you are using With here? (I'm sure you have a good reason, I just can't see it from this simplified example.)
Additional answer
Addressing #halirutan's concern about degree not being inlined during compilation
I see this as exactly the same situation as if we had a global variable defined that we use in Compile. Take for example:
In[18]:= global=1
Out[18]= 1
In[19]:= cf2=Compile[{},1+global]
Out[19]= CompiledFunction[{},1+global,-CompiledCode-]
In[20]:= CompilePrint[cf2]
Out[20]=
No argument
3 Integer registers
Underflow checking off
Overflow checking off
Integer overflow checking on
RuntimeAttributes -> {}
I0 = 1
Result = I2
1 I1 = MainEvaluate[ Function[{}, global][ ]]
2 I2 = I0 + I1
3 Return
This is a common issue. The solution is to tell Compile to inline globals, like so:
cf = Block[{a, x, degree = 3},
With[{expr := Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr,
CompilationOptions -> {"InlineExternalDefinitions" -> True}]]];
CompilePrint[cf]
You can check that now there's no callback to the main evaluator.
Alternatively you can inject the value of degree using an extra layer of With instead of Block. This will make you wish for something like this very much.
Macro expansion in Mathematica
This is somewhat unrelated, but you mention in your post that you use With for macro expansion. Here's my first (possibly buggy) go at implementing macro expansion in Mathematica. This is not well tested, feel free to try to break it and post a comment.
Clear[defineMacro, macros, expandMacros]
macros = Hold[];
SetAttributes[defineMacro, HoldAllComplete]
defineMacro[name_Symbol, value_] := (AppendTo[macros, name]; name := value)
SetAttributes[expandMacros, HoldAllComplete]
expandMacros[expr_] := Unevaluated[expr] //. Join ## (OwnValues /# macros)
Description:
macros is a (held) list of all symbols to be expanded.
defineMacro will make a new macro.
expandMacros will expand macro definitions in an expression.
Beware: I didn't implement macro-redefinition, this will not work while expansion is on using $Pre. Also beware of recursive macro definitions and infinite loops.
Usage:
Do macro expansion on all input by defining $Pre:
$Pre = expandMacros;
Define a to have the value 1:
defineMacro[a, 1]
Set a delayed definition for b:
b := a + 1
Note that the definition of b is not fully evaluated, but a is expanded.
?b
Global`b
b:=1+1
Turn off macro expansion ($Pre can be dangerous if there's a bug in my code):
$Pre =.
One way:
cf = Block[{a, x, degree = 3},
With[{expr = Quiet[Product[x - a[[i]], {i, degree}]]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]]]
be careful though, it you really want this.
Original code:
newtonC = Function[{degree, f, df, colors},
Compile[{{x0, _Complex, 0}, {a, _Complex, 1}},
Block[{x = x0, xn = 0.0 + 0.0 I, i = 0, maxiter = 256,
...
RuntimeOptions -> "Speed", Parallelization -> True]]##
(Quiet#Block[{degree = 3, polynomial, a, x},
polynomial = HornerForm[Product[x - a[[i]], {i, degree}]];
...
Modified code:
newtonC = Function[{degree, f, df, colors},
Compile[{{x0, _Complex, 0}, {a, _Complex, 1}},
Block[{x = x0, xn = 0.0 + 0.0 I, i = 0, maxiter = 256,
...
RuntimeOptions -> "Speed", Parallelization -> True],HoldAllComplete]##
( (( (HoldComplete###)/.a[i_]:>a[[i]] )&)#Block[{degree = 3, polynomial, a, x},
polynomial = HornerForm[Product[x - a[i], {i, degree}]];
...
Add HoldAllComplete attribute to the function.
Write a[i] in place of a[[i]].
Replace Quiet with (( (HoldComplete###)/.a[i_]:>a[[i]] )&)
Produces the identical code, no Quiet, and all of the Hold stuff is in one place.

A "MapList" function

In Mathematica there are a number of functions that return not only the final result or a single match, but all results. Such functions are named *List. Exhibit:
FoldList
NestList
ReplaceList
ComposeList
Something that I am missing is a MapList function.
For example, I want:
MapList[f, {1, 2, 3, 4}]
{{f[1], 2, 3, 4}, {1, f[2], 3, 4}, {1, 2, f[3], 4}, {1, 2, 3, f[4]}}
I want a list element for each application of the function:
MapList[
f,
{h[1, 2], {4, Sin[x]}},
{2}
] // Column
{h[f[1], 2], {4, Sin[x]}}
{h[1, f[2]], {4, Sin[x]}}
{h[1, 2], {f[4], Sin[x]}}
{h[1, 2], {4, f[Sin[x]]}}
One may implement this as:
MapList[f_, expr_, level_: 1] :=
MapAt[f, expr, #] & /#
Position[expr, _, level, Heads -> False]
However, it is quite inefficient. Consider this simple case, and compare these timings:
a = Range#1000;
#^2 & /# a // timeAvg
MapList[#^2 &, a] // timeAvg
ConstantArray[#^2 & /# a, 1000] // timeAvg
0.00005088
0.01436
0.0003744
This illustrates that on average MapList is about 38X slower than the combined total of mapping the function to every element in the list and creating a 1000x1000 array.
Therefore, how may MapList be most efficiently implemented?
I suspect that MapList is nearing the performance limit for any transformation that performs structural modification. The existing target benchmarks are not really fair comparisons. The Map example is creating a simple vector of integers. The ConstantArray example is creating a simple vector of shared references to the same list. MapList shows poorly against these examples because it is creating a vector where each element is a freshly generated, unshared, data structure.
I have added two more benchmarks below. In both cases each element of the result is a packed array. The Array case generates new elements by performing Listable addition on a. The Module case generates new elements by replacing a single value in a copy of a. These results are as follows:
In[8]:= a = Range#1000;
#^2 & /# a // timeAvg
MapList[#^2 &, a] // timeAvg
ConstantArray[#^2 & /# a, 1000] // timeAvg
Array[a+# &, 1000] // timeAvg
Module[{c}, Table[c = a; c[[i]] = c[[i]]^2; c, {i, 1000}]] // timeAvg
Out[9]= 0.0005504
Out[10]= 0.0966
Out[11]= 0.003624
Out[12]= 0.0156
Out[13]= 0.02308
Note how the new benchmarks perform much more like MapList and less like the Map or ConstantArray examples. This seems to show that there is not much scope for dramatically improving the performance of MapList without some deep kernel magic. We can shave a bit of time from MapList thus:
MapListWR4[f_, expr_, level_: {1}] :=
Module[{positions, replacements}
, positions = Position[expr, _, level, Heads -> False]
; replacements = # -> f[Extract[expr, #]] & /# positions
; ReplacePart[expr, #] & /# replacements
]
Which yields these timings:
In[15]:= a = Range#1000;
#^2 & /# a // timeAvg
MapListWR4[#^2 &, a] // timeAvg
ConstantArray[#^2 & /# a, 1000] // timeAvg
Array[a+# &, 1000] // timeAvg
Module[{c}, Table[c = a; c[[i]] = c[[i]]^2; c, {i, 1000}]] // timeAvg
Out[16]= 0.0005488
Out[17]= 0.04056
Out[18]= 0.003
Out[19]= 0.015
Out[20]= 0.02372
This comes within factor 2 of the Module case and I expect that further micro-optimizations can close the gap yet more. But it is with eager anticipation that I join you awaiting an answer that shows a further tenfold improvement.
(Updated my function)
I think I can offer another 2x boost on top of WReach's attempt.
Remove[MapListTelefunken];
MapListTelefunken[f_, dims_] :=
With[{a = Range[dims], fun = f[[1]]},
With[{replace = ({#, #} -> fun) & /# a},
ReplacePart[ConstantArray[a, {dims}], replace]
]
]
Here are the timings on my machine (Sony Z laptop; i7, 8GB ram, 256 SSD in Raid 0):
a = Range#1000;
#^2 & /# a; // timeAvg
MapList[#^2 &, a]; // timeAvg
MapListWR4[#^2 &, a]; // timeAvg
MapListTelefunken[#^2 &, 1000]; // timeAvg
0.0000296 (* just Mapping the function over a Range[1000] for baseline *)
0.0297 (* the original MapList proposed by Mr.Wizard *)
0.00936 (* MapListWR4 from WReach *)
0.00468 (* my attempt *)
I think you'd still need to create the 1000x1000 array, and I don't think there's any cleverer way than a constant array. More to the point, your examples are better served with the following definition, although I admit that it lacks the finesse of levels.
MapList[f_, list_] := (Table[MapAt[f, #, i], {i, Length##}] &)#list;
The culprit in your own definition is the Position[] call, which creates a complex auxiliary structure.
Provide a more complex use case, that will better cover your intentions.

Using PatternSequence with Cases in Mathematica to find peaks

Given pairs of coordinates
data = {{1, 0}, {2, 0}, {3, 1}, {4, 2}, {5, 1},
{6, 2}, {7, 3}, {8, 4}, {9, 3}, {10, 2}}
I'd like to extract peaks and valleys, thus:
{{4, 2}, {5, 1}, {8, 4}}
My current solution is this clumsiness:
Cases[
Partition[data, 3, 1],
{{ta_, a_}, {tb_, b_}, {tc_, c_}} /; Or[a < b > c, a > b < c] :> {tb, b}
]
which you can see starts out by tripling the size of the data set using Partition. I think it's possible to use Cases and PatternSequence to extract this information, but this attempt doesn't work:
Cases[
data,
({___, PatternSequence[{_, a_}, {t_, b_}, {_, c_}], ___}
/; Or[a < b > c, a > b < c]) :> {t, b}
]
That yields {}.
I don't think anything is wrong with the pattern because it works with ReplaceAll:
data /. ({___, PatternSequence[{_, a_}, {t_, b_}, {_, c_}], ___}
/; Or[a < b > c, a > b < c]) :> {t, b}
That gives the correct first peak, {4, 2}. What's going on here?
One of the reasons why your failed attempt doesn't work is that Cases by default looks for matches on level 1 of your expression. Since your looking for matches on level 0 you would need to do something like
Cases[
data,
{___, {_, a_}, {t_, b_}, {_, c_}, ___} /; Or[a < b > c, a > b < c] :> {t, b},
{0}
]
However, this only returns {4,2} as a solution so it's still not what you're looking for.
To find all matches without partitioning you could do something like
ReplaceList[data, ({___, {_, a_}, {t_, b_}, {_, c_}, ___} /;
Or[a < b > c, a > b < c]) :> {t, b}]
which returns
{{4, 2}, {5, 1}, {8, 4}}
Your "clumsy" solution is fairly fast, because it heavily restricts what gets looked at.
Here is an example.
m = 10^4;
n = 10^6;
ll = Transpose[{Range[n], RandomInteger[m, n]}];
In[266]:=
Timing[extrema =
Cases[Partition[ll, 3,
1], {{ta_, a_}, {tb_, b_}, {tc_, c_}} /;
Or[a < b > c, a > b < c] :> {tb, b}];][[1]]
Out[266]= 3.88
In[267]:= Length[extrema]
Out[267]= 666463
This seems to be faster than using replacement rules.
Faster still is to create a sign table of products of differences. Then pick entries not on the ends of the list that correspond to sign products of 1.
In[268]:= Timing[ordinates = ll[[All, 2]];
signs =
Table[Sign[(ordinates[[j + 1]] -
ordinates[[j]])*(ordinates[[j - 1]] - ordinates[[j]])], {j, 2,
Length[ll] - 1}];
extrema2 = Pick[ll[[2 ;; -2]], signs, 1];][[1]]
Out[268]= 0.23
In[269]:= extrema2 === extrema
Out[269]= True
Handling of consecutive equal ordinates is not considered in these methods. Doing that would take more work since one must consider neighborhoods larger than three consecutive elements. (My spell checker wants me to add a 'u' to the middle syllable of "neighborhoods". My spell checker must think we are in Canada.)
Daniel Lichtblau
Another alternative:
Part[#,Flatten[Position[Differences[Sign[Differences[#[[All, 2]]]]], 2|-2] + 1]] &#data
(* ==> {{4, 2}, {5, 1}, {8, 4}} *)
Extract[#, Position[Differences[Sign[Differences[#]]], {_, 2} | {_, -2}] + 1] &#data
(* ==> {{4, 2}, {5, 1}, {8, 4}} *)
This may be not exactly the implementation you ask, but along those lines:
ClearAll[localMaxPositions];
localMaxPositions[lst : {___?NumericQ}] :=
Part[#, All, 2] &#
ReplaceList[
MapIndexed[List, lst],
{___, {x_, _}, y : {t_, _} .., {z_, _}, ___} /; x < t && z < t :> y];
Example:
In[2]:= test = RandomInteger[{1,20},30]
Out[2]= {13,9,5,9,3,20,2,5,18,13,2,20,13,12,4,7,16,14,8,16,19,20,5,18,3,15,8,8,12,9}
In[3]:= localMaxPositions[test]
Out[3]= {{4},{6},{9},{12},{17},{22},{24},{26},{29}}
Once you have positions, you may extract the elements:
In[4]:= Extract[test,%]
Out[4]= {9,20,18,20,16,20,18,15,12}
Note that this will also work for plateau-s where you have more than one same maximal element in a row. To get minima, one needs to trivially change the code. I actually think that ReplaceList is a better choice than Cases here.
To use it with your data:
In[7]:= Extract[data,localMaxPositions[data[[All,2]]]]
Out[7]= {{4,2},{8,4}}
and the same for the minima. If you want to combine, the change in the above rule is also trivial.
Since one of your primary concerns about your "clumsy" method is the data expansion that takes place with Partition, you may care to know about the Developer` function PartitionMap, which does not partition all the data at once. I use Sequence[] to delete the elements that I don't want.
Developer`PartitionMap[
# /. {{{_, a_}, x : {_, b_}, {_, c_}} /; a < b > c || a > b < c :> x,
_ :> Sequence[]} &,
data, 3, 1
]

Efficient way to remove empty lists from lists without evaluating held expressions?

In previous thread an efficient way to remove empty lists ({}) from lists was suggested:
Replace[expr, x_List :> DeleteCases[x, {}], {0, Infinity}]
Using the Trott-Strzebonski in-place evaluation technique this method can be generalized for working also with held expressions:
f1[expr_] :=
Replace[expr,
x_List :> With[{eval = DeleteCases[x, {}]}, eval /; True], {0, Infinity}]
This solution is more efficient than the one based on ReplaceRepeated:
f2[expr_] := expr //. {left___, {}, right___} :> {left, right}
But it has one disadvantage: it evaluates held expressions if they are wrapped by List:
In[20]:= f1[Hold[{{}, 1 + 1}]]
Out[20]= Hold[{2}]
So my question is: what is the most efficient way to remove all empty lists ({}) from lists without evaluating held expressions? The empty List[] object should be removed only if it is an element of another List itself.
Here are some timings:
In[76]:= expr = Tuples[Tuples[{{}, {}}, 3], 4];
First#Timing[#[expr]] & /# {f1, f2, f3}
pl = Plot3D[Sin[x y], {x, 0, Pi}, {y, 0, Pi}];
First#Timing[#[pl]] & /# {f1, f2, f3}
Out[77]= {0.581, 0.901, 5.027}
Out[78]= {0.12, 0.21, 0.18}
Definitions:
Clear[f1, f2, f3];
f3[expr_] :=
FixedPoint[
Function[e, Replace[e, {a___, {}, b___} :> {a, b}, {0, Infinity}]], expr];
f1[expr_] :=
Replace[expr,
x_List :> With[{eval = DeleteCases[x, {}]}, eval /; True], {0, Infinity}];
f2[expr_] := expr //. {left___, {}, right___} :> {left, right};
How about:
Clear[f3];
f3[expr_] :=
FixedPoint[
Function[e,
Replace[e, {a___, {}, b___} :> {a, b}, {0, Infinity}]],
expr]
It seems to live up to the specs:
In[275]:= f3[{a, {}, {b, {}}, c[d, {}]}]
Out[275]= {a, {b}, c[d, {}]}
In[276]:= f3[Hold[{{}, 1 + 1, {}}]]
Out[276]= Hold[{1 + 1}]
You can combine the solutions you mentioned with a minimal performance hit and maintain the code unevaluated by using a technique from this post, with a modification that the custom holding wrapper will be made private by using Module:
ClearAll[removeEmptyListsHeld];
removeEmptyListsHeld[expr_Hold] :=
Module[{myHold},
SetAttributes[myHold, HoldAllComplete];
Replace[MapAll[myHold, expr, Heads -> True],
x : myHold[List][___] :>
With[{eval = DeleteCases[x, myHold[myHold[List][]]]},
eval /; True],
{0, Infinity}]//. myHold[x_] :> x];
The above function assumes that the input expression is wrapped in Hold. Examples:
In[53]:= expr = Tuples[Tuples[{{}, {}}, 3], 4];
First#Timing[#[expr]] & /# {f1, f2, f3, removeEmptyListsHeld[Hold[#]] &}
Out[54]= {0.235, 0.218, 1.75, 0.328}
In[56]:= removeEmptyListsHeld[Hold[{{},1+1,{}}]]
Out[56]= Hold[{1+1}]
I'm just a bit late with this one. ;-)
Though rather complicated this tests about an order of magnitude faster than your f1:
fx[expr_] :=
Module[{s},
expr //
Quiet[{s} /. {x_} :> ({} /. {x___} -> (# /. {} -> x //. {x ..} -> x) &)]
]
It does not evaluate:
Hold[{{}, 1 + 1}] // fx
Hold[{1 + 1}]
Timings
expr = Tuples[Tuples[{{}, {}}, 3], 4];
First # Timing # Do[# # expr, {100}] & /# {f1, fx}
pl = Plot3D[Sin[x y], {x, 0, Pi}, {y, 0, Pi}];
First # Timing # Do[# # pl, {100}] & /# {f1, fx}
{10.577, 0.982} (* 10.8x faster *)
{1.778, 0.266} (* 6.7x faster *)
Check
f1#expr === fx#expr
f1#pl === fx#pl
True
True
Explanation
The basic version of this function would look like this:
{} /. {x___} -> (# //. {} | {x ..} -> x) &
The idea is to first reduce the expression with //. {} | {x ..} -> x and then use the injector pattern with an empty expression to remove all instances of x, as though they were replaced with Sequence[] but without evaluation.
The first change is to optimize this somewhat by splitting the replacement into /. {} -> x //. {x ..} -> x. The second change is to somehow localize x in the patterns so that it does not fail if x appears in the expression itself. Because of the way Mathematica handles nested scoping constructs I cannot simply use Module[{x}, . . . ] but instead have to use the injector pattern again to get a unique symbol into x___ etc., and Quiet to keep it from complaining about the nonstandard use.

Resources