Related
I would like to make a function that takes in a list of integers and returns a partition of the indices of the list, based on the elements of the list.
For instance:
{1,3,3,7} --> { {1}, {2,3}, {4}}
{3,1,3} --> {{1,3}, {2}}
I can think of messy ways to do this, but is there a natural way to do this in Mathematica?
I would use:
indicies[x_] := Reap[MapIndexed[Sow[#2, #] &, x]][[2, All, All, 1]]
This will be much faster than repeatedly using Position, on a long lists with many unique elements. Example:
list = RandomInteger[9999, 10000];
Timing[
result1 =
Function[x, (Flatten#Position[x, #] &) /# DeleteDuplicates[x]]#list;
]
{3.463, Null}
Timing[
result2 = indicies # list;
]
{0.031, Null}
result1 === result2
True
TomD suggested using the newer GatherBy in place of Sow and Reap. This is even faster on long lists with little repetition.
indicies2[x_] := GatherBy[List ~MapIndexed~ x, First][[All, All, 2, 1]]
list2 = RandomInteger[99999, 100000];
Do[indicies # list2, {10}] // Timing
Do[indicies2 # list2, {10}] // Timing
{5.523, Null}
{2.823, Null}
Speed is more similar on lists with greater repetition:
list3 = RandomInteger[99, 100000];
Do[indicies # list3, {10}] // Timing
Do[indicies2 # list3, {10}] // Timing
{1.716, Null}
{1.607, Null}
If going for pure speed one must recognize that MapIndexed is not optimized for packed arrays, therefore Range and Transpose will be considerably faster in that case:
indicies3[x_] := GatherBy[{x, Range#Length#x}\[Transpose], First][[All, All, 2]]
Do[indicies3 # list2, {10}] // Timing
{1.981, Null}
Do[indicies3#list3, {10}] // Timing (* big difference here *)
{0.125, Null}
One possibility:
Function[x, (Flatten#Position[x, #] &) /# DeleteDuplicates[x]]#{1, 3,
3, 7}
giving
(* {{1}, {2, 3}, {4} *)
Another example:
lst = {1, 3, 3, 3, 7, 4, 3};
Function[x, (Flatten#Position[x, #] &) /# DeleteDuplicates[x]]#lst
giving:
(*{{1}, {2, 3, 4, 7}, {5}, {6}}*)
Yet another approach:
f[x_] := Flatten[x~Position~First##] & /# Gather#x
Consider following simple, illustrating example
cf = Block[{a, x, degree = 3},
With[{expr = Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
This is one of the possible ways to transfer code in the body of a Compile statement. It produces the Part::partd error, since a[[i]] is at the moment of evaluation not a list.
The easy solution is to just ignore this message or turn it off. There are of course other ways around it. For instance one could circumvent the evaluation of a[[i]] by replacing it inside the Compile-body before it is compiled
cf = ReleaseHold[Block[{a, x, degree = 3},
With[{expr = Product[x - a[i], {i, degree}]},
Hold[Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]] /.
a[i_] :> a[[i]]]
]
]
If the compiled function a large bit of code, the Hold, Release and the replacement at the end goes a bit against my idea of beautiful code. Is there a short and nice solution I have not considered yet?
Answer to the post of Szabolcs
Could you tell me though why you are using With here?
Yes, and it has to do with the reason why I cannot use := here. I use With, to have something like a #define in C, which means a code-replacement at the place I need it. Using := in With delays the evaluation and what the body of Compile sees is not the final piece of code which it is supposed to compile. Therefore,
<< CompiledFunctionTools`
cf = Block[{a, x, degree = 3},
With[{expr := Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]]];
CompilePrint[cf]
shows you, that there is a call to the Mathematica-kernel in the compiled function
I4 = MainEvaluate[ Function[{x, a}, degree][ R0, T(R1)0]]
This is bad because Compile should use only the local variables to calculate the result.
Update
Szabolcs solution works in this case but it leaves the whole expression unevaluated. Let me explain, why it is important that the expression is expanded before it is compiled. I have to admit, my toy-example was not the best. So lets try a better one using With and SetDelayed like in the solution of Szabolcs
Block[{a, x}, With[
{expr := D[Product[x - a[[i]], {i, 3}], x]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
Say I have a polynomial of degree 3 and I need the derivative of it inside the Compile. In the above code I want Mathematica to calculate the derivative for unassigned roots a[[i]] so I can use the formula very often in the compiled code. Looking at the compiled code above
one sees, that the D[..] cannot be compiled as nicely as the Product and stays unevaluated
11 R1 = MainEvaluate[ Hold[D][ R5, R0]]
Therefore, my updated question is: Is it possible to evaluate a piece of code without evaluating the Part[]-accesses in it better/nicer than using
Block[{a, x}, With[
{expr = D[Quiet#Product[x - a[[i]], {i, 3}], x]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
Edit: I put the Quiet to the place it belongs. I had it in front of code block to make it visible to everyone that I used Quiet here to suppress the warning. As Ruebenko pointed already out, it should in real code always be as close as possible to where it belongs. With this approach you probably don't miss other important warnings/errors.
Update 2
Since we're moving away from the original question, we should move this discussion maybe to a new thread. I don't know to whom I should give the best answer-award to my question since we discussed Mathematica and Scoping more than how to suppress the a[[i]] issue.
Update 3
To give the final solution: I simply suppress (unfortunately like I did all the time) the a[[i]] warning with Quiet. In a real example below, I have to use Quiet outside the complete Block to suppress the warning.
To inject the required code into the body of Compile I use a pure function and give the code to inline as argument. This is the same approach Michael Trott is using in, e.g. his Numerics book. This is a bit like the where clause in Haskell, where you define stuff you used afterwards.
newtonC = Function[{degree, f, df, colors},
Compile[{{x0, _Complex, 0}, {a, _Complex, 1}},
Block[{x = x0, xn = 0.0 + 0.0 I, i = 0, maxiter = 256,
eps = 10^(-6.), zeroId = 1, j = 1},
For[i = 0, i < maxiter, ++i,
xn = x - f/(df + eps);
If[Abs[xn - x] < eps,
Break[]
];
x = xn;
];
For[j = 1, j <= degree, ++j,
If[Abs[xn - a[[j]]] < eps*10^2,
zeroId = j + 1;
Break[];
];
];
colors[[zeroId]]*(1 - (i/maxiter)^0.3)*1.5
],
CompilationTarget -> "C", RuntimeAttributes -> {Listable},
RuntimeOptions -> "Speed", Parallelization -> True]]##
(Quiet#Block[{degree = 3, polynomial, a, x},
polynomial = HornerForm[Product[x - a[[i]], {i, degree}]];
{degree, polynomial, HornerForm[D[polynomial, x]],
List ### (ColorData[52, #] & /# Range[degree + 1])}])
And this function is now fast enough to calculate the Newton-fractal of a polynomial where the position of the roots is not fixed. Therefore, we can adjust the roots dynamically.
Feel free to adjust n. Here it runs up to n=756 fluently
(* ImageSize n*n, Complex plange from -b-I*b to b+I*b *)
With[{n = 256, b = 2.0},
DynamicModule[{
roots = RandomReal[{-b, b}, {3, 2}],
raster = Table[x + I y, {y, -b, b, 2 b/n}, {x, -b, b, 2 b/n}]},
LocatorPane[Dynamic[roots],
Dynamic[
Graphics[{Inset[
Image[Reverse#newtonC[raster, Complex ### roots], "Real"],
{-b, -b}, {1, 1}, 2 {b, b}]}, PlotRange -> {{-b, b}, {-
b, b}}, ImageSize -> {n, n}]], {{-b, -b}, {b, b}},
Appearance -> Style["\[Times]", Red, 20]
]
]
]
Teaser:
Ok, here is the very oversimplified version of the code generation framework I am using for various purposes:
ClearAll[symbolToHideQ]
SetAttributes[symbolToHideQ, HoldFirst];
symbolToHideQ[s_Symbol, expandedSymbs_] :=! MemberQ[expandedSymbs, Unevaluated[s]];
ClearAll[globalProperties]
globalProperties[] := {DownValues, SubValues, UpValues (*,OwnValues*)};
ClearAll[getSymbolsToHide];
Options[getSymbolsToHide] = {
Exceptions -> {List, Hold, HoldComplete,
HoldForm, HoldPattern, Blank, BlankSequence, BlankNullSequence,
Optional, Repeated, Verbatim, Pattern, RuleDelayed, Rule, True,
False, Integer, Real, Complex, Alternatives, String,
PatternTest,(*Note- this one is dangerous since it opens a hole
to evaluation leaks. But too good to be ingored *)
Condition, PatternSequence, Except
}
};
getSymbolsToHide[code_Hold, headsToExpand : {___Symbol}, opts : OptionsPattern[]] :=
Join ## Complement[
Cases[{
Flatten[Outer[Compose, globalProperties[], headsToExpand]], code},
s_Symbol /; symbolToHideQ[s, headsToExpand] :> Hold[s],
Infinity,
Heads -> True
],
Hold /# OptionValue[Exceptions]];
ClearAll[makeHidingSymbol]
SetAttributes[makeHidingSymbol, HoldAll];
makeHidingSymbol[s_Symbol] :=
Unique[hidingSymbol(*Unevaluated[s]*) (*,Attributes[s]*)];
ClearAll[makeHidingRules]
makeHidingRules[symbs : Hold[__Symbol]] :=
Thread[List ## Map[HoldPattern, symbs] -> List ## Map[makeHidingSymbol, symbs]];
ClearAll[reverseHidingRules];
reverseHidingRules[rules : {(_Rule | _RuleDelayed) ..}] :=
rules /. (Rule | RuleDelayed)[Verbatim[HoldPattern][lhs_], rhs_] :> (rhs :> lhs);
FrozenCodeEval[code_Hold, headsToEvaluate : {___Symbol}] :=
Module[{symbolsToHide, hidingRules, revHidingRules, result},
symbolsToHide = getSymbolsToHide[code, headsToEvaluate];
hidingRules = makeHidingRules[symbolsToHide];
revHidingRules = reverseHidingRules[hidingRules];
result =
Hold[Evaluate[ReleaseHold[code /. hidingRules]]] /. revHidingRules;
Apply[Remove, revHidingRules[[All, 1]]];
result];
The code works by temporarily hiding most symbols with some dummy ones, and allow certain symbols evaluate. Here is how this would work here:
In[80]:=
FrozenCodeEval[
Hold[Compile[{{x,_Real,0},{a,_Real,1}},D[Product[x-a[[i]],{i,3}],x]]],
{D,Product,Derivative,Plus}
]
Out[80]=
Hold[Compile[{{x,_Real,0},{a,_Real,1}},
(x-a[[1]]) (x-a[[2]])+(x-a[[1]]) (x-a[[3]])+(x-a[[2]]) (x-a[[3]])]]
So, to use it, you have to wrap your code in Hold and indicate which heads you want to evaluate. What remains here is just to apply ReleseHold to it. Note that the above code just illustrates the ideas, but is still quite limited. The full version of my method involves other steps which make it much more powerful but also more complex.
Edit
While the above code is still too limited to accomodate many really interesting cases, here is one additional neat example of what would be rather hard to achieve with the traditional tools of evaluation control:
In[102]:=
FrozenCodeEval[
Hold[f[x_, y_, z_] :=
With[Thread[{a, b, c} = Map[Sqrt, {x, y, z}]],
a + b + c]],
{Thread, Map}]
Out[102]=
Hold[
f[x_, y_, z_] :=
With[{a = Sqrt[x], b = Sqrt[y], c = Sqrt[z]}, a + b + c]]
EDIT -- Big warning!! Injecting code using With or Function into Compile that uses some of Compile's local variables is not reliable! Consider the following:
In[63]:= With[{y=x},Compile[x,y]]
Out[63]= CompiledFunction[{x$},x,-CompiledCode-]
In[64]:= With[{y=x},Compile[{{x,_Real}},y]]
Out[64]= CompiledFunction[{x},x,-CompiledCode-]
Note the renaming of x to x$ in the first case. I recommend you read about localization here and here. (Yes, this is confusing!) We can guess about why this only happens in the first case and not the second, but my point is that this behaviour might not be intended (call it a bug, dark corner or undefined behaviour), so relying on it is fragile ...
Replace-based solutions, like my withRules function do work though (this was not my intended use for that function, but it fits well here ...)
In[65]:= withRules[{y->x},Compile[x,y]]
Out[65]= CompiledFunction[{x},x,-CompiledCode-]
Original answers
You can use := in With, like so:
cf = Block[{a, x, degree = 3},
With[{expr := Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]
]
]
It will avoid evaluating expr and the error from Part.
Generally, = and := work as expected in all of With, Module and Block.
Could you tell me though why you are using With here? (I'm sure you have a good reason, I just can't see it from this simplified example.)
Additional answer
Addressing #halirutan's concern about degree not being inlined during compilation
I see this as exactly the same situation as if we had a global variable defined that we use in Compile. Take for example:
In[18]:= global=1
Out[18]= 1
In[19]:= cf2=Compile[{},1+global]
Out[19]= CompiledFunction[{},1+global,-CompiledCode-]
In[20]:= CompilePrint[cf2]
Out[20]=
No argument
3 Integer registers
Underflow checking off
Overflow checking off
Integer overflow checking on
RuntimeAttributes -> {}
I0 = 1
Result = I2
1 I1 = MainEvaluate[ Function[{}, global][ ]]
2 I2 = I0 + I1
3 Return
This is a common issue. The solution is to tell Compile to inline globals, like so:
cf = Block[{a, x, degree = 3},
With[{expr := Product[x - a[[i]], {i, degree}]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr,
CompilationOptions -> {"InlineExternalDefinitions" -> True}]]];
CompilePrint[cf]
You can check that now there's no callback to the main evaluator.
Alternatively you can inject the value of degree using an extra layer of With instead of Block. This will make you wish for something like this very much.
Macro expansion in Mathematica
This is somewhat unrelated, but you mention in your post that you use With for macro expansion. Here's my first (possibly buggy) go at implementing macro expansion in Mathematica. This is not well tested, feel free to try to break it and post a comment.
Clear[defineMacro, macros, expandMacros]
macros = Hold[];
SetAttributes[defineMacro, HoldAllComplete]
defineMacro[name_Symbol, value_] := (AppendTo[macros, name]; name := value)
SetAttributes[expandMacros, HoldAllComplete]
expandMacros[expr_] := Unevaluated[expr] //. Join ## (OwnValues /# macros)
Description:
macros is a (held) list of all symbols to be expanded.
defineMacro will make a new macro.
expandMacros will expand macro definitions in an expression.
Beware: I didn't implement macro-redefinition, this will not work while expansion is on using $Pre. Also beware of recursive macro definitions and infinite loops.
Usage:
Do macro expansion on all input by defining $Pre:
$Pre = expandMacros;
Define a to have the value 1:
defineMacro[a, 1]
Set a delayed definition for b:
b := a + 1
Note that the definition of b is not fully evaluated, but a is expanded.
?b
Global`b
b:=1+1
Turn off macro expansion ($Pre can be dangerous if there's a bug in my code):
$Pre =.
One way:
cf = Block[{a, x, degree = 3},
With[{expr = Quiet[Product[x - a[[i]], {i, degree}]]},
Compile[{{x, _Real, 0}, {a, _Real, 1}}, expr]]]
be careful though, it you really want this.
Original code:
newtonC = Function[{degree, f, df, colors},
Compile[{{x0, _Complex, 0}, {a, _Complex, 1}},
Block[{x = x0, xn = 0.0 + 0.0 I, i = 0, maxiter = 256,
...
RuntimeOptions -> "Speed", Parallelization -> True]]##
(Quiet#Block[{degree = 3, polynomial, a, x},
polynomial = HornerForm[Product[x - a[[i]], {i, degree}]];
...
Modified code:
newtonC = Function[{degree, f, df, colors},
Compile[{{x0, _Complex, 0}, {a, _Complex, 1}},
Block[{x = x0, xn = 0.0 + 0.0 I, i = 0, maxiter = 256,
...
RuntimeOptions -> "Speed", Parallelization -> True],HoldAllComplete]##
( (( (HoldComplete###)/.a[i_]:>a[[i]] )&)#Block[{degree = 3, polynomial, a, x},
polynomial = HornerForm[Product[x - a[i], {i, degree}]];
...
Add HoldAllComplete attribute to the function.
Write a[i] in place of a[[i]].
Replace Quiet with (( (HoldComplete###)/.a[i_]:>a[[i]] )&)
Produces the identical code, no Quiet, and all of the Hold stuff is in one place.
In Mathematica, Max[] is the most efficient function to get the maximum number in a list of numbers, but how do I find the list with the maximum last element in a list of lists? e.g. the 2-d coordinate with the biggest x part in a series of coordinates.
My best try is SortBy, but obviously I don't need the program to sort my list, only the maximum one I need.
Perhaps:
list = {{4, 3}, {5, 10}, {-2, 1}, {3, 7}}
Reverse /# Take[#, Ordering[#, -1]] &#(Reverse /# #) &# list
(*
-> {{5, 10}}
*)
Exploiting the fact that Ordering[ ] orders lists by their first element
Edit
Or much better (I think):
Take[#, Ordering[Last /# #, -1]] &# list
Edit
Also:
#[[Ordering[#, -1, Last##2 > Last##1 &]]] &#list
Edit
Perhaps faster:
#[[First#Position[#, Max##] &#(Last /# #)]] &#list
Here is my approach using Pick
maxBy[list_, n_] := With[{s = list[[All, n]]}, Pick[list, s, Max[s]]]
maxBy[{{4, 3}, {5, 10}, {-2, 1}, {3, 7}}, 2]
(* output:
{{5, 10}}
*)
This version works on any number of elements per sublist provided n is less than or equal to the length of the shortest sublist.
Timings for this version on my machine
list2 = RandomInteger[{-10^7, 10^7}, {10^6, 2}];
list3 = RandomInteger[{-10^7, 10^7}, {10^6, 3}];
list9 = RandomInteger[{-10^7, 10^7}, {10^6, 9}];
maxBy[list2, 2]; // Timing
maxBy[list3, 2]; // Timing
maxBy[list9, 2]; // Timing
(* output:
{0.030341, Null}
{0.030912, Null}
{0.033313, Null}
*)
Compared to David's code
maxBy[list2, 2]; // Timing
maxBy[list3, 2]; // Timing
maxBy[list9, 2]; // Timing
(* ouput:
{0.186175, Null}
{0.184989, Null}
{0.262018, Null}
*)
Yoda's code
maxBy[list2, 2]; // Timing
maxBy[list3, 2]; // Timing
maxBy[list9, 2]; // Timing
(* ouput:
{0.944016, Null}
{0.83094, Null}
{0.874126, Null}
*)
And belisarius' code
Reverse /# Take[#, Ordering[#, -1]] &#(Reverse /# #) &#list2; // Timing
Take[#, Ordering[Last /# #, -1]] &#list2; // Timing
#[[Ordering[#, -1, Last##2 > Last##1 &]]] &#list2; // Timing
#[[First#Position[#, Max##] &#(Last /# #)]] &#list2; // Timing
(* output:
{0.211016, Null}
{0.099253, Null}
{2.03415, Null}
{0.266934, Null}
*)
Not the most efficient but simpler?
max = Max#list[[All, -1]];
Cases[list, {_, max}]
or
max = Max#list3[[All, -1]];
Cases[list3, {_,_, max}]
Usage
list = {{40, 3}, {5, 10}, {-2, 1}, {3, 10}}
max = Max#list[[All, -1]];
Cases[list, {_, max}]
Output:
{{5, 10}, {3, 10}}
How about this function (defined here only for 2D lists):
maxBy = Module[{pattern = Reverse#Insert[{Max##1[[All, #2]]}, _, #2]},
Cases[#1, pattern]] &
Example:
list = {{4, 3}, {5, 10}, {20, 1}, {3, 7}};
maxBy[list, 1]
Out[1]= {{20, 1}}
maxBy[list, 2]
Out[2]= {{5, 10}}
Here's an approach that relies on Transpose:
maxBy = #1[[Position[t = Transpose[#1][[#2]], Max[t]][[All, 1]]]] &;
For example:
list = {{4, 3}, {5, 10}, {20, 1}, {3, 7}};
maxBy[list, 1]
(* {{20, 1}} *)
maxBy[list, 2]
(* {{5, 10}} *)
It can handle more than two elements per sublist, provided that the sublists are all the same length.
r:=RandomInteger[{-10^5,10^5}];
list3=Table[{r,r,r},{j,10^2}]; (* 3 numbers in each sublist *)
list9=Table[{r,r,r,r,r,r,r,r,r},{j,10^2}]; (* 9 numbers *)
maxBy[list3, 2] (* Find max in position 2 of list3 *)
(* {{-93332, 99582, 4324}} *)
maxBy[list9, 5] (* Find max in position 5 of list9 *)
(* {{7680, 85508, 51915, -58282, 94679, 50968, -12664, 75246, -82903}} *)
Of course, the results will vary according to the random numbers you have generated.
Edit
Here's some timing data for large lists. SortBy is clearly slower. but doesn't seem as influenced by the number of elements in each sublist. First, my maxBy code followed by SortBy:
Using the same list2, here's some timing data for Yoda's code. Although his routine is also called maxBy, it is his that produced the output that follows:
Again, with the same list2, some data for Belisarius' code:
His second suggestion is the fastest of all tested.
After reading some documentations and doing a few experiments, I managed to get a clearer view of this problem.
I actually was wondering why Max[] seemed intentionally avoid providing directives that make it return not only the max element itself, but also its position. After all, providing position doesn't change the O(n) complexity of the algorithm. For example, imagine:
In[1]:= Max[{991, 993, 992}, ReturnPosition -> True]
Out[1]= {2}
If that could be done, you can simply use the code below to solve my problem:
list[[Max[list[[All, -1]], ReturnPosition -> True]]]
But now I realize that the system function Max[] is not designed for finding the max element in lists. You can tell that the Wolfram team obviously made Max[] more like a traditional max function in mathematics ― it does simple symbolic simplifications, it automatically flatten out all lists, it can be in a plotable function, and most importantly, it's Orderless:
In[2]:= Attributes[Max]
Out[2]= {Flat, NumericFunction, OneIdentity, Orderless, Protected}
Which makes positions meaningless. In a word, it treats all the lists inside as mathematical sets.
So philosophically it's not trivial for Mathematica to compute this. All I need to do is to "DIY" a function with the O(n) complexity and can do the job. I think TomD is heading the right direction, although I prefer:
maxLast[l_] := Cases[l, {___, Max[Last/#l]}]
And Heike (黑客?) adopted Pick which may have better techniques especially designed for selecting elements, but there must be no virtual difference in the complexity of the algorithm. And I may rewrite it this way: (fewer names and heads, faster the speed)
maxLast[l_] := Pick[l, #, Max[#]] &[Last /# l]
They're both good answers.
Many times I find myself counting occurrences with Tally[ ] and then, once I discarded the original list, having to add (and join) to that counters list the results from another list.
This typically happens when I am counting configurations, occurrences, doing some discrete statistics, etc.
So I defined a very simple but handy function for Tally aggregation:
aggTally[listUnTallied__List:{},
listUnTallied1_List,
listTallied_List] :=
Join[Tally#Join[listUnTallied, listUnTallied1], listTallied] //.
{a___, {x_, p_}, b___, {x_, q_}, c___} -> {a, {x, p + q}, b, c};
Such that
l = {x, y, z}; lt = Tally#l;
n = {x};
m = {x, y, t};
aggTally[n, {}]
{{x, 1}}
aggTally[m, n, {}]
{{x, 2}, {y, 1}, {t, 1}}
aggTally[m, n, lt]
{{x, 3}, {y, 2}, {t, 1}, {z, 1}}
This function has two problems:
1) Performance
Timing[Fold[aggTally[Range##2, #1] &, {}, Range[100]];]
{23.656, Null}
(* functional equivalent to *)
Timing[s = {}; j = 1; While[j < 100, s = aggTally[Range#j, s]; j++]]
{23.047, Null}
2) It does not validate that the last argument is a real Tallied list or null (less important for me, though)
Is there a simple, elegant, faster and more effective solution? (I understand that these are too many requirements, but wishing is free)
Perhaps, this will suit your needs?
aggTallyAlt[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
{#[[1, 1]], Total##[[All, 2]]} & /#
GatherBy[Join[Tally#Join[listUnTallied, listUnTallied1], listTallied], First]
The timings are much better, and there is a pattern-based check on the last arg.
EDIT:
Here is a faster version:
aggTallyAlt1[listUnTallied__List : {}, listUnTallied1_List, listTallied : {{_, _Integer} ...}] :=
Transpose[{#[[All, 1, 1]], Total[#[[All, All, 2]], {2}]}] &#
GatherBy[Join[Tally#Join[listUnTallied, listUnTallied1], listTallied], First]
The timings for it:
In[39]:= Timing[Fold[aggTallyAlt1[Range##2, #1] &, {}, Range[100]];]
Timing[s = {}; j = 1; While[j < 100, s = aggTallyAlt1[Range#j, s]; j++]]
Out[39]= {0.015, Null}
Out[40]= {0.016, Null}
The following solution is just a small modification of your original function. It applies Sort before using ReplaceRepeated and can thus use a less general replacement pattern which makes it much faster:
aggTally[listUnTallied__List : {}, listUnTallied1_List,
listTallied : {{_, _Integer} ...}] :=
Sort[Join[Tally#Join[listUnTallied, listUnTallied1],
listTallied]] //. {a___, {x_, p_}, {x_, q_}, c___} -> {a, {x, p + q}, c};
Here's the fastest thing I've come up with yet, (ab)using the tagging available with Sow and Reap:
aggTally5[untallied___List, tallied_List: {}] :=
Last[Reap[
Scan[((Sow[#2, #] &) ### Tally[#]) &, {untallied}];
Sow[#2, #] & ### tallied;
, _, {#, Total[#2]} &]]
Not going to win any beauty contests, but it's all about speed, right? =)
If you stay purely symbolic, you may try something along the lines of
(Plus ## Times ### Join[#1, #2] /. Plus -> List /. Times -> List) &
for joining tally lists. This is stupid fast but returns something that isn't a tally list, so it needs some work (after which it may not be so fast anymore ;) ).
EDIT: So I've got a working version:
aggT = Replace[(Plus ## Times ### Join[#1, #2]
/. Plus -> List
/. Times[a_, b_] :> List[b, a]),
k_Symbol -> List[k, 1], {1}] &;
Using a couple of random symbolic tables I get
a := Tally#b;
b := Table[f[RandomInteger#99 + 1], {i, 100}];
Timing[Fold[aggT[#1, #2] &, a, Table[a, {i, 100}]];]
--> {0.104954, Null}
This version only adds tally lists, doesn't check anything, still returns some integers, and comparing to Leonid's function:
Timing[Fold[aggTallyAlt1[#2, #1] &, a, Table[b, {i, 100}]];]
--> {0.087039, Null}
it's already a couple of seconds slower :-(.
Oh well, nice try.
If I want to find all possible sums from two lists list1 and list2, I use the Outer[] function with the specification of Plus as the combining operator:
In[1]= list1 = {a, b}; list2 = {c, d}; Outer[Plus, list1, list2]
Out[1]= {{a + c, a + d}, {b + c, b + d}}
If I want to be able to handle an arbitrary number of lists, say a list of lists,
In[2]= listOfLists={list1, list2};
then the only way I know how to find all possible sums is to use the Apply[] function (which has the short hand ##) along with Join:
In[3]= argumentsToPass=Join[{Plus},listOfLists]
Out[3]= {Plus, {a, b}, {c, d}}
In[4]= Outer ## argumentsToPass
Out[4]= {{a + c, a + d}, {b + c, b + d}}
or simply
In[5]= Outer ## Join[{Plus},listOfLists]
Out[5]= {{a + c, a + d}, {b + c, b + d}}
The problem comes when I try to compile:
In[6]= Compile[ ..... Outer ## Join[{Plus},listOfLists] .... ]
Compile::cpapot: "Compilation of Outer##Join[{Plus},listOfLists]] is not supported for the function argument Outer. The only function arguments supported are Times, Plus, or List. Evaluation will use the uncompiled function. "
The thing is, I am using a supported function, namely Plus. The problem seems to be solely with the Apply[] function. Because if I give it a fixed number of lists to outer-plus together, it works fine
In[7]= Compile[{{bob, _Integer, 1}, {joe, _Integer, 1}}, Outer[Plus, bob, joe]]
Out[7]= CompiledFunction[{bob, joe}, Outer[Plus, bob, joe],-CompiledCode-]
but as soon as I use Apply, it breaks
In[8]= Compile[{{bob, _Integer, 1}, {joe, _Integer, 1}}, Outer ## Join[{Plus}, {bob, joe}]]
Out[8]= Compile::cpapot: "Compilation of Outer##Join[{Plus},{bob,joe}] is not supported for the function argument Outer. The only function arguments supported are Times, Plus, or List. Evaluation will use the uncompiled function."
So my questions is: Is there a way to circumvent this error or, alternatively, a way to compute all possible sums of elements pulled from an arbitrary number of lists in a compiled function?
(Also, I'm not sure if "compilation" is an appropriate tag. Please advise.)
Thanks so much.
One way it to use With, to create a compiled function programmatically:
Clear[makeCompiled];
makeCompiled[lnum_Integer] :=
With[{listNames = Table[Unique["list"], {lnum}]},
With[{compileArgs = {#, _Integer, 1} & /# listNames},
Compile ## Join[Hold[compileArgs],
Replace[Hold[Outer[Plus, listNames]],
Hold[Outer[Plus, {x__}]] :> Hold[Outer[Plus, x]], {0}]]]];
It can probably be done prettier, but it works. For example:
In[22]:= p2 = makeCompiled[2]
Out[22]= CompiledFunction[{list13,list14},Outer[Plus,list13,list14],-CompiledCode-]
In[23]:= p2[{1,2,3},{4,5}]
Out[23]= {{5,6},{6,7},{7,8}}
In[24]:= p3 = makeCompiled[3]
Out[24]= CompiledFunction[{list15,list16,list17},Outer[Plus,list15,list16,list17],-CompiledCode-]
In[25]:= p3[{1,2},{3,4},{5,6}]
Out[25]= {{{9,10},{10,11}},{{10,11},{11,12}}}
HTH
Edit:
You can hide the compiled function behind another one, so that it is created at run-time and you don't actually see it:
In[33]:=
Clear[computeSums]
computeSums[lists : {__?NumberQ} ..] := makeCompiled[Length[{lists}]][lists];
In[35]:= computeSums[{1, 2, 3}, {4, 5}]
Out[35]= {{5, 6}, {6, 7}, {7, 8}}
You face an overhead of compiling in this case, since you create then a compiled function afresh every time. You can fight this overhead rather elegantly with memoization, using Module variables for persistence, to localize your memoized definitions:
In[44]:=
Clear[computeSumsMemoized];
Module[{compiled},
compiled[n_] := compiled[n] = makeCompiled[n];
computeSumsMemoized[lists : {__?NumberQ} ..] := compiled[Length[{lists}]][lists]];
In[46]:= computeSumsMemoized[{1, 2, 3}, {4, 5}]
Out[46]= {{5, 6}, {6, 7}, {7, 8}}
This is my first post. I hope I get this right.
If your inputs are lists of integers, I am skeptical of the value of compiling this function, at least in Mathematica 7.
For example:
f = Compile[{{a, _Integer, 1}, {b, _Integer, 1}, {c, _Integer, 1}, {d, _Integer, 1}, {e, _Integer, 1}},
Outer[Plus, a, b, c, d, e]
];
a = RandomInteger[{1, 99}, #] & /# {12, 32, 19, 17, 43};
Do[f ## a, {50}] // Timing
Do[Outer[Plus, ##] & ## a, {50}] // Timing
The two Timings are not significantly different for me, but of course this is only one sample. The point is merely that Outer is already fairly fast compared to the compiled version.
If you have reasons other than speed for compilation, you may find some use in Tuples instead of Outer, but you still have the constraint of compiled functions requiring tensor input.
f2 = Compile[{{array, _Integer, 2}},
Plus ### Tuples#array
];
f2[{{1, 3, 7}, {13, 25, 41}}]
If your inputs are large, then a different approach may be in order. Given a list of lists of integers, this function will return the possible sums and the number of ways to get each sum:
f3 = CoefficientRules#Product[Sum[x^i, {i, p}], {p, #}] &;
f3[{{1, 3, 7}, {13, 25, 41}}]
This should prove to be far more memory efficient in many cases.
a2 = RandomInteger[{1, 999}, #] & /# {50, 74, 55, 55, 90, 57, 47, 79, 87, 36};
f3[a2]; // Timing
MaxMemoryUsed[]
This took 3 seconds and minimal memory, but attempting the application of Outer to a2 terminated the kernel with "No more memory available."