How to speed up Mathematica replacement of matrix elements - wolfram-mathematica

I have several 100x15 matrices; one of them is a distance. When elements of that matrix exceed a bound, I want to reset those elements to zero and also reset the corresponding elements of three other matrices to zero. Here's my silly way (but it works):
Do[ If[ xnow[[i, j]] > L, xnow[[i, j]] = 0.;
cellactvA[[i, j ]] = 0.;
cellactvB[[i, j ]] = 0.;
cellactvC[[i, j ]] = 0.; ], (* endIF *)
{ i, 1, nstrips}, {j, 1, ncells} ]; (* endDO *)
I tried ReplacePart:
xnow = ReplacePart[ xnow, Position[ xnow, x_?(# > L &) ] ]
(something like this, I don't have it handy; it was done correctly enough to execute), but it was as slow as the loop and did not produce the correct replacement structure in matrix xnow. Please advise on how to do this in a way that is reasonably quick, as this calc is inside another loop (over time) that executes many many times. The overall calculation is of course, now, very slow. Thanks in advance.
Here is how I did this in R; very simple and quick:
# -- find indices of cells outside window
indxoutRW <- which( xnow > L, arr.ind=T )
# -- reset cells outside window
cellrateA[indxoutRW] <- 0
cellrateB[indxoutRW] <- 0
cellrateC[indxoutRW] <- 0
# -- move reset cells back to left side
xnow[indxoutRW] <- xnow[indxoutRW] - L

How about this:
Timing[
matrixMask2 = UnitStep[limit - $xnow];
xnow = $xnow*matrixMask2;
cellactvA2 = $a*matrixMask2;
cellactvB2 = $b*matrixMask2;
cellactvC2 = $c*matrixMask2;
]
If you want to write fast code one thing to make sure is to check that On["Packing"] does not gives messages; or at least that you understand them and know that they are not an issue.
Edit for OP comment:
mask = UnitStep[limit - xnow];
{xnow*mask, cellactvA2*mask, cellactvB2*mask, cellactvC2*mask}
Hope this helps, you still need to set limit.

The following will be based on SparseArrays, avoid extraneous stuff and very fast:
extractPositionFromSparseArray[
HoldPattern[SparseArray[u___]]] := {u}[[4, 2, 2]];
positionExtr[x_List, n_] :=
extractPositionFromSparseArray[
SparseArray[Unitize[x - n], Automatic, 1]]
replaceWithZero[mat_, flatZeroPositions_List, type : (Integer | Real) : Real] :=
Module[{copy = Flatten#mat},
copy[[flatZeroPositions]] = If[type === Integer, 0, 0.];
Partition[copy, Last[Dimensions[mat]]]];
getFlatZeroDistancePositions[distanceMat_, lim_] :=
With[{flat = Flatten[distanceMat]},
With[{originalZPos = Flatten# positionExtr[flat , 0]},
If[originalZPos === {}, #, Complement[#, originalZPos ]] &#
Flatten#positionExtr[Clip[flat , {0, lim}, {0, 0}], 0]]];
Now, we generate our matrices, making sure that they are packed:
{xnow, cellactvA, cellactvB, cellactvC} =
Developer`ToPackedArray /# RandomReal[10, {4, 100, 15}];
Here is the benchmark for doing this 1000 times:
In[78]:=
Do[
With[{L = 5},
With[{flatzpos = getFlatZeroDistancePositions[xnow,L]},
Map[replaceWithZero[#,flatzpos ]&,{xnow,cellactvA,cellactvB,cellactvC}]]
],
{1000}]//Timing
Out[78]= {0.203,Null}
Note that there was no unpacking in the process, but you have to ensure that you have your matrices packed from the start, and that you pick the correct type (Integer or Real) for the replaceWithZero function.

Yet another method which seems to be fast
xnow = $xnow; a = $a; b = $b; c = $c;
umask = Unitize#Map[If[# > limit, 0, #] &, xnow, {2}];
xnow = xnow*umask; a = a*umask; b = b*umask; c = c*umask;
Based on limited testing in Nasser's setup it seems it is as fast as the SparseArray-based mask.
Edit: Can combine with SparseArray to get a slight speed-up
umask2=SparseArray[Unitize#Map[If[# > limit, 0, #] &, xnow, {2}]];
xnow = xnow*umask2; a = a*umask2; b = b*umask2; c = c*umask2;
Edit 2: Inspired by ruebenko's solution, another built-in function (not nearly as fast as UnitStep but much faster than others):
umask3 = Clip[xnow, {limit, limit}, {1, 0}];
xnow = xnow*umask3; a = a*umask3; b = b*umask3; c = c*umask3;

Does this approach work for you?
matrixMask =
SparseArray[Thread[Position[xnow, _?(# > 0.75 &)] -> 0.],
Dimensions[xnow], 1.];
xnow = xnow * matrixMask;
cellactvA = cellactvA * matrixMask;
cellactvB = cellactvB * matrixMask;
cellactvC = cellactvC * matrixMask;
The basic idea is to create a matrix that is zero where your threshold is crossed, and one everywhere else. Then we use element-wise multiplication to zero out the appropriate elements in the various matrices.

ReplacePart is notoriously slow.
MapThread should do what you want - note the third argument.
{xnow, cellactvA, cellactvB, cellactvC} =
RandomReal[{0, 1}, {4, 10, 5}]
L = 0.6;
MapThread[If[#1 > L, 0, #2] &, {xnow, xnow}, 2]
And for all four matrices
{xnow, cellactvA, cellactvB, cellactvC} =
MapThread[Function[{x, y}, If[x > L, 0, y]], {xnow, #},
2] & /# {xnow, cellactvA, cellactvB, cellactvC}

may be
(*data*)
nRow = 5; nCol = 5;
With[{$nRow = nRow, $nCol = nCol},
xnow = Table[RandomReal[{1, 3}], {$nRow}, {$nCol}];
cellactvA = cellactvB = cellactvC = Table[Random[], {$nRow}, {$nCol}]
];
limit = 2.0;
now do the replacement
pos = Position[xnow, x_ /; x > limit];
{cellactvA, cellactvB, cellactvC} =
Map[ReplacePart[#, pos -> 0.] &, {cellactvA, cellactvB, cellactvC}];
edit(1)
Here is a quick speed comparing the 4 methods above, the LOOP, and then Brett, me, and Verbeia. May be someone can double check them. I used the same data for all. created random data once, then used it for each test. Same limit (called L) I used matrix size of 2,000 by 2,000.
So speed Timing numbers below does not include data allocation.
I run the tests once.
This is what I see:
For 2,000 by 2,000 matrices:
Bill (loop): 16 seconds
me (ReplacPart): 21 seconds
Brett (SparseArray): 7.27 seconds
Verbeia (MapThread): 32 seconds
For 3,000 by 3,000 matrices:
Bill (loop): 37 seconds
me (ReplacPart): 48 seconds
Brett (SparseArray): 16 seconds
Verbeia (MapThread): 79 seconds
So, it seems to be that SparseArray is the fastest. (but please check to make sure I did not break something)
code below:
data generation
(*data*)
nRow = 2000;
nCol = 2000;
With[{$nRow = nRow, $nCol = nCol},
$xnow = Table[RandomReal[{1, 3}], {$nRow}, {$nCol}];
$a = $b = $c = Table[Random[], {$nRow}, {$nCol}]
];
limit = 2.0;
ReplacePart test
xnow = $xnow;
a = $a;
b = $b;
c = $c;
Timing[
pos = Position[xnow, x_ /; x > limit];
{xnow, a, b, c} = Map[ReplacePart[#, pos -> 0.] &, {xnow, a, b, c}]][[1]]
SparseArray test
xnow = $xnow;
a = $a;
b = $b;
c = $c;
Timing[
matrixMask =
SparseArray[Thread[Position[xnow, _?(# > limit &)] -> 0.],
Dimensions[xnow], 1.]; xnow = xnow*matrixMask;
a = a*matrixMask;
b = b*matrixMask;
c = c*matrixMask
][[1]]
MapThread test
xnow = $xnow;
a = $a;
b = $b;
c = $c;
Timing[
{xnow, a, b, c} =
MapThread[Function[{x, y}, If[x > limit, 0, y]], {xnow, #},
2] & /# {xnow, a, b, c}
][[1]]
loop test
xnow = $xnow;
a = $a;
b = $b;
c = $c;
Timing[
Do[If[xnow[[i, j]] > limit,
xnow[[i, j]] = 0.;
a[[i, j]] = 0.;
b[[i, j]] = 0.;
c[[i, j]] = 0.
],
{i, 1, nRow}, {j, 1, nCol}
]
][[1]]
edit(2)
There is something really bothering me with all of this. I do not understand how a loop can be faster that the specialized commands for this purpose?
I wrote a simple loop test in Matlab, like Bill had using R, and I getting much lower timings there also. I hope an expert can come up with a much faster method, because now I am not too happy with this.
For 3,000 by 3,000 matrix, I am getting
Elapsed time is 0.607026 seconds.
This is more than 20 times faster than the SparseArray method, and it is just a loop!
%test, on same machine, 4GB ram, timing uses cpu timing using tic/toc
%allocate data
nRow = 3000;
nCol = 3000;
%generate a random matrix of real values
%between 1 and 3
xnow = 1 + (3-1).*rand(nRow,nRow);
%allocate the other 3 matrices
a=zeros(nRow,nCol);
b=a;
c=b;
%set limit
limit=2;
%engine
tstart=tic;
for i=1:nRow
for j=1:nCol
if xnow(i,j) > limit
xnow(i,j) = 0;
a(i,j) = 0;
b(i,j) = 0;
c(i,j) = 0;
end
end
end
toc(tstart)
fyi: using cputime() gives similar values.as tic/toc.

Related

Tricks to improve the performance of a cunstom function in Julia

I am replicating using Julia a sequence of steps originally made in Matlab. In Octave, this procedure takes 1.4582 seconds and in Julia (using Jupyter) it takes approximately 10 seconds. I'll try to be brief in the scripts. My goal is to achieve or improve Octave's performance. First of all, I will describe my variables and some function:
zgrid (double 1x7 size)
kgrid (double 500x1 size)
V0 (double 500x7 size)
P (double 7x7 size) a transition matrix
delta and beta are fixed parameters.
F(z,k) and u(c) are particular functions and are specified in the Julia script.
% Octave script
% V0 is given
[K, Z, K2] = meshgrid(kgrid, zgrid, kgrid);
K = permute(K, [2, 1, 3]);
Z = permute(Z, [2, 1, 3]);
K2 = permute(K2, [2, 1, 3]);
C = max(f(Z,K) + (1-delta)*K - K2,0);
U = u(C);
EV = V0*P';% EV is a 500x7 matrix size
EV = permute(repmat(EV, 1, 1, 500), [3, 2, 1]);
H = U + beta*EV;
[TV, index] = max(H, [], 3);
In Julia, I created a function that replicates this procedure. I used loops, but it has a performance 9 times longer.
% Julia script
% V0 is the input of my T operator function
V0 = repeat(sqrt.(kgrid), outer = [1,7]);
F = (z,k) -> exp(z)*(k^α);
u = (c) -> (c^(1-μ) - 1)/(1-μ)
% parameters
α = 1/3
β = 0.987
δ = 0.012;
μ = 2
Kss = 48.1905148382166
kgrid = range(0.75*Kss, stop=1.25*Kss, length=500);
zgrid = [-0.06725382459813659, -0.044835883065424395, -0.0224179415327122, 0 , 0.022417941532712187, 0.04483588306542438, 0.06725382459813657]
function T(V)
E=V*P'
T1 = zeros(Float64, 500, 7 )
aux = zeros(Float64, 500)
for i = 1:7
for j = 1:500
for l = 1:500
c= maximum( (F(zrid[i],kgrid[j]) +(1-δ)*kgrid[j] - kgrid[l],0))
aux[l] = u(c) + β*E[l,i]
end
T1[j,i] = maximum(aux)
end
end
return T1
end
I would very much like to improve my performance in Julia. I believe there is a way to improve, but I am new in Julia programming.
This code runs for me in 5ms. Note that I have made F and u into proper (not anonymous) functions, F_ and u_, but you could get a similar effect by making the anonymous functions const.
Your main problem is that you have a lot of non-const global variables, and also that your main function is doing unnecessary work multiple times, and creating an unnecessary array, aux.
The performance tips section in the manual is essential reading: https://docs.julialang.org/en/v1/manual/performance-tips/
F_(z,k) = exp(z) * (k^(1/3)); # you can still use α, but it must be const
u_(c) = (c^(1-2) - 1)/(1-2)
function T_(V, P, kgrid, zgrid, β, δ)
E = V * P'
T1 = similar(V)
for i in axes(T1, 2)
for j in axes(T1, 1)
temp = F_(zgrid[i], kgrid[j]) + (1-δ)*kgrid[j]
aux = -Inf
for l in eachindex(kgrid)
c = max(0.0, temp - kgrid[l])
aux = max(aux, u_(c) + β * E[l, i])
end
T1[j,i] = aux
end
end
return T1
end
Benchmark:
V0 = repeat(sqrt.(kgrid), outer = [1,7]);
zgrid = sort!(rand(1, 7); dims=2)
kgrid = sort!(rand(500, 1); dims=1)
P = rand(length(zgrid), length(zgrid))
#btime T_($V0, $P, $kgrid, $zgrid, $β, $δ);
# output: 5.126 ms (4 allocations: 54.91 KiB)
The following should perform much better. The most noticeable differences are that it calculates F 500x less, and doesn't rely on global variables.
function T(V,kgrid,zgrid,β,δ)
E=V*P'
T1 = zeros(Float64, 500, 7)
for j = 1:500
for i = 1:7
x = F(zrid[i],kgrid[j]) +(1-δ)*kgrid[j]
T1[j,i] = maximum(u(max(x - kgrid[l], 0)) + β*E[l,i] for l in 1:500)
end
end
return T1
end

Iteration method

I am working on finding the initial points of convergence using newton's iteration method in mathematica. newton function works now I would like to show which initial points from a grid produce Newton iterations that converge to -1, same for points that converge to (1 + (3)^1/2)/2i, given that:
f(x) = x^3+1
newton[x0_] := (
x = x0;
a1 = {};
b1 = {};
c1 = {};
counter = 0;
error = Abs[f[x]];
While[counter < 20 && error > 0.0001,
If[f'[x] != 0, x = x - N[f[x]/f'[x]]];
counter = counter + 1;
error = Abs[f[x]]];
x)
I created a grid to show which initial points of a+bi converge to the roots.
grid = Table[a + b I, {a, -2, 2, 0.01}, {b, -2, 2, 0.01}];
Then I created a fractal, but whenever I plot it gives me a blank graph on the axis.
There's got to be a way for me to be able to identify the converge points from the grid but so far I have not been successful. I tried using the Which[] method but when comparing the value its returns false.
Any help will appreciate it
Your code is not optimal, to put it mildly, but to give you a head start, why don't you start with something like this:
f[x_] := x^3 + 1;
newton[x0_] := (x = x0;
a1 = {};
b1 = {};
c1 = {};
counter = 0;
error = Abs[f[x]];
While[counter < 20 && error > 0.0001,
If[f'[x] != 0, x = x - N[f[x]/f'[x]]];
counter = counter + 1;
error = Abs[f[x]]];
{x, counter})
Table[Re#newton[a + b I], {a, -2, 2, 0.01}, {b, -2, 2, 0.01}] // Image

Null values in matrix, why?

I'm learning about dynamic programming via the 0-1 knapsack problem.
I'm getting some weird Nulls out from the function part1. Like 3Null, 5Null etc. Why is this?
The code is an implementation of:
http://www.youtube.com/watch?v=EH6h7WA7sDw
I use a matrix to store all the values and keeps, dont know how efficient this is since it is a list of lists(indexing O(1)?).
This is my code:
(* 0-1 Knapsack problem
item = {value, weight}
Constraint is maxweight. Objective is to max value.
Input on the form:
Matrix[{value,weight},
{value,weight},
...
]
*)
lookup[x_, y_, m_] := m[[x, y]];
part1[items_, maxweight_] := {
nbrofitems = Dimensions[items][[1]];
keep = values = Table[0, {j, 0, nbrofitems}, {i, 1, maxweight}];
For[j = 2, j <= nbrofitems + 1, j++,
itemweight = items[[j - 1, 2]];
itemvalue = items[[j - 1, 1]];
For[i = 1, i <= maxweight, i++,
{
x = lookup[j - 1, i, values];
diff = i - itemweight;
If[diff > 0, y = lookup[j - 1, diff, values], y = 0];
If[itemweight <= i ,
{If[x < itemvalue + y,
{values[[j, i]] = itemvalue + y; keep[[j, i]] = 1;},
{values[[j, i]] = x; keep[[j, i]] = 0;}]
},
y(*y eller x?*)]
}
]
]
{values, keep}
}
solvek[keep_, items_, maxweight_] :=
{
(*w=remaining weight in knapsack*)
(*i=current item*)
w = maxweight;
knapsack = {};
nbrofitems = Dimensions[items][[1]];
For[i = nbrofitems, i > 0, i--,
If[keep[[i, w]] == 1, {Append[knapsack, i]; w -= items[[i, 2]];
i -= 1;}, i - 1]];
knapsack
}
Clear[keep, v, a, b, c]
maxweight = 5;
nbrofitems = 3;
a = {5, 3};
b = {3, 2};
c = {4, 1};
items = {a, b, c};
MatrixForm[items]
Print["Results:"]
results = part1[items, 5];
keep = results[[1]];
Print["keep:"];
Print[keep];
Print["------"];
results2 = solvek[keep, items, 5];
MatrixForm[results2]
(*MatrixForm[results[[1]]]
MatrixForm[results[[2]]]*)
{{{0,0,0,0,0},{0,0,5 Null,5 Null,5 Null},{0,3 Null,5 Null,5 Null,8 Null},{4 Null,4 Null,7 Null,9 Null,9 Null}},{{0,0,0,0,0},{0,0,Null,Null,Null},{0,Null,0,0,Null},{Null,Null,Null,Null,Null}}}
While your code gives errors here, the Null problem occurs because For[] returns Null. So add a ; at the end of the outermost For statement in part1 (ie, just before {values,keep}.
As I said though, the code snippet gives errors when I run it.
In case my answer isn't clear, here is how the problem occurs:
(
Do[i, {i, 1, 10}]
3
)
(*3 Null*)
while
(
Do[i, {i, 1, 10}];
3
)
(*3*)
The Null error has been reported by acl. There are more errors though.
Your keep matrix actually contains two matrices. You need to call solvek with the second one: solvek[keep[[2]], items, 5]
Various errors in solvek:
i -= 1 and i - 1 are more than superfluous (the latter one is a coding error anyway). The i-- in the beginning of the For is sufficient. As it is now you're decreasing i twice per iteration.
Append must be AppendTo
keep[[i, w]] == 1 must be keep[[i + 1, w]] == 1 as the keep matrix has one more row than there are items.
Not wrong but superfluous: nbrofitems = Dimensions[items][[1]]; nbrofitems is already globally defined
The code of your second part could look like:
solvek[keep_, items_, maxweight_] :=
Module[{w = maxweight, knapsack = {}, nbrofitems = Dimensions[items][[1]]},
For[i = nbrofitems, i > 0, i--,
If[keep[[i + 1, w]] == 1, AppendTo[knapsack, i]; w -= items[[i, 2]]]
];
knapsack
]

Mathematica: Thread::tdlen: Objects of unequal length in {Null} {} cannot be combined. >>

I have aproblem:
Thread::tdlen: Objects of unequal length in {Null} {} cannot be combined. >>
It seems to occur in the while test which makes no sense at all since I am onlu comparing numbers...?
The program is a program to solve the 0-1 knapsack dynamic programming problem though I use loops, not recursion.
I have put some printouts and i can only think that the problem is in the while loop and it doesnt make sense.
(* 0-1 Knapsack problem
item = {value, weight}
Constraint is maxweight. Objective is to max value.
Input on the form:
Matrix[{value,weight},
{value,weight},
...
]
*)
lookup[x_, y_, m_] := m[[x, y]];
generateTable[items_, maxweight_] := {
nbrofitems = Dimensions[items][[1]];
keep = values = Table[0, {j, 0, nbrofitems}, {i, 1, maxweight}];
For[j = 2, j <= nbrofitems + 1, j++,
itemweight = items[[j - 1, 2]];
itemvalue = items[[j - 1, 1]];
For[i = 1, i <= maxweight, i++,
{
x = lookup[j - 1, i, values];
diff = i - itemweight;
If[diff > 0, y = lookup[j - 1, diff, values], y = 0];
If[itemweight <= i ,
{If[x < itemvalue + y,
{values[[j, i]] = itemvalue + y; keep[[j, i]] = 1;},
{values[[j, i]] = x; keep[[j, i]] = 0;}]
},
y(*y eller x?*)]
}
]
];
{values, keep}
}
pickItems[keep_, items_, maxweight_] :=
{
(*w=remaining weight in knapsack*)
(*i=current item*)
w = maxweight;
knapsack = {};
nbrofitems = Dimensions[items][[1]];
i = nbrofitems + 1;
x = 0;
While[i > 0 && x < 10,
{
Print["lopp round starting"];
x++;
Print["i"];
Print[i];
Print["w"];
Print[w];
Print["keep[i,w]"];
Print[keep[[i, w]]];
If[keep[[i, w]] == 1,
{Append[knapsack, i];
Print["tjolahej"];
w -= items[[i - 1, 2]];
i -= 1;
Print["tjolahopp"];
},
i -= 1;
];
Print[i];
Print["loop round done"];
}
knapsack;
]
}
Clear[keep, v, a, b, c]
maxweight = 5;
nbrofitems = 3;
a = {5, 3};
b = {3, 2};
c = {4, 1};
items = {a, b, c};
MatrixForm[items]
results = generateTable[items, 5];
keep = results[[1]][[2]];
Print["keep:"];
MatrixForm[keep]
Print["------"];
results2 = pickItems[keep, items, 5];
MatrixForm[results2]
This is not really an answer to the specific question being asked, but some hints on general situations when this error occurs. The short answer is that this is a sign of passing lists of unequal lengths to some Listable function, user-defined or built-in.
Many of Mathematica's built-in functions are Listable(have Listable attribute). This basically means that, given lists in place of some or all arguments, Mathematica automatically threads the function over them. What really happens is that Thread is called internally (or, at least, so it appears). This can be illustrated by
In[15]:=
ClearAll[f];
SetAttributes[f,Listable];
f[{1,2},{3,4,5}]
During evaluation of In[15]:= Thread::tdlen: Objects of unequal length in
f[{1,2},{3,4,5}] cannot be combined. >>
Out[17]= f[{1,2},{3,4,5}]
You can get the same behavior by using Thread explicitly:
In[19]:=
ClearAll[ff];
Thread[ff[{1,2},{3,4,5}]]
During evaluation of In[19]:= Thread::tdlen: Objects of unequal length in
ff[{1,2},{3,4,5}] cannot be combined. >>
Out[20]= ff[{1,2},{3,4,5}]
In case of Listable functions, this is a bit more hidden though. Some typical examples would include things like {1, 2} + {3, 4, 5} or {1, 2}^{3, 4, 5} etc. I discussed this issue in a bit more detail here.
Try this version:
pickItems[keep_, items_, maxweight_] := Module[{},
{(*w=remaining weight in knapsack*)(*i=current item*)w = maxweight;
knapsack = {};
nbrofitems = Dimensions[items][[1]];
i = nbrofitems + 1;
x = 0;
While[i > 0 && x < 10,
{
Print["lopp round starting"];
x++;
Print["i"];
Print[i];
Print["w"];
Print[w];
Print["keep[i,w]"];
Print[keep[[i, w]]];
If[keep[[i, w]] == 1,
{
Append[knapsack, i];
Print["tjolahej"];
w -= items[[i - 1, 2]];
i -= 1;
Print["tjolahopp"];
},
i -= 1;
];
Print[i];
Print["loop round done"]
};
knapsack
]
}
]
no errors now, but I do not know what it does really :)

Yield in Mathematica

Can you do something like Python's yield statement in Mathematica, in order to create generators? See e.g. here for the concept.
Update
Here's an example of what I mean, to iterate over all permutations, using only O(n) space: (algorithm as in Sedgewick's Algorithms book):
gen[f_, n_] := Module[{id = -1, val = Table[Null, {n}], visit},
visit[k_] := Module[{t},
id++; If[k != 0, val[[k]] = id];
If[id == n, f[val]];
Do[If[val[[t]] == Null, visit[t]], {t, 1, n}];
id--; val[[k]] = Null;];
visit[0];
]
Then call it it like:
gen[Print,3], printing all 6 permutations of length 3.
As I have previously stated, using Compile will given faster code. Using an algorithm from fxtbook, the following code generates a next partition in lexicographic ordering:
PermutationIterator[f_, n_Integer?Positive, nextFunc_] :=
Module[{this = Range[n]},
While[this =!= {-1}, f[this]; this = nextFunc[n, this]];]
The following code assumes we run version 8:
ClearAll[cfNextPartition];
cfNextPartition[target : "MVM" | "C"] :=
cfNextPartition[target] =
Compile[{{n, _Integer}, {this, _Integer, 1}},
Module[{i = n, j = n, ni, next = this, r, s},
While[Part[next, --i] > Part[next, i + 1],
If[i == 1, i = 0; Break[]]];
If[i == 0, {-1}, ni = Part[next, i];
While[ni > Part[next, j], --j];
next[[i]] = Part[next, j]; next[[j]] = ni;
r = n; s = i + 1;
While[r > s, ni = Part[next, r]; next[[r]] = Part[next, s];
next[[s]] = ni; --r; ++s];
next
]], RuntimeOptions -> "Speed", CompilationTarget -> target
];
Then
In[75]:= Reap[PermutationIterator[Sow, 4, cfNextPartition["C"]]][[2,
1]] === Permutations[Range[4]]
Out[75]= True
This is clearly better in performance than the original gen function.
In[83]:= gen[dummy, 9] // Timing
Out[83]= {26.067, Null}
In[84]:= PermutationIterator[dummy, 9, cfNextPartition["C"]] // Timing
Out[84]= {1.03, Null}
Using Mathematica's virtual machine is not much slower:
In[85]:= PermutationIterator[dummy, 9,
cfNextPartition["MVM"]] // Timing
Out[85]= {1.154, Null}
Of course this is nowhere near C code implementation, yet provides a substantial speed-up over pure top-level code.
You probably mean the question to be more general but the example of iterating over permutations as given on the page you link to happens to be built in to Mathematica:
Scan[Print, Permutations[{1, 2, 3}]]
The Print there can be replaced with any function.

Resources