Identify important minima and maxima in time-series w/ Mathematica - wolfram-mathematica

I need a way to identify local minima and maxima in time series data with Mathematica. This seems like it should be an easy thing to do, but it gets tricky. I posted this on the MathForum, but thought I might get some additional eyes on it here.
You can find a paper that discusses the problem at: http://www.cs.cmu.edu/~eugene/research/full/compress-series.pdf
I've tried this so far…
Get and format some data:
data = FinancialData["SPY", {"May 1, 2006", "Jan. 21, 2011"}][[All, 2]];
data = data/First#data;
data = Transpose[{Range[Length#data], data}];
Define 2 functions:
First method:
findMinimaMaxima[data_, window_] := With[{k = window},
data[[k + Flatten#Position[Partition[data[[All, 2]], 2 k + 1, 1], x_List /; x[[k + 1]] < Min[Delete[x, k + 1]] || x[[k + 1]] > Max[Delete[x, k + 1]]]]]]
Now another approach, although not as flexible:
findMinimaMaxima2[data_] := data[[Accumulate#(Length[#] & /# Split[Prepend[Sign[Rest#data[[All, 2]] - Most#data[[All, 2]]], 0]])]]
Look at what each the functions does. First findMinimaMaxima2[]:
minmax = findMinimaMaxima2[data];
{Length#data, Length#minmax}
ListLinePlot#minmax
This selects all minima and maxima and results (in this instance) in about a 49% data compression, but it doesn't have the flexibility of expanding the window.
This other method does. A window of 2, yields fewer and arguably more important extrema:
minmax2 = findMinimaMaxima[data, 2];
{Length#data, Length#minmax2}
ListLinePlot#minmax2
But look at what happens when we expand the window to 60:
minmax2 = findMinimaMaxima[data, 60];
ListLinePlot[{data, minmax2}]
Some of the minima and maxima no longer alternate.
Applying findMinimaMaxima2[] to the output of findMinimaMaxima[] gives a workaround...
minmax3 = findMinimaMaxima2[minmax2];
ListLinePlot[{data, minmax2, minmax3}]
, but this seems like a clumsy way to address the problem.
So, the idea of using a fixed window to look left and right doesn't quite do everything one would like. I began thinking about an alternative that could use a range value R (e.g. a percent move up or down) that the function would need to meet or exceed to set the next minima or maxima. Here's my first try:
findMinimaMaxima3[data_, R_] := Module[{d, n, positions},
d = data[[All, 2]];
n = Transpose[{data[[All, 1]], Rest#FoldList[If[(#2 <= #1 + #1*R && #2 >= #1) || (#2 >= #1 - #1* R && #2 <= #1), #1, #2] &, d[[1]], d]}];
n = Sign[Rest#n[[All, 2]] - Most#n[[All, 2]]];
positions = Flatten#Rest[Most[Position[n, Except[0]]]];
data[[positions]]
]
minmax4 = findMinimaMaxima3[data, 0.1];
ListLinePlot[{data, minmax4}]
This too benefits from post processing with findMinimaMaxima2[]
ListLinePlot[{data, findMinimaMaxima2[minmax4]}]
But if you look closely, you see that it misses the extremes if they go beyond the R value in several positions - including the chart's absolute minimum and maximum as well as along the big moves up and down. Changing the R value shows how it misses the top and bottoms even more:
minmax4 = findMinimaMaxima3[data, 0.15];
ListLinePlot[{data, minmax4}]
So, I need to reconsider. Anyone can look at a plot of the data and easily identify the important minima and maxima. It seems harder to get an algorithm to do it. A window and/or an R value seem important to the solution, but neither on their own seems enough (at least not in the approaches above).
Can anyone extend any of the approaches shown or suggest an alternative to identifying the important minima and maxima?
Happy to forward a notebook with all of this code and discussion in it. Let me know if anyone needs it.
Thank you,
Jagra

I suggest to use an iterative approach. The following functions are taken from this post, and while they can be written more concisely without Compile, they'll do the job:
localMinPositionsC =
Compile[{{pts, _Real, 1}},
Module[{result = Table[0, {Length[pts]}], i = 1, ctr = 0},
For[i = 2, i < Length[pts], i++,
If[pts[[i - 1]] > pts[[i]] && pts[[i + 1]] > pts[[i]],
result[[++ctr]] = i]];
Take[result, ctr]]];
localMaxPositionsC =
Compile[{{pts, _Real, 1}},
Module[{result = Table[0, {Length[pts]}], i = 1, ctr = 0},
For[i = 2, i < Length[pts], i++,
If[pts[[i - 1]] < pts[[i]] && pts[[i + 1]] < pts[[i]],
result[[++ctr]] = i]];
Take[result, ctr]]];
Here is your data plot:
dplot = ListLinePlot[data]
Here we plot the mins, which are obtained after 3 iterations:
mins = ListPlot[Nest[#[[localMinPositionsC[#[[All, 2]]]]] &, data, 3],
PlotStyle -> Directive[PointSize[0.015], Red]]
The same for maxima:
maxs = ListPlot[Nest[#[[localMaxPositionsC[#[[All, 2]]]]] &, data, 3],
PlotStyle -> Directive[PointSize[0.015], Green]]
And the resulting plot:
Show[{dplot, mins, maxs}]
You may vary the number of iterations, to get more coarse-grained or finer minima/maxima.
Edit:
actually, I just noticed that a couple of points were still missed by this method, both for the
minima and maxima. So, I suggest it as a starting point, not as a complete solution. Perhaps, you
could analyze minima/maxima, coming from different iterations, and sometimes include those from a "previous", more fine-grained one. Also, the only "physical reason" that this kind of works, is that the nature of the financial data appears to be fractal-like, with several distinctly different scales. Each iteration in the above Nest-s targets a particular scale. This would not work so well for an arbitrary signal.

Related

Performance of Wolfram Mathematica InverseFunction

given the constants
mu = 20.82;
ex = 1.25;
kg1 = 1202.76;
kp = 76.58;
kvb = 126.92;
I need to invert the function
f[Vpx_,Vgx_] := Vpx Log[1 + Exp[kp (1/mu + Vgx/(Vpx s[Vpx]))]];
where
s[x_] := 1 + kvb/(2 x^2);
so that I get a function of two variables, the second one being Vgx.
I tried with
t = InverseFunction[Function[{Vpx, Vgx}, f[Vpx, Vgx]], 1, 2];
tested with t[451,-4]
It takes so much time that every time I try it I stop the evaluation.
On the other side, working with only one variable, everything works:
Vgx = -4;
t = InverseFunction[Function[{Vpx}, f[Vpx,Vgx]]];
t[451]
It's my fault? the method is inappropriate? or it's a limitation of Wolfram Mathematica?
Thanks
Teodoro Marinucci
P.S. For everyone interested it's a problem related to the Norman Koren model of triodes.
As I said in my comment, my guess is that InverseFunction first tries to solve symbolically for the inverse, e.g. Solve[Function[{Vpx, Vgx}, f[Vpx, Vgx]][X, #2] == #1, X], which takes a very long time (I didn't let it finish). However, I came across a system option that seems to turn this off and produce a function:
With[{opts = SystemOptions["ExtendedInverseFunction"]},
Internal`WithLocalSettings[
SetSystemOptions["ExtendedInverseFunction" -> False],
t = InverseFunction[Function[{Vpx, Vgx}, f[Vpx, Vgx]], 1, 2],
SetSystemOptions[opts]
]];
t[451, -4]
(* 199.762 *)
A couple of notes:
According to the documentation, InverseFunction with exact input should produce an exact answer. Here some of the parameters are approximate (floating-point) real numbers, so the answer above is a numerical approximation.
The actual definition of t depends on f. If f changes, then a side effect will be that t changes. If that is not something you explicitly want, then it is probably better to define t this way:
t = InverseFunction[Function[{Vpx, Vgx}, Evaluate#f[Vpx, Vgx]], 1, 2]
As my late Theoretical Physics professor said, "a simple and beautiful solution is likely to be true".
Here is the piece of code that works:
mu = 20.82; ex = 1.25; kg1 = 1202.76; kp = 76.58; kvb = 126.92;
Ip[Vpx_, Vgx_] = Power[Vpx/kp Log[1 + Exp[kp (1/mu + Vgx/Sqrt[kvb + Vpx^2])]], ex] 2/kg1;
Vp[y_, z_] := x /. FindRoot[Ip[x, z] == y, {x, 80}]
The "real" amplification factor of a tube is the partial derivative of Ip[Vpx, Vgx] by respect to Vgx, with give Vpx. I would be happier if could use the Derivative, but I'm having errors.
I'll try to understand why, but for the moment the definition
[CapitalDelta]x = 10^-6;
[Micro][Ipx_, Vgx_] := Abs[Vp[Ipx, Vgx + [CapitalDelta]x] - Vp[Ipx, Vgx]]/[CapitalDelta]x
works well for me.
Thanks, it was really the starting point of the FindRoot the problem.

Poisson Solver using Mathematica

I am looking for some help with a Poisson Solver I am writing in Mathematica. The code is quite long with Arrays plugged in, but the full details can be found at http://pastebin.com/uSrSDcW6
I am calculating voltages given charge densities using the central difference method derived from Poisson's Eqn. After calculating the voltage, I test the data set for convergence. I am setting convergence thresholds on the order of 10^-1000+. I have the loop set up to kick out after 10000 iterations incase something goes awry, as a fail safe. I have a loop counter in place for sanity. The program seems to run fine as long as the convergence threshold is set to 10^-100.
My question is this: No matter what I update the threshold too, ex, 10^-100, 10^-150, the computation stops after 633 iterations and kicks out of the loop. I would appreciate any help with this, I am completely stuck. I've added comments to the program that should be explanatory for anyone on this forum. Again, I know this description is limited, so please see the attached url http://pastebin.com/uSrSDcW6 for the full program.
*Update10/9/12***I've isolated my issue down to the 16 digit machine precision. I need to open that up to my machine max precision of 10^309. Mathematica Help is sparse on how to do this. ex "N[MachinePrecision, 50]". Where would I set this in my program to apply it to all computation? Ill paste the loop here if that helps
Vnew / Vold / RHO are 10x10x34 Matrices
Epsilon is a constant
(Initialize ConvergenceLoop to O - This will serve as a fail safe to kick out of the loop if necessary)
ConvergenceLoop = 0;
(Initialize Convergence to zero)
Convergence = 0;
While[Convergence == 0 && ConvergenceLoop < 10000,
(Run through all i,j,k elements,calculating new voltage values)
Do[Vnew[[i]][[j]][[k]] = (1/(2/deltaX^2 + 2/deltaY^2 +
2/deltaZ^2)) *(((Vold[[i + 1]][[j]][[k]] +
Vold[[i - 1]][[j]][[k]])/(deltaX^2)) + ((Vold[[i]][[j + 1]][[k]] +
Vold[[i]][[j - 1]][[k]])/(deltaY^2)) + ((Vold[[i]][[j]][[k + 1]] +
Vold[[i]][[j]][[k - 1]])/(deltaZ^2)) + ((Rho[[i]][[j]][[k]]/Epsilon))), {i, 2, 9}, {j, 2,9}, {k, 2, 33}];
(Assume converged so the loop is triggered when the test hits the first value exceeding the defined convergence threshold)
Convergence = 1;
(This is the convergence test. User defined Convergence threshold)
Do[If[Vold[[i]][[j]][[k]] == 0, Null,
If[(Vnew[[i]][[j]][[k]] - Vold[[i]][[j]][[k]])/Vold[[i]][[j]][[k]] > .0000001, Convergence = 0;
(*This is purely diagnostic. I added a Tracker point to follow the convergence along.
user defined at any element*)
If[i == 5 && j == 5 && k == 10,
Print[ "Tracker Point" (Vnew[[i]][[j]][[k]] -
Vold[[i]][[j]][[k]])/Vold[[i]][[j]][[k]]], Null],Null]], {i, 2, 9}, {j, 2, 9}, {k, 2, 33}];
(Ignore the first iteration until Vnew and Vold are nonzero)
If[ConvergenceLoop < 2, Convergence = 0, Null];
(Forces Vold to evolve with Vnew)
Vold = Vnew;
ConvergenceLoop ++;]
(Added SessionTime for future planning purposes)
If[ConvergenceLoop == 10000,
Print["Convergence Loop Limit Reached. " (SessionTime[]/3600) ],
Print["Convergence Loop Limit Not Reached."]];
(We broke out of the while loop,meaning our data converged,so print the converged values)
If[Convergence == 1,
Print[ ConvergenceLoop "Congratulations Converged!" MatrixForm [Vnew]], Print["Did Not Converge!"]];
Since based on the comments above you have narrowed this to a precision problem as I suspected, please read these:
Funny behaviour when plotting a polynomial of high degree and large coefficients
Global precision setting
Confused by (apparent) inconsistent precision

Mathematica fast 2D binning algorithm

I am having some trouble developing a suitably fast binning algorithm in Mathematica. I have a large (~100k elements) data set of the form
T={{x1,y1,z1},{x2,y2,z2},....}
and I want to bin it into a 2D array of around 100x100 bins, with the bin value being given by the sum of the Z values that fall into each bin.
Currently I am iterating through each element of the table, using Select to pick out which bin it is supposed to be in based on lists of bin boundaries, and adding the z value to a list of values occupying that bin. At the end I map Total onto the list of bins, summing their contents (I do this because I sometimes want to do other things, like maximize).
I have tried using Gather and other such functions to do this but the above method was ridiculously faster, though perhaps I am using Gather poorly. Anyway It still takes a few minutes to do the sorting by my method and I feel like Mathematica can do better. Does anyone have a nice efficient algorithm handy?
Here is a method based on Szabolcs's post that is about about an order of magnitude faster.
data = RandomReal[5, {500000, 3}];
(*500k values*)
zvalues = data[[All, 3]];
epsilon = 1*^-10;(*prevent 101 index*)
(*rescale and round (x,y) coordinates to index pairs in the 1..100 range*)
indexes = 1 + Floor[(1 - epsilon) 100 Rescale[data[[All, {1, 2}]]]];
res2 = Module[{gb = GatherBy[Transpose[{indexes, zvalues}], First]},
SparseArray[
gb[[All, 1, 1]] ->
Total[gb[[All, All, 2]], {2}]]]; // AbsoluteTiming
Gives about {2.012217, Null}
AbsoluteTiming[
System`SetSystemOptions[
"SparseArrayOptions" -> {"TreatRepeatedEntries" -> 1}];
res3 = SparseArray[indexes -> zvalues];
System`SetSystemOptions[
"SparseArrayOptions" -> {"TreatRepeatedEntries" -> 0}];
]
Gives about {0.195228, Null}
res3 == res2
True
"TreatRepeatedEntries" -> 1 adds duplicate positions up.
I intend to do a rewrite of the code below because of Szabolcs' readability concerns. Until then, know that if your bins are regular, and you can use Round, Floor, or Ceiling (with a second argument) in place of Nearest, the code below will be much faster. On my system, it tests faster than the GatherBy solution also posted.
Assuming I understand your requirements, I propose:
data = RandomReal[100, {75, 3}];
bins = {0, 20, 40, 60, 80, 100};
Reap[
Sow[{#3, #2}, bins ~Nearest~ #] & ### data,
bins,
Reap[Sow[#, bins ~Nearest~ #2] & ### #2, bins, Tr##2 &][[2]] &
][[2]] ~Flatten~ 1 ~Total~ {3} // MatrixForm
Refactored:
f[bins_] := Reap[Sow[{##2}, bins ~Nearest~ #]& ### #, bins, #2][[2]] &
bin2D[data_, X_, Y_] := f[X][data, f[Y][#2, #2~Total~2 &] &] ~Flatten~ 1 ~Total~ {3}
Use:
bin2D[data, xbins, ybins]
Here's my approach:
data = RandomReal[5, {500000, 3}]; (* 500k values *)
zvalues = data[[All, 3]];
epsilon = 1*^-10; (* prevent 101 index *)
(* rescale and round (x,y) coordinates to index pairs in the 1..100 range *)
indexes = 1 + Floor[(1 - epsilon) 100 Rescale[data[[All, {1, 2}]]]];
(* approach 1: create bin-matrix first, then fill up elements by adding zvalues *)
res1 = Module[
{result = ConstantArray[0, {100, 100}]},
Do[
AddTo[result[[##]], zvalues[[i]]] & ## indexes[[i]],
{i, Length[indexes]}
];
result
]; // Timing
(* approach 2: gather zvalues by indexes, add them up, convert them to a matrix *)
res2 = Module[{gb = GatherBy[Transpose[{indexes, zvalues}], First]},
SparseArray[gb[[All, 1, 1]] -> (Total /# gb[[All, All, 2]])]
]; // Timing
res1 == res2
These two approaches (res1 & res2) can handle 100k and 200k elements per second, respectively, on this machine. Is this sufficiently fast, or do you need to run this whole program in a loop?
Here's my approach using the function SelectEquivalents defined in What is in your Mathematica tool bag? which is perfect for a problem like this one.
data = RandomReal[100, {75, 3}];
bins = Range[0, 100, 20];
binMiddles = (Most#bins + Rest#bins)/2;
nearest = Nearest[binMiddles];
SelectEquivalents[
data
,
TagElement -> ({First#nearest[#[[1]]], First#nearest[#[[2]]]} &)
,
TransformElement -> (#[[3]] &)
,
TransformResults -> (Total[#2] &)
,
TagPattern -> Flatten[Outer[List, binMiddles, binMiddles], 1]
,
FinalFunction -> (Partition[Flatten[# /. {} -> 0], Length[binMiddles]] &)
]
If you would want to group according to more than two dimensions you could use in FinalFunction this function to give to the list result the desired dimension (I don't remember where I found it).
InverseFlatten[l_,dimensions_]:= Fold[Partition[#, #2] &, l, Most[Reverse[dimensions]]];

Is there a faster way to find both Min and Max values?

Often I have written: {Min##, Max##} &
Yet this seems inefficient, as the expression must be scanned twice, once to find the minimum value, and once to find the maximum value. Is there a faster way to do this? The expression is often a tensor or array.
This beats it by a bit.
minMax = Compile[{{list, _Integer, 1}},
Module[{currentMin, currentMax},
currentMin = currentMax = First[list];
Do[
Which[
x < currentMin, currentMin = x,
x > currentMax, currentMax = x],
{x, list}];
{currentMin, currentMax}],
CompilationTarget -> "C",
RuntimeOptions -> "Speed"];
v = RandomInteger[{0, 1000000000}, {10000000}];
minMax[v] // Timing
I think it's a little faster than Leonid's version because Do is a bit faster than For, even in compiled code.
Ultimately, though, this is an example of the kind of performance hit you take when using a high level, functional programming language.
Addition in response to Leonid:
I don't think that the algorithm can account for all the time difference. Much more often than not, I think, both tests will be applied anyway. The difference between Do and For is measurable, however.
cf1 = Compile[{{list, _Real, 1}},
Module[{sum},
sum = 0.0;
Do[sum = sum + list[[i]]^2,
{i, Length[list]}];
sum]];
cf2 = Compile[{{list, _Real, 1}},
Module[{sum, i},
sum = 0.0;
For[i = 1, i <= Length[list],
i = i + 1, sum = sum + list[[i]]^2];
sum]];
v = RandomReal[{0, 1}, {10000000}];
First /#{Timing[cf1[v]], Timing[cf2[v]]}
{0.685562, 0.898232}
I think this is as fast as it gets, within Mathematica programming practices. The only way I see to attempt to make it faster within mma is to use Compile with the C compilation target, as follows:
getMinMax =
Compile[{{lst, _Real, 1}},
Module[{i = 1, min = 0., max = 0.},
For[i = 1, i <= Length[lst], i++,
If[min > lst[[i]], min = lst[[i]]];
If[max < lst[[i]], max = lst[[i]]];];
{min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]
However, even this seems to be somewhat slower than your code:
In[61]:= tst = RandomReal[{-10^7,10^7},10^6];
In[62]:= Do[getMinMax[tst],{100}]//Timing
Out[62]= {0.547,Null}
In[63]:= Do[{Min##,Max##}&[tst],{100}]//Timing
Out[63]= {0.484,Null}
You probably can write the function entirely in C, and then compile and load as dll - you may eliminate some overhead this way, but I doubt that you will win more than a few percents - not worth the effort IMO.
EDIT
It is interesting that you may significantly increase the speed of the compiled solution with manual common subexpression elimination (lst[[i]] here):
getMinMax =
Compile[{{lst, _Real, 1}},
Module[{i = 1, min = 0., max = 0., temp},
While[i <= Length[lst],
temp = lst[[i++]];
If[min > temp, min = temp];
If[max < temp, max = temp];];
{min, max}], CompilationTarget -> "C", RuntimeOptions -> "Speed"]
is a little faster than {Min##,Max##}.
For an array, you could do the simplest functional thing and use
Fold[{Min[##],Max[##]}&, First##, Rest##]& # data
Unfortunately, it is no speed demon. Even for short lists, 5 elements, all of Leonid's answers and Mark's answer are at least 7 times faster, uncompiled in v.7. For long lists, 25000 elements, this gets worse with Mark's being 19.6 times faster, yet even at this length this solution took only about 0.05 secs to run.
However, I'd not count out {Min[#], Max[#]}& as an option. Uncompiled it was 1.7 times faster than Mark's for short list and nearly 15 times faster for long lists (8 times and nearly 300 times faster, respectively, than the Fold solution).
Unfortunately, I could not get good numbers for the compiled versions of either {Min[#], Max[#]}&, Leonid's, or Mark's answers, instead I got numerous incomprehensible error messages. In fact, {Min[#], Max[#]}& increased in execution time. The Fold solution improved dramatically, though, and took twice as long as Leonid's answers' uncompiled times.
Edit: for the curious, here's some timing measurements of the uncompiled functions -
Each function was used on 100 lists of the length specified on the horizontal axis and the average time, in seconds, is the vertical axis. In ascending order of time, the curves are {Min[#], Max[#]}&, Mark's answer, Leonid's second answer, Leonid's first answer, and the Fold method from above.
For all of you who are doing timings I'd like to warn you that order of execution is extremely important. For instance, look at the following two subtly different timings tests:
(1)
res =
Table[
a = RandomReal[{0, 100}, 10^8];
{
Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First,
Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First
}
, {100}
]
The odd man out here is the last Min
(2)
res =
Table[
a = RandomReal[{0, 100}, 10^8];
{
Max[a] // AbsoluteTiming // First, Min[a] // AbsoluteTiming // First,
Min[a] // AbsoluteTiming // First, Max[a] // AbsoluteTiming // First
}
, {100}
]
Here, the highest timing is found for the first Max, the second-highest for the second max and the two Mins are about the same and lowest. Actually, I'd expect Max and Min to take about the same time,but they don't. The former seems to take 50% more time than the latter. Having the pole position also seems to come with a 50% handicap.
Now a comparison with the algorithms given by Mark and Leonid:
res =
Table[
a = RandomReal[{0, 100}, 10^8];
{
{Max[a], Min[a]} // AbsoluteTiming // First,
{Min##, Max##} &#a // AbsoluteTiming // First,
getMinMax[a] // AbsoluteTiming // First,
minMax[a] // AbsoluteTiming // First,
{Min[a], Max[a]} // AbsoluteTiming // First
}
, {100}
]
Here we find about .3 s for the {Max[a], Min[a]} (which includes the pole position handicap), the .1 level is for Mark's method; the others are all about the same.

Efficient way to pick/delete a list of rows/columns in a matrix in Mathematica

This question is in a way a continuation of the question I asked here:Simple way to delete a matrix column in Mathematica to which #belisarius and #Daniel provided very helpful answers.
What I am generally trying to do is to extract from a matrix A specific lines and columns OR what remains after what those specified are removed. So this can be formally writtewn as, find TakeOperator and Drop Operator such that:
TakeOperator[A,{i1,..,ip},{j1,...,jq}]=(A[[ik]][[jl]]) (1<=k<=p, 1<=l<=q) =Table[A[[ik]][[jl]],{k,p},{l,q}]
We note Ic={i'1,...,i'p'}=Complement[{1,...,Length[A]},{i1,...,ip}];Jc={j'1,...,j'q'}=Complement[{1,...,Length[A]},{j1,...,jq}];
DropOperator[A,{i1,..,ip},{j1,...,jq}]=(A[[ik]][[jl]]) (1<=k'<=p', 1<=l'<=q') =Table[A[[ik']][[jl']],{k',p'},{l','q}]
While Table as described above does the trick, it is highly inefficient to use Table in that manner.
Just to give an idea, I took # belisarius example:
In: First#Timing[a = RandomInteger[1000, {5000, 5000}];]
Out:0.218
In:Clear[b,c]
In:First#Timing[
b = Table[
If[i < 100, If[j < 100, a[[i]][[j]], a[[i]][[j + 1]]],
If[j < 100, a[[i + 1]][[j]], a[[i + 1]][[j + 1]]]], {i,
4999}, {j, 4999}]]
Out:140.807
In:First#Timing[c = Drop[a, {100}, {100}]]
Out:0.093
In:c===b
Out:True
Note: With respect to the use of Drop in the earlier post, I thought about using it as well, but when I checked the documentation, there was no suggestion of getting it done the way #belisarius and #daniel suggested. If the documentation could be updated in that direction in future releases, it would be helpful.
Part directly supports lists of indices when slicing arrays. The following definitions exploit that:
takeOperator[a_?MatrixQ, rows_List, cols_List] :=
a[[rows, cols]]
dropOperator[a_?MatrixQ, rows_List, cols_List] :=
a[[##]]& ## complementaryIndices[a, rows, cols]
complementaryIndices[a_?MatrixQ, rows_List, cols_List] :=
Complement ### Transpose # {Range /# Dimensions # a, {rows, cols}}
Example use:
a = RandomInteger[1000, {5000, 5000}];
First # Timing # takeOperator[a, Range[1, 5000, 2], Range[1, 5000, 2]]
(* 0.016 *)
First # Timing # dropOperator[a, Range[1, 5000, 2], Range[1, 5000, 2]]
(* 0.015 *)
You can also use explicit ranges in a way that is fairly efficient. They may provide some more flexibility. Here is your example.
a = RandomInteger[1000, {5000, 5000}];
Timing[b = Drop[a, {101}, {101}];]
Out[66]= {0.041993, Null}
Timing[
c = a[[Join[Range[100], Range[102, 5000]],
Join[Range[100], Range[102, 5000]]]];]
Out[67]= {0.061991, Null}
c == b
Out[62]= True
I would also suggest use of Span except offhand I do not see how to get it to work in this setting.
Daniel Lichtblau
Wolfram Research

Resources