A fast implementation in Mathematica for Position2D - wolfram-mathematica

I'm looking for a fast implementation for the following, I'll call it Position2D for lack of a better term:
Position2D[ matrix, sub_matrix ]
which finds the locations of sub_matrix inside matrix and returns the upper left and lower right row/column of a match.
For example, this:
Position2D[{
{0, 1, 2, 3},
{1, 2, 3, 4},
{2, 3, 4, 5},
{3, 4, 5, 6}
}, {
{2, 3},
{3, 4}
}]
should return this:
{
{{1, 3}, {2, 4}},
{{2, 2}, {3, 3}},
{{3, 1}, {4, 2}}
}
It should be fast enough to work quickly on 3000x2000 matrices with 100x100 sub-matrices. For simplicity, it is enough to only consider integer matrices.

Algorithm
The following code is based on an efficient custom position function to find positions of (possibly overlapping) integer sequences in a large integer list. The main idea is that we can first try to eficiently find the positions where the first row of the sub-matrix is in the Flatten-ed large matrix, and then filter those, extracting full sub-matrices and comparing to the sub-matrix of interest. This will be efficient for most cases except very pathological ones (those, for which this procedure would generate a huge number of potential position candidates, while the true number of entries of the sub-matrix would be much smaller. But such cases seem rather unlikely generally, and then, further improvements to this simple scheme can be made).
For large matrices, the proposed solution will be about 15-25 times faster than the solution of #Szabolcs when a compiled version of sequence positions function is used, and 3-5 times faster for the top-level implementation of sequence positions - finding function. The actual speedup depends on matrix sizes, it is more for larger matrices. The code and benchmarks are below.
Code
A generally efficient function for finding positions of a sub-list (sequence)
These helper functions are due to Norbert Pozar and taken from this Mathgroup thread. They are used to efficiently find starting positions of an integer sequence in a larger list (see the mentioned post for details).
Clear[seqPos];
fdz[v_] := Rest#DeleteDuplicates#Prepend[v, 0];
seqPos[list_List, seq_List] :=
Fold[
fdz[#1 (1 - Unitize[list[[#1]] - #2])] + 1 &,
fdz[Range[Length[list] - Length[seq] + 1] *
(1 - Unitize[list[[;; -Length[seq]]] - seq[[1]]])] + 1,
Rest#seq
] - Length[seq];
Example of use:
In[71]:= seqPos[{1,2,3,2,3,2,3,4},{2,3,2}]
Out[71]= {2,4}
A faster position-finding function for integers
However fast seqPos might be, it is still the major bottleneck in my solution. Here is a compiled-to-C version of this, which gives another 5x performance boost to my code:
seqposC =
Compile[{{list, _Integer, 1}, {seq, _Integer, 1}},
Module[{i = 1, j = 1, res = Table[0, {Length[list]}], ctr = 0},
For[i = 1, i <= Length[list], i++,
If[list[[i]] == seq[[1]],
While[j < Length[seq] && i + j <= Length[list] &&
list[[i + j]] == seq[[j + 1]],
j++
];
If[j == Length[seq], res[[++ctr]] = i];
j = 1;
]
];
Take[res, ctr]
], CompilationTarget -> "C", RuntimeOptions -> "Speed"]
Example of use:
In[72]:= seqposC[{1, 2, 3, 2, 3, 2, 3, 4}, {2, 3, 2}]
Out[72]= {2, 4}
The benchmarks below have been redone with this function (also the code for main function is slightly modified )
Main function
This is the main function. It finds positions of the first row in a matrix, and then filters them, extracting the sub-matrices at these positions and testing against the full sub-matrix of interest:
Clear[Position2D];
Position2D[m_, what_,seqposF_:Automatic] :=
Module[{posFlat, pos2D,sp = If[seqposF === Automatic,seqposC,seqposF]},
With[{dm = Dimensions[m], dwr = Reverse#Dimensions[what]},
posFlat = sp[Flatten#m, First#what];
pos2D =
Pick[Transpose[#], Total[Clip[Reverse#dm - # - dwr + 2, {0, 1}]],2] &#
{Mod[posFlat, #, 1], IntegerPart[posFlat/#] + 1} &#Last[dm];
Transpose[{#, Transpose[Transpose[#] + dwr - 1]}] &#
Select[pos2D,
m[[Last## ;; Last## + Last#dwr - 1,
First## ;; First## + First#dwr - 1]] == what &
]
]
];
For integer lists, the faster compiled subsequence position-finding function seqposC can be used (this is a default). For generic lists, one can supply e.g. seqPos, as a third argument.
How it works
We will use a simple example to dissect the code and explain its inner workings. This defines our test matrix and sub-matrix:
m = {{0, 1, 2, 3}, {1, 2, 3, 4}, {2, 3, 4, 5}};
what = {{2, 3}, {3, 4}};
This computes the dimensions of the above (it is more convenient to work with reversed dimensions for a sub-matrix):
In[78]:=
dm=Dimensions[m]
dwr=Reverse#Dimensions[what]
Out[78]= {3,4}
Out[79]= {2,2}
This finds a list of starting positions of the first row ({2,3} here) in the Flattened main matrix. These positions are at the same time "flat" candidate positions of the top left corner of the sub-matrix:
In[77]:= posFlat = seqPos[Flatten#m, First#what]
Out[77]= {3, 6, 9}
This will reconstruct the 2D "candidate" positions of the top left corner of a sub-matrix in a full matrix, using the dimensions of the main matrix:
In[83]:= posInterm = Transpose#{Mod[posFlat,#,1],IntegerPart[posFlat/#]+1}&#Last[dm]
Out[83]= {{3,1},{2,2},{1,3}}
We can then try using Select to filter them out, extracting the full sub-matrix and comparing to what, but we'll run into a problem here:
In[84]:=
Select[posInterm,
m[[Last##;;Last##+Last#dwr-1,First##;;First##+First#dwr-1]]==what&]
During evaluation of In[84]:= Part::take: Cannot take positions 3 through 4
in {{0,1,2,3},{1,2,3,4},{2,3,4,5}}. >>
Out[84]= {{3,1},{2,2}}
Apart from the error message, the result is correct. The error message itself is due to the fact that for the last position ({1,3}) in the list, the bottom right corner of the sub-matrix will be outside the main matrix. We could of course use Quiet to simply ignore the error messages, but that's a bad style. So, we will first filter those cases out, and this is what the line Pick[Transpose[#], Total[Clip[Reverse#dm - # - dwr + 2, {0, 1}]], 2] &# is for. Specifically, consider
In[90]:=
Reverse#dm - # - dwr + 2 &#{Mod[posFlat, #, 1],IntegerPart[posFlat/#] + 1} &#Last[dm]
Out[90]= {{1,2,3},{2,1,0}}
The coordinates of the top left corners should stay within a difference of dimensions of matrix and a sub-matrix. The above sub-lists were made of x and y coordiantes of top - left corners. I added 2 to make all valid results strictly positive. We have to pick only coordiantes at those positions in Transpose#{Mod[posFlat, #, 1], IntegerPart[posFlat/#] + 1} &#Last[dm] ( which is posInterm), at which both sub-lists above have strictly positive numbers. I used Total[Clip[...,{0,1}]] to recast it into picking only at those positions at which this second list has 2 (Clip converts all positive integers to 1, and Total sums numbers in 2 sublists. The only way to get 2 is when numbers in both sublists are positive).
So, we have:
In[92]:=
pos2D=Pick[Transpose[#],Total[Clip[Reverse#dm-#-dwr+2,{0,1}]],2]&#
{Mod[posFlat,#,1],IntegerPart[posFlat/#]+1}&#Last[dm]
Out[92]= {{3,1},{2,2}}
After the list of 2D positions has been filtered, so that no structurally invalid positions are present, we can use Select to extract the full sub-matrices and test against the sub-matrix of interest:
In[93]:=
finalPos =
Select[pos2D,m[[Last##;;Last##+Last#dwr-1,First##;;First##+First#dwr-1]]==what&]
Out[93]= {{3,1},{2,2}}
In this case, both positions are genuine. The final thing to do is to reconstruct the positions of the bottom - right corners of the submatrix and add them to the top-left corner positions. This is done by this line:
In[94]:= Transpose[{#,Transpose[Transpose[#]+dwr-1]}]&#finalPos
Out[94]= {{{3,1},{4,2}},{{2,2},{3,3}}}
I could have used Map, but for a large list of positions, the above code would be more efficient.
Example and benchmarks
The original example:
In[216]:= Position2D[{{0,1,2,3},{1,2,3,4},{2,3,4,5},{3,4,5,6}},{{2,3},{3,4}}]
Out[216]= {{{3,1},{4,2}},{{2,2},{3,3}},{{1,3},{2,4}}}
Note that my index conventions are reversed w.r.t. #Szabolcs' solution.
Benchmarks for large matrices and sub-matrices
Here is a power test:
nmat = 1000;
(* generate a large random matrix and a sub-matrix *)
largeTestMat = RandomInteger[100, {2000, 3000}];
what = RandomInteger[10, {100, 100}];
(* generate upper left random positions where to insert the submatrix *)
rposx = RandomInteger[{1,Last#Dimensions[largeTestMat] - Last#Dimensions[what] + 1}, nmat];
rposy = RandomInteger[{1,First#Dimensions[largeTestMat] - First#Dimensions[what] + 1},nmat];
(* insert the submatrix nmat times *)
With[{dwr = Reverse#Dimensions[what]},
Do[largeTestMat[[Last#p ;; Last#p + Last#dwr - 1,
First#p ;; First#p + First#dwr - 1]] = what,
{p,Transpose[{rposx, rposy}]}]]
Now, we test:
In[358]:= (ps1 = position2D[largeTestMat,what])//Short//Timing
Out[358]= {1.39,{{{1,2461},{100,2560}},<<151>>,{{1900,42},{1999,141}}}}
In[359]:= (ps2 = Position2D[largeTestMat,what])//Short//Timing
Out[359]= {0.062,{{{2461,1},{2560,100}},<<151>>,{{42,1900},{141,1999}}}}
(the actual number of sub-matrices is smaller than the number we try to generate, since many of them overlap and "destroy" the previously inserted ones - this is so because the sub-matrix size is a sizable fraction of the matrix size in our benchmark).
To compare, we should reverse the x-y indices in one of the solutions (level 3), and sort both lists, since positions may have been obtained in different order:
In[360]:= Sort#ps1===Sort[Reverse[ps2,{3}]]
Out[360]= True
I do not exclude a possibility that further optimizations are possible.

This is my implementation:
position2D[m_, k_] :=
Module[{di, dj, extractSubmatrix, pos},
{di, dj} = Dimensions[k] - 1;
extractSubmatrix[{i_, j_}] := m[[i ;; i + di, j ;; j + dj]];
pos = Position[ListCorrelate[k, m], ListCorrelate[k, k][[1, 1]]];
pos = Select[pos, extractSubmatrix[#] == k &];
{#, # + {di, dj}} & /# pos
]
It uses ListCorrelate to get a list of potential positions, then filters those that actually match. It's probably faster on packed real matrices.

As per Leonid's suggestion here's my solution. I know it isn't very efficient (it's about 600 times slower than Leonid's when I timed it) but it's very short, rememberable, and a nice illustration of a rarely used function, PartitionMap. It's from the Developer package, so it needs a Needs["Developer`"] call first.
Given that, Position2D can be defined as:
Position2D[m_, k_] := Position[PartitionMap[k == # &, m, Dimensions[k], {1, 1}], True]
This only gives the upper-left coordinates. I feel the lower-right coordinates are really redundant, since the dimensions of the sub-matrix are known, but if the need arises one can add those to the output by prepending {#, Dimensions[k] + # - {1, 1}} & /# to the above definition.

How about something like
Position2D[bigMat_?MatrixQ, smallMat_?MatrixQ] :=
Module[{pos, sdim = Dimensions[smallMat] - 1},
pos = Position[bigMat, smallMat[[1, 1]]];
Quiet[Select[pos, (MatchQ[
bigMat[[Sequence##Thread[Span[#, # + sdim]]]], smallMat] &)],
Part::take]]
which will return the top left-hand positions of the submatrices.
Example:
Position2D[{{0, 1, 2, 3}, {1, 2, 3, 4}, {2, 3, 4, 5}, {3, 5, 5, 6}},
{{2, 3}, {3, _}}]
(* Returns: {{1, 3}, {2, 2}, {3, 1}} *)
And to search a 1000x1000 matrix, it takes about 2 seconds on my old machine
SeedRandom[1]
big = RandomInteger[{0, 10}, {1000, 1000}];
Position2D[big, {{1, 1, _}, {1, 1, 1}}] // Timing
(* {1.88012, {{155, 91}, {295, 709}, {685, 661},
{818, 568}, {924, 45}, {981, 613}}} *)

Related

Given a pair of integers, minimum # of operations performed to reach target N

Given a pair of numbers (A, B).
You can perform an operation (A + B, B) or (A, A + B).
(A, B) is initialized to (1, 1).
For any N > 0, find the minimum number of operations you need to perform on (A, B) until A = N or B = N
Came across this question in an interview summary on glassdoor. Thought through a couple approaches, searched online but couldn't find any articles/answers solving this question. I have a brute force method shown below, however it must traverse O(2^N) paths, wondering if there is an elegant solution I am not seeing.
def pairsum(N):
A = 1
B = 1
return helper(N, A, B, 0)
def helper(N, A, B, ops):
# Solution found
if A == N or B == N:
return ops
# We've gone over, invalid path taken
if A > N or B > N:
return float("inf")
return min(helper(N, A + B, B, ops + 1), helper(N, A, A + B, ops + 1))
Given a target number N, it's possible to compute the minimum number of operations in approximately O(N log(N)) basic arithmetic operations (though I suspect there are faster ways). Here's how:
For this problem, I think it's easier to work backwards than forwards. Suppose that we're trying to reach a target pair (a, b) of positive integers. We start with (a, b) and work backwards towards (1, 1), counting steps as we go. The reason that this is easy is that there's only ever a single path from a pair (a, b) back to (1, 1): if a > b, then the pair (a, b) can't be the result of the second operation, so the only way we can possibly reach this pair is by applying the first operation to (a - b, b). Similarly, if a < b, we can only have reached the pair via the second operation applied to (a, b - a). What about the case a = b? Well, if a = b = 1, there's nothing to do. If a = b and a > 1, then there's no way we can reach the pair at all: note that both operations take coprime pairs of integers to coprime pairs of integers, so if we start with (1, 1), we can never reach a pair of integers that has a greatest common divisor bigger than 1.
This leads to the following code to count the number of steps to get from (1, 1) to (a, b), for any pair of positive integers a and b:
def steps_to_reach(a, b):
"""
Given a pair of positive integers, return the number of steps required
to reach that pair from (1, 1), or None if no path exists.
"""
steps = 0
while True:
if a > b:
a -= b
elif b > a:
b -= a
elif a == 1: # must also have b == 1 here
break
else:
return None # no path, gcd(a, b) > 1
steps += 1
return steps
Looking at the code above, it bears a strong resemblance to the Euclidean algorithm for computing greatest common divisors, except that we're doing things very inefficiently, by using repeated subtractions instead of going directly to the remainder with a Euclidean division step. So it's possible to replace the above with the following equivalent, simpler, faster version:
def steps_to_reach_fast(a, b):
"""
Given a pair of positive integers, return the number of steps required
to reach that pair from (1, 1), or None if no path exists.
Faster version of steps_to_reach.
"""
steps = -1
while b:
a, (q, b) = b, divmod(a, b)
steps += q
return None if a > 1 else steps
I leave it to you to check that the two pieces of code are equivalent: it's not hard to prove, but if you don't feel like getting out pen and paper then a quick check at the prompt should be convincing:
>>> all(steps_to_reach(a, b) == steps_to_reach_fast(a, b) for a in range(1, 1001) for b in range(1, 1001))
True
The call steps_to_reach_fast(a, b) needs O(log(max(a, b))) arithmetic operations. (This follows from standard analysis of the Euclidean algorithm.)
Now it's straightfoward to find the minimum number of operations for a given n:
def min_steps_to_reach(n):
"""
Find the minimum number of steps to reach a pair (*, n) or (n, *).
"""
# Count steps in all paths to (n, a). By symmetry, no need to
# check (a, n) too.
all_steps = (steps_to_reach_fast(n, a) for a in range(1, n+1))
return min(steps for steps in all_steps if steps is not None)
This function runs reasonably quickly up to n = 1000000 or so. Let's print out the first few values:
>>> min_steps_to_reach(10**6) # takes ~1 second on my laptop
30
>>> [min_steps_to_reach(n) for n in range(1, 50)]
[0, 1, 2, 3, 3, 5, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 6, 7, 7, 7, 7, 7, 7, 8, 7, 7, 7, 8, 8, 7, 8, 8, 8, 9, 8, 8, 8, 9, 8, 8, 8, 8, 8, 9, 8]
A search at the Online Encyclopedia of Integer Sequences quickly yields the sequence A178047, which matches our sequence perfectly. The sequence is described as follows:
Consider the Farey tree A006842/A006843; a(n) = row at which the
denominator n first appears (assumes first row is labeled row 0).
And indeed, if you look at the tree generated by your two operations, starting at (1, 1), and you regard each pair as a fraction, you get something that's very similar to the Stern-Brocot tree (another name for the Farey tree): the contents of each row are the same, but the ordering within each row is different. As it turns out, it's the Stern-Brocot tree in disguise!
This observation gives us an easily computable lower-bound on min_steps_to_reach: it's easy to show that the largest integer appearing as either a numerator or denominator in the ith row of the Stern-Brocot tree is the i+2nd Fibonacci number. So if n > Fib(i+2), then min_steps_to_reach(n) > i (and if n == Fib(i+2), then min_steps_to_reach(n) is exactly i). Getting an upper bound (or an exact value without an exhaustive search) seems to be a bit harder. Here are the worst cases: for each integer s >= 0, the smallest n requiring s steps (so for example, 506 is the first number requiring 15 steps):
[1, 2, 3, 4, 7, 6, 14, 20, 28, 38, 54, 90, 150, 216, 350, 506, 876, 1230, 2034, 3160, 4470, 7764]
If there's a pattern here, I'm not spotting it (but it's essentially sequence A135510 on OEIS).
[I wrote this before I realized #mark-dickinson had answered; his answer is much better than mine, but I'm providing mine for reference anyway]
The problem is fairly easy to solve if you work backwards. As an example, suppose N=65:
That means our current pair is either {65, x} or {y, 65} for some unknown values of x and y.
If {A,B} was the previous pair, this means either {A, A+B} or {A+B, B} is equal to either {65, x} or {y, 65}, which gives us 4 possible cases:
{A,A+B} = {65,x}, which would mean A=65. However, if A=65, we would've already hit A=N at an earlier step, and we're assuming this is the first step at which A=N or B=N, so we discard this possibility.
{A,A+B} = {y,65} which means A+B=65
{A+B,B} = {65,x} which means A+B=65
{A+B,B} = {y,65} which means B=65. However, if B=65, we already had a solution at a previous step, we also discard this possibility.
Therefore, A+B=65. There are 65 ways in which this can happen (actually, I think you can ignore the cases where A=0 or B=0, and also choose B>A by symmetry, but the solution is easy even withouth these assumptions).
We now examine all 65 cases. As an example, let's use A=25 and B=40.
If {C,D} was the pair that generated {25,40}, there are two possible cases:
{C+D,D} = {25,40} so D=40 and C=-15, which is impossible, since, starting at {1,1}, we will never get negative numbers.
{C,C+D} = {25,40} so C=25, and D=15.
Therefore, the "predecessor" of {25,40} is necessarily {25,15}.
By similar analysis, the predecessor of {25,15}, let's call it {E,F}, must have the property that either:
{E,E+F} = {25,15}, impossible since this would mean F=-10
{E+F,F} = {25,15} meaning E=10 and F=15.
Similarly the predecessor of {10,15} is {10,5}, whose predecessor is {5,5}.
The predecessor of {5,5} is either {0,5} or {5,0}. These two pairs are their own predecessors, but have no other predecessors.
Since we never hit {1,1} in this sequence, we know that {1,1} will never generate {25, 40}, so we continue computing for other pairs {A,B} such that A+B=65.
If we did hit {1,1}, we'd count the number of steps it took to get there, store the value, compute it for all other values of {A,B} such that A+B=65, and take the minimum.
Note that once we've chosen a value of A (and thus a value of B), we are effectively doing the subtraction version of Euclid's Algorithm, so the number of steps required is O(log(N)). Since you are doing these steps N times, the algorithm is O(N*log(N)), much smaller than your O(2^N).
Of course, you may be able to find shortcuts to make the method even faster.
Interesting Notes
If you start with {1,1}, here are the pairs you can generate in k steps (we use k=0 for {1,1} itself), after removing duplicates:
k=0: {1,1}
k=1: {2, 1}, {1, 2}
k=2: {3, 1}, {2, 3}, {3, 2}, {1, 3}
k=3: {4, 1}, {3, 4}, {5, 3}, {2, 5}, {5, 2}, {3, 5}, {4, 3}, {1, 4}
k=4: {5, 1}, {4, 5}, {7, 4}, {3, 7}, {8, 3}, {5, 8}, {7, 5}, {2, 7}, {7, 2}, {5, 7}, {8, 5}, {3, 8}, {7, 3}, {4, 7}, {5, 4}, {1, 5}
k=5: {6, 1}, {5, 6}, {9, 5}, {4, 9}, {11, 4}, {7, 11}, {10, 7}, {3, 10}, {11, 3}, {8, 11}, {13, 8}, {5, 13}, {12, 5}, {7, 12}, {9, 7}, {2, 9}, {9, 2}, {7, 9}, {12, 7}, {5, 12}, {13, 5}, {8, 13}, {11, 8}, {3, 11}, {10, 3}, {7, 10}, {11, 7}, {4, 11}, {9, 4}, {5, 9}, {6, 5}, {1, 6}
Things to note:
You can generate N=7 and N=8 in 4 steps, but not N=6, which requires 5 steps.
The number of pairs generated is 2^k
The smallest number of steps (k) required to reach a given N is:
N=1: k=0
N=2: k=1
N=3: k=2
N=4: k=3
N=5: k=3
N=6: k=5
N=7: k=4
N=8: k=4
N=9: k=5
N=10: k=5
N=11: k=5
The resulting sequence, {0,1,2,3,3,5,4,4,5,5,5,...} is https://oeis.org/A178047
The highest number generated in k steps is the (k+2)nd Fibonacci number, http://oeis.org/A000045
The number of distinct integers you can reach in k steps is now the (k+1)st element of http://oeis.org/A293160
As an example for k=20:
There are 2^20 or 1048576 pairs when k=20
The highest number in any of the 1048576 pairs above is 17711, the 22nd (20+2) Fibonacci number
However, you can't reach all of the first 17711 integers with these pairs. You can only reach 11552 of them, the 21st (20+1) element of A293160
For details on how I worked this problem out, see https://github.com/barrycarter/bcapps/blob/master/STACK/bc-add-sets.m

How to find the location of the highest element in a tensor in mathematica

I have a very large array of numbers in the form of a third order tensor.I want to find the highest of all the values in that tensor. How can I do it in mathematica? The context is that a reaction is carried out by varying temperature pressure and vessel volume. I want to find the optimum combination of the three to maximize the product. Each element of the tensor represents a value of the product produced corresponding to a specific combination of temperature pressure and volume.
Given some matrix, tensor, or basically any list-of-lists of real numbers, you can simply use the Max function to determine the maximum value and then Position to say where it is. Assuming your data isn't enormous (requiring some conservative/careful approach to save time/memory), this should be fine.
For example, here is a random list of lists of of lists of reals:
data = Table[RandomReal[],
{i, 1, RandomInteger[{4, 8}]},
{j, 1, RandomInteger[{4, 8}]},
{k, 1, RandomInteger[{4, 8}]}
];
You can just do:
m = Max[data]
Position[data, m]
This will tell you the position of the maximum value. If you did random integers instead, you could have ties, in which case you might have repeats:
data = Table[RandomInteger[{1, 10}],
{i, 1, RandomInteger[{4, 8}]},
{j, 1, RandomInteger[{4, 8}]},
{k, 1, RandomInteger[{4, 8}]}
];
m = Max[data]
Position[data, m]
Table[RandomInteger[100, 3], 3]
Prepend[Ordering[%[[First[Ordering[Reverse#*Sort /# %, -1]]]], -1],
First[Ordering[Reverse#*Sort /# %, -1]]]
% stands for the tensor to sort, in this case it's a random tensor generated from Table[RandomInteger[100, 3], 3]
This gives the position and value in one shot.
(m = RandomReal[{-1, 1}, {4, 3, 2}]) // MatrixForm
First#MaximalBy[
Flatten[MapIndexed[ {##} &, #, {-1}], ArrayDepth[#] - 1],
First] &#m
{0.903213, {3, 2, 2}}
Here is an alternate that will work with ragged lists:
Module[{h},
First#MaximalBy[List ### Flatten[MapIndexed[h### &, #, {-1}]],
First]] &#{{1, 2, 3}, {4, 5, {2, 3}}}
{5, {2, 2}}

Mathematica, maximize element extraction from list

I think this is a simple question for mathematica experts.
How can I maximize the extracted value from a list given a index that has to respect some constrains?
For example:
S = {4,2,3,5}
Maximize[{Extract[S,x], x<= 3, x>=1},{x}]
I would like 4 is returned instead of this error:
Extract::psl: "Position specification x in Extract[{4,2,3,5},x] is not an integer or a list of integers."
Does someone know like solve this?
Thanks a lot.
Thanks a lot!! The last approach shown is what I was looking for but applied to my real problem does not work.
I have the following problem:
I have to maximize the satisfaction of an employee with respect to a certain shift in an certain day of a month.
I have the matrix satisfaction (Employees,shifts) and is something like this:
S= {{4,3,5,2},{3,4,5,1}}
Each element represents the satisfaction of an employee with respect to a certain shift so employee 1 has satisfaction 4 with respect shift 1.
My model has to choose the right shift for all month days in order to maximize the employee satisfaction by respecting certain constraints.
My greatest problem is relate satisfaction matrix with chosen shift.
I am not able to use in method NMaximize a function that takes the chosen shifts and employee and returns the satisfaction and so doing a summation over all month days.
I need to maximize something like this:
Summation(from j=1 to j=31) getSatisfaction[1,chosenShift for that day)
Do you know how can I write this in mathematica?
I am struggling to this problem for several days but I am not able to solve this problem.
I need the input to relate chosen shift with satisfaction matrix.
Thanks a lot!!
If you don't need to find the value of x then I suggest you merely extract the acceptable range of the list and then find the Max of that:
s = {4,2,3,5};
s[[1 ;; 3]] // Max
4
If you have particularly hairy constraints then you may need something like Pick:
list = {5, 7, 1, 9, 3, 6, 2, 8, 4};
Pick[list, Range#Length#list, x_ /; x <= 7 && x >= 3 && Mod[7, x] == 1]
{1, 6}
You can then use Max on the returned list.
For completeness, if you need the value of x or other details from the process, here is an approach:
list = {6, 5, 7, 3, 4, 2, 1, 8, 9};
pos = Cases[Range#Length#list, x_ /; x <= 7 && x >= 3 && Mod[7, x] == 1]
values = Part[list, pos]
maxpos = Part[pos, Ordering[values, -1]]
{3, 6}
{7, 2}
{3}
Answering your updated question:
If you have:
shifts = {{4, 3, 5, 2}, {3, 4, 5, 1}, {4, 3, 5, 2}}
Then
(Tally /# Transpose#shifts)[[All, 1, 1]]
gives you:
{4, 3, 5, 2}
Which i a list with the preferred shift for each employee.

Efficient alternative to Outer on sparse arrays in Mathematica?

Suppose I have two very large lists {a1, a2, …} and {b1, b2, …} where all ai and bj are large sparse arrays. For the sake of memory efficiency I store each list as one comprehensive sparse array.
Now I would like to compute some function f on all possible pairs of ai and bj where each result f[ai, bj] is a sparse array again. All these sparse arrays have the same dimensions, by the way.
While
Flatten[Outer[f, {a1, a2, ...}, {b1, b2, ...}, 1], 1]
returns the desired result (in principle) it appears to consume excessive amounts of memory. Not the least because the return value is a list of sparse arrays whereas one comprehensive sparse array turns out much more efficient in my cases of interest.
Is there an efficient alternative to the above use of Outer?
More specific example:
{SparseArray[{{1, 1, 1, 1} -> 1, {2, 2, 2, 2} -> 1}],
SparseArray[{{1, 1, 1, 2} -> 1, {2, 2, 2, 1} -> 1}],
SparseArray[{{1, 1, 2, 1} -> 1, {2, 2, 1, 2} -> 1}],
SparseArray[{{1, 1, 2, 2} -> -1, {2, 2, 1, 1} -> 1}],
SparseArray[{{1, 2, 1, 1} -> 1, {2, 1, 2, 2} -> 1}],
SparseArray[{{1, 2, 1, 2} -> 1, {2, 1, 2, 1} -> 1}],
SparseArray[{{1, 2, 2, 1} -> -1, {2, 1, 1, 2} -> 1}],
SparseArray[{{1, 2, 2, 2} -> 1, {2, 1, 1, 1} -> 1}]};
ByteCount[%]
list = SparseArray[%%]
ByteCount[%]
Flatten[Outer[Dot, list, list, 1], 1];
ByteCount[%]
list1x2 = SparseArray[%%]
ByteCount[%]
Flatten[Outer[Dot, list1x2, list, 1], 1];
ByteCount[%]
list1x3 = SparseArray[%%]
ByteCount[%]
etc. Not only are the raw intermediate results of Outer (lists of sparse arrays) extremely inefficient, Outer seems to consume way too much memory during the computation itself, too.
I will propose a solution which is rather complex but allows one to only use about twice as much memory during the computation as is needed to store the final result as a SparseArray. The price to pay for this will be a much slower execution.
The code
Sparse array construction / deconstruction API
Here is the code. First, a slightly modified (to address higher-dimensional sparse arrays) sparse array construction - deconstruction API, taken from this answer:
ClearAll[spart, getIC, getJR, getSparseData, getDefaultElement,
makeSparseArray];
HoldPattern[spart[SparseArray[s___], p_]] := {s}[[p]];
getIC[s_SparseArray] := spart[s, 4][[2, 1]];
getJR[s_SparseArray] := spart[s, 4][[2, 2]];
getSparseData[s_SparseArray] := spart[s, 4][[3]];
getDefaultElement[s_SparseArray] := spart[s, 3];
makeSparseArray[dims_List, jc_List, ir_List, data_List, defElem_: 0] :=
SparseArray ## {Automatic, dims, defElem, {1, {jc, ir}, data}};
Iterators
The following functions produce iterators. Iterators are a good way to encapsulate the iteration process.
ClearAll[makeTwoListIterator];
makeTwoListIterator[fname_Symbol, a_List, b_List] :=
With[{indices = Flatten[Outer[List, a, b, 1], 1]},
With[{len = Length[indices]},
Module[{i = 0},
ClearAll[fname];
fname[] := With[{ind = ++i}, indices[[ind]] /; ind <= len];
fname[] := Null;
fname[n_] :=
With[{ind = i + 1}, i += n;
indices[[ind ;; Min[len, ind + n - 1]]] /; ind <= len];
fname[n_] := Null;
]]];
Note that I could have implemented the above function more memory - efficiently and not use Outer in it, but for our purposes this won't be the major concern.
Here is a more specialized version, which produces interators for pairs of 2-dimensional indices.
ClearAll[make2DIndexInterator];
make2DIndexInterator[fname_Symbol, i : {iStart_, iEnd_}, j : {jStart_, jEnd_}] :=
makeTwoListIterator[fname, Range ## i, Range ## j];
make2DIndexInterator[fname_Symbol, ilen_Integer, jlen_Integer] :=
make2DIndexInterator[fname, {1, ilen}, {1, jlen}];
Here is how this works:
In[14]:=
makeTwoListIterator[next,{a,b,c},{d,e}];
next[]
next[]
next[]
Out[15]= {a,d}
Out[16]= {a,e}
Out[17]= {b,d}
We can also use this to get batch results:
In[18]:=
makeTwoListIterator[next,{a,b,c},{d,e}];
next[2]
next[2]
Out[19]= {{a,d},{a,e}}
Out[20]= {{b,d},{b,e}}
, and we will be using this second form.
SparseArray - building function
This function will build a SparseArray object iteratively, by getting chunks of data (also in SparseArray form) and gluing them together. It is basically code used in this answer, packaged into a function. It accepts the code piece used to produce the next chunk of data, wrapped in Hold (I could alternatively make it HoldAll)
Clear[accumulateSparseArray];
accumulateSparseArray[Hold[getDataChunkCode_]] :=
Module[{start, ic, jr, sparseData, dims, dataChunk},
start = getDataChunkCode;
ic = getIC[start];
jr = getJR[start];
sparseData = getSparseData[start];
dims = Dimensions[start];
While[True, dataChunk = getDataChunkCode;
If[dataChunk === {}, Break[]];
ic = Join[ic, Rest#getIC[dataChunk] + Last#ic];
jr = Join[jr, getJR[dataChunk]];
sparseData = Join[sparseData, getSparseData[dataChunk]];
dims[[1]] += First[Dimensions[dataChunk]];
];
makeSparseArray[dims, ic, jr, sparseData]];
Putting it all together
This function is the main one, putting it all together:
ClearAll[sparseArrayOuter];
sparseArrayOuter[f_, a_SparseArray, b_SparseArray, chunkSize_: 100] :=
Module[{next, wrapperF, getDataChunkCode},
make2DIndexInterator[next, Length#a, Length#b];
wrapperF[x_List, y_List] := SparseArray[f ### Transpose[{x, y}]];
getDataChunkCode :=
With[{inds = next[chunkSize]},
If[inds === Null, Return[{}]];
wrapperF[a[[#]] & /# inds[[All, 1]], b[[#]] & /# inds[[All, -1]]]
];
accumulateSparseArray[Hold[getDataChunkCode]]
];
Here, we first produce the iterator which will give us on demand portions of index pair list, used to extract the elements (also SparseArrays). Note that we will generally extract more than one pair of elements from two large input SparseArray-s at a time, to speed up the code. How many pairs we process at once is governed by the optional chunkSize parameter, which defaults to 100. We then construct the code to process these elements and put the result back into SparseArray, where we use an auxiliary function wrapperF. The use of iterators wasn't absolutely necessary (could use Reap-Sow instead, as with other answers), but allowed me to decouple the logic of iteration from the logic of generic accumulation of sparse arrays.
Benchmarks
First we prepare large sparse arrays and test our functionality:
In[49]:=
arr = {SparseArray[{{1,1,1,1}->1,{2,2,2,2}->1}],SparseArray[{{1,1,1,2}->1,{2,2,2,1}->1}],
SparseArray[{{1,1,2,1}->1,{2,2,1,2}->1}],SparseArray[{{1,1,2,2}->-1,{2,2,1,1}->1}],
SparseArray[{{1,2,1,1}->1,{2,1,2,2}->1}],SparseArray[{{1,2,1,2}->1,{2,1,2,1}->1}]};
In[50]:= list=SparseArray[arr]
Out[50]= SparseArray[<12>,{6,2,2,2,2}]
In[51]:= larger = sparseArrayOuter[Dot,list,list]
Out[51]= SparseArray[<72>,{36,2,2,2,2,2,2}]
In[52]:= (large= sparseArrayOuter[Dot,larger,larger])//Timing
Out[52]= {0.047,SparseArray[<2592>,{1296,2,2,2,2,2,2,2,2,2,2}]}
In[53]:= SparseArray[Flatten[Outer[Dot,larger,larger,1],1]]==large
Out[53]= True
In[54]:= MaxMemoryUsed[]
Out[54]= 21347336
Now we do the power tests
In[55]:= (huge= sparseArrayOuter[Dot,large,large,2000])//Timing
Out[55]= {114.344,SparseArray[<3359232>,{1679616,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2}]}
In[56]:= MaxMemoryUsed[]
Out[56]= 536941120
In[57]:= ByteCount[huge]
Out[57]= 262021120
In[58]:= (huge1 = Flatten[Outer[Dot,large,large,1],1]);//Timing
Out[58]= {8.687,Null}
In[59]:= MaxMemoryUsed[]
Out[59]= 2527281392
For this particular example, the suggested method is 5 times more memory-efficient than the direct use of Outer, but about 15 times slower. I had to tweak the chunksize parameter (default is 100, but for the above I used 2000, to get the optimal speed / memory use combination). My method only used as a peak value twice as much memory as needed to store the final result. The degree of memory-savings as compared to Outer- based method will depend on the sparse arrays in question.
If lst1 and lst2 are your lists,
Reap[
Do[Sow[f[#1[[i]], #2[[j]]]],
{i, 1, Length##1},
{j, 1, Length##2}
] &[lst1, lst2];
] // Last // Last
does the job and may be more memory-efficient. On the other hand, maybe not. Nasser is right, an explicit example would be useful.
EDIT: Using Nasser's randomly-generated arrays, and for len=200, MaxMemoryUsed[] indicates that this form needs 170MB while the Outer form in the question takes 435MB.
Using your example list data, I believe that you will find the ability to Append to a SparseArray quite helpful.
acc = SparseArray[{}, {1, 2, 2, 2, 2, 2, 2}]
Do[AppendTo[acc, i.j], {i, list}, {j, list}]
Rest[acc]
I need Rest to drop the first zero-filled tensor in the result. The second argument of the seed SparseArray must be the dimensions of each of your elements with a prefixed 1. You may need to explicitly specify a background for the seed SparseArray to optimize performance.

How to reshape matrices in Mathematica

When manipulating matrices it is often convenient to change their shape. For instance, to turn an N x M sized matrix into a vector of length N X M. In MATLAB a reshape function exists:
RESHAPE(X,M,N) returns the M-by-N matrix whose elements are taken columnwise from X. An error results if X does not have M*N elements.
In the case of converting between a matrix and vector I can use the Mathematica function Flatten which takes advantage of Mathematica's nested list representation for matrices. As a quick example, suppose I have a matrix X:
With Flatten[X] I can get the vector {1,2,3,...,16}. But what would be far more useful is something akin to applying Matlab's reshape(X,2,8) which would result in the following Matrix:
This would allow creation of arbitrary matrices as long as the dimensions equal N*M. As far as I can tell, there isn't anything built in which makes me wonder if someone hasn't coded up a Reshape function of their own.
Reshape[mtx_, _, n_] := Partition[Flatten[mtx], n]
ArrayReshape does exactly that.
Reshape[list_, dimensions_] :=
First[Fold[Partition[#1, #2] &, Flatten[list], Reverse[dimensions]]]
Example Usage:
In: Reshape[{1,2,3,4,5,6},{2,3}]
Out: {{1,2,3},{4,5,6}}
This works with arrays of arbitrary depth.
I know this is an old thread but for the sake of the archives and google searches I've got a more general way that allows a length m*n*... list to be turned into an m*n*... array:
Reshape[list_, shape__] := Module[{i = 1},
NestWhile[Partition[#, shape[[i]]] &, list, ++i <= Length[shape] &]
]
Eg:
In:= Reshape[Range[8], {2, 2, 2}]
Out:= {{{1, 2}, {3, 4}}, {{5, 6}, {7, 8}}}
There is now also a new function ArrayReshape[].
Example:
{{1, 2, 3}, {4, 5, 6}} // MatrixForm
ArrayReshape[{{1, 2, 3}, {4, 5, 6}}, {3, 2}] // MatrixForm

Resources