Related
In Mathematica - how do I bin an array to create a new array which consist from sum domains of the old array with a given size ???
Example:
thanks.
This is slightly simpler than #ChrisDegnen's solution. Given the same definition of array the expression
Map[Total, Map[Flatten, Partition[array, {2, 2}], {2}], {2}]
produces
{{4, 10}, {8, 10}}
If you prefer, this expression
Apply[Plus, Map[Flatten, Partition[array, {2, 2}], {2}], {2}]
uses Apply and Plus rather than Map and Total but is entirely equivalent.
This works for the example but a generalised version would need more work.
array =
{{1, 1, 1, 2},
{1, 1, 3, 4},
{2, 2, 2, 3},
{2, 2, 2, 3}};
Map[Total,
Map[Flatten,
Map[Transpose,
Map[Partition[#, 2] &, Partition[array, 2], 2],
2], {2}], {2}]
% // MatrixForm
4 10
8 10
I have a 20000 x 185 x 5 tensor, which looks like
{{{a1_1,a2_1,a3_1,a4_1,a5_1},{b1_1,b2_1,b3_1,b4_1,b5_1}...
(continue for 185 times)}
{{a1_2,a2_2,a3_2,a4_2,a5_2},{b1_2,b2_2,b3_2,b4_2,b5_2}...
...
...
...
{{a1_20000,a2_20000,a3_20000,a4_20000,a5_20000},
{b1_20000,b2_20000,b3_20000,b4_20000,b5_20000}... }}
The 20000 represents iteration number, the 185 represents individuals, and each individual has 5 attributes. I need to construct a 185 x 5 matrix that stores the mean value for each individual's 5 attributes, averaged across the 20000 iterations.
Not sure what the best way to do this is. I know Mean[ ] works on matrices, but with a Tensor, the derived values might not be what I need. Also, Mathematica ran out of memory if I tried to do Mean[tensor]. Please provide some help or advice. Thank you.
When in doubt, drop the size of the dimensions. (You can still keep them distinct to easily see where things end up.)
(* In[1]:= *) data = Array[a, {4, 3, 2}]
(* Out[1]= *) {{{a[1, 1, 1], a[1, 1, 2]}, {a[1, 2, 1],
a[1, 2, 2]}, {a[1, 3, 1], a[1, 3, 2]}}, {{a[2, 1, 1],
a[2, 1, 2]}, {a[2, 2, 1], a[2, 2, 2]}, {a[2, 3, 1],
a[2, 3, 2]}}, {{a[3, 1, 1], a[3, 1, 2]}, {a[3, 2, 1],
a[3, 2, 2]}, {a[3, 3, 1], a[3, 3, 2]}}, {{a[4, 1, 1],
a[4, 1, 2]}, {a[4, 2, 1], a[4, 2, 2]}, {a[4, 3, 1], a[4, 3, 2]}}}
(* In[2]:= *) Dimensions[data]
(* Out[2]= *) {4, 3, 2}
(* In[3]:= *) means = Mean[data]
(* Out[3]= *) {
{1/4 (a[1, 1, 1] + a[2, 1, 1] + a[3, 1, 1] + a[4, 1, 1]),
1/4 (a[1, 1, 2] + a[2, 1, 2] + a[3, 1, 2] + a[4, 1, 2])},
{1/4 (a[1, 2, 1] + a[2, 2, 1] + a[3, 2, 1] + a[4, 2, 1]),
1/4 (a[1, 2, 2] + a[2, 2, 2] + a[3, 2, 2] + a[4, 2, 2])},
{1/4 (a[1, 3, 1] + a[2, 3, 1] + a[3, 3, 1] + a[4, 3, 1]),
1/4 (a[1, 3, 2] + a[2, 3, 2] + a[3, 3, 2] + a[4, 3, 2])}
}
(* In[4]:= *) Dimensions[means]
(* Out[4]= *) {3, 2}
Mathematica ran out of memory if I tried to do Mean[tensor]
This is probably because intermediate results are larger than the final result. This is likely if the elements are not type Real or Integer. Example:
a = Tuples[{x, Sqrt[y], z^x, q/2, Mod[r, 1], Sin[s]}, {2, 4}];
{MemoryInUse[], MaxMemoryUsed[]}
b = Mean[a];
{MemoryInUse[], MaxMemoryUsed[]}
{109125576, 124244808}
{269465456, 376960648}
If they are, and are in packed array form, perhaps the elements are such that the array in unpacked during processing.
Here is an example where the tensor is a packed array of small numbers, and unpacking does not occur.
a = RandomReal[99, {20000, 185, 5}];
PackedArrayQ[a]
{MemoryInUse[], MaxMemoryUsed[]}
b = Mean[a];
{MemoryInUse[], MaxMemoryUsed[]}
True
{163012808, 163016952}
{163018944, 163026688}
Here is the same size of tensor with very large numbers.
a = RandomReal[$MaxMachineNumber, {20000, 185, 5}];
Developer`PackedArrayQ[a]
{MemoryInUse[], MaxMemoryUsed[]}
b = Mean[a];
{MemoryInUse[], MaxMemoryUsed[]}
True
{163010680, 458982088}
{163122608, 786958080}
To elaborate a little on the other answers, there is no reason to expect Mathematica functions to operate materially differently on tensors than matrices because Mathemetica considers them both to be nested Lists, that are just of different nesting depth. How functions behave with lists depends on whether they're Listable, which you can check using Attributes[f], where fis the function you are interested in.
Your data list's dimensionality isn't actually that big in the scheme of things. Without seeing your actual data it is hard to be sure, but I suspect the reason you are running out of memory is that some of your data is non-numerical.
I don't know what you're doing incorrectly (your code will help). But Mean[] already works as you want it to.
a = RandomReal[1, {20000, 185, 5}];
b = Mean#a;
Dimensions#b
Out[1]= {185, 5}
You can even check that this is correct:
{Max#b, Min#b}
Out[2]={0.506445, 0.494061}
which is the expected value of the mean given that RandomReal uses a uniform distribution by default.
Assume you have the following data :
a = Table[RandomInteger[100], {i, 20000}, {j, 185}, {k, 5}];
In a straightforward manner You can find a table which stores the means of a[[1,j,k]],a[[2,j,k]],...a[[20000,j,k]]:
c = Table[Sum[a[[i, j, k]], {i, Length[a]}], {j, 185}, {k, 5}]/
Length[a] // N; // Timing
{37.487, Null}
or simply :
d = Total[a]/Length[a] // N; // Timing
{0.702, Null}
The second way is about 50 times faster.
c == d
True
To extend on Brett's answer a bit, when you call Mean on a n-dimensional tensor then it averages over the first index and returns an n-1 dimensional tensor:
a = RandomReal[1, {a1, a2, a3, ... an}];
Dimensions[a] (* This would have n entries in it *)
b = Mean[a];
Dimensions[b] (* Has n-1 entries, where averaging was done over the first index *)
In the more general case where you may wish to average over the i-th argument, you would have to transpose the data around first. For example, say you want to average the 3nd of 5 dimensions. You would need the 3rd element first, followed by the 1st, 2nd, 4th, 5th.
a = RandomReal[1, {5, 10, 2, 40, 10}];
b = Transpose[a, {2, 3, 4, 1, 5}];
c = Mean[b]; (* Now of dimensions {5, 10, 40, 10} *)
In other words, you would make a call to Transpose where you placed the i-th index as the first tensor index and moved everything before it ahead one. Anything that comes after the i-th index stays the same.
This tends to come in handy when your data comes in odd formats where the first index may not always represent different realizations of a data sample. I've had this come up, for example, when I had to do time averaging of large wind data sets where the time series came third (!) in terms of the tensor representation that was available.
You could imagine the generalizedTenorMean would look something like this then:
Clear[generalizedTensorMean];
generalizedTensorMean[A_, i_] :=
Module[{n = Length#Dimensions#A, ordering},
ordering =
Join[Table[x, {x, 2, i}], {1}, Table[x, {x, i + 1, n}]];
Mean#Transpose[A, ordering]]
This reduces to the plain-old-mean when i == 1. Try it out:
A = RandomReal[1, {2, 4, 6, 8, 10, 12, 14}];
Dimensions#A (* {2, 4, 6, 8, 10, 12, 14} *)
Dimensions#generalizedTensorMean[A, 1] (* {4, 6, 8, 10, 12, 14} *)
Dimensions#generalizedTensorMean[A, 7] (* {2, 4, 6, 8, 10, 12} *)
On a side note, I'm surprised that Mathematica doesn't support this by default. You don't always want to average over the first level of a list.
I think Mathematica is biased towards rows not columns.
Given a matrix, to insert a row seems to be easy, just use Insert[]
(a = {{1, 2, 3}, {4, 0, 8}, {7 , 8, 0}}) // MatrixForm
1 2 3
4 0 8
7 8 0
row = {97, 98, 99};
(newa = Insert[a, row, 2]) // MatrixForm
1 2 3
97 98 99
4 0 8
7 8 0
But to insert a column, after some struggle, I found 2 ways, I show below, and would like to ask the experts here if they see a shorter and more direct way (Mathematica has so many commands, and I could have overlooked one that does this sort of thing in much direct way), as I think the methods I have now are still too complex for such a basic operation.
First method
Have to do double transpose:
a = {{1, 2, 3}, {4, 0, 8}, {7 , 8, 0}}
column = {97, 98, 99}
newa = Transpose[Insert[Transpose[a], column, 2]]
1 97 2 3
4 98 0 8
7 99 8 0
Second method
Use SparseArray, but need to watch out for index locations. Kinda awkward for doing this:
(SparseArray[{{i_, j_} :> column[[i]] /; j == 2, {i_, j_} :> a[[i, j]] /; j == 1,
{i_, j_} :> a[[i, j - 1]] /; j > 1}, {3, 4}]) // Normal
1 97 2 3
4 98 0 8
7 99 8 0
The question is: Is there a more functional way, that is little shorter than the above? I could ofcourse use one of the above, and wrap the whole thing with a function, say insertColumn[...] to make it easy to use. But wanted to see if there is an easier way to do this than what I have.
For reference, this is how I do this in Matlab:
EDU>> A=[1 2 3;4 0 8;7 8 0]
A =
1 2 3
4 0 8
7 8 0
EDU>> column=[97 98 99]';
EDU>> B=[A(:,1) column A(:,2:end)]
B =
1 97 2 3
4 98 0 8
7 99 8 0
Your double Transpose method seems fine. For very large matrices, this will be 2-3 times faster:
MapThread[Insert, {a, column, Table[2, {Length[column]}]}]
If you want to mimic your Matlab way, the closest is probably this:
ArrayFlatten[{{a[[All, ;; 1]], Transpose[{column}], a[[All, 2 ;;]]}}]
Keep in mind that insertions require making an entire copy of the matrix. So, if you plan to build a matrix this way, it is more efficient to preallocate the matrix (if you know its size) and do in-place modifications through Part instead.
You can use Join with a level specification of 2 along with Partition in subsets of size 1:
a = {{1, 2, 3}, {4, 0, 8}, {7 , 8, 0}}
column = {97, 98, 99}
newa = Join[a,Partition[column,1],2]
I think I'd do it the same way, but here are some other ways of doing it:
-With MapIndexed
newa = MapIndexed[Insert[#1, column[[#2[[1]]]], 2] &, a]
-With Sequence:
newa = a;
newa[[All, 1]] = Transpose[{newa[[All, 1]], column}];
newa = Replace[a, List -> Sequence, {3}, Heads -> True]
Interestingly, this would seem to be a method that works 'in place', i.e. it wouldn't really require a matrix copy as stated in Leonid's answer and if you print the resulting matrix it apparently works as a charm.
However, there's a big catch. See the problems with Sequence in the mathgroup discussion "part assigned sequence behavior puzzling".
I usually just do like this:
In: m0 = ConstantArray[0, {3, 4}];
m0[[All, {1, 3, 4}]] = {{1, 2, 3}, {4, 0, 8}, {7, 8, 0}};
m0[[All, 2]] = {97, 98, 99}; m0
Out:
{{1, 97, 2, 3}, {4, 98, 0, 8}, {7, 99, 8, 0}}
I don't know how it compare in terms of efficiency.
I originally posted this as a comment (now deleted)
Based on a method given by user656058 in this question (Mathematica 'Append To' Function Problem) and the reply of Mr Wizard, the following alternative method of adding a column to a matrix, using Table and Insert, may be gleaned:
(a = {{1, 2, 3}, {4, 0, 8}, {7, 8, 0}});
column = {97, 98, 99};
Table[Insert[a[[i]], column[[i]], 2], {i, 3}] // MatrixForm
giving
Similarly, to add a column of zeros (say):
Table[Insert[#[[i]], 0, 2], {i, Dimensions[#][[1]]}] & # a
As noted in the comments above, Janus has drawn attention to the 'trick' of adding a column of zeros by the ArrayFlatten method (see here)
ArrayFlatten[{{Take[#, All, 1], 0, Take[#, All, -2]}}] & #
a // MatrixForm
Edit
Perhaps simpler, at least for smaller matrices
(Insert[a[[#]], column[[#]], 2] & /# Range[3]) // MatrixForm
or, to insert a column of zeros
Insert[a[[#]], 0, 2] & /# Range[3]
Or, a little more generally:
Flatten#Insert[a[[#]], {0, 0}, 2] & /# Range[3] // MatrixForm
May also easily be adapted to work with Append and Prepend, of course.
This is another simple 'matrix' question in Mathematica. I want to show how I did this, and ask if there is a better answer.
I want to select all 'rows' from matrix based on value in the first column (or any column, I used first column here just as an example).
Say, find all rows where the entry in the first position is <=4 in this example:
list = {{1, 2, 3},
{4, 5, 8},
{7 , 8, 9}}
So, the result should be
{{1,2,3},
{4,5,8}}
Well, the problem is I need to use Position, since the result returned by Position can be used directly by Extract. (but can't be used by Part or [[ ]], so that is why I am just looking at Position[] ).
But I do not know how to tell Position to please restrict the 'search' pattern to only the 'first' column so I can do this in one line.
When I type
pos = Position[list, _?(# <= 4 &)]
it returns position of ALL entries which are <=4.
{{1, 1}, {1, 2}, {1, 3}, {2, 1}}
If I first get the first column, then apply Position on it, it works ofcourse
list = {{1, 2, 3},
{4, 5, 8},
{7 , 8, 9}};
pos = Position[list[[All, 1]], _?(# <= 4 &)]
Extract[list, pos]
--> {{1, 2, 3}, {4, 5, 8}}
Also I tried this:
pos = Position[list, _?(# <= 4 &)];
pos = Select[pos, #[[2]] == 1 &] (*only look at ones in the 'first' column*)
{{1, 1}, {2, 1}}--->
and this gives me the correct positions in the first column. To use that to find all rows, I did
pos = pos[[All, 1]] (* to get list of row positions*)
---> {1, 2}
list[[ pos[[1]] ;; pos[[-1]], All]]
{{1, 2, 3},
{4, 5, 8}}
So, to summarize, putting it all together, this is what I did:
method 1
list = {{1, 2, 3},
{4, 5, 8},
{7 , 8, 9}};
pos = Position[list[[All, 1]], _?(# <= 4 &)]
Extract[list, pos]
--> {{1, 2, 3}, {4, 5, 8}}
method 2
list = {{1, 2, 3},
{4, 5, 8},
{7 , 8, 9}}
pos = Position[list, _?(# <= 4 &)];
pos = Select[pos, #[[2]] == 1 &];
pos = pos[[All, 1]];
list[[ pos[[1]] ;; pos[[-1]], All]]
{{1, 2, 3},
{4, 5, 8}}
The above clearly is not too good.
Is method 1 above the 'correct' functional way to do this?
For reference, this is how I do the above in Matlab:
EDU>> A=[1 2 3;4 5 8;7 8 9]
A =
1 2 3
4 5 8
7 8 9
EDU>> A( A(:,1)<=4 , :)
1 2 3
4 5 8
I am trying to improve my 'functional' handling of working with matrices in Mathematica commands, this is an area I feel I am not good at working with lists. I find working with matrices easier for me.
The question is: Is there is a shorter/more functional way to do this in Mathematica?
thanks
You could use Pick[] as follows:
Pick[list, list[[All, 1]], _?(# <= 4 &)]
How about the following?
In[1]:= list = {{1, 2, 3}, {4, 5, 8}, {7, 8, 9}};
In[2]:= Select[list, First[#] <= 4 &]
Out[2]= {{1, 2, 3}, {4, 5, 8}}
Here's a loose translation of your matlab code:
list[[Flatten[Position[Thread[list[[All, 1]] <= 4], True]]]]
(of course, the Flatten would not be needed if I used Extract instead of Part).
There is a faster method than those already presented, using SparseArray. It is:
list ~Extract~
SparseArray[UnitStep[4 - list[[All, 1]]]]["NonzeroPositions"]
Here are speed comparisons with the other methods. I had to modify WReach's method to handle other position specifications.
f1[list_, x_] := Cases[list, {Sequence ## Table[_, {x - 1}], n_, ___} /; n <= 4]
f2[list_, x_] := Select[list, #[[x]] <= 4 &]
f3[list_, x_] := Pick[list, (#[[x]] <= 4 &) /# list]
f4[list_, x_] := Pick[list, UnitStep[4 - list[[All, x]]], 1]
f5[list_, x_] := Pick[list, Thread[list[[All, x]] <= 4]]
f6[list_, x_] := list ~Extract~
SparseArray[UnitStep[4 - list[[All, x]]]]["NonzeroPositions"]
For a table with few rows and many columns (comparing position 7):
a = RandomInteger[99, {250, 150000}];
timeAvg[#[a, 7]] & /# {f1, f2, f3, f4, f5, f6} // Column
0.02248
0.0262
0.312
0.312
0.2808
0.0009728
For a table with few columns and many rows (comparing position 7):
a = RandomInteger[99, {150000, 12}];
timeAvg[#[a, 7]] & /# {f1, f2, f3, f4, f5, f6} // Column
0.0968
0.1434
0.184
0.0474
0.103
0.002872
If you want the rows that meet the criteria, use Cases:
Cases[list, {n_, __} /; n <= 4]
(* {{1, 2, 3}, {4, 5, 8}} *)
If you want the positions within the list rather than the rows themselves, use Position instead of Cases (restricted to the first level only):
Position[list, {n_, __} /; n <= 4, {1}]
(* {{1}, {2}} *)
If you want to be very clever:
Pick[list, UnitStep[4 - list[[All, 1]]], 1]
This also avoids unpacking, which means it'll be faster and use less memory.
I reduced a debugging problem in Mathematica 8 to something similar to the following code:
f = Function[x,
list = {1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5};
Count[list, x]
];
f[4]
Maximize{f[x], x, Integers]
Output:
4
{0, {x->0}}
So, while the maximum o function f is obtained when x equals 4 (as confirmed in the first output line), why does Maximize return x->0 (output line 2)?
The reason for this behavior can be easily found using Trace. What happens is that your function is evaluated inside Maximize with still symbolic x, and since your list does not contain symbol x, results in zero. Effectively, you call Maximize[0,x,Integers], hence the result. One thing you can do is to protect the function from immediate evaluation by using pattern-defined function with a restrictive pattern, like this for example:
Clear[ff];
ff[x_?IntegerQ] :=
With[{list = {1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5}}, Count[list, x]]
It appears that Maximize can not easily deal with it however, but NMaximize can:
In[73]:= NMaximize[{ff[x], Element[x, Integers]}, x]
Out[73]= {4., {x -> 4}}
But, generally, either of the Maximize family functions seem not quite appropriate for the job. You may be better off by explicitly computing the maximum, for example like this:
In[78]:= list = {1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5};
Extract[#, Position[#, Max[#], 1, 1] &[#[[All, 2]]]] &[Tally[list]]
Out[79]= {{4, 4}}
HTH
Try this:
list = {1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5};
First#Sort[Tally[list], #1[[2]] > #2[[2]] &]
Output:
{4, 4}