I have 2 tensors. The first tensor is 1D (e.g. a tensor of 3 values). The second tensor is 2D, with the first dim as the IDs to first tensor in a one-many relationship (e.g. a tensor with a shape of 6, 2)
# e.g. simple example of dot product
import torch
a = torch.tensor([2, 4, 3])
b = torch.tensor([[0, 2], [0, 3], [0, 1], [1, 4], [2, 3], [2, 1]]) # 1st column is the index to tensor a, 2nd column is the value
output = [(2*2)+(2*3)+(2*1),(4*4),(3*3)+(3*1)]
output = [12, 16, 12]
Current what I have is to find the size of each id in b (e.g. [3,1,2]) then using torch.split to group them into a list of tensors and running a for loop through the groups. It is fine for a small tensor, but when the size of the tensors are in millions, with tens of thousands of arbitrary-sized groups, it became very slow.
Any better solutions?
You can use numpy.bincount or torch.bincount to sum the elements of b by key:
import numpy as np
a = np.array([2,4,3])
b = np.array([[0,2], [0,3], [0,1], [1,4], [2,3], [2,1]])
print( np.bincount(b[:,0], b[:,1]) )
# [6. 4. 4.]
print( a * np.bincount(b[:,0], b[:,1]) )
# [12. 16. 12.]
import torch
a = torch.tensor([2,4,3])
b = torch.tensor([[0,2], [0,3], [0,1], [1,4], [2,3], [2,1]])
torch.bincount(b[:,0], b[:,1])
# tensor([6., 4., 4.], dtype=torch.float64)
a * torch.bincount(b[:,0], b[:,1])
# tensor([12., 16., 12.], dtype=torch.float64)
References:
numpy.bincount official documentation;
torch.bincount official documentation;
How can I reduce a numpy array based on a key rather than an axis?
Another alternative in pytorch if gradient is needed.
import torch
a = torch.tensor([2,4,3])
b = torch.tensor([[0,2], [0,3], [0,1], [1,4], [2,3], [2,1]])
output = torch.zeros(a.shape[0], dtype=torch.long).index_add_(0, b[:, 0], b[:, 1]) * a
alternatively, torch.tensor.scatter_add also works.
Let's say I have a constant matrix A and I want to compute pow(A, n). As described in this question I can calculate its eigenvalue decomposition (or more generally, its invariant subspaces and the generalized modal matrix) to speed up the process.
If A is a square matrix of size k, then the algorithm has complexity O(k log n) via exponentiation by squaring, and a preparation cost (to compute the modal matrix) of O(k^3).
The problem I am thinking about is loss of precision. Calculating eigenvalues et al takes us out of the domain of integers into floating point numbers. Even though in the end, we know that pow(A, n) has to have all integer entries, the algorithm outlined above only computes floating point numbers.
Another way is to exploit only exponentiation by squaring but that gives us only a O(k^3 log n) algorithm.
Is there a way to accurately - without converting to floating point numbers - compute pow(A, n) fast?
Eigenvalue decomposition is also possible for a matrix over a finite field, but only if the field is just right. So it not just takes preprocessing to do the eigenvalue decomposition, but also to find (some) finite field(s) over which that is even possible.
Finding multiple solutions is useful to avoid having work with gigantic finite fields, then compute pow(A, n) in some small fields and use the CRT to work out what the solution would have been in ℤ. But this requires somehow having a sufficient number of fields of sufficient size to work with and you wouldn't really know in advance what will be sufficient (there is always some n above which it stops working), so maybe this all won't work in practice.
As a small example, take:
A = [[1, 1],
[1, 0]]
Characteristic x² - x - 1, let's guess that modulo 1009 will work (it does), then there are roots 383 and 627, so:
A = QDP mod 1009
Q = [[ 1, 1],
[382, 626]]
D = [[383, 0],
[ 0, 627]]
P = [[ 77, 153],
[933, 856]]
So for example
pow(A, 15) = Q [[928, 0], P = [[987, 610],
[ 0, 436]] [610, 377]]
Fibonacci numbers as expected, so it all worked out. But with just 1009 as the modulus, going above 15 for the exponent makes the result not match would it would be in ℤ, then we would need more/bigger fields.
Using the Cayley-Hamilton theorem we can be faster. The theorem states that every matrix power for dimension k can be written as a sum of the first k powers of A.
If we know that, we can use exponentiation by squaring but instead of working on matrices we work on polynomials over A with coefficients in ℤ. We can then, after each step reduce the polynomial by the characteristic polynomial.
As a small example:
A = [[1, 1],
[1, 0]]
A^2 = A + 1 = writing poly. coefficients = {1, 1}
pow(A, 15) = {1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
= {1, 0} * ({1, 0} * ({1, 0} * {1, 0}^2)^2)^2
= {1, 0} * ({1, 0} * ({1, 0} * {1, 0, 0})^2)^2
= {1, 0} * ({1, 0} * ({1, 0} * {1, 1})^2)^2
= {1, 0} * ({1, 0} * ({1, 1, 0})^2)^2
= {1, 0} * ({1, 0} * {2, 1}^2)^2
= {1, 0} * ({1, 0} * {4, 4, 1})^2
= {1, 0} * ({1, 0} * {8, 5})^2
= {1, 0} * ({8, 5, 0})^2
= {1, 0} * {13, 8}^2
= {1, 0} * {169, 208, 64}
= {1, 0} * {377, 233}
= {377, 233, 0}
= {610, 377}
= [[987, 610],
[610, 377]]
So, what is the runtime cost? Trivially O(k^2 * log n) because at each squaring step we need to compute the square of two polynomials and reduce by the char. polynomial. Using a similar trick as #harold in the other answer yields O(k log k log n) by using the discrete fourier polynomial multiplication as we can find primitive roots.
I have a 20000 x 185 x 5 tensor, which looks like
{{{a1_1,a2_1,a3_1,a4_1,a5_1},{b1_1,b2_1,b3_1,b4_1,b5_1}...
(continue for 185 times)}
{{a1_2,a2_2,a3_2,a4_2,a5_2},{b1_2,b2_2,b3_2,b4_2,b5_2}...
...
...
...
{{a1_20000,a2_20000,a3_20000,a4_20000,a5_20000},
{b1_20000,b2_20000,b3_20000,b4_20000,b5_20000}... }}
The 20000 represents iteration number, the 185 represents individuals, and each individual has 5 attributes. I need to construct a 185 x 5 matrix that stores the mean value for each individual's 5 attributes, averaged across the 20000 iterations.
Not sure what the best way to do this is. I know Mean[ ] works on matrices, but with a Tensor, the derived values might not be what I need. Also, Mathematica ran out of memory if I tried to do Mean[tensor]. Please provide some help or advice. Thank you.
When in doubt, drop the size of the dimensions. (You can still keep them distinct to easily see where things end up.)
(* In[1]:= *) data = Array[a, {4, 3, 2}]
(* Out[1]= *) {{{a[1, 1, 1], a[1, 1, 2]}, {a[1, 2, 1],
a[1, 2, 2]}, {a[1, 3, 1], a[1, 3, 2]}}, {{a[2, 1, 1],
a[2, 1, 2]}, {a[2, 2, 1], a[2, 2, 2]}, {a[2, 3, 1],
a[2, 3, 2]}}, {{a[3, 1, 1], a[3, 1, 2]}, {a[3, 2, 1],
a[3, 2, 2]}, {a[3, 3, 1], a[3, 3, 2]}}, {{a[4, 1, 1],
a[4, 1, 2]}, {a[4, 2, 1], a[4, 2, 2]}, {a[4, 3, 1], a[4, 3, 2]}}}
(* In[2]:= *) Dimensions[data]
(* Out[2]= *) {4, 3, 2}
(* In[3]:= *) means = Mean[data]
(* Out[3]= *) {
{1/4 (a[1, 1, 1] + a[2, 1, 1] + a[3, 1, 1] + a[4, 1, 1]),
1/4 (a[1, 1, 2] + a[2, 1, 2] + a[3, 1, 2] + a[4, 1, 2])},
{1/4 (a[1, 2, 1] + a[2, 2, 1] + a[3, 2, 1] + a[4, 2, 1]),
1/4 (a[1, 2, 2] + a[2, 2, 2] + a[3, 2, 2] + a[4, 2, 2])},
{1/4 (a[1, 3, 1] + a[2, 3, 1] + a[3, 3, 1] + a[4, 3, 1]),
1/4 (a[1, 3, 2] + a[2, 3, 2] + a[3, 3, 2] + a[4, 3, 2])}
}
(* In[4]:= *) Dimensions[means]
(* Out[4]= *) {3, 2}
Mathematica ran out of memory if I tried to do Mean[tensor]
This is probably because intermediate results are larger than the final result. This is likely if the elements are not type Real or Integer. Example:
a = Tuples[{x, Sqrt[y], z^x, q/2, Mod[r, 1], Sin[s]}, {2, 4}];
{MemoryInUse[], MaxMemoryUsed[]}
b = Mean[a];
{MemoryInUse[], MaxMemoryUsed[]}
{109125576, 124244808}
{269465456, 376960648}
If they are, and are in packed array form, perhaps the elements are such that the array in unpacked during processing.
Here is an example where the tensor is a packed array of small numbers, and unpacking does not occur.
a = RandomReal[99, {20000, 185, 5}];
PackedArrayQ[a]
{MemoryInUse[], MaxMemoryUsed[]}
b = Mean[a];
{MemoryInUse[], MaxMemoryUsed[]}
True
{163012808, 163016952}
{163018944, 163026688}
Here is the same size of tensor with very large numbers.
a = RandomReal[$MaxMachineNumber, {20000, 185, 5}];
Developer`PackedArrayQ[a]
{MemoryInUse[], MaxMemoryUsed[]}
b = Mean[a];
{MemoryInUse[], MaxMemoryUsed[]}
True
{163010680, 458982088}
{163122608, 786958080}
To elaborate a little on the other answers, there is no reason to expect Mathematica functions to operate materially differently on tensors than matrices because Mathemetica considers them both to be nested Lists, that are just of different nesting depth. How functions behave with lists depends on whether they're Listable, which you can check using Attributes[f], where fis the function you are interested in.
Your data list's dimensionality isn't actually that big in the scheme of things. Without seeing your actual data it is hard to be sure, but I suspect the reason you are running out of memory is that some of your data is non-numerical.
I don't know what you're doing incorrectly (your code will help). But Mean[] already works as you want it to.
a = RandomReal[1, {20000, 185, 5}];
b = Mean#a;
Dimensions#b
Out[1]= {185, 5}
You can even check that this is correct:
{Max#b, Min#b}
Out[2]={0.506445, 0.494061}
which is the expected value of the mean given that RandomReal uses a uniform distribution by default.
Assume you have the following data :
a = Table[RandomInteger[100], {i, 20000}, {j, 185}, {k, 5}];
In a straightforward manner You can find a table which stores the means of a[[1,j,k]],a[[2,j,k]],...a[[20000,j,k]]:
c = Table[Sum[a[[i, j, k]], {i, Length[a]}], {j, 185}, {k, 5}]/
Length[a] // N; // Timing
{37.487, Null}
or simply :
d = Total[a]/Length[a] // N; // Timing
{0.702, Null}
The second way is about 50 times faster.
c == d
True
To extend on Brett's answer a bit, when you call Mean on a n-dimensional tensor then it averages over the first index and returns an n-1 dimensional tensor:
a = RandomReal[1, {a1, a2, a3, ... an}];
Dimensions[a] (* This would have n entries in it *)
b = Mean[a];
Dimensions[b] (* Has n-1 entries, where averaging was done over the first index *)
In the more general case where you may wish to average over the i-th argument, you would have to transpose the data around first. For example, say you want to average the 3nd of 5 dimensions. You would need the 3rd element first, followed by the 1st, 2nd, 4th, 5th.
a = RandomReal[1, {5, 10, 2, 40, 10}];
b = Transpose[a, {2, 3, 4, 1, 5}];
c = Mean[b]; (* Now of dimensions {5, 10, 40, 10} *)
In other words, you would make a call to Transpose where you placed the i-th index as the first tensor index and moved everything before it ahead one. Anything that comes after the i-th index stays the same.
This tends to come in handy when your data comes in odd formats where the first index may not always represent different realizations of a data sample. I've had this come up, for example, when I had to do time averaging of large wind data sets where the time series came third (!) in terms of the tensor representation that was available.
You could imagine the generalizedTenorMean would look something like this then:
Clear[generalizedTensorMean];
generalizedTensorMean[A_, i_] :=
Module[{n = Length#Dimensions#A, ordering},
ordering =
Join[Table[x, {x, 2, i}], {1}, Table[x, {x, i + 1, n}]];
Mean#Transpose[A, ordering]]
This reduces to the plain-old-mean when i == 1. Try it out:
A = RandomReal[1, {2, 4, 6, 8, 10, 12, 14}];
Dimensions#A (* {2, 4, 6, 8, 10, 12, 14} *)
Dimensions#generalizedTensorMean[A, 1] (* {4, 6, 8, 10, 12, 14} *)
Dimensions#generalizedTensorMean[A, 7] (* {2, 4, 6, 8, 10, 12} *)
On a side note, I'm surprised that Mathematica doesn't support this by default. You don't always want to average over the first level of a list.
Does anyone know of any standard algorithms to determine an affine transformation matrix based upon a set of known points in two co-ordinate systems?
Affine transformations are given by 2x3 matrices. We perform an affine transformation M by taking our 2D input (x y), bumping it up to a 3D vector (x y 1), and then multiplying (on the left) by M.
So if we have three points (x1 y1) (x2 y2) (x3 y3) mapping to (u1 v1) (u2 v2) (u3 v3) then we have
[x1 x2 x3] [u1 u2 u3]
M [y1 y2 y3] = [v1 v2 v3].
[ 1 1 1]
You can get M simply by multiplying on the right by the inverse of
[x1 x2 x3]
[y1 y2 y3]
[ 1 1 1].
A 2x3 matrix multiplied on the right by a 3x3 matrix gives us the 2x3 we want. (You don't actually need the full inverse, but if matrix inverse is available it's easy to use.)
Easily adapted to other dimensions. If you have more than 3 points you may want a least squares best fit. You'll have to ask again for that, but it's a little harder.
I'm not sure how standard it is, but there is a nice formula especially for your case presented in "Beginner's guide to mapping simplexes affinely" and "Workbook on mapping simplexes affinely".
Putting it into code should look something like this (sorry for bad codestyle -- I'm mathematician, not programmer)
import numpy as np
# input data
ins = [[1, 1, 2], [2, 3, 0], [3, 2, -2], [-2, 2, 3]] # <- points
out = [[0, 2, 1], [1, 2, 2], [-2, -1, 6], [4, 1, -3]] # <- mapped to
# calculations
l = len(ins)
B = np.vstack([np.transpose(ins), np.ones(l)])
D = 1.0 / np.linalg.det(B)
entry = lambda r,d: np.linalg.det(np.delete(np.vstack([r, B]), (d+1), axis=0))
M = [[(-1)**i * D * entry(R, i) for i in range(l)] for R in np.transpose(out)]
A, t = np.hsplit(np.array(M), [l-1])
t = np.transpose(t)[0]
# output
print("Affine transformation matrix:\n", A)
print("Affine transformation translation vector:\n", t)
# unittests
print("TESTING:")
for p, P in zip(np.array(ins), np.array(out)):
image_p = np.dot(A, p) + t
result = "[OK]" if np.allclose(image_p, P) else "[ERROR]"
print(p, " mapped to: ", image_p, " ; expected: ", P, result)
This code recovers affine transformation from given points ("ins" transformed to "outs") and tests that it works.