How can we efficiently calculate pairwise cosine distances in a matrix using TensorFlow? Given an MxN matrix, the result should be an MxM matrix, where the element at position [i][j] is the cosine distance between i-th and j-th rows/vectors in the input matrix.
This can be done with Scikit-Learn fairly easily as follows:
from sklearn.metrics.pairwise import pairwise_distances
pairwise_distances(input_matrix, metric='cosine')
Is there an equivalent method in TensorFlow?
There is an answer for getting a single cosine distance here: https://stackoverflow.com/a/46057597/288875 . This is based on tf.losses.cosine_distance .
Here is a solution which does this for matrices:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
M = 3
# input
input = tf.placeholder(tf.float32, shape = (M, M))
# normalize each row
normalized = tf.nn.l2_normalize(input, dim = 1)
# multiply row i with row j using transpose
# element wise product
prod = tf.matmul(normalized, normalized,
adjoint_b = True # transpose second matrix
)
dist = 1 - prod
input_matrix = np.array(
[[ 1, 1, 1 ],
[ 0, 1, 1 ],
[ 0, 0, 1 ],
],
dtype = 'float32')
print "input_matrix:"
print input_matrix
from sklearn.metrics.pairwise import pairwise_distances
print "sklearn:"
print pairwise_distances(input_matrix, metric='cosine')
print "tensorflow:"
print sess.run(dist, feed_dict = { input : input_matrix })
which gives me:
input_matrix:
[[ 1. 1. 1.]
[ 0. 1. 1.]
[ 0. 0. 1.]]
sklearn:
[[ 0. 0.18350345 0.42264974]
[ 0.18350345 0. 0.29289323]
[ 0.42264974 0.29289323 0. ]]
tensorflow:
[[ 5.96046448e-08 1.83503449e-01 4.22649741e-01]
[ 1.83503449e-01 5.96046448e-08 2.92893231e-01]
[ 4.22649741e-01 2.92893231e-01 0.00000000e+00]]
Note that this solution may not be the optimal one as it calculates all entries of the (symmetric) result matrix, i.e. does almost twice of the calculations. This is likely not a problem for small matrices, for large matrices a combination of loops may be faster.
Note also that this does not have a minibatch dimension so works for a single matrix only.
Elegant solution (output is the same as from scikit-learn pairwise_distances function):
def compute_cosine_distances(a, b):
# x shape is n_a * dim
# y shape is n_b * dim
# results shape is n_a * n_b
normalize_a = tf.nn.l2_normalize(a,1)
normalize_b = tf.nn.l2_normalize(b,1)
distance = 1 - tf.matmul(normalize_a, normalize_b, transpose_b=True)
return distance
test
input_matrix = np.array([[1, 1, 1],
[0, 1, 1],
[0, 0, 1]], dtype = 'float32')
compute_cosine_distances(input_matrix, input_matrix)
output:
<tf.Tensor: id=442, shape=(3, 3), dtype=float32, numpy=
array([[5.9604645e-08, 1.8350345e-01, 4.2264974e-01],
[1.8350345e-01, 5.9604645e-08, 2.9289323e-01],
[4.2264974e-01, 2.9289323e-01, 0.0000000e+00]], dtype=float32)>
Related
When using tf.boolean_mask(), a Value Error is raised. It reads "Number of mask dimensions must be specified, even if some dimensions are None. E.g. shape=[None] is ok, but shape=None is not.
I suspect that something is going wrong when I create my boolean mask s, because when I just create a boolean mask by hand, all works fine. However, I've checked the shape and the dtype of s so far, and couldn't notice anything suspicious. Both seemed to be identical to the shape and type of the boolean mask I created by hand.
Please see a screenshot of the problem.
The following should allow you to reproduce the error on your machine. You need tensorflow, numpy and scipy.
with tf.Session() as sess:
# receive five embedded vectors
v0 = tf.constant([[3.0,1.0,2.,4.,2.]])
v1 = tf.constant([[4.0,0,1.0,4,1.]])
v2 = tf.constant([[1.0,1.0,0.0,4.,8.]])
v3 = tf.constant([[1.,4,2.,5.,2.]])
v4 = tf.constant([[3.,2.,3.,2.,5.]])
# concatenate the five embedded vectors into a matrix
VT = tf.concat([v0,v1,v2,v3,v4],axis=0)
# perform SVD on the concatenated matrix
s, u1, u2 = tf.svd(VT)
e = tf.square(s) # list of eigenvalues
v = u1 # eigenvectors as column vectors
# sample a set
s = tf.py_func(sample_dpp_bin,[e,v],tf.bool)
X = tf.boolean_mask(VT,s)
print(X.eval())
This is the code to generate s. s is a sample from a determinantal point process (for the mathematically interested).
Note that I'm using tf.py_func to wrap this python function:
import tensorflow as tf
import numpy as np
from scipy.linalg import orth
def sample_dpp_bin(e_val,e_vec):
# e_val = np.array of eigenvalues
# e_vec = array of eigenvectors (= column vectors)
eps = 0.01
# sample a set of eigenvectors
ind = (np.random.rand(len(e_val)) <= (e_val)/(1+e_val))
k = sum(ind)
if k == e_val.size:
return np.ones(e_val.size,dtype=bool) # check for full set
if k == 0:
return np.zeros(e_val.size,dtype=bool)
V = e_vec[:,np.array(ind)]
# sample a set of k items
sample = np.zeros(e_val.size,dtype=bool)
for l in range(k-1,-1,-1):
p = np.sum(V**2,axis=1)
p = np.cumsum(p / np.sum(p)) # item cumulative probabilities
i = int((np.random.rand() <= p).argmax()) # choose random item
sample[i] = True
j = (np.abs(V[i,:])>eps).argmax() # pick an eigenvector not orthogonal to e_i
Vj = V[:,j]
V = orth(V - (np.outer(Vj,(V[i,:]/Vj[i]))))
return sample
The output if I print s and tf.reshape(s) is
[False True True True True]
[5]
The output if I print VT and tf.reshape(VT) is
[[ 3. 1. 2. 4. 2.]
[ 4. 0. 1. 4. 1.]
[ 1. 1. 0. 4. 8.]
[ 1. 4. 2. 5. 2.]
[ 3. 2. 3. 2. 5.]]
[5 5]
Any help much appreciated.
Following example works for me.
import tensorflow as tf
import numpy as np
tensor = [[1, 2], [3, 4], [5, 6]]
mask = np.array([True, False, True])
t_m = tf.boolean_mask(tensor, mask)
sess = tf.Session()
print(sess.run(t_m))
Output:
[[1 2]
[5 6]]
Provide your runnable code snippet to reproduce the error. I think you might be doing something wrong in s.
Update:
s = tf.py_func(sample_dpp_bin,[e,v],tf.bool)
s_v = (s.eval())
X = tf.boolean_mask(VT,s_v)
print(X.eval())
mask should be a np array not TF tensor. You don't have to use tf.pyfunc.
The error message states that the shape of the mask is not defined. What do you get if you print tf.shape(s)? I'd bet the problem with your code is that the shape of s is completely unknown, and you could fix that with a simple call like s.set_shape((None)) (to simply specify that s is a 1-dimensional tensor). Consider this code snippet:
X = np.random.randint(0, 2, (100, 100, 3))
with tf.Session() as sess:
X_tf = tf.placeholder(tf.int8)
# X_tf.set_shape((None, None, None))
y = tf.greater(tf.reduce_max(X_tf, axis=(0, 1)), 0)
print(tf.shape(y))
z = tf.boolean_mask(X_tf, y, axis=2)
print(sess.run(z, feed_dict={X_tf: X}))
This prints a shape of Tensor("Shape_3:0", shape=(?,), dtype=int32) (i.e., even the dimensions of y are unknown) and returns the same error as you have. However, if you uncomment the set_shape line, then X_tf is known to be 3-dimensional and so s is 1-dimensional. The code then works. So, I think all you need to do is add a s.set_shape((None)) call after the py_func call.
I'm having trouble creating a multivariate normal density with sympy 0.7.6.1.
Here is my code.
from sympy import *
from sympy.stats import *
mu = Matrix([5, 13])
Sigma = Matrix([[2, 0], [0, 2]])
X = Normal('X', mu, Sigma)
y = MatrixSymbol('y', 2, 1)
density(X)(y)
The last line gives me this error:
Power of non-square matrix Matrix([
[ -5],
[-13]]) + y
The problem is simple: the formula to calculate the density is not the one supporting matrices, have a look:
https://github.com/sympy/sympy/blob/sympy-0.7.6.1/sympy/stats/crv_types.py#L1641
In this expression, (x-self.mean) gets squared (i.e. raised to the power of 2), but the square of non-square matrix is not defined.
In short, it looks like multivariate normal distributions are not supported, but you could try a workaround by defining a new distribution:
from sympy.stats.crv_types import rv, SingleContinuousDistribution, _value_check
class MultivariateNormalDistribution(SingleContinuousDistribution):
_argnames = ('mean', 'std')
#staticmethod
def check(mean, std):
_value_check(std > 0, "Standard deviation must be positive")
def pdf(self, x):
return exp(-S.Half * (x - self.mean).T * (self.std.inv()) * (x - self.mean)) / (sqrt(2*pi)**(self.std.shape[0])*self.std.det())
def sample(self):
pass
# define sampling function here
def MultivariateNormal(name, mean, std):
return rv(name, MultivariateNormalDistribution, (mean, std))
Unfortunately, your example still doesn't work, because of missing features in the matrix module (that is, no exponentiation of expressions with MatrixSymbol are supported, yet), but you could get the point density:
In[12]: X = MultivariateNormal('X', mu, Sigma)
In [13]: density(X)(Matrix([0, 0]))
Out[13]:
[ -97/2]
[e ]
[------]
[ 8*pi ]
Or with symbols in the matrix:
In [14]: x1, x2 = symbols('x1, x2')
In [15]: density(X)(Matrix([x1, x2]))
Out[15]:
[ 2 2 ]
[ x1 5*x1 x2 13*x2 97]
[ - --- + ---- - --- + ----- - --]
[ 4 2 4 2 2 ]
[e ]
[--------------------------------]
[ 8*pi ]
I'm trying to create a symbolic matrix (S) of general size (let's say LxL), and I want to set each element of the matrix as a function of the indices, i.e.:
S[m,n] = (u+i/2*(n-m))/(u-i/2*(n-m)) * (u+i/2*(n+m))/(u-i/2*(n+m))
I tried running this in sympy, and I got
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-11-a456d47e99e7> in <module>()
2 S_l = MatrixSymbol('S_l',2*l+1,2*l+1)
3 S_k = MatrixSymbol('S_k',2*k+1,2*k+1)
----> 4 S_l[m,n] = (u+i/2*(n-m))/(u-i/2*(n-m)) * (u+i/2*(n+m))/(u-i/2*(n+m))
TypeError: 'MatrixSymbol' object does not support item assignment
Searching through Stack Exchange I found this question from last year:
Sympy - Dense Display of Matrices of Arbitrary Size
Which is unanswered and not exactly the same. Is it the same issue, or am I just trying to do an impossible thing in sympy (or computers in general)?
I know this is ancient, but I came across the same issue and figured I'd share a solution that works for me. You'll need to use a FunctionMatrix object instead of a MatrixSymbol. For background, I'm using SymPy 1.6.1 on Python 3.5.2.
Here's an example. Using the code below, I've setup some iteration symbols and the function f(i,j) I'd like to use for the elements of my matrix u.
# Import SymPy for symbolic computations
import sympy as sym
# Index variables
i,j = sym.symbols('i j', integer=True);
N = sym.Symbol('N', real=True, integer=True, zero=False, positive=True);
# The function we'll use for our matrix
def f(i,j):
# Some arbitrary function...
return i + j;
# Define a function matrix where elements of the matrix
# are a function of the indices
U = sym.FunctionMatrix(N, N, sym.Lambda((i,j), f(i,j)));
Now, let's try using the elements in the matrix by summing them all up...
U_sum = sym.Sum(u[i,j], (i, 0, N), (j, 0, N));
U_sum
>>>
N N
___ ___
╲ ╲
╲ ╲
╱ ╱ (i + j)
╱ ╱
‾‾‾ ‾‾‾
j = 0 i = 0
Then, let's tell SymPy to calculate the summation
our_sum.doit().simplify()
>>> N * ( N**2 + 2*N + 1 )
This certainly can be done. The docs offer some examples. Here's one
>>> Matrix(3, 4, lambda i,j: 1 - (i+j) % 2)
Matrix([
[1, 0, 1, 0],
[0, 1, 0, 1],
[1, 0, 1, 0]])
I was looking for example code showing how to compute a singular value decomposition of a 2x2 matrix that can contain complex values.
For example, this would be useful for "repairing" user-entered matrices to be unitary. You just take u, s, v = svd(m) then omit the s part from the product: repaired = u * v.
Here's some python code that does the trick. It basically just extracts the complex parts then delegates to the solution from this answer for real 2x2 matrices.
I've written the code in python, using numpy. This is a bit ironic, because if you have numpy you should just use np.linalg.svd. Clearly this is intended as example code suitable for learning or translating into other languages in a pinch.
I'm also not an expert on numerical stability, so... buyer beware.
import numpy as np
import math
# Note: in practice in python just use np.linalg.svd instead
def singular_value_decomposition_complex_2x2(m):
"""
Returns a singular value decomposition of the given 2x2 complex numpy
matrix.
:param m: A 2x2 numpy matrix with complex values.
:returns: A tuple (U, S, V) where U*S*V ~= m, where U and V are complex
2x2 unitary matrices, and where S is a 2x2 diagonal matrix with
non-negative real values.
"""
# Make top row non-imaginary and non-negative by column phasing.
# m2 = m p = | > > |
# | ?+?i ?+?i |
p = phase_cancel_matrix(m[0, 0], m[0, 1])
m2 = m * p
# Cancel top-right value by rotation.
# m3 = m p r = | ?+?i 0 |
# | ?+?i ?+?i |
r = rotation_matrix(math.atan2(m2[0, 1].real, m2[0, 0].real))
m3 = m2 * r
# Make bottom row non-imaginary and non-negative by column phasing.
# m4 = m p r q = | ?+?i 0 |
# | > > |
q = phase_cancel_matrix(m3[1, 0], m3[1, 1])
m4 = m3 * q
# Cancel imaginary part of top left value by row phasing.
# m5 = t m p r q = | > 0 |
# | > > |
t = phase_cancel_matrix(m4[0, 0], 1)
m5 = t * m4
# All values are now real (also the top-right is zero), so delegate to a
# singular value decomposition that works for real matrices.
# t m p r q = u s v
u, s, v = singular_value_decomposition_real_2x2(np.real(m5))
# m = (t* u) s (v q* r* p*)
return adjoint(t) * u, s, v * adjoint(q) * adjoint(r) * adjoint(p)
def singular_value_decomposition_real_2x2(m):
"""
Returns a singular value decomposition of the given 2x2 real numpy matrix.
:param m: A 2x2 numpy matrix with real values.
:returns: A tuple (U, S, V) where U*S*V ~= m, where U and V are 2x2
rotation matrices, and where S is a 2x2 diagonal matrix with
non-negative real values.
"""
a = m[0, 0]
b = m[0, 1]
c = m[1, 0]
d = m[1, 1]
t = a + d
x = b + c
y = b - c
z = a - d
theta_0 = math.atan2(x, t) / 2.0
theta_d = math.atan2(y, z) / 2.0
s_0 = math.sqrt(t**2 + x**2) / 2.0
s_d = math.sqrt(z**2 + y**2) / 2.0
return \
rotation_matrix(theta_0 - theta_d), \
np.mat([[s_0 + s_d, 0], [0, s_0 - s_d]]), \
rotation_matrix(theta_0 + theta_d)
def adjoint(m):
"""
Returns the adjoint, i.e. the conjugate transpose, of the given matrix.
When the matrix is unitary, the adjoint is also its inverse.
:param m: A numpy matrix to transpose and conjugate.
:return: A numpy matrix.
"""
return m.conjugate().transpose()
def rotation_matrix(theta):
"""
Returns a 2x2 unitary matrix corresponding to a 2d rotation by the given angle.
:param theta: The angle, in radians, that the matrix should rotate by.
:return: A 2x2 orthogonal matrix.
"""
c, s = math.cos(theta), math.sin(theta)
return np.mat([[c, -s],
[s, c]])
def phase_cancel_complex(c):
"""
Returns a unit complex number p that cancels the phase of the given complex
number c. That is, c * p will be real and non-negative (approximately).
:param c: A complex number.
:return: A complex number on the complex unit circle.
"""
m = abs(c)
# For small values, where the division is in danger of exploding small
# errors, use trig functions instead.
if m < 0.0001:
theta = math.atan2(c.imag, c.real)
return math.cos(theta) - math.sin(theta) * 1j
return (c / float(m)).conjugate()
def phase_cancel_matrix(p, q):
"""
Returns a 2x2 unitary matrix M such that M cancels out the phases in the
column {{p}, {q}} so that the result of M * {{p}, {q}} should be a vector
with non-negative real values.
:param p: A complex number.
:param q: A complex number.
:return: A 2x2 diagonal unitary matrix.
"""
return np.mat([[phase_cancel_complex(p), 0],
[0, phase_cancel_complex(q)]])
I tested the above code by fuzzing it with matrices filled with random values in [-10, 10] + [-10, 10]i, and checking that the decomposed factors had the right properties (i.e. unitary, diagonal, real, as appropriate) and that their product was (approximately) equal to the input.
But here's a simple smoke test:
m = np.mat([[5, 10], [1j, -1]])
u, s, v = singular_value_decomposition_complex_2x2(m)
np.set_printoptions(precision=5, suppress=True)
print "M:\n", m
print "U*S*V:\n", u*s*v
print "U:\n", u
print "S:\n", s
print "V:\n", v
print "M ~= U*S*V:", np.all(np.abs(m - u*s*v) < 0.1**14)
Which outputs the following. You can confirm that the factored S matches the svd from wolfram alpha, although of course the U and V can be (and are) different.
M:
[[ 5.+0.j 10.+0.j]
[ 0.+1.j -1.+0.j]]
U*S*V:
[[ 5.+0.j 10.+0.j]
[ 0.+1.j -1.-0.j]]
U:
[[-0.89081-0.44541j 0.08031+0.04016j]
[ 0.08979+0.j 0.99596+0.j ]]
S:
[[ 11.22533 0. ]
[ 0. 0.99599]]
V:
[[-0.39679+0.20639j -0.80157+0.39679j]
[ 0.40319+0.79837j -0.19359-0.40319j]]
M ~= U*S*V: True
If I have constructed a sparse matrix using the sparse(i, j, k) constructor, how can I then normalize the columns of the matrix (so that each column sums to 1)? I cannot efficiently normalize the entries before I create the matrix, so any help is appreciated. Thanks!
The easiest way would be a broadcasting division by the sum of the columns:
julia> A = sprand(4,5,.5)
A./sum(A,1)
4x5 Array{Float64,2}:
0.0 0.0989976 0.0 0.0 0.0795486
0.420754 0.458653 0.0986313 0.0 0.0
0.0785525 0.442349 0.0 0.856136 0.920451
0.500693 0.0 0.901369 0.143864 0.0
… but it looks like that hasn't been optimized for sparse matrices yet, and falls back to a full matrix. So a simple loop to iterate over the columns does the trick:
julia> for (col,s) in enumerate(sum(A,1))
s == 0 && continue # What does a "normalized" column with a sum of zero look like?
A[:,col] = A[:,col]/s
end
A
4x5 sparse matrix with 12 Float64 entries:
[2, 1] = 0.420754
[3, 1] = 0.0785525
[4, 1] = 0.500693
[1, 2] = 0.0989976
[2, 2] = 0.458653
[3, 2] = 0.442349
[2, 3] = 0.0986313
[4, 3] = 0.901369
[3, 4] = 0.856136
[4, 4] = 0.143864
[1, 5] = 0.0795486
[3, 5] = 0.920451
julia> sum(A,1)
1x5 Array{Float64,2}:
1.0 1.0 1.0 1.0 1.0
This works entirely within sparse matrices and is done in-place (although it is still allocating new sparse matrices for each column slice).
Given a Matrix A (does not matter whether or not it is sparse) normalize by any dimension
A ./ sum(A,1) or A ./ sum(A,2)
to show that it works:
A = sprand(10,10,0.3)
println(sum(A,1))
println(A ./ sum(A,1))
only caveat
A[1,:] = 0
println(A ./ sum(A,1))
as you can see the column 1 now only contains NaNs because we divide by zero. Also we end up with a Matrix and not a sparse Matrix.
On the other hand one can quickly come up with an efficient specialized solution for your problem.
function normalize_columns(A :: SparseMatrixCSC)
sums = sum(A,1)
I,J,V = findnz(A)
for idx in 1:length(V)
V[idx] /= sums[J[idx]]
end
sparse(I,J,V)
end
#Matt B came up with a very similar answer while I was typing this up :)
Remember that sparse matrices in Julia are in compressed column form. So you can access the data directly:
for col = 1 : size(A, 2)
i = A.colptr[col]
k = A.colptr[col+1] - 1
n = i <= k ? norm(A.nzval[i:k]) : 0.0 # or whatever you like
n > 0.0 && (A.nzval[i:k] ./= n)
end
# get the column sums of A
S = vec(sum(A,1))
# get the nonzero entries in A. ei is row index, ej is col index, ev is the value in A
ei,ej,ev = findnz(A)
# get the number or rows and columns in A
m,n = size(A)
# create a new normalized matrix. For each nonzero index (ei,ej), its new value will be
# the old value divided by the sum of that column, which can be obtained by S[ej]
A_normalized = sparse(ei,ej,ev./S[ej],m,n)
the following gives what you want:
A = sprand(4,5,0.5)
B = A./sparse(sum(A,1))
The problem is that sum(A,1) gives a 1x5 dense array so combining with the sparse matrix A through the ./ operator gives a dense array. So you need to force it to be of sparse type. Or you can type
sparse(A ./ sum(A,1)).