Related
I have a positive definite covariance matrix C of size 3n x 3n constructed from n^2 blocks of size 3x3.
Running MvNormal (with e.g a zero mean vector) on this matrix to draw Gaussian random vectors, I am getting the error
PosDefException: matrix is not positive definite; Cholesky factorization failed.
and indeed checking isposef(C) returns false when n becomes too large. However my matrix should be positive definite for any n, so it seems that there is some kind of numerical instability (perhaps due to the determinant becoming too small or too large beyond machine precision).
The reproducible code I am using to generate C is below:
#######################################################
# inputs
grid_size=10
l_sq = 1
xmax = 2
#######################################################
# kernel function used to construct covariance matrix C
function corr(x, y, l_sq)
v=x-y
d_sq=sum(v.^2)
n = d_sq/(2l_sq)
return exp(-n)*(Matrix{Float64}(I, length(x), length(x)))
end
nb_grid_points = grid_size^3
gaussian_vector_dim = 3*nb_grid_points
oneD_grid = LinRange(-xmax, xmax, grid_size)
# get input set X which indexes grid points
threeD_grid = collect.(Iterators.product(oneD_grid, oneD_grid, oneD_grid))
grid_points = vec(reshape(threeD_grid,:,1))
########################################
# build C by blocks
C = Array{Float64}(undef, gaussian_vector_dim, gaussian_vector_dim)
for i in 1:nb_grid_points
for j in 1:nb_grid_points
#block covariance matrix C consist of DxD correlation-function matrices K_i,j for i,j=1,...,nb_grid_points
C[3*(i-1)+1:(3*i),(3*(j-1)+1):(3*j)] = corr(grid_points[i], grid_points[j],l_sq)
end
end
#########################################
# plot covariance matrix
plt.imshow(C,cmap="Blues", interpolation="none")
plt.colorbar()
plt.title("Covariance matrix")
#########################################
print("C is symmetric:",issymmetric(C))
print("\ndet C=",det(C))
print("\nC is positive definite=",isposdef(C))
Maintaining, l_sq = 1 , xmax = 2, the code above gives isposdef(C) = false when grid_size=10 but isposdef(C)=true if grid_size is 9 or less.
Why is this failure occurring and how can I fix it? Perhaps I can help Julia by indicating that the covariance matrix is sparse?
I'm trying to create a fractal tree in bash, provided that the user enters N where N is the number of branches.
I need to write the following sequence that gets N as an input:
N = 1; sequence = 50
N = 2; sequence = (50-16),(50+16)
N = 3; sequence = (50-16-8),(50-16+8),(50+16-8),(50+16+8)
N = 4; sequence = (50-16-8-4),(50-16-8+4),(50-16+8-4),(50-16+8+4),(50+16-8-4),(50+16-8+4),(50+16+8-4),(50+16+8+4)
N = 5; sequence = (50-16-8-4-2),(50-16-8-4+2),(50-16-8+4-2),(50-16-8+4+2),(50-16-8+4-2),(50-16-8+4+2),(50-16+8-4-2),(50-16+8-4+2),(50-16+8+4-2),(50-16+8+4+2),(50+16-8-4-2),(50+16-8-4+2),(50+16-8+4-2),(50+16-8+4+2),(50+16+8-4-2),(50+16+8-4+2),(50+16+8+4-2),(50+16+8+4+2)
I'm trying to use for loops and basic mathematics to get this sequence as an array but I'm still failing to get the accurate output, here is my code so far:
#!/bin/bash
N=$1
declare -a sequence=()
temp1=50
temp2=50
for i in $(eval echo "{1..$N}");do
for j in $(eval echo "{1..$N}");do
temp1=$((temp1+2**(5-j)))
temp2=$((temp2-2**(5-j)))
done
sequence+=($temp1)
sequence+=($temp2)
temp1=50
temp2=50
done
echo ${sequence[#]}
I don't know how to alternate between summation and subtraction, how can I approach this?
Ok so I am not really sure what it is that you are doing haha, but I wrote a script that generates the output you described..
N=${1}
sequence=()
math_sequence=()
if [ $N -eq 1 ]
then
math_sequence+=(50)
sequence+=(50)
else
for i in `seq 0 $(bc <<< "(2^(${N}-1)) - 1")`
do
X=50
Y=32
SIGNS=$(echo "obase=2;${i}" | bc | xargs printf "%0$((${N}-1))d\n" | sed 's/0/-/g; s/1/+/g')
MATH="$X"
VAL=$Y
for (( i=0; i<${#SIGNS}; i++ )); do
MATH+="${SIGNS:$i:1}"
VAL=$(bc <<< "$VAL / 2")
MATH+="${VAL}"
done
math_sequence+=( "(${MATH}), " )
sequence+=( $(bc <<< "${MATH}") )
done
fi
echo ${math_sequence[#]}
echo "----------------"
echo ${sequence[#]}
Some tricks I used here..
I saw that the +/- pattern kinda looked like binary counting: ----,---+,--+-,--++...+++-,++++ So I just made a binary counter and used the 0's and 1's as - and +.
bc <<< "${EQUATION}" is much more reliable than $(( ${EQUATION} )). At least I like it better. Works for larger numbers, uses ^ instead of ** for exponents. My fav
I generate two arrays for ya... math_sequence which contains the list of equations, and sequence which contains the actual values. I was not sure which one you actually wanted so I gave you both.
The script is pretty configurable. Just change X and Y in the for loop and you can tweak this thing to make all sorts of numbers.
bash thisScript.sh <N> Will generate the output you described:
N = 1; sequence = 50
N = 2; sequence = (50-16),(50+16)
N = 3; sequence = (50-16-8),(50-16+8),(50+16-8),(50+16+8)
N = 4; sequence = (50-16-8-4),(50-16-8+4),(50-16+8-4),(50-16+8+4),(50+16-8-4),(50+16-8+4),(50+16+8-4),(50+16+8+4)
N = 5; sequence = (50-16-8-4-2),(50-16-8-4+2),(50-16-8+4-2),(50-16-8+4+2),(50-16-8+4-2),(50-16-8+4+2),(50-16+8-4-2),(50-16+8-4+2),(50-16+8+4-2),(50-16+8+4+2),(50+16-8-4-2),(50+16-8-4+2),(50+16-8+4-2),(50+16-8+4+2),(50+16+8-4-2),(50+16+8-4+2),(50+16+8+4-2),(50+16+8+4+2)
How can we efficiently calculate pairwise cosine distances in a matrix using TensorFlow? Given an MxN matrix, the result should be an MxM matrix, where the element at position [i][j] is the cosine distance between i-th and j-th rows/vectors in the input matrix.
This can be done with Scikit-Learn fairly easily as follows:
from sklearn.metrics.pairwise import pairwise_distances
pairwise_distances(input_matrix, metric='cosine')
Is there an equivalent method in TensorFlow?
There is an answer for getting a single cosine distance here: https://stackoverflow.com/a/46057597/288875 . This is based on tf.losses.cosine_distance .
Here is a solution which does this for matrices:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
M = 3
# input
input = tf.placeholder(tf.float32, shape = (M, M))
# normalize each row
normalized = tf.nn.l2_normalize(input, dim = 1)
# multiply row i with row j using transpose
# element wise product
prod = tf.matmul(normalized, normalized,
adjoint_b = True # transpose second matrix
)
dist = 1 - prod
input_matrix = np.array(
[[ 1, 1, 1 ],
[ 0, 1, 1 ],
[ 0, 0, 1 ],
],
dtype = 'float32')
print "input_matrix:"
print input_matrix
from sklearn.metrics.pairwise import pairwise_distances
print "sklearn:"
print pairwise_distances(input_matrix, metric='cosine')
print "tensorflow:"
print sess.run(dist, feed_dict = { input : input_matrix })
which gives me:
input_matrix:
[[ 1. 1. 1.]
[ 0. 1. 1.]
[ 0. 0. 1.]]
sklearn:
[[ 0. 0.18350345 0.42264974]
[ 0.18350345 0. 0.29289323]
[ 0.42264974 0.29289323 0. ]]
tensorflow:
[[ 5.96046448e-08 1.83503449e-01 4.22649741e-01]
[ 1.83503449e-01 5.96046448e-08 2.92893231e-01]
[ 4.22649741e-01 2.92893231e-01 0.00000000e+00]]
Note that this solution may not be the optimal one as it calculates all entries of the (symmetric) result matrix, i.e. does almost twice of the calculations. This is likely not a problem for small matrices, for large matrices a combination of loops may be faster.
Note also that this does not have a minibatch dimension so works for a single matrix only.
Elegant solution (output is the same as from scikit-learn pairwise_distances function):
def compute_cosine_distances(a, b):
# x shape is n_a * dim
# y shape is n_b * dim
# results shape is n_a * n_b
normalize_a = tf.nn.l2_normalize(a,1)
normalize_b = tf.nn.l2_normalize(b,1)
distance = 1 - tf.matmul(normalize_a, normalize_b, transpose_b=True)
return distance
test
input_matrix = np.array([[1, 1, 1],
[0, 1, 1],
[0, 0, 1]], dtype = 'float32')
compute_cosine_distances(input_matrix, input_matrix)
output:
<tf.Tensor: id=442, shape=(3, 3), dtype=float32, numpy=
array([[5.9604645e-08, 1.8350345e-01, 4.2264974e-01],
[1.8350345e-01, 5.9604645e-08, 2.9289323e-01],
[4.2264974e-01, 2.9289323e-01, 0.0000000e+00]], dtype=float32)>
I was looking for example code showing how to compute a singular value decomposition of a 2x2 matrix that can contain complex values.
For example, this would be useful for "repairing" user-entered matrices to be unitary. You just take u, s, v = svd(m) then omit the s part from the product: repaired = u * v.
Here's some python code that does the trick. It basically just extracts the complex parts then delegates to the solution from this answer for real 2x2 matrices.
I've written the code in python, using numpy. This is a bit ironic, because if you have numpy you should just use np.linalg.svd. Clearly this is intended as example code suitable for learning or translating into other languages in a pinch.
I'm also not an expert on numerical stability, so... buyer beware.
import numpy as np
import math
# Note: in practice in python just use np.linalg.svd instead
def singular_value_decomposition_complex_2x2(m):
"""
Returns a singular value decomposition of the given 2x2 complex numpy
matrix.
:param m: A 2x2 numpy matrix with complex values.
:returns: A tuple (U, S, V) where U*S*V ~= m, where U and V are complex
2x2 unitary matrices, and where S is a 2x2 diagonal matrix with
non-negative real values.
"""
# Make top row non-imaginary and non-negative by column phasing.
# m2 = m p = | > > |
# | ?+?i ?+?i |
p = phase_cancel_matrix(m[0, 0], m[0, 1])
m2 = m * p
# Cancel top-right value by rotation.
# m3 = m p r = | ?+?i 0 |
# | ?+?i ?+?i |
r = rotation_matrix(math.atan2(m2[0, 1].real, m2[0, 0].real))
m3 = m2 * r
# Make bottom row non-imaginary and non-negative by column phasing.
# m4 = m p r q = | ?+?i 0 |
# | > > |
q = phase_cancel_matrix(m3[1, 0], m3[1, 1])
m4 = m3 * q
# Cancel imaginary part of top left value by row phasing.
# m5 = t m p r q = | > 0 |
# | > > |
t = phase_cancel_matrix(m4[0, 0], 1)
m5 = t * m4
# All values are now real (also the top-right is zero), so delegate to a
# singular value decomposition that works for real matrices.
# t m p r q = u s v
u, s, v = singular_value_decomposition_real_2x2(np.real(m5))
# m = (t* u) s (v q* r* p*)
return adjoint(t) * u, s, v * adjoint(q) * adjoint(r) * adjoint(p)
def singular_value_decomposition_real_2x2(m):
"""
Returns a singular value decomposition of the given 2x2 real numpy matrix.
:param m: A 2x2 numpy matrix with real values.
:returns: A tuple (U, S, V) where U*S*V ~= m, where U and V are 2x2
rotation matrices, and where S is a 2x2 diagonal matrix with
non-negative real values.
"""
a = m[0, 0]
b = m[0, 1]
c = m[1, 0]
d = m[1, 1]
t = a + d
x = b + c
y = b - c
z = a - d
theta_0 = math.atan2(x, t) / 2.0
theta_d = math.atan2(y, z) / 2.0
s_0 = math.sqrt(t**2 + x**2) / 2.0
s_d = math.sqrt(z**2 + y**2) / 2.0
return \
rotation_matrix(theta_0 - theta_d), \
np.mat([[s_0 + s_d, 0], [0, s_0 - s_d]]), \
rotation_matrix(theta_0 + theta_d)
def adjoint(m):
"""
Returns the adjoint, i.e. the conjugate transpose, of the given matrix.
When the matrix is unitary, the adjoint is also its inverse.
:param m: A numpy matrix to transpose and conjugate.
:return: A numpy matrix.
"""
return m.conjugate().transpose()
def rotation_matrix(theta):
"""
Returns a 2x2 unitary matrix corresponding to a 2d rotation by the given angle.
:param theta: The angle, in radians, that the matrix should rotate by.
:return: A 2x2 orthogonal matrix.
"""
c, s = math.cos(theta), math.sin(theta)
return np.mat([[c, -s],
[s, c]])
def phase_cancel_complex(c):
"""
Returns a unit complex number p that cancels the phase of the given complex
number c. That is, c * p will be real and non-negative (approximately).
:param c: A complex number.
:return: A complex number on the complex unit circle.
"""
m = abs(c)
# For small values, where the division is in danger of exploding small
# errors, use trig functions instead.
if m < 0.0001:
theta = math.atan2(c.imag, c.real)
return math.cos(theta) - math.sin(theta) * 1j
return (c / float(m)).conjugate()
def phase_cancel_matrix(p, q):
"""
Returns a 2x2 unitary matrix M such that M cancels out the phases in the
column {{p}, {q}} so that the result of M * {{p}, {q}} should be a vector
with non-negative real values.
:param p: A complex number.
:param q: A complex number.
:return: A 2x2 diagonal unitary matrix.
"""
return np.mat([[phase_cancel_complex(p), 0],
[0, phase_cancel_complex(q)]])
I tested the above code by fuzzing it with matrices filled with random values in [-10, 10] + [-10, 10]i, and checking that the decomposed factors had the right properties (i.e. unitary, diagonal, real, as appropriate) and that their product was (approximately) equal to the input.
But here's a simple smoke test:
m = np.mat([[5, 10], [1j, -1]])
u, s, v = singular_value_decomposition_complex_2x2(m)
np.set_printoptions(precision=5, suppress=True)
print "M:\n", m
print "U*S*V:\n", u*s*v
print "U:\n", u
print "S:\n", s
print "V:\n", v
print "M ~= U*S*V:", np.all(np.abs(m - u*s*v) < 0.1**14)
Which outputs the following. You can confirm that the factored S matches the svd from wolfram alpha, although of course the U and V can be (and are) different.
M:
[[ 5.+0.j 10.+0.j]
[ 0.+1.j -1.+0.j]]
U*S*V:
[[ 5.+0.j 10.+0.j]
[ 0.+1.j -1.-0.j]]
U:
[[-0.89081-0.44541j 0.08031+0.04016j]
[ 0.08979+0.j 0.99596+0.j ]]
S:
[[ 11.22533 0. ]
[ 0. 0.99599]]
V:
[[-0.39679+0.20639j -0.80157+0.39679j]
[ 0.40319+0.79837j -0.19359-0.40319j]]
M ~= U*S*V: True
Given two sets of d-dimensional points. How can I most efficiently compute the pairwise squared euclidean distance matrix in Matlab?
Notation:
Set one is given by a (numA,d)-matrix A and set two is given by a (numB,d)-matrix B. The resulting distance matrix shall be of the format (numA,numB).
Example points:
d = 4; % dimension
numA = 100; % number of set 1 points
numB = 200; % number of set 2 points
A = rand(numA,d); % set 1 given as matrix A
B = rand(numB,d); % set 2 given as matrix B
The usually given answer here is based on bsxfun (cf. e.g. [1]). My proposed approach is based on matrix multiplication and turns out to be much faster than any comparable algorithm I could find:
helpA = zeros(numA,3*d);
helpB = zeros(numB,3*d);
for idx = 1:d
helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];
helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 , B(:,idx), ones(numB,1)];
end
distMat = helpA * helpB';
Please note:
For constant d one can replace the for-loop by hardcoded implementations, e.g.
helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,1), A(:,1).^2, ... % d == 2
ones(numA,1), -2*A(:,2), A(:,2).^2 ]; % etc.
Evaluation:
%% create some points
d = 2; % dimension
numA = 20000;
numB = 20000;
A = rand(numA,d);
B = rand(numB,d);
%% pairwise distance matrix
% proposed method:
tic;
helpA = zeros(numA,3*d);
helpB = zeros(numB,3*d);
for idx = 1:d
helpA(:,3*idx-2:3*idx) = [ones(numA,1), -2*A(:,idx), A(:,idx).^2 ];
helpB(:,3*idx-2:3*idx) = [B(:,idx).^2 , B(:,idx), ones(numB,1)];
end
distMat = helpA * helpB';
toc;
% compare to pdist2:
tic;
pdist2(A,B).^2;
toc;
% compare to [1]:
tic;
bsxfun(#plus,dot(A,A,2),dot(B,B,2)')-2*(A*B');
toc;
% Another method: added 07/2014
% compare to ndgrid method (cf. Dan's comment)
tic;
[idxA,idxB] = ndgrid(1:numA,1:numB);
distMat = zeros(numA,numB);
distMat(:) = sum((A(idxA,:) - B(idxB,:)).^2,2);
toc;
Result:
Elapsed time is 1.796201 seconds.
Elapsed time is 5.653246 seconds.
Elapsed time is 3.551636 seconds.
Elapsed time is 22.461185 seconds.
For a more detailed evaluation w.r.t. dimension and number of data points follow the discussion below (#comments). It turns out that different algos should be preferred in different settings. In non time critical situations just use the pdist2 version.
Further development:
One can think of replacing the squared euclidean by any other metric based on the same principle:
help = zeros(numA,numB,d);
for idx = 1:d
help(:,:,idx) = [ones(numA,1), A(:,idx) ] * ...
[B(:,idx)' ; -ones(1,numB)];
end
distMat = sum(ANYFUNCTION(help),3);
Nevertheless, this is quite time consuming. It could be useful to replace for smaller d the 3-dimensional matrix help by d 2-dimensional matrices. Especially for d = 1 it provides a method to compute the pairwise difference by a simple matrix multiplication:
pairDiffs = [ones(numA,1), A ] * [B'; -ones(1,numB)];
Do you have any further ideas?
For squared Euclidean distance one can also use the following formula
||a-b||^2 = ||a||^2 + ||b||^2 - 2<a,b>
Where <a,b> is the dot product between a and b
nA = sum( A.^2, 2 ); %// norm of A's elements
nB = sum( B.^2, 2 ); %// norm of B's elements
distMat = bsxfun( #plus, nA, nB' ) - 2 * A * B' ;
Recently, I've been told that as of R2016b this method for computing square Euclidean distance is faster than accepted method.