Singular value decomposition of complex 2x2 matrix - algorithm

I was looking for example code showing how to compute a singular value decomposition of a 2x2 matrix that can contain complex values.
For example, this would be useful for "repairing" user-entered matrices to be unitary. You just take u, s, v = svd(m) then omit the s part from the product: repaired = u * v.

Here's some python code that does the trick. It basically just extracts the complex parts then delegates to the solution from this answer for real 2x2 matrices.
I've written the code in python, using numpy. This is a bit ironic, because if you have numpy you should just use np.linalg.svd. Clearly this is intended as example code suitable for learning or translating into other languages in a pinch.
I'm also not an expert on numerical stability, so... buyer beware.
import numpy as np
import math
# Note: in practice in python just use np.linalg.svd instead
def singular_value_decomposition_complex_2x2(m):
"""
Returns a singular value decomposition of the given 2x2 complex numpy
matrix.
:param m: A 2x2 numpy matrix with complex values.
:returns: A tuple (U, S, V) where U*S*V ~= m, where U and V are complex
2x2 unitary matrices, and where S is a 2x2 diagonal matrix with
non-negative real values.
"""
# Make top row non-imaginary and non-negative by column phasing.
# m2 = m p = | > > |
# | ?+?i ?+?i |
p = phase_cancel_matrix(m[0, 0], m[0, 1])
m2 = m * p
# Cancel top-right value by rotation.
# m3 = m p r = | ?+?i 0 |
# | ?+?i ?+?i |
r = rotation_matrix(math.atan2(m2[0, 1].real, m2[0, 0].real))
m3 = m2 * r
# Make bottom row non-imaginary and non-negative by column phasing.
# m4 = m p r q = | ?+?i 0 |
# | > > |
q = phase_cancel_matrix(m3[1, 0], m3[1, 1])
m4 = m3 * q
# Cancel imaginary part of top left value by row phasing.
# m5 = t m p r q = | > 0 |
# | > > |
t = phase_cancel_matrix(m4[0, 0], 1)
m5 = t * m4
# All values are now real (also the top-right is zero), so delegate to a
# singular value decomposition that works for real matrices.
# t m p r q = u s v
u, s, v = singular_value_decomposition_real_2x2(np.real(m5))
# m = (t* u) s (v q* r* p*)
return adjoint(t) * u, s, v * adjoint(q) * adjoint(r) * adjoint(p)
def singular_value_decomposition_real_2x2(m):
"""
Returns a singular value decomposition of the given 2x2 real numpy matrix.
:param m: A 2x2 numpy matrix with real values.
:returns: A tuple (U, S, V) where U*S*V ~= m, where U and V are 2x2
rotation matrices, and where S is a 2x2 diagonal matrix with
non-negative real values.
"""
a = m[0, 0]
b = m[0, 1]
c = m[1, 0]
d = m[1, 1]
t = a + d
x = b + c
y = b - c
z = a - d
theta_0 = math.atan2(x, t) / 2.0
theta_d = math.atan2(y, z) / 2.0
s_0 = math.sqrt(t**2 + x**2) / 2.0
s_d = math.sqrt(z**2 + y**2) / 2.0
return \
rotation_matrix(theta_0 - theta_d), \
np.mat([[s_0 + s_d, 0], [0, s_0 - s_d]]), \
rotation_matrix(theta_0 + theta_d)
def adjoint(m):
"""
Returns the adjoint, i.e. the conjugate transpose, of the given matrix.
When the matrix is unitary, the adjoint is also its inverse.
:param m: A numpy matrix to transpose and conjugate.
:return: A numpy matrix.
"""
return m.conjugate().transpose()
def rotation_matrix(theta):
"""
Returns a 2x2 unitary matrix corresponding to a 2d rotation by the given angle.
:param theta: The angle, in radians, that the matrix should rotate by.
:return: A 2x2 orthogonal matrix.
"""
c, s = math.cos(theta), math.sin(theta)
return np.mat([[c, -s],
[s, c]])
def phase_cancel_complex(c):
"""
Returns a unit complex number p that cancels the phase of the given complex
number c. That is, c * p will be real and non-negative (approximately).
:param c: A complex number.
:return: A complex number on the complex unit circle.
"""
m = abs(c)
# For small values, where the division is in danger of exploding small
# errors, use trig functions instead.
if m < 0.0001:
theta = math.atan2(c.imag, c.real)
return math.cos(theta) - math.sin(theta) * 1j
return (c / float(m)).conjugate()
def phase_cancel_matrix(p, q):
"""
Returns a 2x2 unitary matrix M such that M cancels out the phases in the
column {{p}, {q}} so that the result of M * {{p}, {q}} should be a vector
with non-negative real values.
:param p: A complex number.
:param q: A complex number.
:return: A 2x2 diagonal unitary matrix.
"""
return np.mat([[phase_cancel_complex(p), 0],
[0, phase_cancel_complex(q)]])
I tested the above code by fuzzing it with matrices filled with random values in [-10, 10] + [-10, 10]i, and checking that the decomposed factors had the right properties (i.e. unitary, diagonal, real, as appropriate) and that their product was (approximately) equal to the input.
But here's a simple smoke test:
m = np.mat([[5, 10], [1j, -1]])
u, s, v = singular_value_decomposition_complex_2x2(m)
np.set_printoptions(precision=5, suppress=True)
print "M:\n", m
print "U*S*V:\n", u*s*v
print "U:\n", u
print "S:\n", s
print "V:\n", v
print "M ~= U*S*V:", np.all(np.abs(m - u*s*v) < 0.1**14)
Which outputs the following. You can confirm that the factored S matches the svd from wolfram alpha, although of course the U and V can be (and are) different.
M:
[[ 5.+0.j 10.+0.j]
[ 0.+1.j -1.+0.j]]
U*S*V:
[[ 5.+0.j 10.+0.j]
[ 0.+1.j -1.-0.j]]
U:
[[-0.89081-0.44541j 0.08031+0.04016j]
[ 0.08979+0.j 0.99596+0.j ]]
S:
[[ 11.22533 0. ]
[ 0. 0.99599]]
V:
[[-0.39679+0.20639j -0.80157+0.39679j]
[ 0.40319+0.79837j -0.19359-0.40319j]]
M ~= U*S*V: True

Related

QR factorization of two vertically stacked upper triangular matrix using Givens Rotation

In this problem, I am trying to compute the QR factorization of two vertically stacked upper triangular matrix using Givens Rotation. So, I'm trying to zero out all the non zero entries in the r_2 matrix column by column. Here is the code:
import numpy as np
import scipy.linalg as la
np.random.seed(10)
B = np.random.rand(4,2)
q_1 , r_1 = la.qr(B[:2, :2])
q_2 , r_2 = la.qr(B[2:4, :2])
R_12 = np.vstack((r_1, r_2))
def givens(a, b):
c = a / np.sqrt(a **2+ b**2)
s = b / np.sqrt(a **2+ b**2)
return c, s
m, n = R_12.shape
Q5 = np.eye(m)
R_tmp = R_12.copy()
for j in range(n): # columns
for i in range(j+1): # row
c , s = givens(R_tmp[j,j], R_tmp[n+i,j])
Q = np.eye(m)
Q[j,j] = c
Q[j, n+i] = s
Q[n+i, j] = -s
Q[n+i, n+i] = c
Q5 = Q5 # Q
R_tmp = Q # R_tmp
print(R_tmp)
The final upper triangular output R_tmp looks like this
[[ 1.13321832e+00 6.64638268e-01]
[-9.01587953e-19 8.65063215e-01]
[-3.70107044e-18 6.42047272e-19]
[ 9.38337336e-19 8.16113029e-17]]
But checking with the correct answer from the qr function in scipy
q5, r5 =la.qr(R_12)
print(r5, "\n")
it gives
[[ 1.13321832 0.66463827]
[ 0. -0.86506322]
[ 0. 0. ]
[ 0. 0. ]]
So, the first row of my R_tmp matrix looks fine, but there is an extra negative sign in the second row of my R_tmp matrix compared to the correct answer given by scipy. And I'm stuck to figure out where exactly is the problem that causes the extra negative sign in the second row. Any helps would be appreciated.
You can flip the sign of any rows of R as long as you flip the sign of the corresponding column of Q. The new factorisation is still a valid QR factorisation of your original matrix. This can be seen by taking a diagonal matrix S with diagonal elements in {-1,+1}. Premultiplying a matrix by S changes the signs of the rows corresponding to -1 elements of S, while postmultiplying changes the sign of the corresponding columns. Then:
A = QR = Q I R = Q S S R = (Q S) (S R) = Q2 R2.
The QR factorisation of a full-rank square matrix (as well as the thin QR factorisation of a full-rank rectangular matrix) is unique only if we follow the convention of flipping the signs to make the diagonal entries of R positive.
Your updating of the factorization is wrong in one point. If Q,R are the intermediate factors, R not fully triangular, and $G$ is the givens rotation for the current step, then you want
A = Q # R = (Q # G.T) # (G # R)
the factors that you add inside the product have to cancel to the unity matrix, using that the inverse of any orthogonal matrix is also its transpose.
The standard for QR decompositions is the use of Householder reflectors. This variant does not guarantee the signs on the R diagonal. In the extreme case the QR decomposition of the identity can result in Q = R = -I

Perceptron with weights of bounded condition number

Let N be a (linear) single-layer perceptron with weight matrix w of dimension nxn.
I want to train N under the Boolean constraint that the condition number k(w) of the weights w remain below a given threshold k_0 at each step of the optimisation.
Is there a standard way to implement this constraint (in pytorch, say)?
After each optimizer step, go through the list of parameters and recondition all matrices:
(code looked at for a few seconds, but not tested)
def recondition_(x, max_cond): # would need to be fixed for non-square x
u, s, vh = torch.linalg.svd(x)
curr_cond = s[0] / s[-1]
if curr_cond > max_cond:
ratio = curr_cond / max_cond
mult = torch.linspace(0, math.log(ratio), len(s)).exp()
s = mult * s
x[:] = torch.mm(u, torch.mm(torch.diag(s), vh))
Training loop:
...
optimizer.step()
with torch.no_grad():
for p in model.parameters():
if p.dim() == 2:
recondition_(p, max_cond)
...

Boolean expression for modified Queens problem

I saw the boolean expressions for the N Queens problem from here.
My modified N queens rules are simpler:
For a p*p chessboard I want to place N queens in such a way so that
Queens will be placed adjacently, rows will be filled first.
p*p chessboard size will be adjusted until it can hold N queens
For example, say N = 17, then we need a 5*5 chessboard and the placement will be:
Q_Q_Q_Q_Q
Q_Q_Q_Q_Q
Q_Q_Q_Q_Q
Q_Q_*_*_*
*_*_*_*_*
The question is I am trying to come up with a boolean expression for this problem.
This problem can be solved using the Python packages humanize and omega.
"""Solve variable size square fitting."""
import humanize
from omega.symbolic.fol import Context
def pick_chessboard(q):
ctx = Context()
# compute size of chessboard
#
# picking a domain for `p`
# requires partially solving the
# problem of computing `p`
ctx.declare(p=(0, q))
s = f'''
(p * p >= {q}) # chessboard fits the queens, and
/\ ((p - 1) * (p - 1) < {q}) # is the smallest such board
'''
u = ctx.add_expr(s)
d, = list(ctx.pick_iter(u)) # assert unique solution
p = d['p']
print(f'chessboard size: {p}')
# compute number of full rows
ctx.declare(x=(0, p))
s = f'x = {q} / {p}' # integer division
u = ctx.add_expr(s)
d, = list(ctx.pick_iter(u))
r = d['x']
print(f'{r} rows are full')
# compute number of queens on the last row
s = f'x = {q} % {p}' # modulo
u = ctx.add_expr(s)
d, = list(ctx.pick_iter(u))
n = d['x']
k = r + 1
kword = humanize.ordinal(k)
print(f'{n} queens on the {kword} row')
if __name__ == '__main__':
q = 10 # number of queens
pick_chessboard(q)
Representing multiplication (and integer division and modulo) with binary decision diagrams has complexity exponential in the number of variables, as proved in: https://doi.org/10.1109/12.73590

How to perform matrix by vector multiplication with sympy?

I have:
a vector of type <class 'sympy.vector.vector.VectorMul'>; and
a matrix of type <class 'sympy.matrices.dense.MutableDenseMatrix'>
I would like to multiply the matrix by the vector in order to produce a vector.
Can I perform this operation conveniently or do I need to do some extra manipulation first?
For reference I am attempting to get the symbolic result of a rotation matrix applied to a vector.
Also below, is some of my code that deals with the above matrix and vector.
from sympy.vector import CoordSys3D
σ, θ, γ, λ, a, b, c = symbols('σ, θ, γ, λ, a, b, c, a_v, b_v, c_v')
σ = sin(θ)
γ = cos(θ)
λ = 1 - γ
N = CoordSys3D('N')
u = a*N.i + b*N.j + c*N.k # Axis of rotation
R = Matrix([
[a*a*λ + γ, a*b*λ-c*σ, a*c*λ+b*σ],
[b*a*λ+c*σ, b*b*λ + γ, b*c*λ-a*σ],
[c*a*λ-b*σ, c*b*λ+a*σ, c*c*λ + γ],
])
# Input vector prior to rotation
v = a_v*N.i + b_v*N.j + c_v*N.k
# How to calculate the post rotation output vector w = Rv?
In summary is there a built-in mechanism in sympy for matrix by vector multiplication?
Although I didn't find a function to do what I wanted, this code achieved the same result. I'm posting it here in case it is useful for others.
w = R * Matrix([v.coeff(N.i), v.coeff(N.j), v.coeff(N.k)])
In the current version of SymPy (1.11), you can calculate the vector matrix product by using the matmul operator (#)
The following code works for me:
v = Matrix([x, y, z])
Kx = Matrix([[1, 0, 0 ],
[0, cos(kx), -sin(kx)],
[0, sin(kx), cos(kx)]])
product = Kx # v
# Don't:
# product = v # Kx

Pairwise Cosine Similarity using TensorFlow

How can we efficiently calculate pairwise cosine distances in a matrix using TensorFlow? Given an MxN matrix, the result should be an MxM matrix, where the element at position [i][j] is the cosine distance between i-th and j-th rows/vectors in the input matrix.
This can be done with Scikit-Learn fairly easily as follows:
from sklearn.metrics.pairwise import pairwise_distances
pairwise_distances(input_matrix, metric='cosine')
Is there an equivalent method in TensorFlow?
There is an answer for getting a single cosine distance here: https://stackoverflow.com/a/46057597/288875 . This is based on tf.losses.cosine_distance .
Here is a solution which does this for matrices:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
M = 3
# input
input = tf.placeholder(tf.float32, shape = (M, M))
# normalize each row
normalized = tf.nn.l2_normalize(input, dim = 1)
# multiply row i with row j using transpose
# element wise product
prod = tf.matmul(normalized, normalized,
adjoint_b = True # transpose second matrix
)
dist = 1 - prod
input_matrix = np.array(
[[ 1, 1, 1 ],
[ 0, 1, 1 ],
[ 0, 0, 1 ],
],
dtype = 'float32')
print "input_matrix:"
print input_matrix
from sklearn.metrics.pairwise import pairwise_distances
print "sklearn:"
print pairwise_distances(input_matrix, metric='cosine')
print "tensorflow:"
print sess.run(dist, feed_dict = { input : input_matrix })
which gives me:
input_matrix:
[[ 1. 1. 1.]
[ 0. 1. 1.]
[ 0. 0. 1.]]
sklearn:
[[ 0. 0.18350345 0.42264974]
[ 0.18350345 0. 0.29289323]
[ 0.42264974 0.29289323 0. ]]
tensorflow:
[[ 5.96046448e-08 1.83503449e-01 4.22649741e-01]
[ 1.83503449e-01 5.96046448e-08 2.92893231e-01]
[ 4.22649741e-01 2.92893231e-01 0.00000000e+00]]
Note that this solution may not be the optimal one as it calculates all entries of the (symmetric) result matrix, i.e. does almost twice of the calculations. This is likely not a problem for small matrices, for large matrices a combination of loops may be faster.
Note also that this does not have a minibatch dimension so works for a single matrix only.
Elegant solution (output is the same as from scikit-learn pairwise_distances function):
def compute_cosine_distances(a, b):
# x shape is n_a * dim
# y shape is n_b * dim
# results shape is n_a * n_b
normalize_a = tf.nn.l2_normalize(a,1)
normalize_b = tf.nn.l2_normalize(b,1)
distance = 1 - tf.matmul(normalize_a, normalize_b, transpose_b=True)
return distance
test
input_matrix = np.array([[1, 1, 1],
[0, 1, 1],
[0, 0, 1]], dtype = 'float32')
compute_cosine_distances(input_matrix, input_matrix)
output:
<tf.Tensor: id=442, shape=(3, 3), dtype=float32, numpy=
array([[5.9604645e-08, 1.8350345e-01, 4.2264974e-01],
[1.8350345e-01, 5.9604645e-08, 2.9289323e-01],
[4.2264974e-01, 2.9289323e-01, 0.0000000e+00]], dtype=float32)>

Resources