Numpy Hermitian Matrix class - performance

Are you aware of something like a hermitian matrix class in numpy? I'd like to optimize matrix calculations like
B = U * A * U.H
, where A (and thus, B) are hermitian. Without specification, all matrix elements of B are calculated. In fact, it should be able to save a factor of about 2 here. Do I miss something?
The method I need should take take the upper/lower triangle of A, the full matrix of U and return the upper/lower triangle of B.

I don't think there exists a method for your specific problem, but with a little thought you might be able to build an algorithm from the low-level BLAS routines that are wrapped in SciPy. For example, dgemm, dsymm, and dtrmm do general, symmetric, and triangular matrix products respectively. Here's an example of using them:
from scipy.linalg.blas import dgemm, dsymm, dtrmm
A = np.random.rand(10, 10)
B = np.random.rand(10, 10)
S =, A.T) # symmetric matrix
T = np.triu(S) # upper triangular matrix
# normal matrix-matrix product
assert np.allclose(dgemm(1, A, B),, B))
# symmetric mat-mat product using only upper-triangle
assert np.allclose(dsymm(1, T, B),, B))
# upper-triangular mat-mat product
assert np.allclose(dtrmm(1, T, B),, B))
There are many other low-level BLAS routines available; I find the NETLIB page to be a good resource to learn what they do. You may be able to cleverly use some combination of the available routines to efficiently solve the problem you have in mind.
Edit: it looks like there are LAPACK routines that quickly compute exactly what you want: dsytrd or zhetrd, but unfortunately these don't appear to be wrapped directly in scipy.linalg.lapack, though scipy does provide cython wrappers for them. Best of luck!

I needed tridiagonal reduction of a symmetric/Hermitian matrix A,
T = Q^H * A * Q
– presumably OP's underlying problem – and I've just submitted a pull request to SciPy for properly interfacing LAPACK's {s,d}sytrd (for real symmetric matrices) and {c,z}hetrd (for Hermitian matrices). All routines use either only the upper or the lower triangular part of the matrix.
Once this has been merged, it can be used like
import numpy as np
n = 3
A = np.zeros((n, n), dtype=dtype)
A[np.triu_indices_from(A)] = np.arange(1, 2*n+1, dtype=dtype)
# query lwork -- optional
lwork, info = sytrd_lwork(n)
assert info == 0
data, d, e, tau, info = sytrd(A, lwork=lwork)
assert info == 0
The vectors d and e now contain the main diagonal and the upper and lower diagonal, respectively.


Computing a single element of the adjugate or inverse of a symbolic binary matrix

I'm trying to get a single element of an adjugate A_adj of a matrix A, both of which need to be symbolic expressions, where the symbols x_i are binary and the matrix A is symmetric and sparse. Python's sympy works great for small problems:
from sympy import zeros, symbols
size = 4
A = zeros(size,size)
x_i = [x for x in symbols(f'x0:{size}')]
for i in range(size-1):
A[i,i] += 0.5*x_i[i]
A[i+1,i+1] += 0.5*x_i[i]
A[i,i+1] = A[i+1,i] = -0.3*(i+1)*x_i[i]
A_adj_0 = A[1:,1:].det()
This calculates the first element A_adj_0 of the cofactor matrix (which is the corresponding minor) and correctly gives me 0.125x_0x_1x_2 - 0.28x_2x_2^2 - 0.055x_1^2x_2 - 0.28x_1x_2^2, which is the expression I need, but there are two issues:
This is completely unfeasible for larger matrices (I need this for sizes of ~100).
The x_i are binary variables (i.e. either 0 or 1) and there seems to be no way for sympy to simplify expressions of binary variables, i.e. simplifying polynomials x_i^n = x_i.
The first issue can be partly addressed by instead solving a linear equation system Ay = b, where b is set to the first basis vector [1, 0, 0, 0], such that y is the first column of the inverse of A. The first entry of y is the first element of the inverse of A:
b = zeros(size,1)
b[0] = 1
y = A.LUsolve(b)
s = {x_i[i]: 1 for i in range(size)}
print(y[0].subs(s) * A.subs(s).det())
The problem here is that the expression for the first element of y is extremely complicated, even after using simplify() and so on. It would be a very simple expression with simplification of binary expressions as mentioned in point 2 above. It's a faster method, but still unfeasible for larger matrices.
This boils down to my actual question:
Is there an efficient way to compute a single element of the adjugate of a sparse and symmetric symbolic matrix, where the symbols are binary values?
I'm open to using other software as well.
Addendum 1:
It seems simplifying binary expressions in sympy is possible with a simple custom substitution which I wasn't aware of:
A_subs = A_adj_0
for i in range(size):
A_subs = A_subs.subs(x_i[i]*x_i[i], x_i[i])
You should make sure to use Rational rather than floats in sympy so S(1)/2 or Rational(1, 2) rather than 0.5.
There is a new (undocumented and for the moment internal) implementation of matrices in sympy called DomainMatrix. It is likely to be a lot faster for a problem like this and always produces polynomial results in a fully expanded form. I expect that it will be much faster for this kind of problem but it still seems to be fairly slow for this because is is not sparse internally (yet - that will probably change in the next release) and it does not take advantage of the simplification from the symbols being binary-valued. It can be made to work over GF(2) but not with symbols that are assumed to be in GF(2) which is something different.
In case it is helpful though this is how you would use it in sympy 1.7.1:
from sympy import zeros, symbols, Rational
from sympy.polys.domainmatrix import DomainMatrix
size = 10
A = zeros(size,size)
x_i = [x for x in symbols(f'x0:{size}')]
for i in range(size-1):
A[i,i] += Rational(1, 2)*x_i[i]
A[i+1,i+1] += Rational(1, 2)*x_i[i]
A[i,i+1] = A[i+1,i] = -Rational(3, 10)*(i+1)*x_i[i]
# Convert to DomainMatrix:
dM = DomainMatrix.from_list_sympy(size-1, size-1, A[1:, 1:].tolist())
# Compute determinant and convert back to normal sympy expression:
# Could also use dM.det().as_expr() although it might be slower
A_adj_0 = dM.charpoly()[-1].as_expr()
# Reduce powers:
A_adj_0 = A_adj_0.replace(lambda e: e.is_Pow, lambda e: e.args[0])

numerical diagonalization of a unitary matrix

To numerically diagonalize a unitary matrix I use the LAPACK routine zgeev.
The problem is: In case of degeneracies the degenerate subspace is not orthonormalized, since the routine is for general matrices.
However, since in my case the matrices are unitary, the basis can be always orthonormalized. Is there a better solution than applying QR-algorithm afterwards to the degenerate subspace?
Short answer: Schur decomposition!
If a square matrix A is complex, then its Schur factorization is A=ZTZ*, where Z is unitary and T is upper triangular.
If A happens to be unitary, T must also be unitary. Since T is both unitary and triangular, it is diagonal (proof here,.or there)
Let's consider the vectors Z.e_i, where e_i are the vectors of the canonical basis. These vectors obviously form an orthonormal basis. Moreover, these vectors are eigenvectors of the matrix A.
Hence, the columns of the unitary matrix Z are eigenvectors of the unitary matrix A and form an orthonormal basis.
As a consequence, computing a Schur decomposition of a unitary matrix is equivalent to finding one of its orthogonal basis of eigenvectors.
ZGEESX computes the eigenvalues, the Schur form, and, optionally, the matrix of Schur vectors for GE matrices
The resulting T can also be tested to check that A is unitary.
Here is a piece of python code testing it, though scipy's scipy.linalg.schur makes use of Lapack's zgees for Schur decomposition. I used hpaulj's code to generate random unitary matrix as shown in How to create random orthonormal matrix in python numpy
import numpy as np
import scipy.linalg
#from hpaulj,
def rvs(dim=3):
random_state = np.random
H = np.eye(dim)
D = np.ones((dim,))
for n in range(1, dim):
x = random_state.normal(size=(dim-n+1,))
D[n-1] = np.sign(x[0])
x[0] -= D[n-1]*np.sqrt((x*x).sum())
# Householder transformation
Hx = (np.eye(dim-n+1) - 2.*np.outer(x, x)/(x*x).sum())
mat = np.eye(dim)
mat[n-1:, n-1:] = Hx
H =, mat)
# Fix the last sign such that the determinant is 1
D[-1] = (-1)**(1-(dim % 2))*
# Equivalent to, H) but faster, apparently
H = (D*H.T).T
return H
A= rvs(n)
A = A.astype(complex)
#print T
normT=np.linalg.norm(T,ord=None) #2-norm
for i in range(n):
print 'must be very low if A is unitary: ',normTu/normT
#print Z
for i in range(n):
print i,'must be very low if column i of Z is eigenvector of A: ',np.linalg.norm(w,ord=None)/np.linalg.norm(v,ord=None)

Latent factor recovery with probabilistic matrix factorization using Edward

I implemented a probabilistic matrix factorization model (R = U'V) following the example in Edward's repo:
# data
U_true = np.random.randn(D, N)
V_true = np.random.randn(D, M)
R_true =, V_true) + np.random.normal(0, 0.1, size=(N, M))
# model
I = tf.placeholder(tf.float32, [N, M])
U = Normal(loc=tf.zeros([D, N]), scale=tf.ones([D, N]))
V = Normal(loc=tf.zeros([D, M]), scale=tf.ones([D, M]))
R = Normal(loc=tf.matmul(tf.transpose(U), V), scale=tf.ones([N, M]))
I get a good performance when predicting the data in matrix R. However, when I evaluate the inferred traits in U and V, the error varies a lot and can get very high.
I tried with a latent space of small dimension (e.g. 2) and checked if latent traits weren't simply permuted. They sometimes get permuted but even after realigning them the error is still significant.
To throw some numbers: for a synthetic R matrix generated from U and V both normally distributed (mean 0 and variance 1), I can achieve a mean absolute error of 0.003 on R, but on U and V it's usually around 0.5.
I know this model is symmetric, but I am not sure about the implications. I would like to ask:
Is it actually possible to guarantee the recovery of the original latent traits in some way?
If so, how could it be achieved, preferably using Edward?

c++ eigen A.inverse()*B not equal to A.ldlt().solve(B)

I would like to compute the trace of the product of two given matrices, say A and B, Trace(AInv * B) where * is the regular matrix product, AInv is the inverse of A (being symmetric and positive definite) and B is symmetric.
Solution 1: computing the inverse explicitely
Noting that Trace(AInv * B) is equivalent to taking the sum of the componentwise product of AInv and B:
double sol1 = (A.inverse().cwiseProduct(B)).sum();
Solution 2: using ldlt decomposition from the Eigen library
double sol2 = (A.selfadjointView<Lower>().ldlt().solve(B)).trace();
Theoretically, these solutions should be the same, but in my test, they don't. Sounds like I am missing something. As .ldlt().solve() is not made to compute matrix inverse but rather solve a linear system, my question is : does .ldlt() perform any sort of normalization? If not, what I am doing wrong?
Many thanks!
The statement to compute sol1 is wrong: you need to either transpose one of the operands or use a matrix-matrix product: correct versions:
double sol1 = (A.inverse().cwiseProduct(B.transpose())).sum();
double sol1 = (A.inverse().lazyProduct(B)).diagonal().sum();
double sol1 = (A.inverse().lazyProduct(B)).trace();
double sol1 = (A.inverse() * B).diagonal().sum();
double sol1 = (A.inverse() * B).trace();
Note that, in Eigen, when you write (A*B).diagonal() only diagonal elements of A*B are computed;, not the off-diagonal ones.
In general, it is not recommended to explicitly compute the inverse of a matrix, and using either or A.ldlt().solve(B) will give you more accurate results and will be faster too because, unless A is very small (2, 3, 4), A.inverse() is equivalent to In the future, Eigen will very likely rewrite expressions like:
A.inverse() * B
for you anyway.

How is `(d*a)mod(b)=1` written in Ruby?

How should I write this:
in order to make it work properly in Ruby? I tried it on Wolfram, but their solution:
(da(b, d))/(dd) = -a/d
doesn't help me. I know a and b. I need to solve (d*a)mod(b)=1 for d in the form d=....
It's not clear what you're asking, and, depending on what you mean, a solution may be impossible.
First off, (da(b, d))/(dd) = -a/d, is not a solution to that equation; rather, it's a misinterpretation of the notation used for partial derivatives. What Wolfram Alpha actually gave you was:
, which is entirely unrelated.
Secondly, if you're trying to solve (d*a)mod(b)=1 for d, you may be out of luck. For any value of a and b, where a and b have a common prime factor, there are an infinite number of values of d that satisfy the equation. If a and b are coprime, you can use the formula given in LutzL's answer.
Additionally, if you're looking to perform symbolic manipulation of equations, Ruby is likely not the proper tool. Consider using a CAS, like Python's SymPy or Wolfram Mathematica.
Finally, if you're just trying to compute (d*a)mod(b), the modulo operator in Ruby is %, so you'd write (d*a)%(b).
You are looking for the modular inverse of a modulo b.
For any two numbers a,b the extended euclidean algorithm
g,u,v = xgcd(a, b)
gives coefficients u,v such that
u*a+v*b = g
and g is the greatest common divisor. You need a,b co-prime, preferably by ensuring that b is a prime number, to get g=1 and then you can set d=u.
if b = 0
return (a,1,0)
q,r = a divmod b
// a = q*b + r
g,u,v = xgcd(b, r)
// g = u*b + v*r = u*b + v*(a-q*b) = v*a+(u-q*v)*b
return g,v,u - q*v
