Getting element-wise equations of matrix multiplication in sympy - matrix

I've got 2 matrices, first of which is sparse with integer coefficients.
import sympy
A = sympy.eye(2)
A.row_op(1, lambda v, j: v + 2*A[0, j])
The 2nd is symbolic, and I perform an operation between them:
M = MatrixSymbol('M', 2, 1)
X = A * M + A.col(1)
Now, what I'd like is to get the element-wise equations:
X_{0,0} = A_{0,0}
X_{0,1} = 2*A_{0,0} + A_{0,1}
One way to do this is specifying a matrix in sympy with each element being an individual symbol:
rows = []
for i in range(shape[0]):
col = []
for j in range(shape[1]):
col.append(Symbol('%s_{%s,%d}' % (name,i,j)))
rows.append(col)
M = sympy.Matrix(rows)
Is there a way to do it with the MatrixSymbol above, and then get the resulting element-wise equations?

Turns out, this question has a very obvious answer:
MatrixSymbols in sympy can be indexed like a matrix, i.e.:
X[i,j]
gives the element-wise equations.
If one wants to subset more than one element, the MatrixSymbol must first be converted to a sympy.Matrix class:
X = sympy.Matrix(X)
X # lists all indices as `X[i, j]`
X[3:4,2] # arbitrary subsets are supported
Note that this does not allow all operations of a numpy array/matrix (such as indexing with a boolean equivalent), so you might be better of creating a numpy array with sympy symbols:
ijstr = lambda i,j: sympy.Symbol(name+"_{"+str(int(i))+","+str(int(j))+"}")
matrix = np.matrix(np.fromfunction(np.vectorize(ijstr), shape))

Related

Checking that a matrix is positive semidefinite with a given rank (in Julia)

I am writing a function that checks if a matrix X is positive semidefinite with a given rank k. To do this, I compute the eigenvalues of X, and I check that exactly k of them are positive and the rest are 0. Here's what I have so far:
using LinearAlgebra
function ispossemdef(X::AbstractMatrix, k::Int, ϵ::Real = 1e-10)
n = size(X, 1) # dim of X
!issymmetric(X) && return false # short-circuit if X is asymmetric
k > n && error("k > n") # throw error if k > n
eigs = eigvals(X) # eigenvalues of X in ascending order
z = eigs[1:(n - k)] # the values that should be zero
p = eigs[(n - k + 1):end] # the values that should be positive
n_minus_k_zero_eigenvalues = norm(z) < ϵ
k_positive_eigenvalues = all(p .> ϵ)
return n_minus_k_zero_eigenvalues & k_positive_eigenvalues
end
Is there a better algorithm for doing this? Better might mean faster (avoids computing the eigenvalues), or more numerically stable (lets me get away with a stricter error tolerance).
For example, the isposdef function (which is the k = n special case of what I'm doing) works by attempting to compute the Cholesky factor of X, and reporting back with whether or not it could. Can I generalize this procedure to semidefinite matrices? If so, is it better than checking the eigenvalues?
It will not work on all matrices, but have you looked at
using LinearAlgebra # for julia 1+
help> isposdef
at the isposdef() function?

Best iterative way to calculate the fundamental matrix of an absorbing Markov Chain?

I have a very large absorbing Markov chain. I want to obtain the fundamental matrix of this chain to calculate the expected number of steps before absortion. From this question I know that this can be calculated by the equation
(I - Q)t=1
which can be obtained by using the following python code:
def expected_steps_fast(Q):
I = numpy.identity(Q.shape[0])
o = numpy.ones(Q.shape[0])
numpy.linalg.solve(I-Q, o)
However, I would like to calculate it using some kind of iterative method similar to the power iteration method used for calculate the PageRank. This method would allow me to calculate an approximation to the expected number of steps before absortion in a mapreduce-like system.
¿Does something similar exist?
If you have a sparse matrix, check if scipy.spare.linalg.spsolve works. No guarantees about numerical robustness, but at least for trivial examples it's significantly faster than solving with dense matrices.
import networkx as nx
import numpy as np
import scipy.sparse as sp
import scipy.sparse.linalg as spla
def example(n):
"""Generate a very simple transition matrix from a directed graph
"""
g = nx.DiGraph()
for i in xrange(n-1):
g.add_edge(i+1, i)
g.add_edge(i, i+1)
g.add_edge(n-1, n)
g.add_edge(n, n)
m = nx.to_numpy_matrix(g)
# normalize rows to ensure m is a valid right stochastic matrix
m = m / np.sum(m, axis=1)
return m
A = sp.csr_matrix(example(2000)[:-1,:-1])
Ad = np.array(A.todense())
def sp_solve(Q):
I = sp.identity(Q.shape[0], format='csr')
o = np.ones(Q.shape[0])
return spla.spsolve(I-Q, o)
def dense_solve(Q):
I = numpy.identity(Q.shape[0])
o = numpy.ones(Q.shape[0])
return numpy.linalg.solve(I-Q, o)
Timings for sparse solution:
%timeit sparse_solve(A)
1000 loops, best of 3: 1.08 ms per loop
Timings for dense solution:
%timeit dense_solve(Ad)
1 loops, best of 3: 216 ms per loop
Like Tobias mentions in the comments, I would have expected other solvers to outperform the generic one, and they may for very large systems. For this toy example, the generic solve seems to work well enough.
I arraived to this answer thanks to #tobias-ribizel's suggestion of using the Neumann series. If we part from the following equation:
Using the Neumann series:
If we multiply each term of the series by the vector 1 we could operate separately over each row of the matrix Q and approximate successively with:
This is the python code I use to calculate this:
def expected_steps_iterative(Q, n=10):
N = Q.shape[0]
acc = np.ones(N)
r_k_1 = np.ones(N)
for k in range(1, n):
r_k = np.zeros(N)
for i in range(N):
for j in range(N):
r_k[i] += r_k_1[j] * Q[i, j]
if np.allclose(acc, acc+r_k, rtol=1e-8):
acc += r_k
break
acc += r_k
r_k_1 = r_k
return acc
And this is the code using Spark. This code expects that Q is a RDD where each row is a tuple (row_id, dict of weights for that row of the matrix).
def expected_steps_spark(sc, Q, n=10):
def dict2np(d, sz):
vec = np.zeros(sz)
for k, v in d.iteritems():
vec[k] = v
return vec
sz = Q.count()
acc = np.ones(sz)
x = {i:1.0 for i in range(sz)}
for k in range(1, n):
bc_x = sc.broadcast(x)
x_old = x
x = Q.map(lambda (u, ol): (u, reduce(lambda s, j: s + bc_x.value[j]*ol[j], ol, 0.0)))
x = x.collectAsMap()
v_old = dict2np(x_old, sz)
v = dict2np(x, sz)
acc += v
if np.allclose(v, v_old, rtol=1e-8):
break
return acc

Algorithm to evaluate best weights for weighted average

I have a data set of the form:
[9.1 5.6 7.4] => 8.5, [4.1 4.4 5.2] => 4.9, ... , x => y(x)
So x is a real vector of three elements and y is a scalar function.
I'm assuming a weighted average model of this data:
y(x) = (a * x[0] + b * x[1] + c * x[2]) / (a+b+c) + E(x)
where E is an unknown random error term.
I need an algorithm to find a,b,c, that minimizes total sum square error:
error = sum over all x of { E(x)^2 }
for a given data set.
Assume that the weights are normalized to sum to 1 (which happily is without loss of generality), then we can re-cast the problem with c = 1 - a - b, so we are actually solving for a and b.
With this we can write
error(a,b) = sum over all x { a x[0] + b x[1] + (1 - a - b) x[2] - y(x) }^2
Now it's just a question of taking the partial derivatives d_error/da and d_error/db and setting them to zero to find the minimum.
With some fiddling, you get a system of two equations in a and b.
C(X[0],X[0],X[2]) a + C(X[0],X[1],X[2]) b = C(X[0],Y,X[2])
C(X[1],X[0],X[2]) a + C(X[1],X[1],X[2]) b = C(X[1],Y,X[2])
The meaning of X[i] is the vector of all i'th components from the dataset x values.
The meaning of Y is the vector of all y(x) values.
The coefficient function C has the following meaning:
C(p, q, r) = sum over i { p[i] ( q[i] - r[i] ) }
I'll omit how to solve the 2x2 system unless this is a problem.
If we plug in the two-element data set you gave, we should get precise coefficients because you can always approximate two points perfectly with a line. So for example the first equation coefficients are:
C(X[0],X[0],X[2]) = 9.1(9.1 - 7.4) + 4.1(4.1 - 5.2) = 10.96
C(X[0],X[1],X[2]) = -19.66
C(X[0],Y,X[2]) = 8.78
Similarly for the second equation: 4.68 -13.6 4.84
Solving the 2x2 system produces: a = 0.42515, b = -0.20958. Therefore c = 0.78443.
Note that in this problem a negative coefficient results. There is nothing to guarantee they'll be positive, though "real" data sets may produce this result.
Indeed if you compute weighted averages with these coefficients, they are 8.5 and 4.9.
For fun I also tried this data set:
X[0] X[1] X[2] Y
0.018056028 9.70442075 9.368093544 6.360312244
8.138752835 5.181373099 3.824747424 5.423581239
6.296398214 4.74405298 9.837741509 7.714662742
5.177385358 1.241610571 5.028388255 4.491743107
4.251033792 8.261317658 7.415111851 6.430957844
4.720645386 1.0721718 2.187147908 2.815078796
1.941872069 1.108191586 6.24591771 3.994268819
4.220448549 9.931055481 4.435085917 5.233711923
9.398867623 2.799376317 7.982096264 7.612485261
4.971020963 1.578519218 0.462459906 2.248086465
I generated the Y values with 1/3 x[0] + 1/6 x[1] + 1/2 x[2] + E where E is a random number in [-0.1..+0.1]. If the algorithm is working correctly we'd expect to get roughly a = 1/3 and b = 1/6 from this result. Indeed we get a = .3472 and b = .1845.
OP has now said that his actual data are larger than 3-vectors. This method generalizes without much trouble. If the vectors are of length n, then you get an n-1 x n-1 system to solve.

Algorithm to express elements of a matrix as a vector

Statement of Problem:
I have an array M with m rows and n columns. The array M is filled with non-zero elements.
I also have a vector t with n elements, and a vector omega
with m elements.
The elements of t correspond to the columns of matrix M.
The elements of omega correspond to the rows of matrix M.
Goal of Algorithm:
Define chi as the multiplication of vector t and omega. I need to obtain a 1D vector a, where each element of a is a function of chi.
Each element of chi is unique (i.e. every element is different).
Using mathematics notation, this can be expressed as a(chi)
Each element of vector a corresponds to an element or elements of M.
Matlab code:
Here is a code snippet showing how the vectors t and omega are generated. The matrix M is pre-existing.
[m,n] = size(M);
t = linspace(0,5,n);
omega = linspace(0,628,m);
Conceptual Diagram:
This appears to be a type of integration (if this is the right word for it) along constant chi.
Reference:
Link to reference
The algorithm is not explicitly stated in the reference. I only wish that this algorithm was described in a manner reminiscent of computer science textbooks!
Looking at Figure 11.5, the matrix M is Figure 11.5(a). The goal is to find an algorithm to convert Figure 11.5(a) into 11.5(b).
It appears that the algorithm is a type of integration (averaging, perhaps?) along constant chi.
It appears to me that reshape is the matlab function you need to use. As noted in the link:
B = reshape(A,siz) returns an n-dimensional array with the same elements as A, but reshaped to siz, a vector representing the dimensions of the reshaped array.
That is, create a vector siz with the number m*n in it, and say A = reshape(P,siz), where P is the product of vectors t and ω; or perhaps say something like A = reshape(t*ω,[m*n]). (I don't have matlab here, or would run a test to see if I have the product the right way around.) Note, the link does not show an example with one number (instead of several) after the matrix parameter to reshape, but I would expect from the description that A = reshape(t*ω,m*n) might also work.
You should add a pseudocode or a link to the algorithm you want to implement. From what I could understood I have developed the following code anyway:
M = [1 2 3 4; 5 6 7 8; 9 10 11 12]' % easy test M matrix
a = reshape(M, prod(size(M)), 1) % convert M to vector 'a' with reshape command
[m,n] = size(M); % Your sample code
t = linspace(0,5,n); % Your sample code
omega = linspace(0,628,m); % Your sample code
for i=1:length(t)
for j=1:length(omega) % Acces a(chi) in the desired order
chi = length(omega)*(i-1)+j;
t(i) % related t value
omega(j) % related omega value
a(chi) % related a(chi) value
end
end
As you can see, I also think that the reshape() function is the solution to your problems. I hope that this code helps,
The basic idea is to use two separate loops. The outer loop is over the chi variable values, whereas the inner loop is over the i variable values. Referring to the above diagram in the original question, the i variable corresponds to the x-axis (time), and the j variable corresponds to the y-axis (frequency). Assuming that the chi, i, and j variables can take on any real number, bilinear interpolation is then used to find an amplitude corresponding to an element in matrix M. The integration is just an averaging over elements of M.
The following code snippet provides an overview of the basic algorithm to express elements of a matrix as a vector using the spectral collapsing from 2D to 1D. I can't find any reference for this, but it is a solution that works for me.
% Amp = amplitude vector corresponding to Figure 11.5(b) in book reference
% M = matrix corresponding to the absolute value of the complex Gabor transform
% matrix in Figure 11.5(a) in book reference
% Nchi = number of chi in chi vector
% prod = product of timestep and frequency step
% dt = time step
% domega = frequency step
% omega_max = maximum angular frequency
% i = time array element along x-axis
% j = frequency array element along y-axis
% current_i = current time array element in loop
% current_j = current frequency array element in loop
% Nchi = number of chi
% Nivar = number of i variables
% ivar = i variable vector
% calculate for chi = 0, which only occurs when
% t = 0 and omega = 0, at i = 1
av0 = mean( M(1,:) );
av1 = mean( M(2:end,1) );
av2 = mean( [av0 av1] );
Amp(1) = av2;
% av_val holds the sum of all values that have been averaged
av_val_sum = 0;
% loop for rest of chi
for ccnt = 2:Nchi % 2:Nchi
av_val_sum = 0; % reset av_val_sum
current_chi = chi( ccnt ); % current value of chi
% loop over i vector
for icnt = 1:Nivar % 1:Nivar
current_i = ivar( icnt );
current_j = (current_chi / (prod * (current_i - 1))) + 1;
current_t = dt * (current_i - 1);
current_omega = domega * (current_j - 1);
% values out of range
if(current_omega > omega_max)
continue;
end
% use bilinear interpolation to find an amplitude
% at current_t and current_omega from matrix M
% f_x_y is the bilinear interpolated amplitude
% Insert bilinear interpolation code here
% add to running sum
av_val_sum = av_val_sum + f_x_y;
end % icnt loop
% compute the average over all i
av = av_val_sum / Nivar;
% assign the average to Amp
Amp(ccnt) = av;
end % ccnt loop

Summation without a for loop - MATLAB

I have 2 matrices: V which is square MxM, and K which is MxN. Calling the dimension across rows x and the dimension across columns t, I need to evaluate the integral (i.e sum) over both dimensions of K times a t-shifted version of V, the answer being a function of the shift (almost like a convolution, see below). The sum is defined by the following expression, where _{} denotes the summation indices, and a zero-padding of out-of-limits elements is assumed:
S(t) = sum_{x,tau}[V(x,t+tau) * K(x,tau)]
I manage to do it with a single loop, over the t dimension (vectorizing the x dimension):
% some toy matrices
V = rand(50,50);
K = rand(50,10);
[M N] = size(K);
S = zeros(1, M);
for t = 1 : N
S(1,1:end-t+1) = S(1,1:end-t+1) + sum(bsxfun(#times, V(:,t:end), K(:,t)),1);
end
I have similar expressions which I managed to evaluate without a for loop, using a combination of conv2 and\or mirroring (flipping) of a single dimension. However I can't see how to avoid a for loop in this case (despite the appeared similarity to convolution).
Steps to vectorization
1] Perform sum(bsxfun(#times, V(:,t:end), K(:,t)),1) for all columns in V against all columns in K with matrix-multiplication -
sum_mults = V.'*K
This would give us a 2D array with each column representing sum(bsxfun(#times,.. operation at each iteration.
2] Step1 gave us all possible summations and also the values to be summed are not aligned in the same row across iterations, so we need to do a bit more work before summing along rows. The rest of the work is about getting a shifted up version. For the same, you can use boolean indexing with a upper and lower triangular boolean mask. Finally, we sum along each row for the final output. So, this part of the code would look like so -
valid_mask = tril(true(size(sum_mults)));
sum_mults_shifted = zeros(size(sum_mults));
sum_mults_shifted(flipud(valid_mask)) = sum_mults(valid_mask);
out = sum(sum_mults_shifted,2);
Runtime tests -
%// Inputs
V = rand(1000,1000);
K = rand(1000,200);
disp('--------------------- With original loopy approach')
tic
[M N] = size(K);
S = zeros(1, M);
for t = 1 : N
S(1,1:end-t+1) = S(1,1:end-t+1) + sum(bsxfun(#times, V(:,t:end), K(:,t)),1);
end
toc
disp('--------------------- With proposed vectorized approach')
tic
sum_mults = V.'*K; %//'
valid_mask = tril(true(size(sum_mults)));
sum_mults_shifted = zeros(size(sum_mults));
sum_mults_shifted(flipud(valid_mask)) = sum_mults(valid_mask);
out = sum(sum_mults_shifted,2);
toc
Output -
--------------------- With original loopy approach
Elapsed time is 2.696773 seconds.
--------------------- With proposed vectorized approach
Elapsed time is 0.044144 seconds.
This might be cheating (using arrayfun instead of a for loop) but I believe this expression gives you what you want:
S = arrayfun(#(t) sum(sum( V(:,(t+1):(t+N)) .* K )), 1:(M-N), 'UniformOutput', true)

Resources