Fast way of evaluating the second expression only - algorithm

Given a n×n matrix X, three n×n diagonal matrices D1, D2, D3, and three n×1 vectors v1, v2, v3, my goal is to design an efficient way to evaluate each of the following expressions as fast as possible:
(Exp1) = X·v1+X·X·v2+X·X·X·v3
(Exp2) = D1·X·v1+D2·X·X·v2+D3·X·X·X·v3
For efficiently evaluting (Exp1), I have the following ideas:
I can rewrite (Exp1) as follows:
(Exp1) = X·v1+X·X·v2+X·X·X·v3 = X·(X·(X·v3+v2)+v1)
Therefore, I can evaluate (Exp1) by using only three matrix-vector multiplications as follows:
y1=X·v3+v2
y2=X·y1+v1
(Exp1)=X·y2
However, for efficiently evaluating (Exp2) only, I have no idea. Any suggestions or hints are very welcome.

I don't see a better way than to compute the vectors X.v1, X.X.v2 and X.X.X.v3 separately (6 matrix/vector multiplies) and combining them to Exp1 and Exp2.

Related

Combine boolean and integer logic in linear arithmetic using the Z3 Solver?

I would like to solve problems combining boolean and integer logic in linear arithmetic with a SAT/SMT solver. At first glance, Z3 seems promising.
First of all, is it at all possible to solve the following problem? This answer makes it seem like it works.
int x,y,z
boolean a,b,c
( (3x + y - 2z >= 10) OR (A AND (NOT B OR C)) OR ((A == C) AND (x + y >= 5)) )
If so, how does Z3 solve this kind of problem in theory and is there any documentation about it?
I could think of two ways to solve this problem. One would be to convert the Boolean operations into a linear integer expression. Another solution I read about is to use the Nelson-Oppen Combination Method described in [Kro 08].
I found a corresponding documentation in chapter 3.2.2. Solving Arithmetical Fragments, Table 1 a listing of the implemented algorithms for a certain logic.
Yes, SMT solvers are quite good at solving problems of this sort. Your problem can be expressed using z3's Python interface like this:
from z3 import *
x, y, z = Ints('x y z')
A, B, C = Bools('A B C')
solve (Or(3*x + y - 2*z >= 10
, And(A, Or(Not(B), C))
, And(A == C, x + y >= 5)))
This prints:
[A = True, z = 3, y = 0, B = True, C = True, x = 5]
giving you a (not necessarily "the") model that satisfies your constraints.
SMT solvers can deal with integers, machine words (i.e., bit-vectors), reals, along with many other data types, and there are efficient procedures for combinations of linear-integer-arithmetic, booleans, uninterpreted-functions, bit-vectors amongst many others.
See http://smtlib.cs.uiowa.edu for many resources on SMT solving, including references to other work. Any given solver (i.e., z3, yices, cvc etc.) will be a collection of various algorithms, heuristics and tactics. It's hard to compare them directly as each shine in their own way for certain sublogics, but for the base set of linear-integer arithmetic, booleans, and bit-vectors, they should all perform fairly well. Looks like you already found some good references, so you can do further reading as necessary; though for most end users it's neither necessary nor that important to know how an SMT solver internally works.

representing large binary vector as problog fact/rule

In ProbLog, how do I represent the following as p-fact/rule :
A binary vector of size N, where P bits are 1 ? i.e. a bit is ON with probability P/N, where N > 1000
i come up with this, but it seem iffy :
0.02::one(X) :- between(1,1000,X).
Want to use it later to make calculations on what happens if i apply two-or-more operations of bin-vec such as : AND,OR,XOR,count,overlap, hamming distance, but do it like Modeling rather than Simulation
F.e. if I ORed random 10 vec's, what is the probable overlap-count of this unionized vector and a new rand vec
... or what is the probability that they will overlap by X bits
.... questions like that
PS> I suspect cplint is the same.
Another try, but dont have idea how to query for 'single' result
1/10::one(X,Y) :- vec(X), between(1,10,Y). %vec: N=10, P=?
vec(X) :- between(1,2,X). %num of vecs
%P=2 ??
two(A,B,C,D) :- one(1,A), one(2,B), A =\= B, one(1,C), one(2,D), C =\= D.
based on #damianodamiono , so far :
P/N::vec(VID,P,N,_Bit).
prob_on([],[],_,_).
prob_on([H1|T1],[H2|T2],P,N):-
vec(1,P,N,H1), vec(2,P,N,H2),
prob_on(T1,T2,P,N).
query(prob_on([1],[1],2,10)).
query(prob_on([1,2,3,5],[1,6,9,2],2,10)).
I'm super happy to see that someone uses Probabilistic Logic Programming! Anyway, usually you do not need to create a list with 1000 elements and then attach 1000 probabilities. For example, if you want to state that each element of the list has a probabilty to be true of P/N (suppose 0.8), you can use (cplint and ProbLog have almost the same syntax, so you can run the programs on both of them):
0.8::on(_).
in the recursion.
For example:
8/10::on(_).
prob_on([]). prob_on([H|T]):-
on(H),
prob_on(T).
and then ask (in cplint)
?- prob(prob_on([1,2,3]),Prob).
Prob = Prob = 0.512
in ProbLog, you need to add query(prob_on([1,2,3])) in the program. Note the usage of the anonymous variable in the probabilistic fact on/1 (is needed, the motivation may be complicated so I omit it). If you want a probability that depends on the lenght of the list and other variables, you can use flexible probabilities:
P/N::on(P,N).
and then call it in your predicate with
...
on(P,N),
...
where both P and N are ground when on/2 is called. In general, you can add also a body in the probabilistic fact (turning it into a clause), and perform whatever operation you want.
With two lists:
8/10::on_1(_).
7/10::on_2(_).
prob_on([],[]).
prob_on([H1|T1],[H2|T2]):-
on_1(H1),
on_2(H2),
prob_on(T1,T2).
?- prob(prob_on([1,2,3,5],[1,6,9,2]),Prob).
Prob = 0.09834496
Hope this helps, let me know if something is still not clear.

Problem implementing Attentive Pooling Network for Question Answering

I'm following this paper to implement and Attentive Pooling Network to build a Question Answering system. In chapter 2.1, it speaks about the CNN layer:
where q_emb is a question where each token (word) has been embedded using word2vec. q_emb has shape (d, M). d is the dimension of the word embedding and M the length of the question. In a similar way, a_emb is the embedding of the answer with shape (d, L).
My question is: how is the convolution done and how is it possible that W_1 and b_1 are the same for both the operations? In my opinion at least b_1 should have a different dimension in each case (and it should be a matrix, not a vector....).
At the moment I've implemented this operation in PyTorch:
### Input is a tensor of shape (batch_size, 1, M or L, d*k)
conv2 = nn.Conv2d(1, c, (d*k, 1))
I find that the authors of the paper are trusting the readers to assume/figure out a lot of things here. From what I read, here is what I could gather:
W1 should be a 1 X dk matrix because that is the only shape that would make sense in order to get Q as c X M matrix.
Assuming this, b1 need not be an matrix. From the above, you could get a c X 1 X M matrix which could be reshaped to c X M matrix easily and b1 could be a c X 1 vector which could be broadcasted and added to the rest of the matrix.
Since, c, d and k are hyper parameters, you could easily have the same W1 and b1 for both Q and A.
This is what I think so far, I will re read and edit in case anythings amiss.

How to solve complex matrix equations in Go using mat64

I'm doing matrix math in Go using mat64. I have a matrix equation I want to solve, something like: (a * b + c) / (d - e) where a, b, c, d, and e are all matrices with real numbers as elements.
mat64 implements matrix math functions as methods. So, if you wanted to multiply matrix a by b, you'd do something like:
// Multiply a by b:
new := mat64.NewDense(x, y, nil)
new.Mul(a, b)
However, this method becomes unwieldy when you're looking at more complex equations with a whole bunch of steps such as my example above.
So, is there any way to invoke these routines (or methods in Go in general) without using receivers, forcing me to create a boatload of temporary matrices in order to solve a more complex equation, or am I stuck doing this the ugly way?

Best way to do an iteration scheme

I hope this hasn't been asked before, if so I apologize.
EDIT: For clarity, the following notation will be used: boldface uppercase for matrices, boldface lowercase for vectors, and italics for scalars.
Suppose x0 is a vector, A and B are matrix functions, and f is a vector function.
I'm looking for the best way to do the following iteration scheme in Mathematica:
A0 = A(x0), B0=B(x0), f0 = f(x0)
x1 = Inverse(A0)(B0.x0 + f0)
A1 = A(x1), B1=B(x1), f1 = f(x1)
x2 = Inverse(A1)(B1.x1 + f1)
...
I know that a for-loop can do the trick, but I'm not quite familiar with Mathematica, and I'm concerned that this is the most efficient way to do it. This is a justified concern as I would like to define a function u(N):=xNand use it in further calculations.
I guess my questions are:
What's the most efficient way to program the scheme?
Is RecurrenceTable a way to go?
EDIT
It was a bit more complicated than I tought. I'm providing more details in order to obtain a more thorough response.
Before doing the recurrence, I'm having problems understanding how to program the functions A, B and f.
Matrices A and B are functions of the time step dt = 1/T and the space step dx = 1/M, where T and M are the number of points in the {0 < x < 1, 0 < t} region. This is also true for vector the function f.
The dependance of A, B and f on x is rather tricky:
A and B are upper and lower triangular matrices (like a tridiagonal matrix; I suppose we can call them multidiagonal), with defined constant values on their diagonals.
Given a point 0 < xs < 1, I need to determine it's representative xn in the mesh (the closest), and then substitute the nth row of A and B with the function v( x) (transposed, of course), and the nth row of f with the function w( x).
Summarizing, A = A(dt, dx, xs, x). The same is true for B and f.
Then I need do the loop mentioned above, to define u( x) = step[T].
Hope I've explained myself.
I'm not sure if it's the best method, but I'd just use plain old memoization. You can represent an individual step as
xstep[x_] := Inverse[A[x]](B[x].x + f[x])
and then
u[0] = x0
u[n_] := u[n] = xstep[u[n-1]]
If you know how many values you need in advance, and it's advantageous to precompute them all for some reason (e.g. you want to open a file, use its contents to calculate xN, and then free the memory), you could use NestList. Instead of the previous two lines, you'd do
xlist = NestList[xstep, x0, 10];
u[n_] := xlist[[n]]
This will break if n > 10, of course (obviously, change 10 to suit your actual requirements).
Of course, it may be worth looking at your specific functions to see if you can make some algebraic simplifications.
I would probably write a function that accepts A0, B0, x0, and f0, and then returns A1, B1, x1, and f1 - say
step[A0_?MatrixQ, B0_?MatrixQ, x0_?VectorQ, f0_?VectorQ] := Module[...]
I would then Nest that function. It's hard to be more precise without more precise information.
Also, if your procedure is numerical, then you certainly don't want to compute Inverse[A0], as this is not a numerically stable operation. Rather, you should write
A0.x1 == B0.x0+f0
and then use a numerically stable solver to find x1. Of course, Mathematica's LinearSolve provides such an algorithm.

Resources