My Situation
I'm working on a project which needs to:
Prove the correctness of 3D matrix transformation formulas involving matrix operations
Find a model with the values of the unknown matrix entries.
My Question
What's the best way to express formulas using matrix operations so
that they can be solved by z3? (The way used in Z3Py's Sudoku
Example isn't
very elegant and doesn't seem suitable for more complex matrix
operations.)
Thanks. - If anything's unclear, please leave a question comment.
Z3 has no support for matrices like this, so the best way to encode them is to encode the formulas they represent. This is roughly the same as how the Sudoku example encodes things. Here is a simple example using e.g., a 2x2 real matrix (Z3py link: http://rise4fun.com/Z3Py/MYnB ):
# nonlinear version, constants a_ij, b_i are variables
# x_1, x_2, a_11, a_12, a_21, a_22, b_1, b_2 = Reals('x_1 x_2 a_11 a_12 a_21 a_22 b_1 b_2')
# linear version (all numbers are defined values)
x_1, x_2 = Reals('x_1 x_2')
# A matrix
a_11 = 1
a_12 = 2
a_21 = 3
a_22 = 5
# b-vector
b_1 = 7
b_2 = 11
newx_1 = a_11 * x_1 + a_12 * x_2 + b_1
newx_2 = a_21 * x_1 + a_22 * x_2 + b_2
print newx_1
print newx_2
# solve using model generation
s = Solver()
s.add(newx_1 == 0) # pointers to equations
s.add(newx_2 == 5)
print s.check()
print s.model()
# solve using "solve"
solve(And(newx_1 == 0, newx_2 == 5))
To get Z3 to solve for the unknown matrix entities, uncomment the second line (with the symbolic names for a_11, a_12, etc.), comment the other symbolic definitions of x_1, x_2 on the fifth line, and comment the specific assignments to a_11 = 1, etc. You will then get Z3 to solve for any unknowns by finding satisfying assignments to these variables, but note that you may need to enable model completion for your purposes (e.g., if you need assignments to all of the unknown matrix parameters or x_i variables, see, e.g.: Z3 4.0: get complete model ).
However, based on the link you shared, you are interested in performing operations using sinusoids (the rotations), which are in general transcendental, and Z3 at this point does not have support for transcendental (general exponentials, etc.). This will be the challenging part for you, e.g., to prove for any choice of angle for rotations that the operation works, or even just encoding the rotations. The scaling and translations should not be too hard to encode.
Also, see the following answer for how to encode linear differential equations, which are equations of the form x' = Ax, where A is an n * n matrix and x is an n-dimensional vector: Encoding of first order differential equation as First order formula
Related
Suppose I have a list of numbers = [3, 10, 20, 1, ...]
How can I assign a number (x1, x2, x3, x4, ...) to each of the elements in the list, so that 3/x1 ~= 10/x2 ~= 20/x3 ~= 1/x4 = ... ?
Edit: there are some restrictions on the numbers (x1, x2, x3...). they have to be picked from a list of available numbers (which can be floating points as well).
The problem is that the number of elements is not the same. There are more X elements. Xs can be assigned multiple times.
The goal is to minimize the difference between 3/x1, 10/x2, 20/x3, 1/x4
It often helps to develop a mathematical model. E.g.
Let
a(i)>=0 i=1,..,m
b(j)>0 j=1,..,n with n > m
be the data.
Introduce variables (to be determined by the model)
c = common number for all expressions to be close to
x(i,j) = 1 if a(i) is assigned to b(j)
0 otherwise
Then we can write:
min sum((i,j), (x(i,j)*(a(i)/b(j) - c))^2 )
subject to
sum(j, x(i,j)) = 1 for all i (each a(i) is assigned to exactly one b(j))
x(i,j) in {0,1}
c free
This is a non-linear model. MINLP (Mixed Integer Non-linear Programming) solvers are readily available. You can also choose an objective that can be linearized:
min sum((i,j), abs(x(i,j)*(a(i)/b(j) - y(i,j))) )
subject to
y(i,j) = x(i,j)*c
sum(j, x(i,j)) = 1 for all i
x(i,j) in {0,1}
c free
This can be reformulated as a MIP (Mixed Integer Programming) model. There are many MIP solvers available.
The solution can look like:
The values inside the matrix are a(i)/b(j). Each row corresponds to an a(i), and has exactly one matching b(j).
More details are here.
Let's divide the problem to 2 parts, the second one is optional.
Part 1
I have 3 linear equtions with N variables where N usually bigger then 3.
x1*a+x2*b+x3*c+x4*d[....]xN*p = B1
y1*a+y2*b+y3*c+y4*d[....]yN*p = B2
z1*a+z2*b+z3*c+z4*d[....]zN*p = B3
Looking for (a,b,c,d,[...],p), others are constant.
The standard Gaussian way won't work because the matrix will be wider then tall. Of course i can use it to eliminate 2 variables. Do you know an algorithm to find out a solution? (I only need one.) More 0s in the solution coefficients are better but not required.
Part 2
The coefficients in the solution must be non-negative.
Requirements:
The algorithm must be fast enough to run real time. (1800 per sec on an avrage pc). So trial and error method is a no go.
The algorithm will be implemented in C# but feel free to use pseudo language if you want to write code.
Set extra variables to zero. Now we have the matrix equation
A.x = b, where
x1 x2 x3
A = y1 y2 y3
z1 z2 z3
b = (B1, B2, B3), as a column vector
Now invert A. The solution is;
X = A-1.x
End matrix formula's in excel with Ctrl Shift Enter
According to your experience, what is the best crossover operator for weights assignment problem.
In particular, I am facing a constraint that force to be 1 the sum of the all weights. Currently, I am using the uniform crossover operator and then I divide all the parameters by the sum to get 1. The crossover works, but I am not sure that in this way I can save the good part of my solution and go to converge to a better solution.
Do you have any suggestion? No problem, if I need to build a custom operator.
If your initial population is made up of feasible individuals you could try a differential evolution-like approach.
The recombination operator needs three (random) vectors and adds the weighted difference between two population vectors to a third vector:
offspring = A + f (B - C)
You could try a fixed weighting factor f in the [0.6 ; 2.0] range or experimenting selecting f randomly for each generation or for each difference vector (a technique called dither, which should improve convergence behaviour significantly, especially for noisy objective functions).
This should work quite well since the offspring will automatically be feasible.
Special care should be taken to avoid premature convergence (e.g. some niching algorithm).
EDIT
With uniform crossover you are exploring the entire n-dimensional space, while the above recombination limits individuals to a subspace H (the hyperplane Σi wi = 1, where wi are the weights) of the original search space.
Reading the question I assumed that the sum-of-the-weights was the only constraint. Since there are other constraints, it's not true that the offspring is automatically feasible.
Anyway any feasible solution must be on H:
If A = (a1, a2, ... an), B = (b1, ... bn), C = (c1, ... cn) are feasible:
Σi ai = 1
Σi bi = 1
Σi ci = 1
so
Σi (ai + f (bi - ci)) =
Σi ai + f (Σi bi - Σi ci) =
1 + f (1 - 1) = 1
The offspring is on the H hyperplane.
Now depending on the number / type of additional constraints you could modify the proposed recombination operator or try something based on a penalty function.
EDIT2
You could determine analytically the "valid" range of f, but probably something like this is enough:
f = random(0.6, 2.0);
double trial[] = {f, f/2, f/4, -f, -f/2, -f/4, 0};
i = 0;
do
{
offspring = A + trial[i] * (B - C);
i = i + 1;
} while (unfeasible(offspring));
return offspring;
This is just a idea, I'm not sure how it works.
I have a set of points like:
(x , y , z , t)
(1 , 3 , 6 , 0.5)
(1.5 , 4 , 6.5 , 1)
(3.5 , 7 , 8 , 1.5)
(4 , 7.25 , 9 , 2)
I am looking to find the best linear fit on these points, let say a function like:
f(t) = a * x +b * y +c * z
This is Linear Regression problem. The "best fit" depends on the metric you define for being better.
One simple example is the Least Squares Metric, which aims to minimize the sum of squares: (f((x_i,y_i,z_i)) - w_i)^2 - where w_i is the measured value for the sample.
So, in least squares you are trying to minimize SUM{(a*x_i+b*y_i+c*z^i - w_i)^2 | per each i }. This function has a single global minimum at:
(a,b,c) = (X^T * X)^-1 * X^T * w
Where:
X is a 3xm matrix (m is the number of samples you have)
X^T - is the transposed of this matrix
w - is the measured results: `(w_1,w_2,...,w_m)`
The * operator represents matrix multiplication
There are more complex other methods, that use other distance metric, one example is the famous SVR with a linear kernel.
It seems that you are looking for the major axis of a point cloud.
You can work this out by finding the Eigenvector associated to the largest Eigenvalue of the covariance matrix. Could be an opportunity to use the power method (starting the iterations with the point farthest from the centroid, for example).
Can also be addressed by Singular Value Decomposition, preferably using methods that compute the largest values only.
If your data set contains outliers, then RANSAC could be a better choice: take two points at random and compute the sum of distances to the line they define. Repeat a number of times and keep the best fit.
Using the squared distances will answer your request for least-squares, but non-squared distances will be more robust.
You have a linear problem.
For example, my equation will be Y=ax1+bx2+c*x3.
In MATLAB do it:
B = [x1(:) x2(:) x3(:)] \ Y;
Y_fit = [x1(:) x2(:) x3(:)] * B;
In PYTHON do it:
import numpy as np
B, _, _, _ = np.linalg.lstsq([x1[:], x2[:], x3[:]], Y)
Y_fit = np.matmul([x1[:] x2[:] x3[:]], B)
I have to do optimization in supervised learning to get my weights.
I have to learn the values (w1,w2,w3,w4) such that whenever my vector A = [a1 a2 a3 a4] is 1 the sum w1*a1 + w2*a2 + w3*a3 + w4*a4 becomes greater than 0.5 and when its -1 ( labels ) then it becomes less than 0.5.
Can somebody tell me how I can approach this problem in Matlab ? One way that I know is to do it using evolutionary algorithms, taking a random value vector and then changing to pick the best n values.
Is there any other way that this can be approached ?
You can do it using linprog.
Let A be a matrix of size n by 4 consisting of all n training 4-vecotrs you have. You should also have a vector y with n elements (each either plus or minus 1), representing the label of each training 4-vecvtor.
Using A and y we can write a linear program (look at the doc for the names of the parameters I'm using). Now, you do not have an objective function, so you can simply set f to be f = zeros(4,1);.
The only thing you have is an inequality constraint (< a_i , w > - .5) * y_i >= 0 (where <.,.> is a dot-product between 4-vector a_i and weight vector w).
If my calculations are correct, this constraint can be written as
cmat = bsxfun( #times, A, y );
Overall you get
w = linprog( zeros(4,1), -cmat, .5*y );