Why is Python's Laplacian method different from Research Paper? - matrix

I found a Python library for Laplacian Score Feature Selection. But the implementation is seemingly different from the research paper.
I implemented the selection method according to the algorithm from the paper (https://papers.nips.cc/paper/2909-laplacian-score-for-feature-selection.pdf), which is as follows:
However, I found a Python library that implements the Laplacian method (https://github.com/jundongl/scikit-feature/blob/master/skfeature/function/similarity_based/lap_score.py).
So to check that my implementation was correct, I ran both versions on my dataset, and got different answers. In the process of debugging, I saw that the library used different formulas when calculating the affinity matrix (The S matrix from the paper).
The paper uses this formula:
, While the library uses
W_ij = exp(-norm(x_i - x_j)/2t^2)
Further investigation revealed that the library calculates the affinity matrix as follows:
t = kwargs['t']
# compute pairwise euclidean distances
D = pairwise_distances(X)
D **= 2
# sort the distance matrix D in ascending order
dump = np.sort(D, axis=1)
idx = np.argsort(D, axis=1)
idx_new = idx[:, 0:k+1]
dump_new = dump[:, 0:k+1]
# compute the pairwise heat kernel distances
dump_heat_kernel = np.exp(-dump_new/(2*t*t))
G = np.zeros((n_samples*(k+1), 3))
G[:, 0] = np.tile(np.arange(n_samples), (k+1, 1)).reshape(-1)
G[:, 1] = np.ravel(idx_new, order='F')
G[:, 2] = np.ravel(dump_heat_kernel, order='F')
# build the sparse affinity matrix W
W = csc_matrix((G[:, 2], (G[:, 0], G[:, 1])), shape=
n_samples,n_samples))
bigger = np.transpose(W) > W
W = W - W.multiply(bigger) + np.transpose(W).multiply(bigger)
return W
I'm not sure why the library squares each value in the distance matrix. I see that they also do some reordering, and they use a different heat kernel formula.
So I'd just like to know if any of the resources (The paper or the library) are wrong, or if they're somehow equivalent, or if anyone knows why they differ.

Related

The point that minimizes the sum of euclidean distances to a set of n points

I have a set of points W={(x1, y1), (x2, y2),..., (xn, yn)} on the 2D plane. Can you find an algorithm that takes these points as the input and returns a point (x, y) on the 2D plane which has the minimum sum of distances from the points in W? In other words, if
di = Euclidean_distance((x, y), (xi, yi))
I want to minimize:
d1 + d2 + ... + dn
The Problem
You're looking for the geometric median.
An Easy Solution
There is no closed-form solution to this problem, so iterative or probabilistic methods are used. The easiest way to find this is probably with Weiszfeld's algorithm:
We can implement this in Python as follows:
import numpy as np
from numpy.linalg import norm as npnorm
c_pt_old = np.random.rand(2)
c_pt_new = np.array([0,0])
while npnorm(c_pt_old-c_pt_new)>1e-6:
num = 0
denom = 0
for i in range(POINT_NUM):
dist = npnorm(c_pt_new-pts[i,:])
num += pts[i,:]/dist
denom += 1/dist
c_pt_old = c_pt_new
c_pt_new = num/denom
print(c_pt_new)
There's a chance that Weiszfeld's algorithm won't converge, so it might be best to run it several times from different starting points.
A General Solution
You can also find this using second-order cone programming (SOCP). In addition to solving your specific problem, this general formulation then allows you to easily add constraints and weightings, such as variable uncertainty in the location of each data point.
To do so, you create a number of indicator variables representing the distance between the proposed center point and the data points.
You then minimize the sum of the indicator variables. The result follows
import cvxpy as cp
import numpy as np
import matplotlib.pyplot as plt
#Generate random test data
POINT_NUM = 100
pts = np.random.rand(POINT_NUM,2)
c_pt = cp.Variable(2) #The center point we wish to locate
distances = cp.Variable(POINT_NUM) #Distance from the center point to each data point
#Generate constraints. These are used to hold distances.
constraints = []
for i in range(POINT_NUM):
constraints.append( cp.norm(c_pt-pts[i,:])<=distances[i] )
objective = cp.Minimize(cp.sum(distances))
problem = cp.Problem(objective,constraints)
optimal_value = problem.solve()
print("Optimal value = {0}".format(optimal_value))
print("Optimal location = {0}".format(c_pt.value))
plt.scatter(x=pts[:,0], y=pts[:,1], s=1)
plt.scatter(c_pt.value[0], c_pt.value[1], s=10)
plt.show()
SOCPs are available in a number of solvers including CPLEX, Elemental, ECOS, ECOS_BB, GUROBI, MOSEK, CVXOPT, and SCS.
I've tested and the two approaches give the same answers to within tolerance.
Weiszfeld, E. (1937). "Sur le point pour lequel la somme des distances de n points donnes est minimum". Tohoku Mathematical Journal. 43: 355–386.
If that point does not need to be from your sample, then the mean minimises the euclidean distance.
A third method would be to use a compact nonlinear programming formulation. An unconstrained NLP model would be:
min sum(i, ||x-p(i)|| )
This has just 2 variables (the coordinates of x).
There is a very good initial point available. Let p(i,c) be the coordinates of the data points. Then the mean is
m(c) = sum(i, p(i,c)) / n
where n is the number of data points. This point is often very close to the optimal value of x. So we can use m as an excellent initial point for x.
Some limited experiments indicate this approach is quite faster than a cone programming formulation for large n.
For details see Yet Another Math Programming Consultant - Finding the Central Point in a Point Cloud blog post.

Creating graphics for euclidean instances of TSP

I'm currently working on research centring around the Travelling Salesman Problem. More precisely I'm working with sample data using the EUC_2D edge weight type. Like the following:
1 11003.611100 42102.500000
2 11108.611100 42373.888900
3 11133.333300 42885.833300
I am able to produce a tour order. For example, 2-3-1.
I'd like to be able to create some simple graphics which represent a point set for a given problem, and then a point set with a tour over the top. Could anyone recommend a simple method of achieving this - what software should I be looking at to achieve this.
Thanks
Just to give you a quick demo on how the usual scientific plotting-tools would work (assuming i understood your task correctly):
Plot-only code using python & matplotlib:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(2, sharex=True, sharey=True) # Prepare 2 plots
ax[0].set_title('Raw nodes')
ax[1].set_title('Optimized tour')
ax[0].scatter(positions[:, 0], positions[:, 1]) # plot A
ax[1].scatter(positions[:, 0], positions[:, 1]) # plot B
start_node = 0
distance = 0.
for i in range(N):
start_pos = positions[start_node]
next_node = np.argmax(x_sol[start_node]) # needed because of MIP-approach used for TSP
end_pos = positions[next_node]
ax[1].annotate("",
xy=start_pos, xycoords='data',
xytext=end_pos, textcoords='data',
arrowprops=dict(arrowstyle="->",
connectionstyle="arc3"))
distance += np.linalg.norm(end_pos - start_pos)
start_node = next_node
textstr = "N nodes: %d\nTotal length: %.3f" % (N, distance)
props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
ax[1].text(0.05, 0.95, textstr, transform=ax[1].transAxes, fontsize=14, # Textbox
verticalalignment='top', bbox=props)
plt.tight_layout()
plt.show()
Output:
This code is based on data of the following form:
A 2d-array positions of shape (n_points, n_dimension) like:
[[ 4.17022005e-01 7.20324493e-01]
[ 1.14374817e-04 3.02332573e-01]
[ 1.46755891e-01 9.23385948e-02]
[ 1.86260211e-01 3.45560727e-01]
[ 3.96767474e-01 5.38816734e-01]]
A 2d-array x_sol which is our MIP-solution marking ~1 when node x is followed by y in our solution-tour, like:
[[ 0.00000000e+00 1.00000000e+00 -3.01195977e-11 2.00760084e-11
2.41495095e-11]
[ -2.32741108e-11 1.00000000e+00 1.00000000e+00 5.31351363e-12
-6.12644932e-12]
[ 1.18655962e-11 6.52816609e-12 0.00000000e+00 1.00000000e+00
1.42473796e-11]
[ -4.19937042e-12 3.40039727e-11 2.47921345e-12 0.00000000e+00
1.00000000e+00]
[ 1.00000000e+00 -2.65096995e-11 3.55630808e-12 7.24755899e-12
1.00000000e+00]]
Bigger example, solved with MIP-gap = 5%; meaning: the solution is guaranteed to be at most 5% worse than the optimum (one could see the sub-optimal part in the right where some crossing is happening):
Complete code including fake TSP-data and solving available here.
I going to recommend Baby X. (It's my own windowing system).
It's a windows system that runs on either Linux or MS Windows, and is designed for exactly this type of problem - quickly prototyping a program with a few simple graphics.
Baby X exposes rgba surfaces. You then draw into the buffer, either using your own routines, the Baby X basic routines (lines and polygons), or the Baby X graphics context (fully fledged Bezier-based 2D graphics). It's very quick to set up. You'll obviously have to scale your graph to pixel space, plot symbols for the cities, then draw lines between them to represent the tour.
https://github.com/MalcolmMcLean/babyx
However there are several graphics systems out there. You just have to choose one that runs on your hardware pltform.

Efficient Parallel Sparse Matrix dot product in Scipy Python

I have a really big (1.5M x 16M) sparse csr scipy matrix A. What i need to compute is the similarity of each pair of rows. I have defined the similarity as this:
Assume a and b are two rows of matrix A
a = (0, 1, 0, 4)
b = (1, 0, 2, 3)
Similarity (a, b) = 0*1 + 1*0 + 0*2 + 4*3 = 12
To compute all pairwise row similarities I use this (or Cosine similarity):
AT = np.transpose(A)
pairs = A.dot(AT)
Now pairs[i, j] is the similarity of row i and row j for all such i and j.
This is quite similar to pairwise Cosine similarity of rows. So If there is an efficient parallel algorithm that computes pairwise Cosine similarity it would work for me as well.
The problem: This dot product is very slow because it uses just one cpu (I have access to 64 of those cpus on my server).
I can also export A and AT to a file and run any other external program that does the multiplication in parallel and get the results back to the Python program.
Is there any more efficient way of doing this dot product? or computing the pairwise similarity in Parallel?
I finally used the 'Cosine' distance metric of scikit-learn and its pairwise_distances functions which support sparse matrices and is highly parallelised.
sklearn.metrics.pairwise.pairwise_distances(X, Y=None, metric='euclidean', n_jobs=1, **kwds)
I could also divide A into n horizontal parts and use the parallel python package to run multiple multiplications and horizontally stack the results later.
I wrote own implementation using sklearn. It is not parallel but it quite fast for large matrices.
from scipy.sparse import spdiags
from sklearn.preprocessing import normalize
def get_similarity_by_x_dot_x_greedy_for_memory(sp_matrix):
sp_matrix = sp_matrix.tocsr()
matrix = sp_matrix.dot(sp_matrix.T)
# zero diagonal
diag = spdiags(-matrix.diagonal(), [0], *matrix.shape, format='csr')
matrix = matrix + diag
return matrix
def get_similarity_by_cosine(sp_matrix):
sp_matrix = normalize(sp_matrix.tocsr())
return get_similarity_by_x_dot_x_greedy_for_memory(sp_matrix)

calculating Euclidean distance between two image in matlab

I want to calculate the Euclidean distance between two images in Matlab. I find some examples and I've try them but they are not correct.
The result of this Euclidean distance should be between 0 and 1 but with two different ways I reached to different solutions.
The first algorithm gives me a 4 digit number such as 2000 and other digits like this and by the other way I reached numbers such as 0.007
What is wrong with it?
This is one of those algorithms I mentioned:
Im1 = imread('1.jpeg');
Im2 = imread('2.jpeg');
Im1 = rgb2gray(Im1);
Im2 = rgb2gray(Im2);
hn1 = imhist(Im1)./numel(Im1);
hn2 = imhist(Im2)./numel(Im2);
% Calculate the Euclidean distance
f = sum((hn1 - hn2).^2)
the final line of code needs a sqrt command:
f = sum(sqrt(hn1-hn2).^2);
check this link
You can also use the norm command
f = norm(hn1-hn2);
These post1 and post2 can be useful.
Oh, I'm not sure where to begin but here are some things that you should think about:
1: You're normalising your histograms incorrectly. You want them to have unit L1-norm:
hn1 = imhist(Im1);
hn2 = imhist(Im2);
hn1 = hn1/numel(hn1);
hn2 = hn2/numel(hn2);
2: Taking L2-distance between histograms doesn't really make sense (what is an euclidian distance between two distributions really?). You should rather take a look at something like a L1 or Chi-2 distance, or use an intersection kernel. L1 would be
f=norm(hn1-hn2,1);
3: If you really do want it to be L2 euclidian distance, the last line should be
f=norm(hn1-hn2);
but then you should rather L2-normalize the histogram:
hn1 = imhist(Im1);
hn2 = imhist(Im2);
hn1 = hn1/norm(hn1);
hn2 = hn2/norm(hn2);
4: Please try to be clearer in the formulation of your questions - it was a bit hard to decode :). If your would have mentioned the application - I could have given some additional pointers. :)

Finding translation and scale on two sets of points to get least square error in their distance?

I have two sets of 3D points (original and reconstructed) and correspondence information about pairs - which point from one set represents the second one. I need to find 3D translation and scaling factor which transforms reconstruct set so the sum of square distances would be least (rotation would be nice too, but points are rotated similarly, so this is not main priority and might be omitted in sake of simplicity and speed). And so my question is - is this solved and available somewhere on the Internet? Personally, I would use least square method, but I don't have much time (and although I'm somewhat good at math, I don't use it often, so it would be better for me to avoid it), so I would like to use other's solution if it exists. I prefer solution in C++, for example using OpenCV, but algorithm alone is good enough.
If there is no such solution, I will calculate it by myself, I don't want to bother you so much.
SOLUTION: (from your answers)
For me it's Kabsch alhorithm;
Base info: http://en.wikipedia.org/wiki/Kabsch_algorithm
General solution: http://nghiaho.com/?page_id=671
STILL NOT SOLVED:
I also need scale. Scale values from SVD are not understandable for me; when I need scale about 1-4 for all axises (estimated by me), SVD scale is about [2000, 200, 20], which is not helping at all.
Since you are already using Kabsch algorithm, just have a look at Umeyama's paper which extends it to get scale. All you need to do is to get the standard deviation of your points and calculate scale as:
(1/sigma^2)*trace(D*S)
where D is the diagonal matrix in SVD decomposition in the rotation estimation and S is either identity matrix or [1 1 -1] diagonal matrix, depending on the sign of determinant of UV (which Kabsch uses to correct reflections into proper rotations). So if you have [2000, 200, 20], multiply the last element by +-1 (depending on the sign of determinant of UV), sum them and divide by the standard deviation of your points to get scale.
You can recycle the following code, which is using the Eigen library:
typedef Eigen::Matrix<double, 3, 1, Eigen::DontAlign> Vector3d_U; // microsoft's 32-bit compiler can't put Eigen::Vector3d inside a std::vector. for other compilers or for 64-bit, feel free to replace this by Eigen::Vector3d
/**
* #brief rigidly aligns two sets of poses
*
* This calculates such a relative pose <tt>R, t</tt>, such that:
*
* #code
* _TyVector v_pose = R * r_vertices[i] + t;
* double f_error = (r_tar_vertices[i] - v_pose).squaredNorm();
* #endcode
*
* The sum of squared errors in <tt>f_error</tt> for each <tt>i</tt> is minimized.
*
* #param[in] r_vertices is a set of vertices to be aligned
* #param[in] r_tar_vertices is a set of vertices to align to
*
* #return Returns a relative pose that rigidly aligns the two given sets of poses.
*
* #note This requires the two sets of poses to have the corresponding vertices stored under the same index.
*/
static std::pair<Eigen::Matrix3d, Eigen::Vector3d> t_Align_Points(
const std::vector<Vector3d_U> &r_vertices, const std::vector<Vector3d_U> &r_tar_vertices)
{
_ASSERTE(r_tar_vertices.size() == r_vertices.size());
const size_t n = r_vertices.size();
Eigen::Vector3d v_center_tar3 = Eigen::Vector3d::Zero(), v_center3 = Eigen::Vector3d::Zero();
for(size_t i = 0; i < n; ++ i) {
v_center_tar3 += r_tar_vertices[i];
v_center3 += r_vertices[i];
}
v_center_tar3 /= double(n);
v_center3 /= double(n);
// calculate centers of positions, potentially extend to 3D
double f_sd2_tar = 0, f_sd2 = 0; // only one of those is really needed
Eigen::Matrix3d t_cov = Eigen::Matrix3d::Zero();
for(size_t i = 0; i < n; ++ i) {
Eigen::Vector3d v_vert_i_tar = r_tar_vertices[i] - v_center_tar3;
Eigen::Vector3d v_vert_i = r_vertices[i] - v_center3;
// get both vertices
f_sd2 += v_vert_i.squaredNorm();
f_sd2_tar += v_vert_i_tar.squaredNorm();
// accumulate squared standard deviation (only one of those is really needed)
t_cov.noalias() += v_vert_i * v_vert_i_tar.transpose();
// accumulate covariance
}
// calculate the covariance matrix
Eigen::JacobiSVD<Eigen::Matrix3d> svd(t_cov, Eigen::ComputeFullU | Eigen::ComputeFullV);
// calculate the SVD
Eigen::Matrix3d R = svd.matrixV() * svd.matrixU().transpose();
// compute the rotation
double f_det = R.determinant();
Eigen::Vector3d e(1, 1, (f_det < 0)? -1 : 1);
// calculate determinant of V*U^T to disambiguate rotation sign
if(f_det < 0)
R.noalias() = svd.matrixV() * e.asDiagonal() * svd.matrixU().transpose();
// recompute the rotation part if the determinant was negative
R = Eigen::Quaterniond(R).normalized().toRotationMatrix();
// renormalize the rotation (not needed but gives slightly more orthogonal transformations)
double f_scale = svd.singularValues().dot(e) / f_sd2_tar;
double f_inv_scale = svd.singularValues().dot(e) / f_sd2; // only one of those is needed
// calculate the scale
R *= f_inv_scale;
// apply scale
Eigen::Vector3d t = v_center_tar3 - (R * v_center3); // R needs to contain scale here, otherwise the translation is wrong
// want to align center with ground truth
return std::make_pair(R, t); // or put it in a single 4x4 matrix if you like
}
For 3D points the problem is known as the Absolute Orientation problem. A c++ implementation is available from Eigen http://eigen.tuxfamily.org/dox/group__Geometry__Module.html#gab3f5a82a24490b936f8694cf8fef8e60 and paper http://web.stanford.edu/class/cs273/refs/umeyama.pdf
you can use it via opencv by converting the matrices to eigen with cv::cv2eigen() calls.
Start with translation of both sets of points. So that their centroid coincides with the origin of the coordinate system. Translation vector is just the difference between these centroids.
Now we have two sets of coordinates represented as matrices P and Q. One set of points may be obtained from other one by applying some linear operator (which performs both scaling and rotation). This operator is represented by 3x3 matrix X:
P * X = Q
To find proper scale/rotation we just need to solve this matrix equation, find X, then decompose it into several matrices, each representing some scaling or rotation.
A simple (but probably not numerically stable) way to solve it is to multiply both parts of the equation to the transposed matrix P (to get rid of non-square matrices), then multiply both parts of the equation to the inverted PT * P:
PT * P * X = PT * Q
X = (PT * P)-1 * PT * Q
Applying Singular value decomposition to matrix X gives two rotation matrices and a matrix with scale factors:
X = U * S * V
Here S is a diagonal matrix with scale factors (one scale for each coordinate), U and V are rotation matrices, one properly rotates the points so that they may be scaled along the coordinate axes, other one rotates them once more to align their orientation to second set of points.
Example (2D points are used for simplicity):
P = 1 2 Q = 7.5391 4.3455
2 3 12.9796 5.8897
-2 1 -4.5847 5.3159
-1 -6 -15.9340 -15.5511
After solving the equation:
X = 3.3417 -1.2573
2.0987 2.8014
After SVD decomposition:
U = -0.7317 -0.6816
-0.6816 0.7317
S = 4 0
0 3
V = -0.9689 -0.2474
-0.2474 0.9689
Here SVD has properly reconstructed all manipulations I performed on matrix P to get matrix Q: rotate by the angle 0.75, scale X axis by 4, scale Y axis by 3, rotate by the angle -0.25.
If sets of points are scaled uniformly (scale factor is equal by each axis), this procedure may be significantly simplified.
Just use Kabsch algorithm to get translation/rotation values. Then perform these translation and rotation (centroids should coincide with the origin of the coordinate system). Then for each pair of points (and for each coordinate) estimate Linear regression. Linear regression coefficient is exactly the scale factor.
A good explanation Finding optimal rotation and translation between corresponding 3D points
The code is in matlab but it's trivial to convert to opengl using the cv::SVD function
You might want to try ICP (Iterative closest point).
Given two sets of 3d points, it will tell you the transformation (rotation + translation) to go from the first set to the second one.
If you're interested in a c++ lightweight implementation, try libicp.
Good luck!
The general transformation, as well the scale can be retrieved via Procrustes Analysis. It works by superimposing the objects on top of each other and tries to estimate the transformation from that setting. It has been used in the context of ICP, many times. In fact, your preference, Kabash algorithm is a special case of this.
Moreover, Horn's alignment algorithm (based on quaternions) also finds a very good solution, while being quite efficient. A Matlab implementation is also available.
Scale can be inferred without SVD, if your points are uniformly scaled in all directions (I could not make sense of SVD-s scale matrix either). Here is how I solved the same problem:
Measure distances of each point to other points in the point cloud to get a 2d table of distances, where entry at (i,j) is norm(point_i-point_j). Do the same thing for the other point cloud, so you get two tables -- one for original and the other for reconstructed points.
Divide all values in one table by the corresponding values in the other table. Because the points correspond to each other, the distances do too. Ideally, the resulting table has all values being equal to each other, and this is the scale.
The median value of the divisions should be pretty close to the scale you are looking for. The mean value is also close, but I chose median just to exclude outliers.
Now you can use the scale value to scale all the reconstructed points and then proceed to estimating the rotation.
Tip: If there are too many points in the point clouds to find distances between all of them, then a smaller subset of distances will work, too, as long as it is the same subset for both point clouds. Ideally, just one distance pair would work if there is no measurement noise, e.g when one point cloud is directly derived from the other by just rotating it.
you can also use ScaleRatio ICP proposed by BaoweiLin
The code can be found in github

Resources