Dimension reduction:T-SNE, PCA are not preserving the Euclidean distance - matrix

My data exists in 128 dimensions, I'trying to reduce my data to 3 dimensions to visualize my data and preserve the Euclidean distance. Then distance represent the similarity between two data points.
Original data X: 5 * 128 (5 data points)
[[ -4.46e-02 1.57e-01 2.17e-01 1.24e-01 6.01e-02 7.61e-02
6.38e-02 -1.05e-01 -2.55e-02 5.99e-02 -8.38e-02 5.93e-02
-1.58e-01 -1.05e-01 1.31e-01 -5.33e-02 -4.18e-02 9.32e-02
-1.62e-02 -9.19e-02 -1.30e-01 8.56e-02 -6.13e-02 3.78e-02
7.84e-02 -9.74e-02 -9.42e-02 7.47e-02 -4.65e-02 7.36e-03
-9.19e-04 1.37e-01 -8.52e-02 9.27e-02 6.50e-02 -2.61e-02
7.21e-02 -1.83e-01 -2.49e-02 -9.85e-03 1.57e-01 -7.98e-02
1.50e-01 -1.40e-01 -2.39e-02 4.19e-02 6.98e-02 -1.27e-02
-7.56e-02 4.44e-02 1.86e-01 -2.22e-03 -1.79e-02 -3.90e-02
7.72e-02 4.47e-02 -8.15e-02 -4.31e-02 -6.52e-03 7.73e-02
-1.37e-02 5.78e-02 -1.25e-01 -1.58e-01 1.37e-01 9.34e-02
-6.07e-03 -1.69e-01 -2.12e-01 2.14e-01 -4.05e-02 1.29e-01
4.42e-02 1.71e-01 -2.13e-02 8.00e-03 7.17e-02 4.57e-03
-6.55e-03 -1.66e-01 3.73e-02 1.01e-01 -1.26e-03 1.96e-02
5.44e-02 -1.04e-01 -5.32e-02 -1.57e-02 -6.31e-02 1.89e-01
2.43e-02 1.59e-02 9.13e-03 -4.41e-02 -5.96e-03 1.03e-01
4.33e-02 -3.94e-02 7.85e-02 3.61e-02 -2.32e-02 3.69e-03
-9.57e-03 -1.47e-02 2.61e-02 -4.15e-04 1.41e-02 -4.22e-02
-7.42e-02 1.07e-01 9.08e-03 3.45e-02 6.41e-02 -5.37e-02
1.57e-02 -1.91e-01 8.21e-02 3.31e-02 3.57e-02 1.37e-02
1.56e-01 6.25e-02 4.54e-02 -1.07e-02 1.08e-01 2.69e-02
9.57e-02 -1.24e-01]
...
]
Original distance matrix dist:
dist = DataArray(squareform(pdist(X, 'euclidean')))
[[ 0. , 0.67, 0.62, 0.7 , 0.67],
[ 0.67, 0. , 0.48, 0.76, 0.46],
[ 0.62, 0.48, 0. , 0.7 , 0.48],
[ 0.7 , 0.76, 0.7 , 0. , 0.6 ],
[ 0.67, 0.46, 0.48, 0.6 , 0. ]]
T-SNE:
from sklearn.manifold import TSNE
model = TSNE(n_components=3, random_state=0)
x_tsne = model.fit_transform(x)
x_tsne:
[[ 1.78e-04 4.02e-05 1.01e-04]
[ 2.25e-04 1.90e-04 -1.00e-04]
[ 9.43e-05 -1.72e-05 -1.21e-05]
[ 4.02e-05 1.36e-05 1.49e-04]
[ 7.44e-05 1.08e-05 4.45e-05]]
dist_tsne:
[[ 0.00e+00, 2.55e-04, 1.52e-04, 1.49e-04, 1.22e-04],
[ 2.55e-04, 0.00e+00, 2.60e-04, 3.57e-04, 2.75e-04],
[ 1.52e-04, 2.60e-04, 0.00e+00, 1.72e-04, 6.62e-05],
[ 1.49e-04, 3.57e-04, 1.72e-04, 0.00e+00, 1.10e-04],
[ 1.22e-04, 2.75e-04, 6.62e-05, 1.10e-04, 0.00e+00]]
I compares dist and dist_tsne, I noticed that the values are not same, and they are not even proportional. How can I preserve the Euclidean distance while reduce the dimension?

That's theoretically not possible in general.
Your original data is living in much more dimensions and you can't throw away some of them while retaining the distances.
An example:
Imagine the 3 points of an equilateral triangle (in 2d-space)
Every pair of points has the same distance
Try to map this to a 1-dimensional sequence (number line)
It's not possible to keep the pairwise distances
The task of T-SNE and others is: map these point to some lower-dimensional space while keeping the distances visually so that we humans grasp some information hidden in many dimensions.

Related

Pairwise Cosine Similarity using TensorFlow

How can we efficiently calculate pairwise cosine distances in a matrix using TensorFlow? Given an MxN matrix, the result should be an MxM matrix, where the element at position [i][j] is the cosine distance between i-th and j-th rows/vectors in the input matrix.
This can be done with Scikit-Learn fairly easily as follows:
from sklearn.metrics.pairwise import pairwise_distances
pairwise_distances(input_matrix, metric='cosine')
Is there an equivalent method in TensorFlow?
There is an answer for getting a single cosine distance here: https://stackoverflow.com/a/46057597/288875 . This is based on tf.losses.cosine_distance .
Here is a solution which does this for matrices:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
M = 3
# input
input = tf.placeholder(tf.float32, shape = (M, M))
# normalize each row
normalized = tf.nn.l2_normalize(input, dim = 1)
# multiply row i with row j using transpose
# element wise product
prod = tf.matmul(normalized, normalized,
adjoint_b = True # transpose second matrix
)
dist = 1 - prod
input_matrix = np.array(
[[ 1, 1, 1 ],
[ 0, 1, 1 ],
[ 0, 0, 1 ],
],
dtype = 'float32')
print "input_matrix:"
print input_matrix
from sklearn.metrics.pairwise import pairwise_distances
print "sklearn:"
print pairwise_distances(input_matrix, metric='cosine')
print "tensorflow:"
print sess.run(dist, feed_dict = { input : input_matrix })
which gives me:
input_matrix:
[[ 1. 1. 1.]
[ 0. 1. 1.]
[ 0. 0. 1.]]
sklearn:
[[ 0. 0.18350345 0.42264974]
[ 0.18350345 0. 0.29289323]
[ 0.42264974 0.29289323 0. ]]
tensorflow:
[[ 5.96046448e-08 1.83503449e-01 4.22649741e-01]
[ 1.83503449e-01 5.96046448e-08 2.92893231e-01]
[ 4.22649741e-01 2.92893231e-01 0.00000000e+00]]
Note that this solution may not be the optimal one as it calculates all entries of the (symmetric) result matrix, i.e. does almost twice of the calculations. This is likely not a problem for small matrices, for large matrices a combination of loops may be faster.
Note also that this does not have a minibatch dimension so works for a single matrix only.
Elegant solution (output is the same as from scikit-learn pairwise_distances function):
def compute_cosine_distances(a, b):
# x shape is n_a * dim
# y shape is n_b * dim
# results shape is n_a * n_b
normalize_a = tf.nn.l2_normalize(a,1)
normalize_b = tf.nn.l2_normalize(b,1)
distance = 1 - tf.matmul(normalize_a, normalize_b, transpose_b=True)
return distance
test
input_matrix = np.array([[1, 1, 1],
[0, 1, 1],
[0, 0, 1]], dtype = 'float32')
compute_cosine_distances(input_matrix, input_matrix)
output:
<tf.Tensor: id=442, shape=(3, 3), dtype=float32, numpy=
array([[5.9604645e-08, 1.8350345e-01, 4.2264974e-01],
[1.8350345e-01, 5.9604645e-08, 2.9289323e-01],
[4.2264974e-01, 2.9289323e-01, 0.0000000e+00]], dtype=float32)>

How to take power of each element in a matrix in Netlogo?

I would like to take all elements of a matrix to the power of a specific number.
I have a matrix using the matrix extension set up like this:
let A matrix:make-constant 4 4 5
which gives a 4x4 matrix with values of 5 in there
Now I want to take all elements in the matrix to the same power, so say I want to take them to power 2, then I want to end up with a 4x4 matrix with numbers 25.
How can I do this?
You can do this a couple ways. The simplest is probably with matrix:times-element-wise. Unfortunately, this will only work for integer powers greater than or equal to 1:
to-report matrix-power [ mat n ]
repeat n - 1 [
set mat matrix:times-element-wise mat mat
]
report mat
end
You can also convert the matrix to a list of lists, and then use map on that to raise each element to a power. This has the advantage of working with 0, fractional powers, and negative:
to-report matrix-power [ mat n ]
report matrix:from-row-list map [ map [ ? ^ n ] ? ] matrix:to-row-list mat
end
map [ ? ^ n ] some-list raises each element of a list to the power of n. matrix:to-row-list converts the matrix to a list of lists. So, we apply map [ ? ^ n ] each list in the result of matrix:to-row-list. Then, we convert the result back into a matrix with matrix:from-row-list.
You can generalize this to do any element-wise operation:
to-report matrix-map [ function mat ]
report matrix:from-row-list map [ map function ? ] matrix:to-row-list mat
end
Then, we could define the power function as:
to-report matrix-power [ mat n ]
report matrix-map task [ ? ^ n ] mat
end

Find vectors with n-1 equal components

I have an unsorted set of n-dimensional vectors and, for each of the n dimensions in turn, I am looking for the subsets of vectors that differ in only this dimension's component. How can I do this efficiently?
Example:
[ (1,2,3), (1,3,3), (2,3,3), (1,2,5), (2,2,5), (2,3,4) ]
dim 3 variable: [ (1,2,3), (1,2,5) ] & [ (2,3,3), (2,3,4) ]
dim 2 variable: [ (1,2,3), (1,3,3) ]
dim 1 variable: [ (1,3,3), (2,3,3) ] & [ (1,2,5), (2,2,5) ]
Thanks very much for your help!
EDIT
As requested in a comment I am now posting my buggy code:
recursive subroutine get_peaks_on_same_axis(indices, result, current_dim, look_at, last_dim, mode, upper, &
num_groups, num_dim)
! Group the indices that denote the location of peaks within PEAK_INDICES which have n-1 dimensions in common.
! Eventually, RESULT will hold the groups of these peaks.
! e.g.: result(1,:) == (3,7,9) <= peak_indices(3), peak_indices(7), and peak_indices(9) belong together
integer, intent(in) :: indices(:), current_dim, look_at, last_dim, mode, num_dim
integer, intent(inout) :: upper(:), num_groups, result(:,:) ! in RESULT: each line holds a group of peaks
integer :: i, pos_on_axis, next_dim, aux(0:num_dim-1), stat
integer, allocatable :: num_peaks(:), groups(:,:)
integer, save :: slot
if (mode.eq.0) slot = 1
! we're only writing to RESULT once group determination has been completed
if (current_dim.eq.last_dim) then
! saving each column of 'groups' of the instance of the subroutine called one level further up
! = > those are the peaks which have n-1 dimensions in common
upper(slot) = ubound(indices,1)
result(slot,1:upper(slot)) = indices
num_groups = slot ! after the final call it will contain the actual number of peak groups
slot = slot + 1
return
end if
aux(0:num_dim-2) = (/ (i,i = 2,num_dim) /)
aux(num_dim-1) = 1
associate(peak_indices => public_spectra%intensity(look_at)%peak_indices, &
ndp => public_spectra%axes(look_at)%ax_set(current_dim)%num_data_points)
! potentially as many peaks as there are points in this dimension
allocate(num_peaks(ndp), groups(ndp,ubound(indices,1)), stat=stat)
if (stat.ne.0) call aloerr('spectrum_paraphernalia.f90',763)
num_peaks(:) = 0
! POS_ON_AXIS: ppm value of the peak in dimension DIM, converted to an index on the axis
! GROUPS: peaks that have the same axis index in dimension DIM; line: index on axis;
do i=1,ubound(indices,1)
pos_on_axis = peak_indices(current_dim,indices(i))
num_peaks(pos_on_axis) = num_peaks(pos_on_axis) + 1 ! num. of peaks that have this coordinate
groups(pos_on_axis,num_peaks(pos_on_axis)) = indices(i)
end do
next_dim = aux(mod(current_dim+(num_dim-1),num_dim))
do pos_on_axis=1,ubound(num_peaks,1)
if (num_peaks(pos_on_axis).gt.0) then
call get_peaks_on_same_axis(groups(pos_on_axis,1:num_peaks(pos_on_axis)), result, next_dim, look_at, last_dim, &
1, upper, num_groups, num_dim)
end if
end do
end associate
end subroutine
What about the naive way?
Let's assume, you have m vectors with length n.
Then you have to compare all vectors with each other which results in 1/2*(m^2+m-) = O(m^2) comparisons.
In each comparison you check your vectors element wise. If you find one difference you have to make sure, that there is no other difference. In best case, all vectors differ in the first 2 elements which is then 2 comparisons. The worst case is one or no difference which leads to n comparisons for the appropriate vectors.
If there is only one difference you can store its dimension, otherwise store a value like 0 or -1.

EM Problem involving 3 coins

I'm working on an estimation problem using the EM algorithm. The problem is as follows:
You have 3 coins with probabilities of being heads P1, P2, and P3 respectively. You flip coin 1. If coin 1=H, then you flip coin 2; if coin 1=T, then you flip coin 3. You only record whether coin 2 or 3 is heads or tails, not which coin was flipped. So the observations are strings of heads and tails, but nothing else. The problem is to estimate P1, P2, and P3.
My R code to do this is below. It's not working, and I can't figure out why. Any thoughts would be appreciated as I think this is a pretty crafty problem.
Ben
###############
#simulate data
p1<-.8
p2<-.8
p3<-.3
tosses<-1000
rbinom(tosses,size=1,prob=p1)->coin.1
pv<-rep(p3,tosses)
pv[coin.1==1]<-p2
#face now contains the probabilities of a head
rbinom(tosses,size=1,prob=pv)->face
rm(list=(ls()[ls()!="face"]))
#face is all you get to see!
################
#e-step
e.step<-function(x,theta.old) {
fun<-function(p,theta.old,x) {
theta.old[1]->p1
theta.old[2]->p2
theta.old[3]->p3
log(p1*p2^x*(1-p2)^(1-x))*(x*p1*p2+(1-x)*p1*(1-p2))->tmp1 #this is the first part of the expectation
log((1-p1)*p3^x*(1-p3)^(1-x))*(x*(1-p1)*p3+(1-x)*(1-p1)*(1-p3))->tmp2 #this is the second
mean(tmp1+tmp2)
}
return(fun)
}
#m-step
m.step<-function(fun,theta.old,face) {
nlminb(start=runif(3),objective=fun,theta.old=theta.old,x=face,lower=rep(.01,3),upper=rep(.99,3))$par
}
#initial estimates
length(face)->N
iter<-200
theta<-matrix(NA,iter,3)
c(.5,.5,.5)->theta[1,]
for (i in 2:iter) {
e.step(face,theta[i-1,])->tmp
m.step(tmp,theta[i-1,],face)->theta[i,]
print(c(i,theta[i,]))
if (max(abs(theta[i,]-theta[i-1,]))<.005) break("conv")
}
#note that this thing isn't going anywhere!
You can't estimate P1, P2 and P3 separately. The only useful information is the proportion of recorded heads and the total number of sets of flips (each set of flips is independent, so the order does not matter). This is like trying to solve one equation for three unknowns, and it cannot be done.
The probability of recording a head is P1*P2 + (1-P1)*P3 which in your example is 0.7
and of a tail is one minus that, i.e. P1*(1-P2) + (1-P1)*(1-P3) in your example 0.3
Here is a simple simulator
#simulate data
sim <- function(tosses, p1, p2, p3) {
coin.1 <- rbinom(tosses, size=1, prob=p1)
coin.2 <- rbinom(tosses, size=1, prob=p2)
coin.3 <- rbinom(tosses, size=1, prob=p3)
ifelse(coin.1 == 1, coin.2, coin.3) # returned
}
The following are illustrations all producing 0.7 (with some random fluctuations)
> mean(sim(100000, 0.8, 0.8, 0.3))
[1] 0.70172
> mean(sim(100000, 0.2, 0.3, 0.8))
[1] 0.69864
> mean(sim(100000, 0.5, 1.0, 0.4))
[1] 0.69795
> mean(sim(100000, 0.3, 0.7, 0.7))
[1] 0.69892
> mean(sim(100000, 0.5, 0.5, 0.9))
[1] 0.70054
> mean(sim(100000, 0.6, 0.9, 0.4))
[1] 0.70201
Nothing you can do subsequently will distinguish these.

2D transformation matrices for translation, shearing, scaling and rotation?

Ive been looking around the net for ages tryin to find how to derive the 2d transformation matices for the above functions. Couldnt find it in my notes for college and it was a past exam question wondering if anybody could help for revision purposes? cheers
A transformation matrix is simply a short-hand for applying a function to the x and y values of a point, independently. In the case of translation, x' = 1*x + 0*y + dx*1 and y' = 0*x + 1*y + dy * 1. The matrix representation of these two equations is as follows:
[[ 1 0 dx ] [[ x ] [[ x' ]
[ 0 1 dy ] [ y ] = [ y' ]
[ 0 0 1 ]] [ 1 ]] [ 1 ]]
The other matrices can be similarly derived--simply determine what x' and y' should be, in terms of x, y and 1.
See Wikipedia, for instance.

Resources