Related
I want to do scattered interpolation in Matlab, but scatteredInterpolant does not do quite what I want.
scatteredInterpolant allows me to provide a set of input sampling positions and the corresponding sample values. Then I can query the interpolated values by supplying a set of positions:
F = scatteredInterpolant(xpos, ypos, samplevals)
interpvals = F(xgrid, ygrid)
This is sort of the opposite of what I want. I already have a fixed set of sample positions, xpos/ypos, and output grid, xgrid/ygrid, and then I want to vary the sample values. The use case is that I have many quantities sampled at the same sampling positions, that should all be interpolated to the same output grid.
I have an idea how to do this for nearest neighbor and linear interpolation, but not for more general cases, in particular for natural neighbor interpolation.
This is what I want, in mock code:
G = myScatteredInterpolant(xpos, ypos, xgrid, ygrid, interp_method)
interpvals = G(samplevals)
In terms of what this means, I suppose G holds a (presumably sparse) matrix of weights, W, and then G(samplevals) basically does W * samplevals, where the weights in the matrix W depends on the input and output grid, as well as the interpolation method (nearest neighbor, linear, natural neighbor). Calculating the matrix W is probably much more expensive than evaluating the product W * samplevals, which is why I want this to be reused.
Is there any code in Matlab, or in a similar language that I could adapt, that does this? Can it somehow be extracted from scatteredInterpolant in reasonable processing time?
I have a 2D matrix with boolean values, which is updated highly frequently. I want to choose a 2D index {x, y} within the matrix, and find the nearest element that is "true" in the table, without going through all the elements (the matrix is massive).
For example, if I have the matrix:
0000100
0100000
0000100
0100001
and I choose a coordinate {x1, y1} such as {4, 3}, I want returned the location of the closest "true" value, which in this case is {5, 3}. The distance between the elements is measured using the standard Pythagorean equation:
distance = sqrt(distX * distX + distY * distY) where distX = x1 - x and distY = y1 - y.
I can go through all the elements in the matrix and keep a list of "true" values and select the one with the shortest distance result, but it's extremely inefficient. What algorithm can I use to reduce search time?
Details: The matrix size is 1920x1080, and around 25 queries will be made every frame. The entire matrix is updated every frame. I am trying to maintain a reasonable framerate, more than 7fps is enough.
If matrix is always being updated, then there is no need to build some auxillary structure like distance transform, Voronoy diagram etc.
You can just execute search like BFS (bread-first search) propagating from query point. The only difference from usual BFS is euclidean metrics. So you can generate (u, v) pairs ordered by (u^2+v^2) and check symmetric points shifted by (+-u,+-v),(+-v,+-u) combinations (four points when u or v is zero, eight points otherwise)
You could use a tree data structure like a quad-tree (see https://en.wikipedia.org/wiki/Quadtree) to store all locations with value "true". In this way it should be possible to quickly iterate over all "true" values in the neighborhood of a given location. Furthermore, the tree can be updated in logarithmic time, if the value of a location changes.
I am doing a clustering task and I have a distance matrix. I wish to visualize this distance matrix as a 2D graph. Please let me know if there is any way to do it online or in programming languages like R or python.
My distance matrix is as follows,
I used the classical Multidimensional scaling functionality (in R) and obtained a 2D plot that looks like:
But What I am looking for is a graph with nodes and weighted edges running between them.
Possibility 1
I assume, that you want a 2dimensional graph, where distances between nodes positions are the same as provided by your table.
In python, you can use networkx for such applications. In general there are manymethods of doing so, remember, that all of them are just approximations (as in general it is not possible to create a 2 dimensional representataion of points given their pairwise distances) They are some kind of stress-minimizatin (or energy-minimization) approximations, trying to find the "reasonable" representation with similar distances as those provided.
As an example you can consider a four point example (with correct, discrete metric applied):
p1 p2 p3 p4
---------------
p1 0 1 1 1
p2 1 0 1 1
p3 1 1 0 1
p4 1 1 1 0
In general, drawing actual "graph" is redundant, as you have fully connected one (each pair of nodes is connected) so it should be sufficient to draw just points.
Python example
import networkx as nx
import numpy as np
import string
dt = [('len', float)]
A = np.array([(0, 0.3, 0.4, 0.7),
(0.3, 0, 0.9, 0.2),
(0.4, 0.9, 0, 0.1),
(0.7, 0.2, 0.1, 0)
])*10
A = A.view(dt)
G = nx.from_numpy_matrix(A)
G = nx.relabel_nodes(G, dict(zip(range(len(G.nodes())),string.ascii_uppercase)))
G = nx.to_agraph(G)
G.node_attr.update(color="red", style="filled")
G.edge_attr.update(color="blue", width="2.0")
G.draw('distances.png', format='png', prog='neato')
In R you can try multidimensional scaling
# Classical MDS
# N rows (objects) x p columns (variables)
# each row identified by a unique row name
d <- dist(mydata) # euclidean distances between the rows
fit <- cmdscale(d,eig=TRUE, k=2) # k is the number of dim
fit # view results
# plot solution
x <- fit$points[,1]
y <- fit$points[,2]
plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2",
main="Metric MDS", type="n")
text(x, y, labels = row.names(mydata), cex=.7)
Possibility 2
You just want to draw a graph with labeled edges
Again, networkx can help:
import networkx as nx
# Create a graph
G = nx.Graph()
# distances
D = [ [0, 1], [1, 0] ]
labels = {}
for n in range(len(D)):
for m in range(len(D)-(n+1)):
G.add_edge(n,n+m+1)
labels[ (n,n+m+1) ] = str(D[n][n+m+1])
pos=nx.spring_layout(G)
nx.draw(G, pos)
nx.draw_networkx_edge_labels(G,pos,edge_labels=labels,font_size=30)
import pylab as plt
plt.show()
Multidimensional scaling (MDS) is exactly what you want. See here and here for more.
You did not mentioned if you want a 2 dimensional graph or not. I suppose that you want to build a graph on 2 dimensions due to the fact that you need that for visualization. Considering that you have to be aware that for the most of the graphs this is simply not possible.
What can be probably done is to approximate somehow the values from distance matrix, something like small values to have relative small edges and big values to have a relative big length.
With all previous considerations one option would be graphviz. See neato function.
In general what you are interested in is force-directed drawing. See wikipedia for further reference.
You can use d3js Force Directed Graph and configure distance between nodes. d3js force layout has some clustering capability to separate nodes with similar distances. Here's an example with values as distance between nodes:
http://vida.io/documents/SyT7DREdQmGSpsBkK
Another way to visualize is to use same distance between nodes but different line thickness. In that case, you'd want to calculate stroke-width based on values:
.style("stroke-width", function(d) { return Math.sqrt(d.value / 50); });
I have a loosely connected graph. For every edge in this graph, I know the approximate distance d(v,w) between node v and w at positions p(v) and p(w) as a vector in R3, not only as an euclidean distance. The error shall be small (lets say < 3%) and the first node is at <0,0,0>.
If there were no errors at all, I can calculate the node-positions this way:
set p(first_node) = <0,0,0>
calculate_position(first_node)
calculate_position(v):
for (v,w) in Edges:
if p(w) is not set:
set p(w) = p(v) + d(v,w)
calculate_position(w)
for (u,v) in Edges:
if p(u) is not set:
set p(u) = p(v) - d(u,v)
calculate_position(u)
The errors of the distance are not equal. But to keep things simple, assume the relative error (d(v,w)-d'(v,w))/E(v,w) is N(0,1)-normal-distributed. I want to minimize the sum of the squared error
sum( ((p(v)-p(w)) - d(v,w) )^2/E(v,w)^2 ) for all edges
The graph may have a moderate amount of Nodes ( > 100 ) but with just some connections between the nodes and have been "prefiltered" (split into subgraphs, if there is only one connection between these subgraphs).
I have tried a simplistic "physical model" with hooks low but its slow and unstable. Is there a better algorithm or heuristic for this kind of problem?
This looks like linear regression. Take error terms of the following form, i.e. without squares and split into separate coordinates:
(px(v) - px(w) - dx(v,w))/E(v,w)
(py(v) - py(w) - dy(v,w))/E(v,w)
(pz(v) - pz(w) - dz(v,w))/E(v,w)
If I understood you correctly, you are looking for values px(v), py(v) and pz(v) for all nodes v such that the sum of squares of the above terms is minimized.
You can do this by creating a matrix A and a vector b in the following way: every row corresponds to one of equation of the above form, and every column of A corresponds to one variable, i.e. a single coordinate. For n vertices and m edges, the matrix A will have 3m rows (since you separate coordinates) and 3n−3 columns (since you also fix the first node px(0)=py(0)=pz(0)=0).
The row for (px(v) - px(w) - dx(v,w))/E(v,w) would have an entry 1/E(v,w) in the column for px(v) and an entry -1/E(v,w) in the column for px(w). All other columns would be zero. The corresponding entry in the vector b would be dx(v,w)/E(v,w).
Now solve the linear equation (AT·A)x = AT·b where AT denotes the transpose of A. The solution vector x will contain the coordinates for your vertices. You can break this into three independent problems, one for each coordinate direction, to keep the size of the linear equation system down.
I'm working with Matlab's image toolbox. In particular, after binarizing and labeling an image, I run
props = regionprops(labeledImage, 'Centroid');
to get the centroid of all the connected objects. Now, I would like to find the one closer to a pair of coordinates (namely the center of the image). Of course I know I could use a for loop checking each props[i].Centroid pair of coordinates, but that's slow and there must be a matlaby way of doing it...
which is...?
Thanks in advance
The output from REGIONPROPS will be an N-by-1 structure array with one field 'Centroid' that contains a 1-by-2 array. You can first concatenate all these arrays into an N-by-2 array using the function VERTCAT. Then you can replicate your image center coordinates (assumed to be in a 1-by-2 array) using the function REPMAT so that it becomes an N-by-2 array. Now you can compute the distances using vectorized operations and find the index of the value with the minimum distance using the function MIN:
props = regionprops(labeledImage, 'Centroid');
centers = vertcat(props.Centroid); %# Vertically concatenate the centroids
imageCenter = [x y]; %# Your image center coordinates
origin = repmat(imageCenter,numel(props),1); %# Replicate the coordinates
squaredDistance = sum(abs(centers-origin).^2,2); %# Compute the squared distance
[~,minIndex] = min(squaredDistance); %# Find index of the minimum
Note that since you just want the minimum distance, you can just use the squared distances and avoid a needless call to SQRT. Also note that the function BSXFUN could be used as an alternative to replicating the image center coordinates to subtract them from the object centroids.