Separating Axis Theorem - Containment and the minimum translation vector

Separating Axis Theorem - Containment and the minimum translation vector - algorithm

My code to calculate the minimum translation vector using the Separating Axis Theorem works perfectly well, except when one of the polygons is completely contained by another polygon. I have scoured the internet for the solution to this problem and everyone just seems to ignore it ( http://www.codezealot.org/archives/55#sat-contain talks about this, but doesn't give a full solution...)
The pictures below is a screenshot from my program illustrating the problem. The translucent blue triangle is the position of the rectangle before the MTV is applied, and the other triangle is with the MTV applied.

It seems to me that the link you shared does give a solution to this. In your MTV calculation, you have to test for complete containment in a projection and change the calculations accordingly. (The pseudocode is in reference to figure 9 on that page.) Perhaps if you post your code, we can comment on why it isn't working.

Related

How to increase the coordinate resolution of a d3-geo chart

I have a GeoJSON file with small details and features that I want to render using D3. Unfortunately, important details are lost because D3
removes polygon coordinate pairs that are closely spaced.
I've set up a small example to show this. Both links use the exact same GeoJSON data, rendered with both D3-geo and mapbox through github.
Specifically, notice the two areas marked by the red circles.
https://bl.ocks.org/alvra/eebb06be793bc06ff3ae01e6945298b6
https://gist.github.com/alvra/eebb06be793bc06ff3ae01e6945298b6
The top one one marks a part of polygon that is rounded using many closely spaced coordinate pairs, but D3 removes most points and just draws a rough square end.
The lower red circle marks a tiny triangle that is removed altogether. The adjacent polygons should touch exactly, but are also affected by D3's loss of precision.
I haven't found any documentation about D3's coordinate precision or a (configurable) feature size limit.
I've tried decreasing D3-geo's EPSILON and related EPSILON2 values and that removes this problem (for me), although I'm sure even smaller features will still be affected.
Assuming this is related to the fact that D3 uses proper geodesics for polygon segments, while the other mapping libraries just draw straight lines (in the output coordinate space),
I was hoping that this process can only introduce new points.
I haven't been able to find other users experiencing similar problems with small features, although I'm surprised this has never come up before.
Does anyone have an idea about the proper way to deal with this?
Through epsilon, I've narrowed the problem down to this use of pointEqual(). This indicates the problem is with clipCircle considering closely spaced coordinates equal and removes them.
Indeed, if I disable circular clipping projection.clipAngle(null), the problem disappears.

PMVS definition of "n-adjacent"

I am currently reading over Yasutaka Furukawa et al.'s Paper "Accurate, Dense, and Robust Multi-View Stereopsis" (PDF available here), where they describe an MVS-algorithm for reconstructing a 3D point-cloud from images.
I do understand the concepts and the main steps, but there is one detail that I am struggling with. This may be because I am not an English native speaker, so maybe a small hint would be enough.
On page 4 of the linked source, in chapter 3.2 "Expansion", there is the definition of "n-adjacent" patches:
|(c(p)−c(p'))·n(p)|+|(c(p)−c(p'))·n(p')| < 2ρ_2
My question is about ρ_2, that is described as in the following:
[...] ρ_2 is determined automatically as the distance at the depth of the
midpoint of c(p) and c(p') corresponding to an image displacement of β1 pixels
in R(p).
I do not understand what "distance" in this context should be, and I do not understand the stated correspondence to the image displacement.
I know that this is a very specific question, but since this paper is somewhat popular I hoped, that there is somebody, that can help me.

Alright, I think I do get it now.
It just means, that ρ_2 is the distance you have to move in a plane, located as far away from the camera (depth) as the midpoint of c(p) and c(p'), so that you get a displacement of β1 pixels in the image showing the scene.

Algorithm to best fit a rectangle

I'm writing an application which measures boxes from pictures. A sample picture after manipulation is shown below:
My application has identified pixels that are part of the box and changed the color to red. You can see that the image is pretty noisy and therefore creates pretty rough looking edges on the rectangle.
I've been reading about edge/corner detection algorithms, but before I pursue one of them I wanted to step back and see if such a complicated algorithm is really necessary. It seems like there probably is a simpler way to go about this, considering I have a few conditions that simplify things:
The image only contains a rectangle, not any other shape.
Each image only has 1 rectangle.
I do not need to be exact, though I'd like to achieve as best fit as I can.
My first go at a simple algorithm involved finding the top most, bottom most, left most and right most points. Those are the 4 corners. That works OK, but isn't super accurate for noisy edges like this. It is easy to eye ball a much better point as the corner.
Can anyone point me towards an algorithm for this?

You have already identified the region of the image that you are interested in(red region).
Using this same logic you should be able to binarize the image. Say the red region then results in white pixels and the rest is black.
Then trace the external contour of the white region using a contour tracing algorithm.
Now you have a point set that represents the external contour of the region.
Find the minimum-area-rectangle that bounds this point set.
You can easily do this using the OpenCV library. Take a look at threshold, findContours, and minAreaRect if you are planning to use OpenCV. Hope this information helps.

Finding cross on the image

I have set of binary images, on which i need to find the cross (examples attached). I use findcontours to extract borders from the binary image. But i can't understand how can i determine is this shape (border) cross or not? Maybe opencv has some built-in methods, which could help to solve this problem. I thought to solve this problem using Machine learning, but i think there is a simpler way to do this. Thanks!

Viola-Jones object detection could be a good start. Though the main usage of the algorithm (AFAIK) is face detection, it was actually designed for any object detection, such as your cross.
The algorithm is Machine-Learning based algorithm (so, you will need a set of classified "crosses" and a set of classified "not crosses"), and you will need to identify the significant "features" (patterns) that will help the algorithm recognize crosses.
The algorithm is implemented in OpenCV as cvHaarDetectObjects()

From the original image, lets say you've extracted sets of polygons that could potentially be your cross. Assuming that all of the cross is visible, to the extent that all edges can be distinguished as having a length, you could try the following.
Reject all polygons that did not have exactly 12 vertices required to
form your polygon.
Re-order the vertices such that the shortest edge length is first.
Create a best fit perspective transformation that maps your vertices onto a cross of uniform size
Examine the residuals generated by using this transformation to project your cross back onto the uniform cross, where the residual for any given point is the distance between the projected point and the corresponding uniform point.
If all the residuals are within your defined tolerance, you've found a cross.
Note that this works primarily due to the simplicity of the geometric shape you're searching for. Your contours will also need to have noise removed for this to work, e.g. each line within the cross needs to be converted to a single simple line.

Depending on your requirements, you could try some local feature detector like SIFT or SURF. Check OpenSURF which is an interesting implementation of the latter.

after some days of struggle, i came to a conclusion that the only robust way here is to use SVM + HOG. That's all.

You could erode each blob and analyze their number of pixels is going down. No mater the rotation scaling of the crosses they should always go down with the same ratio, excepted when you're closing down on the remaining center. Again, when the blob is small enough you should expect it to be in the center of the original blob. You won't need any machine learning algorithm or training data to resolve this.

Recognizing distortions in a regular grid

To give you some background as to what I'm doing: I'm trying to quantitatively record variations in flow of a compressible fluid via image analysis. One way to do this is to exploit the fact that the index of refraction of the fluid is directly related to its density. If you set up some kind of image behind the flow, the distortion in the image due to refractive index changes throughout the fluid field leads you to a density gradient, which helps to characterize the flow pattern.
I have a set of routines that do this successfully with a regular 2D pattern of dots. The dot pattern is slightly distorted, and by comparing the position of the dots in the distorted image with that in the non-distorted image, I get a displacement field, which is exactly what I need. The problem with this method is resolution. The resolution is limited to the number of dots in the field, and I'm exploring methods that give me more data.
One idea I've had is to use a regular grid of horizontal and vertical lines. This image will distort the same way, but instead of getting only the displacement of a dot, I'll have the continuous distortion of a grid. It seems like there must be some standard algorithm or procedure to compare one geometric grid to another and infer some kind of displacement field. Nonetheless, I haven't found anything like this in my research.
Does anyone have some ideas that might point me in the right direction? FYI, I am not a computer scientist -- I'm an engineer. I say that only because there may be some obvious approach I'm neglecting due to coming from a different field. But I can program. I'm using MATLAB, but I can read Python, C/C++, etc.
Here are examples of the type of images I'm working with:
Regular: Distorted:
--------

I think you are looking for the Digital Image Correlation algorithm.
Here you can see a demo.
Here is a Matlab Implementation.
From Wikipedia:
Digital Image Correlation and Tracking (DIC/DDIT) is an optical method that employs tracking & image registration techniques for accurate 2D and 3D measurements of changes in images. This is often used to measure deformation (engineering), displacement, and strain, but it is widely applied in many areas of science and engineering.
Edit
Here I applied the DIC algorithm to your distorted image using Mathematica, showing the relative displacements.
Edit
You may also easily identify the maximum displacement zone:
Edit
After some work (quite a bit, frankly) you can come up to something like this, representing the "displacement field", showing clearly that you are dealing with a vortex:
(Darker and bigger arrows means more displacement (velocity))
Post me a comment if you are interested in the Mathematica code for this one. I think my code is not going to help anybody else, so I omit posting it.

I would also suggest a line tracking algorithm would work well.
Simply start at the first pixel line of the image and start following each of the vertical lines downwards (You just need to start this at the first line to get the starting points. This can be done by a simple pattern that moves orthogonally to the gradient of that line, ergo follows a line. When you reach a crossing of a horizontal line you can measure that point (in x,y coordinates) and compare it to the corresponding crossing point in your distorted image.
Since your grid is regular you know that the n'th measured crossing point on the m'th vertical black line are corresponding in both images. Then you simply compare both points by computing their distance. Do this for each line on your grid and you will get, by how far each crossing point of the grid is distorted.
This following a line algorithm is also used in basic Edge linking algorithms or the Canny Edge detector.
(All this are just theoretic ideas and I cannot provide you with an algorithm to it. But I guess it should work easily on distorted images like you have there... but maybe it is helpful for you)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio