I was trying to make a application that compares the difference between 2 images in java with opencv. After trying various approaches I came across the algorithm called Demons algorithm.
To me it seems to give the difference of images by some transformation on each place. But I couldn't understand it since the references I found were too complex for me.
Even the demons algorithm does not do what I need I'm interested in learning it.
Can any one explain simply what happens in the demons algorithm and how to write a simple code to use that algorithm on 2 images.
I can give you an overview of general algorithms for deformable image registration, demons is one of them
There are 3 components of the algorithm, a similarity metric, a transformation model and an optimization algorithm.
A similarity metric is used to compute pixel based / patch based similarity between pixels/patches. Common similarity measures are SSD, normalized cross correlation for mono-modal images while information theoretic measures like mutual information are used in the case of multi-modal image registration.
In the case of deformable registration, they generally have a regular grid super-imposed over the image and the grid is deformed by solving an optimization problem which is formulated such that the similarity metric and the smoothness penalty imposed over the transformation is minimized. In deformable registration, once there are deformations over the grid, the final transformation at the pixel level is computed using a B-Spine interpolation of the grid at the pixel level so that the transformation is smooth and continuous.
There are 2 general approaches towards solving the optimization problem, some people use discrete optimization and solve it as a MRF optimization problem while some people use gradient descent, I think demons uses gradient descent.
In case of MRF based approaches, the unary cost is the cost for deforming each node in grid and it is the similarity computed between patches, the pairwise cost which imposes the smoothness of the grid, is generally a potts/truncated quadratic potential which ensures that neighboring nodes in the grid have almost the same displacement. Once you have the unary and pairwise cost, you feed it to a MRF optimization algorithm and get the displacements at the grid level, then you use a B-Spline interpolation to compute pixel level displacement. This process is repeated in a coarse to fine fashion over several scales and also the algorithm is run many times at each scale (reducing the displacement at each node every time).
In case of gradient descent based methods, they formulate the problem with the similarity metric and the grid transformation computed over the image and then compute the gradient of the energy function which they have formulated. The energy function is minimized using iterative gradient descent, however these approaches can get stuck in a local minima and are quite slow.
Some popular methods are DROP, Elastix, itk provides some tools
If you want to know more about algorithms related to deformable image registration, I will recommend you to take a look to FAIR( guide book), FAIR is a toolbox for Matlab so you will have examples to understand the theory.
http://www.cas.mcmaster.ca/~modersit/FAIR/
Then if you want to specifically see some demon example,, here you have this other toolbox:
http://www.mathworks.es/matlabcentral/fileexchange/21451-multimodality-non-rigid-demon-algorithm-image-registration
Related
In machine learning, a lot of techniques require defining a metric between different data points. I want to know what are some popular metrics when the data are images.
An obvious way of measuring distance between images is to sum up the squares of pixel errors. But this is sensitive to simple transformations like translation. For example, even shifting the whole image by one pixel could result in a large distance.
What are some other distance measuring techniques that is more compatible with translation, rotations, etc.?
Wasserstein distance(earth mover's distance) and kullback leibler divergence are the two that I have come across while studying literature about Generative Adversarial Networks(GANs).
I have a challenging problem to solve. The Figure shows green lines, that are derived from an image and the red lines are the edges derived from another image. Both the images are taken from the same camera, so the intrinsic parameters are same. Only, the exterior parameters are different, i.e. there is a slight rotation and translation while taking the 2nd image. As it can be seen in the figure, the two sets of lines are pretty close. My task is to find correspondence between the edges derived from the 1st image and the edges derived from the second image.
I have gone through a few sources, that mention taking corresponding the nearest line segment, by calculating Euclidean distances between the endpoints of an edge of image 1 to the edges of image 2. However, this method is not acceptable for my case, as there are edges in image 1, near to other edges in image 2 that are not corresponding, and this will lead to a huge number of mismatches.
After a bit of more research, few more sources referred to Hausdorff distance. I believe that this could really be a solution to my problem and the paper
"Rucklidge, William J. "Efficiently locating objects using the
Hausdorff distance." International Journal of Computer Vision 24.3
(1997): 251-270."
seemed to be really interesting.
If, I got it correct the paper formulated a function for calculating translation of model edges to image edges. However, while implementation in MATLAB, I'm completely lost, where to begin. I will be much obliged if I can be directed to a pseudocode of the same algorithm or MATLAB implementation of the same.
Additionally, I am aware of
"Apply Hausdorff distance to tile image classification" link
and
"Hausdorff regression"
However, still, I'm unsure how to minimise Hausdorff distance.
Note1: Computational cost is not of concern now, but faster algorithm is preferred
Note2: I am open to other algorithms and methods to solve this as long as there is a pseudocode available or an open implementation.
Have you considered MATLAB's image registration tools?
With imregister(https://www.mathworks.com/help/images/ref/imregister.html), you can just insert both images, 1 as reference, one as "moving" and it will register them together using an affine transform. The function call is just
[optimizer, metric] = imregconfig('monomodal');
output_registered = imregister(moving,fixed,'affine',optimizer,metric);
For better visualization, use the RegistrationEstimator command to open up a gui in which you can import the 2 images and play around with it to register your images. From there you can export code for future images.
Furthermore if you wish to account for non-rigid transforms there is imregdemons(https://www.mathworks.com/help/images/ref/imregdemons.html) which works much the same way.
You can compute the Hausdorff distance using Matlab's bwdist function. You would compute the distance transform of one image, evaluate it at the edge points of the other, and take the maximum value. (You can also take the sum instead, in which case it is called the chamfer distance.) For this problem you'll probably want the symmetric Hausdorff distance, so you would do the computation in both directions.
Both Hausdorff and chamfer distance measure the match quality of a particular alignment. To find the best registration you'll need to try multiple alignment transformations and evaluate them all looking for the best one. As suggested in another answer, you may find it easier to use registration existing tools than to write your own.
I have a general question in image processing. I have a noisy image. I would like to classify the noisy image into some regions. Two famous approaches which can use
MRF/Gibbs MRF: model the spatial dependence between neighborhood pixels
Total variation: key idea maybe based on smallest variation of image.
My question is: Could you tell me what are different between two approaches for noise removal? Which one is better? Thanks
The MRF gives you a framework to do discrete optimization of problems, which respect the Markovian property, that is a pixel is conditioned only on the neighboring ones (roughly stated). Typical applications include binary or multi-class labeling problems. Total variation on the other hand, is generally used as a regularization by adding the integral of the absolute gradient of the signal/image to the energy functional. This helps to neglect irrelevant detail and focus on important ones.
We cannot say one is better than the other, as they are not exactly contradictory things. It depends on the application and the energy function you use in the MRFs.
I play Dwarf Fortress game. And the main challenge for me is to design layout of the fortress efficiently. Meaning, that each industry flow should be as dense as possible, to minimize the travel distances.
An example could be food industry . Each grey ellipse represents a single building. Each white rectangle represents product from the building.
My goal is to find algorithm which would distribute the buildings on 2D grid in such manner that distance between those building is minimal in the sense how they are connected. Meaning that fishery and loom can be far apart, but loom and farmer's should be as close as possible.
At the moment I have considered using some ready software, to simulate the layout, but some tips for algorithm would be fine.
Currently I'm considering some force-directed algorithm, but I'm not sure about the discrete grid requirement.
Formalization of question: Is there a Force Draw Graph algorithm which works in discrete coordinates?
UPDATE: I have found implementation of the Force draw algorithm in AS3 (the web contains JS version too). I will try to convert it to discrete version. But I have some doubts it will work...
UPDATE2: Some further restrictions were requested in comments. Here they are:
Each building occupy single cell on virtual grid. Buildings can be on adjacent cells. Buildings cannot stack/overlap.
(PS: In game, each building has deifned size, usually 3x3, but I want to keep the problem more general, to allow for more approaches).
You are pretty much trying to solve an instance of a floor-planning problem where you are trying to minimize the total "connection" length. Most of these problems are instances of NP-hard problems, some of them have pseudo-polynomial run-time algorithms.
There is a special case you might be interested that is actually fully solvable in polynomial time: if the relative positions of the "boxes" or buildings you want to place are known ahead of time.
For full details on how to solve this particular case, please refer to this tutorial on geometric programming from Stanford, chapter 6, section 6.1 the first example entitled "Floor planning." Another website also includes matlab code that implements and solves the problem (under chapter 8 Geometric Programming.)
So I've managed to do some code which aproximates solution of this problem. It's not a top class product but it's working. I plan to do some updates over time, but I don't have any time frame set.
The source code is here: https://github.com/sutr90/DF-Layout
My code uses Simulated Annealing approach. Where cost function is based on total area, total length of edges and overlap. To measure distance, I use taxi-cab metric, but that is a subject to change.
I have a set of rectangles and arbitrary shape in 2D space. The shape is not necessary a polygon (it may be a circle), and rectangles have different widths and heights. The task is to approximate the shape with rectangles as close as possible. I can't change rectangles dimensions, but rotation is permitted.
It sounds very similar to packing problem and covering problem but covering area is not rectangular...
I guess it's NP problem, and I'm pretty sure there should be some papers that show good heuristics to solve it, but I don't know what to google? Where should I start?
Update: One idea just came into my mind but I'm not sure if it's worth investigating. What if we consider bounding shape as a physical mold filled with water. Each rectangle is considered as a positively charged particle with size. Now drop the smallest rectangle to it. Then drop the next by size at random point. If rectangles too close they repel each other. Keep adding rectangles until all are used. Could this method work?
I think you could look for packing and automatic layout generation algorithms. Automatic VLSI layout generation algorithms might need similar things, just like textile layout questions...
This paper Hegedüs: Algorithms for covering polygons by rectangles seems to address a similar problem. And since this paper is from 1982, it might be interesting to look at the papers which cite this one. Additionally, this meeting seems to be discussing research problems related to this, so might be a starting point for keywords or names who do research in this idea.
I don't know if the computational geometry research has algorithms for your specific problem, or if these algorithms are easy/practical enough to implement. Here is how I would approach it if I had to do it without being able to look up previous work. This is just a direction, by far not a solution...
Formulate it as an optimization problem. You have discrete variables of which rectangles you choose (yes or no) and continuous variables (location and orientation of the triangles). Now you can set up two independent optimizations: a discrete optimization which picks the rectangles; and a continuous that optimizes for the location and orientation once rectangles are given. Interleave these two optimizations. Of course the difficulty lies in the formulation of optimizations, and designing your error energy such that it does not get stuck in some strange configurations (local minima). I'd try to get the continuous as a least squares problem such that I can use standard optimizations libraries.
I think this problem is suitable for solving with genetic algorithm and/or evolutionary strategy algorithm. I've done similar box packing problem with the help of evolutionary strategy algorithm of some kind. Check this out in my blog.
So if you will use such approach - encode into chromosomes box:
x coordinate
y coordinate
angle
Then try to minimize such fitness function-
y = w1 * box_intersection_area +
w2 * box_area_out_of_shape +
w3 * average_circle_radius_in_free_space
Choose weights w1,w2,w3 such as to affect importance of factors. When genetic algorithm will find partial solution - remove boxes which still overlaps together or are out of shape - and you will have at least legal (but not necessary optimal) solution.
good luck in this interesting problem !
It is NP hard indeed and since it has hi-tech application, reasonably efficients approximate strategies are not even in patents, let alone published papers.
The best you can do with a limited budget is to start by limiting the problem. Assume that all rectangles are exactly the same, Assume that all rectangles which are binary sub-divisions of your standard rectangle are also allowed since you can efficiently pre-pack them to fit your core division. For extra points you can also form several fixed schemas for gluing core rectangles to cover a few larger shapes with substantially different proportions. Assume that you can change dimensions of your standard rectangle/cell as long as the rest (pre-packing and gluing schema) remains the same - this gives you parameters to decide approximate size of the core rectangle based on rectangles you are given.
Now you can play with aspect ratios to approximate the error such limited system could guarantee. For the first iterations assume that it can have 50% error with a simple sub-division schema and then change schema to reduce the error but without increasing asymptotic complexity of pre-packing. At the end of the day you are always just assigning given rectangles to your pre-calculated and now fixed grid and binary sub-divisions - meaning you are not trying to do a layout or backtrack at all - you are always happy with the first approximate fit into the grid.
Work on defining classes of rectangles that pack well with your schema - that's again to keep the whole process inverted - you are never trying to actually fit what you are given - you are defining what you have to be given in order to fir it well - then you punt the rest as error since it is approximation.
Then you can try to do a bit more, but not much more - any slip into backtracking or nailing arbitrary small error and it's exponential.
If you are at a research facility and can get some supercomputer time - run a set of exhaustive searches with pathological mixes there just to see how optimal packing may look like and to see if you can derive a few more sub-division schemas and/or classes of rectangle sets.
That should be enough for the first 2 yrs or research :-)