Matlab - How to measure the dispersion of black in a binary image? - image

I am comparing RGB images of small colored granules spilled randomly on a white backdrop. My current method involves importing the image into Matlab, converting to a binary image, setting a threshold and forcing all pixels above it to white. Next, I am calculating the percentage of the pixels that are black. In comparing the images to one another, the measurement of % black pixels is great; however, it does not take into account how well the granules are dispersed. Although the % black from two different images may be identical, the images may be far from being alike. For example, assume I have two images to compare. Both show a % black pixels of 15%. In one picture, the black pixels are randomly distributed throughout the image. In the other, a clump of black pixels are in one corner and are very sparse in the rest of the image.
What can I use in Matlab to numerically quantify how "spread out" the black pixels are for the purpose of comparing the two images?
I haven't been able to wrap my brain around this one yet, and need some help. Your thoughts/answers are most appreciated.

Found an answer to a very similar problem -> https://stats.stackexchange.com/a/13274
Basically, you would use the average distance from a central point to every black pixel as a measure of dispersion.

My idea is based upon the mean free path ()used in ideal gad theory / thermodynamics)
First, you must separate your foreground objects, using something like bwconncomp.
The mean free path is calculated by the mean distance between the centers of your regions. So for n regions, you take all n/2*(n-1) pairs, calculate all the distances and average them. If the mean distance is big, your particles are well spread. If it is small, your objects are close together.
You may want to multiply the resulting mean by n and divide it by the edge length to get a dimensionless number. (Independent of your image size and independent of the number of particles)

Related

Dividing the plane into regions of equal mass based on a density function

Given a "density" scalar field in the plane, how can I divide the plane into nice (low moment of inertia) regions so that each region contains a similar amount of "mass"?
That's not the best description of what my actual problem is, but it's the most concise phrasing I could think of.
I have a large map of a fictional world for use in a game. I have a pretty good idea of approximately how far one could walk in a day from any given point on this map, and this varies greatly based on the terrain etc. I would like to represent this information by dividing the map into regions, so that one day of walking could take you from any region to any of its neighboring regions. It doesn't have to be perfect, but it should be significantly better than simply dividing the map into a hexagonal grid (which is what many games do).
I had the idea that I could create a gray-scale image with the same dimensions as the map, where each pixel's color value represents how quickly one can travel through the pixel in the same place on the map. Well-maintained roads would be encoded as white pixels, and insurmountable cliffs would be encoded as black, or something like that.
My question is this: does anyone have an idea of how to use such a gray-scale image (the "density" scalar field) to generate my "grid" from the previous paragraph (regions of similar "mass")?
I've thought about using the gray-scale image as a discrete probability distribution, from which I can generate a bunch of coordinates, and then use some sort of clustering algorithm to create the regions, but a) the clustering algorithms would have to create clusters of a similar size, I think, for that idea to work, which I don't think they usually do, and b) I barely have any idea if any of that even makes sense, as I'm way out of my comfort zone here.
Sorry if this doesn't belong here, my idea has always been to solve it programatically somehow, so this seemed the most sensible place to ask.
UPDATE: Just thought I'd share the results I've gotten so far, trying out the second approach suggested by #samgak - recursively subdividing regions into boxes of similar mass, finding the center of mass of each region, and creating a voronoi diagram from those.
I'll keep tweaking, and maybe try to find a way to make it less grid-like (like in the upper right corner), but this worked way better than I expected!
Building upon #samgak's solution, if you don't want the grid-like structure, you can just add a small random perturbation to your centers. You can see below for example the difference I obtain:
without perturbation
adding some random perturbation
A couple of rough ideas:
You might be able to repurpose a color-quantization algorithm, which partitions color-space into regions with roughly the same number of pixels in them. You would have to do some kind of funny mapping where the darker the pixel in your map, the greater the number of pixels of a color corresponding to that pixel's location you create in a temporary image. Then you quantize that image into x number of colors and use their color values as co-ordinates for the centers of the regions in your map, and you could then create a voronoi diagram from these points to define your region boundaries.
Another approach (which is similar to how some color quantization algorithms work under the hood anyway) could be to recursively subdivide regions of your map into axis-aligned boxes by taking each rectangular region and choosing the optimal splitting line (x or y) and position to create 2 smaller rectangles of similar "mass". You would end up with a power of 2 count of rectangular regions, and you could get rid of the blockiness by taking the centre of mass of each rectangle (not simply the center of the bounding box) and creating a voronoi diagram from all the centre-points. This isn't guaranteed to create regions of exactly equal mass, but they should be roughly equal. The algorithm could be improved by allowing recursive splitting along lines of arbitrary orientation (or maybe a finite number of 8, 16, 32 etc possible orientations) but of course that makes it more complicated.

Can't understand the SIFT keypoint extraction (not description) algorithm

I have read two references about the SIFT algorithm here and here and I am not truly understanding how just some key points are detected considering that the algorithm works on the difference of Gaussians calculated on several resolutions (they call them octaves). Here are the steps from the technique according to what I've understood from the paper.
Given the input image, blur them with Gaussian filters using different sigmas, resulting in Gaussian filtered images. In the paper, they used 5 Gaussian filters per octave (they tell that two adjacent Gaussian filtered images were filtered using sigma and k * sigma gaussian filter parameters), and they consider 4 octaves in the algorithm. So, there are a total of 20 gaussian filtered images (5 per octave), but we will act on 5 Gaussian filtered images from each octave individually.
For each octave, we calculate 4 Difference Of Gaussian images (DOG) from the 5 Gaussian filtered images by just subtracting adjacent Gaussian filtered images. So, now we have a total of 16 difference of Gaussian images, but we will consider 4 Difference of Gaussian images from each octave individually.
Find local extrema (maximum or minimum values) pixels by comparing a pixel in each DOG with 26 neighbor pixels. Among these, 8 pixels are in the same scale as the pixel (in a 3x3 window), 9 are in a 3x3 window in the scale above (Difference of Gaussian image from the same octave) and 9 others in a 3x3 window in the scale below.
Once finding these local extrema in different octaves, we must refine them, eliminating low-contrast points or weak edge points. They filter bad candidates using threshold in a Taylor expansion function and eigenvalue ratio threshold calculated in a Hessian matrix.
(this part I don't understand perfectly): For each interest point that survived (in each octave, I believe), they consider a neighborhood around it and calculate gradient magnitudes and orientation of each pixel from that region. They build a gradient orientation histogram covering 360 degrees and select the highest peak and also peaks that are higher than 80% of the highest peak. They define that the orientation of the keypoint is defined in a parabola (fitting function?) to the 3 histogram values closest to each peak to interpolate the peak position (I really dont understand this part perfectly).
What I am not understanding
1- The tutorial and even the original paper are not clear on how to detect a single key point as we are dealing with multiple octaves (images resolutions). For example, suppose I have detected 1000 key points in the first octave, 500 in the second, 250 in the third and 125 in the fourth octave. The SIFT algorithm will return me the following data about the key points: 1-(x,y) coordinates 2- scale (what is that?) 3-orientation and 4- the feature vector (which I easily understood how it is built). There are also Python functions from Opencv that can draw these keypoints using the original image (thus, the first octave), but how if the keypoints are detected in different octaves and thus the algorithm considers DOG images with different resolutions?
2- I don't understand the part 5 of the algorithm very well. Is it useful for defining the orientation of the keypoint, right? can somebody explain that to me with other words and maybe I can understand?
3- To find Local extrema per octave (step 3), they don't explain how to do that in the first and last DOG images. As we are considering 4 DOG images, It is possible to do that only in the second and third DOG images.
4- There is another thing that the author wrote that completely messed the understanding of the approach from me:
Figure 1: For each octave of scale space, the initial image is
repeatedly convolved with Gaussians to produce the set of scale space
images shown on the left. Adjacent Gaussian images are subtracted to
produce the difference-of-Gaussian images on the right. After each
octave, the Gaussian image is down-sampled by a factor of 2, and the
process repeated.
What? Does he downsample only one Gaussian image? how can the process be repeated by doing that? I mean, the difference of Gaussians is originally done by filtering the INPUT IMAGE differently. So, I believe the INPUT IMAGE, and not the Gaussian image must be resampled. Or the author forgot to write that THE GAUSSIAN IMAGES from a given octave are downsampled and the process is repeated for another octave?

2D raster image line of sight algorithm

I'm working on a simple mapping application for fun, and one of the things I need to do is to find (and color) all of the points that are visible from the current location. In this case, points are pixels. My map is a raster image where transparent pixels are open space, and any other pixels are opaque. (There are no semi-transparent pixels; alpha is either 0 or 100%.) In this sense, it's sort of like a regular flood fill, with the constraint that each filled pixel has to have a clear line-of-sight to the origin point. The following image shows a couple of such areas colored in (the tiny crosshairs are the origin points, and white = transparent):
(http://tinyurl.com/nf3nqa4)
In addition, what I am ultimately interested in are the points that "border" other colors, i.e., I want the list of points that make up the edge of the visible region.
My current and very inefficient solution is the modified flood-fill I described above. This approach returns correct results, but due to the need to iterate every pixel on a line to the origin for every pixel in the flood fill, it's very slow. My images are downsized and quantized, but I still need about 1MP for acceptable accuracy, and typical LoS areas are at least 100,000 pixels each.
I may well be using the wrong search terms, but I haven't been able to find any discussion of algorithms that would solve this (rasterized) LoS case.
I suspect that this could be done more efficiently if your "walls" were represented as equations rather than simply pixels in a raster image. For example, polygons/triangles, circles, ellipses.
It would then be like raytracing (search for this term) in 2D. In other words, you could consider the ray/line from each pixel in the image to the point of interest and color the pixel only if it does not intersect with any object.
This method does require you to test the intersection for each pixel in the image with each object; however, if you look up raytracing you will find a number of efficient methods for testing these intersections. They will mostly be for the 3D case but it should be straightforward to convert them to 2D.
There are 3D raytracers that are very fast on MUCH larger images so this should be very doable.
You can try a delaunay triangulation on each color. I mean you can try to find the shape of each color with DT.

How to arrange pixels in pairs based on their similarity

what I want to achieve is a transition between two image files. The pixels from the image A move and rearrange themselves to form the image B. Imagine a cloud of particles (that is made from the A image's pixels) that forms into the picture B.
So far I have thought of going through all the pixels in image A and comparing them to pixels in image B; pixels that are the most similar are taken out of the arrays (with their x,y coordinates, too) and put into another array. So, in the end, I have pairs of pixels from both images that are similar. Then I only have to create the animation / possible color balancing (obviously all the pairs won't consist of identical pixels), which is fairly easy.
The problem is the algorithm that finds pixel pairs. For a small 100px x 100px image it would take 50 005 000 comparisons, for larger it would be impossible.
Dividing pictures in clusters? Any ideas will be appreciated.
I'd say that you're likely to achieve the best result matching up pixels by hue first, then saturation, finally luminance. If I'm right, then your best bet for optimization would be to convert to HSV first. Once there, you can just sort your pixels and binary search the results to find your pairs.
I'd say you'd may want to additionally search a fixed window around the result you find, to match up pixels that are least distance away from each other. That may make the resulting transition more coherent.
You may want to take a look at the Hungarian algorithm, which reduces the amount of actual comparisons for 100x100 pixels to 10000 - and after that you have O(n^3) time for finding the optimal matches. Basically, give each pixel combination a "cost" based on similarity and then send the (inverted) cost matrix through the algorithm to get the optimal assignment of pixels from A to pixels from B.
But it still might be too much computation for too little gain, depending on whether you need real time. I.e. this kind of work doesn't necessarily need an optimal match, just good enough - still, it may work as a point of origin in terms of finding less computationally intensive methods.
See bottom of the linked article for implementations in various languages - it's not entirely trival to implement.

Algorithm to Calculate Symmetry of Points

Given a set of 2D points, I want to calculate a measure of how horizontally symmetrical and vertically symmetrical those points are.
Alternatively, for each set of points I will also have a rasterised image of the lines between those points, so is there any way to calculate a measure of symmetry for images?
BTW, this is for use in a feature vector that will be presented to a neural network.
Clarification
The image on the left is 'horizontally' symmetrical. If we imagine a vertical line running down the middle of it, the left and right parts are symmetrical. Likewise, the image on the right is 'vertically' symmetrical, if you imagine a horizontal line running across its center.
What I want is a measure of just how horizontally symmetrical they are, and another of just how vertically symmetrical they are.
This is just a guideline / idea, you'll need to work out the details:
To detect symmetry with respect to horizontal reflection:
reflect the image horizontally
pad the original (unreflected) image horizontally on both sides
compute the correlation of the padded and the reflected images
The position of the maximum in the result of the correlation will give you the location of the axis of symmetry. The value of the maximum will give you a measure of the symmetry, provided you do a suitable normalization first.
This will only work if your images are "symmetric enough", and it works for images only, not sets of points. But you can create an image from a set of points too.
Leonidas J. Guibas from Stanford University talked about it in ETVC'08.
Detection of Symmetries and Repeated Patterns in 3D Point Cloud Data.

Resources