Find sub-pixel maximum on a 2D array - image

Suppose I have an image and I want to find a subarray with shape 3x3 that contains the maximum sum compared to other subarrays.
How do I do that in python efficiently (run as fast as possible)? If you can provide a sample code that would be great.
My specific problem:
I want to extract the location of the center of the blob in this heatmap
I don't want to just get the maximum point because that would cause the coordinate to not be very precise. The true center of the blob could actually be between 2 pixels. Thus, it's better to do weighted average between many points to obtain subpixel precision. For example, if there are 2 points (x1,y1) and (x2,y2) with values 200 and 100. Then the average coordinate will be x=(200*x1+100*x2)/300 y=(200*y1+100*y2)/300
One of my solution is to do a convolution operation. But I think it's not efficient enough because it requires multiplication to the kernel (which contains only ones). I'm looking for a fast implementation so I cannot do looping myself because I'm not sure if it will be fast.
I want to do this algorithm to 50 images every few milliseconds. (Image come in as a batch). Concretely, think of these images as output of a machine learning model that output heatmaps. In order to obtain the coordinate from these heatmaps, I need to do some kind of weighted average between the coordinates with high intensity. My idea is to do a weighted average around 3x3 area on the image. I am also open to other approaches that can be faster or more elegant.

Looking for the "subarray of shape 3x3 with the maximum sum" is the same as looking for the maximum of an image after it has been filtered with an un-normalized 3x3 box filter. So it boils down to finding efficiently the maximum of an image, which you assume is a (perhaps "noisy") discrete sample of an underlying continuous and smooth signal - hence your desire to find a sub-pixel location.
You really need to split the problem in 2 parts:
Find the pixel location m=(xm, ym) of the maximum value of the image. This requires no more than a visit of every pixel in the image, and one comparison per pixel, so it's O(N) and hence optimal as long as you are operating at the native image resolution. In OpenCv it is done using
the minMaxLoc function.
Apply whatever model of the image you are using to find its (subpixel-interpolated) maximum in a neighborhood of m.
To clarify point (2): you write
I don't want to just get the maximum point because that would cause the coordinate to not be very precise. The true center of the blob could actually be between 2 pixels
While intuitively plausible, this assertion needs to be made more precise in order to be computable. That is, you need to express mathematically what assumptions you make about the image, that bring you to search for a "true" maximum between pixel-sampled location.
A simple example for such assumptions is quadratic smoothness. In this scenario you assume that, in a small (say, 3x3, of 5x5) neighborhood of the "true" maximum location, the image signal z is well approximated by a quadratic:
z = A00 dx^2 + A01 dx dy + A11 dy^2 + A02 dx + A12 dy + A22
where:
dx = x - xm; dy = y - ym
This assumption makes sense if the underlying signal is expected to be at least 3rd order continuous and differentiable, because of the Taylor series theorem. Geometrically, it means that you assume (hope?) that the signal looks like a quadric (a paraboloid, or an ellipsoid) near its maximum.
You can then evaluate the above equation for each of the pixels in a neighborhood of m, replacing the actual image values for z, and thus obtain a linear system in the unknown Aij, with as many equations as there are neighbor pixels (so even a 3x3 neighborhood will yield an over-constrained system). Solving the system in the least-squares sense gives you the "optimal" coefficients Aij. The theoretical maximum as predicted by this model is where the first partial derivatives vanish:
del z / del dx = 2 A00 dx + A01 dy = 0
del z / del dy = A01 dx + 2 A11 dy = 0
This is a linear system in the two unknown (dx, dy), and solving it yields the estimated location of the maximum and, through the above equation for z, the predicted image value at the maximum.
In terms of computational cost, all such model estimations are extremely fast, compared with traversing an image of even moderate size.

I am sorry I did not exactly understand the meaning of your last paragraph so I have just stopped at a point where I got all the coordinates having the maximum value. I have used cv2.filter2D for convolution on a thresholded image and then using np.amax and np.where have found the coordinates having the maximum value.
import cv2
import numpy as np
from timeit import default_timer as timer
img = cv2.imread('blob.png', 0)
start = timer()
_, thresh = cv2.threshold(img, 240, 1, cv2.THRESH_BINARY)
mask = np.ones((3, 3), np.uint8)
res = cv2.filter2D(thresh, -1, mask)
result = np.where(res == np.amax(res))
end = timer()
print(end - start)
I don't whether it as efficient as you want or not but the output was 0.0013461999999435648 s
P.S. The image you have provided had a white border which I had to crop out for this method.

One way is to sub-sampling the image and find the neighborhood of the desired point. You can make it by doing a loop not on all the pixels but on e.g. every 5 pixels (row=row+5andcol=col+5 in the loop). After finding the near location, consider a specific neighborhood around that location and do a loop on whole pixels of that specific crop to find the exact location.

Based on my knowledge of image processing, to get a reliable result that works for any one blob, follow these steps:
Make the image greyscale if it isn’t already (pixel values 0-255)
Normalise the image so that pixel intensities cover the full range of 0-255
Convert image to binary (a pixel is either 0 or 1) - this can be achieved by thresholding, such as applying the rule that any pixel less than or equal to 127 in intensity is given an intensity of 0 and anything else is given an intensity of 1
Find the weighted average of all the pixels that hold the value of “1”
or
Apple an erosion to the image until you are left with either 2 pixels or 1 pixel.
Case 1
If you have two pixels then you need to find the u and v co-ordinates if both pixels. The centre of the blob will be the halfway point between the u and v coordinates of the pixels.
Case 2
If you have one pixel left then that pixel’s co-ordinates is the centre point.
—————
You mentioned about achieving this quickly in Python:
Python by design is an interpreted language, so it executed line by line, making it less suitable for highly iterative tasks like image processing. However, you can make use of libraries like OpenCV (https://docs.opencv.org/2.4/index.html), which is written in C, to mitigate this apart from making the task at hand a lot easier for you.
OpenCV also provides solutions for all the steps I listed above in this capacity, therefore you should be able to achieve a reliable solution fairly quickly, though I can’t say for sure if it will hit your target of 50 images every few milliseconds. Other factors to take into account is the size of the image you are processing. That will increase the processing load exponentially.
UPDATE
I just found a good article that practically echoes my step-process:
https://www.learnopencv.com/find-center-of-blob-centroid-using-opencv-cpp-python/
More importantly it also denotes the formula for finding the centroid mathematically as:
c = (1/n)sigma(n, i = 1, x_i)
but this is better written in the article than I can do so here.

Related

Drawing pixel circle of given area

I have some area X by Y pixels and I need to fill it up pixel by pixel. The problem is that at any given moment the drawn shape should be as round as possible.
I think that this algorithm is subset of Ordered Dithering, when converting grayscale images to one-bit, but I could not find any references nor could I figure it out myself.
I am aware of Bresenham's Circle, but it is used to draw circle of certain radius not area.
I created animation of all filling percents for 10 by 10 pixel grid. As full area is 10x10=100px, then each frame is exactly 1% inc.
A filled disk has the equation
(X - Xc)² + (Y - Yc)² ≤ C.
When you increase C, the number of points that satisfies the equation increases, but because of symmetry it increases in bursts.
To obtain the desired filling effect, you can compute (X - Xc)² + (Y - Yc)² for every pixel, sort on this value, and let the pixels appear one by one (or in a single go if you know the desired number of pixels).
You can break ties in different ways:
keep the original order as when you computed the pixels, by using a stable sort;
shuffle the runs of equal values;
slightly alter the center coordinates so that there are no ties.
Filling with the de-centering trick.
Values:
Order:

Algorithm for plotting 2d xy graph

I'm trying to plot XY graph in real time using Java. Functions that only rely on X are easy. Just iterate over x0...xn, get value and draw lines between the points. There are a lot of guides on it and it's intuitive.
But there is literally no guide on plotting graphs with x AND y being a variable.
Consider this equation: sin(x^3 * y^2) = cos(x^2 * y^3)
Using online Graph plotter I get this:
While my best result plotting the same function is this:
I just iterate over every pixel on screen and pass pixel positions as parameters to the function. If function's output is close to 0, I color the pixel. As you can see it's bad. It also takes huge amount of processing power. It only redraws once every couple of seconds. And if I try to increase precision, all lines just become thicker. Especially around intersections.
My question is how can I make my program faster and make it produce better looking graphs. Maybe there are some algorithms for that purpose?
The challenge is to chose the correct threshold. Pixels where abs(f(x,y)) is below the threshold should be colored. Pixels above the threshold should be white.
The problem is that if the threshold is too low, gaps appear in places where no pixel is exactly on the line. On the other hand, if the threshold is too high, the lines widen in places where the function is near zero, and the function is changing slowly.
So what's the correct threshold? The answer is the magnitude of the gradient, multiplied by the radius of a pixel. In other words, the pixel should be colored when
abs(f(x,y)) < |g(x,y)| * pixelRadius
The reason is that the magnitude of the gradient is equal to the maximum slope of the surface (at a given point). So a zero crossing occurs within a pixel if the slope is large enough to reduce the function to zero, inside the pixel.
That of course is only rough approximation. It assumes that the gradient doesn't change significantly within the area bounded by the pixel. The function in the question conforms to that assumption reasonably well, except in the upper right corner. Notice that in the graph below, there are Moiré patterns in the upper right. I believe that those are due to the failure in my antialiasing calculation: I don't compensate for a rapidly changing gradient.
In the graph below, pixels are white if
abs(f(x,y)) > |g(x,y)| * pixelRadius
Otherwise the pixel intensity is a number from 0 to 1, with 0 being black and 1 being white:
intensity = abs(f(x,y)) / (|g(x,y)| * pixelRadius)
I don't how exactly the online plotter did, but here are some suggestions.
Simplify your equation, as to this specific one, you can easily have x^2 * y^2 * (x ± y) = (2 * n + 1 / 2) * pi where n for any integer. It's much clearer than the original one.
Draw lines rather than points. Every n here stands for 4 curves, you can now loop over x and figure out y and draw a line between adjacent points.
Hope it helps!

Distance between set of lines

Say that my images are simple shapes - set of lines, dots, curves, and simple objects,
How do I calculate the distance between images - so length is important but total scale is non important, location of line\curve is important, angles is important etc
Attached image For example:
My comparison object is a cube on the top left, score are fictitious just for this example.
that the distance to the Cylinder is 80 (has 2 lines but top geometry is different)
The bottom left cube score is 100 since it exact match lines with different scale.
The bottom right Rectangle score is 90 since it has exact match lines on the top but different scale lines on the side.
I am looking for algorithm name or general approach that will help me to start to think towards a solution....
Thank you for your help.
Here is something to get you started. When jumping into new problems, I don't see much value in trying a lot of complex steps just because they are available somewhere to use. So my focus is on using relatively simple things, that will fail in more varied situations, but hopefully you will see its value and get some sense of the problem.
The approach is fully based on corner detection; two typical methods for this detection are the Harris detector or the one by Shi and Tomasi described in the paper "Good Features to Track", 1994. I will use the second one, just because there is a ready implementation in OpenCV, newer Matlab, and possibly many other places. Its implementation on these packages also allows for easier parameter adjustment, regarding corner quality and minimum distance between corners. So, suppose you can detect all corner points correctly, how do you measure how close one shape is to another one based on these points ? The images have arbitrary size, so my idea is to normalize the point coordinates to the range [0, 1]. This solves for the scaling issue, which is desired according to the original description. Now we have to compare point sets in the range [0, 1]. Here we go for the simplest thing: consider one point p from the shape a, what is the closest point in shape b ? We assume it is one with the minimum absolute different between this point p and any point in b. If we sum all the values, we get a scoring between shapes. The lower the score, the more similar the shapes (according to this approach).
Here are some shapes I drew:
Here are the detected corners:
As you can clearly see in this last set of images, the method will easily confuse a rectangle/square with a cylinder. To handle that you will need to combine the approach with other descriptors. Initially, a simple one that you might consider is the ratio between the shape's area and its bounding box area (which would give 1 for rectangle, and lower for cylinder).
With the method described above, here are the measurements between the first and second shapes, first and third shapes, ..., respectively: 0.02358485, 0.41350339, 0.30128458 0.4980852, 0.18031262. The second cube is a resized version of the first one, and as you see, they are very similar by this metric. The last shape is a resized version of the first cube but without keeping the aspect ratio, and the metric gives a much higher difference.
If you want to play with the code that performs this, here it is (in Python, depends on OpenCV, numpy):
import sys
import cv2 as cv
import numpy
inp = []
for fname in sys.argv[1:]:
img_color = cv.imread(fname)
img = cv.cvtColor(img_color, cv.COLOR_RGB2GRAY)
inp.append((img_color, img))
ptsets = []
# Corner detection parameters.
params = (
200, # max number of corners
0.01, # minimum quality level of corners
10, # minimum distance between corners
)
# Params for visual circle markers.
circle_radii = 3
circle_color = (255, 0, 0)
for i, (img_color, img) in enumerate(inp):
height, width = img.shape
cornerMap = cv.goodFeaturesToTrack(img, *params)
corner = numpy.array([c[0] for c in cornerMap])
for c in corner:
cv.circle(img_color, tuple(c), circle_radii, circle_color, -1)
# Just to visually check for correct corners.
cv.imwrite('temp_%d.png' % i, img_color)
# Convert corner coordinates to [0, 1]
cornerUnity = (corner - corner.min()) / (corner.max() - corner.min())
# You might want to use other descriptors here. XXX
ptsets.append(cornerUnity)
def compare_ptsets(p):
res = numpy.zeros(len(p))
base = p[0]
for i in xrange(1, len(p)):
sum_min_diff = sum(numpy.abs(p[i] - value).min() for value in base)
res[i] = sum_min_diff
return res
res = compare_ptsets(ptsets)
print res
The process to be followed depends on what depth of features you are going to consider and accuracy required.
If you want something more accurate, search some technical papers like this which can give a concrete and well-proven approach or algorithm.
EDIT:
The idea from Waltz algorithm (one method in AI) can be tweaked. This is just my thought. Interpret the original image, generate some constraints out of it. For each candidate, find out the number of constraints it satisfies. The one which satisfies more constraints will be the most similar to the original image.
Try to calculate mass center for each figure. Treat each point of figure as particle with mass equal 1.
Then calculate each distance as sqrt((x1-x2)^2 + (y1-y2)^2), where (xi, yi) is mass center coordinate for figure i.

Drawing a circle on an array for CCD integration purposes

I am writing a function to draw an approximate circle on a square array (in Matlab, but the problem is mainly algorithmic).
The goal is to produce a mask for integrating light that falls on a portion of a CCD sensor from a diffraction-limited point source (whose diameter corresponds to a few pixels on the CCD array). In summary, the CCD sensor sees a pattern with revolution-symmetry, that has of course no obligation to be centered on one particular pixel of the CCD (see example image below).
Here is the algorithm that I currently use to produce my discretized circular mask, and which works partially (Matlab/Octave code):
xt = linspace(-xmax, xmax, npixels_cam); % in physical coordinates (meters)
[X Y] = meshgrid(xt-center(1), xt-center(2)); % shifted coordinate matrices
[Theta R] = cart2pol(X,Y);
R = R'; % cart2pol uses a different convention for lines/columns
mask = (R<=radius);
As you can see, my algorithm selects (sets to 1) all the pixels whose physical distance (in meters) is smaller or equal to a radius, which doesn't need to be an integer.
I feel like my algorithm may not be the best solution to this problem. In particular, I would like it to include the pixel in which the center is present, even when the radius is very small.
Any ideas ?
(See http://i.stack.imgur.com/3mZ5X.png for an example image of a diffraction-limited spot on a CCD camera).
if you like to select pixels if and only if they contain any part of the circle C:
in each pixel place a small circle A with the radius = halv size of the pixel, and another one around it with R=sqrt(2)*half size of the circle (a circumscribed circle)
To test if two circles touch each other you just calculate the center to center distances and subtract the sum of the two radii.
If the test circle C is within A then you select the pixel. If it's within B but not C you need to test all four pixel sides for overlap like this Circle line-segment collision detection algorithm?
A brute force approximate method is to make a much finer grid within each pixel and test each center point in that grid.
This is a well-studied problem. Several levels of optimization are possible:
You can brute-force check if every pixel is inside the circle. (r^2 >= (x-x0)^2 + (y-y0)^2)
You can brute-force check if every pixel in a square bounding the circle is inside the circle. (r^2 >= (x-x0)^2 + (y-y0)^2 where |x-x0| < r and |y-y0| < r)
You can go line-by-line (where |y-y0| < r) and calculate the starting x ending x and fill all the lines in between. (Although square roots aren't cheap.)
There's an infinite possibility of more sophisticated algorithms. Here's a common one: http://en.wikipedia.org/wiki/Midpoint_circle_algorithm (filling in the circle is left as an exercise)
It really depends on how sophisticated you want to be based on how imperative good performance is.

Is there any algorithm for determining 3d position in such case? (images below)

So first of all I have such image (and ofcourse I have all points coordinates in 2d so I can regenerate lines and check where they cross each other)
(source: narod.ru)
But hey, I have another Image of same lines (I know thay are same) and new coords of my points like on this image
(source: narod.ru)
So... now Having points (coords) on first image, How can I determin plane rotation and Z depth on second image (asuming first one's center was in point (0,0,0) with no rotation)?
What you're trying to find is called a projection matrix. Determining precise inverse projection usually requires that you have firmly established coordinates in both source and destination vectors, which the images above aren't going to give you. You can approximate using pixel positions, however.
This thread will give you a basic walkthrough of the techniques you need to use.
Let me say this up front: this problem is hard. There is a reason Dan Story's linked question has not been answered. Let provide an explanation for people who want to take a stab at it. I hope I'm wrong about how hard it is, though.
I will assume that the 2D screen coordinates and projection/perspective matrix is known to you. You need to know at least this much (if you don't know the projection matrix, essentially you are using a different camera to look at the world). Let's call each pair of 2D screen coordinates (a_i, b_i), and I will assume the projection matrix is of the form
P = [ px 0 0 0 ]
[ 0 py 0 0 ]
[ 0 0 pz pw]
[ 0 0 s 0 ], s = +/-1
Almost any reasonable projection has this form. Working through the rendering pipeline, you find that
a_i = px x_i / (s z_i)
b_i = py y_i / (s z_i)
where (x_i, y_i, z_i) are the original 3D coordinates of the point.
Now, let's assume you know your shape in a set of canonical coordinates (whatever you want), so that the vertices is (x0_i, y0_i, z0_i). We can arrange these as columns of a matrix C. The actual coordinates of the shape are a rigid transformation of these coordinates. Let's similarly organize the actual coordinates as columns of a matrix V. Then these are related by
V = R C + v 1^T (*)
where 1^T is a row vector of ones with the right length, R is an orthogonal rotation matrix of the rigid transformation, and v is the offset vector of the transformation.
Now, you have an expression for each column of V from above: the first column is { s a_1 z_1 / px, s b_1 z_1 / py, z_1 } and so on.
You must solve the set of equations (*) for the set of scalars z_i, and the rigid transformation defined R and v.
Difficulties
The equation is nonlinear in the unknowns, involving quotients of R and z_i
We have assumed up to now that you know which 2D coordinates correspond to which vertices of the original shape (if your shape is a square, this is slightly less of a problem).
We assume there is even a solution at all; if there are errors in the 2D data, then it's hard to say how well equation (*) will be satisfied; the transformation will be nonrigid or nonlinear.
It's called (digital) photogrammetry. Start Googling.
If you are really interested in this kind of problems (which are common in computer vision, tracking objects with cameras etc.), the following book contains a detailed treatment:
Ma, Soatto, Kosecka, Sastry, An Invitation to 3-D Vision, Springer 2004.
Beware: this is an advanced engineering text, and uses many techniques which are mathematical in nature. Skim through the sample chapters featured on the book's web page to get an idea.

Resources