Sobel Edge detection – matlab - image

hello as part of my Homework. i need to calculate and display the edge magnitude image and the
edge direction image of image balls1.tif, using Sobel Edge detection.
Do not use matlab's edge function. You may use conv2.
Display a binary edge image (1 edge pixel, 0 no edge) of strong edge pixels (above a threshold).
Determine a threshold that eliminates the ball shadows.
here is my main.m
addpath(fullfile(pwd,'TOOLBOX'));
addpath(fullfile(pwd,'images'));
%Sobel Edge Detection
Image = readImage('balls1.tif');
showImage(Image);
message = sprintf('Sobel Edge Detection');
sobelEdgeDetection(Image);
uiwait(msgbox(message,'Done', 'help'));
close all
here is my SobeEdgeDetection.m
function [ output_args ] = SobelEdgeDetection( Image )
maskX = [-1 0 1 ; -2 0 2; -1 0 1];
maskY = [-1 -2 -1 ; 0 0 0 ; 1 2 1] ;
resX = conv2(Image, maskX);
resY = conv2(Image, maskY);
magnitude = sqrt(resX.^2 + resY.^2);
direction = atan(resY/resX);
thresh = magnitude < 101;
magnitude(thresh) = 0;
showImage(magnitude);
end
my questions are:
1. i what is the direction used for ? and how can i display it?
2. is there a better way to get a threshold to eliminate the ball shadows. i used trial and error....
these are my result as far as showing the magnitude:

According to the second part of your homework you have solved it, i.e., you eliminated the shadows.
For the first question: the direction can be used in many different ways. Here is the simplest way: make pretty pictures with it. A more useful reason to consider it is when you are doing non-maximum suppression, but since you are not manually doing it then there isn't much immediate use for it. To visualize the results of the gradient direction it is simply matter of establishing colors for each direction you consider. To further simplify the visualization also suppose you reduce the directions to increments of 30 degrees up to 180, starting from 0. That way if you have a direction of 35 degrees, for example, you consider it as 30 degrees (since it is the nearest one in your reduced list). Next we see an image and a visualization of its gradient direction considering Sobel and the discretization to steps of 30 degrees (black is indicating 0 degree direction).
Automatically determining good thresholds is usually not an easy task. For example, you could start with the one provided by the Otsu method, and decrease or increase its value based on some other histogram analysis according to the problem you are trying to solve.

Here's the answer to your first question :
In Sobel Edge Detection Algo. the direction obtained is basically the gradient.
Gradient in image processing is defined as the direction in which change for intensity is maximum. Change can be increase in intensity or decrease in intensity. Also, this change is calculated for each pixel,this means that for each pixel maximum change in intensity is measured. resX (in your question's example,SobelEdgeDetection.m) signifies changes in X-direction and resY defines change in Y direction.
See see it practically just fire this command in command window of Matlab:
imshow(resX);
Also try, imshow(resY)

Related

Find sub-pixel maximum on a 2D array

Suppose I have an image and I want to find a subarray with shape 3x3 that contains the maximum sum compared to other subarrays.
How do I do that in python efficiently (run as fast as possible)? If you can provide a sample code that would be great.
My specific problem:
I want to extract the location of the center of the blob in this heatmap
I don't want to just get the maximum point because that would cause the coordinate to not be very precise. The true center of the blob could actually be between 2 pixels. Thus, it's better to do weighted average between many points to obtain subpixel precision. For example, if there are 2 points (x1,y1) and (x2,y2) with values 200 and 100. Then the average coordinate will be x=(200*x1+100*x2)/300 y=(200*y1+100*y2)/300
One of my solution is to do a convolution operation. But I think it's not efficient enough because it requires multiplication to the kernel (which contains only ones). I'm looking for a fast implementation so I cannot do looping myself because I'm not sure if it will be fast.
I want to do this algorithm to 50 images every few milliseconds. (Image come in as a batch). Concretely, think of these images as output of a machine learning model that output heatmaps. In order to obtain the coordinate from these heatmaps, I need to do some kind of weighted average between the coordinates with high intensity. My idea is to do a weighted average around 3x3 area on the image. I am also open to other approaches that can be faster or more elegant.
Looking for the "subarray of shape 3x3 with the maximum sum" is the same as looking for the maximum of an image after it has been filtered with an un-normalized 3x3 box filter. So it boils down to finding efficiently the maximum of an image, which you assume is a (perhaps "noisy") discrete sample of an underlying continuous and smooth signal - hence your desire to find a sub-pixel location.
You really need to split the problem in 2 parts:
Find the pixel location m=(xm, ym) of the maximum value of the image. This requires no more than a visit of every pixel in the image, and one comparison per pixel, so it's O(N) and hence optimal as long as you are operating at the native image resolution. In OpenCv it is done using
the minMaxLoc function.
Apply whatever model of the image you are using to find its (subpixel-interpolated) maximum in a neighborhood of m.
To clarify point (2): you write
I don't want to just get the maximum point because that would cause the coordinate to not be very precise. The true center of the blob could actually be between 2 pixels
While intuitively plausible, this assertion needs to be made more precise in order to be computable. That is, you need to express mathematically what assumptions you make about the image, that bring you to search for a "true" maximum between pixel-sampled location.
A simple example for such assumptions is quadratic smoothness. In this scenario you assume that, in a small (say, 3x3, of 5x5) neighborhood of the "true" maximum location, the image signal z is well approximated by a quadratic:
z = A00 dx^2 + A01 dx dy + A11 dy^2 + A02 dx + A12 dy + A22
where:
dx = x - xm; dy = y - ym
This assumption makes sense if the underlying signal is expected to be at least 3rd order continuous and differentiable, because of the Taylor series theorem. Geometrically, it means that you assume (hope?) that the signal looks like a quadric (a paraboloid, or an ellipsoid) near its maximum.
You can then evaluate the above equation for each of the pixels in a neighborhood of m, replacing the actual image values for z, and thus obtain a linear system in the unknown Aij, with as many equations as there are neighbor pixels (so even a 3x3 neighborhood will yield an over-constrained system). Solving the system in the least-squares sense gives you the "optimal" coefficients Aij. The theoretical maximum as predicted by this model is where the first partial derivatives vanish:
del z / del dx = 2 A00 dx + A01 dy = 0
del z / del dy = A01 dx + 2 A11 dy = 0
This is a linear system in the two unknown (dx, dy), and solving it yields the estimated location of the maximum and, through the above equation for z, the predicted image value at the maximum.
In terms of computational cost, all such model estimations are extremely fast, compared with traversing an image of even moderate size.
I am sorry I did not exactly understand the meaning of your last paragraph so I have just stopped at a point where I got all the coordinates having the maximum value. I have used cv2.filter2D for convolution on a thresholded image and then using np.amax and np.where have found the coordinates having the maximum value.
import cv2
import numpy as np
from timeit import default_timer as timer
img = cv2.imread('blob.png', 0)
start = timer()
_, thresh = cv2.threshold(img, 240, 1, cv2.THRESH_BINARY)
mask = np.ones((3, 3), np.uint8)
res = cv2.filter2D(thresh, -1, mask)
result = np.where(res == np.amax(res))
end = timer()
print(end - start)
I don't whether it as efficient as you want or not but the output was 0.0013461999999435648 s
P.S. The image you have provided had a white border which I had to crop out for this method.
One way is to sub-sampling the image and find the neighborhood of the desired point. You can make it by doing a loop not on all the pixels but on e.g. every 5 pixels (row=row+5andcol=col+5 in the loop). After finding the near location, consider a specific neighborhood around that location and do a loop on whole pixels of that specific crop to find the exact location.
Based on my knowledge of image processing, to get a reliable result that works for any one blob, follow these steps:
Make the image greyscale if it isn’t already (pixel values 0-255)
Normalise the image so that pixel intensities cover the full range of 0-255
Convert image to binary (a pixel is either 0 or 1) - this can be achieved by thresholding, such as applying the rule that any pixel less than or equal to 127 in intensity is given an intensity of 0 and anything else is given an intensity of 1
Find the weighted average of all the pixels that hold the value of “1”
or
Apple an erosion to the image until you are left with either 2 pixels or 1 pixel.
Case 1
If you have two pixels then you need to find the u and v co-ordinates if both pixels. The centre of the blob will be the halfway point between the u and v coordinates of the pixels.
Case 2
If you have one pixel left then that pixel’s co-ordinates is the centre point.
—————
You mentioned about achieving this quickly in Python:
Python by design is an interpreted language, so it executed line by line, making it less suitable for highly iterative tasks like image processing. However, you can make use of libraries like OpenCV (https://docs.opencv.org/2.4/index.html), which is written in C, to mitigate this apart from making the task at hand a lot easier for you.
OpenCV also provides solutions for all the steps I listed above in this capacity, therefore you should be able to achieve a reliable solution fairly quickly, though I can’t say for sure if it will hit your target of 50 images every few milliseconds. Other factors to take into account is the size of the image you are processing. That will increase the processing load exponentially.
UPDATE
I just found a good article that practically echoes my step-process:
https://www.learnopencv.com/find-center-of-blob-centroid-using-opencv-cpp-python/
More importantly it also denotes the formula for finding the centroid mathematically as:
c = (1/n)sigma(n, i = 1, x_i)
but this is better written in the article than I can do so here.

Algorithm for plotting 2d xy graph

I'm trying to plot XY graph in real time using Java. Functions that only rely on X are easy. Just iterate over x0...xn, get value and draw lines between the points. There are a lot of guides on it and it's intuitive.
But there is literally no guide on plotting graphs with x AND y being a variable.
Consider this equation: sin(x^3 * y^2) = cos(x^2 * y^3)
Using online Graph plotter I get this:
While my best result plotting the same function is this:
I just iterate over every pixel on screen and pass pixel positions as parameters to the function. If function's output is close to 0, I color the pixel. As you can see it's bad. It also takes huge amount of processing power. It only redraws once every couple of seconds. And if I try to increase precision, all lines just become thicker. Especially around intersections.
My question is how can I make my program faster and make it produce better looking graphs. Maybe there are some algorithms for that purpose?
The challenge is to chose the correct threshold. Pixels where abs(f(x,y)) is below the threshold should be colored. Pixels above the threshold should be white.
The problem is that if the threshold is too low, gaps appear in places where no pixel is exactly on the line. On the other hand, if the threshold is too high, the lines widen in places where the function is near zero, and the function is changing slowly.
So what's the correct threshold? The answer is the magnitude of the gradient, multiplied by the radius of a pixel. In other words, the pixel should be colored when
abs(f(x,y)) < |g(x,y)| * pixelRadius
The reason is that the magnitude of the gradient is equal to the maximum slope of the surface (at a given point). So a zero crossing occurs within a pixel if the slope is large enough to reduce the function to zero, inside the pixel.
That of course is only rough approximation. It assumes that the gradient doesn't change significantly within the area bounded by the pixel. The function in the question conforms to that assumption reasonably well, except in the upper right corner. Notice that in the graph below, there are Moiré patterns in the upper right. I believe that those are due to the failure in my antialiasing calculation: I don't compensate for a rapidly changing gradient.
In the graph below, pixels are white if
abs(f(x,y)) > |g(x,y)| * pixelRadius
Otherwise the pixel intensity is a number from 0 to 1, with 0 being black and 1 being white:
intensity = abs(f(x,y)) / (|g(x,y)| * pixelRadius)
I don't how exactly the online plotter did, but here are some suggestions.
Simplify your equation, as to this specific one, you can easily have x^2 * y^2 * (x ± y) = (2 * n + 1 / 2) * pi where n for any integer. It's much clearer than the original one.
Draw lines rather than points. Every n here stands for 4 curves, you can now loop over x and figure out y and draw a line between adjacent points.
Hope it helps!

What is the maximum x-axis range of acquired depth data in Google Project Tango?

I need to divide the points based on their x-position, so that there is, for example, three divisions of points (a middle, left, and right). The middle one should have a range of one meter. Thus, I was wondering what is the min/max ranges of the x-axis? is it large enough to add more divisions than three with same range (1 meter) ?
Thanks
I'm not sure if your question is very precise.
The x and y positions of the depth data will depend in the actual depth of the image. In particular, it will depend on the depth and the angle of the camera. If the wall in front of the camera is very close, there will be less x-axis range.
As an example. For a depth data with an average z-range of 1.5, I get a x-range around [-0.8,0.8]. For another frame, the average z-range is 3.0, the range goes to [-1.6, 1.6]. Of course these numbers depend on the scene itself, it was just to give you a little idea.
Is it clearer now?
If you check the Horizontal field of View up to this equation
Horizontal FOV = 2 * atan(0.5 * width / Fx)
https://developers.google.com/tango/overview/intrinsics-extrinsics
In the Tango yellowstone is about 63 degrees. So it means that you have 31 degrees to left and 31 degrees to right.
Now, if you have pointcloudData based on xyz you can know that if z = 1 meter then you could apply trigonometry

Distance between set of lines

Say that my images are simple shapes - set of lines, dots, curves, and simple objects,
How do I calculate the distance between images - so length is important but total scale is non important, location of line\curve is important, angles is important etc
Attached image For example:
My comparison object is a cube on the top left, score are fictitious just for this example.
that the distance to the Cylinder is 80 (has 2 lines but top geometry is different)
The bottom left cube score is 100 since it exact match lines with different scale.
The bottom right Rectangle score is 90 since it has exact match lines on the top but different scale lines on the side.
I am looking for algorithm name or general approach that will help me to start to think towards a solution....
Thank you for your help.
Here is something to get you started. When jumping into new problems, I don't see much value in trying a lot of complex steps just because they are available somewhere to use. So my focus is on using relatively simple things, that will fail in more varied situations, but hopefully you will see its value and get some sense of the problem.
The approach is fully based on corner detection; two typical methods for this detection are the Harris detector or the one by Shi and Tomasi described in the paper "Good Features to Track", 1994. I will use the second one, just because there is a ready implementation in OpenCV, newer Matlab, and possibly many other places. Its implementation on these packages also allows for easier parameter adjustment, regarding corner quality and minimum distance between corners. So, suppose you can detect all corner points correctly, how do you measure how close one shape is to another one based on these points ? The images have arbitrary size, so my idea is to normalize the point coordinates to the range [0, 1]. This solves for the scaling issue, which is desired according to the original description. Now we have to compare point sets in the range [0, 1]. Here we go for the simplest thing: consider one point p from the shape a, what is the closest point in shape b ? We assume it is one with the minimum absolute different between this point p and any point in b. If we sum all the values, we get a scoring between shapes. The lower the score, the more similar the shapes (according to this approach).
Here are some shapes I drew:
Here are the detected corners:
As you can clearly see in this last set of images, the method will easily confuse a rectangle/square with a cylinder. To handle that you will need to combine the approach with other descriptors. Initially, a simple one that you might consider is the ratio between the shape's area and its bounding box area (which would give 1 for rectangle, and lower for cylinder).
With the method described above, here are the measurements between the first and second shapes, first and third shapes, ..., respectively: 0.02358485, 0.41350339, 0.30128458 0.4980852, 0.18031262. The second cube is a resized version of the first one, and as you see, they are very similar by this metric. The last shape is a resized version of the first cube but without keeping the aspect ratio, and the metric gives a much higher difference.
If you want to play with the code that performs this, here it is (in Python, depends on OpenCV, numpy):
import sys
import cv2 as cv
import numpy
inp = []
for fname in sys.argv[1:]:
img_color = cv.imread(fname)
img = cv.cvtColor(img_color, cv.COLOR_RGB2GRAY)
inp.append((img_color, img))
ptsets = []
# Corner detection parameters.
params = (
200, # max number of corners
0.01, # minimum quality level of corners
10, # minimum distance between corners
)
# Params for visual circle markers.
circle_radii = 3
circle_color = (255, 0, 0)
for i, (img_color, img) in enumerate(inp):
height, width = img.shape
cornerMap = cv.goodFeaturesToTrack(img, *params)
corner = numpy.array([c[0] for c in cornerMap])
for c in corner:
cv.circle(img_color, tuple(c), circle_radii, circle_color, -1)
# Just to visually check for correct corners.
cv.imwrite('temp_%d.png' % i, img_color)
# Convert corner coordinates to [0, 1]
cornerUnity = (corner - corner.min()) / (corner.max() - corner.min())
# You might want to use other descriptors here. XXX
ptsets.append(cornerUnity)
def compare_ptsets(p):
res = numpy.zeros(len(p))
base = p[0]
for i in xrange(1, len(p)):
sum_min_diff = sum(numpy.abs(p[i] - value).min() for value in base)
res[i] = sum_min_diff
return res
res = compare_ptsets(ptsets)
print res
The process to be followed depends on what depth of features you are going to consider and accuracy required.
If you want something more accurate, search some technical papers like this which can give a concrete and well-proven approach or algorithm.
EDIT:
The idea from Waltz algorithm (one method in AI) can be tweaked. This is just my thought. Interpret the original image, generate some constraints out of it. For each candidate, find out the number of constraints it satisfies. The one which satisfies more constraints will be the most similar to the original image.
Try to calculate mass center for each figure. Treat each point of figure as particle with mass equal 1.
Then calculate each distance as sqrt((x1-x2)^2 + (y1-y2)^2), where (xi, yi) is mass center coordinate for figure i.

Generate Visually Different Colors With An Unknown Color Collection Size

I am trying to generate colors on the fly for a chart control. I want the colors to be visually distinctive. I don't just want the colors to be distinctive from the adjacent colors, but all colors generated so far.
I also don't want to have to have a known color collection size. Some algorithms I have seen for this require the number of things to color to be known. I want to implement a GetNextColor() for my color generator so I will not know at the time of choosing how many colors I will ultimately have and choosing a number up front feels wrong.
I am not just trying to graph a bunch of stuff in different colors, I am interested in this problem and want some feedback.
Here's where I'm at:
Using the HSV color space.
The hue is a value from [0-360] where 0
and 360 are the same (reddish).
Hue starts at 0, I ad 27 (so that
when it cycles around it doesn't land on the same color it started
on), take MOD 360.
For S and V (both between 0 and 1) I start out at a low number like
.25
Run through about 20 hues
Then take a high number like .85
Run through 20 hues
Then start bisecting to get the most distant
values that haven't been used yet.
This isn't a very effective method, it works OK, but it could be much
more scientific. It started out with a lot of thought and then
morphed into this mess.
Any ideas on how to do this elegantly?
(It shouldn't matter, but I am using C# and I will post code when I get back to my computer I have all this stuff on.)
I believe that your question should be split into two questions:
How to map colors into a n-dimensional Cartesian space, and define an Euclidean distance function between colors, such that the distance reflects the difference for a human observer.
Given a n-dimensional cuboid, generate a sequence of dots such that minimal Euclidean distance between any two dots generated so far would be maximized.
And now the answers:
Color difference is calculated using the The CIEDE2000 Color-Difference Formula. The CIEDE2000 formula is based on the LCH color space (Luminosity, Chroma, and Hue). LCH color space is represented as a cylinder (see image here).
However, the difference formula is highly nonlinear. Therefore it would be impossible to map the colors into a square grid such that Euclidean distance would give the CIEDE2000 color-difference.
Settling on a less accurate model, we can use the CIE76 Color-Difference formula, which is based on the Lab color space ( L*a*b*). We can use Euclidean distance directly on this color space to measure the difference. There are no simple formulas for conversion between RGB or CMYK values and L*a*b*, because the RGB and CMYK color models are device dependent. The RGB or CMYK values first need to be transformed to a specific absolute color space, such as sRGB or Adobe RGB. This adjustment will be device dependent, but the resulting data from the transform will be device independent, allowing data to be transformed to the CIE 1931 color space and then transformed into L*a*b*. This article explains the procedure and the formulas.
For the L*a*b* color space and the CIE76 Color-Difference formula - we'll need to solve the problem for a 3D cube.
I believe that your best strategy would be to divide the cube into 8 cubes, which will generate 27 points. Use these points. Now divide each of the 8 cubes into another 8 cubes. For each of these cubes, 12 out of the 27 points have already been used, so you're left with 15*8 new points. In each additional step n, you can generate 15*8^n additional points.
The points-set in each step should be sorted such that the minimal distance between two consecutive points would be maximized. I don't know how to do it - I've just posted a question.
Edit:
I've crossposted on https://cstheory.stackexchange.com/ and got a good answer. See https://cstheory.stackexchange.com/questions/8609/sorting-points-such-that-the-minimal-euclidean-distance-between-consecutive-poin.
If you map the whole color space linearly then your next color would map into it using the powers of 2. Your first choice would be the center, your 2nd choice would be between start and center. Your 3rd choice would be between center and end.
Some JavaScript to illustrate.
// initialize start and end of our linear transform
var START = 0;
var END = 100;
// next function
var _level = 1;
var _index = 1;
function next() {
var pow2 = 2 << (_level - 1);
var result = (END-START) / pow2;
result = result * _index
_index = (_index + 2) % pow2;
if(_index == 1) {
_level++;
}
return result;
}
// testing
for(var i=0; i<32; i++)
console.log(next());

Resources