What does "filter" mean in Realistic Ray Tracing? - raytracing

I'm reading the book Realistic Ray Tracing and I couldn't understand the box filter code:
void boxFilter(Vector2* samples, int num_samples)
{
for (int i = 0; i < num_samples; i++)
{
samples[i].x = samples[i].x - 0.5f;
samples[i].y = samples[i].y - 0.5f;
}
}
In my opinion, "filter" is an array of weights, and sampling is to generate positions to produce rays, filter is to combine the results (so the filter method should return float[], but the function above returns an Vector2[]). What does the code mean?

The basic idea of box filtering is that no matter where on an image plane "pixel" the sample lands, a box filter causes the renderer to act as if it landed in the exact center of that pixel.
I haven't read that particular book, but I'm guessing that in that code fragment, sample[].x and .y are the int (or previously floor()ed) locations where the returned ray hits the image plane, in pixel coordinates. Therefore, subtracting .5 from each puts the sample in the geometric center of each pixel, hence, it's a box filter.
For an in-depth discussion of box filtering (and other filters), see Physically-Based Rendering, Chapter 7, "Sampling and Reconstruction".

Related

depth peeling invariance in webgl (and threejs)

I'm looking at what i think is the first paper for depth peeling (the simplest algorithm?) and I want to implement it with webgl, using three.js
I think I understand the concept and was able to make several peels, with some logic that looks like this:
render(scene, camera) {
const oldAutoClear = this._renderer.autoClear
this._renderer.autoClear = false
setDepthPeelActive(true) //sets a global injected uniform in a singleton elsewhere, every material in the scene has onBeforeRender injected with additional logic and uniforms
let ping
let pong
for (let i = 0; i < this._numPasses; i++) {
const pingPong = i % 2 === 0
ping = pingPong ? 1 : 0
pong = pingPong ? 0 : 1
const writeRGBA = this._screenRGBA[i]
const writeDepth = this._screenDepth[ping]
setDepthPeelPassNumber(i) //was going to try increasing the polygonOffsetUnits here globally,
if (i > 0) {
//all but first pass write to depth
const readDepth = this._screenDepth[pong]
setDepthPeelFirstPass(false)
setDepthPeelPrevDepthTexture(readDepth)
this._depthMaterial.uniforms.uFirstPass.value = 0
this._depthMaterial.uniforms.uPrevDepthTex.value = readDepth
} else {
//first pass just renders to depth
setDepthPeelFirstPass(true)
setDepthPeelPrevDepthTexture(null)
this._depthMaterial.uniforms.uFirstPass.value = 1
this._depthMaterial.uniforms.uPrevDepthTex.value = null
}
scene.overrideMaterial = this._depthMaterial
this._renderer.render(scene, camera, writeDepth, true)
scene.overrideMaterial = null
this._renderer.render(scene, camera, writeRGBA, true)
}
this._quad.material = this._blitMaterial
// this._blitMaterial.uniforms.uTexture.value = this._screenDepth[ping]
this._blitMaterial.uniforms.uTexture.value = this._screenRGBA[
this._currentBlitTex
]
console.log(this._currentBlitTex)
this._renderer.render(this._scene, this._camera)
this._renderer.autoClear = oldAutoClear
}
I'm using gl_FragCoord.z to do the test, and packing the depth into a 8bit RGBA texture, with a shader that looks like this:
float depth = gl_FragCoord.z;
vec4 pp = packDepthToRGBA( depth );
if( uFirstPass == 0 ){
float prevDepth = unpackRGBAToDepth( texture2D( uPrevDepthTex , vSS));
if( depth <= prevDepth + 0.0001) {
discard;
}
}
gl_FragColor = pp;
Varying vSS is computed in the vertex shader, after the projection:
vSS.xy = gl_Position.xy * .5 + .5;
The basic idea seems to work and i get peels, but only if i using the fudge factor. It looks like it fails though as the angle gets more obtuse (which is why polygonOffset needs both the factor and units, to account for the slope?).
I didn't understand at all how the invariance is solved. I don't understand how the mentioned extension is being used other than it seems to be overriding the fragment depth, but with what?
I must admit that I'm not sure even which interpolation is being referred to here since every pixel is aligned, i'm just using nearest filtering.
I did see some hints about depth buffer precision, but not really understanding the issue, i wanted to try packing the depth into only three channels and see what happens.
Having such a small fudge factor make it sort of work tells me that likely all these sampled and computed depths do seem to exist in the same space. But this seems to be the same issue as if using gl.EQUAL for depth testing? For shits and giggles i tried to override the depth with the unpacked depth immediately after packing it, but it didn't seem to do anything.
edit
Increasing the polygon offset with each peel seems to have done the trick. I got some fighting though with the lines but i think it's due to the fact that i was already using offset to draw them and i need to include that in the peel offset. I'd still love to understand more about the problem.
The depth buffer stores depths :) Depending on the 'far' and 'near' planes the perspective projection tends to set the depths of the points "stacked" in just a short part of the buffer. It's not linear in z. You can see this on your own setting a different color depending on the depth and render some triangle that takes most of near-far distance.
A shadow map stores depths (distances to light)... calculated after projection. Later, in the second or following pass, you will compare those depths, which are "stacked", which makes some comparisons to fail due to they are very similar values: hazardous variances.
You can user a more fine-grained depth buffer, 24 bits instead of 16 or 8 bits. This may solve part of the problem.
There's another issue: the perspective division or z/w, needed to get normalized device coordinates (NDC). It occurs after vertex shader, so gl_FragDepth = gl_FragCoord.z is affected.
The other approach is to store the depths calculated in some space that doesn't suffer "stacking" nor perspective division. Camera space is one. In other words, you can calculate the depth undoing projection in the vertex shader.
The article you link to is for old fixed-pipeline, without shaders. It shows a NVIDIA extension to deal with these variances.

What algorithms or approaches apart from Haar cascades could be used for custom objects detection?

I need to do computer visions tasks in order to detect watter bottles or soda cans. I will obtain 'frontal' images of bottles, soda cans or any other random objects (one by one) and my algorithm should determine whether it's a bottle, a can or any of them.
Some details about object detecting scenario:
As mentioned, I will test one single object per image/video frame.
Not all watter bottles are the same. There could be color in plastic, lid or label variation. Maybe some could not get label or lid.
Same about variation goes for soda cans. No wrinkled soda cans are gonna be tested though.
There could be small size variation between objects.
I could have a green (or any custom color) background.
I will do any needed filters on image.
This will be run on a Raspberry Pi.
Just in case, an example of each:
I've tested a couple times OpenCV face detection algorithms and I know it works pretty good but I'd need to obtain an special Haar Cascades features XML file for detecting each custom object on this approach.
So, the distinct alternatives I have in mind are:
Creating a custom Haar Classifier.
Considering shapes.
Considering outlines.
I'd like to get a simple algorithm and I think creating a custom Haar classifier could be even not needed. What would you suggest?
Update
I strongly considered the shape/aspect ratio approach.
However I guess I'm facing some issues as bottles come in distinct sizes or even shapes each. But this made me think or set following considerations:
I'm applying a threshold with THRESH_BINARY method. (Thanks to the answers).
I will use a white background on detection.
Soda cans are all same size.
So, a bounding box for soda cans with high accuracy might distinguish a can.
What I've achieved:
Threshold really helped me, I could notice that on white background tests I would obtain for cans:
And this is what it's obtained for bottles:
So, darker areas left dominancy is noticeable. There are some cases in cans where this might turn into false negatives. And for bottles, light and angle may lead to not consistent results but I really really think this could be a shorter approach.
So, I'm quite confused now how I should evaluate that darkness dominancy, I've read that findContours leads to it but I'm quite lost on how to seize such function. For example, in case of soda cans, it may find several contours, so I get lost on what to evaluate.
Note: I'm open to test any other algorithms or libraries distinct to Open CV.
I see few basic ideas here:
Check object (to be precise - object boundind rect) width/height ratio. For can it's approimetely 2-2.5, for bottle i think it will be >3. It's very simple idea to it should be easy to test it quickly and i think it should has quite good accuracy. For some values, like 2.75 (assumimg that values that i gave are correct, which most likely isn't true) you can use some different algorithm.
Check whether you object contains glass/transparence regions - if yes, than definitely it's a bottle. Here you can read more about it.
Use grabcut algorithm to get object mask/more precise shape and check whether this shape width at the top is similar to width at the bottom - if yes than it's a can, no - bottle (bottles has screw cap at the top).
Since you want to recognize can vs bottle rather than pepsi vs coke, shape matching is probably the way to go when compared to Haar and the features2d matchers like SIFT/SURF/ORB
A unique background color will make things easier.
First create a histogram from an image of just the background
int channels[] = {0,1,2}; // use all the channels
int rgb_bins = 32; // quantize to 32 colors per channel
int histSize[] = {rgb_bins, rgb_bins, rgb_bins};
float _range[] = {0,255};
float* ranges[] = {_range, _range, _range};
cv::SparseMat bghist;
cv::calcHist(&bg_image, 1, channels, cv::noArray(),bghist, 3, histSize, ranges );
Then use calcBackProject to create a mask of bg and not bg
cv::MatND temp_ND;
cv::calcBackProject( &bottle_image, 1, channels, bghist, temp_ND, ranges );
cv::Mat bottle_mask, bottle_backproj;
if( feeling_lazy ){
cv::normalize(temp_ND, bottle_backproj, 0, 255, cv::NORM_MINMAX, CV_8U);
//a small blur here could work nicely
threshold( bottle_backproj, bottle_mask, 0, 255, THRESH_OTSU );
bottle_mask = cv::Scalar(255) - bottle_mask; //invert the mask
} else {
//finding just the right value here might be better than the above method
int magic_threshold = 64;
temp_ND.convertTo( bottle_backproj, CV_8U, 255.);
//I expect temp_ND to be CV_32F ranging from 0-1, but I might be wrong.
threshold( bottle_backproj, bottle_mask, magic_threshold, 255, THRESH_BINARY_INV );
}
Then either:
Compare bottle_mask or bottle_backproj to a few sample bottle masks/backprojections using matchTemplate with a threshold on confidence to decide if it's a match.
matchTemplate(bottle_mask, bottle_template, result, CV_TM_CCORR_NORMED);
double confidence; minMaxLoc( result, NULL, &confidence);
Or use matchShapes, though I've never gotten this to work properly.
double confidence = matchShapes(bottle_mask, bottle_template, CV_CONTOURS_MATCH_I3);
Or use linemod which is difficult to set up but works great for images like this where the shape isn't very complex. Aside from the linked file, I haven't found any working samples of this method so here's what I did.
First create/train the detector with some sample images
//some magic numbers
std::vector<int> T_at_level;
T_at_level.push_back(4);
T_at_level.push_back(8);
//add some padding so linemod doesn't scream at you
const int T = 32;
int width = bottle_mask.cols;
if( width % T != 0)
width += T - width % T;
int height = bottle_mask.rows;
if( height % T != 0)
height += T - height % T;
//in this case template_backproj is created specifically from a sample bottle_backproj
cv::Rect padded_roi( (width - template_backproj.cols)/2, (height - template_backproj.rows)/2, template_backproj.cols, template_backproj.rows);
cv::Mat padded_backproj = zeros( width, height, template_backproj.type());
padded_backproj( padded_roi ) = template_backproj;
cv::Mat padded_mask = zeros( width, height, template_mask.type());
padded_mask( padded_roi ) = template_mask;
//you might need to erode padded_mask by a few pixels.
//initialize detector
std::vector< cv::Ptr<cv::linemod::Modality> > modalities;
modalities.push_back( cv::makePtr<cv::linemod::ColorGradient>() ); //for those that don't have a kinect
cv::Ptr<cv::linemod::Detector> new_detector = cv::makePtr<cv::linemod::Detector>(modalities, T_at_level);
//add sample images to the detector
std::vector<cv::Mat> template_images;
templates.push_back( padded_backproj);
cv::Rect ignore_me;
const std::string class_id = "bottle";
template_id = new_detector->addTemplate(template_images, class_id, padded_mask, &ignore_me);
Then do some matching
std::vector<cv::Mat> sources_vec;
sources_vec.push_back( padded_backproj );
//padded_backproj doesn't need to be the same size as the trained template images, but it does need to be padded the same way.
float matching_threshold = 0.8; //a higher number makes the algorithm faster
std::vector<cv::linemod::Match> matches;
std::vector<cv::String> class_ids;
new_detector->match(sources_vec, matching_threshold, matches,class_ids);
float confidence = matches.size() > 0? matches[0].similarity : 0;
As cyriel suggests, the aspect ratio (width/height) might be one useful measure. Here is some OpenCV Python code that finds contours (hopefully including the outline of the bottle or can) and gives you aspect ratio and some other measurements:
# src image should have already had some contrast enhancement (such as
# cv2.threshold) and edge finding (such as cv2.Canny)
contours, hierarchy = cv2.findContours(src, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
num_points = len(contour)
if num_points < 5:
# The contour has too few points to fit an ellipse. Skip it.
continue
# We could use area to help determine the type of object.
# Small contours are probably false detections (not really a whole object).
area = cv2.contourArea(contour)
bounding_ellipse = cv2.fitEllipse(contour)
center, radii, angle_degrees = bounding_ellipse
# Let's define an ellipse's normal orientation to be landscape (width > height).
# We must ensure that the ellipse's measurements match this orientation.
if radii[0] < radii[1]:
radii = (radii[1], radii[0])
angle_degrees -= 90.0
# We could use the angle to help determine the type of object.
# A bottle or can's angle is probably approximately a multiple of 90 degrees,
# assuming that it is at rest and not falling.
# Calculate the aspect ratio (width / height).
# For example, 0.5 means the object's height is 2 times its width.
# A bottle is probably taller than a can.
aspect_ratio = radii[0] / radii[1]
For checking transparency, you can compare the picture to a known background using histogram analysis or background subtraction.
The contour's moments can be used to determine its centroid (center of gravity):
moments = cv2.moments(contour)
m00 = moments['m00']
m01 = moments['m01']
m10 = moments['m10']
centroid = (m10 / m00, m01 / m00)
You could compare this to the center. If the object is bigger ("heavier") on one end, the centroid will be closer to that end than the center is.
So, my main approach for detection was:
Bottles are transparent and cans are opaque
Generally algorithm consisted in:
Take a grayscale picture.
Apply a binary threshold.
Select a convenient ROI from it.
Obtain it's color mean and even the standard deviation.
Distinguish.
Implementation was basically reduced to this function (where CAN and BOTTLE were previously defined):
int detector(int x, int y, int width, int height, int thresholdValue, CvCapture* capture) {
Mat img;
Rect r;
vector<Mat> channels;
r = Rect(x,y,width,height);
if ( !capture ) {
fprintf( stderr, "ERROR: capture is NULL \n" );
getchar();
return -1;
}
img = Mat(cvQueryFrame( capture ));
cvtColor(img,img,CV_RGB2GRAY);
threshold(img, img, 127, 255, THRESH_BINARY);
// ROI
Mat roiImage = img(r);
split(roiImage, channels);
Scalar m = mean(channels[0]);
float media = m[0];
printf("Media: %f\n", media);
if (media < thresholdValue) {
return CAN;
}
else {
return BOTTLE;
}
}
As it can be seen, a THRESH_BINARY threshold was applied, and it was a plain white background which was used. However the main and critical issue I faced with this whole approach and algorithm was luminosity changes in environment, even minor ones.
Sometimes I could notice a THRESH_BINARY_INV might help more, but I wonder if I could use some certian threshold parameters or wether applying other filters may lead to getting rid of environment lightning as an issue.
I really appreciate the aspect ratio calculation approach from bounding box or finding contours but I found this straight forward and simple when conditions were adjusted.
I'd use deep learning, based on Transfer learning.
The idea is this: given a highly complex well trained neural network, that was trained on a similar classification task (tipically over a large public dataset, like imagenet), you can freeze the majority of its weigths and only train the last layers. There are lots of tutorials out there. You don't need to have a background on deep learning.
There is a tutorial which is almost out of the box with tensorflow here and here there is another based on keras.

Need to automatically eliminate noise in image and outer boundary of a object

i am mechanical engineering student working on a project to automatically detect the weld seam (The seam is a edge that is to be welded) present in a workshop. This gives a basic terminology involved in welding (http://i.imgur.com/Hfwjq0w.jpg).
To separate the weldment from the other objects, i have taken the background image and subtracted the foreground image having the weldment to obatin only the weldment(http://i.imgur.com/v7yBWs1.jpg). After image subtraction,there are the shadow ,glare and remnant noises of subtracted background are still present.
As i want to automatically identify only the weld seam without the outer boundary of weldment, i have tried to detect the edges in the weldment image using canny algorithm and tried to eliminate the isolated noises using the function bwareopen.I have somehow obtained the approximate boundary of weldment and weld seam. The threshold i have used are purely on trial and error approach as dont know a way to automatically set a threshold to detect them.
The problem now i am facing is that i cant specify an definite threshold as this algorithm should be able to identify the seam of any material regardless of its surface texture,glare and shadow present there. I need some assistance to remove the glare,shadow and isolated points from the background subtracted image.
Also i need help to get rid of the outer boundary and obtain only smooth weld seam from starting point to end point.
i have tried to use the following code:
a=imread('imageofworkpiece.jpg'); %http://i.imgur.com/3ngu235.jpg
b=imread('background.jpg'); %http://i.imgur.com/DrF6wC2.jpg
Ip = imsubtract(b,a);
imshow(Ip) % weldment separated %http://i.imgur.com/v7yBWs1.jpg
BW = rgb2gray(Ip);
c=edge(BW,'canny',0.05); % by trial and error
figure;imshow(c) % %http://i.imgur.com/1UQ8E3D.jpg
bw = bwareaopen(c, 100); % by trial and error
figure;imshow(bw) %http://i.imgur.com/Gnjy2aS.jpg
Can anybody please suggest me a adaptive way to set a threhold and remove the outer boundary to detect only the seam? Thank you
Well this doesn't solve your problem of finding an automatic thresholding algorithm. but I can help with isolation the seam. The seam is along the y axis (will this always be the case?) so I used hough transform to isolate only near vertical lines. Normally it finds all lines but I restricted the theta search parameter. The code I'm using now happens to highlight the longest line segment (I got it directly from the matlab website) and it is coincidentally the weld seam. This was purely coincidental. But using your bwareaopened image as input the hough line detector is able to find the seam. Of course it required a bit of playing around to work, so you are stuck at your original problem of finding optimal settings somehow
Maybe this can be a springboard for someone else
a=imread('weldment.jpg'); %http://i.imgur.com/3ngu235.jpg
b=imread('weld_bg.jpg'); %http://i.imgur.com/DrF6wC2.jpg
Ip = imsubtract(b,a);
imshow(Ip) % weldment separated %http://i.imgur.com/v7yBWs1.jpg
BW = rgb2gray(Ip);
c=edge(BW,'canny',0.05); % by trial and error
bw = bwareaopen(c, 100); % by trial and error
figure(1);imshow(c) ;title('canny') % %http://i.imgur.com/1UQ8E3D.jpg
figure(2);imshow(bw);title('bw area open') %http://i.imgur.com/Gnjy2aS.jpg
[H,T,R] = hough(bw,'RhoResolution',1,'Theta',-15:5:15);
figure(3)
imshow(H,[],'XData',T,'YData',R,...
'InitialMagnification','fit');
xlabel('\theta'), ylabel('\rho');
axis on, axis normal, hold on;
P = houghpeaks(H,5,'threshold',ceil(0.5*max(H(:))));
x = T(P(:,2)); y = R(P(:,1));
plot(x,y,'s','color','white');
% Find lines and plot them
lines = houghlines(BW,T,R,P,'FillGap',2,'MinLength',30);
figure(4), imshow(BW), hold on
max_len = 0;
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
% Plot beginnings and ends of lines
plot(xy(1,1),xy(1,2),'x','LineWidth',2,'Color','yellow');
plot(xy(2,1),xy(2,2),'x','LineWidth',2,'Color','red');
% Determine the endpoints of the longest line segment
len = norm(lines(k).point1 - lines(k).point2);
if ( len > max_len)
max_len = len;
xy_long = xy;
end
end
% highlight the longest line segment
plot(xy_long(:,1),xy_long(:,2),'LineWidth',2,'Color','blue');
from your image it looks like the weld seam will be usually very dark with sharp intensity edge so why don't you use that ?
do not use background
create derivation image
dx[y][x]=pixel[y][x]-pixel[y][x-1]
do this for whole image (if on place then x must decrease in loop!!!)
filter out all derivations lower then thresholds
if (|dx[y][x]|<threshold) dx[y][x]=0; else pixel[y][x]=255;` // or what ever values you use
how to obtain threshold value ?
compute min and max intensity and set threshold as (max-min)*scale where scale is value lower then 1.0 (start with 0.02 or 0.1 for example ...
do this also for y axis
so compute dy[][]... and combine dx[][] and dy[][] together. Either with OR or by AND logical functions
filter out artifacts
you can use morphologic filters or smooth threshold for this. After all this you will have mask of pixels of weld seam
if you need boundig box then just loop through all pixels and remember min,max x,y coords ...
[Notes]
if your images will have good lighting then you can ignore the derivation and threshold the intensity directly with something like:
threshold = 0.5*(average_intensity+lowest_intensity)
if you want really fully automate this then you have to use adaptive thresholds. So try more thresholds in a loop and remember result closest to desired output based on geometry size,position etc ...
[edit1] finally have some time/mood for this so
Intensity image threshold
you provided just single image which is far from enough to make reliable algorithm. This is the result
as you can see without further processing this is not good approach
Derivation image threshold
threshold derivation by x (10%)
threshold derivation by y (5%)
AND combination of both 10% di/dx and 1.5% di/dy
The code in C++ looks like this (sorry do not use Matlab):
int x,y,i,i0,i1,tr2,tr3;
pic1=pic0; // copy input image pic0 to pic1
pic2=pic0; // copy input image pic0 to pic2 (just to resize to desired size for derivation)
pic3=pic0; // copy input image pic0 to pic3 (just to resize to desired size for derivation)
pic1.rgb2i(); // RGB -> grayscale
// abs derivate by x
for (y=pic1.ys-1;y>0;y--)
for (x=pic1.xs-1;x>0;x--)
{
i0=pic1.p[y][x ].dd;
i1=pic1.p[y][x-1].dd;
i=i0-i1; if (i<0) i=-i;
pic2.p[y][x].dd=i;
}
// compute min,max derivation
i0=pic2.p[1][1].dd; i1=i0;
for (y=1;y<pic1.ys;y++)
for (x=1;x<pic1.xs;x++)
{
i=pic2.p[y][x].dd;
if (i0>i) i0=i;
if (i1<i) i1=i;
}
tr2=i0+((i1-i0)*100/1000);
// abs derivate by y
for (y=pic1.ys-1;y>0;y--)
for (x=pic1.xs-1;x>0;x--)
{
i0=pic1.p[y ][x].dd;
i1=pic1.p[y-1][x].dd;
i=i0-i1; if (i<0) i=-i;
pic3.p[y][x].dd=i;
}
// compute min,max derivation
i0=pic3.p[1][1].dd; i1=i0;
for (y=1;y<pic1.ys;y++)
for (x=1;x<pic1.xs;x++)
{
i=pic3.p[y][x].dd;
if (i0>i) i0=i;
if (i1<i) i1=i;
}
tr3=i0+((i1-i0)*15/1000);
// threshold the derivation images and combine them
for (y=1;y<pic1.ys;y++)
for (x=1;x<pic1.xs;x++)
{
// copy original (pic0) pixel for non thresholded areas the rest fill with green color
if ((pic2.p[y][x].dd>=tr2)&&(pic3.p[y][x].dd>=tr3)) i=0x00FF00;
else i=pic0.p[y][x].dd;
pic1.p[y][x].dd=i;
}
pic0 is input image
pic1 is output image
pic2,pic3 are just temporary storage for derivations
pic?.xy,pic?.ys is the size of pic?
pic.p[y][x].dd is pixel axes (dd means access pixel as DWORD ...)
as you can see there is a lot of stuff around (nod visible in the first image you provided) so you need to process this further
segmentate and separate...,
use hough transform ...
filter out small artifacts ...
identify object by expected geometry properties (aspect ratio,position,size)
Adaptive thresholds:
you need for this to know the desired output image properties (not possible to reliably deduce from single image input) then create function that do the above processing with variable tr2,tr3. Try in loop more options of tr2,tr3 (loop through all values or iterate to better results and remember the best output (so you also need some function that detects the quality of output) for example:
quality=0.0; param=0.0;
for (a=0.2;a<=0.8;a+=0.1)
{
pic1=process_image(pic0,a);
q=detect_quality(pic1);
if (q>quality) { quality=q; param=a; pico=pic1; }
}
after this the pic1 should hold the relatively best threshold image ... You should process like this all threshold separately inside the process_image the targeted threshold must be scaled by a for example tr2=i0+((i1-i0)*a);

Location of highest density on a sphere

I have a lot of points on the surface of the sphere.
How can I calculate the area/spot of the sphere that has the largest point density?
I need this to be done very fast. If this was a square for example I guess I could create a grid and then let the points vote which part of the grid is the best.
I have tried with transforming the points to spherical coordinates and then do a grid, both this did not work well since points around north pole are close on the sphere but distant after the transform.
Thanks
There is in fact no real reason to partition the sphere into a regular non-overlapping mesh, try this:
partition your sphere into semi-overlapping circles
see here for generating uniformly distributed points (your circle centers)
Dispersing n points uniformly on a sphere
you can identify the points in each circle very fast by a simple dot product..it really doesn't matter if some points are double counted, the circle with the most points still represents the highest density
mathematica implementation
this takes 12 seconds to analyze 5000 points. (and took about 10 minutes to write )
testcircles = { RandomReal[ {0, 1}, {3}] // Normalize};
Do[While[ (test = RandomReal[ {-1, 1}, {3}] // Normalize ;
Select[testcircles , #.test > .9 & , 1] ) == {} ];
AppendTo[testcircles, test];, {2000}];
vmax = testcircles[[First#
Ordering[-Table[
Count[ (testcircles[[i]].#) & /# points , x_ /; x > .98 ] ,
{i, Length[testcircles]}], 1]]];
To add some other, alternative schemes to the mix: it's possible to define a number of (almost) regular grids on sphere-like geometries by refining an inscribed polyhedron.
The first option is called an icosahedral grid, which is a triangulation of the spherical surface. By joining the centres of the triangles about each vertex, you can also create a dual hexagonal grid based on the underlying triangulation:
Another option, if you dislike triangles (and/or hexagons) is the cubed-sphere grid, formed by subdividing the faces of an inscribed cube and projecting the result onto the spherical surface:
In either case, the important point is that the resulting grids are almost regular -- so to evaluate the region of highest density on the sphere you can simply perform a histogram-style analysis, counting the number of samples per grid cell.
As a number of commenters have pointed out, to account for the slight irregularity in the grid it's possible to normalise the histogram counts by dividing through by the area of each grid cell. The resulting density is then given as a "per unit area" measure. To calculate the area of each grid cell there are two options: (i) you could calculate the "flat" area of each cell, by assuming that the edges are straight lines -- such an approximation is probably pretty good when the grid is sufficiently dense, or (ii) you can calculate the "true" surface areas by evaluating the necessary surface integrals.
If you are interested in performing the requisite "point-in-cell" queries efficiently, one approach is to construct the grid as a quadtree -- starting with a coarse inscribed polyhedron and refining it's faces into a tree of sub-faces. To locate the enclosing cell you can simply traverse the tree from the root, which is typically an O(log(n)) operation.
You can get some additional information regarding these grid types here.
Treating points on a sphere as 3D points might not be so bad.
Try either:
Select k, do approximate k-NN search in 3D for each point in the data or selected point of interest, then weight the result by their distance to the query point. Complexity may vary for different approximate k-NN algorithms.
Build a space-partitioning data structure like k-d Tree, then do approximate (or exact) range counting query with a ball range centered at each point in the data or selected point of interest. Complexity is O(log(n) + epsilon^(-3)) or O(epsilon^(-3)*log(n)) for each approximate range query with state of the art algorithms, where epsilon is the range error threshold w.r.t. the size of the querying ball. For exact range query, the complexity is O(n^(2/3)) for each query.
Partition the sphere into equal-area regions (bounded by parallels and meridians) as described in my answer there and count the points in each region.
The aspect ratio of the regions will not be uniform (the equatorial regions will be more "squarish" when N~M, while the polar regions will be more elongated).
This is not a problem because the diameters of the regions go to 0 as N and M increase.
The computational simplicity of this method trumps the better uniformity of domains in the other excellent answers which contain beautiful pictures.
One simple modification would be to add two "polar cap" regions to the N*M regions described in the linked answer to improve the numeric stability (when the point is very close to a pole, its longitude is not well defined). This way the aspect ratio of the regions is bounded.
You can use the Peters projection, which preserves the areas.
This will allow you to efficiently count the points in a grid, but also in a sliding window (box Parzen window) by using the integral image trick.
If I understand correctly, you are trying to find the densepoint on sphere.
if points are denser at some point
Consider Cartesian coordinates and find the mean X,Y,Z of points
Find closest point to mean X,Y,Z that is on sphere (you may consider using spherical coordinates, just extend the radius to original radius).
Constraints
If distance between mean X,Y,Z and the center is less than r/2, then this algorithm may not work as desired.
I am not master of mathematics but may be it can solve by analytical way as:
1.Short the coordinate
2.R=(Σ(n=0. n=max)(Σ(m=0. M=n)(1/A^diff_in_consecative))*angle)/Σangle
A=may any constant
This is really just an inverse of this answer of mine
just invert the equations of equidistant sphere surface vertexes to surface cell index. Don't even try to visualize the cell different then circle or you go mad. But if someone actually do it then please post the result here (and let me now)
Now just create 2D cell map and do the density computation in O(N) (like histograms are done) similar to what Darren Engwirda propose in his answer
This is how the code looks like in C++
//---------------------------------------------------------------------------
const int na=16; // sphere slices
int nb[na]; // cells per slice
const int na2=na<<1;
int map[na][na2]; // surface cells
const double da=M_PI/double(na-1); // latitude angle step
double db[na]; // longitude angle step per slice
// sherical -> orthonormal
void abr2xyz(double &x,double &y,double &z,double a,double b,double R)
{
double r;
r=R*cos(a);
z=R*sin(a);
y=r*sin(b);
x=r*cos(b);
}
// sherical -> surface cell
void ab2ij(int &i,int &j,double a,double b)
{
i=double(((a+(0.5*M_PI))/da)+0.5);
if (i>=na) i=na-1;
if (i< 0) i=0;
j=double(( b /db[i])+0.5);
while (j< 0) j+=nb[i];
while (j>=nb[i]) j-=nb[i];
}
// sherical <- surface cell
void ij2ab(double &a,double &b,int i,int j)
{
if (i>=na) i=na-1;
if (i< 0) i=0;
a=-(0.5*M_PI)+(double(i)*da);
b= double(j)*db[i];
}
// init variables and clear map
void ij_init()
{
int i,j;
double a;
for (a=-0.5*M_PI,i=0;i<na;i++,a+=da)
{
nb[i]=ceil(2.0*M_PI*cos(a)/da); // compute actual circle cell count
if (nb[i]<=0) nb[i]=1;
db[i]=2.0*M_PI/double(nb[i]); // longitude angle step
if ((i==0)||(i==na-1)) { nb[i]=1; db[i]=1.0; }
for (j=0;j<na2;j++) map[i][j]=0; // clear cell map
}
}
//---------------------------------------------------------------------------
// this just draws circle from point x0,y0,z0 with normal nx,ny,nz and radius r
// need some vector stuff of mine so i did not copy the body here (it is not important)
void glCircle3D(double x0,double y0,double z0,double nx,double ny,double nz,double r,bool _fill);
//---------------------------------------------------------------------------
void analyse()
{
// n is number of points and r is just visual radius of sphere for rendering
int i,j,ii,jj,n=1000;
double x,y,z,a,b,c,cm=1.0/10.0,r=1.0;
// init
ij_init(); // init variables and map[][]
RandSeed=10; // just to have the same random points generated every frame (do not need to store them)
// generate draw and process some random surface points
for (i=0;i<n;i++)
{
a=M_PI*(Random()-0.5);
b=M_PI* Random()*2.0 ;
ab2ij(ii,jj,a,b); // cell corrds
abr2xyz(x,y,z,a,b,r); // 3D orthonormal coords
map[ii][jj]++; // update cell density
// this just draw the point (x,y,z) as line in OpenGL so you can ignore this
double w=1.1; // w-1.0 is rendered line size factor
glBegin(GL_LINES);
glColor3f(1.0,1.0,1.0); glVertex3d(x,y,z);
glColor3f(0.0,0.0,0.0); glVertex3d(w*x,w*y,w*z);
glEnd();
}
// draw cell grid (color is function of density)
for (i=0;i<na;i++)
for (j=0;j<nb[i];j++)
{
ij2ab(a,b,i,j); abr2xyz(x,y,z,a,b,r);
c=map[i][j]; c=0.1+(c*cm); if (c>1.0) c=1.0;
glColor3f(0.2,0.2,0.2); glCircle3D(x,y,z,x,y,z,0.45*da,0); // outline
glColor3f(0.1,0.1,c ); glCircle3D(x,y,z,x,y,z,0.45*da,1); // filled by bluish color the more dense the cell the more bright it is
}
}
//---------------------------------------------------------------------------
The result looks like this:
so now just see what is in the map[][] array you can find the global/local min/max of density or whatever you need... Just do not forget that the size is map[na][nb[i]] where i is the first index in array. The grid size is controlled by na constant and cm is just density to color scale ...
[edit1] got the Quad grid which is far more accurate representation of used mapping
this is with na=16 the worst rounding errors are on poles. If you want to be precise then you can weight density by cell surface size. For all non pole cells it is simple quad. For poles its triangle fan (regular polygon)
This is the grid draw code:
// draw cell quad grid (color is function of density)
int i,j,ii,jj;
double x,y,z,a,b,c,cm=1.0/10.0,mm=0.49,r=1.0;
double dx=mm*da,dy;
for (i=1;i<na-1;i++) // ignore poles
for (j=0;j<nb[i];j++)
{
dy=mm*db[i];
ij2ab(a,b,i,j);
c=map[i][j]; c=0.1+(c*cm); if (c>1.0) c=1.0;
glColor3f(0.2,0.2,0.2);
glBegin(GL_LINE_LOOP);
abr2xyz(x,y,z,a-dx,b-dy,r); glVertex3d(x,y,z);
abr2xyz(x,y,z,a-dx,b+dy,r); glVertex3d(x,y,z);
abr2xyz(x,y,z,a+dx,b+dy,r); glVertex3d(x,y,z);
abr2xyz(x,y,z,a+dx,b-dy,r); glVertex3d(x,y,z);
glEnd();
glColor3f(0.1,0.1,c );
glBegin(GL_QUADS);
abr2xyz(x,y,z,a-dx,b-dy,r); glVertex3d(x,y,z);
abr2xyz(x,y,z,a-dx,b+dy,r); glVertex3d(x,y,z);
abr2xyz(x,y,z,a+dx,b+dy,r); glVertex3d(x,y,z);
abr2xyz(x,y,z,a+dx,b-dy,r); glVertex3d(x,y,z);
glEnd();
}
i=0; j=0; ii=i+1; dy=mm*db[ii];
ij2ab(a,b,i,j); c=map[i][j]; c=0.1+(c*cm); if (c>1.0) c=1.0;
glColor3f(0.2,0.2,0.2);
glBegin(GL_LINE_LOOP);
for (j=0;j<nb[ii];j++) { ij2ab(a,b,ii,j); abr2xyz(x,y,z,a-dx,b-dy,r); glVertex3d(x,y,z); }
glEnd();
glColor3f(0.1,0.1,c );
glBegin(GL_TRIANGLE_FAN); abr2xyz(x,y,z,a ,b ,r); glVertex3d(x,y,z);
for (j=0;j<nb[ii];j++) { ij2ab(a,b,ii,j); abr2xyz(x,y,z,a-dx,b-dy,r); glVertex3d(x,y,z); }
glEnd();
i=na-1; j=0; ii=i-1; dy=mm*db[ii];
ij2ab(a,b,i,j); c=map[i][j]; c=0.1+(c*cm); if (c>1.0) c=1.0;
glColor3f(0.2,0.2,0.2);
glBegin(GL_LINE_LOOP);
for (j=0;j<nb[ii];j++) { ij2ab(a,b,ii,j); abr2xyz(x,y,z,a-dx,b+dy,r); glVertex3d(x,y,z); }
glEnd();
glColor3f(0.1,0.1,c );
glBegin(GL_TRIANGLE_FAN); abr2xyz(x,y,z,a ,b ,r); glVertex3d(x,y,z);
for (j=0;j<nb[ii];j++) { ij2ab(a,b,ii,j); abr2xyz(x,y,z,a-dx,b+dy,r); glVertex3d(x,y,z); }
glEnd();
the mm is the grid cell size mm=0.5 is full cell size , less creates a space between cells
If you want a radial region of the greatest density, this is the robust disk covering problem with k = 1 and dist(a, b) = great circle distance (a, b) (see https://en.wikipedia.org/wiki/Great-circle_distance)
https://www4.comp.polyu.edu.hk/~csbxiao/paper/2003%20and%20before/PDCS2003.pdf
Consider using a geographic method to solve this. GIS tools, geography data types in SQL, etc. all handle curvature of a spheroid. You might have to find a coordinate system that uses a pure sphere instead of an earthlike spheroid if you are not actually modelling something on Earth.
For speed, if you have large numbers of points and want the densest location of them, a raster heatmap type solution might work well. You could create low resolution rasters, then zoom to areas of high density and create higher resolution only cells that you care about.

How to obtain the bounding ellipse of a target connect component

Suppose we have a connected component in the image as the following image illustrates:image http://dl.dropbox.com/u/92688392/ellipse.jpg.
My question is how can calculate the bounding ellipse of the connected components (the red ellipse in the image). I have checked MATLAB function regionprops, and understand how MATLAB can do that. I also notice that Opencv has similar function to do that CBlob::GetEllipse(). However, although I understand how they obtain the result by reading the code, the fundamental theory behind it is still unclear to me. I am therefore wondering whether there are some standard algorithms to do the job. Thanks!
EDIT:
Based on the comments, I reorganized my question: in image moment Wikipedia the calculation formula of the longest axis angle is
However, in the MATLAB function regionprops, the codes are as follows:
% Calculate orientation.
if (uyy > uxx)
num = uyy - uxx + sqrt((uyy - uxx)^2 + 4*uxy^2);
den = 2*uxy;
else
num = 2*uxy;
den = uxx - uyy + sqrt((uxx - uyy)^2 + 4*uxy^2);
end
This implementation is inconsistent with the formula in Wikipedia. I was wondering which one is correct.
If you're looking for a OpenCV implementation than I can give it to you. The algorithm is the following:
Convert image to 1bit (b&w)
Find all contours
Create contour that contains all points from founded contours
Calculate convex hull of this contour
Find rotated ellipse (rectangle) with minimal square that contains calculated in previous step contour
Here's code:
Mat src = imread("ellipse.jpg"), tmp;
vector<Vec4i> hierarchy;
vector<vector<Point> > contours;
vector<Point> bigContour, hull;
RotatedRect ell;
//step 1
cvtColor(src, tmp, CV_BGR2GRAY);
threshold(tmp, tmp, 100, 255, THRESH_BINARY);
//step 2
findContours(tmp, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
//step 3
for (size_t i=0; i<contours.size(); i++)
{
for (size_t j=0; j<contours[i].size(); j++)
{
bigContour.push_back(contours[i][j]);
}
}
//step 4
convexHull(bigContour, hull);
//step 5
ell = fitEllipse(hull);
//drawing result
ellipse(src, ell, Scalar(0,0,255), 2);
imshow("result", src);
waitKey();
This is the input:
And here's a result:
I was trying to find out what's the algorithm behind it as well so I could write my own implementation of it. I found it on a blog post of mathworks. In one of the comments, the author says:
regionprops calculates the 2nd-order moments of the object in question and then returns measurements of the ellipse with the same 2nd-order moments.
and later later:
The equations used are from Haralick and Shapiro, Computer and Robot Vision vol. 1, Appendix A, Addison-Wesley 1992. I did a sanity check by constructing an image containing an ellipse with major axis length = 100 and minor axis length = 50, and regionprops returned the correct measurements.
I don't have that book but seems I'll need to get a copy of it.
I'm not sure how matlab or opencv calculates the ellipsoid. But if you are interested in math behind it, there is a very nice optimization approach called Löwner-John ellipsoid. You can find more information about this method in the Stanford Convex Optimization course. I hope it helps...

Resources