Path Tracing algorithm - Need help understanding key point - algorithm

So the Wikipedia page for path tracing (http://en.wikipedia.org/wiki/Path_tracing) contains a naive implementation of the algorithm with the following explanation underneath:
"All these samples must then be averaged to obtain the output color. Note this method of always sampling a random ray in the normal's hemisphere only works well for perfectly diffuse surfaces. For other materials, one generally has to use importance-sampling, i.e. probabilistically select a new ray according to the BRDF's distribution. For instance, a perfectly specular (mirror) material would not work with the method above, as the probability of the new ray being the correct reflected ray - which is the only ray through which any radiance will be reflected - is zero. In these situations, one must divide the reflectance by the probability density function of the sampling scheme, as per Monte-Carlo integration (in the naive case above, there is no particular sampling scheme, so the PDF turns out to be 1)."
The part I'm having trouble understanding is the part in bold. I am familiar with PDFs but I am not quite sure how they fit into here. If we stick to the mirror example, what would be the PDF value we would divide by? Why? How would I go about finding the PDF value to divide by if I was using an arbitrary BRDF value such as a Phong reflection model or Cook-Torrance reflection model, etc? Lastly, why do we divide by the PDF instead of multiply? If we divide, don't we give more weight to a direction with a lower probability?

Let's assume that we have only materials without color (greyscale). Then, their BDRF at each point can be expressed as a single valued function
float BDRF(phi_in, theta_in, phi_out, theta_out, pointWhereObjWasHit);
Here, phi and theta are the azimuth and zenith angles of the two rays under consideration. For pure Lambertian reflection, this function would look like this:
float lambertBRDF(phi_in, theta_in, phi_out, theta_out, pointWhereObjWasHit)
{
return albedo*1/pi*cos(theta_out);
}
albedo ranges from 0 to 1 - this measures how much of the incoming light is reemitted. The factor 1/pi ensures that the integral of BRDF over all outgoing vectors does not exceed 1. With the naive approach of the Wikipedia article (http://en.wikipedia.org/wiki/Path_tracing), one can use this BRDF as follows:
Color TracePath(Ray r, depth) {
/* .... */
Ray newRay;
newRay.origin = r.pointWhereObjWasHit;
newRay.direction = RandomUnitVectorInHemisphereOf(normal(r.pointWhereObjWasHit));
Color reflected = TracePath(newRay, depth + 1);
return emittance + reflected*lambertBDRF(r.phi,r.theta,newRay.phi,newRay.theta,r.pointWhereObjWasHit);
}
As mentioned in the article and by Ross, this random sampling is unfortunate because it traces incoming directions (newRay's) from which little light is reflected with the same probability as directions from which there is lots of light. Instead, directions whence much light is reflected to the observer should be selected preferentially, to have an equal sample rate per contribution to the final color over all directions. For that, one needs a way to generate random rays from a probability distribution. Let's say there exists a function that can do that; this function takes as input the desired PDF (which, ideally should be be equal to the BDRF) and the incoming ray:
vector RandomVectorWithPDF(function PDF(p_i,t_i,p_o,t_o,point x), Ray incoming)
{
// this function is responsible to create random Rays emanating from x
// with the probability distribution PDF. Depending on the complexity of PDF,
// this might somewhat involved. It is possible, however, to do it for Lambertian
// reflection (how exactly is math, not programming):
vector randomVector;
if(PDF==lambertBDRF)
{
float phi = uniformRandomNumber(0,2*pi);
float rho = acos(sqrt(uniformRandomNumber(0,1)));
float theta = pi/2-rho;
randomVector = getVectorFromAzimuthZenithAndNormal(phi,zenith,normal(incoming.whereObjectWasHit));
}
else // deal with other PDFs
return randomVector;
}
The code in the TracePath routine would then simply look like this:
newRay.direction = RandomVectorWithPDF(lambertBDRF,r);
Color reflected = TracePath(newRay, depth + 1);
return emittance + reflected;
Because the bright directions are preferred in the choice of samples, you do not have to weight them again by applying the BDRF as a scaling factor to reflected. However, if PDF and BDRF are different for some reason, you would have to scale down the output whenever PDF>BDRF (if you picked to many from the respective direction) and enhance it when you picked to little .
In code:
newRay.direction = RandomVectorWithPDF(PDF,r);
Color reflected = TracePath(newRay, depth + 1);
return emittance + reflected*BDRF(...)/PDF(...);
The output is best, however, if BDRF/PDF is equal to 1.
The question remains why can't one always choose the perfect PDF which is exactly equal to the BDRF? First, some random distributions are harder to compute than others. For example, if there was a slight variation in the albedo parameter, the algorithm would still do much better for the non-naive sampling than for uniform sampling, but the correction term BDRF/PDF would be needed for the slight variations. Sometimes, it might even be impossible to do it at all. Imagine a colored object with different reflective behavior of red green and blue - you could either render in three passes, one for each color, or use an average PDF, which fits all color components approximately, but none perfectly.
How would one go about implementing something like Phong shading? For simplicity, I still assume that there is only one color component, and that the ratio of diffuse to specular reflection is 60% / 40% (the notion of ambient light makes no sense in path tracing). Then my code would look like this:
if(uniformRandomNumber(0,1)<0.6) //diffuse reflection
{
newRay.direction=RandomVectorWithPDF(lambertBDRF,r);
reflected = TracePath(newRay,depth+1)/0.6;
}
else //specular reflection
{
newRay.direction=RandomVectorWithPDF(specularPDF,r);
reflected = TracePath(newRay,depth+1)*specularBDRF/specularPDF/0.4;
}
return emittance + reflected;
Here specularPDF is a distribution with a narrow peak around the reflected ray (theta_in=theta_out,phi_in=phi_out+pi) for which a way to create random vectors is available, and specularBDRF returns the specular intensity from Phong's model (http://en.wikipedia.org/wiki/Phong_reflection_model).
Note how the PDFs are modified by 0.6 and 0.4 respectively.

I'm by no means an expert in ray tracing, but this seems to be classic Monte Carlo:
You have lots of possible rays, and you choose one uniformly at random and then average over lots of trials.
The distribution you used to choose one of the rays was uniform (they were all equally as likely)
so you don't have to do any clever re-normalising.
However, Perhaps there are lots of possible rays to choose, but only a few would possibly lead to useful results.We therefore bias towards picking those 'useful' possibilities with higher probability, and then re-normalise (we are not choosing the rays uniformly any more, so we can't just take the average). This is
importance sampling.
The mirror example seems to be the following: only one possible ray will give a useful result.
If we choose a ray at random then the probability we hit that useful ray is zero: this is a property
of conditional probability on continuous spaces (it's not actually continuous, it's implicitly discretised
by your computer, so it's not quite true...): the probability of hitting something specific when there are infinitely many things must be zero.
Thus we are re-normalising by something with probability zero - standard conditional probability definitions
break when we consider events with probability zero, and that is where the problem would come from.

Related

How do I synchronize scale and position of map and point layers in d3.js?

I've seen many example maps in d3 where points added to a map automatically align as expected, but in code I've adapted from http://bl.ocks.org/bycoffe/3230965 the points I've added do not line up with the map below.
Example here: https://naltmann.github.io/d3-geo-collision/
(the points should match up with some major US cities)
I'm pretty sure the difference is due to the code around scale/range, but I don't know how to unify them between the map and points.
Aligning geographic features geographically with your example will be challenging - first you are projecting points and then scaling x,y:
node.cx = xScale(projection(node.coordinates)[0]);
node.cy = yScale(projection(node.coordinates)[1]);
The ranges for the scales is interesting in that both limits of both ranges are negatives, this might be an attempt to rectify the positioning of points due to the cumulative nature of forces on the points:
.on('tick', function(e) {
k = 10 * e.alpha;
for (i=0; i < nodes.length; i++) {
nodes[i].x += k * nodes[i].cx
nodes[i].y += k * nodes[i].cy
This is challenging as if we remove the scales, the points move farther and farther right and down. This cumulative nature means that with each tick the points drift further and further from recognizable geographic coordinates. This is fine when dealing with a set of geographic data that undergoes the same transformation, but when dealing with a background that doesn't undergo the same transformation, it's a bit hard.
I'll note that if you want a map width of 1800 and a height of 900, you should set the mercator projection's translate to [1800/2,900/2] and the scale to something like 1800/Math.PI/2
The disconnection between geographic coordinates and force coordinates appears to be very difficult to rectify. Any solution for this particular layout and dimensions is likely to fail on different layouts and dimensions.
Instead I'd suggest attempting to use only a projection to place coordinates and not cumulatively adding force changes to each point. This is the short answer to your question.
For a longer answer, my first thought was to get rid of the collision function and use an anchor point linked to a floating point for each city, only drawing the floating point (using link distance to keep them close). This is likely a cleaner solution, but one that is unfortunately completely different than what you've attempted.
However, my second thoughts were more towards keeping your example, but removing the scales (and the cumulative forces) and reducing the forces to zero so that the collision function can work without interference. Based on those thoughts, here's a demonstration of a possible solution.

MATLAB image processing technique

I have this 3D array in MATLAB (V: vertical, H: horizontal, t: time frame)
Figures below represent images obtained using imagesc function after slicing the array in terms of t axis
area in black represents damage area and other area is intact
each frame looks similar but has different amplitude
I am trying to visualize only defect area and get rid of intact area
I tried to use 'threshold' method to get rid of intact area as below
NewSet = zeros(450,450,200);
for kk = 1:200
frame = uwpi(:,:,kk);
STD = std(frame(:));
Mean = mean(frame(:));
for ii = 1:450
for jj =1:450
if frame(ii, jj) > 2*STD+Mean
NewSet(ii, jj, kk) = frame(ii, jj);
else
NewSet(ii, jj, kk) = NaN;
end
end
end
end
However, since each frame has different amplitude, result becomes
Is there any image processing method to get rid of intact area in this case?
Thanks in advance
You're thresholding based on mean and standard deviation, basically assuming your data is normally distributed and looking for outliers. But your model should try to distinguish values around zero (noise) vs higher values. Your data is not normally distributed, mean and standard deviation are not meaningful.
Look up Otsu thresholding (MATLAB IP toolbox has it). It's model does not perfectly match your data, but it might give reasonable results. Like most threshold estimation algorithms, it uses the image's histogram to determine the optimal threshold given some model.
Ideally you'd model the background peak in the histogram. You can find the mode, fit a Gaussian around it, then cut off at 2 sigma. Or you can use the "triangle method", which finds the point along the histogram that is furthest from the line between the upper end of the histogram and the top of the background peak. A little more complex to explain, but trivial to implement. We have this implemented in DIPimage (http://www.diplib.org), M-file code is visible so you can see how it works (look for the function threshold)
Additionally, I'd suggest to get rid of the loops over x and y. You can type frame(frame<threshold) = nan, and then copy the whole frame back into NewSet in one operation.
Do I clearly understand the question, ROI is the dark border and all it surrounds? If so I'd recommend process in 3D using some kind of region-growing technique like watershed or active snakes with markers by imregionalmin. The methods should provide segmentation result even if the border has small holes. Than just copy segmented object to a new 3D array via logic indexing.

How use raw Gryoscope Data °/s for calculating 3D rotation?

My question may seem trivial, but the more I read about it - the more confused I get... I have started a little project where I want to roughly track the movements of a rotating object. (A basketball to be precise)
I have a 3-axis accelerometer (low-pass-filtered) and a 3-axis gyroscope measuring °/s.
I know about the issues of a gyro, but as the measurements will only be several seconds and the angles tend to be huge - I don't care about drift and gimbal right now.
My Gyro gives me the rotation speed of all 3 axis. As I want to integrate the acceleration twice to get the position at each timestep, I wanted to convert the sensors coordinate-system into an earthbound system.
For the first try, I want to keep things simple, so I decided to go with the big standard rotation matrix.
But as my results are horrible I wonder if this is the right way to do so. If I understood correctly - the matrix is simply 3 matrices multiplied in a certain order. As rotation of a basketball doesn't have any "natural" order, this may not be a good idea. My sensor measures 3 angular velocitys at once. If I throw them into my system "step by step" it will not be correct since my second matrix calculates the rotation around the "new y-axis" , but my sensor actually measured an angular velocity around the "old y-axis". Is that correct so far?
So how can I correctly calculate the 3D rotation?
Do I need to go for quaternoins? but how do I get one from 3 different rotations? And don't I have the same issue here again?
I start with a unity-matrix ((1, 0, 0)(0, 1, 0)(0, 0, 1)) multiplied with the acceleration vector to give me the first movement.
Then I want use the Rotation matrix to find out, where the next acceleration is really heading so I can simply add the accelerations together.
But right now I am just too confused to find a proper way.
Any suggestions?
btw. sorry for my poor english, I am tired and (obviously) not a native speaker ;)
Thanks,
Alex
Short answer
Yes, go for quaternions and use a first order linearization of the rotation to calculate how orientation changes. This reduces to the following pseudocode:
float pose_initial[4]; // quaternion describing original orientation
float g_x, g_y, g_z; // gyro rates
float dt; // time step. The smaller the better.
// quaternion with "pose increment", calculated from the first-order
// linearization of continuous rotation formula
delta_quat = {1, 0.5*dt*g_x, 0.5*dt*g_y, 0.5*dt*g_z};
// final orientation at start time + dt
pose_final = quaternion_hamilton_product(pose_initial, delta_quat);
This solution is used in PixHawk's EKF navigation filter (it is open source, check out formulation here). It is simple, cheap, stable and accurate enough.
Unit matrix (describing a "null" rotation) is equivalent to quaternion [1 0 0 0]. You can get the quaternion describing other poses using a suitable conversion formula (for example, if you have Euler angles you can go for this one).
Notes:
Quaternions following [w, i, j, k] notation.
These equations assume angular speeds in SI units, this is, radians per second.
Long answer
A gyroscope describes the rotational speed of an object as a decomposition in three rotational speeds around the orthogonal local axes XYZ. However, you could equivalently describe the rotational speed as a single rate around a certain axis --either in reference system that is local to the rotated body or in a global one.
The three rotational speeds affect the body simultaneously, continously changing the rotation axis.
Here we have the problem of switching from the continuous-time real world to a simpler discrete-time formulation that can be easily solved using a computer. When discretizing, we are always going to introduce errors. Some approaches will lead to bigger errors, while others will be notably more accurate.
Your approach of concatenating three simultaneous rotations around orthogonal axes work reasonably well with small integration steps (let's say smaller than 1/1000 s, although it depends on the application), so that you are simulating the continuous change of rotation axis. However, this is computationally expensive, and error grows as you make time steps bigger.
As an alternative to first-order linearization, you can calculate pose increments as a small delta of angular speed gradient (also using quaternion representation):
quat_gyro = {0, g_x, g_y, g_z};
q_grad = 0.5 * quaternion_product(pose_initial, quat_gyro);
// Important to normalize result to get unit quaternion!
pose_final = quaternion_normalize(pose_initial + q_grad*dt);
This technique is used in Madgwick rotation filter (here an implementation), and works pretty fine for me.

Which way is my yarn oriented?

I have an image processing problem. I have pictures of yarn:
The individual strands are partly (but not completely) aligned. I would like to find the predominant direction in which they are aligned. In the center of the example image, this direction is around 30-34 degrees from horizontal. The result could be the average/median direction for the whole image, or just the average in each local neighborhood (producing a vector map of local directions).
What I've tried: I rotated the image in small steps (1 degree) and calculated statistics in the vertical vs horizontal direction of the rotated image (for example: standard deviation of summed rows or summed columns). I reasoned that when the strands are oriented exactly vertically or exactly horizontally the difference in statistics would be greatest, and so that angle of rotation is the correct direction in the original image. However, for at least several kinds of statistical properties I tried, this did not work.
I further thought that perhaps this wasn't working because there were too many different directions at the same time in the whole image, so I tired it in a small neighborhood. In this case, there is always a very clear preferred direction (different for each neighborhood), but it is not the direction that the fibers really go... I can post my sample code but it is basically useless.
I keep thinking there has to be some kind of simple linear algebra/statistical property of the whole image, or some value derived from the 2D FFT that would give the correct direction in one step... but how?
What probably won't work: detecting individual fibers. They are not necessarily the same color, and the image can shade from light to dark so edge detectors don't work well, and the image may not even be in focus sometimes. Because of that, it is not always even possible to see individual fibers for a human (see top-right in the example), they kinda have to be detected as preferred direction in a statistical sense.
You might try doing this in the frequency domain. The output of a Fourier Transform is orientation dependent so, if you have some kind of oriented pattern, you can apply a 2D FFT and you will see a clustering around a specific orientation.
For example, making a greyscale out of your image and performing FFT (with ImageJ) gives this:
You can see a distinct cluster that is oriented orthogonally with respect to the orientation of your yarn. With some pre-processing on your source image, to remove noise and maybe enhance the oriented features, you can probably achieve a much stronger signal in the FFT. Once you have a cluster, you can use something like PCA to determine the vector for the major axis.
For info, this is a technique that is often used to enhance oriented features, such as fingerprints, by applying a selective filter in the FFT and then taking the inverse to obtain a clearer image.
An alternative approach is to try a series of Gabor filters see here pre-built with a selection of orientations and frequencies and use the resulting features as a metric for identifying the most likely orientation. There is a scikit article that gives some examples here.
UPDATE
Just playing with ImageJ to give an idea of some possible approaches to this - I started with the FFT shown above, then - in the following image, I performed these operations (clockwise from top left) - Threshold => Close => Holefill => Erode x 3:
Finally, rather than using PCA, I calculated the spatial moments of the lower left blob using this ImageJ Plugin which handily calculates the orientation of the longest axis based on the 2nd order moment. The result gives an orientation of approximately -38 degrees (with respect to the X axis):
Depending on your frame of reference you can calculate the approximate average orientation of your yarn from this rather than from PCA.
I tried to use Gabor filters to enhance the orientations of your yarns. The parameters I used are:
phi = x*pi/16; % x = 1, 3, 5, 7
theta = 3;
sigma = 0.65*theta;
filterSize = 3;
And the imag part of the convoluted image are shown below:
As you mentioned, the most orientations lies between 30-34 degrees, thus the filter with phi = 5*pi/16 in left bottom yields the best contrast among the four.
I would consider using a Hough Transform for this type of problem, there is a nice write-up here.

How can I choose an image with higher contrast in PHP?

For a thumbnail-engine I would like to develop an algorithm that takes x random thumbnails (crop, no resize) from an image, analyzes them for contrast and chooses the one with the highest contrast. I'm working with PHP and Imagick but I would be glad for some general tips about how to compute contrast of imagery.
It seems that many things are easier than computing contrast, for example counting colors, computing luminosity,etc.
What are your experiences with the analysis of picture material?
I'd do it that way (pseudocode):
L[256] = {0,0,0...}
loop over each pixel:
luminance = avg(R,G,B)
increment L[luminance] by 1
for i = 0 to 255:
if L[i] < C: L[i] = 0 // C = threshold of your chose
find index of first and last non-zero value of L[]
contrast = last - first
In looking for the image "with the highest contrast," you will need to be very careful in how you define contrast for the image. In the simplest way, contrast is the difference between the lowest intensity and the highest intensity in the image. That is not going to be very useful in your case.
I suggest you use a histogram approach to describe the contrast of a given image and then compare the properties of the histograms to determine the image with the highest contrast as you define it. You could use a variety of well known containers to represent the histogram in code, or construct a class to meet your specific needs. (I am not implying that you need to create a histogram in the form of a chart – just a statistical representation of the intensity values.) You could use the variance of each histogram directly as a measure of contrast, or use the standard deviation if that is easier to work with.
The key really lies in how you define the contrast of the image. In general, I would define a high contrast image as one with values present for all, or nearly all, the possible values. And I would further add that in this definition of a high contrast image, the intensity values of the image will tend to be distributed across the range of possible values in a uniform way.
Using this approach, a low contrast image would tend to have relatively few discrete intensity values and they would tend to be closely grouped together rather than uniformly distributed. (As a general rule, they will also tend to be grouped toward the center of the range.)

Resources