Coherent Spherical Noise? - random

Does anyone know how I might be able to generate the following kind of noise?
Three inputs, three outputs
The outputs must always result in a vector of the same magnitude
If it receives the same input as some other time, it must return the same output
It must be continuous (best if it appears smooth, like perlin noise)
It must appear to be fairly random
EDIT: It would also be nice if it were isotropic, but that's not entirely necessary.

I've found a way, and it might not be very fast, but it does the job (this is c-like pseudocode designed to make porting to other languages easy).
vec3 sphereNoise(vec3 input, float radius)
{
vec3 result;
result.x = simplex(input.x, input.y); //could use perlin instead of simplex
result.y = simplex(input.y, input.z); //but I prefer simplex for its speed
result.z = simplex(input.z, input.x); //and its lack of directional artifacts
//uncomment the following line to make it a spherical-shell noise
//result.normalize();
result *= radius;
return result;
}

Related

How inefficient is my ray-box-intersection algorithm?

I am experimenting a little bit with shaders and the calculation of a collision between ray-box which is done following way:
inline bool hitsCube(in Ray ray, in Cube cube,
out float tMin, out float tMax,
out float3 signMin, out float3 signMax)
{
float3 biggerThan0 = ray.odir > 0; // ray.odir = (1.0/ray.dir)
float3 lessThan0 = 1.0f - biggerThan0;
float3 tMinXYZ = cube.center + biggerThan0 * cube.minSize + lessThan0 * cube.maxSize;
float3 tMaxXZY = cube.center + biggerThan0 * cube.maxSize + lessThan0 * cube.minSize;
float3 rMinXYZ = (tMinXYZ - ray.origin) * ray.odir;
float3 rMaxXYZ = (tMaxXZY - ray.origin) * ray.odir;
float minV = max(rMinXYZ.x, max(rMinXYZ.y, rMinXYZ.z));
float maxV = min(rMaxXYZ.x, min(rMaxXYZ.y, rMaxXYZ.z));
tMin = minV;
tMax = maxV;
signMin = (rMinXYZ == minV) * lessThan0; // important calculation for another algorithm, but no context provided here
signMax = (rMaxXYZ == maxV) * lessThan0;
return maxV > minV * (minV + maxV >= 0); // last multiplication makes sure the origin of the ray is outside the cube
}
Considering this function could be called inside a hlsl-shader many, many times (for some pixels lets say at least 200/300 times): Is my implementation of the collision logic inefficient?
Not rally a easily answerable "question", and hard to say without knowing all else that's going on around it, but just a few random thoughts:
a) if you're really interested in knowing that this could would look like on the GPU I'd suggest "porting" that to a CUDA kernel, then using CUDA to generate PTX and SASS for a modern GPU (say, sm75 for turing or sm86 for ampere); then compare two or three variants of that in SASS output.
b) the "converting logic to multiplications" might give you less than you think - if the logic isn't too complicated there's a good change you might end up with a few predicates and not much warp divergence at all, so might not be too bad. Only way to tell is look at PTX and/or SASS output, see 'a'.
c) your formulation of tMinXYZ/tMaxXYZ is (IMHO) unnecesarily complicated: just express it with min/max operations, which are really cheap on GPUs. Also see the respective chapter "ray/box intersection" in the ray tracing gems 2 book (which is free for download). Also more numerically stable btw.
d) re "lags... is my logic inefficient" - actual assembly "efficiency" will rarely have such gigantic effects; usually the culprit for noticeable "lags" is either memory stalls (hard to guess what's going on), or something going horribly wrong for other reasons (see next bullet).
e) just a hunch: I would check rays where some of the direction components are 0. In this case you're dividing by 0 (never a good idea), and in particular if this gets multiplied with 0.f (which in your case can happen) you'll get NaNs, and since "comparison with NaN is always false" you may end with cases where your traversal logic always goes down instead of skipping. Not the same as "efficiency" of your logic, but something to look out for. Good fix is to always change each ray.dir that's 0.f to 1e-6f or so.

What algorithms or approaches apart from Haar cascades could be used for custom objects detection?

I need to do computer visions tasks in order to detect watter bottles or soda cans. I will obtain 'frontal' images of bottles, soda cans or any other random objects (one by one) and my algorithm should determine whether it's a bottle, a can or any of them.
Some details about object detecting scenario:
As mentioned, I will test one single object per image/video frame.
Not all watter bottles are the same. There could be color in plastic, lid or label variation. Maybe some could not get label or lid.
Same about variation goes for soda cans. No wrinkled soda cans are gonna be tested though.
There could be small size variation between objects.
I could have a green (or any custom color) background.
I will do any needed filters on image.
This will be run on a Raspberry Pi.
Just in case, an example of each:
I've tested a couple times OpenCV face detection algorithms and I know it works pretty good but I'd need to obtain an special Haar Cascades features XML file for detecting each custom object on this approach.
So, the distinct alternatives I have in mind are:
Creating a custom Haar Classifier.
Considering shapes.
Considering outlines.
I'd like to get a simple algorithm and I think creating a custom Haar classifier could be even not needed. What would you suggest?
Update
I strongly considered the shape/aspect ratio approach.
However I guess I'm facing some issues as bottles come in distinct sizes or even shapes each. But this made me think or set following considerations:
I'm applying a threshold with THRESH_BINARY method. (Thanks to the answers).
I will use a white background on detection.
Soda cans are all same size.
So, a bounding box for soda cans with high accuracy might distinguish a can.
What I've achieved:
Threshold really helped me, I could notice that on white background tests I would obtain for cans:
And this is what it's obtained for bottles:
So, darker areas left dominancy is noticeable. There are some cases in cans where this might turn into false negatives. And for bottles, light and angle may lead to not consistent results but I really really think this could be a shorter approach.
So, I'm quite confused now how I should evaluate that darkness dominancy, I've read that findContours leads to it but I'm quite lost on how to seize such function. For example, in case of soda cans, it may find several contours, so I get lost on what to evaluate.
Note: I'm open to test any other algorithms or libraries distinct to Open CV.
I see few basic ideas here:
Check object (to be precise - object boundind rect) width/height ratio. For can it's approimetely 2-2.5, for bottle i think it will be >3. It's very simple idea to it should be easy to test it quickly and i think it should has quite good accuracy. For some values, like 2.75 (assumimg that values that i gave are correct, which most likely isn't true) you can use some different algorithm.
Check whether you object contains glass/transparence regions - if yes, than definitely it's a bottle. Here you can read more about it.
Use grabcut algorithm to get object mask/more precise shape and check whether this shape width at the top is similar to width at the bottom - if yes than it's a can, no - bottle (bottles has screw cap at the top).
Since you want to recognize can vs bottle rather than pepsi vs coke, shape matching is probably the way to go when compared to Haar and the features2d matchers like SIFT/SURF/ORB
A unique background color will make things easier.
First create a histogram from an image of just the background
int channels[] = {0,1,2}; // use all the channels
int rgb_bins = 32; // quantize to 32 colors per channel
int histSize[] = {rgb_bins, rgb_bins, rgb_bins};
float _range[] = {0,255};
float* ranges[] = {_range, _range, _range};
cv::SparseMat bghist;
cv::calcHist(&bg_image, 1, channels, cv::noArray(),bghist, 3, histSize, ranges );
Then use calcBackProject to create a mask of bg and not bg
cv::MatND temp_ND;
cv::calcBackProject( &bottle_image, 1, channels, bghist, temp_ND, ranges );
cv::Mat bottle_mask, bottle_backproj;
if( feeling_lazy ){
cv::normalize(temp_ND, bottle_backproj, 0, 255, cv::NORM_MINMAX, CV_8U);
//a small blur here could work nicely
threshold( bottle_backproj, bottle_mask, 0, 255, THRESH_OTSU );
bottle_mask = cv::Scalar(255) - bottle_mask; //invert the mask
} else {
//finding just the right value here might be better than the above method
int magic_threshold = 64;
temp_ND.convertTo( bottle_backproj, CV_8U, 255.);
//I expect temp_ND to be CV_32F ranging from 0-1, but I might be wrong.
threshold( bottle_backproj, bottle_mask, magic_threshold, 255, THRESH_BINARY_INV );
}
Then either:
Compare bottle_mask or bottle_backproj to a few sample bottle masks/backprojections using matchTemplate with a threshold on confidence to decide if it's a match.
matchTemplate(bottle_mask, bottle_template, result, CV_TM_CCORR_NORMED);
double confidence; minMaxLoc( result, NULL, &confidence);
Or use matchShapes, though I've never gotten this to work properly.
double confidence = matchShapes(bottle_mask, bottle_template, CV_CONTOURS_MATCH_I3);
Or use linemod which is difficult to set up but works great for images like this where the shape isn't very complex. Aside from the linked file, I haven't found any working samples of this method so here's what I did.
First create/train the detector with some sample images
//some magic numbers
std::vector<int> T_at_level;
T_at_level.push_back(4);
T_at_level.push_back(8);
//add some padding so linemod doesn't scream at you
const int T = 32;
int width = bottle_mask.cols;
if( width % T != 0)
width += T - width % T;
int height = bottle_mask.rows;
if( height % T != 0)
height += T - height % T;
//in this case template_backproj is created specifically from a sample bottle_backproj
cv::Rect padded_roi( (width - template_backproj.cols)/2, (height - template_backproj.rows)/2, template_backproj.cols, template_backproj.rows);
cv::Mat padded_backproj = zeros( width, height, template_backproj.type());
padded_backproj( padded_roi ) = template_backproj;
cv::Mat padded_mask = zeros( width, height, template_mask.type());
padded_mask( padded_roi ) = template_mask;
//you might need to erode padded_mask by a few pixels.
//initialize detector
std::vector< cv::Ptr<cv::linemod::Modality> > modalities;
modalities.push_back( cv::makePtr<cv::linemod::ColorGradient>() ); //for those that don't have a kinect
cv::Ptr<cv::linemod::Detector> new_detector = cv::makePtr<cv::linemod::Detector>(modalities, T_at_level);
//add sample images to the detector
std::vector<cv::Mat> template_images;
templates.push_back( padded_backproj);
cv::Rect ignore_me;
const std::string class_id = "bottle";
template_id = new_detector->addTemplate(template_images, class_id, padded_mask, &ignore_me);
Then do some matching
std::vector<cv::Mat> sources_vec;
sources_vec.push_back( padded_backproj );
//padded_backproj doesn't need to be the same size as the trained template images, but it does need to be padded the same way.
float matching_threshold = 0.8; //a higher number makes the algorithm faster
std::vector<cv::linemod::Match> matches;
std::vector<cv::String> class_ids;
new_detector->match(sources_vec, matching_threshold, matches,class_ids);
float confidence = matches.size() > 0? matches[0].similarity : 0;
As cyriel suggests, the aspect ratio (width/height) might be one useful measure. Here is some OpenCV Python code that finds contours (hopefully including the outline of the bottle or can) and gives you aspect ratio and some other measurements:
# src image should have already had some contrast enhancement (such as
# cv2.threshold) and edge finding (such as cv2.Canny)
contours, hierarchy = cv2.findContours(src, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
num_points = len(contour)
if num_points < 5:
# The contour has too few points to fit an ellipse. Skip it.
continue
# We could use area to help determine the type of object.
# Small contours are probably false detections (not really a whole object).
area = cv2.contourArea(contour)
bounding_ellipse = cv2.fitEllipse(contour)
center, radii, angle_degrees = bounding_ellipse
# Let's define an ellipse's normal orientation to be landscape (width > height).
# We must ensure that the ellipse's measurements match this orientation.
if radii[0] < radii[1]:
radii = (radii[1], radii[0])
angle_degrees -= 90.0
# We could use the angle to help determine the type of object.
# A bottle or can's angle is probably approximately a multiple of 90 degrees,
# assuming that it is at rest and not falling.
# Calculate the aspect ratio (width / height).
# For example, 0.5 means the object's height is 2 times its width.
# A bottle is probably taller than a can.
aspect_ratio = radii[0] / radii[1]
For checking transparency, you can compare the picture to a known background using histogram analysis or background subtraction.
The contour's moments can be used to determine its centroid (center of gravity):
moments = cv2.moments(contour)
m00 = moments['m00']
m01 = moments['m01']
m10 = moments['m10']
centroid = (m10 / m00, m01 / m00)
You could compare this to the center. If the object is bigger ("heavier") on one end, the centroid will be closer to that end than the center is.
So, my main approach for detection was:
Bottles are transparent and cans are opaque
Generally algorithm consisted in:
Take a grayscale picture.
Apply a binary threshold.
Select a convenient ROI from it.
Obtain it's color mean and even the standard deviation.
Distinguish.
Implementation was basically reduced to this function (where CAN and BOTTLE were previously defined):
int detector(int x, int y, int width, int height, int thresholdValue, CvCapture* capture) {
Mat img;
Rect r;
vector<Mat> channels;
r = Rect(x,y,width,height);
if ( !capture ) {
fprintf( stderr, "ERROR: capture is NULL \n" );
getchar();
return -1;
}
img = Mat(cvQueryFrame( capture ));
cvtColor(img,img,CV_RGB2GRAY);
threshold(img, img, 127, 255, THRESH_BINARY);
// ROI
Mat roiImage = img(r);
split(roiImage, channels);
Scalar m = mean(channels[0]);
float media = m[0];
printf("Media: %f\n", media);
if (media < thresholdValue) {
return CAN;
}
else {
return BOTTLE;
}
}
As it can be seen, a THRESH_BINARY threshold was applied, and it was a plain white background which was used. However the main and critical issue I faced with this whole approach and algorithm was luminosity changes in environment, even minor ones.
Sometimes I could notice a THRESH_BINARY_INV might help more, but I wonder if I could use some certian threshold parameters or wether applying other filters may lead to getting rid of environment lightning as an issue.
I really appreciate the aspect ratio calculation approach from bounding box or finding contours but I found this straight forward and simple when conditions were adjusted.
I'd use deep learning, based on Transfer learning.
The idea is this: given a highly complex well trained neural network, that was trained on a similar classification task (tipically over a large public dataset, like imagenet), you can freeze the majority of its weigths and only train the last layers. There are lots of tutorials out there. You don't need to have a background on deep learning.
There is a tutorial which is almost out of the box with tensorflow here and here there is another based on keras.

Unprojecting Screen coords to world in OpenGL es 2.0

Long time listener, first time caller.
So I have been playing around with the Android NDK and I'm at a point where I want to Unproject a tap to world coordinates but I can't make it work.
The problem is the x and y values for both the near and far points are the same which doesn't seem right for a perspective projection. Everything in the scene draws OK so I'm a bit confused why it wouldn't unproject properly, anyway here is my code please help thanks
//x and y are the normalized screen coords
ndk_helper::Vec4 nearPoint = ndk_helper::Vec4(x, y, 1.f, 1.f);
ndk_helper::Vec4 farPoint = ndk_helper::Vec4(x, y, 1000.f, 1.f);
ndk_helper::Mat4 inverseProjView = this->matProjection * this->matView;
inverseProjView = inverseProjView.Inverse();
nearPoint = inverseProjView * nearPoint;
farPoint = inverseProjView * farPoint;
nearPoint = nearPoint *(1 / nearPoint.w_);
farPoint = farPoint *(1 / farPoint.w_);
Well, after looking at the vector/matrix math code in ndk_helper, this isn't a surprise. In short: Don't use it. After scanning through it for a couple of minutes, it has some obvious mistakes that look like simple typos. And particularly the Vec4 class is mostly useless for the kind of vector operations you need for graphics. Most of the operations assume that a Vec4 is a vector in 4D space, not a vector containing homogenous coordinates in 3D space.
If you want, you can check it out here, but be prepared for a few face palms:
https://android.googlesource.com/platform/development/+/master/ndk/sources/android/ndk_helper/vecmath.h
For example, this is the implementation of the multiplication used in the last two lines of your code:
Vec4 operator*( const float& rhs ) const
{
Vec4 ret;
ret.x_ = x_ * rhs;
ret.y_ = y_ * rhs;
ret.z_ = z_ * rhs;
ret.w_ = w_ * rhs;
return ret;
}
This multiplies a vector in 4D space by a scalar, but is completely wrong if you're operating with homogeneous coordinates. Which explains the results you are seeing.
I would suggest that you either write your own vector/matrix library that is suitable for graphics type operations, or use one of the freely available libraries that are tested, and used by others.
BTW, the specific values you are using for your test look somewhat odd. You definitely should not be getting the same results for the two vectors, but it's probably not what you had in mind anyway. For the z coordinate in your input vectors, you are using the distances of the near and far planes in eye coordinates. But then you apply the inverse view-projection matrix to those vectors, which transforms them back from clip/NDC space into world space. So your input vectors for this calculation should be in clip/NDC space, which means the z-coordinate values corresponding to the near/far plane should be at -1 and 1.

OpenCL for-loop doing strange things

I'm currently implementing terrain generation in OpenCL using layered octaves of noise and I've stumbled upon this problem:
float multinoise2d(float2 position, float scale, int octaves, float persistence)
{
float result = 0.0f;
float sample = 0.0f;
float coefficient = 1.0f;
for(int i = 0; i < octaves; i++){
// get a sample of a simple signed perlin noise
sample = sgnoise2d(position/scale);
if(i > 0){
// Here is the problem:
// Implementation A, this works correctly.
coefficient = pown(persistence, i);
// Implementation B, using this only the first
// noise octave is visible in the terrain.
coefficient = persistence;
persistence = persistence*persistence;
}
result += coefficient * sample;
scale /= 2.0f;
}
return result;
}
Does OpenCL parallelize for-loops, leading to synchronization issues here or am I missing something else?
Any help is appreciated!
the problem of your code is with the lines
coefficient = persistence;
persistence = persistence*persistence;
It should be changed to
coefficient = coefficient *persistence;
otherwise on every iteration
the first coeficient grows by just persistence
pow(persistence, 1) ; pow(persistence, 2); pow(persistence, 3) ....
However the second implementation goes
pow(persistence, 1); pow(persistence, 2); pow(persistence, 4); pow(persistence, 8) ......
soon "persistence" will run above the limit for float and you will get zeros (or undefined behavior) in your answer.
EDIT
Two more things
Accumulation (implementation 2) is not a good idea, specially with real numbers and with algorithms that require accuracy. You might be losing a small fraction of you information every time you accumulate on "persistence" (e.g due to rounding). Prefer direct calculation (1st implementation) over accumulation whenever you can. (plus if this was Serial the 2nd implementation will be readily parallelizable.)
If you are working with AMD OpenCL pay attention to the pow() functions. I have had problems with those on multiple machines on multiple occasions. The functions seem to hang sometimes for no reason. Just FYI.
I'm assuming this is some kind of utility method that is called in your CL kernel. Vivek is correct in his comment above: OpenCL does not parallelize your code for you. You have to leverage OpenCL's facilities for dividing your problem into data-parallel chunks.
Also, I don't see a potential synchronization issue in the above code. All of your variables are in work-item private memory space.

Implementing Bezier Curves

I am trying to implement Bezier Curves for an assignment. I am trying to move a ball (using bezier curves) by giving my function an array of key frames. The function should give me all the frames in between the key frames ... or control points ... but although I'm using the formula found on wikipedia... it is not really working :s
her's my code:
private void interpolate(){
float x,y,b, t = 0;
frames = new Frame[keyFrames.length];
for(int i =0;i<keyFrames.length;++i){
t+=0.001;
b = Bint(i,keyFrames.length,t);
x = b*keyFrames[i].x;
y = b*keyFrames[i].y;
frames[i] = new Frame(x,y);
}
}
private float Bint(int i, int n, float t){
float Cni = fact(n)/(fact(i) * fact(n-i));
return Cni * pow(1-t,n-i) * pow(t,i);
}
Also I've noticed that the frames[] array should be much bigger but I can't find any other text which is more programmer friendly
Thanks in advance.
There are lots of things that don't look quite right here.
Doing it this way, your interpolation will pass exactly through the first and last control points, but not through the others. Is that what you want?
If you have lots of key frames, you're using a very-high-degree polynomial for your interpolation. Polynomials of high degree are notoriously badly-behaved, you may get your position oscillating wildly in between the key frame positions. (This is one reason why the answer to question 1 should probably be no.)
Assuming for the sake of argument that you really do want to do this, your value of t should go from 0 at the start to 1 at the end. Do you happen to have exactly 1001 of these key frames? If not, you'll be doing the wrong thing.
Evaluating these polynomials with lots of calls to fact and pow is likely to be inefficient, especially if n is large.
I'm reluctant to go into much detail about what you should do without knowing more about the scope of your assignment -- it will do no one any good for Stack Overflow to do your homework for you! What have you already been told about Bezier curves? What exactly does your assignment ask you to do?
EDITED to add:
The simplest way to do interpolation using Bezier curves is probably this. Have one (cubic) Bezier curve between each pair of key-points. The endpoints (first and last control points) of each Bezier curve are those keypoints. You need two more control points. For motion to be smooth as you move through a given keypoint, you need (keypoint minus previous control point) = (next control point minus keypoint). So you're choosing a single vector at each keypoint, which will determine where the previous and subsequent control points go. As you move through each keypoint, you'll be moving in the direction of that vector, and the longer the vector is the faster you'll be moving. (If the vector is zero then your cubic Bezier degenerates into a simple straight-line path.)
Choosing that vector so that everything looks nice is highly nontrivial, but you probably aren't really being asked to do that at this stage. So something pretty simple will probably be good enough. You might, e.g., take the vector to be proportional to (next keypoint minus previous keypoint). You'll need to do something a bit different at the start and end of your path if you do that.
Finally got What I needed! Here's what I did:
private void interpolate() {
float t = 0;
float x,y,b;
for(int f =0;f<frames.length;f++) {
x=0;
y=0;
for(int i = 0; i<keyFrames.length; i++) {
b = Bint(i,keyFrames.length-1,map(t,0,time,0,1));
x += b*keyFrames[i].x;
y += b*keyFrames[i].y;
}
frames[f] = new Frame(x,y);
t+=partialTime;
}
}
private void createInterpolationData() {
time = keyFrames[keyFrames.length-1].time -
keyFrames[0].time;
noOfFrames = 60*time;
partialTime = time/noOfFrames;
frames = new Frame[ceil(noOfFrames)];
}

Resources