I am trying to port the "RayTracing in One Weekend" into metal compute shader. I encounter this strip artifacts in my project:
Is it because my random generator does not work well?
Does anyone have a clue?
// 参照 https://www.pcg-random.org/ 实现
typedef struct { uint64_t state; uint64_t inc; } pcg32_random_t;
uint32_t pcg32_random_r(thread pcg32_random_t* rng)
{
uint64_t oldstate = rng->state;
rng->state = oldstate * 6364136223846793005ULL + rng->inc;
uint32_t xorshifted = ((oldstate >> 18u) ^ oldstate) >> 27u;
uint32_t rot = oldstate >> 59u;
return (xorshifted >> rot) | (xorshifted << ((-rot) & 31));
}
void pcg32_srandom_r(thread pcg32_random_t* rng, uint64_t initstate, uint64_t initseq)
{
rng->state = 0U;
rng->inc = (initseq << 1u) | 1u;
pcg32_random_r(rng);
rng->state += initstate;
pcg32_random_r(rng);
}
// 生成0~1之间的浮点数
float randomF(thread pcg32_random_t* rng)
{
//return pcg32_random_r(rng)/float(UINT_MAX);
return ldexp(float(pcg32_random_r(rng)), -32);
}
// 生成x_min ~ x_max之间的浮点数
float randomRange(thread pcg32_random_t* rng, float x_min, float x_max){
return randomF(rng) * (x_max - x_min) + x_min;
}
I found this link.It says that the primary ray hit point is either above or below the sphere's surface a litter bit due to the float precision error. It is a z-fighting problem
Seeing these "circles aroune the image center" artifact is almost always a dead give-away for z-fighting (rays along any such "circle" always have the same distance to any flat object you're looking at, so they either all round up or all round down, giving you this artifact).
This z-fighting then translates to this artifact because sometimes all the rays on such a circle are inside the sphere (meaning their shadow rays get self-occluded by the sphere), or all outside (they do what they should). What you want to do is offset the ray origin of the shadow rays a tiny bit (say, 1e-3f) along the normal direction at the hitpoint.
You may also want to read up on Carsten Waechter's article in Ray Tracing Gems 1 - it's for triangles, but explains the problem and potential solution very well.
Related
I have been using glm to help build a software rasterizer for self education. In my camera class I am using glm::lookat() to create my view matrix and glm::perspective() to create my perspective matrix.
I seem to be getting what I expect for my left, right top and bottom clipping planes. However, I seem to be either doing something wrong for my near/far planes of there is an error in my understanding. I have reached a point in which my "google-fu" has failed me.
Operating under the assumption that I am correctly extracting clip planes from my glm::perspective matrix, and using the general plane equation:
aX+bY+cZ+d = 0
I am getting strange d or "offset" values for my zNear and zFar planes.
It is my understanding that the d value is the value of which I would be shifting/translatin the point P0 of a plane along the normal vector.
They are 0.200200200 and -0.200200200 respectively. However, my normals are correct orientated at +1.0f and -1.f along the z-axis as expected for a plane perpendicular to my z basis vector.
So when testing a point such as the (0, 0, -5) world space against these planes, it is transformed by my view matrix to:
(0, 0, 5.81181192)
so testing it against these plane in a clip chain, said example vertex would be culled.
Here is the start of a camera class establishing the relevant matrices:
static constexpr glm::vec3 UPvec(0.f, 1.f, 0.f);
static constexpr auto zFar = 100.f;
static constexpr auto zNear = 0.1f;
Camera::Camera(glm::vec3 eye, glm::vec3 center, float fovY, float w, float h) :
viewMatrix{ glm::lookAt(eye, center, UPvec) },
perspectiveMatrix{ glm::perspective(glm::radians<float>(fovY), w/h, zNear, zFar) },
frustumLeftPlane {setPlane(0, 1)},
frustumRighPlane {setPlane(0, 0)},
frustumBottomPlane {setPlane(1, 1)},
frustumTopPlane {setPlane(1, 0)},
frstumNearPlane {setPlane(2, 0)},
frustumFarPlane {setPlane(2, 1)},
The frustum objects are based off the following struct:
struct Plane
{
glm::vec4 normal;
float offset;
};
I have extracted the 6 clipping planes from the perspective matrix as below:
Plane Camera::setPlane(const int& row, const bool& sign)
{
float temp[4]{};
Plane plane{};
if (sign == 0)
{
for (int i = 0; i < 4; ++i)
{
temp[i] = perspectiveMatrix[i][3] + perspectiveMatrix[i][row];
}
}
else
{
for (int i = 0; i < 4; ++i)
{
temp[i] = perspectiveMatrix[i][3] - perspectiveMatrix[i][row];
}
}
plane.normal.x = temp[0];
plane.normal.y = temp[1];
plane.normal.z = temp[2];
plane.normal.w = 0.f;
plane.offset = temp[3];
plane.normal = glm::normalize(plane.normal);
return plane;
}
Any help would be appreciated, as now I am at a loss.
Many thanks.
The d parameter of a plane equation describes how much the plane is offset from the origin along the plane normal. This also takes into account the length of the normal.
One can't just normalize the normal without also adjusting the d parameter since normalizing changes the length of the normal. If you want to normalize a plane equation then you also have to apply the division step to the d coordinate:
float normalLength = sqrt(temp[0] * temp[0] + temp[1] * temp[1] + temp[2] * temp[2]);
plane.normal.x = temp[0] / normalLength;
plane.normal.y = temp[1] / normalLength;
plane.normal.z = temp[2] / normalLength;
plane.normal.w = 0.f;
plane.offset = temp[3] / normalLength;
Side note 1: Usually, one would store the offset of a plane equation in the w-coordinate of a vec4 instead of a separate variable. The reason is that the typical operation you perform with it is a point to plane distance check like dist = n * x - d (for a given point x, normal n, offset d, * is dot product), which can then be written as dist = [n, d] * [x, -1].
Side note 2: Most software and also hardware rasterizer perform clipping after the projection step since it's cheaper and easier to implement.
I'm looking for a way to estimate the distance to the boundary of the Mandelbrot set from a point inside of it for use in a GLSL shader.
This page links to various resources online touching on the subject of interior distance estimation such as the underlying mathematical formula, a Haskell implementation, some other blogs, forum posts and a C99 implementation, but I got the impression that they are all either very complex to implement or very computationally heavy to run.
After many hours of trying, I managed to make this code that runs in Shadertoy:
void mainImage( out vec4 fragColor, in vec2 fragCoord ) {
float zoom = 1.;
vec2 c = vec2(-0.75, 0.0) + zoom * (2.*fragCoord-iResolution.xy)/iResolution.y;
vec2 z = c;
float ar = 0.; // average of reciprocals
float i;
for (i = 0.; i < 1000.; i++) {
ar += 1./length(z);
z = vec2(z.x * z.x - z.y * z.y, 2.0 * z.x * z.y) + c;
}
ar = ar / i;
fragColor = vec4(vec3(2. / ar), 1.0);
}
It does produce a gradient in every bulb, but it is clear that it's not usable as a distance estimator by itself because values in smaller bulbs have inconsistent magnitude (brightness) compared to bigger bulbs. So it's clear that a parameter is missing but I don't know what it is.
I don't require a perfect solution nor one that converges into a perfect solution like in this image.
Something that at least guarantees a lower bound is plenty.
My bet is that 1./length(z) is hitting the precision of float try to use double and dvec2 instead of float,vec2 if it makes any difference. If it does then I would ignore too small values of length(z).
Alternatively you can render just the boundary into texture in one pass and then just scan neighbors in all directions until boundary found returning the ray length. (may require some morphology operators before safe use)
This can be speed up with another pass where you "flood" fill incrementing distance into texture until its filled (better done on CPU side as you need R/W access to the same texture) its similar to A* filling however your precision will be limited by texture resolution.
If I ported my mandlebrot from link above to your computation ported to doubles and added the threshold:
// Fragment
#version 450 core
uniform dvec2 p0=vec2(0.0,0.0); // mouse position <-1,+1>
uniform double zoom=1.000; // zoom [-]
uniform int n=100; // iterations [-]
in smooth vec2 p32;
out vec4 col;
vec3 spectral_color(float l) // RGB <0,1> <- lambda l <400,700> [nm]
{
float t; vec3 c=vec3(0.0,0.0,0.0);
if ((l>=400.0)&&(l<410.0)) { t=(l-400.0)/(410.0-400.0); c.r= +(0.33*t)-(0.20*t*t); }
else if ((l>=410.0)&&(l<475.0)) { t=(l-410.0)/(475.0-410.0); c.r=0.14 -(0.13*t*t); }
else if ((l>=545.0)&&(l<595.0)) { t=(l-545.0)/(595.0-545.0); c.r= +(1.98*t)-( t*t); }
else if ((l>=595.0)&&(l<650.0)) { t=(l-595.0)/(650.0-595.0); c.r=0.98+(0.06*t)-(0.40*t*t); }
else if ((l>=650.0)&&(l<700.0)) { t=(l-650.0)/(700.0-650.0); c.r=0.65-(0.84*t)+(0.20*t*t); }
if ((l>=415.0)&&(l<475.0)) { t=(l-415.0)/(475.0-415.0); c.g= +(0.80*t*t); }
else if ((l>=475.0)&&(l<590.0)) { t=(l-475.0)/(590.0-475.0); c.g=0.8 +(0.76*t)-(0.80*t*t); }
else if ((l>=585.0)&&(l<639.0)) { t=(l-585.0)/(639.0-585.0); c.g=0.84-(0.84*t) ; }
if ((l>=400.0)&&(l<475.0)) { t=(l-400.0)/(475.0-400.0); c.b= +(2.20*t)-(1.50*t*t); }
else if ((l>=475.0)&&(l<560.0)) { t=(l-475.0)/(560.0-475.0); c.b=0.7 -( t)+(0.30*t*t); }
return c;
}
void main()
{
int i,j;
dvec2 pp,p;
double x,y,q,xx,yy,mu,cx,cy;
p=dvec2(p32);
pp=(p/zoom)-p0; // y (-1.0, 1.0)
pp.x-=0.5; // x (-1.5, 0.5)
cx=pp.x; // normal
cy=pp.y;
/*
// single pass mandelbrot integer escape
for (x=0.0,y=0.0,xx=0.0,yy=0.0,i=0;(i<n)&&(xx+yy<4.0);i++)
{
q=xx-yy+cx;
y=(2.0*x*y)+cy;
x=q;
xx=x*x;
yy=y*y;
}
float f=float(i)/float(n);
f=pow(f,0.2);
col=vec4(spectral_color(400.0+(300.0*f)),1.0);
*/
// distance to boundary
double ar=0.0,aa,nn=0.0; // *** this is what I added
for (x=0.0,y=0.0,xx=0.0,yy=0.0,i=0;(i<n)&&(xx+yy<4.0);i++)
{
aa=length(dvec2(x,y)); // *** this is what I added
if (aa>1e-3){ ar+=1.0/aa; nn++; } // *** this is what I added
q=xx-yy+cx;
y=(2.0*x*y)+cy;
x=q;
xx=x*x;
yy=y*y;
}
ar=ar/nn; // *** this is what I added
col=vec4(vec3(1.0-(2.0/ar)),1.0); // *** this is what I added
}
I got these outputs:
Just look for // *** this is what I added comment in the code that is what is added to the standard mandelbrot rendering to render distance instead. ps my (x,y) is your z and (cx,cy) is your c
anyway the distance is still highly nonlinear and depends on the position
[Edit1] non-isotropic scale
The black dot is the thresholds size you can lover it to 1e-20 ... Now I added level lines to show the distribution of distance scale (as I did not know how non isotropic and non linear it is...) here the output:
And coloring part of fragment (after the for loop):
ar=1.0-(2.0*nn/ar);
aa=10.0*ar; // 10 level lines per unit
aa-=floor(aa);
if (abs(aa)<0.05) col=vec4(0.0,1.0,0.0,1.0); // width and color of level line
else col=vec4(ar,ar,ar,1.0);
As you can see its not very parallel to border but still locally "constant" (the level lines are equidistant to each in local feature of fractal) so if gradient (derivate) used the result will be just very rough estimate (but should work). If that is enough what you should do is:
compute non linear distance for queried position and few point d distant to it in "all" directions.
pick that neighbor that has bigest change in distance to original point
rescale the estimated distances so their substraction will give you d. then use the first distance rescaled as output.
When put to fragment code (using 8-neighbors):
// Fragment
#version 450 core
uniform dvec2 p0=vec2(0.0,0.0); // mouse position <-1,+1>
uniform double zoom=1.000; // zoom [-]
uniform int n=100; // iterations [-]
in smooth vec2 p32;
out vec4 col;
double mandelbrot_distance(double cx,double cy)
{
// distance to boundary
int i,j;
double x,y,q,xx,yy,ar=0.0,aa,nn=0.0;
for (x=0.0,y=0.0,xx=0.0,yy=0.0,i=0;(i<n)&&(xx+yy<4.0);i++)
{
aa=length(dvec2(x,y));
if (aa>1e-20){ ar+=1.0/aa; nn++; }
q=xx-yy+cx;
y=(2.0*x*y)+cy;
x=q;
xx=x*x;
yy=y*y;
}
return 1.0-(2.0*nn/ar);
}
void main()
{
dvec2 pp,p;
double cx,cy,d,dd,d0,d1,e;
p=dvec2(p32);
pp=(p/zoom)-p0; // y (-1.0, 1.0)
pp.x-=0.5; // x (-1.5, 0.5)
cx=pp.x; // normal
cy=pp.y;
d =0.01/zoom; // normalization distance
e =sqrt(0.5)*d;
dd=mandelbrot_distance(cx,cy);
if (dd>0.0)
{
d0=mandelbrot_distance(cx-d,cy ); if (d0>0.0) d0=abs(d0-dd);
d1=mandelbrot_distance(cx+d,cy ); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
d1=mandelbrot_distance(cx ,cy-d); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
d1=mandelbrot_distance(cx ,cy+d); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
d1=mandelbrot_distance(cx-e,cy-e); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
d1=mandelbrot_distance(cx+e,cy-e); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
d1=mandelbrot_distance(cx-e,cy+e); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
d1=mandelbrot_distance(cx+e,cy+e); if (d1>0.0){ d1=abs(d1-dd); if (d0<d1) d0=d1; }
dd*=d/d0;
}
dd*=zoom; // just for visualization of small details real distance should not be scaled by this
col=vec4(dd,dd,dd,1.0);
}
here the result:
As you can see its now much more correct (but very close to border is inaccurate due to non isotropy mentioned above). The 8 neighbors produces the 8 diagonal like lines pattern in the circular blobs. If you want to get rid of them you should scan whole circle around the position instead of just 8 points.
Also there are still some white dots (they are not accuracy related) I think they are cases when the selected d distant neighbor is across the mandelbrot edge in different blob than original. That could be filtered out ... (you know d/2 distance in the same direction should be half if not you are in different blob)
However even 8 neighbors are pretty slow. So I for more accuracy I would recommend to go for the 2 pass "ray casting" method instead.
I'm writing an ios vertex shader that "flattens" a MC world in the x direction if a change in the z direction is detected, and vice versa (The xz plane is perp to height). I have several shaders that warp the world just fine, but writing this pseudo movement detection hasn't worked. I know conditionals are costly. I was comparing the position to a number and it worked:
if (worldPos.x != 4.) {
worldPos.z = 0.;
}
But comparing position to a static call of the position doesn't. So far I've tried assigning constant floats to the x and z components, uniform floats, and a POS4 uniform, but no success. I have a feeling the conditionals fail because of a data type problem? It would be easier to debug if PE version displayed coord like PC. Thanks for any/all help! Current code:
uniform POS4 CHUNK_ORIGIN_AND_SCALE;
attribute POS4 POSITION;
void main()
{
POS4 worldPos;
worldPos.xyz = (POSITION.xyz * CHUNK_ORIGIN_AND_SCALE.w) + CHUNK_ORIGIN_AND_SCALE.xyz;
worldPos.w = 1.;
const float staticPosx = worldPos.x;
const float staticPosz = worldPos.z;
if (worldPos.x != staticPosx) {
worldPos.z = 0.;
staticPosx = worldPos.x;
}
if (worldPos.z != staticPosz) {
worldPos.x = 0.;
staticPosz = worldPos.z;
}
etc.
i didn't try it in glsl but in C you can't
int a=5;
const int b=a;
Hello I am new to XNA and trying to develop a game prototype where the character moves from one location to another using mouse clicks.
I have a Rectangle representing the current position. I get the target location as a Vector2 using player mouse input. I extract the direction vector from the source to the target by Vector2 subtraction.
//the cursor's coordinates should be the center of the target position
float x = mouseState.X - this.position.Width / 2;
float y = mouseState.Y - this.position.Height / 2;
Vector2 targetVector = new Vector2(x, y);
Vector2 dir = (targetVector - this.Center); //vector from source center to target
//center
I represent the world using a tile map, every cell is 32x32 pixels.
int tileMap[,];
What I want to do is check whether the direction vector above passes through any blue tiles on the map. A blue tile is equal 1 on the map.
I am not sure how to do this. I thought about using linear line equation and trigonometric formulas but I'm finding it hard to implement. I've tried normalizing the vector and multiplying by 32 to get 32 pixel length intervals along the path of the vector but it doesn't seem to work. Can anyone tell me if there's anything wrong in it, or another way to solve this problem? Thanks
//collision with blue wall. Returns point of impact
private bool CheckCollisionWithBlue(Vector2 dir)
{
int num = Worldmap.size; //32
int i = 0;
int intervals = (int)(dir.Length() / num + 1); //the number of 32-pixel length
//inervals on the vector, with an edge
Vector2 unit = Vector2.Normalize(dir) * num; //a vector of length 32 in the same
//direction as dir.
Vector2 v = unit;
while (i <= intervals & false)
{
int x = (int)(v.X / num);
int y = (int)(v.Y / num);
int type = Worldmap.getType(y, x);
if (type == 1) //blue tile
{
return true;
}
else
{
i++;
v = unit * i;
}
}
return false;
}
You need the initial postion too, not only direction
Maybe you need more resolution
¿what? remove the "false" evaluation
The calcs for next pos are a bit complicated
private bool CheckCollisionWithBlue(Vector2 source, Vector2 dir)
{
int num = 8; // pixel blocks of 8
int i = 0;
int intervals = (int)(dir.Length() / num);
Vector2 step = Vector2.Normalize(dir)*num;
while (i <= intervals)
{
int x = (int)(source.X);
int y = (int)(source.Y);
int type = Worldmap.getType(y, x);
if (type == 1) //blue tile
{
return true;
}
else
{
i++;
source+=step;
}
}
return false;
}
This will improve something your code, but maybe innacurate... it depends on what are you trying to do...
You maybe can find interesting the bresenham's line algorithm http://en.wikipedia.org/wiki/Bresenham's_line_algorithm
You should realize that you are not doing a volume collision but a line collision, if the ship or character or whatever that is at source position maybe you have to add more calcs
I am trying some experiments in fractal rendering with DirectX11 Compute Shaders.
The provided example runs on a FeatureLevel_10 device.
My RwStructured output buffer has a data format of R32G32B32A32_FLOAT
The problem is that when writing to the buffer, it seems that only the ALPHA ( w ) value gets written nothing else....
Here is the shader code:
struct BufType
{
float4 value;
};
cbuffer ScreenConstants : register(b0)
{
float2 ScreenDimensions;
float2 Padding;
};
RWStructuredBuffer<BufType> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void Main( uint3 DTid : SV_DispatchThreadID )
{
uint index = DTid.y * ScreenDimensions.x + DTid.x;
float minRe = -2.0f;
float maxRe = 1.0f;
float minIm = -1.2;
float maxIm = minIm + ( maxRe - minRe ) * ScreenDimensions.y / ScreenDimensions.x;
float reFactor = (maxRe - minRe ) / (ScreenDimensions.x - 1.0f);
float imFactor = (maxIm - minIm ) / (ScreenDimensions.y - 1.0f);
float cim = maxIm - DTid.y * imFactor;
uint maxIterations = 30;
float cre = minRe + DTid.x * reFactor;
float zre = cre;
float zim = cim;
bool isInside = true;
uint iterationsRun = 0;
for( uint n = 0; n < maxIterations; ++n )
{
float zre2 = zre * zre;
float zim2 = zim * zim;
if ( zre2 + zim2 > 4.0f )
{
isInside = false;
iterationsRun = n;
}
zim = 2 * zre * zim + cim;
zre = zre2 - zim2 + cre;
}
if ( isInside )
{
BufferOut[index].value = float4(1.0f,0.0f,0.0f,1.0f);
}
}
The code actually produces in a sense the correct result ( 2D Mandelbrot set ) but it seems somehow only the alpha value is touched and nothing else is written, although the pixels inside the set should be colored red... ( the image is black & white )
Anybody has a clue what's going on here ?
After some fiddling around i found the problem.
I have not found any documentation from MS mentioning this, so it could also be a Nvidia
specific driver issue.
Apparently you are only allowed to write ONCE per Compute Shader Invocation to the same element in a RWSructuredBuffer. And you also HAVE to write ONCE.
I changed the code to accumulate the correct color in a local variable, and write it now only once at the end of the shader.
Everything works perfectly now in that way.
I'm not sure but, shouldn't it be for BufferOut decl:
RWStructuredBuffer<BufType> BufferOut : register(u0);
instead of :
RWStructuredBuffer BufferOut : register(u0);
If you are only using a float4 write target, why not use just:
RWBuffer<float4> BufferOut : register (u0);
Maybe this could help.
After playing around today again, i ran into the same problem once again.
The following code produced all white output:
[numthreads(1, 1, 1)]
void Main( uint3 dispatchId : SV_DispatchThreadID )
{
float4 color = float4(1.0f,0.0f,0.0f,1.0f);
WriteResult(dispatchId,color);
}
The WriteResult method is a utility method from my hlsl standard library.
Long story short. After i upgraded from Driver version 192 to 195(beta) the problem went away.
Seems like the drivers have some definitive problems in compute shader support left, so beware.
from what ive seen, computer shaders are only useful if you need a more general computational model than the tradition pixel shader, or if you can load data and then share it between threads in fast shared memory. im fairly sure u would get better performance with a pixel shader for the mandelbrot shader.
on my setup (win7, feb 10 dx sdk, gtx480) my compute shaders have a punishing setup time of over 0.2-0.3ms (binding a SRV and a UAV and then calling dispatch()).
if u do a PS implementation please post your experiences.
I have no direct experience with DX compute shaders but...
Why are you setting alpha = 1.0?
IIRC, that makes the pixel 100% transparent, so your inside pixels are transparent red, and show up as whatever color was drawn behind them.
When alpha = 1.0, the RGB components are never used.