DirectX 11 Compute Shader - not writing all values - directx-11

I am trying some experiments in fractal rendering with DirectX11 Compute Shaders.
The provided example runs on a FeatureLevel_10 device.
My RwStructured output buffer has a data format of R32G32B32A32_FLOAT
The problem is that when writing to the buffer, it seems that only the ALPHA ( w ) value gets written nothing else....
Here is the shader code:
struct BufType
{
float4 value;
};
cbuffer ScreenConstants : register(b0)
{
float2 ScreenDimensions;
float2 Padding;
};
RWStructuredBuffer<BufType> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void Main( uint3 DTid : SV_DispatchThreadID )
{
uint index = DTid.y * ScreenDimensions.x + DTid.x;
float minRe = -2.0f;
float maxRe = 1.0f;
float minIm = -1.2;
float maxIm = minIm + ( maxRe - minRe ) * ScreenDimensions.y / ScreenDimensions.x;
float reFactor = (maxRe - minRe ) / (ScreenDimensions.x - 1.0f);
float imFactor = (maxIm - minIm ) / (ScreenDimensions.y - 1.0f);
float cim = maxIm - DTid.y * imFactor;
uint maxIterations = 30;
float cre = minRe + DTid.x * reFactor;
float zre = cre;
float zim = cim;
bool isInside = true;
uint iterationsRun = 0;
for( uint n = 0; n < maxIterations; ++n )
{
float zre2 = zre * zre;
float zim2 = zim * zim;
if ( zre2 + zim2 > 4.0f )
{
isInside = false;
iterationsRun = n;
}
zim = 2 * zre * zim + cim;
zre = zre2 - zim2 + cre;
}
if ( isInside )
{
BufferOut[index].value = float4(1.0f,0.0f,0.0f,1.0f);
}
}
The code actually produces in a sense the correct result ( 2D Mandelbrot set ) but it seems somehow only the alpha value is touched and nothing else is written, although the pixels inside the set should be colored red... ( the image is black & white )
Anybody has a clue what's going on here ?

After some fiddling around i found the problem.
I have not found any documentation from MS mentioning this, so it could also be a Nvidia
specific driver issue.
Apparently you are only allowed to write ONCE per Compute Shader Invocation to the same element in a RWSructuredBuffer. And you also HAVE to write ONCE.
I changed the code to accumulate the correct color in a local variable, and write it now only once at the end of the shader.
Everything works perfectly now in that way.

I'm not sure but, shouldn't it be for BufferOut decl:
RWStructuredBuffer<BufType> BufferOut : register(u0);
instead of :
RWStructuredBuffer BufferOut : register(u0);
If you are only using a float4 write target, why not use just:
RWBuffer<float4> BufferOut : register (u0);
Maybe this could help.

After playing around today again, i ran into the same problem once again.
The following code produced all white output:
[numthreads(1, 1, 1)]
void Main( uint3 dispatchId : SV_DispatchThreadID )
{
float4 color = float4(1.0f,0.0f,0.0f,1.0f);
WriteResult(dispatchId,color);
}
The WriteResult method is a utility method from my hlsl standard library.
Long story short. After i upgraded from Driver version 192 to 195(beta) the problem went away.
Seems like the drivers have some definitive problems in compute shader support left, so beware.

from what ive seen, computer shaders are only useful if you need a more general computational model than the tradition pixel shader, or if you can load data and then share it between threads in fast shared memory. im fairly sure u would get better performance with a pixel shader for the mandelbrot shader.
on my setup (win7, feb 10 dx sdk, gtx480) my compute shaders have a punishing setup time of over 0.2-0.3ms (binding a SRV and a UAV and then calling dispatch()).
if u do a PS implementation please post your experiences.

I have no direct experience with DX compute shaders but...
Why are you setting alpha = 1.0?
IIRC, that makes the pixel 100% transparent, so your inside pixels are transparent red, and show up as whatever color was drawn behind them.
When alpha = 1.0, the RGB components are never used.

Related

DX9 style intristics are disabled when not in dx9 compatibility mode?

I am currently writing an HLSL shader for a basic Gaussian blur. The shader code is straight forward, but I keep getting an error:
DX9 style intristics are disabled when not in dx9 compatibility mode. (LN#: 19)
This tells me that line 19 in my code is the issue, and I believe it is either due to tex2D or Sampler in that particular line.
#include "Common.hlsl"
Texture2D Texture0 : register(t0);
SamplerState Sampler : register(s0);
float4 PSMain(PixelShaderInput pixel) : SV_Target {
float2 uv = pixel.TextureUV; // This is TEXCOORD0.
float4 result = 0.0f;
float offsets[21] = { ... };
float weights[21] = { ... };
// Blur horizontally.
for (int x = 0; x < 21; x++)
result += tex2D(Sampler, float2(uv.x + offsets[x], uv.y)) * weights[x];
return result;
}
See below for notes about the code, and my questions.
Notes
I have to hand type my code into StackOverflow due to my code being on a computer without a connection. Therefore:
Any spelling or case errors present here do not exist in code.
The absence of values inside of offsets and weights is intentional.
This is because there are 21 values in each and I didn't feel like typing them all.
offsets is every integer from -10 to 10.
weights ranges from 0.01 to 0.25 and back to 0.01.
The line count here is smaller due to the absence mentioned prior.
The line number of the error here is 15.
The column range is 13 - 59 which encapsulates tex2D and Sampler.
My Questions
Am I using the wrong data type for Sampler in the tex2D call?
Is tex2D deprecated in DirectX 11?
What should I be using instead?
What am I doing wrong here as that is my only error.
After some extensive searching, I've found out that tex2D is no longer supported from ps_4_0 and up. Well, 4_0 will work in legacy mode, but it doesn't work it 5_0 which is what I am using.
Shader Model : Supported
Shader Model 4 : yes (pixel shader only), but you must use the legacy compile option when compiling.
Shader Model 3 (DirectX HLSL) : yes (pixel shader only)
Shader Model 2 (DirectX HLSL) : yes (pixel shader only)
Shader Model 1 (DirectX HLSL) : yes (pixel shader only)
This has been replaced by Texture2D; the documentation is available for it. Below is an example from the mentioned documentation:
// Object Declarations
Texture2D g_MeshTexture;
SamplerState MeshTextureSampler
{
Filter = MIN_MAG_MIP_LINEAR;
AddressU = Wrap;
AddressV = Wrap;
};
struct VS_OUTPUT
{
float4 Position : SV_POSITION;
float4 Diffuse : COLOR0;
float2 TextureUV : TEXCOORD0;
};
VS_OUTPUT In;
// Shader body calling the intrinsic function
Output.RGBColor = g_MeshTexture.Sample(MeshTextureSampler, In.TextureUV) * In.Diffuse;
To replace the tex2D call in my code:
result += Texture0.Sample(Sampler, float2(uv.x + offsets[x], uv.y)) * weights[x];
Also, note that the code in this post is for the horizontal pass of a Gaussian blur.

Metal normal doesn't interpolate

I've been learning how Metal works using Swift and targeting macOS. Thing's have been going okay, but now, close to getting the stuff done, I've hit a problem that I cannot possibly understand ... I hope you guys will help me :)
I'm loading and displaying a OBJ teapot, which I'm lighting using ambiant+diffuse+specular light. Lighting in itself works well, but problem is : the normal vector is not interpolated when going to the fragment shader, which results in having flat lighting on supposedly curved surface ... Not good ...
I really don't understand why the normal is not interpolated while other values (position + eye) are ... Here is my shader and an image to show the result :
Thanks in advance :)
struct Vertex
{
float4 position;
float4 normal;
};
struct ProjectedVertex
{
float4 position [[position]];
float3 eye;
float3 normal;
};
vertex ProjectedVertex vertex_project(device Vertex *vertices [[buffer(0)]],
constant Uniforms &uniforms [[buffer(1)]],
uint vid [[vertex_id]])
{
ProjectedVertex outVert;
outVert.position = uniforms.modelViewProjectionMatrix * vertices[vid].position;
outVert.eye = -(uniforms.modelViewProjectionMatrix * vertices[vid].position).xyz;
outVert.normal = (uniforms.modelViewProjectionMatrix * float4(vertices[vid].normal)).xyz;
return outVert;
}
fragment float4 fragment_light(ProjectedVertex vert [[stage_in]],
constant Uniforms &uniforms [[buffer(0)]])
{
float3 ambientTerm = light.ambientColor * material.ambientColor;
float3 normal = normalize(vert.normal);
float diffuseIntensity = saturate(dot(normal, light.direction));
float3 diffuseTerm = light.diffuseColor * material.diffuseColor * diffuseIntensity;
float3 specularTerm(0);
if (diffuseIntensity > 0)
{
float3 eyeDirection = normalize(vert.eye);
float3 halfway = normalize(light.direction + eyeDirection);
float specularFactor = pow(saturate(dot(normal, halfway)), material.specularPower);
specularTerm = light.specularColor * material.specularColor * specularFactor;
}
return float4(ambientTerm + diffuseTerm + specularTerm, 1);
}
screenshot
So problem was that using OBJ-C, when I indexed the vertices from the OBJ file, I only generated 1 vertex for shared vertices between surfaces, so I kept only 1 normal.
When translating it to swift, the hash value I used to check if the vertex is at the same place than one I already have was wrong and couldn't detect shared vertices, which resulted in keeping all of the normals, so each surface is flat.
I don't know if I'm clear enough but that's what happened, for future reference, this question was about making a Swift version of "metalbyexample" book which is Obj-C only.

Strange behavior of tessFactors inside tessellation stage

i've noticed some super stange behavior on my nvidia 860m. Im programming some 3d engine and i'm using tessellation for terrain rendering.
I use a simple quad tessellation algorithm.
struct PatchTess
{
float EdgeTess[4] : SV_TessFactor;
float InsideTess[2] : SV_InsideTessFactor;
};
PatchTess ConstantHS(InputPatch<VS_OUT, 4> patch)
{
PatchTess pt;
float3 l = (patch[0].PosW + patch[2].PosW) * 0.5f;
float3 t = (patch[0].PosW + patch[1].PosW) * 0.5f;
float3 r = (patch[1].PosW + patch[3].PosW) * 0.5f;
float3 b = (patch[2].PosW + patch[3].PosW) * 0.5f;
float3 c = (patch[0].PosW + patch[1].PosW + patch[2].PosW + patch[3].PosW) * 0.25f;
pt.EdgeTess[0] = GetTessFactor(l);
pt.EdgeTess[1] = GetTessFactor(t);
pt.EdgeTess[2] = GetTessFactor(r);
pt.EdgeTess[3] = GetTessFactor(b);
pt.InsideTess[0] = GetTessFactor(c);
pt.InsideTess[1] = pt.InsideTess[0];
return pt;
}
[domain("quad")]
[partitioning("fractional_even")]
[outputtopology("triangle_cw")]
[outputcontrolpoints(4)]
[patchconstantfunc("ConstantHS")]
[maxtessfactor(64.0f)]
VS_OUT HS(InputPatch<VS_OUT, 4> p, uint i : SV_OutputControlPointID)
{
VS_OUT vout;
vout.PosW = p[i].PosW;
return vout;
}
[domain("quad")]
DS_OUT DS(PatchTess patchTess, float2 uv : SV_DomainLocation, const OutputPatch<VS_OUT, 4> quad)
{
DS_OUT dout;
float3 p = lerp(lerp(quad[0].PosW, quad[1].PosW, uv.x), lerp(quad[2].PosW, quad[3].PosW, uv.x), uv.y);
p.y = GetHeight(p);
dout.PosH = mul(float4(p, 1.0f), gViewProj);
dout.PosW = p;
return dout;
}
This code above isn't the problem, just want to give you some code context.
The Problem occurres in this function:
inline float GetTessFactor(float3 posW)
{
const float factor = saturate((length(gEyePos - posW) - minDistance) / (maxDistance - minDistance));
return pow(2, lerp(6.0f, 0.0f, factor));
}
When i use the debug mode in Visual Studio, everything works pretty finde, tessellation works as it should. But in release mode, i got flickering of the terrain patches.
And now the super strange thing: When i change the function and switch from pow to just a linear function or something else, everything works as exspected.
So this works fine:
inline float GetTessFactor(float3 posW)
{
const float factor = saturate((length(gEyePos - posW) - minDistance) / (maxDistance - minDistance));
return lerp(64.0f, 0.0f, factor));
}
EDIT:
changing the line:
pt.InsideTess[0] = GetTessFactor(c);
to
pt.InsideTess[0] = max(max(pt.EdgeTess[0], pt.EdgeTess[1]), max(pt.EdgeTess[2], pt.EdgeTess[3]));
does the job.
It seems that sometimes the pow function is calculating values (withing the valid range of 64.0f) that are not valid with the edge tess factors.
Also keep in mind, that this problem just appears when running in release mode and not in debug mode (VS 2013).
Does anyone know restrictions for the combination of the tessfactor values? I didn't find any information on msdn or any similar pages.
Thanks
Newest driver update solved the Problem.

Path Tracing Shadowing Error

I really dont know what else do to to fix this problem.I have written a path tracer using explicit light sampling in c++ and I keep getting this weird really black shadows which I know is wrong.I have done everything to fix it but I still keep getting it,even on higher samples.What am I doing wrong ? Below is a image of the scene.
And The Radiance Main Code
RGB Radiance(Ray PixRay,std::vector<Primitive*> sceneObjects,int depth,std::vector<AreaLight> AreaLights,unsigned short *XI,int E)
{
int MaxDepth = 10;
if(depth > MaxDepth) return RGB();
double nearest_t = INFINITY;
Primitive* nearestObject = NULL;
for(int i=0;i<sceneObjects.size();i++)
{
double root = sceneObjects[i]->intersect(PixRay);
if(root > 0)
{
if(root < nearest_t)
{
nearest_t = root;
nearestObject = sceneObjects[i];
}
}
}
RGB EstimatedRadiance;
if(nearestObject)
{
EstimatedRadiance = nearestObject->getEmission() * E;
Point intersectPoint = nearestObject->intersectPoint(PixRay,nearest_t);
Vector intersectNormal = nearestObject->surfacePointNormal(intersectPoint).Normalize();
if(nearestObject->getBRDF().Type == 1)
{
for(int x=0;x<AreaLights.size();x++)
{
Point pointOnTriangle = RandomPointOnTriangle(AreaLights[x].shape,XI);
Vector pointOnTriangleNormal = AreaLights[x].shape.surfacePointNormal(pointOnTriangle).Normalize();
Vector LightDistance = (pointOnTriangle - intersectPoint).Normalize();
//Geometric Term
RGB Geometric_Term = GeometricTerm(intersectPoint,pointOnTriangle,sceneObjects);
//Lambertian BRDF
RGB LambertianBRDF = nearestObject->getColor() * (1. / M_PI);
//Emitted Light Power
RGB Emission = AreaLights[x].emission;
double MagnitudeOfXandY = (pointOnTriangle - intersectPoint).Magnitude() * (pointOnTriangle - intersectPoint).Magnitude();
RGB DirectLight = Emission * LambertianBRDF * Dot(intersectNormal,-LightDistance) *
Dot(pointOnTriangleNormal,LightDistance) * (1./MagnitudeOfXandY) * AreaLights[x].shape.Area() * Geometric_Term;
EstimatedRadiance = EstimatedRadiance + DirectLight;
}
//
Vector diffDir = CosWeightedRandHemiDirection(intersectNormal,XI);
Ray diffRay = Ray(intersectPoint,diffDir);
EstimatedRadiance = EstimatedRadiance + ( Radiance(diffRay,sceneObjects,depth+1,AreaLights,XI,0) * nearestObject->getColor() * (1. / M_PI) * M_PI );
}
//Mirror
else if(nearestObject->getBRDF().Type == 2)
{
Vector reflDir = PixRay.d-intersectNormal*2*Dot(intersectNormal,PixRay.d);
Ray reflRay = Ray(intersectPoint,reflDir);
return nearestObject->getColor() *Radiance(reflRay,sceneObjects,depth+1,AreaLights,XI,0);
}
}
return EstimatedRadiance;
}
I haven't debugged your code, so there may be any number of bugs of course, but I can give you some tips: First, go look at SmallPT, and see what it does that you don't. It's tiny but still quite easy to read.
From the look of it, it seems there are issues with either the sampling and/or gamma correction. The easiest one is gamma: when converting RGB intensity in the range 0..1 to RGB in the range 0..255, remember to always gamma correct. Use a gamma of 2.2
R = r^(1.0/gamma)
G = g^(1.0/gamma)
B = b^(1.0/gamma)
Having the wrong gamma will make any path traced image look bad.
Second: sampling. It's not obvious from the code how the sampling is weighted. I'm only familiar with Path Tracing using russian roulette sampling. With RR the radiance basically works like so:
if (depth > MaxDepth)
return RGB();
RGB color = mat.Emission;
// Russian roulette:
float survival = 1.0f;
float pContinue = material.Albedo();
survival = 1.0f / pContinue;
if (Rand.Next() > pContinue)
return color;
color += DirectIllumination(sceneIntersection);
color += Radiance(sceneIntersection, depth+1) * survival;
RR is basically a way of terminating rays at random, but still maintaining an unbiased estimate of the true radiance. Since it adds a weight to the indirect term, and the shadow and bottom of the speheres are only indirectly lit, I'd suspect that has something to do with it (if it isn't just the gamma).

How to blur the outcome of a fragment shader?

I'm working on a shader that generates little clouds based on some mask images. Right now it works well, but i feel the result is missing something, and i thought a blur would be nice. I remember a basic blur algorithm where you have to apply a convolution with a matrix of norm 1 (the bigger the matrix the greater the result) and an image. The thing is, I don't know how to treat the current outcome of the shader as an image. So basically I want to keep the shader as is, but getting it blurry. Any ideas?, how can I integrate the convolution algorithm to the shader? Or does anyone know of other algorithm?
Cg code:
float Luminance( float4 Color ){
return 0.6 * Color.r + 0.3 * Color.g + 0.1 * Color.b;
}
struct v2f {
float4 pos : SV_POSITION;
float2 uv_MainTex : TEXCOORD0;
};
float4 _MainTex_ST;
v2f vert(appdata_base v) {
v2f o;
o.pos = mul(UNITY_MATRIX_MVP, v.vertex);
o.uv_MainTex = TRANSFORM_TEX(v.texcoord, _MainTex);
return o;
}
sampler2D _MainTex;
sampler2D _Gradient;
sampler2D _NoiseO;
sampler2D _NoiseT;
float4 frag(v2f IN) : COLOR {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
half4 c = tex2D (_MainTex, IN.uv_MainTex);
if (lum >= 1.0f){
float pos = lum - 1.0f;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
//turb.a = 0.0f;
return turb;
}
else return c;
}
It appears to me that this shader is emulating alpha testing between a backbuffer-like texture (passed via the sampler2D _MainTex) and a generated cloud luminance (represented by float lum) mapped onto a gradient. This makes things trickier because you can't just fake a blur and let alpha blending take care of the rest. You'll also need to change your alpha testing routine to emulate an alpha blend instead or restructure your rendering pipeline accordingly. We'll deal with blurring the clouds first.
The first question you need to ask yourself is if you need a screen-space blur. Seeing the mechanics of this fragment shader, I would think not -- you want to blur the clouds on the actual model. Given this, it should be sufficient to blur the underlying textures and result in a blurred result -- except you're emulating alpha clipping, so you'll get rough edges. The question is what to do about those rough edges. That's where alpha blending comes in.
You can emulate alpha blending by using a lerp (linear interpolation) between the turb color and c color with lerp() function (depending on which shader language you're using). You'll probably want something that looks like return lerp(c, turb, 1 - pos); instead of return turb; ... I'd expect you'll want to tweak this continually until you understand and start getting the results you want. (For example, you may prefer lerp(c, turb, 1 - pow(pos,4)))
In fact, you can try this last step (just adding the lerp) before modifying your textures to get an idea of what the alpha blending will do for you.
Edit: I hadn't considered the case where the _NoiseO and _NoiseT samplers were changing continually, so simply telling you to blur them was minimally useful advice. You can emulate blurring by using a multi-tap filter. The most simple way is to take uniformly spaced samples, weight them, and sum them together resulting in your final color. (Typically you'll want the weights themselves to sum to 1.)
This being said, you may or may not way to do this on the _NoiseO and _NoiseT textures themselves -- you may want to create a screen-space blur instead which may look more interesting to a viewer. In this case, the same concept applies, but you need to do the calculations for the offset coordinates for each tap and then perform a weighted summation.
For example if we were going with the first case and we wanted to sample from the _Noise0 sampler and blur it slightly, we could use this box filter (where all the weights are the same and sum to 1, thus performing an average):
// Untested code.
half4 nO = 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, g_offset.y))
Alternatively, if we wanted the entire cloud output to appear blurry we'd wrap the cloud generation portion in a function and call it instead of tex2D() for the taps.
// More untested code.
half4 genCloud(float2 tc) {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
float pos = lum - 1.0;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
// Figure out how you'd generate your alpha blending constant here for your lerp
turb.a = ACTUAL_ALPHA;
return turb;
}
And the multi-tap filtering would look like:
// And even more untested code.
half4 cloudcolor = 0.25 * genCloud(IN.uv_MainTex + float2( 0, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, g_offset.y))
return lerp(c, cloudcolor, cloudcolor.a);
However doing this is going to be relatively slow for calculations if you make the cloud function too complex. If you're bound by raster operations and texture reads (transferring texture/buffer data to and from memory) chances are this won't matter much unless you use a much more advanced blurring technique (such successful downsampling through ping-ponged buffers, useful for blurs/filters that are expensive because they have lots of taps). But performance is another entire consideration from just getting the look you want.

Resources