How to blur the outcome of a fragment shader? - algorithm

I'm working on a shader that generates little clouds based on some mask images. Right now it works well, but i feel the result is missing something, and i thought a blur would be nice. I remember a basic blur algorithm where you have to apply a convolution with a matrix of norm 1 (the bigger the matrix the greater the result) and an image. The thing is, I don't know how to treat the current outcome of the shader as an image. So basically I want to keep the shader as is, but getting it blurry. Any ideas?, how can I integrate the convolution algorithm to the shader? Or does anyone know of other algorithm?
Cg code:
float Luminance( float4 Color ){
return 0.6 * Color.r + 0.3 * Color.g + 0.1 * Color.b;
}
struct v2f {
float4 pos : SV_POSITION;
float2 uv_MainTex : TEXCOORD0;
};
float4 _MainTex_ST;
v2f vert(appdata_base v) {
v2f o;
o.pos = mul(UNITY_MATRIX_MVP, v.vertex);
o.uv_MainTex = TRANSFORM_TEX(v.texcoord, _MainTex);
return o;
}
sampler2D _MainTex;
sampler2D _Gradient;
sampler2D _NoiseO;
sampler2D _NoiseT;
float4 frag(v2f IN) : COLOR {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
half4 c = tex2D (_MainTex, IN.uv_MainTex);
if (lum >= 1.0f){
float pos = lum - 1.0f;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
//turb.a = 0.0f;
return turb;
}
else return c;
}

It appears to me that this shader is emulating alpha testing between a backbuffer-like texture (passed via the sampler2D _MainTex) and a generated cloud luminance (represented by float lum) mapped onto a gradient. This makes things trickier because you can't just fake a blur and let alpha blending take care of the rest. You'll also need to change your alpha testing routine to emulate an alpha blend instead or restructure your rendering pipeline accordingly. We'll deal with blurring the clouds first.
The first question you need to ask yourself is if you need a screen-space blur. Seeing the mechanics of this fragment shader, I would think not -- you want to blur the clouds on the actual model. Given this, it should be sufficient to blur the underlying textures and result in a blurred result -- except you're emulating alpha clipping, so you'll get rough edges. The question is what to do about those rough edges. That's where alpha blending comes in.
You can emulate alpha blending by using a lerp (linear interpolation) between the turb color and c color with lerp() function (depending on which shader language you're using). You'll probably want something that looks like return lerp(c, turb, 1 - pos); instead of return turb; ... I'd expect you'll want to tweak this continually until you understand and start getting the results you want. (For example, you may prefer lerp(c, turb, 1 - pow(pos,4)))
In fact, you can try this last step (just adding the lerp) before modifying your textures to get an idea of what the alpha blending will do for you.
Edit: I hadn't considered the case where the _NoiseO and _NoiseT samplers were changing continually, so simply telling you to blur them was minimally useful advice. You can emulate blurring by using a multi-tap filter. The most simple way is to take uniformly spaced samples, weight them, and sum them together resulting in your final color. (Typically you'll want the weights themselves to sum to 1.)
This being said, you may or may not way to do this on the _NoiseO and _NoiseT textures themselves -- you may want to create a screen-space blur instead which may look more interesting to a viewer. In this case, the same concept applies, but you need to do the calculations for the offset coordinates for each tap and then perform a weighted summation.
For example if we were going with the first case and we wanted to sample from the _Noise0 sampler and blur it slightly, we could use this box filter (where all the weights are the same and sum to 1, thus performing an average):
// Untested code.
half4 nO = 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * tex2D(_Noise0, IN.uv_MainTex + float2(g_offset.x, g_offset.y))
Alternatively, if we wanted the entire cloud output to appear blurry we'd wrap the cloud generation portion in a function and call it instead of tex2D() for the taps.
// More untested code.
half4 genCloud(float2 tc) {
half4 nO = tex2D (_NoiseO, IN.uv_MainTex);
half4 nT = tex2D (_NoiseT, IN.uv_MainTex);
float4 turbulence = nO + nT;
float lum = Luminance(turbulence);
float pos = lum - 1.0;
if( pos > 0.98f ) pos = 0.98f;
if( pos < 0.02f ) pos = 0.02f;
float2 texCord = (pos, pos);
half4 turb = tex2D (_Gradient, texCord);
// Figure out how you'd generate your alpha blending constant here for your lerp
turb.a = ACTUAL_ALPHA;
return turb;
}
And the multi-tap filtering would look like:
// And even more untested code.
half4 cloudcolor = 0.25 * genCloud(IN.uv_MainTex + float2( 0, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2( 0, g_offset.y))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, 0))
+ 0.25 * genCloud(IN.uv_MainTex + float2(g_offset.x, g_offset.y))
return lerp(c, cloudcolor, cloudcolor.a);
However doing this is going to be relatively slow for calculations if you make the cloud function too complex. If you're bound by raster operations and texture reads (transferring texture/buffer data to and from memory) chances are this won't matter much unless you use a much more advanced blurring technique (such successful downsampling through ping-ponged buffers, useful for blurs/filters that are expensive because they have lots of taps). But performance is another entire consideration from just getting the look you want.

Related

OpenGL simple antialiased polygon grid shader

How to make a test grid pattern with antialiased lines in a fragment shader?
I remember I found this challenging, so I'll post the answer here for my future self and for anyone who wants the same effect.
This shader is meant to be rendered "above" the already textured plane in a separate render call. The reason I'm doing that - is because in my program I am generating the texture of the surface through several render calls, slowly building it up layer by layer. And then I wanted to make a simple black grid over it, so I make the last render call to do this.
That's why the base color here is (0,0,0,0), basically a nothing. Then I can use GL mixing patterns to overlay the result of this shader over whatever my texture is.
Note that you needn't do that separately. You can just as easily modify this code to display a certain color (like smooth grey) or even a texture of your choice. Simply pass the texture to the shader and modify the last line accordingly.
Also note that I use constants that I set up during shader compillation. Basically, I just load the shader string, but before passing it to a shader compiler - I search and replace the __CONSTANT_SOMETHING with an actual value I want. Don't forget that that's all text, so you need to replace it with text, for example:
//java code
shaderCode = shaderCode.replaceFirst("__CONSTANT_SQUARE_SIZE", String.valueOf(GlobalSettings.PLANE_SQUARE_SIZE));
If I could share with you the code I use for anti-aliased grids, it might help the complexity. All I've done is use the texture coordinates to paint a grid on a plane. I used GLSL's genType fract(genType x) to repeat texture space. Then I used the absolute value function to essentially calculate each pixel's distance to the grid line. The rest of the operations are to interpret that as a color.
You can play with this code directly on Shadertoy.com by pasting it into a new shader.
If you want to use it in your code, the only lines you need are the part starting at the gridSize variable and ending with the grid variable.
iResolution.y is the screen height, uv is the texture coordinate of your plane.
gridSize and width should probably be supplied with a uniform variable.
void mainImage(out vec4 fragColor, in vec2 fragCoord) {
// aspect correct pixel coordinates (for shadertoy only)
vec2 uv = fragCoord / iResolution.xy * vec2(iResolution.x / iResolution.y, 1.0);
// get some diagonal lines going (for shadertoy only)
uv.yx += uv.xy * 0.1;
// for every unit of texture space, I want 10 grid lines
float gridSize = 10.0;
// width of a line on the screen plus a little bit for AA
float width = (gridSize * 1.2) / iResolution.y;
// chop up into grid
uv = fract(uv * gridSize);
// abs version
float grid = max(
1.0 - abs((uv.y - 0.5) / width),
1.0 - abs((uv.x - 0.5) / width)
);
// Output to screen (for shadertoy only)
fragColor = vec4(grid, grid, grid, 1.0);
}
Happy shading!
Here're my shaders:
Vertex:
#version 300 es
precision highp float;
precision highp int;
layout (location=0) in vec3 position;
uniform mat4 projectionMatrix;
uniform mat4 modelViewMatrix;
uniform vec2 coordShift;
uniform mat4 modelMatrix;
out highp vec3 vertexPosition;
const float PLANE_SCALE = __CONSTANT_PLANE_SCALE; //assigned during shader compillation
void main()
{
// generate position data for the fragment shader
// does not take view matrix or projection matrix into account
// TODO: +3.0 part is contingent on the actual mesh. It is supposed to be it's lowest possible coordinate.
// TODO: the mesh here is 6x6 with -3..3 coords. I normalize it to 0..6 for correct fragment shader calculations
vertexPosition = vec3((position.x+3.0)*PLANE_SCALE+coordShift.x, position.y, (position.z+3.0)*PLANE_SCALE+coordShift.y);
// position data for the OpenGL vertex drawing
gl_Position = projectionMatrix * modelViewMatrix * vec4(position, 1.0);
}
Note that I calculate VertexPosition here and pass it to the fragment shader. This is so that my grid "moves" when the object moves. The thing is, in my app I have the ground basically stuck to the main entity. The entity (call it character or whatever) doesn't move across the plane or changes its position relative to the plane. But to create the illusion of movement - I calculate the coordinate shift (relative to the square size) and use that to calculate vertex position.
It's a bit complicated, but I thought I would include that. Basically, if the square size is set to 5.0 (i.e. we have a 5x5 meter square grid), then coordShift of (0,0) would mean that the character stands in the lower left corner of the square; coordShift of (2.5,2.5) would be the middle, and (5,5) would be top right. After going past 5, the shifting loops back to 0. Go below 0 - it loops to 5.
So basically the grid ever "moves" within one square, but because it is uniform - the illusion is that you're walking on an infinite grid surface instead.
Also note that you can make the same thing work with multi-layered grids, for example where every 10th line is thicker. All you really need to do is make sure your coordShift represents the largest distance your grid pattern shifts.
Just in case someone wonders why I made it loop - it's for precision sake. Sure, you could just pass raw character's coordinate to the shader, and it'll work fine around (0,0), but as you get 10000 units away - you will notice some serious precision glitches, like your lines getting distorted or even "fuzzy" like they're made out of brushes.
Here's the fragment shader:
#version 300 es
precision highp float;
in highp vec3 vertexPosition;
out mediump vec4 fragColor;
const float squareSize = __CONSTANT_SQUARE_SIZE;
const vec3 color_l1 = __CONSTANT_COLOR_L1;
void main()
{
// calculate deriviatives
// (must be done at the start before conditionals)
float dXy = abs(dFdx(vertexPosition.z)) / 2.0;
float dYy = abs(dFdy(vertexPosition.z)) / 2.0;
float dXx = abs(dFdx(vertexPosition.x)) / 2.0;
float dYx = abs(dFdy(vertexPosition.x)) / 2.0;
// find and fill horizontal lines
int roundPos = int(vertexPosition.z / squareSize);
float remainder = vertexPosition.z - float(roundPos)*squareSize;
float width = max(dYy, dXy) * 2.0;
if (remainder <= width)
{
float diff = (width - remainder) / width;
fragColor = vec4(color_l1, diff);
return;
}
if (remainder >= (squareSize - width))
{
float diff = (remainder - squareSize + width) / width;
fragColor = vec4(color_l1, diff);
return;
}
// find and fill vertical lines
roundPos = int(vertexPosition.x / squareSize);
remainder = vertexPosition.x - float(roundPos)*squareSize;
width = max(dYx, dXx) * 2.0;
if (remainder <= width)
{
float diff = (width - remainder) / width;
fragColor = vec4(color_l1, diff);
return;
}
if (remainder >= (squareSize - width))
{
float diff = (remainder - squareSize + width) / width;
fragColor = vec4(color_l1, diff);
return;
}
// fill base color
fragColor = vec4(0,0,0, 0);
return;
}
It is currently built for a 1-pixel thick lines only, but you can control thickness by controlling the "width"
Here, the first important part is dfdx / dfdy functions. These are GLSL functions, and I'll simply say that they let you determine how much space in WORLD coordinates your fragment takes on the screen, based on the Z-distance of that spot on your plane.
Well, that was a mouthful. I'm sure you can figure it out if you read docs for them though.
Then I take the maximum of those outputs as width. Basically, depending on the way your camera is looking you want to "stretch" the width of your line a bit.
remainder - is basically how far this fragment is from the line that we want to draw in world coordinates. If it's too far - we don't need to fill it.
If you simply take the max here, you will get a non-antialiased line 1 pizel wide. It'll basically look like a perfect 1-pixel line shape from MS paint.
But increasing width, you make those straight segments stretch further and overlap.
You can see that I compare remainder with line width here. The greater the width - the bigger the remainder can be to "hit" it. I have to compare this from both sides, because otherwise you're only looking at pixels that are close to the line from the negative coord side, and discount the positive, which could still be hitting it.
Now, for the simple antialiasing effect, we need to make those overlapping segments "fade out" as they near their ends. For this purpose, I calculate the fraction to see how deeply the remainder is inside the line. When the fraction equals 1, this means that our line that we want to draw basically goes straight through the middle of the fragment that we're currently drawing. As the fraction approaches 0, it means the fragment is farther and farther away from the line, and should thus be made more and more transparent.
Finally, we do this from both sides for horizontal and vertical lines separately. We have to do them separate because dFdX / dFdY needs to be different for vertical and horizontal lines, so we can't do them in one formula.
And at last, if we didn't hit any of the lines close enough - we fill the fragment with transparent color.
I'm not sure if that's THE best code for the task - but it works. If you have suggestions let me know!
p.s. shaders are written for Opengl-ES, but they should work for OpenGL too.

GLSL Shader: Mapping Bars in Polar-Coordinates

I'd like to create a polar representation of this shader: https://www.shadertoy.com/view/4sfSDN
So that it looks like in this screenshot:
http://postimg.org/image/uwc34jxxz/
I know the basics of the polar-system: How to calculate r and ϕ, but i can only use those values with a texture2d() load function on a image.
When i only have a amplitude value like in the shader above, i dont get it working.
r should somehow be based of the amplitude, but then i dont know how to draw the circle without the texture2d() function... i can draw a circle with r only, but then there are no different amplitudes. Or do i even need to fill a matrix with the generated bars in a loop and load the circle from there?
Im quite sure it is possible, because of the insane shaders on shadertoy, but i dont quite get it...
Can anyone point me out to a solution?
From the shader you posted I think it should be enough to simply transform the uv to polar coordinates.
So what you are looking for are angle and radius from the center. First let us transform the uv so it gives the vector pointing from the center:
uv = fragCoord - (iResolution*.5);
Next try to normalize it. Since the view is not square the normalization transform should only be by 1 coordinate such that
if(iResolution.x>iResolution.y)
{
uv = uv/iResolution.y;
}
else
{
uv = uv/iResolution.x;
}
This will kind of produce a fit effect but you may just hard code one or the other if you need to. min can be used if available (uv = uv/min(iResolution.x, iResolution.y))) to remove the condition.
So at this point the uv vector points from the center toward the pixel position in a coordinate system that is normalized in one dimension.
Now to get the angle you may simply use atan(uv.y, uv.x). To get the radius you then need length(uv).
The radius in your case will be for the shorter dimension in range [0, .5] so you may multiply it by 2.0 but this is a factor you may later change to get the desired effect so that the maximum value is not hitting the border but maybe having 80% or so (just play around with it).
The angle is in range of [-Pi, Pi] plus in the docs it says it does not work for X = 0 which you will need to handle yourself then. So now the angle must be transformed to be in range [.0, 1.0] to access the texture coordinate:
angle = angle/(Pi*2.0) + .5
So now construct the new uv
uv = vec2(angle, radius)
And use the same shader you did before.
You will also need to keep in mind that radius may be larger then 1.0 in corners which may produce a wrong texture access. In such cases it would be best to discard the fragment.
From the shader toy:
#define M_PI 3.1415926535897932384626433832795
void mainImage( out vec4 fragColor, in vec2 fragCoord )
{
vec2 uv = fragCoord.xy - (iResolution.xy*.5);
uv = uv/min(iResolution.x, iResolution.y);
float angle = atan(uv.y, uv.x);
angle = angle/(M_PI*2.0) + .5;
float radius = length(uv);
uv = vec2(angle, radius*2.0);
float bars = 24.;
float fft = texture2D( iChannel0, vec2(floor(uv.x*bars)/bars,0.25) ).x;
float amp = (fft - uv.y)*100.;
fragColor = vec4(amp,0.,0.,1.0);
}

Volume ray casting doesn't work fine (Webgl + GLSL + Three.js)

I have tried to make better quality of my volume ray casting algorithm. I have set a smaller step of raycast (quality is better), but it causes problem. It is on pictures below (black areas where they shouldnt be).
I am using RGB cube to get direction of ray in volume.
I think, i have the same algorithm like there: volume rendering (using glsl) with ray casting algorithm
Have anybody some ideas, where could be a problem? I need to resolve this, because deadline of my diplom thesis is to close:( I realy don't know, why it doesnt work:(
EDIT:
I cant show there my all code (it could be problem, if i will supply it before hand it in school). But the key code to going throught the volume:
// All variables neede to rays
vec3 rayDirection = texture2D(backFaceCube, texCoo).xyz - varcolor.xyz;
float lenRay = length(rayDirection);
vec3 normDir = normalize(rayDirection);
float d = qualitySteps; //quality steps is size of steps defined by user -> example: 0.01, 0.001, 0.0001 etc.
vec3 step = normDir * d;
float lenStep = length(step);
float accumulatedLength = 0.0;
and then in cycle:
posInCube.xyz += step;
accumulatedLength += lenStep;
...
...
...
if(accumulatedLength >= lenRay || accumulatedColor.a > 1.0 ) {
break;
}
EDIT2:(sorry but like comment it was too long)
Yes, the texture is noisy...i have tried to delete the condition with alpha: if(accumulatedColor.a > 1.0), but the result is same.
I think that there is some direct correlation with length of ray and size of step. I tried many combination and i have found these things.
If step is big, i am able to go throught all volume, but if it is small, than i am realy not able to go throught volume (maybe). If step is extremely big, than i can see mirroved object (it can be caused by repeating texture if i go out of the texture on GPU). If step is too small, than i am able to mapped only small part of texture -> it seems, that ray is too short, but in reality he isnt. Questins are, why mapping of 3D coordinates to 2D texture is wrong and depend on size of step..
Can you please supply the code for your fragment shader?
Are you traversing the whole vector from front to end position? Here's an example shader (the code might contain some errors since I just wrote it from the top of my head. I unfortunately can't test the code on my computer at the moment):
in vec2 texCoord;
out vec4 outColor;
uniform float stepSize;
uniform int numSteps;
uniform sampler2d frontTexture;
uniform sampler2d backTexture;
uniform sampler3d volumeTexture;
uniform sampler1d transferTexture; // Density to RGB
void main()
{
vec4 color = vec4(0.0);
vec3 startPosition = texture(frontTexture, texCoord);
vec3 endPosition = texture(backTexture, texCoord);
vec3 delta = normalize(startPosition - endPosition) * stepSize;
vec3 position = startPosition;
for (int i = 0; i < numSteps; ++i)
{
float density = texture(volumeTexture, position).r;
vec3 voxelColor = texture(transferTexture, density);
// Sampling distance correction
color.a = 1.0 - pow((1.0 - color.a), stepSize * 500.0);
// Front to back blending (no shading done)
color.rgb = color.rgb + (1.0 - color.a) * voxelColor.a * voxelColor.rgb;
color.a = color.a + (1.0 - color.a) * voxelColor.a;
if (color.a >= 1.0)
{
break;
}
// Advance
position += direction;
if (position.x > 1.0 || position.y > 1.0 || position.z > 1.0)
{
break;
}
}
outColor = color;
}

Numeric Stability with Summed Area Tables in Shadow Mapping

Im having issue with loss of precision in my SAVSM setup.
when you see the light moving around the effect is very striking; there is a lot of noise with fragments going black and white all the time. This can be somewhat lessened by using the minvariance (thus ignoring anything below a certain threshold) but then we get even worse effects with incorrect falloff (see my other post).
Im using GLSL 1.2 because I'm on a mac so I dont have access to the modf function in order to split the precision across two channels as described in GPU Gems 3 Chapter 8.
Im using GL_RGBA32F_ARB textures with a Framebuffer object and ping ponging two textures to generate a summed area table which i use with the VSM algorithm.
Moments / Depth Shader to create the basis for the tables
varying vec4 v_position;
varying float tDepth;
float g_DistributeFactor = 1024.0;
void main()
{
// Is this linear depth? I would say yes but one can't be utterly sure.
// Could try a divide by the far plane?
float depth = v_position.z / v_position.w ;
depth = depth * 0.5 + 0.5; //Don't forget to move away from unit cube ([-1,1]) to [0,1] coordinate system
vec2 moments = vec2(depth, depth * depth);
// Adjusting moments (this is sort of bias per pixel) using derivative
float dx = dFdx(depth);
float dy = dFdy(depth);
moments.y += 0.25 * (dx*dx+dy*dy);
// Subtract 0.5 off now so we can get this into our summed area table calc
//moments -= 0.5;
// Split the moments into rg and ba for EVEN MORE PRECISION
// float FactorInv = 1.0 / g_DistributeFactor;
// gl_FragColor = vec4(floor(moments.x) * FactorInv, fract(moments.x ) * g_DistributeFactor,
// floor(moments.y) * FactorInv, fract(moments.y) * g_DistributeFactor);
gl_FragColor = vec4(moments,0.0,0.0);
}
The shadowmap shader
varying vec4 v_position;
varying float tDepth;
float g_DistributeFactor = 1024.0;
void main()
{
// Is this linear depth? I would say yes but one can't be utterly sure.
// Could try a divide by the far plane?
float depth = v_position.z / v_position.w ;
depth = depth * 0.5 + 0.5; //Don't forget to move away from unit cube ([-1,1]) to [0,1] coordinate system
vec2 moments = vec2(depth, depth * depth);
// Adjusting moments (this is sort of bias per pixel) using derivative
float dx = dFdx(depth);
float dy = dFdy(depth);
moments.y += 0.25 * (dx*dx+dy*dy);
// Subtract 0.5 off now so we can get this into our summed area table calc
//moments -= 0.5;
// Split the moments into rg and ba for EVEN MORE PRECISION
// float FactorInv = 1.0 / g_DistributeFactor;
// gl_FragColor = vec4(floor(moments.x) * FactorInv, fract(moments.x ) * g_DistributeFactor,
// floor(moments.y) * FactorInv, fract(moments.y) * g_DistributeFactor);
gl_FragColor = vec4(moments,0.0,0.0);
}
The Summed tables do seem to be working. I know this because I have a function that converts back from the summed table to the original depth map and the two images do look pretty much the same. Im also using the -0.5 + 0.5 trick in order to get some more precision but it doesnt seem to be helping
My question is this, given that im on a mac which has GLSL 1.2 only, how can I split the precision over two channels? If I could use these extra channels for space in the summed table then maybe that would work? Ive seen some stuff that uses modf but that isnt available to me.
Also, people have suggested 32 bit integer buffers but I dont think I have support for these on my macbook pro.

DirectX 11 Compute Shader - not writing all values

I am trying some experiments in fractal rendering with DirectX11 Compute Shaders.
The provided example runs on a FeatureLevel_10 device.
My RwStructured output buffer has a data format of R32G32B32A32_FLOAT
The problem is that when writing to the buffer, it seems that only the ALPHA ( w ) value gets written nothing else....
Here is the shader code:
struct BufType
{
float4 value;
};
cbuffer ScreenConstants : register(b0)
{
float2 ScreenDimensions;
float2 Padding;
};
RWStructuredBuffer<BufType> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void Main( uint3 DTid : SV_DispatchThreadID )
{
uint index = DTid.y * ScreenDimensions.x + DTid.x;
float minRe = -2.0f;
float maxRe = 1.0f;
float minIm = -1.2;
float maxIm = minIm + ( maxRe - minRe ) * ScreenDimensions.y / ScreenDimensions.x;
float reFactor = (maxRe - minRe ) / (ScreenDimensions.x - 1.0f);
float imFactor = (maxIm - minIm ) / (ScreenDimensions.y - 1.0f);
float cim = maxIm - DTid.y * imFactor;
uint maxIterations = 30;
float cre = minRe + DTid.x * reFactor;
float zre = cre;
float zim = cim;
bool isInside = true;
uint iterationsRun = 0;
for( uint n = 0; n < maxIterations; ++n )
{
float zre2 = zre * zre;
float zim2 = zim * zim;
if ( zre2 + zim2 > 4.0f )
{
isInside = false;
iterationsRun = n;
}
zim = 2 * zre * zim + cim;
zre = zre2 - zim2 + cre;
}
if ( isInside )
{
BufferOut[index].value = float4(1.0f,0.0f,0.0f,1.0f);
}
}
The code actually produces in a sense the correct result ( 2D Mandelbrot set ) but it seems somehow only the alpha value is touched and nothing else is written, although the pixels inside the set should be colored red... ( the image is black & white )
Anybody has a clue what's going on here ?
After some fiddling around i found the problem.
I have not found any documentation from MS mentioning this, so it could also be a Nvidia
specific driver issue.
Apparently you are only allowed to write ONCE per Compute Shader Invocation to the same element in a RWSructuredBuffer. And you also HAVE to write ONCE.
I changed the code to accumulate the correct color in a local variable, and write it now only once at the end of the shader.
Everything works perfectly now in that way.
I'm not sure but, shouldn't it be for BufferOut decl:
RWStructuredBuffer<BufType> BufferOut : register(u0);
instead of :
RWStructuredBuffer BufferOut : register(u0);
If you are only using a float4 write target, why not use just:
RWBuffer<float4> BufferOut : register (u0);
Maybe this could help.
After playing around today again, i ran into the same problem once again.
The following code produced all white output:
[numthreads(1, 1, 1)]
void Main( uint3 dispatchId : SV_DispatchThreadID )
{
float4 color = float4(1.0f,0.0f,0.0f,1.0f);
WriteResult(dispatchId,color);
}
The WriteResult method is a utility method from my hlsl standard library.
Long story short. After i upgraded from Driver version 192 to 195(beta) the problem went away.
Seems like the drivers have some definitive problems in compute shader support left, so beware.
from what ive seen, computer shaders are only useful if you need a more general computational model than the tradition pixel shader, or if you can load data and then share it between threads in fast shared memory. im fairly sure u would get better performance with a pixel shader for the mandelbrot shader.
on my setup (win7, feb 10 dx sdk, gtx480) my compute shaders have a punishing setup time of over 0.2-0.3ms (binding a SRV and a UAV and then calling dispatch()).
if u do a PS implementation please post your experiences.
I have no direct experience with DX compute shaders but...
Why are you setting alpha = 1.0?
IIRC, that makes the pixel 100% transparent, so your inside pixels are transparent red, and show up as whatever color was drawn behind them.
When alpha = 1.0, the RGB components are never used.

Resources