MCPE GLSL Conditionals - opengl-es

I'm writing an ios vertex shader that "flattens" a MC world in the x direction if a change in the z direction is detected, and vice versa (The xz plane is perp to height). I have several shaders that warp the world just fine, but writing this pseudo movement detection hasn't worked. I know conditionals are costly. I was comparing the position to a number and it worked:
if (worldPos.x != 4.) {
worldPos.z = 0.;
}
But comparing position to a static call of the position doesn't. So far I've tried assigning constant floats to the x and z components, uniform floats, and a POS4 uniform, but no success. I have a feeling the conditionals fail because of a data type problem? It would be easier to debug if PE version displayed coord like PC. Thanks for any/all help! Current code:
uniform POS4 CHUNK_ORIGIN_AND_SCALE;
attribute POS4 POSITION;
void main()
{
POS4 worldPos;
worldPos.xyz = (POSITION.xyz * CHUNK_ORIGIN_AND_SCALE.w) + CHUNK_ORIGIN_AND_SCALE.xyz;
worldPos.w = 1.;
const float staticPosx = worldPos.x;
const float staticPosz = worldPos.z;
if (worldPos.x != staticPosx) {
worldPos.z = 0.;
staticPosx = worldPos.x;
}
if (worldPos.z != staticPosz) {
worldPos.x = 0.;
staticPosz = worldPos.z;
}
etc.

i didn't try it in glsl but in C you can't
int a=5;
const int b=a;

Related

Has anyone encounter this this stripping artifact during "RayTracing in one Weekend"?

I am trying to port the "RayTracing in One Weekend" into metal compute shader. I encounter this strip artifacts in my project:
Is it because my random generator does not work well?
Does anyone have a clue?
// 参照 https://www.pcg-random.org/ 实现
typedef struct { uint64_t state; uint64_t inc; } pcg32_random_t;
uint32_t pcg32_random_r(thread pcg32_random_t* rng)
{
uint64_t oldstate = rng->state;
rng->state = oldstate * 6364136223846793005ULL + rng->inc;
uint32_t xorshifted = ((oldstate >> 18u) ^ oldstate) >> 27u;
uint32_t rot = oldstate >> 59u;
return (xorshifted >> rot) | (xorshifted << ((-rot) & 31));
}
void pcg32_srandom_r(thread pcg32_random_t* rng, uint64_t initstate, uint64_t initseq)
{
rng->state = 0U;
rng->inc = (initseq << 1u) | 1u;
pcg32_random_r(rng);
rng->state += initstate;
pcg32_random_r(rng);
}
// 生成0~1之间的浮点数
float randomF(thread pcg32_random_t* rng)
{
//return pcg32_random_r(rng)/float(UINT_MAX);
return ldexp(float(pcg32_random_r(rng)), -32);
}
// 生成x_min ~ x_max之间的浮点数
float randomRange(thread pcg32_random_t* rng, float x_min, float x_max){
return randomF(rng) * (x_max - x_min) + x_min;
}
I found this link.It says that the primary ray hit point is either above or below the sphere's surface a litter bit due to the float precision error. It is a z-fighting problem
Seeing these "circles aroune the image center" artifact is almost always a dead give-away for z-fighting (rays along any such "circle" always have the same distance to any flat object you're looking at, so they either all round up or all round down, giving you this artifact).
This z-fighting then translates to this artifact because sometimes all the rays on such a circle are inside the sphere (meaning their shadow rays get self-occluded by the sphere), or all outside (they do what they should). What you want to do is offset the ray origin of the shadow rays a tiny bit (say, 1e-3f) along the normal direction at the hitpoint.
You may also want to read up on Carsten Waechter's article in Ray Tracing Gems 1 - it's for triangles, but explains the problem and potential solution very well.

Math differences between GLSL and Metal

I play around with GLSL and got this effect. And I tried to convert it to metal but I got some funky result for y-axis when it is smaller than 0:
There are these funny curvy crop off for most of the cubes above the horizon(<0). This is my Metal code:
static float mod(float x, float y)
{
return x - y * floor(x/y);
}
static float vmax(float3 v) {
return max(max(v.x, v.y), v.z);
}
float fBoxCheap(float3 p, float3 b) { //cheap box
return vmax(abs(p) - b);
}
static float map( float3 p )
{
p.x = mod(p.x + 5,10)-5;
p.y = mod(p.y + 5 ,10)-5;
p.z = mod(p.z + 5 ,10)-5;
float box = fBoxCheap(p-float3(0.0,3.0,0.0),float3(4.0,3.0,1.0));
return box;
}
It is almost the same code in GLSL:
float vmax(vec3 v) {
return max(max(v.x, v.y), v.z);
}
float box(vec3 p, vec3 b) { //cheap box
return vmax(abs(p) - b);
}
float map( vec3 p )
{
p.x=mod(p.x+3.0,6.0)-3.0;
p.y=mod(p.y+3.0,6.0)-3.0;
p.z=mod(p.z+3.0,6.0)-3.0;
return box( p, vec3(1.,1.,1.) );
}
How can I resolve this?
I am fairly new to both GLSL and Metal but I find Metal is more tricky because of these math issue.
I don't think there's a difference here. You can create similar artifacts in the GL version by applying all of the same modifications you do in the Metal version. The problem is that offsetting the point after you fold space with mod violates the requirement that the SDF be Lipschitz continuous (i.e., the gradient must be <= 1 everywhere). If you want to translate the box, translate p before applying mod.

How to implement this function without an if-else branch? (GLSL)

I'm working on an game, using OpenGL ES 2.0
I would like to eliminate branches in the fragment shader, if possible. But there is a function, which I cannot improve:
float HS(in float p, in float c) {
float ap = abs(p);
if( ap > (c*1.5) ) {
return ap - c ;
} else {
return mod(ap+0.5*c,c)-0.5*c;
}
}
The c is a constant in most of the cases, if it helps in this situation. I use this function like this:
vec3 op = sign(p1)*vec3(
HS(p1.x, cc),
HS(p1.y, cc),
HS(p1.z, cc)
);
Here's a trick that "eliminates" the branch. But the more important thing it does is vectorize your code. After all, the compiler probably eliminated the branch for you; it's far less likely that it realized it could do this:
vec3 HSvec(in vec3 p, in const float c)
{
vec3 ap = abs(p);
vec3 side1 = ap - c;
const float val = 0.5 * c;
vec3 side2 = mod(ap + val, vec3(c)) - val;
bvec3 tests = greaterThan(ap, vec3(c*1.5));
return mix(side2, side1, vec3(tests));
}
This eliminates lots of redundant computations, as well as doing lots of computations simultaneously.
The key here is the mix function. mix performs linear interpolation between the two arguments based on the third. But since a bool converted to a float will be exactly 1.0 or 0.0, it's really just selecting either side1 or side2. And this selection is defined by the results of the component-wise greaterThan operation.

Random / noise functions for GLSL

As the GPU driver vendors don't usually bother to implement noiseX in GLSL, I'm looking for a "graphics randomization swiss army knife" utility function set, preferably optimised to use within GPU shaders. I prefer GLSL, but code any language will do for me, I'm ok with translating it on my own to GLSL.
Specifically, I'd expect:
a) Pseudo-random functions - N-dimensional, uniform distribution over [-1,1] or over [0,1], calculated from M-dimensional seed (ideally being any value, but I'm OK with having the seed restrained to, say, 0..1 for uniform result distribution). Something like:
float random (T seed);
vec2 random2 (T seed);
vec3 random3 (T seed);
vec4 random4 (T seed);
// T being either float, vec2, vec3, vec4 - ideally.
b) Continous noise like Perlin Noise - again, N-dimensional, +- uniform distribution, with constrained set of values and, well, looking good (some options to configure the appearance like Perlin levels could be useful too). I'd expect signatures like:
float noise (T coord, TT seed);
vec2 noise2 (T coord, TT seed);
// ...
I'm not very much into random number generation theory, so I'd most eagerly go for a pre-made solution, but I'd also appreciate answers like "here's a very good, efficient 1D rand(), and let me explain you how to make a good N-dimensional rand() on top of it..." .
For very simple pseudorandom-looking stuff, I use this oneliner that I found on the internet somewhere:
float rand(vec2 co){
return fract(sin(dot(co, vec2(12.9898, 78.233))) * 43758.5453);
}
You can also generate a noise texture using whatever PRNG you like, then upload this in the normal fashion and sample the values in your shader; I can dig up a code sample later if you'd like.
Also, check out this file for GLSL implementations of Perlin and Simplex noise, by Stefan Gustavson.
It occurs to me that you could use a simple integer hash function and insert the result into a float's mantissa. IIRC the GLSL spec guarantees 32-bit unsigned integers and IEEE binary32 float representation so it should be perfectly portable.
I gave this a try just now. The results are very good: it looks exactly like static with every input I tried, no visible patterns at all. In contrast the popular sin/fract snippet has fairly pronounced diagonal lines on my GPU given the same inputs.
One disadvantage is that it requires GLSL v3.30. And although it seems fast enough, I haven't empirically quantified its performance. AMD's Shader Analyzer claims 13.33 pixels per clock for the vec2 version on a HD5870. Contrast with 16 pixels per clock for the sin/fract snippet. So it is certainly a little slower.
Here's my implementation. I left it in various permutations of the idea to make it easier to derive your own functions from.
/*
static.frag
by Spatial
05 July 2013
*/
#version 330 core
uniform float time;
out vec4 fragment;
// A single iteration of Bob Jenkins' One-At-A-Time hashing algorithm.
uint hash( uint x ) {
x += ( x << 10u );
x ^= ( x >> 6u );
x += ( x << 3u );
x ^= ( x >> 11u );
x += ( x << 15u );
return x;
}
// Compound versions of the hashing algorithm I whipped together.
uint hash( uvec2 v ) { return hash( v.x ^ hash(v.y) ); }
uint hash( uvec3 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ); }
uint hash( uvec4 v ) { return hash( v.x ^ hash(v.y) ^ hash(v.z) ^ hash(v.w) ); }
// Construct a float with half-open range [0:1] using low 23 bits.
// All zeroes yields 0.0, all ones yields the next smallest representable value below 1.0.
float floatConstruct( uint m ) {
const uint ieeeMantissa = 0x007FFFFFu; // binary32 mantissa bitmask
const uint ieeeOne = 0x3F800000u; // 1.0 in IEEE binary32
m &= ieeeMantissa; // Keep only mantissa bits (fractional part)
m |= ieeeOne; // Add fractional part to 1.0
float f = uintBitsToFloat( m ); // Range [1:2]
return f - 1.0; // Range [0:1]
}
// Pseudo-random value in half-open range [0:1].
float random( float x ) { return floatConstruct(hash(floatBitsToUint(x))); }
float random( vec2 v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec3 v ) { return floatConstruct(hash(floatBitsToUint(v))); }
float random( vec4 v ) { return floatConstruct(hash(floatBitsToUint(v))); }
void main()
{
vec3 inputs = vec3( gl_FragCoord.xy, time ); // Spatial and temporal inputs
float rand = random( inputs ); // Random per-pixel value
vec3 luma = vec3( rand ); // Expand to RGB
fragment = vec4( luma, 1.0 );
}
Screenshot:
I inspected the screenshot in an image editing program. There are 256 colours and the average value is 127, meaning the distribution is uniform and covers the expected range.
Gustavson's implementation uses a 1D texture
No it doesn't, not since 2005. It's just that people insist on downloading the old version. The version that is on the link you supplied uses only 8-bit 2D textures.
The new version by Ian McEwan of Ashima and myself does not use a texture, but runs at around half the speed on typical desktop platforms with lots of texture bandwidth. On mobile platforms, the textureless version might be faster because texturing is often a significant bottleneck.
Our actively maintained source repository is:
https://github.com/ashima/webgl-noise
A collection of both the textureless and texture-using versions of noise is here (using only 2D textures):
http://www.itn.liu.se/~stegu/simplexnoise/GLSL-noise-vs-noise.zip
If you have any specific questions, feel free to e-mail me directly (my email address can be found in the classicnoise*.glsl sources.)
Gold Noise
// Gold Noise ©2015 dcerisano#standard3d.com
// - based on the Golden Ratio
// - uniform normalized distribution
// - fastest static noise generator function (also runs at low precision)
// - use with indicated fractional seeding method.
float PHI = 1.61803398874989484820459; // Φ = Golden Ratio
float gold_noise(in vec2 xy, in float seed){
return fract(tan(distance(xy*PHI, xy)*seed)*xy.x);
}
See Gold Noise in your browser right now!
This function has improved random distribution over the current function in #appas' answer as of Sept 9, 2017:
The #appas function is also incomplete, given there is no seed supplied (uv is not a seed - same for every frame), and does not work with low precision chipsets. Gold Noise runs at low precision by default (much faster).
There is also a nice implementation described here by McEwan and #StefanGustavson that looks like Perlin noise, but "does not require any setup, i.e. not textures nor uniform arrays. Just add it to your shader source code and call it wherever you want".
That's very handy, especially given that Gustavson's earlier implementation, which #dep linked to, uses a 1D texture, which is not supported in GLSL ES (the shader language of WebGL).
After the initial posting of this question in 2010, a lot has changed in the realm of good random functions and hardware support for them.
Looking at the accepted answer from today's perspective, this algorithm is very bad in uniformity of the random numbers drawn from it. And the uniformity suffers a lot depending on the magnitude of the input values and visible artifacts/patterns will become apparent when sampling from it for e.g. ray/path tracing applications.
There have been many different functions (most of them integer hashing) being devised for this task, for different input and output dimensionality, most of which are being evaluated in the 2020 JCGT paper Hash Functions for GPU Rendering. Depending on your needs you could select a function from the list of proposed functions in that paper and simply from the accompanying Shadertoy.
One that isn't covered in this paper but that has served me very well without any noticeably patterns on any input magnitude values is also one that I want to highlight.
Other classes of algorithms use low-discrepancy sequences to draw pseudo-random numbers from, such as the Sobol squence with Owen-Nayar scrambling. Eric Heitz has done some amazing research in this area, as well with his A Low-Discrepancy Sampler that Distributes Monte Carlo Errors as a Blue Noise in Screen Space paper.
Another example of this is the (so far latest) JCGT paper Practical Hash-based Owen Scrambling, which applies Owen scrambling to a different hash function (namely Laine-Karras).
Yet other classes use algorithms that produce noise patterns with desirable frequency spectrums, such as blue noise, that is particularly "pleasing" to the eyes.
(I realize that good StackOverflow answers should provide the algorithms as source code and not as links because those can break, but there are way too many different algorithms nowadays and I intend for this answer to be a summary of known-good algorithms today)
Do use this:
highp float rand(vec2 co)
{
highp float a = 12.9898;
highp float b = 78.233;
highp float c = 43758.5453;
highp float dt= dot(co.xy ,vec2(a,b));
highp float sn= mod(dt,3.14);
return fract(sin(sn) * c);
}
Don't use this:
float rand(vec2 co){
return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453);
}
You can find the explanation in Improvements to the canonical one-liner GLSL rand() for OpenGL ES 2.0
hash:
Nowadays webGL2.0 is there so integers are available in (w)GLSL.
-> for quality portable hash (at similar cost than ugly float hashes) we can now use "serious" hashing techniques.
IQ implemented some in https://www.shadertoy.com/view/XlXcW4 (and more)
E.g.:
const uint k = 1103515245U; // GLIB C
//const uint k = 134775813U; // Delphi and Turbo Pascal
//const uint k = 20170906U; // Today's date (use three days ago's dateif you want a prime)
//const uint k = 1664525U; // Numerical Recipes
vec3 hash( uvec3 x )
{
x = ((x>>8U)^x.yzx)*k;
x = ((x>>8U)^x.yzx)*k;
x = ((x>>8U)^x.yzx)*k;
return vec3(x)*(1.0/float(0xffffffffU));
}
Just found this version of 3d noise for GPU, alledgedly it is the fastest one available:
#ifndef __noise_hlsl_
#define __noise_hlsl_
// hash based 3d value noise
// function taken from https://www.shadertoy.com/view/XslGRr
// Created by inigo quilez - iq/2013
// License Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
// ported from GLSL to HLSL
float hash( float n )
{
return frac(sin(n)*43758.5453);
}
float noise( float3 x )
{
// The noise function returns a value in the range -1.0f -> 1.0f
float3 p = floor(x);
float3 f = frac(x);
f = f*f*(3.0-2.0*f);
float n = p.x + p.y*57.0 + 113.0*p.z;
return lerp(lerp(lerp( hash(n+0.0), hash(n+1.0),f.x),
lerp( hash(n+57.0), hash(n+58.0),f.x),f.y),
lerp(lerp( hash(n+113.0), hash(n+114.0),f.x),
lerp( hash(n+170.0), hash(n+171.0),f.x),f.y),f.z);
}
#endif
A straight, jagged version of 1d Perlin, essentially a random lfo zigzag.
half rn(float xx){
half x0=floor(xx);
half x1=x0+1;
half v0 = frac(sin (x0*.014686)*31718.927+x0);
half v1 = frac(sin (x1*.014686)*31718.927+x1);
return (v0*(1-frac(xx))+v1*(frac(xx)))*2-1*sin(xx);
}
I also have found 1-2-3-4d perlin noise on shadertoy owner inigo quilez perlin tutorial website, and voronoi and so forth, he has full fast implementations and codes for them.
I have translated one of Ken Perlin's Java implementations into GLSL and used it in a couple projects on ShaderToy.
Below is the GLSL interpretation I did:
int b(int N, int B) { return N>>B & 1; }
int T[] = int[](0x15,0x38,0x32,0x2c,0x0d,0x13,0x07,0x2a);
int A[] = int[](0,0,0);
int b(int i, int j, int k, int B) { return T[b(i,B)<<2 | b(j,B)<<1 | b(k,B)]; }
int shuffle(int i, int j, int k) {
return b(i,j,k,0) + b(j,k,i,1) + b(k,i,j,2) + b(i,j,k,3) +
b(j,k,i,4) + b(k,i,j,5) + b(i,j,k,6) + b(j,k,i,7) ;
}
float K(int a, vec3 uvw, vec3 ijk)
{
float s = float(A[0]+A[1]+A[2])/6.0;
float x = uvw.x - float(A[0]) + s,
y = uvw.y - float(A[1]) + s,
z = uvw.z - float(A[2]) + s,
t = 0.6 - x * x - y * y - z * z;
int h = shuffle(int(ijk.x) + A[0], int(ijk.y) + A[1], int(ijk.z) + A[2]);
A[a]++;
if (t < 0.0)
return 0.0;
int b5 = h>>5 & 1, b4 = h>>4 & 1, b3 = h>>3 & 1, b2= h>>2 & 1, b = h & 3;
float p = b==1?x:b==2?y:z, q = b==1?y:b==2?z:x, r = b==1?z:b==2?x:y;
p = (b5==b3 ? -p : p); q = (b5==b4 ? -q : q); r = (b5!=(b4^b3) ? -r : r);
t *= t;
return 8.0 * t * t * (p + (b==0 ? q+r : b2==0 ? q : r));
}
float noise(float x, float y, float z)
{
float s = (x + y + z) / 3.0;
vec3 ijk = vec3(int(floor(x+s)), int(floor(y+s)), int(floor(z+s)));
s = float(ijk.x + ijk.y + ijk.z) / 6.0;
vec3 uvw = vec3(x - float(ijk.x) + s, y - float(ijk.y) + s, z - float(ijk.z) + s);
A[0] = A[1] = A[2] = 0;
int hi = uvw.x >= uvw.z ? uvw.x >= uvw.y ? 0 : 1 : uvw.y >= uvw.z ? 1 : 2;
int lo = uvw.x < uvw.z ? uvw.x < uvw.y ? 0 : 1 : uvw.y < uvw.z ? 1 : 2;
return K(hi, uvw, ijk) + K(3 - hi - lo, uvw, ijk) + K(lo, uvw, ijk) + K(0, uvw, ijk);
}
I translated it from Appendix B from Chapter 2 of Ken Perlin's Noise Hardware at this source:
https://www.csee.umbc.edu/~olano/s2002c36/ch02.pdf
Here is a public shade I did on Shader Toy that uses the posted noise function:
https://www.shadertoy.com/view/3slXzM
Some other good sources I found on the subject of noise during my research include:
https://thebookofshaders.com/11/
https://mzucker.github.io/html/perlin-noise-math-faq.html
https://rmarcus.info/blog/2018/03/04/perlin-noise.html
http://flafla2.github.io/2014/08/09/perlinnoise.html
https://mrl.nyu.edu/~perlin/noise/
https://rmarcus.info/blog/assets/perlin/perlin_paper.pdf
https://developer.nvidia.com/gpugems/GPUGems/gpugems_ch05.html
I highly recommend the book of shaders as it not only provides a great interactive explanation of noise, but other shader concepts as well.
EDIT:
Might be able to optimize the translated code by using some of the hardware-accelerated functions available in GLSL. Will update this post if I end up doing this.
lygia, a multi-language shader library
If you don't want to copy / paste the functions into your shader, you can also use lygia, a multi-language shader library. It contains a few generative functions like cnoise, fbm, noised, pnoise, random, snoise in both GLSL and HLSL. And many other awesome functions as well. For this to work it:
Relays on #include "file" which is defined by Khronos GLSL standard and suported by most engines and enviroments (like glslViewer, glsl-canvas VS Code pluging, Unity, etc. ).
Example: cnoise
Using cnoise.glsl with #include:
#ifdef GL_ES
precision mediump float;
#endif
uniform vec2 u_resolution;
uniform float u_time;
#include "lygia/generative/cnoise.glsl"
void main (void) {
vec2 st = gl_FragCoord.xy / u_resolution.xy;
vec3 color = vec3(cnoise(vec3(st * 5.0, u_time)));
gl_FragColor = vec4(color, 1.0);
}
To run this example I used glslViewer.
Please see below an example how to add white noise to the rendered texture.
The solution is to use two textures: original and pure white noise, like this one: wiki white noise
private static final String VERTEX_SHADER =
"uniform mat4 uMVPMatrix;\n" +
"uniform mat4 uMVMatrix;\n" +
"uniform mat4 uSTMatrix;\n" +
"attribute vec4 aPosition;\n" +
"attribute vec4 aTextureCoord;\n" +
"varying vec2 vTextureCoord;\n" +
"varying vec4 vInCamPosition;\n" +
"void main() {\n" +
" vTextureCoord = (uSTMatrix * aTextureCoord).xy;\n" +
" gl_Position = uMVPMatrix * aPosition;\n" +
"}\n";
private static final String FRAGMENT_SHADER =
"precision mediump float;\n" +
"uniform sampler2D sTextureUnit;\n" +
"uniform sampler2D sNoiseTextureUnit;\n" +
"uniform float uNoseFactor;\n" +
"varying vec2 vTextureCoord;\n" +
"varying vec4 vInCamPosition;\n" +
"void main() {\n" +
" gl_FragColor = texture2D(sTextureUnit, vTextureCoord);\n" +
" vec4 vRandChosenColor = texture2D(sNoiseTextureUnit, fract(vTextureCoord + uNoseFactor));\n" +
" gl_FragColor.r += (0.05 * vRandChosenColor.r);\n" +
" gl_FragColor.g += (0.05 * vRandChosenColor.g);\n" +
" gl_FragColor.b += (0.05 * vRandChosenColor.b);\n" +
"}\n";
The fragment shared contains parameter uNoiseFactor which is updated on every rendering by main application:
float noiseValue = (float)(mRand.nextInt() % 1000)/1000;
int noiseFactorUniformHandle = GLES20.glGetUniformLocation( mProgram, "sNoiseTextureUnit");
GLES20.glUniform1f(noiseFactorUniformHandle, noiseFactor);
FWIW I had the same questions and I needed it to be implemented in WebGL 1.0, so I couldn't use a few of the examples given in previous answers. I tried the Gold Noise mentioned before, but the use of PHI doesn't really click for me. (distance(xy * PHI, xy) * seed just equals length(xy) * (1.0 - PHI) * seed so I don't see how the magic of PHI should be put to work when it gets directly multiplied by seed?
Anyway, I did something similar just without PHI and instead added some variation at another place, basically I take the tan of the distance between xy and some random point lying outside of the frame to the top right and then multiply with the distance between xy and another such random point lying in the bottom left (so there is no accidental match between these points). Looks pretty decent as far as I can see. Click to generate new frames.
(function main() {
const dim = [512, 512];
twgl.setDefaults({ attribPrefix: "a_" });
const gl = twgl.getContext(document.querySelector("canvas"));
gl.canvas.width = dim[0];
gl.canvas.height = dim[1];
const bfi = twgl.primitives.createXYQuadBufferInfo(gl);
const pgi = twgl.createProgramInfo(gl, ["vs", "fs"]);
gl.canvas.onclick = (() => {
twgl.bindFramebufferInfo(gl, null);
gl.useProgram(pgi.program);
twgl.setUniforms(pgi, {
u_resolution: dim,
u_seed: Array(4).fill().map(Math.random)
});
twgl.setBuffersAndAttributes(gl, pgi, bfi);
twgl.drawBufferInfo(gl, bfi);
});
})();
<script src="https://twgljs.org/dist/4.x/twgl-full.min.js"></script>
<script id="vs" type="x-shader/x-vertex">
attribute vec4 a_position;
attribute vec2 a_texcoord;
void main() {
gl_Position = a_position;
}
</script>
<script id="fs" type="x-shader/x-fragment">
precision highp float;
uniform vec2 u_resolution;
uniform vec2 u_seed[2];
void main() {
float uni = fract(
tan(distance(
gl_FragCoord.xy,
u_resolution * (u_seed[0] + 1.0)
)) * distance(
gl_FragCoord.xy,
u_resolution * (u_seed[1] - 2.0)
)
);
gl_FragColor = vec4(uni, uni, uni, 1.0);
}
</script>
<canvas></canvas>

DirectX 11 Compute Shader - not writing all values

I am trying some experiments in fractal rendering with DirectX11 Compute Shaders.
The provided example runs on a FeatureLevel_10 device.
My RwStructured output buffer has a data format of R32G32B32A32_FLOAT
The problem is that when writing to the buffer, it seems that only the ALPHA ( w ) value gets written nothing else....
Here is the shader code:
struct BufType
{
float4 value;
};
cbuffer ScreenConstants : register(b0)
{
float2 ScreenDimensions;
float2 Padding;
};
RWStructuredBuffer<BufType> BufferOut : register(u0);
[numthreads(1, 1, 1)]
void Main( uint3 DTid : SV_DispatchThreadID )
{
uint index = DTid.y * ScreenDimensions.x + DTid.x;
float minRe = -2.0f;
float maxRe = 1.0f;
float minIm = -1.2;
float maxIm = minIm + ( maxRe - minRe ) * ScreenDimensions.y / ScreenDimensions.x;
float reFactor = (maxRe - minRe ) / (ScreenDimensions.x - 1.0f);
float imFactor = (maxIm - minIm ) / (ScreenDimensions.y - 1.0f);
float cim = maxIm - DTid.y * imFactor;
uint maxIterations = 30;
float cre = minRe + DTid.x * reFactor;
float zre = cre;
float zim = cim;
bool isInside = true;
uint iterationsRun = 0;
for( uint n = 0; n < maxIterations; ++n )
{
float zre2 = zre * zre;
float zim2 = zim * zim;
if ( zre2 + zim2 > 4.0f )
{
isInside = false;
iterationsRun = n;
}
zim = 2 * zre * zim + cim;
zre = zre2 - zim2 + cre;
}
if ( isInside )
{
BufferOut[index].value = float4(1.0f,0.0f,0.0f,1.0f);
}
}
The code actually produces in a sense the correct result ( 2D Mandelbrot set ) but it seems somehow only the alpha value is touched and nothing else is written, although the pixels inside the set should be colored red... ( the image is black & white )
Anybody has a clue what's going on here ?
After some fiddling around i found the problem.
I have not found any documentation from MS mentioning this, so it could also be a Nvidia
specific driver issue.
Apparently you are only allowed to write ONCE per Compute Shader Invocation to the same element in a RWSructuredBuffer. And you also HAVE to write ONCE.
I changed the code to accumulate the correct color in a local variable, and write it now only once at the end of the shader.
Everything works perfectly now in that way.
I'm not sure but, shouldn't it be for BufferOut decl:
RWStructuredBuffer<BufType> BufferOut : register(u0);
instead of :
RWStructuredBuffer BufferOut : register(u0);
If you are only using a float4 write target, why not use just:
RWBuffer<float4> BufferOut : register (u0);
Maybe this could help.
After playing around today again, i ran into the same problem once again.
The following code produced all white output:
[numthreads(1, 1, 1)]
void Main( uint3 dispatchId : SV_DispatchThreadID )
{
float4 color = float4(1.0f,0.0f,0.0f,1.0f);
WriteResult(dispatchId,color);
}
The WriteResult method is a utility method from my hlsl standard library.
Long story short. After i upgraded from Driver version 192 to 195(beta) the problem went away.
Seems like the drivers have some definitive problems in compute shader support left, so beware.
from what ive seen, computer shaders are only useful if you need a more general computational model than the tradition pixel shader, or if you can load data and then share it between threads in fast shared memory. im fairly sure u would get better performance with a pixel shader for the mandelbrot shader.
on my setup (win7, feb 10 dx sdk, gtx480) my compute shaders have a punishing setup time of over 0.2-0.3ms (binding a SRV and a UAV and then calling dispatch()).
if u do a PS implementation please post your experiences.
I have no direct experience with DX compute shaders but...
Why are you setting alpha = 1.0?
IIRC, that makes the pixel 100% transparent, so your inside pixels are transparent red, and show up as whatever color was drawn behind them.
When alpha = 1.0, the RGB components are never used.

Resources