In WebGL or OpenGL is it bad to use an output fragment variable as temp storage? - opengl-es

Ignoring coding patterns and code clarity/quality:
This is a question I just don't know if it's inherently bad or inconsequential. I can't find enough on the inner workings of assigning to say: gl_FragColor in WebGL 1.0 or to an out variable in WebGL 2 (layout(location = 0) out vec4 color).
Is there some inherent additional performance cost for doing something like:
void main() {
gl_FragColor = vec4(0., 0., 0., 1.);
vec4 val = gl_FragColor * something;
...
gl_FragColor = val;
}
Or is it better to work entirely with interim declared variables and then assign to the out value a single time?
void main() {
vec4 thing = vec4(0., 0., 0., 1.);
vec4 val = thing * something;
...
gl_FragColor = val;
}

I'm only guessing the answer is "no", there is no consequence. The driver can do whatever it wants/needs to get the correct answer. If gl_FragColor is special the driver can make its own temp. GPU Vendors compete for perf. Shaders are translated to the assembly language of the GPU itself so it's unlikely gl_FragColor is special except as a way to tell the compiler which value to actually output when it's all done computing.
I do it all the time. So does three.js as an example

Related

GLSL vertex shader performance with early return and branching

I have a vertex shader as such
void main (){
vec4 wPos = modelMatrix * vec4( position , 1. );
vWorldPosition = wPos.xyz;
float mask = step(
0.,
dot(
cameraDir,
normalize(normalMatrix * aNormal)
)
);
gl_PointSize = mask * uPointSize;
gl_Position = projectionMatrix * viewMatrix * wPos;
}
I'm not entirely sure how to test the performance of the shader, and exclude other factors like overdraw. I imagine a point of size 1, arranged in a grid in screen space without any overlap would work?
Otherwise i'm curious about these tweaks:
(removes step, removes a multiplication, introduces if else)
void main (){
if(dot(
cameraDir,
normalize(normalMatrix * aNormal) //remove step
) < 0.) {
gl_Position = vec4(0.,.0,-2.,.1);
gl_PointSize = 0.;
} else {
gl_PointSize = uPointSize; //remove a multiplication
vec4 wPos = modelMatrix * vec4( position , 1. );
vWorldPosition = wPos.xyz;
gl_Position = projectionMatrix * viewMatrix * wPos;
}
}
vs something like this:
void main (){
if(dot(
cameraDir,
normalize(normalMatrix * aNormal)
) < 0.) {
gl_Position = vec4(0.,.0,-2.,.1);
return;
}
gl_PointSize = uPointSize;
vec4 wPos = modelMatrix * vec4( position , 1. );
vWorldPosition = wPos.xyz;
gl_Position = projectionMatrix * viewMatrix * wPos;
}
Will these shaders behave differently and why/how?
I'm interested if there is a something to quantify the difference in performance.
Is there some value, like number of MADs or something else that the different code would obviously yield?
Would different generation GPUs treat these differences differently?
If the step version is guaranteed to be fastest, is there a known list of patterns of how branching can be avoided, and which operations to prefer? (Like using floor instead of step could also be possible?):
.
float condition = clamp(floor(myDot + 1.),0.,1.); //is it slower?
There are just way too many variables so the answer is "it depends". Some GPU can handle branches. Some can't and the code is expanded by the compiler so that there are no branches, just math that is multiplied by 0 and other math that is not. Then there's things like tiling GPUs that attempt to aggressively avoid overdraw. I'm sure there are other factors.
Theoretically you can run a million or a few million iterations of your shader and time it with
gl.readPixels(one pixel);
const start = performance.now();
...draw a bunch..
gl.readPixels(one pixel);
const end = performance.now();
const elapsedTime = end - start;
gl.readPixels is a synchronous operation so it's stalls the GPU pipeline.
The elapsedTime itself is not the actual time since it includes starting up the GPU and stopping it among other things it but it seems like you could compare the elapsedTime from one shader with another to see which is faster.
In other words if elapsedTime is 10 seconds it does not mean your shader took ten seconds. It means it took 10 seconds to start the gpu, run your shader, and stop the GPU. How many of those seconds are start, how many are stop and how many are your shader isn't available. But, if elaspedTime for one shader is 10 seconds and 11 for another than it's probably safe to say one shader is faster than the other. Note you probably want to make your test long enough that you get seconds of difference and not microseconds of difference. You'd also need to test on multiple GPUs to see if the speed differences always hold true.
Note that calling return in the vertex shader does not prevent the vertex from being generated. In fact what gl_Position is in that case is undefined.
Conditional branches are expensive on GPUs -- generally significantly more expensive than multiplies, so your revised shaders are probably slower.

OpenGL ES 2.0 Shader Float Data Precision

Here is my whole fragment shader code, it's quite simple:
precision highp float;
void main( void )
{
float a = 66061311.0;
if(a == 66061312.0)
gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);
else
gl_FragColor = vec4(0.0, 1.0, 0.0, 1.0);
}
Why the screen is clear to red.
When I set a to 66061315.0, the screen is clear to green.
That confuses me. Under my understanding, 66061311.0 is within the range of float type.
How can I fix or go around this?
Even if the value is within the range of the type it does not mean that its precision is precise enough at such large values to see a difference between the two.
In your case for a standard 32-bit float the results are:
66061311.0 = 6.60613e+07
66061312.0 = 6.60613e+07
And the values are the same when comparing. This is not related or bound to openGL nor shaders, it is how a float is defined. A 64-bit float will detect a difference though.
To add a bit more info if you check the definition of a floating value you will see that the fraction only has 23 bits which means the precition is up to 8.4M but you have over 66M.

Is this the work of messed up Normals or what could it be?

So I am taking a Graphics course and I am programming shaders. In the course we are given access to a service running webgl as well as a bunch of C++ code to compile to get an environment with a few models going. However, whether I compile this code on Windows or Linux does not matter, I don't get the result I am supposed to.
So this is the result I get using the exact same glsl code as with this result:
I have not programmed C++ enough to debug the program yet but I suspect there is a bug in the program itself and not the shader, so I am wondering if anybody can, from experience, tell what kind of issue this is and then I will try to locate the code that would handle it.
My guess is that it has to do with the normals but I am not exactly sure as I just started with graphics and it was two years ago I did linear algebra properly. And as I don't know the C++ source (I'm analysing it right now so I can understand the flow) I don't know where to debug.
Vertex Shader:
....
attribute vec3 VertexPosition
attribute vec2 VertexST
attribute vec3 VertexNormal
....
void main(void) {
Position = ProjectionMatrix * ViewMatrix * WorldMatrix * vec4 (VertexPosition, 1);
Normal = normalize ((ViewMatrix * WorldMatrix * vec4 (VertexNormal, 0)).xyz);
EyeSpaceLightPosition = ViewMatrix * LightPosition;
EyeSpaceVertexPosition = ViewMatrix * WorldMatrix * vec4 (VertexPosition, 1);
EyeSpaceObjectPosition = ViewMatrix * WorldMatrix * vec4 (0, 0, 0, 1);
STCoords = VertexST;
gl_Position = Position;
}
Pixel shader:
void main(void) {
fragColor = vec4 (Normal, 1.0);
gl_FragColor = fragColor;
}
This code is what is actually running. There are declarations and stuff before, but yeah. There is more code in the actual files but they are commented so those lines don't run.
So the problem was solved and the reason was that in the vertex shader there are three declarations that tell the shader how in the array the information is ordered. It said:
attribute vec3 VertexPosition;
attribute vec2 VertexST;
attribute vec3 VertexNormal;
This is wrong, since the information that is given to the GPU from the CPU is actually ordered with Position, Normal, ST.
That means they should switch and what happened was that the shader took the wrong piece of information and sent it onwards to the pixelshader that was the trying to do lighting without any normals (as I believe in this case we had no ST information, which I think is texture coordinates).
This made the model not render properly as in the first picture in the question, but after switching these lines:
attribute vec3 VertexPosition;
attribute vec3 VertexNormal; // Switched these two
attribute vec2 VertexST; //
Now the shader interprets the information given to it properly and the result is what is expected.
EDIT: This is what was actually defined by the program and can be done in different ways. But in the case of the program I was given for my assignment, it was in this order. But as commenter said, it depends. But the problem was still that the order of declarations in the program differed from the order in the shader.

How to implement a ShaderToy shader in three.js?

looking for info on how to recreate the ShaderToy parameters iGlobalTime, iChannel etc within threejs. I know that iGlobalTime is the time elapsed since the Shader started, and I think the iChannel stuff is for pulling rgb out of textures, but would appreciate info on how to set these.
edit: have been going through all the shaders that come with three.js examples and think that the answers are all in there somewhere - just have to find the equivalent to e.g. iChannel1 = a texture input etc.
I am not sure if you have answered your question, but it might be good for others to know the integration steps for shadertoys to THREEJS.
First, you need to know that shadertoys is a fragment shaders. That being said, you have to set a "general purpose" vertex shader that should work with all shadertoys (fragment shaders).
Step 1
Create a "general purpose" vertex shader
varying vec2 vUv;
void main()
{
vUv = uv;
vec4 mvPosition = modelViewMatrix * vec4(position, 1.0 );
gl_Position = projectionMatrix * mvPosition;
}
This vertex shader is pretty basic. Notice that we defined a varying variable vUv to tell the fragment shader where is the texture mapping. This is important because we are not going to use the screen resolution (iResolution) for our base rendering. We will use the texture coordinates instead. We have done that in order to integrate multiple shadertoys on different objects in the same THREEJS scene.
Step 2
Pick the shadertoys that we want and create the fragment shader. (I have chosen a simple toy that performs well: Simple tunnel 2D by niklashuss).
Here is the given code for this toy:
void main(void)
{
vec2 p = gl_FragCoord.xy / iResolution.xy;
vec2 q = p - vec2(0.5, 0.5);
q.x += sin(iGlobalTime* 0.6) * 0.2;
q.y += cos(iGlobalTime* 0.4) * 0.3;
float len = length(q);
float a = atan(q.y, q.x) + iGlobalTime * 0.3;
float b = atan(q.y, q.x) + iGlobalTime * 0.3;
float r1 = 0.3 / len + iGlobalTime * 0.5;
float r2 = 0.2 / len + iGlobalTime * 0.5;
float m = (1.0 + sin(iGlobalTime * 0.5)) / 2.0;
vec4 tex1 = texture2D(iChannel0, vec2(a + 0.1 / len, r1 ));
vec4 tex2 = texture2D(iChannel1, vec2(b + 0.1 / len, r2 ));
vec3 col = vec3(mix(tex1, tex2, m));
gl_FragColor = vec4(col * len * 1.5, 1.0);
}
Step 3
Customize the shadertoy raw code to have a complete GLSL fragment shader.
The first thing missing out the code are the uniforms and varyings declaration. Add them at the top of your frag shader file (just copy and paste the following):
uniform float iGlobalTime;
uniform sampler2D iChannel0;
uniform sampler2D iChannel1;
varying vec2 vUv;
Note, only the shadertoys variables used for that sample are declared, plus the varying vUv previously declared in our vertex shader.
The last thing we have to twick is the proper UV mapping, now that we have decided to not use the screen resolution. To do so, just replace the line that uses the IResolution uniforms i.e.:
vec2 p = gl_FragCoord.xy / iResolution.xy;
with:
vec2 p = -1.0 + 2.0 *vUv;
That's it, your shaders are now ready for usage in your THREEJS scenes.
Step 4
Your THREEJS code:
Set up uniform:
var tuniform = {
iGlobalTime: { type: 'f', value: 0.1 },
iChannel0: { type: 't', value: THREE.ImageUtils.loadTexture( 'textures/tex07.jpg') },
iChannel1: { type: 't', value: THREE.ImageUtils.loadTexture( 'textures/infi.jpg' ) },
};
Make sure the textures are wrapping:
tuniform.iChannel0.value.wrapS = tuniform.iChannel0.value.wrapT = THREE.RepeatWrapping;
tuniform.iChannel1.value.wrapS = tuniform.iChannel1.value.wrapT = THREE.RepeatWrapping;
Create the material with your shaders and add it to a planegeometry. The planegeometry() will simulate the shadertoys 700x394 screen resolution, in other words it will best transfer the work the artist intented to share.
var mat = new THREE.ShaderMaterial( {
uniforms: tuniform,
vertexShader: vshader,
fragmentShader: fshader,
side:THREE.DoubleSide
} );
var tobject = new THREE.Mesh( new THREE.PlaneGeometry(700, 394,1,1), mat);
Finally, add the delta of the THREE.Clock() to iGlobalTime value and not the total time in your update function.
tuniform.iGlobalTime.value += clock.getDelta();
That is it, you are now able to run most of the shadertoys with this setup...
2022 edit: The version of Shaderfrog described below is no longer being actively developed. There are bugs in the compiler used making it not able to parse all shaders correctly for import, and it doesn't support many of Shadertoy's features, like multiple image buffers. I'm working on a new tool if you want to follow along, otherwise you can try the following method, but it likely won't work most of the time.
Original answer follows:
This is an old thread, but there's now an automated way to do this. Simply go to http://shaderfrog.com/app/editor/new and on the top right click "Import > ShaderToy" and paste in the URL. If it's not public you can paste in the raw source code. Then you can save the shader (requires sign up, no email confirm), and click "Export > Three.js".
You might need to tweak the parameters a little after import, but I hope to have this improved over time. For example, ShaderFrog doesn't support audio nor video inputs yet, but you can preview them with images instead.
Proof of concept:
ShaderToy https://www.shadertoy.com/view/MslGWN
ShaderFrog http://shaderfrog.com/app/view/247
Full disclosure: I am the author of this tool which I launched last week. I think this is a useful feature.
This is based on various sources , including the answer of #INF1.
Basically you insert missing uniform variables from Shadertoy (iGlobalTime etc, see this list: https://www.shadertoy.com/howto) into the fragment shader, the you rename mainImage(out vec4 z, in vec2 w) to main(), and then you change z in the source code to 'gl_FragColor'. In most Shadertoys 'z' is 'fragColor'.
I did this for two cool shaders from this guy (https://www.shadertoy.com/user/guil) but unfortunately I didn't get the marble example to work (https://www.shadertoy.com/view/MtX3Ws).
A working jsFiddle is here: https://jsfiddle.net/dirkk0/zt9dhvqx/
Change the shader from frag1 to frag2 in line 56 to see both examples.
And don't 'Tidy' in jsFiddle - it breaks the shaders.
EDIT:
https://medium.com/#dirkk/converting-shaders-from-shadertoy-to-threejs-fe17480ed5c6

How to emulate GL_DEPTH_CLAMP_NV?

I have a platform where this extension is not available ( non NVIDIA ).
How could I emulate this functionality ?
I need it to solve far plane clipping problem when rendering stencil shadow volumes with z-fail algorithm.
Since you say you're using OpenGL ES, but also mentioned trying to clamp gl_FragDepth, I'm assuming you're using OpenGL ES 2.0, so here's a shader trick:
You can emulate ARB_depth_clamp by using a separate varying for the z-component.
Vertex Shader:
varying float z;
void main()
{
gl_Position = ftransform();
// transform z to window coordinates
z = gl_Position.z / gl_Position.w;
z = (gl_DepthRange.diff * z + gl_DepthRange.near + gl_DepthRange.far) * 0.5;
// prevent z-clipping
gl_Position.z = 0.0;
}
Fragment shader:
varying float z;
void main()
{
gl_FragColor = vec4(vec3(z), 1.0);
gl_FragDepth = clamp(z, 0.0, 1.0);
}
"Fall back" to ARB_depth_clamp?
Check if NV_depth_clamp exists anyway? For example my ATI card supports five "NVidia-only" GL extensions.

Resources