Here is my whole fragment shader code, it's quite simple:
precision highp float;
void main( void )
{
float a = 66061311.0;
if(a == 66061312.0)
gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);
else
gl_FragColor = vec4(0.0, 1.0, 0.0, 1.0);
}
Why the screen is clear to red.
When I set a to 66061315.0, the screen is clear to green.
That confuses me. Under my understanding, 66061311.0 is within the range of float type.
How can I fix or go around this?
Even if the value is within the range of the type it does not mean that its precision is precise enough at such large values to see a difference between the two.
In your case for a standard 32-bit float the results are:
66061311.0 = 6.60613e+07
66061312.0 = 6.60613e+07
And the values are the same when comparing. This is not related or bound to openGL nor shaders, it is how a float is defined. A 64-bit float will detect a difference though.
To add a bit more info if you check the definition of a floating value you will see that the fraction only has 23 bits which means the precition is up to 8.4M but you have over 66M.
Related
I'm wondering why I can't initialize an array with an integer index. In shadertoy it seems to work but it doesn't work when I use this pixel shader via three.js:
void main(void) {
vec2 p[1];
p[0] = vec2(0.0, 0.0); // works
int i = 0;
p[i] = vec2(0.0, 0.0); // doesn't work glsl doesn't run
gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);
}
Any ideas?
The issue is GLSL 1.0 only supports constant integer expressions for array axis or loops based on constant integer expressions.
See the spec
void main(void) {
vec2 p[1];
p[0] = vec2(0.0, 0.0); // works
int i = 0;
p[i] = vec2(0.0, 0.0); // doesn't work. i is not constant
const int j = 0;
p[j] = vec2(0.0, 0.0); // works
vec2 q[2];
for (int k = 0; k < 2; ++k) { // 2 is a constant int so this works
p[k] = vec2(0); // works
}
gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);
}
Note that the rules are complex. For example your code is ok in a vertex shader but not in a fragment shader. Except for arrays of samplers even in vertex shaders the index must follow the same restricted rules.
WebGL2 supports GLSL ES 3.00 which allows non-constant integer array access in more places.
Shadertoy optionally uses WebGL2 though it tries to do it auto-magically. You don't have to tell it your shader is using GLSL ES 3.0, it just guesses some how. Maybe it compiles the shader both ways and whichever one works is the one it uses. I have no idea, I just know it does support both.
THREE.js has a WebGL2 version
I have this line:
gl_FragColor = vec4(worldPos.x / maxX, worldPos.z / maxZ, 1.0, 1.0);
Where worldPos.x and worldPos.y goes from 0 to 19900. maxX and maxZ are float uniforms. It works as expected when maxX and maxZ are set to 5000.0 (a gradient to white and above 5000 it's all white), but when maxX and maxZ are set to 19900.0 it all turns blue. Why is that and how to get around it? Hardcoding the values doesn't make a difference, i.e:
gl_FragColor = vec4(worldPos.x / 5000.0, worldPos.z / 5000.0, 1.0, 1.0);
works as expected while:
gl_FragColor = vec4(worldPos.x / 19900.0, worldPos.z / 19900.0, 1.0, 1.0);
makes it all blue. This only happens on some devices and not on others.
Update:
Adding highp modifier (as suggested by Michael below) solved it for one device, but when testing on another it didn't make any difference. Then I tried to do the division on the CPU (also suggested by Michael) like this:
in java, before passing it as uniform:
float maxX = 1.0f / 19900.0f;
float maxZ = 1.0f / 19900.0f;
program.setUniformf(maxXUniform, maxX);
program.setUniformf(maxZUniform, maxZ);
in shader:
uniform float maxX;
uniform float maxZ;
...
gl_FragColor = vec4(worldPos.x * maxX, worldPos.z * maxZ, 1.0, 1.0);
...
Final sulotion:
This still didn't cut it. Now the values are too small so when passed in to the shader they turn 0 due to too low float precision. Then I tried to multiply it by 100 before passing it in, and then multiplying it by 0.01 inside the shader.
in java:
float maxX = 100.0f / 19900.0f;
float maxZ = 100.0f / 19900.0f;
program.setUniformf(maxXUniform, maxX);
program.setUniformf(maxZUniform, maxZ);
in shader:
uniform float maxX;
uniform float maxZ;
...
gl_FragColor = vec4(worldPos.x * 0.01 * maxX, worldPos.z * 0.01 * maxZ, 1.0, 1.0);
...
And that solved the problem. Now the highp modifier isn't needed. Maybe it isn't the prettiest sulotion but it's efficient and robust.
I guess you're running OpenGL ES? Well,the floating precision sucks on many,usually quite old, devices.I had similar issues on several occasions when implementing cascaded shadows mapping in shaders for mobile hardware.
Make sure you use highp qualifier for those variables. (note - that might not solve the issue, but is worth to try)
Another possible solution: don't perform the division in the shader. That's a quite heavy operation for many old and weak implementations anyway. Try to avoid division, sqrt(),pow().Run shader profiler and you will be surprised to find out how much those ops are HEAVY! (iOS emulator on Mac has a nice shader profiler) Try to pass the results directly as uniforms.I am not sure that would be a problem in your case,as I can't see any of these variables bound to per-fragment execution.
And if it still doesn't help, then usually there is nothing you can do about that. That's the old hardware/GLSL implementation issue. But I am sure,if you calculate that on CPU and upload the results as uniforms, that should solve the issue.
I need to remove all odd lines from a texture - this is part of a simple deinterlacer.
In the following code sample, instead of getting the RGB from texture, I choose to output white colour for odd line and red colour for even line - so I can visually check if the result is what I expected.
_texcoord is passed in from vertex shader and has a range of [0, 1] for both x and y
uniform sampler2D sampler0; /* not used here because we directly output White or Red color */
varying highp vec2 _texcoord;
void main() {
highp float height = 480.0; /* assume texture has height of 480 */
highp float y = height * _texcoord.y;
if (mod(y, 2.0) >= 1.0) {
gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0);
} else {
gl_FragColor = vec4(0.0, 0.0, 1.0, 1.0);
}
}
When render to screen, the output isn't expected. Vertically it's RRWWWRRWWW But I am really expecting RWRWRWRW (i.e. alternate between red and white color)
My code run on iOS and target to GLES 2.0 so it should be no different on Android with GLES 2.0
Question: Where did I do wrong?
EDIT
Yes the texture height is correct
I guess my question is: given a _texcoord.y, how to tell if it's referring to odd or even line of the texture.
void main(void)
{
vec2 p= vec2(floor(gl_FragCoord.x), floor(gl_FragCoord.y));
if (mod(p.y, 2.0)==0.0)
gl_FragColor = vec4(texture2D(tex,uv).xyz ,1.0);
else
gl_FragColor = vec4(0.0,0.0,0.0 ,1.0);
}
Is there a anyway to optimize the next algorithm to be any faster, even if is just a small speed increase?
const mat3 factor = mat3(1.0, 1.0, 1.0, 2.112, 1.4, 0.0, 0.0, 2.18, -2.21);
vec3 calculate(in vec2 coord)
{
vec3 sample = texture2D(texture_a, coord).rgb;
return (factor / sample) * 2.15;
}
The only significant optimization I can think of is to pack texture_a and texture_b into a single three-channel texture, if you can. That saves you one of the two texture lookups, which are most likely to be the bottleneck here.
#Thomas answer is the most helpfull, since texture lookups are most expensive, if his solution is possible in your application. If you already use those textures somewhere else better pass the values as parameters to avoid duplicate lookups.
Else I don't know if it can be optimized that much but some straight forward things that come to my mind.
Compiler optimizations:
Assign const keyword to coord parameter, if possible to sample too.
Assign f literal in each float element.
Maybe manually assign mat
I don't know if its faster because I don't know how the matrix multiplication is implemented but since the constant factor matrix contains many ones and zeros it maybe can be manually assigned.
vec3 calculate(const in vec2 coord)
{
//not 100% sure if that init is possible
const vec3 sample = vec3(texture2D(texture_a, coord).r
texture2D(texture_b, coord).ra - 0.5f);
vec3 result = vec3(sample.y);
result.x += sample.x + sample.z;
result.y += 2.112f * sample.x;
result.z *= 2.18f;
result.z -= 2.21f * sample.z;
return result;
}
I have a platform where this extension is not available ( non NVIDIA ).
How could I emulate this functionality ?
I need it to solve far plane clipping problem when rendering stencil shadow volumes with z-fail algorithm.
Since you say you're using OpenGL ES, but also mentioned trying to clamp gl_FragDepth, I'm assuming you're using OpenGL ES 2.0, so here's a shader trick:
You can emulate ARB_depth_clamp by using a separate varying for the z-component.
Vertex Shader:
varying float z;
void main()
{
gl_Position = ftransform();
// transform z to window coordinates
z = gl_Position.z / gl_Position.w;
z = (gl_DepthRange.diff * z + gl_DepthRange.near + gl_DepthRange.far) * 0.5;
// prevent z-clipping
gl_Position.z = 0.0;
}
Fragment shader:
varying float z;
void main()
{
gl_FragColor = vec4(vec3(z), 1.0);
gl_FragDepth = clamp(z, 0.0, 1.0);
}
"Fall back" to ARB_depth_clamp?
Check if NV_depth_clamp exists anyway? For example my ATI card supports five "NVidia-only" GL extensions.