I am calculating normals from an RGB encoded height map:
float unpackFactor = vec3(256.0 * 255.0, 255.0, 255.0 / 256.0);
float unpackOffset = -32768.0;
Therefore I edited the phong shader built-in dHdxy_fwd() function:
vec2 dHdxy_fwd() {
float texelSize = 1.0 / 256.0;
vec2 dSTdx = vec2(texelSize, .0);
vec2 dSTdy = vec2(.0, texelSize);
float Hll = bumpScale * dot(texture2D(displacementMap, vUv).rgb, unpackFactors) + unpackOffset;
float dBx = bumpScale * dot(texture2D(displacementMap, vUv + dSTdx).rgb, unpackFactors) + unpackOffset - Hll;
float dBy = bumpScale * dot(texture2D(displacementMap, vUv + dSTdy).rgb, unpackFactors) + unpackOffset - Hll;
return vec2(dBx, dBy);
}
The decoding of the height, unfortunately, causes artifacts around the green channel of the texture – happening on iPhone 7, iPhone XS and the Radeon 455 dedicated GPU of my Mac:
Those artifacts look like this:
Zoomed in:
On the Intel HD Graphics 530 (integrated GPU), however, there are no such artifacts visible – it looks perfect just as it should (ignore the tile seams for):
Zoomed in:
Why are artifacts appearing on some (most of the tested) GPUs? Any idea how to get rid of them? Seems like some numerical instability, but I fumbled around with texture precision, compressing the total height value, etc. with no luck yet.
Related
I have a program that works great when my POT DataTextures are 1:1 (width:height) in their texel dimensions, however when they are 2:1 or 1:2 in texel dimensions it appears that the texels are being incorrectly read and applied. I'm using continuous indexes (1,2,3,4,5...) to access the texels using the two functions below.
I'm wondering if there is something wrong with how I am accessing the texel data, or perhaps if my use of a Float32Array for the integer indexes needs to be switched to a Uint8Array or something else? Thanks in advance!
This function finds the uv for textures that have one texel per particle cloud in my visualization:
float texelSizeX = 1.0 / uPerCloudBufferWidth;
float texelSizeY = 1.0 / uPerCloudBufferHeight;
vec2 perMotifUV = vec2(
mod(cellIndex, uPerCloudBufferWidth)*texelSizeX,
floor(cellIndex / uPerCloudBufferHeight)*texelSizeY );
perCloudUV += vec2(0.5*texelSizeX, 0.5*texelSizeY);
This function finds the uv for textures that contain one texel for each particle contained in all of the clouds:
float pTexelSizeX = 1.0 / uPerParticleBufferWidth;
float pTexelSizeY = 1.0 / uPerParticleBufferHeight;
vec2 perParticleUV = vec2(
mod(aParticleIndex, uPerParticleBufferWidth)*pTexelSizeX,
floor(aParticleIndex / uPerParticleBufferHeight)*pTexelSizeY );
perParticleUV += vec2(0.5*pTexelSizeX, 0.5*pTexelSizeY);
Shouldn't this
vec2 perMotifUV = vec2(
mod(cellIndex, uPerCloudBufferWidth)*texelSizeX,
floor(cellIndex / uPerCloudBufferHeight)*texelSizeY );
be this?
vec2 perMotifUV = vec2(
mod(cellIndex, uPerCloudBufferWidth)*texelSizeX,
floor(cellIndex / uPerCloudBufferWidth)*texelSizeY ); // <=- use width
And same for the other? Divide by width not height
I wrote some WebGL code that is based on floating point textures. But while testing it on a few more devices I found that support for the OES_texture_float extension isn't as widespread as I had thought. So I'm looking for a fallback.
I have currently a luminance floating point texture with values between -1.0 and 1.0. I'd like to encode this data in a texture format that is available in WebGL without any extensions, so probably a simple RGBA unsigned byte texture.
I'm a bit worried about the potential performance overhead because the cases where this fallback is needed are older smartphones or tablets which already have much weaker GPUs than a modern desktop computer.
How can I emulate floating point textures on a device that doesn't support them in WebGL?
If you know your range is -1 to +1 the simplest way is to just to convert that to some integer range and then convert back. Using the code from this answer which packs a value that goes from 0 to 1 into a 32bit color
const vec4 bitSh = vec4(256. * 256. * 256., 256. * 256., 256., 1.);
const vec4 bitMsk = vec4(0.,vec3(1./256.0));
const vec4 bitShifts = vec4(1.) / bitSh;
vec4 pack (float value) {
vec4 comp = fract(value * bitSh);
comp -= comp.xxyz * bitMsk;
return comp;
}
float unpack (vec4 color) {
return dot(color , bitShifts);
}
Then
const float rangeMin = -1.;
const float rangeMax = -1.;
vec4 convertFromRangeToColor(float value) {
float zeroToOne = (value - rangeMin) / (rangeMax - rangeMin);
return pack(value);
}
float convertFromColorToRange(vec4 color) {
float zeroToOne = unpack(color);
return rangeMin + zeroToOne * (rangeMax - rangeMin);
}
This should be a good starting point: http://aras-p.info/blog/2009/07/30/encoding-floats-to-rgba-the-final/
It's intended for encoding to 0.0 to 1.0, but should be straightforward to remap to your required range.
I'm trying to rotate a texture in a fragment shader, instead of using the vertex shader and matrix transformations.
The rotation has the pivot at the center.
The algorithm works fine when rendering in a quad with a square shape, but when the quad has a rectangular shape the render result gets messed up.
Anyone can spot the problem?
Thank you
varying vec2 v_texcoord;
uniform sampler2D u_texture;
uniform float u_angle;
void main()
{
vec2 coord = v_texcoord;
float sin_factor = sin(u_angle);
float cos_factor = cos(u_angle);
coord = (coord - 0.5) * mat2(cos_factor, sin_factor, -sin_factor, cos_factor);
coord += 0.5;
gl_FragColor = texture2D(u_texture, coord);
}
The following line of code which was provided in the question:
coord = vec2(coord.x - (0.5 * Resolution.x / Resolution.y), coord.y - 0.5) * mat2(cos_factor, sin_factor, -sin_factor, cos_factor);
is not quite right.
There are some bracketing errors.
The correct version would be:
coord = vec2((coord.x - 0.5) * (Resolution.x / Resolution.y), coord.y - 0.5) * mat2(cos_factor, sin_factor, -sin_factor, cos_factor);
I haven't tried it out myself, but my guess is that since you are using the texture coordinates in a rectangular space, it will cause distortion upon rotation without some factor to correct it.
You'll need to pass it a uniform that declares the width and height of your texture. With this, you can apply the aspect ratio to correct the distortion.
coord = (coord - 0.5) * mat2(cos_factor, sin_factor, -sin_factor, cos_factor);
may become something like:
coord = vec2(coord.x - (0.5 * Resolution.x / Resolution.y), coord.y - 0.5) * mat2(cos_factor, sin_factor, -sin_factor, cos_factor);
Like I said though, I haven't tried it out, but I have had to do this in the past for similar shaders. Might need to reverse Resolution.x / Resolution.y.
I have my painting application which is written using OpenGL ES 1.0 and some Quartz.
I'm trying to rewrite it using OpenGL ES 2.0 for better performance and new features.
I have written 2 shaders: one renders user's input to texture and second mixes this texture with some other textures according to some rules.
Suddenly I realized that second shader works too long on iPad 1st generation - I have 10-15 fps only. iPad 2 works perfectly with 60+ fps. I was slightly shocked because original app (OpenGL ES 1.0) works fine on both devices. It renders only two polygons (but almost fullscreen).
I've tried some optimizations like changing precision, commented some math operations, hardcoded some textures calls - It helped a little, but I'm still far away from 60 fps. Only when I fully comment call of this shader I've got 60 fps.
Am I missing something? I haven't much experience in OpenGL but i do believe this shader must work fine on both generations of devices, just like original application works. My vertex and fragment shaders are:
===============Vertex Shader===================
uniform mat4 modelViewProjectionMatrix;
attribute vec3 position;
attribute vec2 texCoords;
varying vec2 fTexCoords;
void main()
{
fTexCoords = texCoords;
vec4 postmp = vec4(position.xyz, 1.0);
gl_Position = modelViewProjectionMatrix * postmp;
}
===============Fragment Shader===================
precision highp float;
varying lowp vec4 colorVarying;
varying highp vec2 fTexCoords;
uniform sampler2D texture; // black & white user should paint
uniform sampler2D drawingTexture; // texture with user drawings I rendered earlier
uniform sampler2D paperTexture; // texture of sheet of paper
uniform float currentArea; // which area we should not shadow
uniform float isShadowingOn; // bool - should we shadow some areas of picture
void main()
{
// I pass 1024*1024 texture here but I only need 560*800 so I do some calculations to find real texture coordinates
vec2 convertedTexCoords = vec2(fTexCoords.x * 560.0/1024.0, fTexCoords.y * 800.0/1024.0);
vec4 bgImageColor = texture2D(texture, convertedTexCoords);
float area = bgImageColor.a;
bgImageColor.a = 1.0;
vec4 paperColor = texture2D(paperTexture, convertedTexCoords);
vec4 drawingColor = texture2D(drawingTexture, convertedTexCoords);
// if special area
if ( abs(area - 1.0) < 0.0001) {
// if shadowing ON
if (isShadowingOn == 1.0) {
// if color of original image is black
if ( (bgImageColor.r < 0.1) && (bgImageColor.g < 0.1) && (bgImageColor.b < 0.1) ) {
gl_FragColor = vec4(bgImageColor.rgb, 1.0) * vec4(0.5, 0.5, 0.5, 1.0);
}
// if color of original image is grey
else if ( abs(bgImageColor.r - bgImageColor.g) < 0.15 && abs(bgImageColor.r - bgImageColor.b) < 0.15 && abs(bgImageColor.g - bgImageColor.b) < 0.15 && bgImageColor.r < 0.8 && bgImageColor.g < 0.8 && bgImageColor.b < 0.8){ gl_FragColor = vec4(paperColor.rgb * bgImageColor.rgb * 0.4 - drawingColor.rgb * 0.4, 1.0);}
else
{
gl_FragColor = vec4(bgImageColor.rgb, 1.0) * vec4(0.5, 0.5, 0.5, 1.0);
}
}
// if shadowing is OFF
else {
// if color of original image is black
if ( (bgImageColor.r < 0.1) && (bgImageColor.g < 0.1) && (bgImageColor.b < 0.1) ) {
gl_FragColor = vec4(bgImageColor.rgb, 1.0);
}
// if color of original image is gray
else if ( abs(bgImageColor.r - bgImageColor.g) < 0.15 && abs(bgImageColor.r - bgImageColor.b) < 0.15 && abs(bgImageColor.g - bgImageColor.b) < 0.15
&& bgImageColor.r < 0.8 && bgImageColor.g < 0.8 && bgImageColor.b < 0.8){
gl_FragColor = vec4(paperColor.rgb * bgImageColor.rgb * 0.4 - drawingColor.rgb * 0.4, 1.0);
}
// rest
else {
gl_FragColor = vec4(bgImageColor.rgb, 1.0);
}
}
}
// if area of fragment is equal to current area
else if ( abs(area-currentArea/255.0) < 0.0001 ) {
gl_FragColor = vec4(paperColor.rgb * bgImageColor.rgb - drawingColor.rgb, 1.0);
}
// if area of fragment is NOT equal to current area
else {
if (isShadowingOn == 1.0) {
gl_FragColor = vec4(paperColor.rgb * bgImageColor.rgb - drawingColor.rgb, 1.0) * vec4(0.5, 0.5, 0.5, 1.0);
} else {
gl_FragColor = vec4(paperColor.rgb * bgImageColor.rgb - drawingColor.rgb, 1.0);
}
}
}
Branching is really expensive to do in a shader, as it removes possibilities for the GPU to run the shader in parallel, and you are having a lot of branches in your fragment shader (the one shader that should be as fast as possible anyway). Even worse than that, you are branching based on values computed on the GPU itself which also drastically drains your performance.
You really should try to remove as many branches as possible, rather let the GPU do some "extra work" by eg. not trying to optimize the texture atlas and render everything (if this is possible), this will still be faster than your current version. If this doesn't work, try to split up your shader in multiple smaller shaders which each only does a specific part of your larger shader and branch on the CPU rather than on the GPU (you only need to do this once per draw call and not for every "pixel").
Beyond JustSid's valid point about branching in the shader, I see a few other things wrong here. First, if I just run this fragment shader through Imagination Texhnologies' PVRUniSco Editor (which you really should get, and is part of their free SDK), I see this:
which shows a best-case performance of 42 cycles, worst of 52 for this shader. From a similar case of fragment shader tuning I asked about, I found that an 11-16 cycle fragment shader took 35-68 ms to render on an iPad 1 (15 - 29 FPS). You're going to need to make this a lot tighter to get reasonable render times for it.
To eliminate some of the branches, you might be able to use a step function or play tricks with your alpha channel. I've done this and seen a massive reduction in shader rendering times. I would not pass in the isShadowingOn uniform, but I would split this into two shaders to use in the different cases of this being on and off.
Beyond branching, I can see that you're performing a dependent texture read for bgImageColor, paperColor, and drawingColor as a result of calculating the texture coordinates to fetch within your fragment shader. This is horribly expensive on the tile-based deferred renderer within iOS devices, because it prevents certain optimizations for texture fetching from being used. Instead of calculating this per-fragment, I recommend moving this calculation to the vertex shader and passing in the result as a varying to your fragment shader. Use that varying as the coordinate to fetch your textures and you'll see a massive boost in performance.
There are also smaller things you can do to tweak this. For example,
gl_FragColor = vec4((paperColor.rgb * bgImageColor.rgb - drawingColor.rgb) * 0.4, 1.0);
should be slightly faster than
gl_FragColor = vec4(paperColor.rgb * bgImageColor.rgb * 0.4 - drawingColor.rgb * 0.4, 1.0);
The editor will live-compile your shader, so you can try out these manipulations in code and see the results in terms of estimated GPU cycles.
So I look onto OpenGL ES shader specs but do not see such...
For example - I created simple "pinch to zoon" and "rotate to turn around" and "move to move center" HYDRA pixel bender filter. it can be executed in flash. It is based on default pixel bender twirl example and this:
<languageVersion: 1.0;>
kernel zoomandrotate
< namespace : "Pixel Bender Samples";
vendor : "Kabumbus";
version : 3;
description : "rotate and zoom an image around"; >
{
// define PI for the degrees to radians calculation
const float PI = 3.14159265;
// An input parameter to specify the center of the twirl effect.
// As above, we're using metadata to indicate the minimum,
// maximum, and default values, so that the tools can set the values
// in the correctly in the UI for the filter.
parameter float2 center
<
minValue:float2(0.0, 0.0);
maxValue:float2(2048.0, 2048.0);
defaultValue:float2(256.0, 256.0);
>;
// An input parameter to specify the angle that we would like to twirl.
// For this parameter, we're using metadata to indicate the minimum,
// maximum, and default values, so that the tools can set the values
// in the correctly in the UI for the filter.
parameter float twirlAngle
<
minValue:float(0.0);
maxValue:float(360.0);
defaultValue:float(90.0);
>;
parameter float zoomAmount
<
minValue:float(0.01);
maxValue:float(10.0);
defaultValue:float(1);
>;
// An input parameter that indicates how we want to vary the twirling
// within the radius. We've added support to modulate by one of two
// functions, a gaussian or a sinc function. Since Flash does not support
// bool parameters, we instead are using this as an int with two possible
// values. Setting this parameter to be 1 will
// cause the gaussian function to be used, unchecking it will cause
// the sinc function to be used.
parameter int gaussOrSinc
<
minValue:int(0);
maxValue:int(1);
defaultValue:int(0);
>;
input image4 oImage;
output float4 outputColor;
// evaluatePixel(): The function of the filter that actually does the
// processing of the image. This function is called once
// for each pixel of the output image.
void
evaluatePixel()
{
// convert the angle to radians
float twirlAngleRadians = radians(twirlAngle);
// calculate where we are relative to the center of the twirl
float2 relativePos = outCoord() - center;
// calculate the absolute distance from the center normalized
// by the twirl radius.
float distFromCenter = length( relativePos );
distFromCenter = 1.0;
// modulate the angle based on either a gaussian or a sync.
float adjustedRadians;
// precalculate either the gaussian or the sinc weight
float sincWeight = sin( distFromCenter ) * twirlAngleRadians / ( distFromCenter );
float gaussWeight = exp( -1.0 * distFromCenter * distFromCenter ) * twirlAngleRadians;
// protect the algorithm from a 1 / 0 error
adjustedRadians = (distFromCenter == 0.0) ? twirlAngleRadians : sincWeight;
// switch between a gaussian falloff or a sinc fallof
adjustedRadians = (gaussOrSinc == 1) ? adjustedRadians : gaussWeight;
// rotate the pixel sample location.
float cosAngle = cos( adjustedRadians );
float sinAngle = sin( adjustedRadians );
float2x2 rotationMat = float2x2(
cosAngle, sinAngle,
-sinAngle, cosAngle
);
relativePos = rotationMat * relativePos;
float scale = zoomAmount;
// sample and set as the output color. since relativePos
// is related to the center location, we need to add it back in.
// We use linear sampling to smooth out some of the pixelation.
outputColor = sampleLinear( oImage, relativePos/scale + center );
}
}
So now I want to port it into OpenGL ES shader. math and parameters are convertable into OpenGL ES shader language, but what to do with sampleLinear? what is analog for it in openGL ES shader languge?
update:
So I had created something similar to my HYDRA filter... compatable with webGL and OpenGL ES shaders...
#ifdef GL_ES
precision highp float;
#endif
uniform vec2 resolution;
uniform float time;
uniform sampler2D tex0;
void main(void)
{
vec2 p = -1.0 + 2.0 * gl_FragCoord.xy / resolution.xy;
// a rotozoom
vec2 cst = vec2( cos(.5*time), sin(.5*time) );
mat2 rot = 0.5*cst.x*mat2(cst.x,-cst.y,cst.y,cst.x);
vec3 col = texture2D(tex0,0.5*rot*p+sin(0.1*time)).xyz;
gl_FragColor = vec4(col,1.0);
}
To see how it works get modern browser, navigate to shadertoy provide it with one texture ( http://www.iquilezles.org/apps/shadertoy/presets/tex4.jpg for example), paste my code into editable text aeria and hit ... Have fun. So.. now I have another problem... I want to have one image and black around it not copies of that same image... Any one knows how to do that?
Per Adobe's Pixel Blender Reference, sampleLinear "Handles coordinates not at pixel centers by performing bilinear interpolation on the adjacent pixel values."
The correct way to achieve that in OpenGL is to use texture2D, as you already are, but to set the texture environment for linear filtering via glTexParameter.
You can use the step function and multiply by its result to get black for out-of-bounds pixels, or give your texture a single pixel black border and switch to clamping rather than repeat, also via glTexParameter.
If you want to do it in code, try:
#ifdef GL_ES
precision highp float;
#endif
uniform vec2 resolution;
uniform float time;
uniform sampler2D tex0;
void main(void)
{
vec2 p = -1.0 + 2.0 * gl_FragCoord.xy / resolution.xy;
// a rotozoom
vec2 cst = vec2( cos(.5*time), sin(.5*time) );
mat2 rot = 0.5*cst.x*mat2(cst.x,-cst.y,cst.y,cst.x);
vec2 samplePos = 0.5*rot*p+sin(0.1*time);
float mask = step(samplePos.x, 0.0) * step(samplePos.y, 0.0) * (1.0 - step(samplePos.x, 1.0)) * (1.0 - step(samplePos.y, 1.0));
vec3 col = texture2D(tex0,samplePos).xyz;
gl_FragColor = vec4(col*mask,1.0);
}
That'd restrict colours to coming from the box from (0, 0) to (1, 1), but it looks like the shader heads off to some significantly askew places, so I'm not sure exactly what you want.