I need to generate a pseudo random number, in specified range, using glsl
Something like this
float rand(vec2 co)
{
return fract(sin(dot(co.xy,vec2(12.9898,78.233))) * 43758.5453);
}
or this
highp float rand(vec2 co)
{
highp float a = 12.9898;
highp float b = 78.233;
highp float c = 43758.5453;
highp float dt= dot(co.xy ,vec2(a,b));
highp float sn= mod(dt,3.14);
return fract(sin(sn) * c);
}
both from here and here respectively, would probably work, but I need to be able to specify a range (ie 1-10) for each pseudo random number.
In addition, I am using a glsl compute shader, not a vertex shader, so I do not have access to the typical vertex shader variables such as st
Related
I am clipping off on the fragment shader (setting the transparency to 0/1) based on the cut off vertex (v_cutPos) and the current vertex (v_currPos) that I get from the vertex shader. These two vertices are passed as world coords.
Now, the cut off logic works fine. But the cut itself is not smooth (it has to follow a certain shape). And when I pass the same vertices after converting to clip space, the cut is much smoother (or finer.)
Is there any explanation to this?
//fragment shader
precision highp float;
varying mediump vec4 v_color;
varying vec4 v_currPos;
varying vec4 v_cutPos;
/* returns 0 if pt is inside box otherwise 1 */
float insideCutArea(vec2 pt, vec2 cutPos)
{
return float(pt.y > cutPos.y);
}
void main(void)
{
float transparency = insideCutArea(v_currPos.xy, v_cutPos.xy);
gl_FragColor = vec4(v_color.xyz, v_color.w * transparency);
}
//vertex shader
varying mediump vec4 v_color;
uniform vec3 cutPos;
varying vec4 v_currPos;
varying vec4 v_cutPos;
void main(void)
{
//-------------------
other transformations
-------------------//
v_cutPos = myPMVMatrix * vec4(cutPos,1.0); //cut is not fine when not multiplying with the matrix
gl_Position = myPMVMatrix * vec4(validVertex, 1.0);
v_currPos = myPMVMatrix * vec4(validVertex, 1.0); //cut is not fine when not multiplying with the matrix
v_color = color;
}
PS: This question was previously closed due to not much clarity. I
have created this again with code explaining what I have done.
I have a simple fragment shader that draws test grid pattern.
I don't really have a problem - but I've noticed a weird behavior that's inexplicable to me. Don't mind weird constants - they get filled during shader assembly before compilation. Also, vertexPosition is actual calculated position in world space, so I can move the shader texture when the mesh itself moves.
Here's the code of my shader:
#version 300 es
precision highp float;
in highp vec3 vertexPosition;
out mediump vec4 fragColor;
const float squareSize = __CONSTANT_SQUARE_SIZE;
const vec3 color_base = __CONSTANT_COLOR_BASE;
const vec3 color_l1 = __CONSTANT_COLOR_L1;
float minWidthX;
float minWidthY;
vec3 color_green = vec3(0.0,1.0,0.0);
void main()
{
// calculate l1 border positions
float dimention = squareSize;
int roundX = int(vertexPosition.x / dimention);
int roundY = int(vertexPosition.z / dimention);
float remainderX = vertexPosition.x - float(roundX)*dimention;
float remainderY = vertexPosition.z - float(roundY)*dimention;
vec3 dyX = dFdy(vec3(vertexPosition.x, vertexPosition.y, 0));
vec3 dxX = dFdx(vec3(vertexPosition.x, vertexPosition.y, 0));
minWidthX = max(length(dxX),length(dyX));
vec3 dyY = dFdy(vec3(0, vertexPosition.y, vertexPosition.z));
vec3 dxY = dFdx(vec3(0, vertexPosition.y, vertexPosition.z));
minWidthY = max(length(dxY),length(dyY));
//Fill l1 suqares
if (remainderX <= minWidthX)
{
fragColor = vec4(color_l1, 1.0);
return;
}
if (remainderY <= minWidthY)
{
fragColor = vec4(color_l1, 1.0);
return;
}
// fill base color
fragColor = vec4(color_base, 1.0);
return;
}
So, with this code everything works well.
I then wanted to optimize it a little bit by moving calculations that only concern horizontal lines after the vertical lines are drawn. Because these calculations are useless if the vertical lines check is true. Like this:
#version 300 es
precision highp float;
in highp vec3 vertexPosition;
out mediump vec4 fragColor;
const float squareSize = __CONSTANT_SQUARE_SIZE;
const vec3 color_base = __CONSTANT_COLOR_BASE;
const vec3 color_l1 = __CONSTANT_COLOR_L1;
float minWidthX;
float minWidthY;
vec3 color_green = vec3(0.0,1.0,0.0);
void main()
{
// calculate l1 border positions
float dimention = squareSize;
int roundX = int(vertexPosition.x / dimention);
int roundY = int(vertexPosition.z / dimention);
float remainderX = vertexPosition.x - float(roundX)*dimention;
float remainderY = vertexPosition.z - float(roundY)*dimention;
vec3 dyX = dFdy(vec3(vertexPosition.x, vertexPosition.y, 0));
vec3 dxX = dFdx(vec3(vertexPosition.x, vertexPosition.y, 0));
minWidthX = max(length(dxX),length(dyX));
//Fill l1 suqares
if (remainderX <= minWidthX)
{
fragColor = vec4(color_l1, 1.0);
return;
}
vec3 dyY = dFdy(vec3(0, vertexPosition.y, vertexPosition.z));
vec3 dxY = dFdx(vec3(0, vertexPosition.y, vertexPosition.z));
minWidthY = max(length(dxY),length(dyY));
if (remainderY <= minWidthY)
{
fragColor = vec4(color_l1, 1.0);
return;
}
// fill base color
fragColor = vec4(color_base, 1.0);
return;
}
But even while seemingly this should not affect the result - it does. By quite a bit.
Below are the two screenshots. The first one is the original code, the second - is the "optimized" one. Which works bad.
Original version:
Optimized version (looks much worse):
Notice how the lines became "fuzzy" even though seemingly no numbers should have changed at all.
Note: this isn't because minwidthX/Y are global. I initially optimized by making them local.
I also initially moved RoundY and remainderY calculation below the X check as well, and the result is the same.
Note 2: I tried adding highp keyword for each of those calculations specifically, but that doesn't change anything (not that I expected it to, but I tried nevertheless)
Could anyone please explain to me why this happens? I would like to know for my future shaders, and actually I would like to optimize this one as well. I would like to understand the principle behind precision loss here, because it doesn't make any sense to me.
For the answer I'll refer to OpenGL ES Shading Language 3.20 Specification, which is the same as OpenGL ES Shading Language 3.00 Specification in this point.
8.14.1. Derivative Functions
[...] Derivatives are undefined within non-uniform control flow.
and further
3.9.2. Uniform and Non-Uniform Control Flow
When executing statements in a fragment shader, control flow starts as uniform control flow; all fragments enter the same control path into main(). Control flow becomes non-uniform when different fragments take different paths through control-flow statements (selection, iteration, and jumps).[...]
That means, that the result of the derivative functions in the first case (of your question) is well defined.
But in the second case it is not:
if (remainderX <= minWidthX)
{
fragColor = vec4(color_l1, 1.0);
return;
}
vec3 dyY = dFdy(vec3(0, vertexPosition.y, vertexPosition.z));
vec3 dxY = dFdx(vec3(0, vertexPosition.y, vertexPosition.z));
because the return statement acts like a selection. And all the code after the code block with the return statement is in non-uniform control flow.
To implement this idea, I wrote the following two versions of my vertex and fragment shaders:
// Vertex:
precision highp int;
precision highp float;
uniform vec4 r_info;
attribute vec2 s_coords;
attribute vec2 r_coords;
varying vec2 t_coords;
void main (void) {
int w = int(r_info.w);
int x = int(r_coords.x) + int(r_coords.y) * int(r_info.y);
int y = x / w;
x = x - y * w;
y = y + int(r_info.x);
t_coords = vec2(x, y) * r_info.z;
gl_Position = vec4(s_coords, 0.0, 1.0);
}
// Fragment:
precision highp float;
uniform sampler2D sampler;
uniform vec4 color;
varying vec2 t_coords;
void main (void) {
gl_FragColor = vec4(color.rgb, color.a * texture2D(sampler, t_coords).a);
}
vs.
// Vertex:
precision highp float;
attribute vec2 s_coords;
attribute vec2 r_coords;
varying vec2 t_coords;
void main (void) {
t_coords = r_coords;
gl_Position = vec4(s_coords, 0.0, 1.0);
}
// Fragment:
precision highp float;
precision highp int;
uniform vec4 r_info;
uniform sampler2D sampler;
uniform vec4 color;
varying vec2 t_coords;
void main (void) {
int w = int(r_info.w);
int x = int(t_coords.x) + int(t_coords.y) * int(r_info.y);
int y = x / w;
x = x - y * w;
y = y + int(r_info.x);
gl_FragColor = vec4(color.rgb, color.a * texture2D(sampler, vec2(x, y) * r_info.z).a);
}
The only difference between them (I hope) is the location where the texture coordinates are transformed. In the first version, the math happens in the vertex shader, in the second one it happens in the fragment shader.
Now, the official OpenGL ES SL 1.0 Specifications state that "[t]he vertex language must provide an integer precision of at least 16 bits, plus a sign bit" and "[t]he fragment language must provide an integer precision of at least 10 bits, plus a sign bit" (chapter 4.5.1). If I understand correctly, this means that given just a minimal implementation, the precision I should be able to get in the vertex shader should be better than that in the fragment shader, correct? For some reason, though, the second version of the code works correctly while the first version leads to a bunch of rounding errors. Am I missing something???
Turns out I fundamentally misunderstood how things work... Maybe I still do, but let me answer my question based on my current understanding:
I thought that for every pixel that is rendered, first the Vertex Shader and then the Fragment Shader are executed. But, if I now understand correctly, the Vertex Shader is only called once for each vertex of the triangle primitives (kind-of makes sense given it's name, too...).
So, the first version of my code above only calculates the correct texture coordinate at the actual corner points (vertices) of the triangles that I'm drawing. For all other pixels in the triangle, the texture coordinate is simply a linear interpolation between those corner coordinates. Of course, since my formula isn't linear (including rounding and modulo-operations), this leads to the wrong texture-coordinates for each individual pixel.
The second version, though, applies the non-linear transformation to the texture coordinates at each pixel location, giving the correct texture coordinates everywhere.
So, the generalized learning (and the reason I didn't just delete the question):
All non-linear texture-coordinate transformation must be done in the fragment shader.
I have a 3x3 homography matrix that works correctly with OpenCV's warpPerspective, but I need to do the warping on GPU for performance reasons. What is the best approach? I tried multiplying in the vertex shader to get the texture coordinates and then render a quad, but I get strange distortions. I'm not sure if it's the interpolation not working as I expect. Attaching output for comparison (it involves two different, but close enough shots).
Absolute difference of warp and other image from GPU:
Composite of warp and other image in OpenCV:
EDIT:
Following are my shaders: the task is image rectification (making epilines become scanlines) + absolute difference.
// Vertex Shader
static const char* warpVS = STRINGIFY
(
uniform highp mat3 homography1;
uniform highp mat3 homography2;
uniform highp int width;
uniform highp int height;
attribute highp vec2 position;
varying highp vec2 refTexCoords;
varying highp vec2 curTexCoords;
highp vec2 convertToTexture(highp vec3 pixelCoords) {
pixelCoords /= pixelCoords.z; // need to project
pixelCoords /= vec3(float(width), float(height), 1.0);
pixelCoords.y = 1.0 - pixelCoords.y; // origin is in bottom left corner for textures
return pixelCoords.xy;
}
void main(void)
{
gl_Position = vec4(position / vec2(float(width) / 2.0, float(height) / 2.0) - vec2(1.0), 0.0, 1.0);
gl_Position.y = -gl_Position.y;
highp vec3 initialCoords = vec3(position, 1.0);
refTexCoords = convertToTexture(homography1 * initialCoords);
curTexCoords = convertToTexture(homography2 * initialCoords);
}
);
// Fragment Shader
static const char* warpFS = STRINGIFY
(
varying highp vec2 refTexCoords;
varying highp vec2 curTexCoords;
uniform mediump sampler2D refTex;
uniform mediump sampler2D curTex;
uniform mediump sampler2D maskTex;
void main(void)
{
if (texture2D(maskTex, refTexCoords).r == 0.0) {
discard;
}
if (any(bvec4(curTexCoords[0] < 0.0, curTexCoords[1] < 0.0, curTexCoords[0] > 1.0, curTexCoords[1] > 1.0))) {
discard;
}
mediump vec4 referenceColor = texture2D(refTex, refTexCoords);
mediump vec4 currentColor = texture2D(curTex, curTexCoords);
gl_FragColor = vec4(abs(referenceColor.r - currentColor.r), 1.0, 0.0, 1.0);
}
);
I think you just need to do the projection per pixel. Make refTexCoords and curTexCoords at least vec3, then do the /z in the pixel shader before texture lookup. Even better use the textureProj GLSL instruction.
You want to do everything that is linear in the vertex shader, but things like projection need to be done in the fragment shader per pixel.
This link might help with some background: http://www.reedbeta.com/blog/2012/05/26/quadrilateral-interpolation-part-1/
I'm trying to implement a 2D outline shader in OpenGL ES2.0 for iOS. It is insanely slow. As in 5fps slow. I've tracked it down to the texture2D() calls. However, without those any convolution shader is undoable. I've tried using lowp instead of mediump, but with that everything is just black, although it does give another 5fps, but it's still unusable.
Here is my fragment shader.
varying mediump vec4 colorVarying;
varying mediump vec2 texCoord;
uniform bool enableTexture;
uniform sampler2D texture;
uniform mediump float k;
void main() {
const mediump float step_w = 3.0/128.0;
const mediump float step_h = 3.0/128.0;
const mediump vec4 b = vec4(0.0, 0.0, 0.0, 1.0);
const mediump vec4 one = vec4(1.0, 1.0, 1.0, 1.0);
mediump vec2 offset[9];
mediump float kernel[9];
offset[0] = vec2(-step_w, step_h);
offset[1] = vec2(-step_w, 0.0);
offset[2] = vec2(-step_w, -step_h);
offset[3] = vec2(0.0, step_h);
offset[4] = vec2(0.0, 0.0);
offset[5] = vec2(0.0, -step_h);
offset[6] = vec2(step_w, step_h);
offset[7] = vec2(step_w, 0.0);
offset[8] = vec2(step_w, -step_h);
kernel[0] = kernel[2] = kernel[6] = kernel[8] = 1.0/k;
kernel[1] = kernel[3] = kernel[5] = kernel[7] = 2.0/k;
kernel[4] = -16.0/k;
if (enableTexture) {
mediump vec4 sum = vec4(0.0);
for (int i=0;i<9;i++) {
mediump vec4 tmp = texture2D(texture, texCoord + offset[i]);
sum += tmp * kernel[i];
}
gl_FragColor = (sum * b) + ((one-sum) * texture2D(texture, texCoord));
} else {
gl_FragColor = colorVarying;
}
}
This is unoptimized, and not finalized, but I need to bring up performance before continuing on. I've tried replacing the texture2D() call in the loop with just a solid vec4 and it runs no problem, despite everything else going on.
How can I optimize this? I know it's possible because I've seen way more involved effects in 3D running no problem. I can't see why this is causing any trouble at all.
I've done this exact thing myself, and I see several things that could be optimized here.
First off, I'd remove the enableTexture conditional and instead split your shader into two programs, one for the true state of this and one for false. Conditionals are very expensive in iOS fragment shaders, particularly ones that have texture reads within them.
Second, you have nine dependent texture reads here. These are texture reads where the texture coordinates are calculated within the fragment shader. Dependent texture reads are very expensive on the PowerVR GPUs within iOS devices, because they prevent that hardware from optimizing texture reads using caching, etc. Because you are sampling from a fixed offset for the 8 surrounding pixels and one central one, these calculations should be moved up into the vertex shader. This also means that these calculations won't have to be performed for each pixel, just once for each vertex and then hardware interpolation will handle the rest.
Third, for() loops haven't been handled all that well by the iOS shader compiler to date, so I tend to avoid those where I can.
As I mentioned, I've done convolution shaders like this in my open source iOS GPUImage framework. For a generic convolution filter, I use the following vertex shader:
attribute vec4 position;
attribute vec4 inputTextureCoordinate;
uniform highp float texelWidth;
uniform highp float texelHeight;
varying vec2 textureCoordinate;
varying vec2 leftTextureCoordinate;
varying vec2 rightTextureCoordinate;
varying vec2 topTextureCoordinate;
varying vec2 topLeftTextureCoordinate;
varying vec2 topRightTextureCoordinate;
varying vec2 bottomTextureCoordinate;
varying vec2 bottomLeftTextureCoordinate;
varying vec2 bottomRightTextureCoordinate;
void main()
{
gl_Position = position;
vec2 widthStep = vec2(texelWidth, 0.0);
vec2 heightStep = vec2(0.0, texelHeight);
vec2 widthHeightStep = vec2(texelWidth, texelHeight);
vec2 widthNegativeHeightStep = vec2(texelWidth, -texelHeight);
textureCoordinate = inputTextureCoordinate.xy;
leftTextureCoordinate = inputTextureCoordinate.xy - widthStep;
rightTextureCoordinate = inputTextureCoordinate.xy + widthStep;
topTextureCoordinate = inputTextureCoordinate.xy - heightStep;
topLeftTextureCoordinate = inputTextureCoordinate.xy - widthHeightStep;
topRightTextureCoordinate = inputTextureCoordinate.xy + widthNegativeHeightStep;
bottomTextureCoordinate = inputTextureCoordinate.xy + heightStep;
bottomLeftTextureCoordinate = inputTextureCoordinate.xy - widthNegativeHeightStep;
bottomRightTextureCoordinate = inputTextureCoordinate.xy + widthHeightStep;
}
and the following fragment shader:
precision highp float;
uniform sampler2D inputImageTexture;
uniform mediump mat3 convolutionMatrix;
varying vec2 textureCoordinate;
varying vec2 leftTextureCoordinate;
varying vec2 rightTextureCoordinate;
varying vec2 topTextureCoordinate;
varying vec2 topLeftTextureCoordinate;
varying vec2 topRightTextureCoordinate;
varying vec2 bottomTextureCoordinate;
varying vec2 bottomLeftTextureCoordinate;
varying vec2 bottomRightTextureCoordinate;
void main()
{
mediump vec4 bottomColor = texture2D(inputImageTexture, bottomTextureCoordinate);
mediump vec4 bottomLeftColor = texture2D(inputImageTexture, bottomLeftTextureCoordinate);
mediump vec4 bottomRightColor = texture2D(inputImageTexture, bottomRightTextureCoordinate);
mediump vec4 centerColor = texture2D(inputImageTexture, textureCoordinate);
mediump vec4 leftColor = texture2D(inputImageTexture, leftTextureCoordinate);
mediump vec4 rightColor = texture2D(inputImageTexture, rightTextureCoordinate);
mediump vec4 topColor = texture2D(inputImageTexture, topTextureCoordinate);
mediump vec4 topRightColor = texture2D(inputImageTexture, topRightTextureCoordinate);
mediump vec4 topLeftColor = texture2D(inputImageTexture, topLeftTextureCoordinate);
mediump vec4 resultColor = topLeftColor * convolutionMatrix[0][0] + topColor * convolutionMatrix[0][1] + topRightColor * convolutionMatrix[0][2];
resultColor += leftColor * convolutionMatrix[1][0] + centerColor * convolutionMatrix[1][1] + rightColor * convolutionMatrix[1][2];
resultColor += bottomLeftColor * convolutionMatrix[2][0] + bottomColor * convolutionMatrix[2][1] + bottomRightColor * convolutionMatrix[2][2];
gl_FragColor = resultColor;
}
The texelWidth and texelHeight uniforms are the inverse of the width and height of the input image, and the convolutionMatrix uniform specifies the weights for the various samples in your convolution.
On an iPhone 4, this runs in 4-8 ms for a 640x480 frame of camera video, which is good enough for 60 FPS rendering at that image size. If you just need to do something like edge detection, you can simplify the above, convert the image to luminance in a pre-pass, then only sample from one color channel. That's even faster, at about 2 ms per frame on the same device.
The only way I know of reducing time taken in this shader is by reducing the number of texture fetches. Since your shader samples textures from equally spaced points about the center pixels and linearly combines them, you could reduce the number of fetches by making use of the GL_LINEAR mode availbale for texture sampling.
Basically instead of sampling at every texel, sample in between a pair of texels to directly get a linearly weighted sum.
Let us call the sampling at offset (-stepw,-steph) and (-stepw,0) as x0 and x1 respectively. Then your sum is
sum = x0*k0 + x1*k1
Now instead if you sample in between these two texels, at a distance of
k0/(k0+k1) from x0 and therefore k1/(k0+k1) from x1, then the GPU will perform the linear weighting during the fetch and give you,
y = x1*k1/(k0+k1) + x0*k0/(k1+k0)
Thus sum can be calculated as
sum = y*(k0 + k1) from just one fetch!
If you repeat this for the other adjacent pixels, you will end up doing 4 texture fetches for each of the adjacent offsets, and one extra texture fetch for the center pixel.
The link explains this much better