Why is this OpenGL 3 vertex shader very slow? - performance

I have the following vertex shader:
#version 150
in vec4 position;
in vec2 texture;
in int layer;
out vec2 pass_texture;
out float pass_layer;
uniform mat4 _modelToClipMatrix;
uniform float layerDepth[255];
void main (void)
{
gl_Position = _modelToClipMatrix*vec4(position.xy,layerDepth[layer]/255,position.w);
// gl_Position = _modelToClipMatrix*position;
pass_layer = float(layer);
pass_texture = texture;
}
When I use it the way it is here, my frame rate is about 7 FPS. If I use the second line (which is commented out) instead of the first, my frame rate jumps to about 50 FPS. It seems that the array lookup is the big problem here. Why is it so terribly slow? And how can I improve performance while keeping functionality?
My hardware is a ATI Radeon HD 4670 256 MB (iMac 2010 model).
My vertex structure looks like:
typedef struct
{
floatVector2 position; //2*4=8
uByteVector2 textureCoordinate; //2*1=2
GLubyte layer; //1
} PCBVertex;
and I set op the buffer in the following way:
glVertexAttribPointer((GLuint)positionAttribute, 2, GL_FLOAT, GL_FALSE, sizeof(PCBVertex), (const GLvoid *)offsetof(PCBVertex, position));
glVertexAttribPointer((GLuint)textureAttribute, 2, GL_UNSIGNED_BYTE, GL_FALSE, sizeof(PCBVertex), (const GLvoid *)offsetof(PCBVertex, textureCoordinate));
glVertexAttribIPointer(layerAttribute, 1, GL_UNSIGNED_BYTE, sizeof(PCBVertex), (const GLvoid *)offsetof(PCBVertex, layer));
Some background information:
I'm working on a drawing package. The user can draw on multiple layers. One layer is active at a time and it's drawn front-most. He can also "flip" the layers, as if looking from the other side. I figured it would be inefficient to update all vertices when the layer order changes, so I give each vertex a layer number and lookup its current position in the uniform (I only send x and y as position data). Also, as a side note: the fragment shader uses the same layer number to determine the color, using a uniform array as well.

If you remove the line
gl_Position = _modelToClipMatrix*vec4(position.xy,layerDepth[layer]/255,position.w);
from your shader the uniform float layerDepth[255]; will become unused. That means the compiler will optimize it away.
Further more, the layerAttribute location will become -1, preventing any data transfers for this attrib pointer.

Related

Perfomance depending on index type

I was playing around with "drawing" millions of triangles and found something interesting: switching type of indices from VK_INDEX_TYPE_UINT32 to VK_INDEX_TYPE_UINT16 increased amount of triangles being drawn per second by 1.5 times! I want to know, how is the difference in speed so large?
I use indirect indexed instanced (so much i) drawing: 25 vertices, 138 indices(46 triangles), 2^21~=2M instances(I am too lazy to seek where to disable vSync), 1 draw per frame. 96'468'992 triangles per frame total. To get the clearest results I look away from the triangles (discarding rasterisation has pretty much same performance)
I have very simple vertex shader:
layout(set = 0, binding = 0) uniform A
{
mat4 cam;
};
layout(location = 0)in vec3 inPosition;//
layout(location = 1)in vec4 inColor; //Color and position are de-interleaved
layout(location = 2)in vec3 inGlob; //
layout(location = 3)in vec4 inQuat; //data per instance, interleaved
layout(location = 0)out vec4 fragColor;
vec3 vecXquat(const vec3 v, const vec4 q)
{// function rotating vector by quaternion
return v + 2.0f *
cross(q.xyz,
cross(q.xyz, v)
+ q.w * v);
}
void main(){
gl_Position = vec4(vecXquat(inPosition, inQuat)+inGlob, 1.0f)*cam;
fragColor = inColor;
}
and pass-through fragment shader.
Primitives - VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST
The results:
~1950MTris/s with 32bit indices
~2850MTris/s with 16bit indices
GPU - GTX1050Ti
Since your shaders are so simple, your rendering performance will likely be dominated by factors that would otherwise be more trivial, like vertex data transfer rate.
138 indices have to be read by the GPU for each instance. With 2 million instances, that's 1.02GB of just index data that has to be read by the GPU with 32-bit indices. Of course, for 16-bit indices, the transfer rate is halved. And with half as much data, there's a better chance that the index data all manages to fit entirely in the vertex pulling cache.

OpenGL - trouble passing ALL data into shader at once

I'm trying to display textures on quads (2 triangles) using opengl 3.3
Drawing a texture on a quad works great; however when I have ONE textures (sprite atlas) but using 2 quads(objects) to display different parts of the atlas. When in draw loop, they end up switching back and fourth(one disappears than appears again, etc) at their individual translated locations.
The way I'm drawing this is not the standard DrawElements for each quad(or object) but I package all quads, uv, translations, etc send them up to the shader as one big chunk (as "in" variables): Vertex shader:
#version 330 core
// Input vertex data, different for all executions of this shader.
in vec3 vertexPosition_modelspace;
in vec3 vertexColor;
in vec2 vertexUV;
in vec3 translation;
in vec4 rotation;
in vec3 scale;
// Output data ; will be interpolated for each fragment.
out vec2 UV;
// Output data ; will be interpolated for each fragment.
out vec3 fragmentColor;
// Values that stay constant for the whole mesh.
uniform mat4 MVP;
...
void main(){
mat4 Model = mat4(1.0);
mat4 t = translationMatrix(translation);
mat4 s = scaleMatrix(scale);
mat4 r = rotationMatrix(vec3(rotation), rotation[3]);
Model *= t * r * s;
gl_Position = MVP * Model * vec4 (vertexPosition_modelspace,1); //* MVP;
// The color of each vertex will be interpolated
// to produce the color of each fragment
fragmentColor = vertexColor;
// UV of the vertex. No special space for this one.
UV = vertexUV;
}
Is the vertex shader working as I think it would with a large chunk of data - that it draws each segment passed up as uniform individually because it does not seem like it? Is my train of thought correct on this?
For completeness this is my fragment shader:
#version 330 core
// Interpolated values from the vertex shaders
in vec3 fragmentColor;
// Interpolated values from the vertex shaders
in vec2 UV;
// Ouput data
out vec4 color;
// Values that stay constant for the whole mesh.
uniform sampler2D myTextureSampler;
void main()
{
// Output color = color of the texture at the specified UV
color = texture2D( myTextureSampler, UV ).rgba;
}
A request for more information was made so I will put how i bind this data up to the vertex shader. The following code is just one I use for my translations. I have more for color, rotation, scale, uv, etc:
gl.BindBuffer(gl.ARRAY_BUFFER, tvbo)
gl.BufferData(gl.ARRAY_BUFFER, len(data.Translations)*4, gl.Ptr(data.Translations), gl.DYNAMIC_DRAW)
tAttrib := uint32(gl.GetAttribLocation(program, gl.Str("translation\x00")))
gl.EnableVertexAttribArray(tAttrib)
gl.VertexAttribPointer(tAttrib, 3, gl.FLOAT, false, 0, nil)
...
gl.DrawElements(gl.TRIANGLES, int32(len(elements)), gl.UNSIGNED_INT, nil)
You have just single sampler2D
which means you have just single texture at your disposal
regardless on how many of them you bind.
If you really need to pass the data as single block
then you should add sampler per each texture you got
not sure how many objects/textures you have
but you are limited by gfx hw limit on texture units with this way of data passing
also you need to add another value to your data telling which primitive use which texture unit
and inside fragment then select the right texture sampler ...
You should add stuff like this:
// vertex
in int usedtexture;
out int txr;
void main()
{
txr=usedtexture;
}
// fragment
uniform sampler2D myTextureSampler0;
uniform sampler2D myTextureSampler1;
uniform sampler2D myTextureSampler2;
uniform sampler2D myTextureSampler3;
in vec2 UV;
in int txr;
out vec4 color;
void main
{
if (txr==0) color = texture2D( myTextureSampler0, UV ).rgba;
else if (txr==1) color = texture2D( myTextureSampler1, UV ).rgba;
else if (txr==2) color = texture2D( myTextureSampler2, UV ).rgba;
else if (txr==3) color = texture2D( myTextureSampler3, UV ).rgba;
else color=vec4(0.0,0.0,0.0,0.0);
}
This way of passing is not good for these reasons:
number of used textures is limited to HW texture units limit
if your rendering would need additional textures like normal/shininess/light maps
then you need more then 1 texture per object type and your limit is suddenly divided by 2,3,4...
You need if/switch statements inside fragment which can slow things down considerably
Yes you can do it brunch less but then you would need to access all textures all the time increasing heat stress on gfx without reason...
This kind of passing is suitable for
all textures inside single image (as you mentioned texture atlas)
which can be faster this way and reasonable for scenes with small number of object types (or materials) but large object count...
Since I needed more input on this matter, I linked this page to reddit and someone was able to help me with one response! Anyways the reddit link is here:
https://www.reddit.com/r/opengl/comments/3gyvlt/opengl_passing_all_scene_data_into_shader_each/
The issue of seeing two individual textures/quads after passing all vertices as one data structure over to vertex shader was because my element indices were off. I needed to determine the correct index of each set of vertices for my 2 triangle(quad) objects. Simply had to do something like this:
vertexInfo.Elements = append(vertexInfo.Elements, uint32(idx*4), uint32(idx*4+1), uint32(idx*4+2), uint32(idx*4), uint32(idx*4+2), uint32(idx*4+3))

Retrieve Vertices Data in THREE.js

I'm creating a mesh with a custom shader. Within the vertex shader I'm modifying the original position of the geometry vertices. Then I need to access to this new vertices position from outside the shader, how can I accomplish this?
In lieu of transform feedback (which WebGL 1.0 does not support), you will have to use a passthrough fragment shader and floating-point texture (this requires loading the extension OES_texture_float). That is the only approach to generate a vertex buffer on the GPU in WebGL. WebGL does not support pixel buffer objects either, so reading the output data back is going to be very inefficient.
Nevertheless, here is how you can accomplish this:
This will be a rough overview focusing on OpenGL rather than anything Three.js specific.
First, encode your vertex array this way (add a 4th component for index):
Vec4 pos_idx : xyz = Vertex Position, w = Vertex Index (0.0 through NumVerts-1.0)
Storing the vertex index as the w component is necessary because OpenGL ES 2.0 (WebGL 1.0) does not support gl_VertexID.
Next, you need a 2D floating-point texture:
MaxTexSize = Query GL_MAX_TEXTURE_SIZE
Width = MaxTexSize;
Height = min (NumVerts / MaxTexSize, 1);
Create an RGBA floating-point texture with those dimensions and use it as FBO color attachment 0.
Vertex Shader:
#version 100
attribute vec4 pos_idx;
uniform int width; // Width of floating-point texture
uniform int height; // Height of floating-point texture
varying vec4 vtx_out;
void main (void)
{
float idx = pos_idx.w;
// Position this vertex so that it occupies a unique pixel
vec2 xy_idx = vec2 (float ((int (idx) % width)) / float (width),
floor (idx / float (width)) / float (height)) * vec2 (2.0) - vec2 (1.0);
gl_Position = vec4 (xy_idx, 0.0f, 1.0f);
//
// Do all of your per-vertex calculations here, and output to vtx_out.xyz
//
// Store the index in the W component
vtx_out.w = idx;
}
Passthrough Fragment Shader:
#version 100
varying vec4 vtx_out;
void main (void)
{
gl_FragData [0] = vtx_out;
}
Draw and Read Back:
// Draw your entire vertex array for processing (as `GL_POINTS`)
glDrawArrays (GL_POINTS, 0, NumVerts);
// Bind the FBO's color attachment 0 to `GL_TEXTURE_2D`
// Read the texture back and store its results in an array `verts`
glGetTexImage (GL_TEXTURE_2D, 0, GL_RGBA, GL_FLOAT, verts);

How can I get Alpha blending transparency working in OpenGL ES 2.0?

I'm in the midst of porting some code from OpenGL ES 1.x to OpenGL ES 2.0, and I'm struggling to get transparency working as it did before; all my triangles are being rendered fully opaque.
My OpenGL setup has these lines:
// Draw objects back to front
glDisable(GL_DEPTH_TEST);
glEnable(GL_BLEND);
glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
glDepthMask(false);
And my shaders look like this:
attribute vec4 Position;
uniform highp mat4 mat;
attribute vec4 SourceColor;
varying vec4 DestinationColor;
void main(void) {
DestinationColor = SourceColor;
gl_Position = Position * mat;
}
and this:
varying lowp vec4 DestinationColor;
void main(void) {
gl_FragColor = DestinationColor;
}
What could be going wrong?
EDIT: If I set the alpha in the fragment shader manually to 0.5 in the fragment shader (or indeed in the vertex shader) as suggested by keaukraine below, then I get transparent everything. Furthermore, if I change the color values I'm passing in to OpenGL to be floats instead of unsigned bytes, then the code works correctly.
So it looks as though something is wrong with the code that was passing the color information into OpenGL, and I'd still like to know what the problem was.
My vertices were defined like this (unchanged from the OpenGL ES 1.x code):
typedef struct
{
GLfloat x, y, z, rhw;
GLubyte r, g, b, a;
} Vertex;
And I was using the following code to pass them into OpenGL (similar to the OpenGL ES 1.x code):
glBindBuffer(GL_ARRAY_BUFFER, glTriangleVertexBuffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(Vertex) * nTriangleVertices, triangleVertices, GL_STATIC_DRAW);
glUniformMatrix4fv(matLocation, 1, GL_FALSE, m);
glVertexAttribPointer(positionSlot, 4, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*)offsetof(Vertex, x));
glVertexAttribPointer(colorSlot, 4, GL_UNSIGNED_BYTE, GL_FALSE, sizeof(Vertex), (void*)offsetof(Vertex, r));
glDrawArrays(GL_TRIANGLES, 0, nTriangleVertices);
glBindBuffer(GL_ARRAY_BUFFER, 0);
What is wrong with the above?
Your Colour vertex attribute values are not being normalized. This means that the vertex shader sees values for that attribute in the range 0-255.
Change the fourth argument of glVertexAttribPointer to GL_TRUE and the values will be normalized (scaled to the range 0.0-1.0) as you originally expected.
see http://www.khronos.org/opengles/sdk/docs/man/xhtml/glVertexAttribPointer.xml
I suspect the DestinationColor varying to your fragment shader always contains 0xFF for the alpha channel? If so, that is your problem. Try changing that so that the alpha actually varies.
Update: We found 2 good solutions:
Use floats instead of unsigned bytes for the varyings that are supplied to the DestinationColor in the fragment shader.
Or, as GuyRT pointed out, you can change the fourth argument of glVertexAttribPointer to GL_TRUE to tell OpenGL ES to normalize the values when they are converted from integers to floats.
To debug this situation, you can try setting constant alpha and see if it makes a difference:
varying lowp vec4 DestinationColor;
void main(void) {
gl_FragColor = DestinationColor;
gl_FragColor.a = 0.5; /* try other values from 0 to 1 to test blending */
}
Also you should ensure that you're picking EGL config with alpha channel.
And don't forget to specify precision for floats in fragment shaders! Read specs for OpenGL GL|ES 2.0 (http://www.khronos.org/registry/gles/specs/2.0/GLSL_ES_Specification_1.0.17.pdf section 4.5.3), and please see this answer: https://stackoverflow.com/a/6336285/405681

Perspective correct texturing of trapezoid in OpenGL ES 2.0

I have drawn a textured trapezoid, however the result does not appear as I had intended.
Instead of appearing as a single unbroken quadrilateral, a discontinuity occurs at the diagonal line where its two comprising triangles meet.
This illustration demonstrates the issue:
(Note: the last image is not intended to be a 100% faithful representation, but it should get the point across.)
The trapezoid is being drawn using GL_TRIANGLE_STRIP in OpenGL ES 2.0 (on an iPhone). It's being drawn completely facing the screen, and is not being tilted (i.e. that's not a 3D sketch you're seeing!)
I have come to understand that I need to perform "perspective correction," presumably in my vertex and/or fragment shaders, but I am unclear how to do this.
My code includes some simple Model/View/Projection matrix math, but none of it currently influences my texture coordinate values. Update: The previous statement is incorrect, according to comment by user infact.
Furthermore, I have found this tidbit in the ES 2.0 spec, but do not understand what it means:
The PERSPECTIVE CORRECTION HINT is not supported because OpenGL
ES 2.0 requires that all attributes be perspectively interpolated.
How can I make the texture draw correctly?
Edit: Added code below:
// Vertex shader
attribute vec4 position;
attribute vec2 textureCoordinate;
varying vec2 texCoord;
uniform mat4 modelViewProjectionMatrix;
void main()
{
gl_Position = modelViewProjectionMatrix * position;
texCoord = textureCoordinate;
}
// Fragment shader
uniform sampler2D texture;
varying mediump vec2 texCoord;
void main()
{
gl_FragColor = texture2D(texture, texCoord);
}
// Update and Drawing code (uses GLKit helpers from iOS)
- (void)update
{
float fov = GLKMathDegreesToRadians(65.0f);
float aspect = fabsf(self.view.bounds.size.width / self.view.bounds.size.height);
projectionMatrix = GLKMatrix4MakePerspective(fov, aspect, 0.1f, 50.0f);
viewMatrix = GLKMatrix4MakeTranslation(0.0f, 0.0f, -4.0f); // zoom out
}
- (void)glkView:(GLKView *)view drawInRect:(CGRect)rect
{
glClearColor(0.0f, 0.0f, 0.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glUseProgram(shaders[SHADER_DEFAULT]);
GLKMatrix4 modelMatrix = GLKMatrix4MakeScale(0.795, 0.795, 0.795); // arbitrary scale
GLKMatrix4 modelViewMatrix = GLKMatrix4Multiply(viewMatrix, modelMatrix);
GLKMatrix4 modelViewProjectionMatrix = GLKMatrix4Multiply(projectionMatrix, modelViewMatrix);
glUniformMatrix4fv(uniforms[UNIFORM_MODELVIEWPROJECTION_MATRIX], 1, GL_FALSE, modelViewProjectionMatrix.m);
glBindTexture(GL_TEXTURE_2D, textures[TEXTURE_WALLS]);
glUniform1i(uniforms[UNIFORM_TEXTURE], 0);
glVertexAttribPointer(ATTRIB_VERTEX, 3, GL_FLOAT, GL_FALSE, 0, wall.vertexArray);
glVertexAttribPointer(ATTRIB_TEXTURE_COORDINATE, 2, GL_FLOAT, GL_FALSE, 0, wall.texCoords);
glDrawArrays(GL_TRIANGLE_STRIP, 0, wall.vertexCount);
}
(I'm taking a bit of a punt here, because your picture does not show exactly what I would expect from texturing a trapezoid, so perhaps something else is happening in your case - but the general problem is well known)
Textures will not (by default) interpolate correctly across a trapezoid. When the shape is triangulated for drawing, one of the diagonals will be chosen as an edge, and while that edge is straight through the middle of the texture, it is not through the middle of the trapezoid (picture the shape divided along a diagonal - the two triangles are very much not equal).
You need to provide more than a 2D texture coordinate to make this work - you need to provide a 3D (or rather, projective) texture coordinate, and perform the perspective divide in the fragment shader, post-interpolation (or else use a texture lookup function which will do the same).
The following shows how to provide texture coordinates for a trapezoid using old-school GL functions (which are a little easier to read for demonstration purposes). The commented-out lines are the 2d texture coordinates, which I have replaced with projective coordinates to get the correct interpolation.
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho(0,640,0,480,1,1000);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
const float trap_wide = 600;
const float trap_narrow = 300;
const float mid = 320;
glBegin(GL_TRIANGLE_STRIP);
glColor3f(1,1,1);
// glTexCoord4f(0,0,0,1);
glTexCoord4f(0,0,0,trap_wide);
glVertex3f(mid - trap_wide/2,10,-10);
// glTexCoord4f(1,0,0,1);
glTexCoord4f(trap_narrow,0,0,trap_narrow);
glVertex3f(mid - trap_narrow/2,470,-10);
// glTexCoord4f(0,1,0,1);
glTexCoord4f(0,trap_wide,0,trap_wide);
glVertex3f(mid + trap_wide/2,10,-10);
// glTexCoord4f(1,1,0,1);
glTexCoord4f(trap_narrow,trap_narrow,0,trap_narrow);
glVertex3f(mid + trap_narrow/2,470,-10);
glEnd();
The third coordinate is unused here as we're just using a 2D texture. The fourth coordinate will divide the other two after interpolation, providing the projection. Obviously if you divide it through at the vertices, you'll see you get the original texture coordinates.
Here's what the two renderings look like:
If your trapezoid is actually the result of transforming a quad, it might be easier/better to just draw that quad using GL, rather than transforming it in software and feeding 2D shapes to GL...
What you are trying here is Skewed texture. A sample fragment shader is as follows :
precision mediump float;
varying vec4 vtexCoords;
uniform sampler2D sampler;
void main()
{
gl_FragColor = texture2DProj(sampler,vtexCoords);
}
2 things which should look different are :
1) We are using varying vec4 vtexCoords; . Texture co-ordinates are 4 dimensional.
2) texture2DProj() is used instead of texture2D()
Based on length of small and large side of your trapezium you will assign texture co-ordinates. Following URL might help :
http://www.xyzw.us/~cass/qcoord/
The accepted answer gives the correct solution and explanation but for those looking for a bit more help on the OpenGL (ES) 2.0 pipeline...
const GLfloat L = 2.0;
const GLfloat Z = -2.0;
const GLfloat W0 = 0.01;
const GLfloat W1 = 0.10;
/** Trapezoid shape as two triangles. */
static const GLKVector3 VERTEX_DATA[] = {
{{-W0, 0, Z}},
{{+W0, 0, Z}},
{{-W1, L, Z}},
{{+W0, 0, Z}},
{{+W1, L, Z}},
{{-W1, L, Z}},
};
/** Add a 3rd coord to your texture data. This is the perspective divisor needed in frag shader */
static const GLKVector3 TEXTURE_DATA[] = {
{{0, 0, 0}},
{{W0, 0, W0}},
{{0, W1, W1}},
{{W0, 0, W0}},
{{W1, W1, W1}},
{{0, W1, W1}},
};
////////////////////////////////////////////////////////////////////////////////////
// frag.glsl
varying vec3 v_texPos;
uniform sampler2D u_texture;
void main(void)
{
// Divide the 2D texture coords by the third projection divisor
gl_FragColor = texture2D(u_texture, v_texPos.st / v_texPos.p);
}
Alternatively, in the shader, as per #maverick9888's answer, You can use texture2Dproj though for iOS / OpenGLES2 it still only supports a vec3 input...
void main(void)
{
gl_FragColor = texture2DProj(u_texture, v_texPos);
}
I haven't really benchmarked it properly but for my very simple case (a 1d texture really) the division version seems a bit snappier.

Resources