I'm working on Opengl ES 2.0 using OMAP3530 development board on Windows CE 7.
My Task is to Load a 24-Bit Image File & rotate it about an angle in z-Axis & export the image file(Buffer).
For this task I've created a FBO for off-screen rendering & loaded this image file as a Texture by using glTexImage2D() & I've applied this Texture to a Quad & rotate that QUAD by using PVRTMat4::RotationZ() API & Read-Back by using ReadPixels() API. Since it is a single frame process i just made only 1 loop.
Here are the problems I'm facing now.
1) All API's are taking distinct processing time on every run.ie Sometimes when i run my application i get different processing time for all API's.
2) glDrawArrays() is taking too much time (~50 ms - 80 ms)
3) glReadPixels() is also taking too much time ~95 ms for Image(800x600)
4) Loading 32-Bit image is much faster than 24-Bit image so conversion is needed.
I'd like to ask you all if anybody facing/Solved similar problem kindly suggest me any
Here is the Code snippet of my Application.
[code]
[i]
void BindTexture(){
glGenTextures(1, &m_uiTexture);
glBindTexture(GL_TEXTURE_2D, m_uiTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, ImageWidth, ImageHeight, 0, GL_RGBA, GL_UNSIGNED_BYTE, pTexData);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,GL_LINEAR );
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
}
int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, TCHAR *lpCmdLine, int nCmdShow)
{
// Fragment and vertex shaders code
char* pszFragShader = "Same as in RenderToTexture sample;
char* pszVertShader = "Same as in RenderToTexture sample;
CreateWindow(Imagewidth, ImageHeight);//For this i've referred OGLES2HelloTriangle_Windows.cpp example
LoadImageBuffers();
BindTexture();
Generate& BindFrame,Render Buffer();
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, m_auiFbo, 0);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT16, ImageWidth, ImageHeight);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, m_auiDepthBuffer);
BindTexture();
GLfloat Angle = 0.02f;
GLfloat afVertices[] = {Vertices to Draw a QUAD};
glGenBuffers(1, &ui32Vbo);
LoadVBO's();//Aps's to load VBO's refer
// Draws a triangle for 1 frames
while(g_bDemoDone==false)
{
glBindFramebuffer(GL_FRAMEBUFFER, m_auiFbo);
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
PVRTMat4 mRot,mTrans, mMVP;
mTrans = PVRTMat4::Translation(0,0,0);
mRot = PVRTMat4::RotationZ(Angle);
glBindBuffer(GL_ARRAY_BUFFER, ui32Vbo);
glDisable(GL_CULL_FACE);
int i32Location = glGetUniformLocation(uiProgramObject, "myPMVMatrix");
mMVP = mTrans * mRot ;
glUniformMatrix4fv(i32Location, 1, GL_FALSE, mMVP.ptr());
// Pass the vertex data
glEnableVertexAttribArray(VERTEX_ARRAY);
glVertexAttribPointer(VERTEX_ARRAY, 3, GL_FLOAT, GL_FALSE, m_ui32VertexStride, 0);
// Pass the texture coordinates data
glEnableVertexAttribArray(TEXCOORD_ARRAY);
glVertexAttribPointer(TEXCOORD_ARRAY, 2, GL_FLOAT, GL_FALSE, m_ui32VertexStride, (void*) (3 * sizeof(GLfloat)));
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);//
glReadPixels(0,0,ImageWidth ,ImageHeight,GL_RGBA,GL_UNSIGNED_BYTE,pOutTexData) ;
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindFramebuffer(GL_FRAMEBUFFER, 0);
eglSwapBuffers(eglDisplay, eglSurface);
}
DeInitAll();[/i][/code]
The PowerVR architecture can not render a single frame and allow the ARM to read it back quickly. It is just not designed to work that way fast - it is a deferred rendering tile-based architecture. The execution times you are seeing are too be expected and using an FBO is not going to make it faster either. Also, beware that the OpenGL ES drivers on OMAP for Windows CE are really poor quality. Consider yourself lucky if they work at all.
A better design would be to display the OpenGL ES rendering directly to the DSS and avoid using glReadPixels() and the FBO completely.
I've got improved performance for rotating a Image Buffer my using multiple FBO's & PBO's.
Here is the pseudo code snippet of my application.
InitGL()
GenerateShaders();
Generate3Textures();//Generate 3 Null Textures
Generate3FBO();//Generate 3 FBO & Attach each Texture to 1 FBO.
Generate3PBO();//Generate 3 PBO & to readback from FBO.
DrawGL()
{
BindFBO1;
BindTexture1;
UploadtoTexture1;
Do Some Processing & Draw it in FBO1;
BindFBO2;
BindTexture2;
UploadtoTexture2;
Do Some Processing & Draw it in FBO2;
BindFBO3;
BindTexture3;
UploadtoTexture3;
Do Some Processing & Draw it in FBO3;
BindFBO1;
ReadPixelfromFBO1;
UnpackToPBO1;
BindFBO2;
ReadPixelfromFBO2;
UnpackToPBO2;
BindFBO3;
ReadPixelfromFBO3;
UnpackToPBO3;
}
DeinitGL();
DeallocateALL();
By this way I've achieved 50% increased performance for overall processing.
Related
I have developed more than 20 mobile apps using OpenGL ES 2.0. However, I am trying to make a renderer to use my apps in OSX so now I am using OpenGL v3.3 with GLSL v130. Yesterday, I ran into a problem that I can't use a texture(RTT) that I drew particles on Off-Screen FBO with GL_LINES 1.0 size (it is the max value in OpenGL 3.3 why??)
When I drew geometry on the Off Screen FBO and used it as a texture on On-screen, I was able to see that
and also if I draw small particles on On-screen I can clearly see those but if I try to draw that particle lines and try to use it as a texture on Main screen I can see only a black texture.
I have already checked GL ERRORs and back FBOs' status and GL blending options but I am still struggling to solve it .
Anyone has a idea to solve it ?
Even though I think my code is okay I attached a little code bellow
// AFTER generate and bind FBO, generate RTT
StarTexture fboTex;
fboTex.texture_width = texture_width;
fboTex.texture_height = texture_height;
glGenTextures(1, &fboTex.texture_id);
glBindTexture(GL_TEXTURE_2D,fboTex.texture_id);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, texture_width, texture_height, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, fboTex.texture_id, 0);
and this is drawing particles on BACK FBO
glUniformMatrix4fv( h_Uniforms[UNIFORMS_PROJECTION], 1, GL_FALSE, g_proxtrans.s);
glBindBuffer(GL_ARRAY_BUFFER, h_VBO[VBO_PARTICLE]);
glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(Vec3)*ParticleNumTotal*2, &p_particle_lc_xy[0]);
glVertexAttribPointer(h_Attributes[ATTRIBUTES_POSITION], 3, GL_FLOAT, 0, 0,0);
glEnableVertexAttribArray(h_Attributes[ATTRIBUTES_POSITION]);
glBindBuffer(GL_ARRAY_BUFFER, h_VBO[VBO_COLOR]);
glVertexAttribPointer(h_Attributes[ATTRIBUTES_COLOR], 4, GL_FLOAT, 0, 0,0);
glEnableVertexAttribArray(h_Attributes[ATTRIBUTES_COLOR]);
glLineWidth(Thickness); // 1.0 because it is maxium
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, h_VBO[VBO_INDEX_OFF1]);
glDrawElements(GL_LINES, 400, GL_UNSIGNED_INT, 0); // 200 lines
and when I draw that on the main screen
glClearColor(0.0, 0.0, 0.0, 1.0);
glClear( GL_COLOR_BUFFER_BIT);
starfbo->bindingVAO1();
glViewport(0, 0, ogl_Width, ogl_Height);
glUseProgram(h_Shader_Program[Shader_Program_FINAL]);
glBindBuffer(GL_ARRAY_BUFFER, h_VBO[VBO_TEXCOORD2]);
glVertexAttribPointer(h_Attributes[ATTRIBUTES_UV2], 2, GL_FLOAT, 0, 0,0);
glEnableVertexAttribArray(h_Attributes[ATTRIBUTES_UV2]);
glBindBuffer(GL_ARRAY_BUFFER, h_VBO[VBO_SQCOORD2]);
glVertexAttribPointer(h_Attributes[ATTRIBUTES_POSITION3], 2, GL_FLOAT, 0, 0,0 );
glEnableVertexAttribArray(h_Attributes[ATTRIBUTES_POSITION3]);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, h_VBO[VBO_INDEX_ON]);
glDrawElements(GL_TRIANGLES,sizeof(squareIndices)/sizeof(squareIndices[0]), GL_UNSIGNED_INT ,(void*)0);
glUniformMatrix4fv( h_Uniforms[UNIFORMS_PROJECTION], 1, GL_FALSE, g_proxtrans.s);
glBindBuffer(GL_ARRAY_BUFFER, h_VBO[VBO_PARTICLE]);
glBufferSubData(GL_ARRAY_BUFFER, 0, sizeof(Vec3)*ParticleNumTotal*2, &p_particle_lc_xy[0]);
glVertexAttribPointer(h_Attributes[ATTRIBUTES_POSITION], 3, GL_FLOAT, 0, 0,0);
glEnableVertexAttribArray(h_Attributes[ATTRIBUTES_POSITION]);
glBindBuffer(GL_ARRAY_BUFFER, h_VBO[VBO_COLOR]);
glVertexAttribPointer(h_Attributes[ATTRIBUTES_COLOR], 4, GL_FLOAT, 0, 0,0);
glEnableVertexAttribArray(h_Attributes[ATTRIBUTES_COLOR]);
glLineWidth(Thickness);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, h_VBO[VBO_INDEX_OFF1]);
glDrawElements(GL_LINES, 400, GL_UNSIGNED_INT, 0);
If the resolution of the rendered image is much larger than the size (in pixels) it ends up being rendered at, it's certainly possible that small features disappear entirely.
Picture an extreme case. Say you render a few thin lines into a 1000x1000 texture, lighting up a very small fraction of the total 1,000,000 pixels. Now you map this texture onto a quad that has a size of 10x10 pixels when displayed. The fragment shader is invoked once for each pixel (assuming no MSAA), which makes for 100 shader invocations. Each of these 100 invocations samples the texture. With linear sampling and no mipmapping, it will read 4 texels for each sample operation. In total, 100 * 4 = 400 texels are read while rendering the polygon. It's quite likely that reading these 400 texels out of the total 1,000,000 will completely miss all of the lines you rendered into the texture.
One way to reduce this problem is to use mipmapping. This will generally prevent the features from disappearing completely. But small features will still fade because more and more texels are averaged in higher mipmap levels, where most of the texels are black.
A better but slightly more complex approach is that instead of using automatically generated mipmaps, you create the mipmaps manually, by rendering the same content into each mipmap level.
It might be good enough to simply be careful that you're not making the texture too large. Or to create your own wide lines by drawing them as polygons instead of using line primitives.
glDrawElements(GL_LINES, 400, GL_UNSIGNED_INT, 0);
GL_UNSIGNED_INT can not be used in OpenGL ES Versus OpenGL. Oddly, it works for IOS but not Android.
The parameter must be GL_UNSIGNED_BYTE or GL_UNSIGNED_SHORT in OpenGL ES.
I have four VBO's (BufferA, BufferB, BufferC and BufferD) and two programs (program1 and program2).
Main steps of logic are:
glUseProgram(progran1);
glBindBuffer(GL_ARRAY_BUFFER, BufferA);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, BufferB);
glBeginTransformFeedback(GL_POINTS);
glDrawArrays(GL_POINTS, 0, Vertex1Count);
glEndTransformFeedback();
swap(BufferA, BufferB);
glUseProgram(progran2);
glBindBuffer(GL_ARRAY_BUFFER, BufferC);
glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, BufferD);
glBeginTransformFeedback(GL_POINTS);
glDrawArrays(GL_POINTS, 0, Vertex2Count);
glEndTransformFeedback();
swap(BufferC, BufferD);
Questions: What do I need to do to gain access to BufferB from program2?
Can I bind BufferB as texture somehow and read it with texelfetch?
I am using iOS 7 and OpenGL es 3.0
Yes, you can. You may use buffer as PBO and then create a texture from your buffer.
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, BufferB);
GLuint someTex;
glActiveTexture(GL_TEXTURE0);
glGenTexutre(1, &someTex);
glBindTexture(GL_TEXTURE_2D, someTex);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA32F, 1, sizeOfYourBuffer, 0, GL_RGBA, GL_FLOAT, nullptr);
// nullptr is interpret as an offset in your buffer
In case of using PBO, TexImage* works fast since CPU is not involved in texture initialization.
Disadvantage of the approach is that the texture is not allowed to be changed. But if you are implementing iterating method you may use the "pin pong strategy" (Have different buffers for previous and new state; Swap it after visualization).
I'm trying to get some shadowing effects to work in OpenGL ES 2.0 on iOS by porting some code from standard GL. Part of the sample involves copying the depth buffer to a texture:
glBindTexture(GL_TEXTURE_2D, g_uiDepthBuffer);
glCopyTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, 0, 0, 800, 600, 0);
However, it appears the glCopyTexImage2D is not supported on ES. Reading a related thread, it seems I can use the frame buffer and fragment shaders to extract the depth data. So I'm trying to write the depth component to the color buffer, then copying it:
// clear everything
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT);
// turn on depth rendering
glUseProgram(m_BaseShader.uiId);
// this is a switch to cause the fragment shader to just dump out the depth component
glUniform1i(uiBaseShaderRenderDepth, true);
// and for this, the color buffer needs to be on
glColorMask(GL_TRUE,GL_TRUE,GL_TRUE,GL_TRUE);
// and clear it to 1.0, like how the depth buffer starts
glClearColor(1.0f, 1.0f, 1.0f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT);
// draw the scene
DrawScene();
// bind our texture
glBindTexture(GL_TEXTURE_2D, g_uiDepthBuffer);
glCopyTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 0, 0, width, height, 0);
Here is the fragment shader:
uniform sampler2D sTexture;
uniform bool bRenderDepth;
varying lowp float LightIntensity;
varying mediump vec2 TexCoord;
void main()
{
if(bRenderDepth) {
gl_FragColor = vec4(vec3(gl_FragCoord.z), 1.0);
} else {
gl_FragColor = vec4(texture2D(sTexture, TexCoord).rgb * LightIntensity, 1.0);
}
}
I have experimented with not having the 'bRenderDepth' branch, and it doesn't speed it up significantly.
Right now pretty much just doing this step its at 14fps, which obviously is not acceptable. If I pull out the copy its way above 30fps. I'm getting two suggestions from the Xcode OpenGLES analyzer on the copy command:
file://localhost/Users/xxxx/Documents/Development/xxxx.mm: error:
Validation Error: glCopyTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 0, 0,
960, 640, 0) : Height<640> is not a power of two
file://localhost/Users/xxxx/Documents/Development/xxxx.mm: warning:
GPU Wait on Texture: Your app updated a texture that is currently
used for rendering. This caused the CPU to wait for the GPU to
finish rendering.
I'll work to resolve the two above issues (perhaps they are the crux if of it). In the meantime can anyone suggest a more efficient way to pull that depth data into a texture?
Thanks in advance!
iOS devices generally support OES_depth_texture, so on devices where the extension is present, you can set up a framebuffer object with a depth texture as its only attachment:
GLuint g_uiDepthBuffer;
glGenTextures(1, &g_uiDepthBuffer);
glBindTexture(GL_TEXTURE_2D, g_uiDepthBuffer);
glTexImage2D(GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, width, height, 0, GL_DEPTH_COMPONENT, GL_UNSIGNED_INT, NULL);
// glTexParameteri calls omitted for brevity
GLuint g_uiDepthFramebuffer;
glGenFramebuffers(1, &g_uiDepthFramebuffer);
glBindFramebuffer(GL_FRAMEBUFFER, g_uiDepthFramebuffer);
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_TEXTURE_2D, g_uiDepthBuffer, 0);
Your texture then receives all the values being written to the depth buffer when you draw your scene (you can use a trivial fragment shader for this), and you can texture from it directly without needing to call glCopyTexImage2D.
I've been searching through the net for a few day looking for the fastest possible way to take a OpenCV webcam capture and display it on an OpenGL context. So far this seems to work OK until I need to zoom.
void Camera::DrawIplImage1(IplImage *image, int x, int y, GLfloat xZoom, GLfloat yZoom)
{
GLenum format;
switch(image->nChannels) {
case 1:
format = GL_LUMINANCE;
break;
case 2:
format = GL_LUMINANCE_ALPHA;
break;
case 3:
format = GL_BGR;
break;
default:
return;
}
yZoom =- yZoom;
glRasterPos2i(x, y);
glPixelZoom(xZoom, yZoom); //Slow when not (1.0f, 1.0f);
glDrawPixels(image->width, image->height, format, GL_UNSIGNED_BYTE, image->imageData);
}
I've heard that maybe taking the FBO approach would be even faster. Any ideas out there on the fastest possible way to get an OpenCV webcam capture to an OpenGL context. I will test everything I see and post results.
Are you sure your openGL implementation needs ^2 textures? Even very poor PC implementations (yes Intel) can manage arbitrary sizes now.
Then the quickest is probably to use a openGL Pixel buffer
Sorry the code is from Qt, so the function names are slightly different but the sequence is the same
Allocate the opengl Texture
glEnable(GL_TEXTURE_2D);
glGenTextures(1,&texture);
glBindTexture(GL_TEXTURE_2D,texture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, glFormat, width, height, 0, glFormatExt, glType, NULL );
glDisable(GL_TEXTURE_2D);
Now get a pointer to the texture to use the memeory
glbuffer.bind();
unsigned char *dest = (unsigned char*)glbuffer.map(QGLBuffer::ReadWrite);
// creates an openCV image but the pixel data is stored in an opengl buffer
cv::Mat opencvImage(rows,cols,CV_TYPE,dest);
.... do stuff ....
glbuffer.unmap(); // pointer is no longer valid - so neither is openCV image
Then to draw it - this should be essentially instant because the data was copied to the GPU in the mapped calls above
glBindTexture(GL_TEXTURE_2D,texture);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0,0, width, height, glFormatExt, glType, 0);
glbuffer.release();
By using different types for glFormat and glFormatExt you can have the graphics card automatically convert between opencVs BGR and typical RGBA display formats for you in hardware.
I was wondering how to speed up my iPad application using OpenGLES 2.0. At the moment we have every drawable object draw itself with a call to glDrawArrays(). Blend mode is on, we really need it. Without disabling blendmode, how would we improve performance for this app?
For instances, if we now draw 3 textures (1024x1024, 256x512, 256x512) across the whole screen, the app only gets 15FPS, which is really slow I think? Are we doing something terribly wrong? Our drawing code (for each drawable), is as follows:
- (void) draw {
GLuint textureAvailable = 0;
if(texture != nil){
textureAvailable = 1;
}
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, texture.name);
glVertexAttribPointer(ATTRIB_VERTEX, 2, GL_FLOAT, 0, 0, vertices);
glEnableVertexAttribArray(ATTRIB_VERTEX);
glVertexAttribPointer(ATTRIB_COLOR, 4, GL_FLOAT, 1, 0, colorsWithMultipliedAlpha);
glEnableVertexAttribArray(ATTRIB_COLOR);
glVertexAttribPointer(ATTRIB_TEXTUREMAP, 2, GL_FLOAT, 1, 0, textureMapping);
glEnableVertexAttribArray(ATTRIB_TEXTUREMAP);
//Note that we are NOT using position.z here because that is only used to determine drawing order
int *jnUniforms = JNOpenGLConstants::getInstance().uniforms;
glUniform4f(jnUniforms[UNIFORM_TRANSLATE], position.x, position.y, 0.0, 0.0);
glUniform4f(jnUniforms[UNIFORM_SCALE], scale.x, scale.y, 1.0, 1.0);
glUniform1f(jnUniforms[UNIFORM_ROTATION], rotation);
glUniform1i(jnUniforms[UNIFORM_TEXTURE_SAMPLE], 0);
glUniform2f(jnUniforms[UNIFORM_TEXTURE_REPEAT], textureRepeat.x, textureRepeat.y);
glUniform1i(jnUniforms[UNIFORM_TEXTURE_AVAILABLE], textureAvailable);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
Possible optimizations I think won't work:
Drawing geometry in batches
I'm only drawing 3 items and the FPS is 15, I don't think batching the geometry would work here because it's such a small number of calls for drawing that it doesn't matter if we kill 2/3 of those calls.
Texture Atlas
Again, only drawing 3 textures. What I do wonder if it would matter (a lot) if we were to convert these to PVR? I haven't looked into it, but I must admit we're loading big PNGs at the moment. Is there any way to see if this is indeed the case, or is it easier just to check it out?
But please tell me if I'm wrong, I'm happy to hear any ideas.
Proposed solutions
Mipmapped textures
Loading mipmapped textures, doing it like this:
- (id) initWithUIImage: (UIImage * const) image {
glGenTextures(1, &name);
//JNLogString(#"Recieved name(%d), binding texture", name);
glBindTexture(GL_TEXTURE_2D, name);
//Set the needed parameters for the texture
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
//Load the image data into the texture
glGenerateMipmap(GL_TEXTURE_2D);
return self;
}
This doesn't seem to do anything for our FPS, I think this is because our textures are already roughly at the size they are rendered to on the screen, in most cases even 1:1.
Other solutions are welcome! I will try them out and post the results here
If you are using very large textures, try to create mipmap textures. The cost is basically 1/3 of the original texture memory. I think they can be created with this call when setting up the textures.
glTexParameteri(GL_TEXTURE_2D, GL_GENERATE_MIPMAP, GL_TRUE);
Some calculations: If you have 3 textures 2048x2048 (max size) at 15 Hz you will have a texel throughput (if they are fully shown, ie downscaled to screen resolution) of 2048x2048x3x15 = 188,743,680 / sec which is around the value we see at glbenchmark.com for single fill rate (173 Mtexel/sec). But if you are using mipmap textures the texel throughput should be closer to the screen size resolution (1024x768) which should be something like 1/4 of the previous throughput.
I had a branch in my fragment shader. I though that didn't put a lot of strain on it, but it did! Anyhow, that was the whole problem, I removed the branch and now my FPS has almost doubled.