Fastest possible OpenCV 2 OpenGL context - performance

I've been searching through the net for a few day looking for the fastest possible way to take a OpenCV webcam capture and display it on an OpenGL context. So far this seems to work OK until I need to zoom.
void Camera::DrawIplImage1(IplImage *image, int x, int y, GLfloat xZoom, GLfloat yZoom)
{
GLenum format;
switch(image->nChannels) {
case 1:
format = GL_LUMINANCE;
break;
case 2:
format = GL_LUMINANCE_ALPHA;
break;
case 3:
format = GL_BGR;
break;
default:
return;
}
yZoom =- yZoom;
glRasterPos2i(x, y);
glPixelZoom(xZoom, yZoom); //Slow when not (1.0f, 1.0f);
glDrawPixels(image->width, image->height, format, GL_UNSIGNED_BYTE, image->imageData);
}
I've heard that maybe taking the FBO approach would be even faster. Any ideas out there on the fastest possible way to get an OpenCV webcam capture to an OpenGL context. I will test everything I see and post results.

Are you sure your openGL implementation needs ^2 textures? Even very poor PC implementations (yes Intel) can manage arbitrary sizes now.
Then the quickest is probably to use a openGL Pixel buffer
Sorry the code is from Qt, so the function names are slightly different but the sequence is the same
Allocate the opengl Texture
glEnable(GL_TEXTURE_2D);
glGenTextures(1,&texture);
glBindTexture(GL_TEXTURE_2D,texture);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexImage2D(GL_TEXTURE_2D, 0, glFormat, width, height, 0, glFormatExt, glType, NULL );
glDisable(GL_TEXTURE_2D);
Now get a pointer to the texture to use the memeory
glbuffer.bind();
unsigned char *dest = (unsigned char*)glbuffer.map(QGLBuffer::ReadWrite);
// creates an openCV image but the pixel data is stored in an opengl buffer
cv::Mat opencvImage(rows,cols,CV_TYPE,dest);
.... do stuff ....
glbuffer.unmap(); // pointer is no longer valid - so neither is openCV image
Then to draw it - this should be essentially instant because the data was copied to the GPU in the mapped calls above
glBindTexture(GL_TEXTURE_2D,texture);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0,0, width, height, glFormatExt, glType, 0);
glbuffer.release();
By using different types for glFormat and glFormatExt you can have the graphics card automatically convert between opencVs BGR and typical RGBA display formats for you in hardware.

Related

OpenGLES2: How to load and access a big float array

I have a large WxH float array:
float floatArray[W][H];
I want to access it in a fragment shader and I need to load/access it through a texture due to its size:
vec4 v4 = texture2D(tex, v_texCoord);
//Getting v4.x as floatArray[v_texCoord.x * W][v_texCoord.y * H]
I load the texture like this:
int texturenames[1];
glGenTextures(1, texturenames);
glActiveTexture(GL_TEXTURE0 + texturenames[0]);
glBindTexture(GL_TEXTURE_2D, texturenames[0]);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexImage2D(GL_TEXTURE_2D, 0, GL_LUMINANCE, w, h, 0, GL_LUMINANCE, GL_FLOAT, floatArray);
glUniform1i(glGetUniformLocation(program_, "tex"), texturenames[0]);
I don't get the right values. Note that the third (internalformat) and seventh (format) parameters of glTexImage2D are GL_LUMINANCE.
void glTexImage2D(GLenum target,
GLint level,
GLint internalformat,
GLsizei width,
GLsizei height,
GLint border,
GLenum format,
GLenum type,
const GLvoid * data);
How can I load and access a big float array in OpenGLES2?
Short answer - you can't. OpenGL ES 2.0 doesn't support floating point texturing.
Given you only want a single channel perhaps you could encode it in an RGBA unorm texture and recover the value algorithmically in the shader, but it sounds horribly expensive on a mobile GPU.
OpenGL ES 3.0 does support float texturing, so that might provide more luck.

Image Rotation by using Opengl ES

I'm working on Opengl ES 2.0 using OMAP3530 development board on Windows CE 7.
My Task is to Load a 24-Bit Image File & rotate it about an angle in z-Axis & export the image file(Buffer).
For this task I've created a FBO for off-screen rendering & loaded this image file as a Texture by using glTexImage2D() & I've applied this Texture to a Quad & rotate that QUAD by using PVRTMat4::RotationZ() API & Read-Back by using ReadPixels() API. Since it is a single frame process i just made only 1 loop.
Here are the problems I'm facing now.
1) All API's are taking distinct processing time on every run.ie Sometimes when i run my application i get different processing time for all API's.
2) glDrawArrays() is taking too much time (~50 ms - 80 ms)
3) glReadPixels() is also taking too much time ~95 ms for Image(800x600)
4) Loading 32-Bit image is much faster than 24-Bit image so conversion is needed.
I'd like to ask you all if anybody facing/Solved similar problem kindly suggest me any
Here is the Code snippet of my Application.
[code]
[i]
void BindTexture(){
glGenTextures(1, &m_uiTexture);
glBindTexture(GL_TEXTURE_2D, m_uiTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, ImageWidth, ImageHeight, 0, GL_RGBA, GL_UNSIGNED_BYTE, pTexData);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,GL_LINEAR );
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
}
int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, TCHAR *lpCmdLine, int nCmdShow)
{
// Fragment and vertex shaders code
char* pszFragShader = "Same as in RenderToTexture sample;
char* pszVertShader = "Same as in RenderToTexture sample;
CreateWindow(Imagewidth, ImageHeight);//For this i've referred OGLES2HelloTriangle_Windows.cpp example
LoadImageBuffers();
BindTexture();
Generate& BindFrame,Render Buffer();
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, m_auiFbo, 0);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT16, ImageWidth, ImageHeight);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, m_auiDepthBuffer);
BindTexture();
GLfloat Angle = 0.02f;
GLfloat afVertices[] = {Vertices to Draw a QUAD};
glGenBuffers(1, &ui32Vbo);
LoadVBO's();//Aps's to load VBO's refer
// Draws a triangle for 1 frames
while(g_bDemoDone==false)
{
glBindFramebuffer(GL_FRAMEBUFFER, m_auiFbo);
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
PVRTMat4 mRot,mTrans, mMVP;
mTrans = PVRTMat4::Translation(0,0,0);
mRot = PVRTMat4::RotationZ(Angle);
glBindBuffer(GL_ARRAY_BUFFER, ui32Vbo);
glDisable(GL_CULL_FACE);
int i32Location = glGetUniformLocation(uiProgramObject, "myPMVMatrix");
mMVP = mTrans * mRot ;
glUniformMatrix4fv(i32Location, 1, GL_FALSE, mMVP.ptr());
// Pass the vertex data
glEnableVertexAttribArray(VERTEX_ARRAY);
glVertexAttribPointer(VERTEX_ARRAY, 3, GL_FLOAT, GL_FALSE, m_ui32VertexStride, 0);
// Pass the texture coordinates data
glEnableVertexAttribArray(TEXCOORD_ARRAY);
glVertexAttribPointer(TEXCOORD_ARRAY, 2, GL_FLOAT, GL_FALSE, m_ui32VertexStride, (void*) (3 * sizeof(GLfloat)));
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);//
glReadPixels(0,0,ImageWidth ,ImageHeight,GL_RGBA,GL_UNSIGNED_BYTE,pOutTexData) ;
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindFramebuffer(GL_FRAMEBUFFER, 0);
eglSwapBuffers(eglDisplay, eglSurface);
}
DeInitAll();[/i][/code]
The PowerVR architecture can not render a single frame and allow the ARM to read it back quickly. It is just not designed to work that way fast - it is a deferred rendering tile-based architecture. The execution times you are seeing are too be expected and using an FBO is not going to make it faster either. Also, beware that the OpenGL ES drivers on OMAP for Windows CE are really poor quality. Consider yourself lucky if they work at all.
A better design would be to display the OpenGL ES rendering directly to the DSS and avoid using glReadPixels() and the FBO completely.
I've got improved performance for rotating a Image Buffer my using multiple FBO's & PBO's.
Here is the pseudo code snippet of my application.
InitGL()
GenerateShaders();
Generate3Textures();//Generate 3 Null Textures
Generate3FBO();//Generate 3 FBO & Attach each Texture to 1 FBO.
Generate3PBO();//Generate 3 PBO & to readback from FBO.
DrawGL()
{
BindFBO1;
BindTexture1;
UploadtoTexture1;
Do Some Processing & Draw it in FBO1;
BindFBO2;
BindTexture2;
UploadtoTexture2;
Do Some Processing & Draw it in FBO2;
BindFBO3;
BindTexture3;
UploadtoTexture3;
Do Some Processing & Draw it in FBO3;
BindFBO1;
ReadPixelfromFBO1;
UnpackToPBO1;
BindFBO2;
ReadPixelfromFBO2;
UnpackToPBO2;
BindFBO3;
ReadPixelfromFBO3;
UnpackToPBO3;
}
DeinitGL();
DeallocateALL();
By this way I've achieved 50% increased performance for overall processing.

Grayscaled texture rendered with color in OpenGL

I'm writing Cocoa screensaver, simple opengl scene, nothing special. I have a bunch of rgb gif's with patterns and all of them works great, except one.
What I see on the screensaver preview (rendering single quad with a texture on it):
Texture itself (scaled accordingly):
Some code:
Tex loading:
NSBitmapImageRep *bitmap = [NSBitmapImageRep imageRepWithData:[texImg TIFFRepresentation]];
if(bitmap) {
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, (GLsizei)[texImg size].width,
(GLsizei)[texImg size].height, 0, GL_RGB, GL_UNSIGNED_BYTE,
[bitmap bitmapData]) ;
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
}
GL init:
glEnable(GL_TEXTURE_2D);
glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_FASTEST);
Seems like it could be a pixel pack issue.
Can you try setting this line before creating the texture?
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
You can read about what this does here:
http://www.khronos.org/opengles/sdk/docs/man/xhtml/glPixelStorei.xml
and
http://www.opengl.org/archives/resources/features/KilgardTechniques/oglpitfall/ (read section 8)
But basically by default opengl expects rows of pixels to be a multiple of 4 bytes, which isn't always accurate when using 1/2/3 bytes per pixel.
Recommend also setting it back to 4 (default) after the texture is created.

White textures with OpenGL and DevIL on some Windows systems

I'm having some problems trying to load images with DevIL and creating textures in OpenGL. When a friend of mine tested my program on his Windows, all the rects, whom were supposed to contain textures, were white, without any image (texture). The problem seems to occur only on Window XP, Windows Vista and Windows 7, but not in every Windows PC: my Windows XP runs the program without problems.
Maybe there're some missing DLLs or files (improbable), or something that don't let the image to be loaded or used as a texture. By the other hand, the program runs fine on *UNIX systems.
This is the code I'm using to load an image and generate a texture:
void Image::load(const char* filename)
{
ILuint ilimg;
ilGenImages(1, &ilimg);
ilBindImage(ilimg);
if (!ilLoadImage(filename))
throw ImageLoadError;
glGenTextures(1, &image);
glBindTexture(GL_TEXTURE_2D, image);
bpp = ilGetInteger(IL_IMAGE_BPP);
width = ilGetInteger(IL_IMAGE_WIDTH);
height = ilGetInteger(IL_IMAGE_HEIGHT);
format = ilGetInteger(IL_IMAGE_FORMAT);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexImage2D(GL_TEXTURE_2D, 0, bpp, width, height, 0, format,
GL_UNSIGNED_BYTE, ilGetData());
ilDeleteImages(1, &ilimg);
}
This is the code that draws a rect with the texture applied:
void Rect::show()
{
glPushAttrib(GL_CURRENT_BIT);
glEnable(GL_TEXTURE_2D);
glColor4f(1.0, 1.0, 1.0, opacity);
glBindTexture(GL_TEXTURE_2D, texture->get_image());
glBegin(GL_POLYGON);
glTexCoord2i(0, 0); glVertex2f(x, y);
glTexCoord2i(1, 0); glVertex2f(x+width, y);
glTexCoord2i(1, 1); glVertex2f(x+width, y+height);
glTexCoord2i(0, 1); glVertex2f(x, y+height);
glEnd();
glDisable(GL_TEXTURE_2D);
glPopAttrib();
}
If you need some other code that I don't mentioned, ask me and i'll post it.
glTexImage2D(target, level, internal format, width, height, border, format, type, data);
The internal format parameter of glTexImage is not bits per pixel. Actually its numerical value is not related to the format at all. OpenGL defines only specific values to be valid, among them also four with numeric relation, but just as a mnemonic: 1 (shorthand for GL_LUMINANCE), 2 (GL_LUMINANCE_ALPHA), 3 (GL_RGB), 4 (GL_RGBA). There are also a number of other format tokens, but their numeric value is arbitrarily choosen.
Also the bpp for the image data is used to find the format and type parameters. You'll need to implement a LUT or switch-case structure for that.
The issue you are experiencing sounds like an implementation version issue. Probably a "correct" but less "nice" implementation isn't letting you get away with something subtly incorrect. Or on the other hand the implementation may have a subtle bug and isn't accepting your valid code.
I might try swapping
glColor4f(1.0, 1.0, 1.0, opacity);
glBindTexture(GL_TEXTURE_2D, texture->get_image());
to
glBindTexture(GL_TEXTURE_2D, texture->get_image());
glColor4f(1.0, 1.0, 1.0, opacity);
and declaring vertex before texcoord like.
glVertex2f(x, y);
glTexCoord2i(0, 0);
Short of that if you do not already have "The Red Book" consult http://glprogramming.com/red/chapter09.html

glDrawArrays() slow on iPad?

I was wondering how to speed up my iPad application using OpenGLES 2.0. At the moment we have every drawable object draw itself with a call to glDrawArrays(). Blend mode is on, we really need it. Without disabling blendmode, how would we improve performance for this app?
For instances, if we now draw 3 textures (1024x1024, 256x512, 256x512) across the whole screen, the app only gets 15FPS, which is really slow I think? Are we doing something terribly wrong? Our drawing code (for each drawable), is as follows:
- (void) draw {
GLuint textureAvailable = 0;
if(texture != nil){
textureAvailable = 1;
}
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, texture.name);
glVertexAttribPointer(ATTRIB_VERTEX, 2, GL_FLOAT, 0, 0, vertices);
glEnableVertexAttribArray(ATTRIB_VERTEX);
glVertexAttribPointer(ATTRIB_COLOR, 4, GL_FLOAT, 1, 0, colorsWithMultipliedAlpha);
glEnableVertexAttribArray(ATTRIB_COLOR);
glVertexAttribPointer(ATTRIB_TEXTUREMAP, 2, GL_FLOAT, 1, 0, textureMapping);
glEnableVertexAttribArray(ATTRIB_TEXTUREMAP);
//Note that we are NOT using position.z here because that is only used to determine drawing order
int *jnUniforms = JNOpenGLConstants::getInstance().uniforms;
glUniform4f(jnUniforms[UNIFORM_TRANSLATE], position.x, position.y, 0.0, 0.0);
glUniform4f(jnUniforms[UNIFORM_SCALE], scale.x, scale.y, 1.0, 1.0);
glUniform1f(jnUniforms[UNIFORM_ROTATION], rotation);
glUniform1i(jnUniforms[UNIFORM_TEXTURE_SAMPLE], 0);
glUniform2f(jnUniforms[UNIFORM_TEXTURE_REPEAT], textureRepeat.x, textureRepeat.y);
glUniform1i(jnUniforms[UNIFORM_TEXTURE_AVAILABLE], textureAvailable);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
Possible optimizations I think won't work:
Drawing geometry in batches
I'm only drawing 3 items and the FPS is 15, I don't think batching the geometry would work here because it's such a small number of calls for drawing that it doesn't matter if we kill 2/3 of those calls.
Texture Atlas
Again, only drawing 3 textures. What I do wonder if it would matter (a lot) if we were to convert these to PVR? I haven't looked into it, but I must admit we're loading big PNGs at the moment. Is there any way to see if this is indeed the case, or is it easier just to check it out?
But please tell me if I'm wrong, I'm happy to hear any ideas.
Proposed solutions
Mipmapped textures
Loading mipmapped textures, doing it like this:
- (id) initWithUIImage: (UIImage * const) image {
glGenTextures(1, &name);
//JNLogString(#"Recieved name(%d), binding texture", name);
glBindTexture(GL_TEXTURE_2D, name);
//Set the needed parameters for the texture
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
//Load the image data into the texture
glGenerateMipmap(GL_TEXTURE_2D);
return self;
}
This doesn't seem to do anything for our FPS, I think this is because our textures are already roughly at the size they are rendered to on the screen, in most cases even 1:1.
Other solutions are welcome! I will try them out and post the results here
If you are using very large textures, try to create mipmap textures. The cost is basically 1/3 of the original texture memory. I think they can be created with this call when setting up the textures.
glTexParameteri(GL_TEXTURE_2D, GL_GENERATE_MIPMAP, GL_TRUE);
Some calculations: If you have 3 textures 2048x2048 (max size) at 15 Hz you will have a texel throughput (if they are fully shown, ie downscaled to screen resolution) of 2048x2048x3x15 = 188,743,680 / sec which is around the value we see at glbenchmark.com for single fill rate (173 Mtexel/sec). But if you are using mipmap textures the texel throughput should be closer to the screen size resolution (1024x768) which should be something like 1/4 of the previous throughput.
I had a branch in my fragment shader. I though that didn't put a lot of strain on it, but it did! Anyhow, that was the whole problem, I removed the branch and now my FPS has almost doubled.

Resources