how to speed up objects movement speed in opengl without dropping movement smoothness - performance

Here's the function which is registered as display function in glutDisplayFunc()
void RenderFunction(void)
{
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
glPointSize(5);
glBegin(GL_POINTS);
glVertex2i(0+i,0);
glEnd();
glutSwapBuffers();
i+=1;
glutPostRedisplay();
}
This way the point moves across the screen but its speed is really slow.
I can speed it up by incrementing i with greater number than 1, but then the motion doesn't seem smooth. How do I achieve higher speed?
I used to work with SFML which is made on the top of OpenGL and there the object moved really fast with move() method. So there has to be a way in OpenGL too.

In this case, there's probably not a lot you can do other than moving your point further each time you redraw. In particular, most performance improvement probably won't have any significant effect on the perceived speed in this case.
The problem is fairly simple: you're changing the location by one pixel at a time. Chances are pretty good that you have screen updating "locked" so it's happening in the conjunction with the monitor's refresh.
Assuming that's the case, with a typical monitor that refreshes at 60 Hz, you're going to get a fixed rate of your point moving 60 pixels per second. If you improve the code's efficiency, the movement speed won't change--it'll just reduce the amount of CPU time you're using to move the dot at the same speed.
Essentially the only choice to move if faster is to move more than one pixel per screen refresh. One pixel per screen refresh means 60 pixels per second, so (for example) to move across a typical HD screen (1920 dots horizontally) will take 1920 pixels/60 pixels/second = 32 seconds.
With really slow code, you might use 1% of the CPU to do that. With faster code, that might drop to some unmeasureably small amount--but either way, it's going to travel the same speed, so it'll take 32 seconds to get across the screen.
If you wanted to, you could unlock your screen updates from the screen refresh. Since you're only drawing one point, you can probably update the screen at a few thousand frames per second, so the dot would appear to move across the screen a lot faster.
In reality, however, the monitor is still only refreshing the screen at 60 Hz. When your code updates faster than that, it just means you'll produce a number of intermediate updates that never show up on the screen. As far as the pictures actually showing up on the screen go, you're just moving the point more than one pixel per screen refresh. The fact that you updated data in memory for all the intermediate points doesn't really change anything. The visible result is essentially the same as if you redrew once per screen refresh, and moved the point a lot further each time.

define model view matrix with i as translation component.
Then apply this matrix to the vertex you are defining. But yes as others are telling try to move to modern OpenGL.
glMatrixMode(GL_MODELVIEW);
glTranslatef(i, i, i);

Related

OpenGL ES: Is it more efficient to use glClear or use glDrawArrays with primitives that render over used frames?

For example, if I have several figures rendered over a black background, is it more efficient to call glClear(GL_COLOR_BUFFER_BIT) each frame, or render black triangles over artifacts from the past frame? I would think that rendering black triangles over artifacts would be faster, since less pixels need to be changed than clearing the entire screen. However, I have also read online that some drivers and graphics hardware perform optimizations when using glClear(GL_COLOR_BUFFER_BIT) that cannot occur when rendering black triangles instead.
Couple things you neglected which will probably change how you look at this:
Painting over the top with triangles doesn't play well with depth testing. So you have to disable testing first.
Painting over the top with triangles has to be strictly ordered with the previous result. While glClear breaks data dependencies, allowing out of order execution to start drawing the next frame as long as there's enough memory for both framebuffers to exist independently.
Double-buffering, to avoid tearing effects, uses multiple framebuffers anyway. So to use the result of the prior frame as a starting point for the new one, not only do you have to wait for the prior frame to finish rendering, you also have to copy its framebuffer into the buffer for the new frame. Which requires touching every pixel.
In a single buffering scenario, drawing over the top of the previous frame requires not only finishing rendering, but finishing streaming it to the display device. Which leaves you only the blanking interval to render the entire new frame. That almost certainly will prevent achieving maximum frame rate.
Generally, clearing touches every pixel just as copying does, but doesn't require flushing the pipeline. So it's much better to start a frame by glFlush.

Kinect 2 - Significant delay on hand motion

I'm using the Kinect 2 to perform rotation and zooming of a virtual camera showing on an 3D object by moving the hand in all three directions. The problem I currently tackle with is that these operations are executed with some noticeable delay. If my hand is in a steady position again, the camera still continues to move for a short time. It feels like if I push the camera instead of control them in real time. Perhaps the frame rate is a problem. As far as I know the Kinect has 30 FPS while my application has 60 FPS (VSync enabled).
What could be the cause for this issue? How can I control my camera without any significant delay?
The Kinect is a very graphic and process intensive piece of hardware. For your application I'd suggest a minimum specification of a GTX960 and a 4th gen i7 processor. Your hardware will be a prime factor in how fast you can compute Kinect data.
You're also going to want to avoid using loops as much as possible and instead rely on multi threading and if you are looping be certain that there are no foreach loops as they take longer to execute. It is very important that your code is running the data read from the Kinect and the position command asynchronously.
The Kinect will never be real time responsive. There is just too much data that it is handling, the best you can do is optimize your code and increase your hardware power to shrink the response time.

SFML Game initially slow, yet speeds up permanently when very little is drawn

So I have established the base for an SFML game using a tilemap. For the most part, I've optimized it so it can run at a good 60 fps. However, I can only get this 60 fps if at some point the map is halfway off the screen so there is less of it being rendered. This seems like it would make sense, less being drawn means it runs faster, but once the fps increases it stay permanently, even if I then make the entire screen rendering the map with the map. I can't understand this irregularity with the fps in that I either have to start the map slightly offset, or move the map offscreen for a moment to get a solid fps. Clearly there isn't a problem with the ability of my computer to render at this fps, as it can stay there once it starts, but I can't understand why the map has to be offscreen momentarily for it to achieve this speed.

Effect like auto-cast spell icon in war3 for cocos2d

I would like to implement the effect that a short line segment revolving around a square (not knowing the exact effect name, it's just like the one in war3 to indicate auto-cast spell or in fishingjoy to indicate equipped weapon). Any advise/hint is welcomed. Thanks!
You have several variants. The first one, the easiest, to create frame animation of desired effect and run it on the empty CCSprite instance, that will be placed over your weapon icon. I think, 5 or 6 frames of animation will be enough. Big plus - you can create any desired effects on these frames in photoshop, and it is easy to add existing frames as animation to your project. Minus - it will take additional place in your texture cache, spriteframe cache and it will increase the size of your app. This is the good solution, if your square is quite small, because if your square will have large contentSize, it will take a lot of useless memory. For example, 6 frames of such animation with the size of screen( 640x960 pixels on retina screen ) wil take additional 16Mb of your memory.
The second variant, IMHO, much more interesting)) And it can help to save memory) This variant is to implement this animation with OpenGL) But it seems to be much more complicated)

graphics: best performance with floating point accumulation images

I need to speed up some particle system eye candy I'm working on. The eye candy involves additive blending, accumulation, and trails and glow on the particles. At the moment I'm rendering by hand into a floating point image buffer, converting to unsigned chars at the last minute then uploading to an OpenGL texture. To simulate glow I'm rendering the same texture multiple times at different resolutions and different offsets. This is proving to be too slow, so I'm looking at changing something. The problem is, my dev hardware is an Intel GMA950, but the target machine has an Nvidia GeForce 8800, so it is difficult to profile OpenGL stuff at this stage.
I did some very unscientific profiling and found that most of the slow down is coming from dealing with the float image: scaling all the pixels by a constant to fade them out, and converting the float image to unsigned chars and uploading to the graphics hardware. So, I'm looking at the following options for optimization:
Replace floats with uint32's in a fixed point 16.16 configuration
Optimize float operations using SSE2 assembly (image buffer is a 1024*768*3 array of floats)
Use OpenGL Accumulation Buffer instead of float array
Use OpenGL floating-point FBO's instead of float array
Use OpenGL pixel/vertex shaders
Have you any experience with any of these possibilities? Any thoughts, advice? Something else I haven't thought of?
The problem is simply the sheer amount of data you have to process.
Your float buffer is 9 megabytes in size, and you touch the data more than once. Most likely your rendering loop looks somewhat like this:
Clear the buffer
Render something on it (uses reads and writes)
Convert to unsigned bytes
Upload to OpenGL
That's a lot of data that you move around, and the cache can't help you much because the image is much larger than your cache. Let's assume you touch every pixel five times. If so you move 45mb of data in and out of the slow main memory. 45mb does not sound like much data, but consider that almost each memory access will be a cache miss. The CPU will spend most of the time waiting for the data to arrive.
If you want to stay on the CPU to do the rendering there's not much you can do. Some ideas:
Using SSE for non temporary loads and stores may help, but they will complicate your task quite a bit (you have to align your reads and writes).
Try break up your rendering into tiles. E.g. do everything on smaller rectangles (256*256 or so). The idea behind this is, that you actually get a benefit from the cache. After you've cleared your rectangle for example the entire bitmap will be in the cache. Rendering and converting to bytes will be a lot faster now because there is no need to get the data from the relative slow main memory anymore.
Last resort: Reduce the resolution of your particle effect. This will give you a good bang for the buck at the cost of visual quality.
The best solution is to move the rendering onto the graphic card. Render to texture functionality is standard these days. It's a bit tricky to get it working with OpenGL because you have to decide which extension to use, but once you have it working the performance is not an issue anymore.
Btw - do you really need floating point render-targets? If you get away with 3 bytes per pixel you will see a nice performance improvement.
It's best to move the rendering calculation for massive particle systems like this over to the GPU, which has hardware optimized to do exactly this job as fast as possible.
Aaron is right: represent each individual particle with a sprite. You can calculate the movement of the sprites in space (eg, accumulate their position per frame) on the CPU using SSE2, but do all the additive blending and accumulation on the GPU via OpenGL. (Drawing sprites additively is easy enough.) You can handle your trails and blur either by doing it in shaders (the "pro" way), rendering to an accumulation buffer and back, or simply generate a bunch of additional sprites on the CPU representing the trail and throw them at the rasterizer.
Try to replace the manual code with sprites: An OpenGL texture with an alpha of, say, 10%. Then draw lots of them on the screen (ten of them in the same place to get the full glow).
If you by "manual" mean that you are using the CPU to poke pixels, I think pretty much anything you can do where you draw textured polygons using OpenGL instead will represent a huge speedup.

Resources