Is there a performance benefit when the frame is not redrawn? - performance

I'm working on a 2D OpenGL chess game, and I was wondering if there is a performance gain if I were to redraw some of the squares as opposed to the whole frame?

In theory, yes, but you shouldn't worry about it. Especially in a not graphics-heavy 2D-Game like this the performance-improvement will barely be noticeable (if noticeable at all). Performance may decrease if you have a lot of overdraw; drawing a lot of transparent layers over each other.
You should first try out: how it works without drawing only specific squares, and if it really does become slow, you might think again about optimizing.

Related

Three.js: What's the upper limit for holding 60 FPS on an average desktop?

I'm currently working on a game using Three.js. I've been studying software engineering for four years and have been working professionally on backends for two, but I've barely touched on graphics aside from some simple Unity experimenting.
I currently have ~22,000 vertices and ~8,000 faces according to renderstats.js, and my desktop (above average) can't run it above 20 FPS. I'm using Lambert material as well as a single ambient light, so I feel like this isn't too much to ask.
With these figures in mind, is this the expected behavior for three.js rendering?
I would be pretty sure that is not end of the line and you are probably missing some possibilities for massive performance-improvements.
But just to give you some numbers first,
if you leave everything fancy away (including three.js) and just render an ultra-simple point-cloud with one fragment rendered per point, you can easily get to rendering 10-20 million (yes, million) points/vertices on an average GPU.
just with simple shapes and material, I already got three.js to render something in the range of 500k triangles (at 1080p-resolution) at 60FPS without problem. You can probably take those numbers times 10 for latest high-end GPUs.
However, these kinds of numbers are not really helpful.
Some hints:
if you want to debug your rendering-performance, you should first add some metrics. Renderstats is good, but I'd recommend integrating http://spite.github.io/rstats/ for this (see the example).
generally the choice of material shouldn't matter too much, the GPU is way more capable than most people think. It's more likely a problem somewhere else in the pipeline. EDIT from comment: In some cases, like hi-resolution displays with slow GPUs (think mobile-devices) this might be less true and complicated shader-code can slow down your site, but it might worth be looking at the other points first. As the rendering itself happens off-thread (so you can't measure it's duration using regular tools like the devtools-profiler), you can use the EXT_disjoint_timer_query-extension to get some information about what is going on on the GPU.
the number of drawcalls shouldn't be too high: three.js needs to do a single drawcall for every Mesh and Points-object rendered in the scene and too many objects are generally a far bigger problem than objects with lots of vertices. Reducing the number of drawcalls can be done by merging multiple geometries into one and making use of multi-materials, vertex-colors and things like that.
if you are doing postprocessing, the GPU needs to render every pixel on screen several times. This might as well massively limit your performance. This can be optimized by merging multiple postprocessing-passes into one (I admit, that'd be a lot of hard work..)
another problem could be on the JS side: you should use the profiler or timeline-view from the chrome devtools to see if maybe it's the javascript that is taking too much time per frame (shouldn't be more than 8-12ms per frame). I've been told there are ways to optimize the javascript-performance as well :)

OpenGL ES, Z-Buffer, 2D sprites, discard, performance

I have a retro-looking 2D game with a lot of sprites (reminiscent of Sega's Super Scaler arcades) which do not use semi-transparency. I have thought about using the Z-Buffer over sorting to simplify things. Ok, but by default writes are done to the Z-buffer even though alpha is zero, giving the effect illustrated here:
http://i.stack.imgur.com/ubLlp.png
Now, since I'm in OpenGL ES 2, I don't have alpha testing, so from what I understand my only possibility is to discard the pixel from the fragment shader if alpha is 0 so that it doesn't get written to the Z-Buffer. But in terms of performance this is SO wrong: not only the if is slow, but the discard basically kills the purpose since it disables early depth testing and the result is way worse than doing it in software.
if (val.a < 0.5) {
discard;
}
Is there any other solution I could use which would not kill the performance? Do all 2D games sort sprites themselves and not use depth buffer?
It's a tradeoff really. If you let the z-buffer do the sorting and use discard in your shaders then it's more expensive on the GPU because of branching and late depth testing as you say.
If you do the depth sorting yourself, then you'll find it's harder to issue your draw calls in an optimal order (e.g. you'll keep having to change texture). Draw calls on GLES2 have a very significant CPU hit on lower end devices and the count will probably go up.
If performance is a big concern, then probably the second option is better if you do it in conjunction with a big effort on the texture atlasing front to minimize your draw call count, this might be particularly effective if your sprites are low resolution retro sprites because you'll be able to get a lot of sprites per texture atlas. It isn't a clear winner by any stretch and I can imagine that different games take different approaches.
Also, you should take into account that the vast majority of target hardware is going to perform just fine whichever path you choose, and maybe you should just choose the one that is faster to implement and makes your code simpler (which is probably letting the z-buffer do the sorting).
If you fancy a technical challenge, I've often thought the best approach might be divide up your sprites into fully opaque sections and sections with transparency and render the two parts as separate meshes (they won't be quads any more). You'd have to do a lot of preprocessing and draw a lot more triangles, but by being able to do some rendering with fully-opaque parts then you can take advantage of the hidden-surface-removal tech in all iOS devices and lots of Android devices. Certainly by doing this you should be able to reduce your fill rate burden, but at a cost of increased draw calls, and there might be an unnecessarily high amount of added complexity to your code and your tools.

j2me: what is effect of 'g.clipRect()' on speed?

I'm developing in j2me and using canvas to drawing some images.
Now, my question is : what is difference of below sample codes in speed of drawing?
drawing after clipping area rectangle:
g.clipRect(x, y, myImage.getWidth(), myImage.getHeight());
g.drawImage(myImage, x , y, Graphics.TOP | Graphics.LEFT);
g.setClip(0, 0, screenWidth, screenHeight);
drawing without clip:
g.drawImage(myImage, x, y, Graphics.TOP | Graphics.LEFT);
is the first one is faster? I'm drawing on screen a lot.
Well the direct answer to your question would be Mu I'm afraid - because you appear to approach the issue from the wrong direction.
Thing is, clipping API is not intended for performance considerations / optimizations. You can find full coverage of its purpose in API documentation (available online), it does not state anything related to performance impact:
Clipping
The clip is the set of pixels in the destination of the Graphics object that may be modified by graphics rendering operations.
There is a single clip per Graphics object. The only pixels modified by graphics operations are those that lie within the clip. Pixels outside the clip are not modified by any graphics operations.
Operations are provided for intersecting the current clip with a given rectangle and for setting the current clip outright...
Attempting to use clipping API for imaginary performance considerations will make your code a nightmare to understand for future maintainers. Note this future maintainer may be you yourself, just few weeks / months / years later - I for one had my nose broken on my own code written some time ago without clearly understandable intent - trust me, it hurts the same as messing with poor code written by anyone else.
Don't get me wrong - there is a chance that clipping may have substantial performance impact in some particular case on specific device - why not, everything is possible given the variety of MIDP implementations. Know what? there is even a chance of it having an opposite impact on some other device, why not.
If (if) that happens, if (if) you'll somehow get a clear, solid, tested and proven justification of specific performance impact - then (then), go ahead, implement whatever tricks necessary to reach required performance, no matter how perverse they may be (BTDTGTTS). Until then, though, drop any baseless assumptions that just may come to your mind.
Until then... Just. Drop. It.
Developers love to optimize code and with good reason. It is so satisfying and fun. But knowing when to optimize is far more important. Unfortunately, developers generally have horrible intuition about where the performance problems in an application will actually be... Most performance tuning reminds me of the old joke about the guy who's looking for his keys in the kitchen even though he lost them in the street, because the light's better in the kitchen... (Brian Goetz)
This will almost certainly vary between platforms, and will depend on how much you're actually drawing.
I suggest you measure performance yourself by logging the number of paints per second, or the average duration of a paint method, and painting this on screen.
Drawing without clip should be faster on any platform for the simple reason that you are not calling two clip methods. But I might ask, why are you using clip to begin with?
You usually use clipping when you have an animation sprite or an icon variation in the same file. In this case you can create a file for each frame/icon. It will increase your jar file size and will use more heap space to hold these images on memory, but will be drawn faster.

OpenGL - Will using multiple VBO's slow down rendering?

I am rendering some meshes (sometimes upwards of 500) and I wanted to know the best way to approach this. Would it be pointless to create 500 VBOs and then if they pass the frustum and visibility tests, render them. Is there a more efficient way to do this? I am looking to maximize performance.
To answer your question, yes, many VBOs will slow things down. More polys will usually slow down the render, but more draw calls has a much greater hit. You want to minimize state changes and draws, as well as the number of buffers you have (and memory use).
I would suggest first looking at the buffers and figuring out how many you need. If you can batch/instance geometry, merge static geometry into a single buffer, reuse buffer more efficiently, etc.
Once you've cut the buffers down to the minimum possible, you'll want to use culling of multiple sorts. Visibility, both by frustrum (perhaps in an octree) and occlusion, can provide a significant performance boost. The main idea is to disqualify the geometry as fast and simply as possible, so you start with rough tests (octree), then somewhat more detailed (perhaps an AABB and/or simplified hull), then occlusion, then actually draw.
Here's a good article on frustrum culling, which touches a bit on quadtrees (and by extension, octrees). Diagrams, explanations and some sample code.
OpenGL occlusion culling articles seem a bit less common, although this one from GPU Gems might be a good starting place.

Tackling alpha blend in OpenGL for better performance

Since having blends is hitting perfomance of our game, we tried several blending strategies for creating the "illusion" of blending. One of them is drawing a sprite every odd frame, resulting in the sprite being visible half of the time. The effect is quit good. (You'd need a proper frame rate by the way, else your sprite would be noticeably flickering)
Despite that, I would like to know if there are any good insights out there in avoiding blending in order to better the overal performance without compromising (too much) of the visual experience.
Is it the actual blending that's killing your performance? (i.e. video memory bandwidth)
What games commonly do these days to handle lots of alpha blended stuff (think large explosions that cover whole screen): render them into a smaller texture (e.g. 2x2 smaller or 4x4 smaller than screen), and composite them back onto the main screen.
Doing that might require rendering depth buffer of opaque surfaces into that smaller texture as well, to properly handle intersections with opaque geometry. On some platforms (consoles) doing multisampling or depth buffer hackery might make that a very cheap operation; no such luck on regular PC though.
See article from GPU Gems 3 for example: High-Speed, Off-Screen Particles. Christer Ericson's blog post overviews a lot of optimization approaches as well: Optimizing the rendering of a particle system
Excellent article here about rendering particle systems quickly. It covers the smaller off screen buffer technique and suggest quite a few other approaches.
You can read it here
It is not quite clear from your question what kind of application of blending hits your game's performance. Generally blending is blazingly fast. If your problems are particle system related, then what is most likely to kill framerate is the number and size of particles drawn. Particularly lots of close up (and therefore large) particles will require high memory bandwidth and fill rate of the graphics card. I have implemented a particle system myself, and while I can render tons of particles in the distance, I feel the negative impact of e.g. flying through smoke (that will fill the entire screen because the viewer is amidst of it) very much on weaker hardware.

Resources