This is a difficult question to search in Google since it has other meaning in finance.
Of course, what I mean here is "Drawing" as in .. computer graphics.. not money..
I am interested in preventing overdrawing for both 3D Drawing and 2D Drawing.
(should I make them into two different questions?)
I realize that this might be a very broad question since I didn't specify which technology to use. If it is too broad, maybe some hints on some resources I can read up will be okay.
EDIT:
What I mean by overdrawing is:
when you draw too many objects, rendering single frame will be very slow
when you draw more area than what you need, rendering a single frame will be very slow
It's quite complex topic.
First thing to consider is frustum culling. It will filter out objects that are not in camera’s field of view so you can just pass them on render stage.
The second thing is Z-sorting of objects that are in camera. It is better to render them from front to back so that near objects will write “near-value” to the depth buffer and far objects’ pixels will not be drawn since they will not pass depth test. This will save your GPU’s fill rate and pixel-shader work. Note however, if you have semitransparent objects in scene, they should be drawn first in back-to-front order to make alpha-blending possible.
Both things achievable if you use some kind of space partition such as Octree or Quadtree. Which is better depends on your game. Quadtree is better for big open spaces and Octree is better for in-door spaces with many levels.
And don't forget about simple back-face culling that can be enabled with single line in DirectX and OpenGL to prevent drawing of faces that are look at camera with theirs back-side.
Question is really too broad :o) Check out these "pointers" and ask more specifically.
Typical overdraw inhibitors are:
Z-buffer
Occlusion based techniques (various buffer techniques, HW occlusions, ...)
Stencil test
on little bit higher logic level:
culling (usually by view frustum)
scene organization techniques (usually trees or tiling)
rough drawing front to back (this is obviously supporting technique :o)
EDIT: added stencil test, has indeed interesting overdraw prevention uses especially in combination of 2d/3d.
Reduce the number of objects you consider for drawing based on distance, and on position (ie. reject those outside of the viewing frustrum).
Also consider using some sort of object-based occlusion system to allow large objects to obscure small ones. However this may not be worth it unless you have a lot of large objects with fairly regular shapes. You can pre-process potentially visible sets for static objects in some cases.
Your API will typically reject polygons that are not facing the viewpoint also, since you typically don't want to draw the rear-face.
When it comes to actual rendering time, it's often helpful to render opaque objects from front-to-back, so that the depth-buffer tests end up rejecting entire polygons. This works for 2D too, if you have depth-buffering turned on.
Remember that this is a performance optimisation problem. Most applications will not have a significant problem with overdraw. Use tools like Pix or NVIDIA PerfHUD to measure your problem before you spend resources on fixing it.
Related
I am trying to understand the whole 2D accelerated rendering process using SDL 2.0.
So my question is which would be the most efficient way to draw circles in the screen and why?
Some ways would be:
First to create a software surface and then draw the necessary pixels on that surface then create a texture out of that surface and lastly copy that texture to the rendering target.
Also another implementation would be to draw a circle using multiple times SDL_RenderDrawLine.And I think this is the way it is being implemented in SDL 2.0 gfx
Or there is a more efficient way to do all of this?
Take this question more generally in means of if I would wanted to draw other shapes manually, which probably, couldn't be rendered easily with the 2D rendering API that SDL provides(using draw line or rectangle).
With the example of circles this is a fairly complicated question, it is more based on the visual quality you wish to achieve which will drive performance. Drawing lots of short lines will vary vastly based on how close to a circle you wish to get, if you are happy to use say, 60 lines, which will work on small shapes nearly seamlessly but if scaled up will begin to appear not to be a circle, the performance will likely be better (depending on the user's hardware). Note also SDL_RenderDrawLines will be much much faster for many lines as it avoids lots of context switches for rendering calls.
However if you need a very accurate circle with thousands of lines to get a good approximation it will be faster to simply use a bitmap and scale and blit it. This will also give you a 'smoother' feel to the circle.
In my personal opinion I do not think the hardware accelerated render API has much use outside of some special uses such as graph rendering and perhaps very simple GUI drawing. For anything more complex I would usually use bitmap based drawing.
With regards to the second part, it again depends on the accuracy of any arcs you need to draw. If you can easily approximate the shape into a few tens of lines it will be fast, otherwise the pixel method is better.
I want to implement the "depth peeling" in webgl but the problem is that there is no occlusion query so I don't know how to check when the "peeling" of the scene is over.
Do you see an other way to do that?
The usual approach is to limit the peeling to a certain amount of steps. This is sometimes even better than using occlusion queries because to many layers of transparent structure become close to impossible to discern from each other. It often helps to know what you are exactly rendering to get a good estimate of the amount of layers you need to peel.
I recently implemented depth peeling in webgl. There are a few limiting factors that make it kinda hard to do as many peels as layers. Mainly a very limited amount of texture units and the fact you can only render to one target at a time, so you have to render color and depth seperately. With 7 textures used I can do 4 peels. That already takes 11 render passes per frame. To do more peels you would need to do a bit more sophisticated merging of intermediate results. I doubt you gain much from more peels.
I'm implementing a simple lightning effect for my 3D game, something like this:
http://www.krazydad.com/bestiary/bestiary_lightning.html
I'm using opengl ES 2.0. I'm pondering what the best looking and most performance efficient way to render this in a 3D environment is though, as the lines making up the electric bolt needs to be looking "solid" when viewed from any angle.
I was thinking to generate two planes for each line segment, in an X cross to create an effect of line thickness. Rendering by disabling depth buffer writes, using some kind off additive blending mode. Texturing each line segment using an electric looking texture with an alpha channel.
I'm a bit worried about the performance hit from generating the necessary triangle lists using this method though, as my game will potentially have a lot of lightning bolts generated at the same time. But as the length and thickness of the lightning bolts will vary a lot, I doubt it would look good to simply use an animated 3D object of an lightning bolt, stretched and pointing to the right location, which was my initial idea.
I was thinking of an alternative approach where I render the lightning bolts using 2D lines between projected end points in a post processing pass. That should work well since the perspective effect in my case is negligible, except then it would be tricky to have the lines appear behind occluding objects.
Any good ideas on the best approach here?
Edit: I found this white paper from nVidia:
http://developer.download.nvidia.com/SDK/10/direct3d/Source/Lightning/doc/lightning_doc.pdf
Which uses an approach with having billboards for each line segment, then apply some filtering to smooth the resulting gaps and overlaps from each billboard.
Seems to yield pretty good visual results, however I am not too happy about the additional filtering pass as the game is for mobile phones where such a step is quite costly. And, as it turns out, billboarding is quite CPU expensive too, due to the additional matrix calculation overhead, which is slow on mobile devices.
I ended up doing something like the nVidia paper suggested, but to prevent the need for a postprocessing step I used different kind of textures for different kind of branching angles, to avoid gaps and overlaps of the segment corners, which turned out quite well. And to avoid the expensive billboard matrix calculation I instead drew the line segments using a more 2D approach, but calculating the depth value manually for each vertex in the segments. This yields both acceptable performance and visuals.
An animated texture, possibly powered by a shader, is likely the fastest way to handle this.
Any geometry generation and rendering will limit the quality of the effect, and may take significantly more CPU time, memory bandwidth and draw calls.
Using a single animated texture on a quad, or a shader creating procedural lightning, will give constant speed and make the effect much simpler to implement. For that, this question may be of interest.
In my opengl app, I am drawing the same polygon approximately 50k times but at different points on the screen. In my current approach, I do the following:
Draw the polygon once into a display list
for each instance of the polygon, push the matrix, translate to that point, scale and rotate appropriate (the scaling of each point will be the same, the translation and rotation will not).
However, with 50k polygons, this is 50k push and pops and computations of the correct matrix translations to move to the correct point.
A coworker of mine also suggested drawing the entire scene into a buffer and then just drawing the whole buffer with a single translation. The tradeoff here is that we need to keep all of the polygon vertices in memory rather than just the display list, but we wouldn't need to do a push/translate/scale/rotate/pop for each vertex.
The first approach is the one we currently have implemented, and I would prefer to see if we can improve that since it would require major changes to do it the second way (however, if the second way is much faster, we can always do the rewrite).
Are all of these push/pops necessary? Is there a faster way to do this? And should I be concerned that this many push/pops will degrade performance?
It depends on your ultimate goal. More recent OpenGL specs enable features for "geometry instancing". You can load all the matrices into a buffer and then draw all 50k with a single "draw instances" call (OpenGL 3+). If you are looking for a temporary fix, at the very least, load the polygon into a Vertex Buffer Object. Display Lists are very old and deprecated.
Are these 50k polygons going to move independently? You'll have to put up with some form of "pushing/popping" (even though modern scene graphs do not necessarily use an explicit matrix stack). If the 50k polygons are static, you could pre-compile the entire scene into one VBO. That would make it render very fast.
If you can assume a recent version of OpenGL (>=3.1, IIRC) you might want to look at glDrawArraysInstanced and/or glDrawElementsInstanced. For older versions, you can probably use glDrawArraysInstancedEXT/`glDrawElementsInstancedEXT, but they're extensions, so you'll have to access them as such.
Either way, the general idea is fairly simple: you have one mesh, and multiple transforms specifying where to draw the mesh, then you step through and draw the mesh with the different transforms. Note, however, that this doesn't necessarily give a major improvement -- it depends on the implementation (even more than most things do).
I am rendering some meshes (sometimes upwards of 500) and I wanted to know the best way to approach this. Would it be pointless to create 500 VBOs and then if they pass the frustum and visibility tests, render them. Is there a more efficient way to do this? I am looking to maximize performance.
To answer your question, yes, many VBOs will slow things down. More polys will usually slow down the render, but more draw calls has a much greater hit. You want to minimize state changes and draws, as well as the number of buffers you have (and memory use).
I would suggest first looking at the buffers and figuring out how many you need. If you can batch/instance geometry, merge static geometry into a single buffer, reuse buffer more efficiently, etc.
Once you've cut the buffers down to the minimum possible, you'll want to use culling of multiple sorts. Visibility, both by frustrum (perhaps in an octree) and occlusion, can provide a significant performance boost. The main idea is to disqualify the geometry as fast and simply as possible, so you start with rough tests (octree), then somewhat more detailed (perhaps an AABB and/or simplified hull), then occlusion, then actually draw.
Here's a good article on frustrum culling, which touches a bit on quadtrees (and by extension, octrees). Diagrams, explanations and some sample code.
OpenGL occlusion culling articles seem a bit less common, although this one from GPU Gems might be a good starting place.