I'm currently rendering geological data, and have done so successfully with good results. To clarify, I input a matrix of elevations and output a single static mesh. I do this by creating a single plane for each elevation point, then, after creating all of these individual planes, merge them into a single mesh.
I've been running at 60 FPS even on a Macbook Air, but I want to push the limits. Is using a single PlaneGeometry as a heightmap as described in other terrain examples more efficient, or is it ultimately the same product at the end of the process?
Sorry for a general question without code examples, but I think this is a specific enough topic to warrant a question.
From my own tests that I spent a couple of days running, creating individual custom geometries and merging them as they are created is wildly inefficient, not just in the render loop, but also during the loading process.
The process I am using now is creating a single BufferGeometry with enough width and height segments to contain the elevation data, as described in this article.
Related
I would like to inquire some insights into rendering a large amount of meshes with the best performance.
I'm working on generative mine-able planets incorporating marching cube chunked terrain. Currently I'm trying to add vegetation/rocks to spruce up the planet surfaces (get it?). I am using the actual chunk loading to (next to the terrain) also load smaller rocks and some grass stuff. That runs pretty well. I am having issues with tree's and boulders (visible on the entire planet surface but LODed, obviously).
Testing different methods have lead me on the road of;
Custom shaders with material clipping based on camera distance; Works okay for about half a million trees made from 2 perpendicular planes (merged into one single bufferGeometry). But those 'models' are not good enough.
THREE.LOD's; Which sucks up fps like crazy, to slow for large amounts of meshes.
THREE.InstancedMesh's; Works pretty well, however I'd have to disable frustumCulling, since the originpoint of the vegetation is not always on screen. Which makes it inefficient.
THREE.InstancedGeometry combined with the custom clipping shaders; I had high hopes for this, it gives the best performance while using actual models. But it still eats up half of the frameRate. The vertexshader still has to process all the vertices to determine if it is within clipping range. Also the same frustumCulling issue applies.
Material.clippingPlanes? Combined with InstancedMeshes; This is what I'm trying now, did not have any luck with it, still trying to figure out exactly how that works..
Does anyone have experience with rendering large amounts of meshes or has some advice for me? Is there a technique I do not yet know about?
Would it help to split up the trees in multiple InstancedMeshes? Would the clippingPlanes give me better performance?
I have a scenario such that I have a wall and whenever player hits the walls it falls into pieces .Each of the piece is the Child GO of that wall.Each of the piece is using same material .So I tried Dynamic Batching on that but it didn't reduce the draw calls.
Then I tried CombineChildren.cs it worked and combined all meshes thus reducing the drawcalls but whenever my player hit the wall it didn't played the animation.
I cannot try SkinnedMeshCombiner.cs from this wiki with the answer from this link check this
beacause my game objects have Mesh renders rather than Skinned Mesh renderer
Is there any other solution I can do for that?
So I tried Dynamic Batching on that but it didn't reduce the draw
calls.
Dynamic batching is almost the only way to go if you want to reduce draw calls for moving object (unless you try to implement something similar on your own).
However it has some limitations as explained in the doc.
Particularly:
limits on vertex numbers for objects to be batched
all must share the same material (not different instances of the same material - be sure to check this)
no lightmapped objects
no multipass shaders
Be sure all limitations listed in the doc are met, then dynamic batching should work.
Then I tried CombineChildren.cs it worked and combined all meshes thus
reducing the drawcalls but whenever my player hit the wall it didn't
played the animation.
That's because after been combined, all GameObjects are combined into an unique one. You can't move them indipendently no more.
Recently started learning webGL and decided to use the Three.js library.
Currently, in addition to, rendering over 100K of cubes, I'm also trying to render lines between those cubes (over 100K).
The problem with rendering occurs when I try to draw lines, NOT cubes. Rendering 100k cubes was relatively fast. Even rendering those 100K+ lines is relatively fast but when I try to zoom/pan using the TrackballControls, the FPS goes down to almost 0.
I've searched StackOverflow and various other sites in order to improve the performance of my application. I've used the merging geometries technique, the delayed rendering of lines ( basically x number of cubes/lines at a time using a timeout in js), and adjusting the appearance of the lines to require the most minimal rendering time.
Are there any other methods in constructing the lines so that the rendering/fps aren't affected? I'm building a group of lines and THEN adding it to the scene. But is there a way that merging is possible with lines? Should I be constructing my lines using different objects?
I just find it strange that I can easily render/pan/zoom/maintain a high fps with over 100k cubes but with lines (which is the most basic form of geometry besides a point), everything crashes.
You have found out (the hard way) where do graphics chip vendors put their focus on their devise drivers. But that is to be expected in 3D graphics as lines (yes the most basic geometry) are not used my many games so they do not receive much attention as polygons do. You can take a look at the example webgl_buffergeometry_lines.html which is probably the faster way to draw lines.
In my opengl app, I am drawing the same polygon approximately 50k times but at different points on the screen. In my current approach, I do the following:
Draw the polygon once into a display list
for each instance of the polygon, push the matrix, translate to that point, scale and rotate appropriate (the scaling of each point will be the same, the translation and rotation will not).
However, with 50k polygons, this is 50k push and pops and computations of the correct matrix translations to move to the correct point.
A coworker of mine also suggested drawing the entire scene into a buffer and then just drawing the whole buffer with a single translation. The tradeoff here is that we need to keep all of the polygon vertices in memory rather than just the display list, but we wouldn't need to do a push/translate/scale/rotate/pop for each vertex.
The first approach is the one we currently have implemented, and I would prefer to see if we can improve that since it would require major changes to do it the second way (however, if the second way is much faster, we can always do the rewrite).
Are all of these push/pops necessary? Is there a faster way to do this? And should I be concerned that this many push/pops will degrade performance?
It depends on your ultimate goal. More recent OpenGL specs enable features for "geometry instancing". You can load all the matrices into a buffer and then draw all 50k with a single "draw instances" call (OpenGL 3+). If you are looking for a temporary fix, at the very least, load the polygon into a Vertex Buffer Object. Display Lists are very old and deprecated.
Are these 50k polygons going to move independently? You'll have to put up with some form of "pushing/popping" (even though modern scene graphs do not necessarily use an explicit matrix stack). If the 50k polygons are static, you could pre-compile the entire scene into one VBO. That would make it render very fast.
If you can assume a recent version of OpenGL (>=3.1, IIRC) you might want to look at glDrawArraysInstanced and/or glDrawElementsInstanced. For older versions, you can probably use glDrawArraysInstancedEXT/`glDrawElementsInstancedEXT, but they're extensions, so you'll have to access them as such.
Either way, the general idea is fairly simple: you have one mesh, and multiple transforms specifying where to draw the mesh, then you step through and draw the mesh with the different transforms. Note, however, that this doesn't necessarily give a major improvement -- it depends on the implementation (even more than most things do).
I'm trying to create a platformer game, and I am taking various sprite blocks, and piecing them together in order to draw the level. This requires drawing a large number of sprites on the screen every single frame. A good computer has no problem handling drawing all the sprites, but it starts to impact performance on older computers. Since this is NOT a big game, I want it to be able to run on almost any computer. Right now, I am using the following DirectX function to draw my sprites:
D3DXVECTOR3 center(0.0f, 0.0f, 0.0f);
D3DXVECTOR3 position(static_cast<float>(x), static_cast<float>(y), z);
(my LPD3DXSPRITE object)->Draw((sprite texture pointer), NULL, ¢er, &position, D3DCOLOR_ARGB(a, r, g, b));
Is there a more efficient way to draw these pictures on the screen? Is there a way that I can use less complex picture files (I'm using regular png's right now) to speed things up?
To sum it up: What is the most performance friendly way to draw sprites in DirectX? thanks!
The ID3DXSPRITE interface you are using is already pretty efficient. Make sure all your sprite draw calls happen in one batch if possible between the sprite begin and end calls. This allows the sprite interface to arrange the draws in the most efficient way.
For extra performance you can load multiple smaller textures in to one larger texture and use texture coordinates to get them out. This makes it so textures don't have to be swapped as frequently. See:
http://nexe.gamedev.net/directknowledge/default.asp?p=ID3DXSprite
The file type you are using for the textures does not matter as long as they are are preloaded into textures. Make sure you load them all in to textures once when the game/level is loading. Once you have loaded them in to textures it does not matter what format they were originally in.
If you still are not getting the performance you want, try using PIX to profile your application and find where the bottlenecks really are.
Edit:
This is too long to fit in a comment, so I will edit this post.
When I say swapping textures I mean binding them to a texture stage with SetTexture. Each time SetTexture is called there is a small performance hit as it changes the state of the texture stage. Normally this delay is fairly small, but can be bad if DirectX has to pull the texture from system memory to video memory.
ID3DXsprite will reorder the draws that are between begin and end calls for you. This means SetTexture will typically only be called once for each texture regardless of the order you draw them in.
It is often worth loading small textures into a large one. For example if it were possible to fit all small textures in to one large one, then the texture stage could just stay bound to that texture for all draws. Normally this will give a noticeable improvement, but testing is the only way to know for sure how much it will help. It would look terrible, but you could just throw in any large texture and pretend it is the combined one to test what performance difference there would be.
I agree with dschaeffer, but would like to add that if you are using a large number different textures, it may better to smush them together on a single (or few) larger textures and adjust the texture coordinates for different sprites accordingly. Texturing state changes cost a lot and this may speed things up on older systems.