I'm working on a 3d engine, that should work for mobile platforms. Currently I just want to make a prototype that will work on iOS and use forward rendering. In the engine a scene can have a variable number of lights of different types (directional, spot etc). When rendering, for each object (mesh) an array of lights that affect this object is constructed. The array will always have 1 or more elements. I can pack the light source information into 1D texture and pass to the shader. The number of lights can be put into this texture or passed as a separate uniform (I did not try it yet, but these are my thoughts after googling).
The problem is that not all glsl-es implementation support for loops with variable limits. So I can't write a shader that will loop through light sources and expect it to work on a wide range on platforms. Are there any technics to support variable number of lights in a shader if for loops with variable limits are not supported?
The idea I have:
Implement some preprocessing of shader source to unroll loops manually for different number of lights.
So in that case if I would render all objects with one type of shader and if the number of lights limits are 1 to 3, I will end-up having 3 different shaders (generated automatically) for 1, 2 and 3 lights.
Is it a good idea?
Since the source code for a shader consists of strings that you pass in at runtime, there's nothing stopping you from building the source code dynamically, depending on the number of lights, or any other parameters that control what kind of shader you need.
If you're using a setup where the shader code is in separate text files, and you want to keep it that way, you can take advantage of the fact that you can use preprocessor directives in shader code. Say you use LIGHT_COUNT for the number of lights in your shader code. Then when compiling the shader code, you prepend it with a definition for the count you need, for example:
#define LIGHT_COUNT 4
Since glShaderSource() takes an array of strings, you don't even need any string operations to connect this to the shader code your read from the file. You simply pass it in as an additional string to glShaderSource().
Shader compilation is fairly expensive, so you'll probably want to cache the shader program for each light count.
Another option is what Andon suggested in a comment. You can write the shader for the upper limit of the light count you need, and then pass in uniforms that serve as multipliers for each light source. For the lights you don't need, you set the multiplier to 0. That's not very efficient since you're doing extra calculations for light sources you don't need, but it's simple, and might be fine if it meets your performance requirements.
Related
I'm currently rewriting a shader written in GLES30 for the GLES20 shader language.
I've hit a snag where the shader I need to convert makes a call to the function textureLod, which samples the currently bound texture using a specific level-of-detail. This call is made within the fragment shader, which can only be called within the vertex shader when using GLES20.
I'm wondering, if I replace this with a call with the function texture2D, will I be likely to compromise the function of the shader, or just reduce it's performance? All instances where the textureLod call is made within the original shader uses a level of detail of zero.
If you switch calls from textureLod to texture2D, you will lose control over which mip-level is being sampled.
If the texture being sampled only has a single mip-level, then the two calls are equivalent, regardless of the lod parameter passed to textureLod, because there is only one level that could be sampled.
If the original shader always samples the top mip level (=0), it is unlikely that the change could hurt performance, as sampling lower mip-levels would more likely give better texture cache performance. If possible, you could have your sampled texture only include a top level to guarantee equivalence (unless the mip levels are required somewhere else). If this isn't possible, then the execution will be different. If the sample is used for 'direct' texturing, it is likely that the results will be fairly similar, assuming a nicely generated mip-chain. If it is used for other purposes (eg. logic within the shader), then the divergence might be larger. It's difficult to predict without seeing the actual shader.
Also note that, if the texture sample is used within a loop or conditional, and has been ported to/from a DirectX HLSL shader at any point in its lifetime, the call to textureLod may be an artifact of HLSL not allowing gradient instructions within dynamic loops (of which the HLSL equivalent of texture2D is, but equivalent of textureLod is not). This is required in HLSL, even if the texture only has a single mip-level.
I'm making a WebGL game and eventually came up with a pretty convenient concept of object templates, when the game objects of the same kind (say, characters of the same race) are using the same template (which means: buffers, attributes and shader program), and are instanced from that template by specifying a set of uniforms (which are, in fact, the most common difference between the same-kind objects: model matrix, textures, bones positions, etc). For making independent objects with their own deep-copy of buffers, I just deep-copy and re-initialize the original template and start instantiating new objects from it.
But after that I started having doubts. Say, if I start using morphing on objects, by explicit editing of the vertices, this approach will require me to make a separate template for every object of such kind (otherwise, they would start morphing in exactly the same phase). Which is probably fine for this very case, 'cause I'll most likely need to recalculate normals and even texture coordinates, which means – most of the buffers.
But what if I'm missing some very common case of using attributes, say, blood decals, which will require me to update only a small piece of the buffer? In that case, it would be much more reasonable to have two buffers for each object: a common one that is shared by them all and the one for blood decals, which is unique for every single of them. And, as blood is usually spilled on everything, this sounds pretty reasonable, so that we would save a lot of space by storing vertices, normals and such without their unnecessary duplication.
I haven't tried implementing decals yet, so honestly not even sure if implementing them using vertex painting (textured or not) is the right choice. But I'm also pretty sure there are some commonly used attributes aside from vertices, normals and texture coordinates.
Here are some that I managed to come up with myself:
decals (probably better to be modelled as separate objects?)
bullet holes and such (same as decals maybe?)
Any thoughts?
UPD: as all this might sound confusing, I want to clarify: I do understand that using as few buffers as possible is a good thing, this is exactly why I'm trying to use this templates concept. My question is: what are the possible cases when using a single buffer and a single element buffer (with both of them shared between similar objects) for a template is going to stab me in the back?
Keeping a giant chunk of data that won't change on the card is incredibly useful for saving bandwidth. Additionally, you probably won't be directly changing the vertices positions once they are on the card. Instead you will probably morph them with passed in uniforms in the Vertex shader through Skeletal animation. Read about it here: Skeletal Animation
Do keep in mind though, that in Key frame animation with meshes, you would keep a bunch of buffers on the card each in a different key frame pose of the animation. However, you would then load whatever two key frames you want to interpolate over in as attributes and then blend between them (You can have more than two). Keyframe Animation
Additionally, with the introduction of Transformation Feedback, (No you don't get to use it in WebGL, it became core in OpenGL 3.0, WebGL is based on OpenGL ES 2.0, which is based on OpenGL 2.0) you can start keeping calculated data GPU side. In other words, you can do a giant particle system simulation in the vertex or geometry shader and then store the calculated data into another buffer, then use that buffer in the next frame without having to have a round trip from the GPU to CPU Read about them here: Transform Feedback and here: Transform Feedback how to
In general, you don't want to touch buffers once they are on the card, especially every frame. Instead load several and use pointers to that data in shaders as attributes.
I'm adding an OpenGL renderer to my 2D game engine and I want to know whether there is a way to apply an mvp matrix only to part of the vertices in a single draw call?
I'm planning to group draw calls by textures so I'll pass a buffer of many vertices and texcoords, now I want to apply different rotation angles to different quads. Is there a way to accomplish it in the shader or should I give up on the mvp matrix in the shader and perform the same thing using the cpu?
EDIT: What about adding 3 float attributes (rotation and rot_center.xy) per vertex?
what's better performance
(1) doing CPU rotation?
(2) providing 3 more floats per vertex
(3) separating draw calls?
Is there any other option?
Here is a possibility:
Do the rotation in the vertex shader. Pass in the information (angle?) needed to create the rotation matrix as a vertex attribute.
Pass in a vertex attribute (ubyte) that is effectively a per-vertex boolean flag. Rotation in #1 will be executed only if the bool is set.
Not sure if the above will work for you from a performance/storage perspective.
I think that, while it is a good thing to group draw calls for many different performance reasons, changing your code to satisfy a basic requirement as rotation is not a good idea.
Drawing batching is a good thing but, if you are forced to keep an additional attribute (because you cannot do it with uniforms for sure, you wouldn't have the information of the single entity) it is not worth.
An additional attribute means much more memory bandwidth usage that usually is the main killing factor for performances on nowadays systems.
Drawing batching, on the other side, is important but not always critical, it depends on many factors such as:
the GPU OpenGL driver optimization
The GPU tiles configuration
The number of shapes/draw calls we are talking about (if you have 20 quads on the screen, why should you bother of batching? :) )
In other words, often it is much more convenient to drop extreme batching in favor of easiness/main tenability and avoid fancy solutions for simple requirements as rotation.
I hope this helps in some way.
Use two different objects, that is all!
There is no other workaround for rotation of part of object
Example:
A game with a tank, where you want to rotate turret and remaining-body separately. Like in your case here these two are treated as separate objects.
Or any counterpart?
How can I generate a cheap random number?
GLSL ES doesn't come with noise functions, and the desktop GLSL noise functions are almost never implemented.
However, there are some freeware noise functions available. They're supposed to be pretty decent and fast. I've never used them myself, but they should work. It's MIT-licensed code, if you're worried about that.
Define "cheap".
The way random numbers work in computers is, they're not really random. You start with a number (the seed), and for each random number you want you do some fancy looking calculations on that number to get another number which looks random, and you use that number as your random number and the seed for the next random number. See here for the gory details.
Problem is, that procedure is inherently sequential, which is no good for shaders.
You could theoretically write a function in a fragment shader that makes some hash out of, say, the fragment position and potentially some uniform int that is incremented every frame, but that is an awful lot of work for a fragment shader, just to produce something that looks like noise.
The conventional technique for producing noise effects in OpenGL is to create a noisy texture and have the shader(s) use it in various ways. You could simply apply the texture as a standard texture to your surface, or you could stretch it or clamp its color values. For time-varying effects you might want to use a 3D texture, or have a larger 2D texture and pass a random texture coordinate offset to the fragment shader stage each frame via a uniform.
Also have a look at perlin noise, which essentially uses a variation of the effect described above.
I'm using an ParticleSystem with PointSprites (inspired by the Cocos2D Source). But I wonder how to rebuild the functionality for OpenGL ES 2.0
glEnable(GL_POINT_SPRITE_OES);
glEnableClientState(GL_POINT_SIZE_ARRAY_OES);
glPointSizePointerOES(GL_FLOAT,sizeof(PointSprite),(GLvoid*) (sizeof(GL_FLOAT)*2));
glDisableClientState(GL_POINT_SIZE_ARRAY_OES);
glDisable(GL_POINT_SPRITE_OES);
these generate BAD_ACCESS when using an OpenGL ES 2.0 context.
Should I simply go with 2 TRIANGLES per PointSprite? But thats probably not very efficent (overhead for extra vertexes).
EDIT:
So, my new problem with the suggested solution from:
https://gamedev.stackexchange.com/questions/11095/opengl-es-2-0-point-sprites-size/15528#15528
is a possibility to pass many different sizes in an batch call. I thought of using an Attribute instead of an Uniform, but then I would need to pass always an PointSize to my shaders - even if I'm not drawing GL_POINTS. So, maybe a second shader (a shader only for GL_POINTS)?! I'm not aware of the overhead for switching shaders every frame in the draw routine (because if the particle system is used, I want naturally also render regular GL_TRIANGLES without an pointSize)... Any ideas on this?
So doing the thing here as I already commented here is what you need: https://gamedev.stackexchange.com/questions/11095/opengl-es-2-0-point-sprites-size/15528#15528
And for which approach to go, I can either tell you to use different shaders for different types of drawables in your application or just another boolean uniform in your shader and enable and disable changing the gl_PointSize through your shader code. It's usually up to you. What you need to keep in mind is changing the shader program is one of the most time costly operations so doing the drawing of same type of objects in a batch will be better in that case. I'm not really sure if using an if statement in your shader code will give a huge performance impact.