In a DirectX 11 demo application, I use a CDXUTSDKMesh for my static geometry. It is loaded and already displayed.
I'm doing some experiments related to Precomputed Radiance Transfer in this application. The ID3DXPRTEngine would be a pretty handy tool for the job. Unfortunately, D3DX and DXUT don't seem to be very compatible there.
ID3DXPRTEngine requires a single ID3DXMesh (multiple meshes can be concatenated with D3DXConcatenateMeshes no problem). Is there an easy way to 'convert' CDXUTSDKMesh to one or multiple ID3DXMesh instances?
Short Answer: No.
Long answer: It can be done, but it requires that you be able to read the raw vertex data, convert it to the Direct3D 9 FVF system (the old fixed shader pipeline), and write the new data back.
Basically what you're trying to do is inter-operate two different versions of the Direct3D API: version 9.0 and version 11.0. In order to do this, you need to do the following:
Confirm your verticies contain no custom semantics (i.e. D3D11 allows a semantic like RANDOM_EYE_VEC, whereas D3D9 does not).
Confirm you have the ability to read your vertex and face information into a byte buffer (i.e. a char pointer).
Create a Direct3D 9 Device and a Direct3D 11 device.
Open your mesh, pull the data into a byte buffer, and create an ID3DXMesh object from that data. If you have more than 2^16 faces, you will need to split the mesh into two or more meshes.
Create the PRT Engine from the mesh(es), and run your PRT calculations.
Read the new information from the mesh(es) using ID3DXBaseMesh::GetVertexBuffer() (not sure if this step is entirely correct as I've never actually used the PRT engine).
Write the new data back into the CDXUTSDKMesh.
As you may be able to tell, this process is long, prone to error, and very slow. If you're doing this for an offline tool, it may be OK, but if this is to be used in real time, you're probably better off writing your own PRT engine.
Additionally, if your mesh has any custom semantics (i.e. you use a shader that expects some data that wasn't a part of the D3D9 world) then this algorithm will fail. The PRT engine uses the fixed function pipeline, so any new meshes that take advantage of D3D11 features that didn't exist back then are not going to work here.
Related
We have a webgl/three.js application that makes extensive use of texture buffers for passing data between passes and for storing arrays of data. None of these has any use for mipmaps. We are easily able to prevent mipmap generation: at the three.js level we set min and mag filters to NearestFilter, and set generateMipmaps false.
However, the shaders do not know at compile time that there is no mipmapping. When compiled using ANGLE we get a lot of warning messages:
warning X4121: gradient-based operations must be moved out of flow control to prevent divergence. Performance may improve by using a non-gradient operation
I have recoded so that the flow around such lookups is (optionally) avoided.
On my Windows/NVidia machine using the conditional flows improves performance and does not cause any visual issues (but does cause the messages).
I don't want the texture lookups to be gradient-based operations. What I would like to do is to write the shaders in such a way that they know at compile time that there is no decision to be made; which should (marginally) improve performance and also make the messages go away. However, I cannot see any way to do this in GLSL for GLES 2 (as used by webgl). It can be done in later versions with textureLodOffset() and various other ways. The only control in level 2 I can see is the bias option on texture2D(), but that is a bias not an absolute value and so does not resolve the issue. So, finally ...
Question: Do you know any way to prevent lod calculation in WEBGL level GLSL shaders?
You might try ensuring:
Using gl_FragCoord instead of a user varying
NEAREST is set before texImage2d, instead of after
My question is about design and possible suggestions for the following scenario:
I am writing a 3d visualizer. For my renderable objects I would like to store the minimum data possible (so quaternions are naturally nice for rotation).
At some point I must extract a Matrix for rendering which requires computation and temporary storage on every frame update (even for objects that do not change spatially).
Given that many objects remain static and don't need to be rotated locally would it make sense to store the matrix instead and thereby avoid the computation for each object each frame? Is there any best practice approach to this perhaps from a game engine design point of view?
I am currently a bit torn between storing the two extremes of either position+quaternion or 4x3/4x4 matrix. Looking at openframeworks (not necessarily trying to achieve the same goal as me), they seem to do a hybrid where they store a quaternion AND a matrix (matrix always reflects the quaternion) so its always ready when needed but needs to be updated along with every change to the quaternion.
More compact storage require 3 scalars, so Euler Angels or Exponential Maps (Rodrigues) can be used. Quaternions is good compromise between conversion to matrix speed and compactness.
From design point of view , there is a good rule "make all design decisions as LATE as possible". In your case, just incapsulate (isolate) the rotation (transformation) representation, to be able in the future, to change the physical storage of data in different states (file, memory, rendering and more). Also it enables different platform optimization, keep data in GPU or CPU and more.
Been there.
First: keep in mind the omnipresent struggle of time against space (in computer science processing time against memory requirements)
You said that want to keep minimum information possible at first (space), and next talked about some temporary matrix reflecting the quartenions, which is more of a time worry.
If you accept a tip, I would go for the matrices. They are generally performance wise standard for 3D graphics and it's size becomes easily irrelevant next to the object data itself.
Just to have and idea: in most GPUs transforming an vector for the identity (no change) is actually faster then checking if it needs transformation and then doing nothing.
As for engines, I can't think of one that does not apply the transformations for every vertex every frame. Even if the objects keep in place, they position has to go through projection and view matrices.
(does this answer? Maybe I got you wrong)
Which is faster, a single call to glUseProgram, or sending e.g. 6 or so floats via glUniform (batched or separately), and by approximately how much?
Can you describe in more detail the scenario where you think this affects the performance of the rendering pipeline? They offer completely different functionalities and I don't see why you would care about the performance of glUseProgram vs glUniform.
Now let's analyze what happens when you use this functions to get an idea of their cost.
When you call glUseProgram it changes several OpenGL rendering states because we are going to use new shaders attached to the program object. The specification says that vertex and fragment programs are installed in the processors when you invoke this function. That alone seems costly enough to overshadow the cost of glUniform. Also, when you install new vertex and fragment programs, additional states of the rendering pipeline are changed to accomodate the number of texture units and data layout used by the programs.
glUniform copies data from one location of memory to another to specify the value of an uniform variable. The worst case would be copying matrices which seems less complex than glUseProgram.
But in the end, it all depends of the amount of data you are transferring with glUniform and the underlying implementation of glUseProgram (it could be super optimized by the driver and have a very small cost) and if your engine is smart enough to group the geometry that uses the same program and draw it without changing states.
I'm writing a directx app and want to overlay a grid on the front of the scene. The grid will possibly update every frame but will be something like 20 horizontal lines and 20 vertical lines (LineList).
I'm trying to understand if this situation (small amount of vertices updated frequently) means a dynamic buffer is more appropriate than a static buffer?
Is anyone able to advise on this? I've not been able to find a low-level explanation of the difference between the two - it sounds like dynamic is 'more accessible' to the CPU and requires some locking semantics whereas static is less accessible.
Cheers
You will likely want to use a dynamic vertex buffer. If you want to update the vertices on a per frame basis then dynamic is the way to go.
See this MSDN article for a more low-level description
MSDN Article
If you are changing the buffer every frame, use Dynamic buffers.
Using static buffers will cause the GPU to stall every time you change the buffer, which will crash performance.
I'm not sure about dynamic buffers in direct3d10, the name seems to come from direct3d9. Direct3D10 has a somewhat more elaborate scheme of creating 'dynamic' buffers but you should not be using Static buffers in any case.
Is it possible by using the Zbar API, that one can check if the image consists of barcode or not?
This is as a backup measure, so that if the application is unable to get barcode value, let it check if it might contain a barcode, if so user can later manually verify it.
I have explored quite a bit but with no major success. If not ZBar, any other open source library that can do it well?
Thanks
What you need is a detector, i.e. the ability to locate the barcode (if any), and thus just return yes or no according to the detection result.
IMHO Zbar does not provide a versatile enough API to do so since it exposes a high-level scanner interface (zbar_scan_image) that combines detection & decoding on one hand, and a pure decoder interface on the other hand.
You should definitely refer to this paper: Robust 1D Barcode Recognition on Mobile Devices. It contains an entire section related to the detection step including pseudo-algorithms [1] - see 4. Locating the barcode. But there is no ready-to-use open source library: you would have to implement your own detector based on the described techniques.
At last, more pragmatic/simple techniques may be used depending on the kind of input images you plan to work with (is there any rotation? blur? is it about processing images or the video stream in real-time?).
[1] In addition I would say that it's a good idea to use a different kind of algorithm within this fallback step than the one used within the first step.