I've been getting into Directx programming lately. All is good for me, but there is one big issue. Each and every time I run the program, even when there hasn't been any changes to the code, it has to compile the shaders. Is there a way to set it so that it only has to compile them when they are edited?
It is rather annoying trying to perfect values to the way I want them when I have to wait 2 minutes every time I compile.
And yes, it is compiling them at run time.
Use fxc to precompile the shaders into .fxo files. These can be loaded by D3DXCreateEffectFromFile just like your .fx files. This should significantly decrease loading time.
See the CompiledEffect Sample from the SDK for details.
As a sidenote, are you sure it is the shader compiling that is causing the long delay? Large amounts of shaders definitely can cause a slowdown like that, but if you are ´getting into DirectX programming´ like you say... how many lines of shader code are we talking about?
Related
OK, I have the problem, I do not know exactly the correct terms in order to find what I am looking for on google. So I hope someone here can help me out.
When developing real time programs on embedded devices you might have to iterate a few hundred or thousand times until you get the desired result. When using e.g. ARM devices you wear out the internal flash quite quickly. So typically you develop your programs to reside in the RAM of the device and all is ok. This is done using GCC's functionality to split the code in various sections.
Unfortunately, the RAM of most devices is much smaller than the flash. So at one point in time, your program gets too big to fit in RAM with all variables etc. (You choose the size of the device such that one assumes it will fit the whole code in flash later.)
Classical shared objects do not work as there is nothing like a dynamical linker in my environment. There is no OS or such.
My idea was the following: For the controller it is no problem to execute code from both RAM and flash. When compiling with the correct attributes for the functions this is also no big problem for the compiler to put part of the program in RAM and part in flash.
When I have some functionality running successfully I create a library and put this in the flash. The main development is done in the 'volatile' part of the development in RAM. So the flash gets preserved.
The problem here is: I need to make sure, that the library always gets linked to the exact same location as long as I do not reflash. So a single function must always be on the same address in flash for each compile cycle. When something in the flash is missing it must be placed in RAM or a lining error must be thrown.
I thought about putting together a real library and linking against that. Here I am a bit lost. I need to tell GCC/LD to link against a prelinked file (and create such a prelinked file).
It should be possible to put all the library objects together and link this together in the flash. Then the addresses could be extracted and the main program (for use in RAM) can link against it. But: How to do these steps?
In the internet there is the term prelink as well as a matching program for linux. This is intended to speed up the loading times. I do not know if this program might help me out as a side effect. I doubt it but I do not understand the internals of its work.
Do you have a good idea how to reach the goal?
You are solving a non-problem. Embedded flash usually has a MINIMUM write cycle of 10,000. So even if you flash it 20 times a day, it will last a year and half. An St-Nucleo is $13. So that's less than 3 pennies a day :-). The TYPICAL write cycle is even longer, at about 100,000. It will be a long time before you wear them out.
Now if you are using them for dynamic storage, that might be a concern, depending on the usage patterns.
But to answer your questions, you can build your code into a library .a file easily enough. However, GCC does not guarantee that it links the object code in any order, as it depends on optimization level. Furthermore, only functions that are referenced in a library file is pulled in, so if your function calls change, it may pull in more or less library functions.
tl;dr I want a rapid edit-compile-run workflow, but prefixing every single function call in "somenamespace_" is annoying.
One (debatable) advantage of C is that you can have separate compilation units. Large amounts of code can be compiled into objects, and libraries, which are much faster to link together than parsing any amount of C code. It can be slower to run, since inlining optimizations can't be done between compliation units, but it is very fast to link, especially with that ld.gold linker.
Problem being, it's C. Lack of namespaces, pretty much.
But even C++ doesn't (in practice) use separate compliation units. Sure you can, but for the most part it's about including megabytes of templates in header files. The whole philosophy behind "private:" is to pretend you have separated interfaces, without actually having them at all. So the standard practice in a language is important too, because even if I make my own isolated binary interfaces, if each implementation has to #include the same tons of code from third parties, the time saved in isolating them doesn't add up. C++ is kind of... featureful for me, anyway. Really, I just want namespaces. (And modules... sigh)
Languages like python, racket and java use partial compilation, which seems fast enough, but you still get slowdowns in startup for large projects, as they have to translate all that bytecode into machine code every time. There's no option (outside of writing a C interface) to isolate code in a way that's fast to combine with code that you are working on.
I'd just like to know what languages where large amounts of code can be concealed behind small, quick to load interfaces, so that compiling them initially might be slow, but then I can get a rapid edit-compile-run cycle as I work on parts of it. Instead of this python hack, where I change something in the progress displayer, and then have to sit there staring at it as it loads the standard library, and then loads the database code, and then loads the web server code, and then loads the image processing code, and then sits there for another 20 seconds figuring out the gobject-introspection thing for some gui code.
It's not always obvious. I'm staring at D trying to figure out when it parses the code from dependencies when it recompiles stuff, without a clue. Go seems to just slap all the code together (including dependencies!) into a single compilation unit, but maybe I'm wrong there? And I don't think Nim regenerates all the generated C every compile, but maybe it does? Rust uses separate compliation units (I think) but it's still slow as heck to compile! And python really does compile fast, so it's only once my projects start getting big and successful that I start getting tripped up by it.
I haven't learned the other languages you mentioned such as Rust, D, or Python. When you compile a Nim program, it creates a folder called nimcache. That contains all the .c and .o files. If you make a change to the Nim program and recompile, it tries to reuse the files in that nimcache folder. The files have a .c and .o extension.
Does compile time matter? How does the code affect the compile time? Lastly, do comments in code affect the compile time?
Long compiles change how you think about working. If it costs you half an hour every time you change something, it makes you avoid changing things. This can actually be good: you end up thinking a lot more carefully since you can't experiment. Mostly though, it's a problem since it discourages trying new things.
Last first: comments don't affect compile time perceptibly; they're stripped off by the preprocessor before any of the actual work of compilation is commenced.
I wouldn't consider compile time an indicator of the quality of programming; the biggest thing that affects compile time is the actual size of the code (exclusive of comments, as previously noted). Where compile time is an issue is that, first, it contributes to the weight gain and other health issues of programmers ("Compiling now, time for a soda."), and second, if extreme, it may contribute very slightly to the cost of a project, though I'd never expect it to be a huge issue unless recompiling for tiny changes in a huge project.
And this is one reason why large projects are universally handled as multiple, separately compiled modules; make a small change, and only compilation of the affected module and relinking are required, rather than a full build of the entire project.
I recently started doing some investigations into ASM and I've played with a few of the demos online. I must say that the Unreal demo was quite impressive... I've been developing an app using Three for a large number of months now. It works beautifully on fast machines, but on lower end ones it tends to struggle. When I ran the unreal demo on my lower end machines, the demo worked like a dream. My question is, what place might ASM have with Three - could it vastly speed up the engine? Is it worth while investigating or developing a solution that utilises both and switching between them based on the browser? Also would there be any plans for Three to take advantage of it in the future?
I came from a C++ background, and would be quite interested in the prospect of developing something. But at the same time it would mean having to re-learn the language and even more problematic might be the large amount of time it would take to get it to a usable point.
What are your thoughts?
This is my opinion:
First and foremost, asm.js isn't really meant to be written by hand. Altough I say that It certainly is possible to write it as it has a validator. The unreal demo is something that has been compiled into asm.js with emscripten. It also doesn't need to interact with other code outside the code that gets compiled. So it generates highly optimized code because of the fact that the unreal demo is already highly optimized code in C++, It gets optimized by a compiler and then gets another run of optimisations through asm.js.
Secondly, asm.js is actually only supported by firefox. Altough all other browsers can execute it but on most it still incurs a performance penalty. This penalty is if you compare asm.js code that does the same as normal javascript code. Just search jsperf.com for examples of this.
Okay, This is some general guide lines about asm.js. Now let's talk about Three.js.
Firstly, because THREE.js has to interact with usercode, it isn't easy to write an asm.js library because of it's many restrictions (no objects).
Secondly, Three.js will not gain much performance a whole lot in performance for calculations where asm.js is strong in. But will gain more from future updates from the browsers. (for instance, the creation of typedarrays in chrome which is now a pain point in THREE.js is comming soon. V8 issue)
Thirdly, The code in asm.js needs to manage its own memory. Which would mean that THREE.js has to figure out a way to make a large apps work with limited memory. Or make every application very memory hungry.
Fourth, comparing the unreal demo with three.js is a bit unfair due to the fact that three.js tries to allow everyone to write 3D apps while the unreal engine is a highly optimized engine for 3D games.
As you've noticed, I'm mostly against asm.js in three.js. But that's becuase it's too early to tell what the best way to go is. There is a high probability that asm.js will get a place in three.js eventually but for more a limited use as renderer-only for instance. But for now, there are still too many unsolved questions around asm.js.
But If you want to use asm.js and use C++, Then i recommend emscripten which was used to build the unreal demo.
This is of course my opinion. But I think it somewhat represents what #Mr.doob and #WestLangley had in mind. And sorry about the long post.
The best way to find out is to write a small demo in C (by hand) then compile to asm.js and run it, then write the same small demo in JS with Three.js (by hand) then run that, and compare the differences in both developer experience as well as performance.
I'm working on a music visualization plugin for libvisual. It's an AVS clone -- AVS being from Winamp. Right now I have a superscope plugin. This element has 4 scripts, and "point" is run at every pixel. You can imagine that it has to be rather fast. The original libvisual avs clone had a JIT compiler that was really fast, but it had some bugs and wasn't fully implemented, so I decided to try v8. Well, v8 is too slow running the compiled script at every pixel. Is there any other script engine that would be pretty fast for this purpose?
If you are running your updates on a per-pixel level, I would suggest having an off-screen in-memory representation of the screen, and update the screen as a whole, not each individual pixel. I know that this is a common issue for bitmap updates in general, not V8 per-se. I don't know enough about the specific environment you are working in to be much help, only that as I said, it's a common performance issue to try to update individual pixels against a UI canvas one at a time. If you can do an offline/offscreen representation of your canvas/ui surface then update it all at once, your performance will be much better.
Also, there will be some dependencies on how your event model is worked out. If this doesn't work well, you may need to bring this logic into a compiled COM object or something, but on a per-pixel update scheme, you will have similar issues when trying to do per-pixel updates. Not saying you are, just noting again this is the most common issue with this type of problem.
sounds like you need to use native code, or maybe a Java Applet (Not that I recommend a Java Applet, use it only if you are in full control over the client environment).