Measuring WebGL efficency - three.js

I’m working on a ThreeJs project that requires some heavy-duty work done with in a fragment shader so I am looking for a way to use lower quality if the device can’t handle the work.
By pure accident I recently included an ‘uint’ uniform in my shader code and found it just would not run-on older devices. So, the availability of WebGL2 became an obvious and good switch.
The problem is that WebGL2 is a browser choice and some older devices with later software will still run it even if very badly.
Is there a quick test to determine WebGL efficiency so I can fall back to lower quality if needed.
Measuring FPS is not an option since even on a modern device it can take a few seconds for it to stabilize for a new page.

This is not a general solution.
But in my particular situation I am using a very expensive SDF that is needed in both the Hi and Lo versions of the graphicsis. It is generated once and stored in a FBO, then used again multiple times as a texture.
Even on a desktop using a RTX 3060 Ti it takes more than 20mS to generate the texture, on the old S4 it's 320+ mS to generate.
They're not ideal metrics but with a bit tuning they should provide a way of guesstimating the GPU's capability and give a good indication of when to fall back to simpler graphics.
There will always be a cut off of what we support but being able to get the best from older devices is not a bad thing.

Related

Leap Motion point cloud

How can we access the point cloud in the Leap Motion API? One feature that led me to purchase it was the point cloud demo from their promo video, but I can't seem to locate documentation regarding it and user replies on the forums seem mixed. Am I just missing something?
I'm looking to use the Leap Motion as a sort of cheap 3D scanner.
That demo was clearly a mockup which simulated a 3-D model of the human hand, not actual point cloud data. You can tell by the fact that points were displayed which could not have possibly been read by the sensor, due to obstruction.
orion78fr points to one forum post on this, but the transcript of an interview by the founders provides more information direct from the source:
Can you please allow access to cloud points in SDK?
David: So I think sometimes people have a misperception as to really
how things work in our hardware. It’s very different from other things
like the Kinect, and in normal device operation we have very different
priorities than most other technologies. Our priority is precision,
small movements, very low latency, very low CPU usage - so in order to
do that we will often be making sacrifices that make what the device
is doing completely not applicable to what I think you’re getting at,
which is 3D scanning.
What we’re working on are sort of alternative device modes that will
let you use it for those sorts of purposes, but that’s not what it was
originally built for.You know, it’s our goal to let it be able to do
those things and with the hardware can do many things. But our
priority right now is of course human computer interaction, which we
think is really the missing component in technology, and that’s our
core passion.
Michael: We really believe in trying to squeeze every ounce of
optimization and performance out of the devices for the purpose they
were built. So in this case the Leap today is intended to be a great
human computer interface. And we have made thousands of little
optimizations along the way to make it better, that might sacrifice
things in the process that might be useful for things like 3D scanning
objects. But those are intentional decisions, but they don’t mean that
we think 3D scanning isn’t exciting and isn’t a good use case. There
will be other things we build as a company in the future, and other
devices that might be able to do both or maybe there will be two
different devices. One that is fully optimized for 3D scanning, and
one that continues to be optimized and as great as it can be at
tracking fingers and hands.
If we haven’t done a good job communicating that the device isn’t
about 3D scanning or isn’t going to be able to 3D scan, that’s
unfortunate and it’s a mistake on our part - but that’s something that
we’ve had to sacrifice. The good news is that those sacrifices have
made the main device really exceptional at tracking hands and fingers.
I have developed with the Leap Motion Controller as well as several other 3-D scanning systems, and from what I've seen I'd seriously doubt if we're ever going to get point cloud data out of the currently shipping hardware. If we do, the fidelity will be far below what we see for gross finger and hand tracking from that device.
There are some low-cost alternatives for 3-D scanning that have started to emerge. SoftKinetic has their DepthSense 325 camera for $250 (which is effectively the same as the Creative Gesture Camera that is only $150 right now). The DS 325 is a time-of-flight IR camera that gives you a 320x240 point cloud map of the 3-D space in front of it. In my tests, it worked well with opaque materials, but anything with a little gloss or shininess gave it trouble.
The PrimeSense Carmine 1.09 ($200) uses structured light to get point cloud data in front of it, as an advancement of the technology they supplied for the original Kinect. It has a lower effective sptial resolution than the SoftKinetic cameras, but it seems to provide less depth noise and to work on a wider variety of materials.
The DUO was also a promising project, but unfortunately its Kickstarter campaign failed. It was using stereoscopic imaging from an IR source to return a point cloud from a couple of PS3 Eye cameras. They may restart that project at some point in the future.
While the Leap may not do what you want, it looks like more and more devices are coming out in the consumer price range to enable 3-D scanning.
See this link
It says that yes, Leap Motion can theorically handle point cloud and it was temporarily part of the visualiser in beta and no, you can't access it using the Leap Motion APIs right now.
It may appear in the future but it's not a priority of Leap Motion Team.
As with LeapMotion SDK 2.x one can at least access the stereo camera images! As I know by myself it is a convenient solution, for many tasks where the point cloud data was asked for. This is why I mention it here, even if it does not give the point-cloud data internally generated by the driver to extract the pointer-metadata. But now one has the capability to generate own point-cloud by yourself, this is why I think it is strongly related to the question.
Currently there is no access to the Pointcloud in the public API. But I think this video is no mock-up, so there should be a possibility:
http://www.youtube.com/watch?v=MYgsAMKLu7s#t=40s
Roadtovr recently reviewed the Nimble Sense Kickstarter, which is using point cloud.
It’s the same technology that the Kinect 2 uses, and it’s supposed to have some advantages over the Leap Motion.
Because it’s a depth sensing camera, you can point the camera top-down like the Touch+, although their product will not ship till next year.

Is it possible to use GPU for raytracing without CUDA/OpenCL etc?

I'm working on Windows Phone 7 which does not support features like CUDA or OpenCL. I'm new to the GPU side of things, Is there anything on the GPU that I can use to help speed up raytracing? Like triangle intersection tests? Or selecting the correct colour from a texture?
CUDA and the like are really just higher level languages for programming shaders, so any platform that supports programmable shaders allows you some capability to run general purpose calculations on the gpu.
Unfortunately, it looks like Windows Phone 7 does not support custom programmable shaders, so GPU acceleration for a ray tracer is not really possible at this time. Even if it was, it is very difficult to effecticely use a GPU for raytracing because of several very anti-GPU characteristics:
Poor memory coherency (each ray can easily interact with completely different geometry)
High branching factor (shaders work best with code that consistently follows a single path)
Large working set (A lot of geometry has to be accesable in memory at any one time to compute the outcome of even a single ray)
If your goal is to write a raytracer, it would probably be far easier to do completely on the CPU, and only then consider optimizations that are more esoteric.
Raytracing is still a bit slow, even on modern average desktop PC. You can speed it up by shooting just primary rays, but then rasterisation methods will be actually better and faster.
Are you certain, you want to do ray-tracing on a phone, which has even less compute power than PC? They are not designed to do that kind of work.

Detect whether a Quartz Composition in a QCView will be rendered through software or hardware

I have a feeling there are combinations of Cocoa Quartz Compositions and GPUs which can't be handled by the GPU and which fall back on the software renderer, even if Core Image is "accelerated" normally. How would I detect such a situation?
Or more generally, how do I detect that a machine is too underpowered to handle a certain composition of a certain size, without actually playing the composition and measuring the FPS?
(Measuring the FPS through playing the composition in a hidden window is unlikely to work, since the QCView might detect that situation and optimise away the whole operation, or parts thereof. And even if it didn't do that today it might start doing that with the next update from Apple - it'd be an unreliable solution.)
Update: to be thorough I did write some code to test render the composition at full resolution in an ordered out but properly sized window, trying to force the render to happen with [self startRendering];[self snapshotImage];[self stopRendering];. This took an amount of time which looked reasonable at first, until it turned out the slow machine was faster at running this test than the fast one. ;) In reality the slow machine renders the composition at a measly 2.24 FPS vs 27 FPS on the fast machine.
I'm guessing you're asking so that you can make a simpler fallback animation for weaker systems?
One option may be to check the user's hardware string as is mentioned here:
GPU Chipset Detection.
glGetString can return GL_VENDOR, GL_RENDERER, GL_VERSION, or GL_EXTENSIONS. You could theoretically use GL_VENDOR to identify Intel GMA's as too slow, or compare GL_RENDERER to a list of known poor-performing GPUs. If you're writing code for 10.6+ only, you only have to compare to GPUs used in Intel Macs, so the list shouldn't be too long.
This might not be quite the elegant solution you're looking for, but it should do the trick. I would also provide the user with an override to choose the higher or lower quality graphics if they wish.

Why not use GDI to repeatedly fill a window with RGB data from an array?

This is a follow-up to this question. I'm currently writing a simple game and am looking for the fastest way to (repeatedly) display an array of RGB data in a Win32 window, without flickering or other artifacts.
Several different approaches were recommended in the answers to the previous question, but there was no consensus on which would be the fastest. So, I threw together a test program. The code simply displays a framebuffer on the screen repeatedly, as fast as possible.
These are the results I obtained, for 32-bit data running in a 32-bit video mode - they may surprise some people:
- Direct3D (1): 500 fps
- Direct3D (2): 650 fps
- DirectDraw (3): 1100 fps
- DirectDraw (4): 800 fps
- GDI (SetDIBitsToDevice): 2000 fps
Given these figures:
Why are many people adamant that GDI is simply too slow for this operation?
Is there any reason to prefer DirectDraw or Direct3D over SetDIBitsToDevice?
Here is a brief summary of the calls made by each of the Direct* codepaths. If anyone knows a more efficient way to use DirectDraw/Direct3D, please comment.
1. CreateTexture(D3DUSAGE_DYNAMIC, D3DPOOL_DEFAULT);
LockRect(); memcpy(); UnlockRect(); DrawPrimitive()
2. CreateTexture(0, D3DPOOL_SYSTEMMEM); CreateTexture(0, D3DPOOL_DEFAULT);
LockRect(); memcpy(); UnlockRect(); UpdateTexture(); DrawPrimitive()
3. CreateSurface(); SetSurfaceDesc(lpSurface = &frameBuffer[0]);
memcpy(); primarySurface->Blt();
4. CreateSurface();
Lock(); memcpy(); Unlock(); primarySurface->Blt();
There are a couple of things to keep in mind here. First of all, a lot of "common knowledge" is based on some facts that no longer really apply.
In the days of AGP, when the CPU talked directly to the GPU, it always used the base PCI protocol, which happened at the "1x" rate (always and inevitably). AGX 2x/4x/8x only applied when the GPU was taking to the memory controller directly. In other words, depending on when you looked, it was up to 8 times as fast to have the GPU load a texture from memory as it was for the CPU to send the same data directly to the GPU. Of course, the CPU also had a great deal more bandwidth to memory than the PCI bus supported.
When things switched to PCI-E, however, that changed completely. While there can be differences in bandwidth depending on path, there's no general rule that memory->GPU will be faster than CPU->GPU. The one generalization that's (mostly) safe is that if you have a dedicated graphics card, then the GPU will almost always have more bandwidth to the memory on the graphics card than it does to main memory on the motherboard.
In your case, that doesn't matter much though -- you're talking about moving data from CPU space to GPU space regardless. The main speed difference with using DirectX (or OpenGL) happens when you keep all (or most) of the computation on the GPU, and avoid using the CPU (or main memory) at all. They don't (now that AGP is history) provide any substantial improvement in memory->display bandwidth.
Jerry Coffin makes some good points. The thing to bear in mind is what the DI stands for in SetDIBitsToDevice. It stands for Device Independent. Which means you were ALWAYS at the mercy of drivers. Some drivers used to be complete rubbish and it affected the performance massively. DirectDraw suffered from similar issues as well ... but you also had access to the hardware blitters so it was generally more useful. IHVs also tended to put more time in to writing proper drivers for DirectDraw because of its gaming association. Who wants to be the bottom of the performance pile when the hardware is quite capable of doing better?
These days many graphics cards can accept the bit data directly so no conversion happens. If it does need to be swizzled this is also INCREDIBLY quick in this day and age.
The reason your Direct3D performance is so terrible, by comparison, is that Direct3D, by nature of the fact it is meant to be used totally internally to the GPU, uses odd and complex formats to improve cache performance and so forth.
Couple that with the fact that you aren't testing like for like (with DDraw and D3D) by creating a texture/surface, locking it, copying, unlocking and then drawing over the back buffer (via various methods). To get best performance you'd be best off directly locking the backbuffer using a DISCARD lock then memcpy'ing directly into the returned buffer before unlocking. This will bring your performance much closer to the SetDIBitsToDevice. I still would expect D3D to be slower than DDraw, however, for the reasons outlined above.
The reason you will hear people trounce on GDI is that it used to just be old windows API calls. The newer versions of it (that were called GDI+ when I last looked at em) are actually just an API placed on top of DirectX calls. So using GDI may seem fairly simple programming wise at times, but adding a layer between things always slows things down. As mentioned in the response from Jerry Coffin, your examples are about moving the data, and that is the slow time. I am a bit surprised that DirectX is that much slower though but I can not be much more help with out digging through the DirectX documentation (which has been pretty awesome for quite some time really.. Might want to check out www.codesampler.com. I have always found good starting places from him and actually, while I may be insane for saying this, I would swear the improvements to the DirectX SDK in doc and examples were done based on this guys work!)
As for the DirectDraw vs Direct3D (and not the GDI calls) discussion. I would say go to Direct3D. I believe DirectDraw has been deprecated since 8.0 or so, and 9.0 has been around for quite a long while. And at the end of the day all of DirectX is 3D, it just varies on the levels of helpful 2D apis that are around, but you may find you can do some very interesting things in a 2D environment when you are actually using 3D space. (I had a pretty neat randomly generated lightning weapon for a space invaders clone at one time :))
Anywho, hope this helped!
PS: It should be noted that DirectX is not always the fastest. For keyboard input (unless this has changed in 10 or 11) it has pretty much always been recommended to use the windows events.. as DirectInput was actually just a wrapper for that system!.. XInput however is -awesome-!!

Porting DirectX to OpenGL ES (iPhone)

I have been asked to investigate porting 10 year old Direct X (v7-9) games to OpenGL ES, initially for the iPhone
I have never undertaken a game port like this before (and will be hiring someone to do it) but I'd like to understand the process.
Are there any resources/books/blogs that will help me in understanding the process?
Are there any projects like Mono that can accomplish this?
TBH A porting job like this is involved but fairly easy.
First you start by replacing all the DirectX calls with "stubs" (ie empty functions). You do this until you can get the software to compile. Once it has compiled then you start implementing all the stub functions. There will be a number of gotchas along the way but its worth doing.
If you need to port to and support phones before iPhone 3GS you have a more complex task as the hardware only supports GLES 1 which is fixed-function only. You will have to "emulate" these shaders somehow. On mobile platforms I have written, in the past, assembler code that performs "vertex shading" directly on the vertex data. Pixel shading is often more complicated but you can usually provide enough information through the "vertex shading" to get this going. Some graphical features you may just have to drop.
Later versions of the iPhone use GLES 2 so you have access to GLSL ... ATI have written, and Aras P of Unity3D fame has extended, software that will port HLSL code to GLSL.
Once you have done all this you get on to the optimisation stage. You will probably find that your first pass isn't very efficient. This is perfectly normal. At this point you can look at the code from a higher level and see how you can move code around and do things differently to get best performance.
In summary: Your first step will be to get the code to compile without DirectX. Your next step will be the actual porting of DirectX calls to OpenGL ES calls. Finally you will want to refactor the remaining code for best performance.
(P.S: I'd be happy to do the porting work for you. Contact me through my linkedin page in my profile ;)).
Not a complete answer, but in the hope of helping a little...
I'm not aware of anything targeting OpenGL ES specifically, but Cadega, Cider and VirtualBox — amongst others — provide translation of DirectX calls to OpenGL calls, and OpenGL ES is, broadly speaking, OpenGL with a lot of very rarely used bits and some slower and redundant parts removed. So it would probably be worth at least investigating those products; at least VirtualBox is open source.
The SGX part in the iPhone 3GS onwards has a fully programmable pipeline, making it equivalent to a DirectX 10 part, so the hardware is there. The older MBX is fixed pipeline with the dot3 extension but no cube maps and only two texture units. It also has the matrix palette extension, so you can do good animation and pretty good lighting if multiple passes is acceptable.

Resources