GLES performance bottleneck in iOS 11 (seen on 6S+) - performance

EDIT: This is not a code bottleneck. Profiling shows identical CPU usage regardless of fluctuating GPU ms. Also I incorrectly talked about glbufferdata when I first posted; it's glbuffersubdata.
As part of an optimisation drive for the Android build of our app, I reorganised the render pipeline of our engine in a way that exposed a very significant performance bottleneck on certain iOS devices under iOS11.
Our engine manages a mixed scene of 2D geometry (UI) and 3D static models. The 2D geometry is dynamic, and generated each frame.
Until recently, every time a 2D draw call had to be issued (change in texture, change in blend modes or shader etc), the sprite data generated so far would be uploaded via glbuffersubdata and then rendered. This was fine on iOS, less so on Android.
To improve performance on Android I made the engine put all the 2D geometry into a single data buffer, upload it to a single VBO with one glbuffersubdata call at the start of the next frame, and then execute draw calls using sub-sections of that buffer. This led to a significant boost on Android devices, and made no difference to some of our iOS test devices (eg iPod Touch 6 running iOS10).
However, on a 6S+ running iOS 11, the new pipeline increased the GPU time from a rock-stable 8ms to a wildly-fluctuating 10-14ms, often pushing the game down to 30fps instead of 60. This is not the first time something previously innocuous has become performance-killing with an iOS update.
I've now made the buffering-up a compile option, and regained the lost performance on iOS, but if you are generating significant amounts of dynamic geometry and struggling with performance on iOS11, you may see an improvement if you stop trying to batch glbuffersubdata uploads and instead perform smaller ones interspersed with render calls.

Related

Kinect 2 - Significant delay on hand motion

I'm using the Kinect 2 to perform rotation and zooming of a virtual camera showing on an 3D object by moving the hand in all three directions. The problem I currently tackle with is that these operations are executed with some noticeable delay. If my hand is in a steady position again, the camera still continues to move for a short time. It feels like if I push the camera instead of control them in real time. Perhaps the frame rate is a problem. As far as I know the Kinect has 30 FPS while my application has 60 FPS (VSync enabled).
What could be the cause for this issue? How can I control my camera without any significant delay?
The Kinect is a very graphic and process intensive piece of hardware. For your application I'd suggest a minimum specification of a GTX960 and a 4th gen i7 processor. Your hardware will be a prime factor in how fast you can compute Kinect data.
You're also going to want to avoid using loops as much as possible and instead rely on multi threading and if you are looping be certain that there are no foreach loops as they take longer to execute. It is very important that your code is running the data read from the Kinect and the position command asynchronously.
The Kinect will never be real time responsive. There is just too much data that it is handling, the best you can do is optimize your code and increase your hardware power to shrink the response time.

FillGeometry much slower on UWP compared to desktop?

The UWP version of our app runs with a much slower framerate (6 fps vs 24 fps) compared to the desktop equivalent. Note that both versions were tested on the same hardware.
Both versions are built using SharpDX, the only difference is how the RenderTargets are set up. The Windows app uses an HwndRenderTarget, and the UWP app uses a SurfaceImageSource brush that paints into a Rectangle.
We've narrowed the main culprit (on the CPU side at least) to FillGeometry, which consumes a lot of the time on UWP.
Is there a reason why FillGeometry would take much longer in the above UWP configuration compared to desktop?
Note: The rendering code is identical on both, so avoid suggestions which impact both implementations equally, such as using GeometryRealization instead of Geometry. We're looking for the reason for the difference between the rendering performance on UWP and desktop.
If there are factors other than Geometry that might be affecting performance, it would be useful to know those as well, since our profiling tools might not be altogether precise.
One of the factors seems to be that internal Direct2D clipping works differently in these cases.
Our scene has hundreds of geometries, and our initial code did not clip to the viewport, and relied instead on Direct2D doing the clipping. This resulted in the differences in frame rates mentioned in the original post.
When explicit clipping was added, the frame rate for the UWP version increased to around 16fps (still less than that of the desktop app) while the frame rate of the desktop version was not affected much.
So at this point it is a hypothesis that different clipping routines are at work in these two cases.
This isn't fully solved, as we still have a significant difference in frame rates. But it's a start.

glBufferData very slow with big textures (sprites sheets) in Cocos2d-x 3.3

I'm working with Cocos2d-x to port my PC game to Android.
For the sprites part, I wanted to optimize the rendering process so I decided to dynamically create sprites sheets that contain the frames for all the sprites.
Unfortunately, this makes the rendering process about 10-15 times slower than using small textures containing only the frames for the current sprite (on mobile device, on Windows everything runs smoothly).
I initially thought it could be related to the switching between the sheets (big textures like 4096*4096) when the rendering process would display one sprite from one sheet, then another from another sheet and so on... making a lot of switches between huge textures.
So I sorted the sprites before "putting" their frames in the sprites sheets, and I can confirm that the switches are now non-existent.
After a long investigation, profiling, tests etc... I finally found that one Open GL function takes all the time:
glBufferData(GL_ARRAY_BUFFER, sizeof(_quadVerts[0]) * _numberQuads * 4, _quadVerts, GL_DYNAMIC_DRAW);
Calling this function takes a long time (profiler says more than 20 ms per call) if I use the big texture, quite fast if I use small ones (about 2 ms).
I don't really know Open GL, I'm using it because Cocos2d-x uses it, and I'm not at ease to try to debug/optimize the engine because I really think they are far better than me for that :)
I might be misunderstanding something and I'm stuck on this since several days and I have no idea of what I can do now.
Any clues ?
Note: I'm talking about glBufferData but I have the same issue with glBindFramebuffer, very slow with big textures. I assume this is all the same topic.
Thanks
It is normally a costly call to do as glBufferData involves CPU to GPU transfer.
But the logic behind Renderer::drawBatchedQuads is to flush the quads that have been buffered in a temporary array. The more quads you have to render, the more data have to be transferred.
Since the quads properties (positions, texture, colors) are likely to change each frame, a CPU to GPU transfer is required every frame as hinted by the flag GL_DYNAMIC_DRAW.
According to specs:
GL_DYNAMIC_DRAW: The data store contents will be modified repeatedly and used many times as the source for GL drawing command.
There are possible alternatives to glBufferData such as glMapBuffer or glBufferSubData that could be used for comparison.

Windows Phone | Frame Based Animations And Memory Footprints

I am developing a small game for Windows Phone which is based on silverlight animation.
Some animations are using silverlight animation framework like Trandforms API and some animations are frame based. What I am doing is, I am running a Storyboard having very small duration and when it;s completed event fires, I am changing image frame there. So images get replaced every time completed event get fired. But I think it is causing memory leakage in my game and memory footprint is increasing with time.
I want to ask is it a right way to do frame base animations or is there any better way to do this in silverlight???
What I can do to reduce memory consumption so that it does not increase with time.
As a general rule, beware animating anything which can't be GPU accelerated or bitmap cached. You haven't given enough information to tell if this is your issue but start by monitoring the frame rate counters, redraw regions and cache visualisation.
You can detect memory leaks with the built in profiling tools.
See DEBUG > Start Windows Phone Application Analysis

Suitability of using Core Animation on iOS vs using Cocos2D and OpenGL ES?

I finished a breakout game tutorial in a book, but the ball, which is a 20x20 pixel image, was skipping frames and not moving very smoothly. That is the case on the Simulator as well as on an iPhone 4S (the real thing). The code wasn't using NSTimer (which may be slower), but was using CADisplayLink and UIImageView setFrame to do the animation.
Is Core Animation on iOS not very suitable for development animation type of games? Say if it is a game of
Invaders (Space Invaders)
Breakout (as a game in a tutorial)
Arkanoid
Angry Birds / Cut the Rope / Fruit Ninja
For these types of games, is Core Animation really suitable for writing (2) above? For (1), (3), and (4), either Cocos2D or OpenGL ES is more suitable of doing the job. And the performance of Cocos2D and OpenGL ES are very close. Is that true?
Cocos2D is often looked at because of its ease for programming common game logic, like collision detection and sprite animations, frame-by-frame, scaling and other processes that are quite common in game development, where you string together multiple animations, combine then, sequence them, do call backs, and more. That is one of the big benefits of the engine.
However, performance is another. Cocos offers batch nodes, which combine all graphic elements into a single OpenGL call, rather than "drawing" each to the screen separately in each frame; this can dramatically improve performance, especially for large graphics. If you had skipping frames, I wonder if batch sprites in Cocos would have been the missing link.
I'm very impressed by Core Animation and want to hope that it can hold its own with performance issues in games. My understanding is that CA is, like Cocos, also built on top of OpenGL ES, so I'd expect it possible to achieve good results in either. It could be that doing so in Cocos is easier simply because it has been designed and optimized internally for game development.
If you are having performance problems with a 2D app, this is likely caused by a lack of understanding of how to get the most efficient results from CoreGraphics as opposed to something that switching to OpenGL will fix. A 2D game will work just fine with CoreGraphics, you just need to start with the right approach. First off, you should not be rendering the entire view over again on each CADisplayLink callback. Instead, setup a UIView that contains multiple CALayer objects. Set the layer like so: CALayer.contents = (id) cgImage and then let the system take care of rendering it when the x, y, or animation elements change. You just need to position your elements and define the animations that move the elements around. With this approach, the system will cache the animating image on the graphics card behind the scenes and redraw using GPU operations.

Resources