iOS / Core-Animation: Performance tuning

iOS / Core-Animation: Performance tuning - performance

I have my app running on my iPad. but it is performing very badly -- I am getting below 15fps. can anyone help me to optimise?
It is basically a wheel (derived from UIView) containing 12 buttons (derived from UIControl).
As the user spins it, the buttons dynamically expand and contract (e.g. the one at the 12 o'clock position should always be the biggest)
So my wheel contains a:
- (void) displayLinkIsCallingBack: (CADisplayLink *) dispLink
{
:
// using CATransaction like this goes from 14fps to 19fps
[CATransaction begin];
[CATransaction setDisableActions: YES];
// NEG, as coord system is flipped/fucked
self.transform = CGAffineTransformMakeRotation(-thetaWheel);
[CATransaction commit];
if (BLA)
[self rotateNotch: direction];
}
… which calculates from recent touch input the new rotation for the wheel. There is already one performance issue here which I am pursuing on a separate thread: iOS Core-Animation: Performance issues with CATransaction / Interpolating transform matrices
This routine also checks whether the wheel has completed another 1/12 rotation, and if so instructs all 12 buttons to resize:
// Wheel.m
- (void) rotateNotch: (int) direction
{
for (int i=0; i < [self buttonCount] ; i++)
{
CustomButton * b = (CustomButton *) [self.buttons objectAtIndex: i];
// Note that b.btnSize is a dynamic property which will calculate
// the desired button size based on the button index and the wheels rotation.
[b resize: b.btnSize];
}
}
Now for the actual resizing code, in button.m:
// Button.m
- (void) scaleFrom: (float) s_old
to: (float) s_new
time: (float) t
{
CABasicAnimation * scaleAnimation = [CABasicAnimation animationWithKeyPath: #"transform.scale"];
[scaleAnimation setDuration: t ];
[scaleAnimation setFromValue: (id) [NSNumber numberWithDouble: s_old] ];
[scaleAnimation setToValue: (id) [NSNumber numberWithDouble: s_new] ];
[scaleAnimation setTimingFunction: [CAMediaTimingFunction functionWithName: kCAMediaTimingFunctionEaseOut] ];
[scaleAnimation setFillMode: kCAFillModeForwards];
scaleAnimation.removedOnCompletion = NO;
[self.contentsLayer addAnimation: scaleAnimation
forKey: #"transform.scale"];
if (self.displayShadow && self.shadowLayer)
[self.shadowLayer addAnimation: scaleAnimation
forKey: #"transform.scale"];
size = s_new;
}
// - - -
- (void) resize: (float) newSize
{
[self scaleFrom: size
to: newSize
time: 1.];
}
I wonder if the problem is related to the overhead of multiple transform.scale operations queueing up -- each button resize takes a full second to complete, and if I am spinning the wheel fast I might spin a couple of revolutions per second; that means that each button is getting resized 24 times per second.
** creating the button's layer **
The final piece of the puzzle I guess is to have a look at the button's contentsLayer. but I have tried
contentsLayer.setRasterize = YES;
which should effectively be storing it as a bitmap. so with the setting the code is in effect dynamically resizing 12 bitmaps.
I can't believe this is taxing the device beyond its limits. however, core animation instrument tells me otherwise; while I am rotating the wheel (by dragging my finger in circles), it is reporting ~15fps.
This is no good: I eventually need to put a text layer inside each button and this is going to drag performance down further (...unless I am using the .setRasterize setting above, in which case it should be the same).
There must be something I'm doing wrong! but what?
EDIT: here is the code responsible for generating the button content layer (ie the shape with the shadow):
- (CALayer *) makeContentsLayer
{
CAShapeLayer * shapeOutline = [CAShapeLayer layer];
shapeOutline.path = self.pOutline;
CALayer * contents = [CALayer layer];
// get the smallest rectangle centred on (0,0) that completely contains the button
CGRect R = CGRectIntegral(CGPathGetPathBoundingBox(self.pOutline));
float xMax = MAX(abs(R.origin.x), abs(R.origin.x+R.size.width));
float yMax = MAX(abs(R.origin.y), abs(R.origin.y+R.size.height));
CGRect S = CGRectMake(-xMax, -yMax, 2*xMax, 2*yMax);
contents.bounds = S;
contents.shouldRasterize = YES; // try NO also
switch (technique)
{
case kMethodMask:
// clip contents layer by outline (outline centered on (0, 0))
contents.backgroundColor = self.clr;
contents.mask = shapeOutline;
break;
case kMethodComposite:
shapeOutline.fillColor = self.clr;
[contents addSublayer: shapeOutline];
self.shapeLayer = shapeOutline;
break;
default:
break;
}
if (NO)
[self markPosition: CGPointZero
onLayer: contents ];
//[self refreshTextLayer];
//[contents addSublayer: self.shapeLayerForText];
return contents;
}
as you can see, I'm trying every possible approach, I am trying two methods for making the shape, and separately I am toggling .shouldRasterize
** compromising the UI design to get tolerable frame rate **
EDIT: Now I have tried disabling the dynamic resizing behaviour until the wheel settles into a new position, and setting wheel.setRasterize = YES. so it is effectively spinning a single prerendered UIView (which is admittedly taking up most of the screen) underneath the finger (which it happily does #~60fps), until the wheel comes to rest, at which point it performs this laggy resizing animation (#<20fps).
while this gives a tolerable result, it seems nuts that I am having to sacrifice my UI design in such a way. I feel sure I must be doing something wrong.
EDIT: I have just tried as an experiment to resize buttons manually; ie put a display link callback in each button, and dynamically calculate the expected size of this given frame, explicitly disable animations with CATransaction the same as I did with the wheel, set the new transformation matrix (scale transform generated from the expected size). added to this I have set the buttons content layer shouldRasterize = YES. so it should be simply scaling 12 bitmaps each frame onto a single UIView which is itself rotating. amazingly this is dead slow, it is even bringing the simulator to a halt. It is definitely 10 times slower than doing it automatically using core animation's animation feature.

I have no experience in developing iPad applications but I do have some in optimizing video games. So, I cannot give an exact answer but I want to give some tips in optimization.
Do not guess. Profile it.
It seems you are trying to make changes without profiling the code. Changing some suspicious code and crossing your fingers does not really work. You should profile your code by examining how long each task takes and how often they need to run. Try to break down your tasks and put profiling code to measure time and frequency. It's even better if you can measure how much memory are used for each task, or how many other system resources. Find your bottleneck based an evidence, not your feeling.
For your specific problem, you think the program gets really slow when your resizing work kicks in. But, are you sure? It could be something else. We don't know for sure until we have actual profiling data.
Minimize problematic area and measure real performance before making changes.
After profiling, you have a candidate for your bottleneck. If you can still split the cause to several small tasks, do it and go to profile them until you cannot split it anymore. Then, try to measure their precise performance by running them repeatedly like a few thousand times. This is important because you need to know the performance (speed & space) before making any changes so that you can compare it to future changes.
For your specific problem, if resizing is really the issue, try to examine how it performs. How long it takes to perform one resize call? How often you need to do resize work to complete your job?
Improve it. How? It depends.
Now, you have the problematic code block and its current performance. You now have to improve it. How? well, it really depends on what the problem is. you could search for better algorithms, you could do lazy fetching if you can delay calculations until you really need to perform, or you could do over-eager evaluation by caching some data if you are doing the same operation too frequently. The better you know the cause, the easier you can improve it.
For your specific problem, it might be just the limit of OS ui function. Usually, resizing button is not just resizing button itself but it also invalidates a whole area or its parent widget so that every ui stuff can be rendered properly. Resizing button could be expensive and if that's the case, you could resolve the issue by simply using image-based approach instead of OS ui-based system. You could try to use OpenGL if image operations from OS API are not enough. But, again, we don't know until we actually try them out and profile these alternatives. Good luck! :)

Try it without your shadows and see if that improves performance. I imagine it will improve it greatly. Then I'd look into using CALayer's shadowpath for rendering shadows. That will greatly improve shadow rendering performance.
Apple's Core Animation videos from last year's WWDC have a lot of great info on increasing performance in core animation.
By the way, I'm animating something way more complex then this right now and it works beautifully even on an older iPhone 3G. The hardware/software is quite capable.

I know this question is old, but still up to date. CoreAnimation (CA) is just a wrapper around OpenGL - or meanwhile maybe around Metal. Layers are in fact textures drawn on rectangles and the animations are expressed using 3D transformations. As all of this is handled by the GPU, it should be ultra fast... but it isn't. The whole CA sub-system seems pretty complex and translating between AppKit/UIKit and the 3D world is harder than it seems (if you ever tried to write such a wrapper yourself, you know how hard it can be). To the programmer, CA offers a super simple to use interface but this simplicity comes with a price. All my attempts to optimize very slow CA were futile so far; you can speed it up a bit but at some point you have to reconsider your approach: Either CA is fast enough to does the job for you or you need to stop using CA and either implement all animation yourself using classic view drawing (if the CPU can cope with that) or implement the animations yourself using a 3D API (then the GPU will do it), in which case you can decide how the 3D world interacts with the rest of your app; the price is much more code to write or much more complex API to use, but the results will speak for themselves in the end.
Still, I'd like to give some generic tips about speeding up CA:
Every time you "draw" to a layer or load content into a layer (a new image), the data of the texture backing this layer needs to be updated. Every 3D programmer knows: Updating textures is very expensive. Avoid that at all costs.
Don't use huge layers as if layers are too big to be handled directly by the GPU as a single texture, they are split into multiple textures and this alone makes performance worse.
Don't use too many layers as the amount of memory GPUs can spend on textures is often limited. If you need more memory than that limit, textures are swapped out (removed from GPU memory to make room for other textures, later on added back when they need to drawn). See first tip above, this kind of swapping is very expensive.
Don't redraw things that don't need redrawing, cache into images instead. E.g. drawing shadows and drawing gradients are both utlra expensive and usually rarely ever change. So instead of making CA draw them each time, draw them once to a layer or draw them to an image and load that image to a CALayer, then position the layer where you need it. Monitor when you need to update them (e.g. if the size of an object has changed), then re-draw them once and again cache the result. CA itself also tries to cache results, but you get better results if you control that caching yourself.
Careful with transparency. Drawing an opaque layer is always faster than drawing one that isn't. So avoid using transparency where not needed as the system will not scan all your content for transparency (or the lack of it). If a NSView contains no area where its parent shines through, make a custom subclass and override isOpaque to return YES. Same holds true for UIViews and layers where neither the parent, nor their siblings will ever shine through, but here it is enough to just set the opaque property to YES.
If none of that really helps, you are pushing CA to its limits and you probably need to replace it with something else.

You should probably just re do this in OpenGL

Are you using shadows on your layer for me it was a cause of performance issue, then you have 2 options AFAIK:
- setting shadowPath that way, CA does not have to compute it everytime
- removing shadows and using images to replace

Related

How to Fit the Physics Body Size of an SKSpriteNode to its Image in Xcode Using SpriteKit

So I'm not sure if this is possible, but I am trying to adapt the physics body to go with my image. Many people have told me that the hitbox for the character in my game on the iOS App Store (StreakDash - go download it!) does not work well, since it makes the game a lot harder (the hitbox hits an obstacle even though the character doesn't appear to even be touching it). This is due to the fact that the hitboxes are rectangular, whereas my character has a strange shape. Are there any ways to go around this? I thought of some ways that didn't completely work (e.g. trying to change the actual frame/canvas shape from rectangular to something else, trying to change the hitbox shape/size, trying to alter the image, etc.). It would be great if I could have advice on whether or not I should even change the hitbox in the first place (is it the type of rage-inducing game that makes people want to keep playing or stop?). But finding a way to solve the problem would be best!
Here is a picture of my character with its hitbox:
Here is just some basic code of with the SKSpriteNode and physics body:
sprite = [SKSpriteNode spriteNodeWithImageNamed:#"stick"];
sprite.size = CGSizeMake(self.frame.size.width/6.31, self.frame.size.height/3.2);
sprite.physicsBody =[SKPhysicsBody bodyWithRectangleOfSize:CGSizeMake (sprite.size.width, sprite.size.height)];
sprite.position = CGPointMake(self.frame.size.width/5.7, self.frame.size.height/2.9);
sprite.physicsBody.categoryBitMask = personCategory;
sprite.physicsBody.contactTestBitMask = lineCategory;
sprite.physicsBody.dynamic = NO;
sprite.physicsBody.collisionBitMask = 0;
sprite.physicsBody.usesPreciseCollisionDetection = YES;

I suggest to represent the physic body with a polygon shape. You have two ways to improve bodyWithRectangleOfSize:
The Lazy way is to use bodyWithTexture:size: which will create a physics body from the contents of a texture. But as Apple suggested, the more complex your physic body is , the more work to be properly simulated. You may want to make a tradeoff between precision and performance.
The more proper way is to represent the bounding of your sprite with a convex polygon shape. See bodyWithPolygonFromPath:. There are some tools online to generate the path code in user interface. Here is the one: SKPhysicsBody Path Generator (be careful with the offset and anchor point). If you know the way to generate CGMutablePathRef code yourself, it will be easier to fit your situation.

Efficiently rendering tiled map using SpriteKit

As an exercise, I decided to write a SimCity (original) clone in Swift for OSX. I started the project using SpriteKit, originally having each tile as an instance of SKSpriteNode and swapping the texture of each node when that tile changed. This caused terrible performance, so I switched the drawing over to regular Cocoa windows, implementing drawRect to draw NSImages at the correct tile position. This solution worked well until I needed to implement animated tiles which refresh very quickly.
From here, I went back to the first approach, this time using a texture atlas to reduce the amount of draws needed, however, swapping textures of nodes that need to be animated was still very slow and had a huge detrimental effect on frame rate.
I'm attempting to display a 44x44 tile map where each tile is 16x16 pixels. I know here must be an efficient (or perhaps more correct way) to do this. This leads to my question:
Is there an efficient way to support 1500+ nodes in SpriteKit and which are animated through changing their textures? More importantly, am I taking the wrong approach by using SpriteKit and SKSpriteNode for each tile in the map (even if I only redraw the dirty ones)? Would another approach (perhaps, OpenGL?) be better?
Any help would be greatly appreciated. I'd be happy to provide code samples, but I'm not sure how relevant/helpful they would be for this question.
Edit
Here are some links to relevant drawing code and images to demonstrate the issue:
Screenshot:
When the player clicks on the small map, the center position of the large map changes. An event is fired from the small map the central engine powering the game which is then forwarded to listeners. The code that gets executed on the large map the change all of the textures can be found here:
https://github.com/chrisbenincasa/Swiftopolis/blob/drawing-performance/Swiftopolis/GameScene.swift#L489
That code uses tileImages which is a wrapper around a Texture Atlas that is generated at runtime.
https://github.com/chrisbenincasa/Swiftopolis/blob/drawing-performance/Swiftopolis/TileImages.swift
Please excuse the messiness of the code -- I made an alternate branch for this investigation and haven't cleaned up a lot of residual code that has been hanging around from pervious iterations.

I don't know if this will "answer" your question, but may help.
SpriteKit will likely be able to handle what you need but you need to look at different optimizations for SpriteKit and more so your game logic.
SpriteKit. Creating a .atlas is by far one of the best things you can do and will help keep your draw calls down. Also as I learned the hard way keep a pointer to your SKTextures as long as you need them and only generate the ones you needs. For instance don't create textureWithImageNamed#"myImage" every time you need a texture for myImage instead keep reusing a texture and store it in a dictionary. Also skView.ignoresSiblingOrder = YES; helps a bunch but you have to manage your own zPosition on all the sprites.
Game logic. Updating every tile every loop is going to be very expensive. You will want to look at a better way to do that. keeping smaller arrays or maybe doing logic (model) updates on a background thread.
I currently have a project you can look into if you want called Old Frank. I have a map that is 75 x 75 with 32px by 32px tiles that may be stacked 2 tall. I have both Mac and iOS target so you could in theory blow up the scene size and see how the performance holds up. Not saying there isn't optimization work to be done (it is a work in progress), but I feel it might help get you pointed in the right direction at least.
Hope that helps.

What's a good way to optimise rendering a 2D tile game in XNA?

EDIT: I've opted for the second approach as I got 150+ fps even when all 3 tile layers fill the entire screen.
EDIT 2: I read a lot about vertex buffer objects and how they would be great for static geometry and although I still have no idea how to turn my 2D tiles into a VBO and store it on the GPU memory, it definitely seems like the way to go if anyone else is looking for a fast way to render static geometry/quads.
I'm making a game like Super Meat Boy and was wondering if it would be better/faster to store level tiles in an array list and do a camera bounds overlap test to see if it should be rendered.
foreach(Tile tile in world.tiles) {
if(Overlap(camera.bounds, tile))
render(tile);
}
Or would a 2D array storing every grid square and only reading off between camera bounds be better?
int left = (int)(camera.position.x - camera.width/2);
int right = (int)(camera.position.x + camera.width/2) + 1;
int top = (int)(camera.position.y - camera.height/2); // WHY XNA DO YOU UPSIDE DOWN!!!
int bottom = (int)(camera.position.y + camera.width/2) + 1;
for(int x = left; x < right; x++) {
for(int y = top; y < bottom; y++) {
render(world.tiles[x][y]);
}
}
The camera can fit 64*36 tiles on screen which is 2300 odd tiles to read off using the latter approach but is doing an overlap test with every tile in the level any better? I read an answer about joining matching adjacent tiles into a larger quad and just repeating the texture (although I'm using a texture atlas so I'm not sure how to repeat a region on a texture).
Cheers guys.

As per my past experience I can share the details. In 2D map, normally map is like 0 - N long. Now N is far longer then screen size. Now at first I tried to loading everything at once. But it is bit of a overhead. Ok, it was like very much of a overhead. And I endup with 0 FPS. As I want different kind of object for me. So, even repeating same object and saving memory is not working. Then I tried bounding things with reference to screen. So, objects are there and but they are not getting rendered. So, it is moved from away from draw pipe line. And game back to life.
Now, for further performance with C# 4.0 I can use TPL and async and await with draw. It is like better version of threading. So, you can throw stuff there and let it be render at will.
Here is deal wiht XNA or any kinda graphics library. There is complete graphics rendering pipeline. And that makes things whole lot slow. Specifically if PC is old and only have 64MB graphics card to support only wide screen. Your game will be deployed to any kinda machine right??!!
So, if I explain in language of XNA, update is simple code and run it as fast as it can, there is nothing to stop it. But draw is has complete pipe line ahead of it. And that is sole reason of having begin and end. So, after end it can start pushing things to pipe line. [Here] (http://classes.soe.ucsc.edu/cmps020/Winter11/readings/hlsl.pdf) article for reference.
So, here is a deal rendering pipeline is needed but there is no need that is should be slow and blocking. Just make it multi-threaded and things will quite faster for you. If you want more terse then you have to use C# at it fullest including Linked list and stuff. But that will be like last stage.
I hope I have given enough details to provide you an answer. Please let me know if any further details needed.

UIView self.layer.shouldRasterize = YES and performance issues

I would like to share my experience from using self.layer.shouldRasterize = YES; flag on UIViews.
I have a UIView class hierarchy that has self.layer.shouldRasterize turned ON in order to improve scrolling performance (all of them have STATIC subviews that are larger than the screen of the device).
Today in one of the subclasses I used CAEmitterLayer to produce nice particle effects.
The performance is really poor although the number of particles was really low (50 particles).
What is the cause of this problem?

I'll just quote Apple Doc and explain:
#property BOOL shouldRasterize
When the value of this property is YES, the layer is
rendered as a bitmap in its local coordinate space and then composited
to the destination with any other content. Shadow effects and any
filters in the filters property are rasterized and included in the
bitmap. However, the current opacity of the layer is not rasterized.
If the rasterized bitmap requires scaling during compositing, the
filters in the minificationFilter and magnificationFilter properties
are applied as needed.
So basically when shouldRasterize is set to YES, every pixel that will compose the layer is calculated and the whole layer is cached as a bitmap.
When will you benefit from it ?
When you only need to draw it once. That means when you need just pure "simple" animation (eg moving, transform, scaling...) because CoreAnimation will actually use that layer without redrawing it every frame. It's a very powerful feature to cache complex layers (with shadows and corner radius) combined with CoreAnimation.
When will it kill you framerate ?
When your layer is redisplayed many times, because on top of the drawing that is already taking effect, the shouldRasterize will process all pixels to cache the bitmap data.
So the real question you should ask yourself is this : "On which layer am I applying the shouldRasterize to YES ? And how often is this layer redrawn ?"
Hope this was clear enough.

Turning OFF self.layer.shouldRasterize increases performance to normal levels.
Why is that?
According to a video on apple's developers site (I cannot remember the video, help please?) the rule for self.layer.shouldRasterize is that simple: If all of your subviews are static (their position, contents etc, are not changing or animating) then it is beneficiary to turn self.layer.shouldRasterize ON. On the other side if any of the subviews are changing then the framework needs to re-cache the view hierarchy and this is a huge bottleneck. Under the hood the bottleneck is the memory copying between CPU and GPU.

Efficient method to draw a line with millions of points

I'm writing an audio waveform editor in Cocoa with a wide range of zoom options. At its widest, it shows a waveform for an entire song (~10 million samples in view). At its narrowest, it shows a pixel accurate representation of the sound wave (~1 thousand samples in a view). I want to be able to smoothly transition between these zoom levels. Some commercial editors like Ableton Live seem to do this in a very inexpensive fashion.
My current implementation satisfies my desired zoom range, but is inefficient and choppy. The design is largely inspired by this excellent article on drawing waveforms with quartz:
http://supermegaultragroovy.com/blog/2009/10/06/drawing-waveforms/
I create multiple CGMutablePathRef's for the audio file at various levels of reduction. When I'm zoomed all the way out, I use the path that's been reduced to one point per x-thousand samples. When I'm zoomed in all the way in, I use that path that contains a point for every sample. I scale a path horizontally when I'm in between reduction levels. This gets it functional, but is still pretty expensive and artifacts appear when transitioning between reduction levels.
One thought on how I might make this less expensive is to take out anti-aliasing. The waveform in my editor is anti-aliased while the one in Ableton is not (see comparison below).
I don't see a way to turn off anti-aliasing for CGMutablePathRef's. Is there a non-anti-aliased alternative to CGMutablePathRef in the world of Cocoa? If not, does anyone know of some OpenGL classes or sample code that might set me on course to drawing my huge line more efficiently?
Update 1-21-2014: There's now a great library that does exactly what I was looking for: https://github.com/syedhali/EZAudio

i use CGContextMoveToPoint+CGContextAddLineToPoint+CGContextStrokePath in my app. one point per onscreen point to draw using a pre-calculated backing buffer for the overview. the buffer contains the exact points to draw, and uses an interpolated representation of the signal (based on the zoom/scale). although it could be faster and look better if i rendered to an image buffer, i've never had a complaint. you can calc and render all of this from a secondary thread, if you set it up correctly.
anti-aliasing pertains to the graphics context.
CGFloat (the native input for CGPaths) is overkill for an overview, as an intermediate representation, and for calculating the waveform overview. 16 bits should be adequate. of course, you'll have to convert to CGFloat when passing to CG calls.
you need to profile to find out where your time is spent -- focus on the parts that take the most time. also, make you sure you only draw what you must, when you must and avoid overlays/animations where possible. if you need overlays, it's better to render to an image/buffer and update that as needed. sometimes it helps to break up the display into multiple drawing surfaces when the surface is large.
semi-OT: ableton's using s+h values this can be slightly faster but... i much prefer it as an option. if your implementation uses linear interpolation (which it may, based on its appearance), consider a more intuitive approach. linear interpolation is a bit of a cheat, and really not what the user would expect if you're developing a pro app.

In relation to the particular question of anti-aliasing. In Quartz the anti-aliasing is applied to the context at the moment of drawing. The CGPathRef is agnostic to the drawing context. Thus, the same CGPathRef can be rendered into an antialiased context or to a non-antialiased context. For example, to disable antialiasing during animations:
CGContextRef context = UIGraphicsGetCurrentContext();
GMutablePathRef fill_path = CGPathCreateMutable();
// Fill the path with the wave
...
CGContextAddPath(context, fill_path);
if ([self animating])
CGContextSetAllowsAntialiasing(context, NO);
else
CGContextSetAllowsAntialiasing(context, YES);
// Do the drawing
CGContextDrawPath(context, kCGPathStroke);

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio