I would like to share my experience from using self.layer.shouldRasterize = YES; flag on UIViews.
I have a UIView class hierarchy that has self.layer.shouldRasterize turned ON in order to improve scrolling performance (all of them have STATIC subviews that are larger than the screen of the device).
Today in one of the subclasses I used CAEmitterLayer to produce nice particle effects.
The performance is really poor although the number of particles was really low (50 particles).
What is the cause of this problem?
I'll just quote Apple Doc and explain:
#property BOOL shouldRasterize
When the value of this property is YES, the layer is
rendered as a bitmap in its local coordinate space and then composited
to the destination with any other content. Shadow effects and any
filters in the filters property are rasterized and included in the
bitmap. However, the current opacity of the layer is not rasterized.
If the rasterized bitmap requires scaling during compositing, the
filters in the minificationFilter and magnificationFilter properties
are applied as needed.
So basically when shouldRasterize is set to YES, every pixel that will compose the layer is calculated and the whole layer is cached as a bitmap.
When will you benefit from it ?
When you only need to draw it once. That means when you need just pure "simple" animation (eg moving, transform, scaling...) because CoreAnimation will actually use that layer without redrawing it every frame. It's a very powerful feature to cache complex layers (with shadows and corner radius) combined with CoreAnimation.
When will it kill you framerate ?
When your layer is redisplayed many times, because on top of the drawing that is already taking effect, the shouldRasterize will process all pixels to cache the bitmap data.
So the real question you should ask yourself is this : "On which layer am I applying the shouldRasterize to YES ? And how often is this layer redrawn ?"
Hope this was clear enough.
Turning OFF self.layer.shouldRasterize increases performance to normal levels.
Why is that?
According to a video on apple's developers site (I cannot remember the video, help please?) the rule for self.layer.shouldRasterize is that simple: If all of your subviews are static (their position, contents etc, are not changing or animating) then it is beneficiary to turn self.layer.shouldRasterize ON. On the other side if any of the subviews are changing then the framework needs to re-cache the view hierarchy and this is a huge bottleneck. Under the hood the bottleneck is the memory copying between CPU and GPU.
Related
Since most devices today have a CPU and a GPU, the usual advice for programmers wishing to do animated vector graphics (like making a circle grow or move around) is to define the graphical item once and then use linear transformations to animate it. This way, (on most platforms and frameworks) the GPU can do the animation work, because rasterization with linear transformations can be done very fast on a GPU. If the programmer chooses to draw each frame on the CPU, it would most likely be much slower and consume more energy.
I understand that the Watch is not a device you want to overload with complex animations, but at least the Home Screen certainly seems to use exactly this kind of animated linear transformations:
Also, most Watch Faces are animated in a way, e.g. the moving seconds and minutes hands.
However, the WatchKit controls do not have a .transform property, and I could not find much in the documentation - the words "animation" and "graphics" are not even mentioned there.
So, the only way I currently see is to draw the vector graphics to a CGContext and then put the result as an UIImage to a image control, as described here. But this does not really seem energy-efficient. It is exactly the kind of "CPU pixel drawing" that we usually want to avoid if possible. I think it is not energy-efficient because if I draw on a 100x100 pixels image buffer, the image has to be scaled to the actual Watch screen size, so we have two actual drawing processes per frame.
Is there a officially recommended, energy-efficient way to do animations on the Apple Watch?
Or, in other words, can we animate things like they are animated on the Home Screen or Watch Faces?
Seems SpriteKit is the answer. You can create SKScene and node objects and then display them in a WKInterfaceSKScene.
I have a complex structure of CALayers forming a motion graphics system that can be manipulated by the user. This is being displayed in the main window as a part of the UI. I am looking for a good way to display multiple small sections of the CALayer stack on a second display as "viewports", which will likely be at a higher resolution that the main view. I am aware that I could render them out and redraw them, but want to maintain the resolution independence of the CALayers.
My thought process was something to the effect of adding the main CALayer to multiple superlayers and then using a combination of masks and transforms to get the viewport to display the portion needed. Unfortunately, a CALayer can only have one superlayer.
Is there any good way to achieve this? Thanks in advance.
Unfortunately I think you'll need to maintain multiple CALayer stacks, one for each view. Since all the sets of layers should just be reflecting the state of a single model it should be relatively straightforward to keep them in sync.
You could optimise the zoomed view to only manage layers that are actually visible, which would cut down on resource usage.
I'm writing an audio waveform editor in Cocoa with a wide range of zoom options. At its widest, it shows a waveform for an entire song (~10 million samples in view). At its narrowest, it shows a pixel accurate representation of the sound wave (~1 thousand samples in a view). I want to be able to smoothly transition between these zoom levels. Some commercial editors like Ableton Live seem to do this in a very inexpensive fashion.
My current implementation satisfies my desired zoom range, but is inefficient and choppy. The design is largely inspired by this excellent article on drawing waveforms with quartz:
http://supermegaultragroovy.com/blog/2009/10/06/drawing-waveforms/
I create multiple CGMutablePathRef's for the audio file at various levels of reduction. When I'm zoomed all the way out, I use the path that's been reduced to one point per x-thousand samples. When I'm zoomed in all the way in, I use that path that contains a point for every sample. I scale a path horizontally when I'm in between reduction levels. This gets it functional, but is still pretty expensive and artifacts appear when transitioning between reduction levels.
One thought on how I might make this less expensive is to take out anti-aliasing. The waveform in my editor is anti-aliased while the one in Ableton is not (see comparison below).
I don't see a way to turn off anti-aliasing for CGMutablePathRef's. Is there a non-anti-aliased alternative to CGMutablePathRef in the world of Cocoa? If not, does anyone know of some OpenGL classes or sample code that might set me on course to drawing my huge line more efficiently?
Update 1-21-2014: There's now a great library that does exactly what I was looking for: https://github.com/syedhali/EZAudio
i use CGContextMoveToPoint+CGContextAddLineToPoint+CGContextStrokePath in my app. one point per onscreen point to draw using a pre-calculated backing buffer for the overview. the buffer contains the exact points to draw, and uses an interpolated representation of the signal (based on the zoom/scale). although it could be faster and look better if i rendered to an image buffer, i've never had a complaint. you can calc and render all of this from a secondary thread, if you set it up correctly.
anti-aliasing pertains to the graphics context.
CGFloat (the native input for CGPaths) is overkill for an overview, as an intermediate representation, and for calculating the waveform overview. 16 bits should be adequate. of course, you'll have to convert to CGFloat when passing to CG calls.
you need to profile to find out where your time is spent -- focus on the parts that take the most time. also, make you sure you only draw what you must, when you must and avoid overlays/animations where possible. if you need overlays, it's better to render to an image/buffer and update that as needed. sometimes it helps to break up the display into multiple drawing surfaces when the surface is large.
semi-OT: ableton's using s+h values this can be slightly faster but... i much prefer it as an option. if your implementation uses linear interpolation (which it may, based on its appearance), consider a more intuitive approach. linear interpolation is a bit of a cheat, and really not what the user would expect if you're developing a pro app.
In relation to the particular question of anti-aliasing. In Quartz the anti-aliasing is applied to the context at the moment of drawing. The CGPathRef is agnostic to the drawing context. Thus, the same CGPathRef can be rendered into an antialiased context or to a non-antialiased context. For example, to disable antialiasing during animations:
CGContextRef context = UIGraphicsGetCurrentContext();
GMutablePathRef fill_path = CGPathCreateMutable();
// Fill the path with the wave
...
CGContextAddPath(context, fill_path);
if ([self animating])
CGContextSetAllowsAntialiasing(context, NO);
else
CGContextSetAllowsAntialiasing(context, YES);
// Do the drawing
CGContextDrawPath(context, kCGPathStroke);
I have my app running on my iPad. but it is performing very badly -- I am getting below 15fps. can anyone help me to optimise?
It is basically a wheel (derived from UIView) containing 12 buttons (derived from UIControl).
As the user spins it, the buttons dynamically expand and contract (e.g. the one at the 12 o'clock position should always be the biggest)
So my wheel contains a:
- (void) displayLinkIsCallingBack: (CADisplayLink *) dispLink
{
:
// using CATransaction like this goes from 14fps to 19fps
[CATransaction begin];
[CATransaction setDisableActions: YES];
// NEG, as coord system is flipped/fucked
self.transform = CGAffineTransformMakeRotation(-thetaWheel);
[CATransaction commit];
if (BLA)
[self rotateNotch: direction];
}
… which calculates from recent touch input the new rotation for the wheel. There is already one performance issue here which I am pursuing on a separate thread: iOS Core-Animation: Performance issues with CATransaction / Interpolating transform matrices
This routine also checks whether the wheel has completed another 1/12 rotation, and if so instructs all 12 buttons to resize:
// Wheel.m
- (void) rotateNotch: (int) direction
{
for (int i=0; i < [self buttonCount] ; i++)
{
CustomButton * b = (CustomButton *) [self.buttons objectAtIndex: i];
// Note that b.btnSize is a dynamic property which will calculate
// the desired button size based on the button index and the wheels rotation.
[b resize: b.btnSize];
}
}
Now for the actual resizing code, in button.m:
// Button.m
- (void) scaleFrom: (float) s_old
to: (float) s_new
time: (float) t
{
CABasicAnimation * scaleAnimation = [CABasicAnimation animationWithKeyPath: #"transform.scale"];
[scaleAnimation setDuration: t ];
[scaleAnimation setFromValue: (id) [NSNumber numberWithDouble: s_old] ];
[scaleAnimation setToValue: (id) [NSNumber numberWithDouble: s_new] ];
[scaleAnimation setTimingFunction: [CAMediaTimingFunction functionWithName: kCAMediaTimingFunctionEaseOut] ];
[scaleAnimation setFillMode: kCAFillModeForwards];
scaleAnimation.removedOnCompletion = NO;
[self.contentsLayer addAnimation: scaleAnimation
forKey: #"transform.scale"];
if (self.displayShadow && self.shadowLayer)
[self.shadowLayer addAnimation: scaleAnimation
forKey: #"transform.scale"];
size = s_new;
}
// - - -
- (void) resize: (float) newSize
{
[self scaleFrom: size
to: newSize
time: 1.];
}
I wonder if the problem is related to the overhead of multiple transform.scale operations queueing up -- each button resize takes a full second to complete, and if I am spinning the wheel fast I might spin a couple of revolutions per second; that means that each button is getting resized 24 times per second.
** creating the button's layer **
The final piece of the puzzle I guess is to have a look at the button's contentsLayer. but I have tried
contentsLayer.setRasterize = YES;
which should effectively be storing it as a bitmap. so with the setting the code is in effect dynamically resizing 12 bitmaps.
I can't believe this is taxing the device beyond its limits. however, core animation instrument tells me otherwise; while I am rotating the wheel (by dragging my finger in circles), it is reporting ~15fps.
This is no good: I eventually need to put a text layer inside each button and this is going to drag performance down further (...unless I am using the .setRasterize setting above, in which case it should be the same).
There must be something I'm doing wrong! but what?
EDIT: here is the code responsible for generating the button content layer (ie the shape with the shadow):
- (CALayer *) makeContentsLayer
{
CAShapeLayer * shapeOutline = [CAShapeLayer layer];
shapeOutline.path = self.pOutline;
CALayer * contents = [CALayer layer];
// get the smallest rectangle centred on (0,0) that completely contains the button
CGRect R = CGRectIntegral(CGPathGetPathBoundingBox(self.pOutline));
float xMax = MAX(abs(R.origin.x), abs(R.origin.x+R.size.width));
float yMax = MAX(abs(R.origin.y), abs(R.origin.y+R.size.height));
CGRect S = CGRectMake(-xMax, -yMax, 2*xMax, 2*yMax);
contents.bounds = S;
contents.shouldRasterize = YES; // try NO also
switch (technique)
{
case kMethodMask:
// clip contents layer by outline (outline centered on (0, 0))
contents.backgroundColor = self.clr;
contents.mask = shapeOutline;
break;
case kMethodComposite:
shapeOutline.fillColor = self.clr;
[contents addSublayer: shapeOutline];
self.shapeLayer = shapeOutline;
break;
default:
break;
}
if (NO)
[self markPosition: CGPointZero
onLayer: contents ];
//[self refreshTextLayer];
//[contents addSublayer: self.shapeLayerForText];
return contents;
}
as you can see, I'm trying every possible approach, I am trying two methods for making the shape, and separately I am toggling .shouldRasterize
** compromising the UI design to get tolerable frame rate **
EDIT: Now I have tried disabling the dynamic resizing behaviour until the wheel settles into a new position, and setting wheel.setRasterize = YES. so it is effectively spinning a single prerendered UIView (which is admittedly taking up most of the screen) underneath the finger (which it happily does #~60fps), until the wheel comes to rest, at which point it performs this laggy resizing animation (#<20fps).
while this gives a tolerable result, it seems nuts that I am having to sacrifice my UI design in such a way. I feel sure I must be doing something wrong.
EDIT: I have just tried as an experiment to resize buttons manually; ie put a display link callback in each button, and dynamically calculate the expected size of this given frame, explicitly disable animations with CATransaction the same as I did with the wheel, set the new transformation matrix (scale transform generated from the expected size). added to this I have set the buttons content layer shouldRasterize = YES. so it should be simply scaling 12 bitmaps each frame onto a single UIView which is itself rotating. amazingly this is dead slow, it is even bringing the simulator to a halt. It is definitely 10 times slower than doing it automatically using core animation's animation feature.
I have no experience in developing iPad applications but I do have some in optimizing video games. So, I cannot give an exact answer but I want to give some tips in optimization.
Do not guess. Profile it.
It seems you are trying to make changes without profiling the code. Changing some suspicious code and crossing your fingers does not really work. You should profile your code by examining how long each task takes and how often they need to run. Try to break down your tasks and put profiling code to measure time and frequency. It's even better if you can measure how much memory are used for each task, or how many other system resources. Find your bottleneck based an evidence, not your feeling.
For your specific problem, you think the program gets really slow when your resizing work kicks in. But, are you sure? It could be something else. We don't know for sure until we have actual profiling data.
Minimize problematic area and measure real performance before making changes.
After profiling, you have a candidate for your bottleneck. If you can still split the cause to several small tasks, do it and go to profile them until you cannot split it anymore. Then, try to measure their precise performance by running them repeatedly like a few thousand times. This is important because you need to know the performance (speed & space) before making any changes so that you can compare it to future changes.
For your specific problem, if resizing is really the issue, try to examine how it performs. How long it takes to perform one resize call? How often you need to do resize work to complete your job?
Improve it. How? It depends.
Now, you have the problematic code block and its current performance. You now have to improve it. How? well, it really depends on what the problem is. you could search for better algorithms, you could do lazy fetching if you can delay calculations until you really need to perform, or you could do over-eager evaluation by caching some data if you are doing the same operation too frequently. The better you know the cause, the easier you can improve it.
For your specific problem, it might be just the limit of OS ui function. Usually, resizing button is not just resizing button itself but it also invalidates a whole area or its parent widget so that every ui stuff can be rendered properly. Resizing button could be expensive and if that's the case, you could resolve the issue by simply using image-based approach instead of OS ui-based system. You could try to use OpenGL if image operations from OS API are not enough. But, again, we don't know until we actually try them out and profile these alternatives. Good luck! :)
Try it without your shadows and see if that improves performance. I imagine it will improve it greatly. Then I'd look into using CALayer's shadowpath for rendering shadows. That will greatly improve shadow rendering performance.
Apple's Core Animation videos from last year's WWDC have a lot of great info on increasing performance in core animation.
By the way, I'm animating something way more complex then this right now and it works beautifully even on an older iPhone 3G. The hardware/software is quite capable.
I know this question is old, but still up to date. CoreAnimation (CA) is just a wrapper around OpenGL - or meanwhile maybe around Metal. Layers are in fact textures drawn on rectangles and the animations are expressed using 3D transformations. As all of this is handled by the GPU, it should be ultra fast... but it isn't. The whole CA sub-system seems pretty complex and translating between AppKit/UIKit and the 3D world is harder than it seems (if you ever tried to write such a wrapper yourself, you know how hard it can be). To the programmer, CA offers a super simple to use interface but this simplicity comes with a price. All my attempts to optimize very slow CA were futile so far; you can speed it up a bit but at some point you have to reconsider your approach: Either CA is fast enough to does the job for you or you need to stop using CA and either implement all animation yourself using classic view drawing (if the CPU can cope with that) or implement the animations yourself using a 3D API (then the GPU will do it), in which case you can decide how the 3D world interacts with the rest of your app; the price is much more code to write or much more complex API to use, but the results will speak for themselves in the end.
Still, I'd like to give some generic tips about speeding up CA:
Every time you "draw" to a layer or load content into a layer (a new image), the data of the texture backing this layer needs to be updated. Every 3D programmer knows: Updating textures is very expensive. Avoid that at all costs.
Don't use huge layers as if layers are too big to be handled directly by the GPU as a single texture, they are split into multiple textures and this alone makes performance worse.
Don't use too many layers as the amount of memory GPUs can spend on textures is often limited. If you need more memory than that limit, textures are swapped out (removed from GPU memory to make room for other textures, later on added back when they need to drawn). See first tip above, this kind of swapping is very expensive.
Don't redraw things that don't need redrawing, cache into images instead. E.g. drawing shadows and drawing gradients are both utlra expensive and usually rarely ever change. So instead of making CA draw them each time, draw them once to a layer or draw them to an image and load that image to a CALayer, then position the layer where you need it. Monitor when you need to update them (e.g. if the size of an object has changed), then re-draw them once and again cache the result. CA itself also tries to cache results, but you get better results if you control that caching yourself.
Careful with transparency. Drawing an opaque layer is always faster than drawing one that isn't. So avoid using transparency where not needed as the system will not scan all your content for transparency (or the lack of it). If a NSView contains no area where its parent shines through, make a custom subclass and override isOpaque to return YES. Same holds true for UIViews and layers where neither the parent, nor their siblings will ever shine through, but here it is enough to just set the opaque property to YES.
If none of that really helps, you are pushing CA to its limits and you probably need to replace it with something else.
You should probably just re do this in OpenGL
Are you using shadows on your layer for me it was a cause of performance issue, then you have 2 options AFAIK:
- setting shadowPath that way, CA does not have to compute it everytime
- removing shadows and using images to replace
I have an application that draws images from a CGImage.
The CImage itself is loaded using a CGImageSourceCreateImageAtIndex to create an image from a PNG file.
This forms part of a sprite engine - there are multiple sprite images on a single PNG file, so each sprite has a CGRect defining where it is found on the CGImage.
The problem is, CGContextDraw only takes a destination rect - and stretches the source CGImage to fill it.
So, to draw each sprite image we need to create multiple CGImages from the original source, using CGImageCreateWithImageInRect().
I thought at first that this would be a 'cheap' operation - it doesn't seem necessary for each CGImage to contain its own copy of the images bits - however, profiling has revealed that using CGImageCreateWithImageInRect() is a rather expensive operation.
Is there a more optimal method to draw a sub-section of a CGImage onto a CGContext so I dont need to CGImageCreateWithImageInRect() so often?
Given the lack of a source rectangle, and the ease of making a CGImage from a rect on a CGImage I began to suspect that perhaps CGImage implemented a copy-on-write semantic where a CGImage made from a CGImage would refer to a sub-rect of the same physical bits as the parent.
Profiling seems to prove this wrong :/
I was in the same boat as you. CGImageCreateWithImageInRect() worked better for my needs but previously I had attempted to convert to an NSImage, and prior to that I was clipping the context I was drawing in, and translating so that CGContextDrawImage() would draw the right data into the clipped region.
Of all of the solutions I tried:
Clipping and translating was prohibitively tolling on the CPU. It was too slow. It seemed that increasing the amount of bitmap data only slightly made significant performance impacts, suggesting that this approach lacks any sort of scalability.
Conversion to NSImage was relatively efficient, at least for the data we were using. There didn't seem to be any duplication of bitmap data that I could see, which was mostly what I was afraid of going from one image object to another.
At one point I converted to a CIImage, as this class also allows drawing subregions of the image. This seemed to be slower than converting to NSImage, but did offer me the chance to fiddle around with the bitmap by passing through some of the Core Image filters.
Using CGImageCreateWithImageInRect() was the fastest of the lot; maybe this has been optimised since you had last used it. The documentation for this function says the resulting image retains a reference to the original image, this seems to agree with what you had assumed regarding copy-on-write semantics. In my benchmarks, there appears to be no duplication of data but maybe I'm reading the results wrong. We went with this method because it was not only the fastest but it seemed like a more “clean” approach, keeping the whole process in one framework.
Create an NSImage with the CGImage. An NSImage object makes it easy to draw only some section of it to a destination rectangle.
I believe the recommendation is to use a clipping region.
I had a similar problem when writing a simple 2D tile-based game.
The only way I got decent performance was to:
1) Pre-render the tilesheet CGImage into a CGBitmapContext using CGContextDrawImage()
2) Create another CGBitmapContext as an offscreen rendering buffer, with the same size as the UIView I was drawing in, and same pixel format as the context from (1).
3) Write my own fast blit routine that would copy a region (CGRect) of pixels from the bitmap context created in (1) to the bitmap context created in (2). This is pretty easy: just simple memory copying (and some extra per-pixel operations to do alpha blending if needed), keeping in mind that the rasters are in reverse order in the buffer (the last row of pixels in the image is at the beginning of the buffer).
4) Once a frame had been drawn, draw the offscreen buffer in the view using CGContextDrawImage().
As far as I could tell, every time you call CGImageCreateWithImageInRect(), it decodes the entire PNG file into a raw bitmap, then copies the desired region of the bitmap to the destination context.