Which CSS selectors or rules can significantly affect front-end layout / rendering performance in the real world? - performance

Is it worth worrying about CSS rendering performance? Or should we just not worry about efficiency at all with CSS and just focus on writing elegant or maintainable CSS instead?
This question is intended to be a useful resource for front-end developers on which parts of CSS can actually have a significant impact on device performance, and which devices / browsers or engines may be affected. This is not a question about how to write elegant or maintainable CSS, it's purely about performance (although hopefully what's written here can inform more general articles on best-practice).
Existing evidence
Google and Mozilla have written guidelines on writing efficient CSS and CSSLint's set of rules includes:
Avoid selectors that look like regular expressions
..
don't use the complex equality operators to avoid performance penalties
but none of them provide any evidence (that I could find) of the impact these have.
A css-tricks.com article on efficient CSS argues (after outlining a load of efficiency best practices) that we should not .. sacrifice semantics or maintainability for efficient CSS these days.
A perfection kills blog post suggested that border-radius and box-shadow rendered orders of magnitude slower than simpler CSS rules. This was hugely significant in Opera's engine, but insignificant in Webkit. Further, a smashing magazine CSS benchmark found that rendering time for CSS3 display rules was insignificant and significantly faster than rendering the equivalent effect using images.
Know your mobile tested various mobile browsers and found that they all rendered CSS3 equally insignificantly fast (in 12ms) but it looks like they did the tests on a PC, so we can't infer anything about how hand-held devices perform with CSS3 in general.
There are many articles on the internet on how to write efficient CSS. However, I have yet to find any comprehensive evidence that badly considered CSS actually has a significant impact on the rendering time or snappiness of a site.
Background
I offered bounty for this question to try to use the community power of SO to create a useful well-researched resource.

The first thing that comes to mind here is: how clever is the rendering engine you're using?
That, generic as it sounds, matters a lot when questioning the efficiency of CSS rendering/selection. For instance, suppose the first rule in your CSS file is:
.class1 {
/*make elements with "class1" look fancy*/
}
So when a very basic engine sees that (and since this is the first rule), it goes and looks at every element in your DOM, and checks for the existence of class1 in each. Better engines probably map classnames to a list of DOM elements, and use something like a hashtable for efficient lookup.
.class1.class2 {
/*make elements with both "class1" and "class2" look extra fancy*/
}
Our example "basic engine" would go and revisit each element in DOM looking for both classes. A cleverer engine will compare n('class1') and n('class2') where n(str) is number of elements in DOM with the class str, and takes whichever is minimum; suppose that's class1, then passes on all elements with class1 looking for elements that have class2 as well.
In any case, modern engines are clever (way more clever than the discussed example above), and shiny new processors can do millions (tens of millions) of operations a second. It's quite unlikely that you have millions of elements in your DOM, so the worst-case performance for any selection (O(n)) won't be too bad anyhow.
Update: To get some actual practical illustrative proof, I've decided to do some tests. First of all, to get an idea about how many DOM elements on average we can see in real-world applications, let's take a look at how many elements some popular sites' webpages have:
Facebook: ~1900 elements (tested on my personal main page).
Google: ~340 elements (tested on the main page, no search results).
Google: ~950 elements (tested on a search result page).
Yahoo!: ~1400 elements (tested on the main page).
Stackoverflow: ~680 elements (tested on a question page).
AOL: ~1060 elements (tested on the main page).
Wikipedia: ~6000 elements, 2420 of which aren't spans or anchors (Tested on the Wikipedia article about Glee).
Twitter: ~270 elements (tested on the main page).
Summing those up, we get an average of ~1500 elements. Now it's time to do some testing. For each test, I generated 1500 divs (nested within some other divs for some tests), each with appropriate attributes depending on the test.
The tests
The styles and elements are all generated using PHP. I've uploaded the PHPs I used, and created an index, so that others can test locally: little link.
Results:
Each test is performed 5 times on three browsers (the average time is reported): Firefox 15.0 (A), Chrome 19.0.1084.1 (B), Internet Explorer 8 (C):
A B C
1500 class selectors (.classname) 35ms 100ms 35ms
1500 class selectors, more specific (div.classname) 36ms 110ms 37ms
1500 class selectors, even more specific (div div.classname) 40ms 115ms 40ms
1500 id selectors (#id) 35ms 99ms 35ms
1500 id selectors, more specific (div#id) 35ms 105ms 38ms
1500 id selectors, even more specific (div div#id) 40ms 110ms 39ms
1500 class selectors, with attribute (.class[title="ttl"]) 45ms 400ms 2000ms
1500 class selectors, more complex attribute (.class[title~="ttl"]) 45ms 1050ms 2200ms
Similar experiments:
Apparently other people have carried out similar experiments; this one has some useful statistics as well: little link.
The bottom line: Unless you care about saving a few milliseconds when rendering (1ms = 0.001s), don't bother give this too much thought. On the other hand, it's good practice to avoid using complex selectors to select large subsets of elements, as that can make some noticeable difference (as we can see from the test results above). All common CSS selectors are reasonably fast in modern browsers.
Suppose you're building a chat page, and you want to style all the messages. You know that each message is in a div which has a title and is nested within a div with a class .chatpage. It is correct to use .chatpage div[title] to select the messages, but it's also bad practice efficiency-wise. It's simpler, more maintainable, and more efficient to give all the messages a class and select them using that class.
The fancy one-liner conclusion:
Anything within the limits of "yeah, this CSS makes sense" is okay.

Most answers here focus on selector performance as if it were the only thing that matters. I'll try to cover some spriting trivia (spoiler alert: they're not always a good idea), css used value performance and rendering of certain properties.
Before I get to the answer, let me get an IMO out of the way: personally, I strongly disagree with the stated need for "evidence-based data". It simply makes a performance claim appear credible, while in reality the field of rendering engines is heterogenous enough to make any such statistical conclusion inaccurate to measure and impractical to adopt or monitor.
As original findings quickly become outdated, I'd rather see front-end devs have an understanding of foundation principles and their relative value against maintainability/readability brownie points - after all, premature optimization is the root of all evil ;)
Let's start with selector performance:
Shallow, preferably one-level, specific selectors are processed faster. Explicit performance metrics are missing from the original answer but the key point remains: at runtime an HTML document is parsed into a DOM tree containing N elements with an average depth D and than has a total of S CSS rules applied. To lower computational complexity O(N*D*S), you should
Have the right-most keys match as few elements as possible - selectors are matched right-to-left^ for individual rule eligibility so if the right-most key does not match a particular element, there is no need to further process the selector and it is discarded.
It is commonly accepted that * selector should be avoided, but this point should be taken further. A "normal" CSS reset does, in fact, match most elements - when this SO page is profiled, the reset is responsible for about 1/3 of all selector matching time so you may prefer normalize.css (still, that only adds up to 3.5ms - the point against premature optimisation stands strong)
Avoid descendant selectors as they require up to ~D elements to be iterated over. This mainly impacts mismatch confirmations - for instance a positive .container .content match may only require one step for elements in a parent-child relationship, but the DOM tree will need to be traversed all the way up to html before a negative match can be confirmed.
Minimize the number of DOM elements as their styles are applied individually (worth noting, this gets offset by browser logic such as reference caching and recycling styles from identical elements - for instance, when styling identical siblings)
Remove unused rules since the browser ends up having to evaluate their applicability for every element rendered. Enough said - the fastest rule is the one that isn't there :)
These will result in quantifiable (but, depending on the page, not necessarily perceivable) improvements from a rendering engine performance standpoint, however there are always additional factors such as traffic overhead and DOM parsing etc.
Next, CSS3 properties performance:
CSS3 brought us (among other things) rounded corners, background gradients and drop-shadow variations - and with them, a truckload of issues. Think about it, by definition a pre-rendered image performs better than a set of CSS3 rules that has to be rendered first. From webkit wiki:
Gradients, shadows, and other decorations in CSS should be used only
when necessary (e.g. when the shape is dynamic based on the content) -
otherwise, static images are always faster.
If that's not bad enough, gradients etc. may have to be recalculated on every repaint/reflow event (more details below). Keep this in mind until the majority of users user can browse a css3-heavy page like this without noticeable lag.
Next, spriting performance:
Avoid tall and wide sprites, even if their traffic footprint is relatively small. It is commonly forgotten that a rendering engine cannot work with gif/jpg/png and at runtime all graphical assets are operated with as uncompressed bitmaps. At least it's easy to calculate: this sprite's width times height times four bytes per pixel (RGBA) is 238*1073*4≅1MB. Use it on a few elements across different simultaneously open tabs, and it quickly adds up to a significant value.
A rather extreme case of it has been picked up on mozilla webdev, but this is not at all unexpected when questionable practices like diagonal sprites are used.
An alternative to consider is individual base64-encoded images embedded directly into CSS.
Next, reflows and repaints:
It is a misconception that a reflow can only be triggered with JS DOM manipulation - in fact, any application of layout-affecting style would trigger it affecting the target element, its children and elements following it etc. The only way to prevent unnecessary iterations of it is to try and avoid rendering dependencies. A straightforward example of this would be rendering tables:
Tables often require multiple passes before the layout is completely established because they are one of the rare cases where elements can
affect the display of other elements that came before them on the DOM.
Imagine a cell at the end of the table with very wide content that
causes the column to be completely resized. This is why tables are not
rendered progressively in all browsers.
I'll make edits if I recall something important that has been missed. Some links to finish with:
http://perfectionkills.com/profiling-css-for-fun-and-profit-optimization-notes/
http://jacwright.com/476/runtime-performance-with-css3-vs-images/
https://developers.google.com/speed/docs/best-practices/payload
https://trac.webkit.org/wiki/QtWebKitGraphics
https://blog.mozilla.org/webdev/2009/06/22/use-sprites-wisely/
http://dev.opera.com/articles/view/efficient-javascript/

While it's true that
computers were way slower 10 years ago.
You also have a much wider variety of device that are capable of accessing your website these days. And while desktops/laptops have come on in leaps and bounds, the devices in the mid and low end smartphone market, in many cases aren't much more powerful than what we had in desktops ten years ago.
But having said that CSS Selection speed is probably near the bottom of the list of things you need to worry about in terms of providing a good experience to as broad a device range as possible.
Expanding upon this I was unable to find specific information relating to more modern browsers or mobile devices struggling with inefficient CSS selectors but I was able to find the following:
http://www.stevesouders.com/blog/2009/03/10/performance-impact-of-css-selectors/
Quite dated (IE8, Chrome 2) now but has a decent attempt of establishing efficiency of various selectors in some browsers and also tries to quantify how the # of CSS rules impacts page rendering time.
http://www.thebrightlines.com/2010/07/28/css-performance-who-cares/
Again quite dated (IE8, Chrome 6) but goes to extremes in inefficient CSS selectors * * * * * * * * * { background: #ff1; } to establish performance degradation.

For such a large bounty I am willing to risk the Null answer: there are no official CSS selectors that cause any appreciable slow-downs in the rendering, and (in this day of fast computers and rapid browser iteration) any that are found are quickly solved by browser makers. Even in mobile browsers there is no problem, unless the unwary developer is willing to use non-standard jQuery selectors. These are marked as risky by the jQuery developers, and can indeed be problematic.
In this case the lack of evidence is evidence of the lack of problems. So, use semantic markup (especially OOCSS), and report any slow-downs that you find when using standard CSS selectors in obscure browsers.
People from the future: CSS performance problems in 2012 were already a thing of the past.

isn't css a irrelevant way to make it faster, it must be the last thing you look at when you look at performance. Make your css in what ever way that suites you, compile it. and then put it in the head. This might be rough but their are loads of other things to look for when your looking in to browser performance. If you work at a digital bureau you wont get paid to do that extra 1ms in load time.
As i commented use pagespeed for chrome its a google tool that analyze the website in 27 parameters css is 1 of them.
My post just concern exactly, wouldn't rather have around 99% of web users be able to open the website and see it right, even the people with IE7 and such. Than closing out around 10% by using css3, (If it turns out that you can get an extra 1-10ms on performance).
Most people have atleast 1mbit/512kbit or higher, and if you load a heavy site it takes around 3 secounds to load, but you can save 10ms maybe on css??
And when it comes to mobile devices you should make sites just for mobiles so when you have a device with screen size less than "Width"px, you have a separate site
Please comment below this is my perspective and my personal experience with web development

While not directly code-related, using <link> over #import to include your stylesheets provides much faster performance.
'Don’t use #import' via stevesouders.com
The article contains numerous speed test examples between each type as well as including one type with another (ex: A CSS file called via <link> also contains #import to another css file).

Related

jQuery Chosen is slow on IE, show results after x chars?

I'm using the chosen plugin (http://harvesthq.github.io/chosen/) with 10K items in the select box
On IE9 & IE10 it's very slow.
Is there a way to speed up to plugin?
Was thinking that results only would show up after x chars searched for, but can't find any documentation on that.
That number of entries will likely be slow, no matter which plugin you use (assuming the plugin keeps the elements in memory the whole time).
If you REALLY need to have all of those options available, it may be faster to have a search performed server-side, and return the resulting elements, rebuilding the select box after wards. I'm not sure if 'Chosen' has this capability, but I'm sure there is a jQuery plugin around somewhere that would provide this functionality.
10k is a lot of elements to go through and prune - and it's fair enough to say, IE has always been a tad on the slow side for JS.
With regards to why it's not speeding up after providing a certain number of characters, I imagine it's searching the whole data set every time, instead of (if a character is added) a sub set (the previously returned results).
This could be improved, perhaps by using a result set history of some kind, but would required substantial development.
Edit: Something like this possibly? http://ivaynberg.github.io/select2/

performance of layered canvases vs manual drawImage()

I've written a small graphics engine for my game that has multiple canvases in a tree(these basically represent layers.) Whenever something in a layer changes, the engine marks the affected layers as "soiled" and in the render code the lowest affected layer is copied to its parent via drawImage(), which is then copied to its parent and so on up to the root layer(the onscreen canvas.) This can result in multiple drawImage() calls per frame but also prevents rerendering anything below the affected layer. However, in frames where nothing changes no rendering or drawImage() calls take place, and in frames where only foreground objects move, rendering and drawImage() calls are minimal.
I'd like to compare this to using multiple onscreen canvases as layers, as described in this article:
http://www.ibm.com/developerworks/library/wa-canvashtml5layering/
In the onscreen canvas approach, we handle rendering on a per-layer basis and let the browser handle displaying the layers on screen properly. From the research I've done and everything I've read, this seems to be generally accepted as likely more efficient than handling it manually with drawImage(). So my question is, can the browser determine what needs to be re-rendered more efficiently than I can, despite my insider knowledge of exactly what has changed each frame?
I already know the answer to this question is "Do it both ways and benchmark." But in order to get accurate data I need real-world application, and that is months away. By then if I have an acceptable approach I will have bigger fish to fry. So I'm hoping someone has been down this road and can provide some insight into this.
The browser cannot determine anything when it comes to the canvas element and the rendering as it is a passive element - everything in it is user rendered by the means of JavaScript. The only thing the browser does is to pipe what's on the canvas to the display (and more annoyingly clear it from time to time when its bitmap needs to be re-allocated).
There is unfortunately no golden rule/answer to what is the best optimization as this will vary from case to case - there are many techniques that could be mentioned but they are merely tools you can use but you will still have to figure out what would be the right tool or the right combination of tools for your specific case. Perhaps layered is good in one case and perhaps it doesn't bring anything to another case.
Optimization in general is very much an in-depth analysis and break-down of patterns specific to the scenario, that are then isolated and optimized. The process if often experiment, benchmark, re-adjust, experiment, benchmark, re-adjust, experiment, benchmark, re-adjust... of course experience reduce this process to a minimum but even with experience the specifics comes in a variety of combinations that still require some fine-tuning from case to case (given they are not identical).
Even if you find a good recipe for your current project it is not given that it will work optimal with your next project. This is one reason no one can give an exact answer to this question.
However, when it comes canvas what you want to achieve is a minimum of clear operations and minimum areas to redraw (drawImage or shapes). The point with layers is to groups elements together to enable this goal.

Ideal Hostnames Number to Parallelize Downloads

We are setting up a CDN to serve CSS, JS and the images. We are wondering what will be the ideal number of hostnames to distribute the ressources across. This technique is use by many websites to increase parallelize downloads and page loading. However, DNS lookups slow down the page loading so the rule is not the more hostname you have, the more performance you will get.
I've read somewhere that the ideal number is between 2 and 4. I wonder if there is a rule of thumbs that apply to all webpages or if there is a rule of thumbs according to the number of ressources being served and the size of them.
Specific case : Our websites are composed of two kinds of pages. One kind serve a list of thumbnails (15-20 or so images, varying) and the other serve a flash or shockwave application (mostly games) with a lot less images. Obviously, we have regular JS, images and CSS on all pages. When I mean regular, that correctly optimized elements, 1 CSS, a few images of the UI, 2/3 JS...
I will love to have answers for our specific case but I will be also very interested to have general answers too!
We (Yahoo!) did a study back in 2007 that showed that 2 is a good number, and anything more than that doesn't improve page performance, in some cases having more than 2 domains degraded performance.
What I would recommend is - if you have a A/B testing infrastructure then go ahead and try it out on your site, use different number of domains and measure the page load time from your end users.
If you don't have a A/B testing framework then just try a different value for few days, measure it, try a new one, measure that ... do this till you find that point where performance starts to degrade.
There is no silver bullet for this recommendation. This is something that depends a lot on how many assets are there on each page and what browser (number of parallel downloads) your end users use. Hope that helps.

Horrible performance when removing HTML element from UIWebView

I'm currently writing an iOS app that uses a UIWebView for surfing around pages. Sometimes I need to dynamically remove elements in the UIWebView using stringByEvaluatingJavaScriptFromString:, but this locks up the main UI for sometimes up to 2 seconds on my first gen iPod touch, and maybe half a second on an iPhone 3GS. The JavaScript I'm using to remove it by is simply:
element.parentNode.removeChild(element);
Nothing more complicated than that. At the same time I'm doing some very basic 2D rendering in OpenGL ES, and if rerendering the UIWebView wouldn't lock up I would use just simple CoreAnimation on the main thread. Could it be that it has to recalculate the DOM tree, all element positions etc? Should this really lock up the main UI thread? Is it that I'm calling stringByEvaluatingJavaScriptFromString: that locks up everything? Is this normal and to be expected on this kind of hardware? The odd thing is it is able to render some semi-complex MooTools animations in the webview, with opacity and height changes, but removing one single element takes several seconds.
Does anyone have any ideas in improvements? Maybe just hiding elements using visibility: hidden is better, or setting opacity: 0? Any thoughs or wise words of experience?
DOM tree manipulations are horribly slow in all browsers, especially iOS Safari. The key factor is size of the DOM tree. Because of this, the best advice is to delete as you go (don't use visibility:hidden).
If you can possibly avoid using direct dom manipulation and instead use setInnerHTML(), this will yeild much better performance. I've seen it work in the order of 100 times faster. It is counter-intuitive, but doing a whole lot of string manipulation and then throwing a string at the browser is much faster. This is because browsers are optimised to render DOM trees from strings.
In your case setInnerHTML might only be helpful if you are trying to delete a number of nodes at the same time. If you are only deleting one node, you're stuck - just try to keep the DOM small.
Hope this helps.

Any name for this concept?

Say, we have a program that gets user input or any other unpredictable events at arbitrary moments of time.
For each kind of event the program should perform some computation or access a resource, which is reasonably time-consuming to be considered. The program should output a result as fast as possible. If next events arrive, it might be acceptable to drop previous computations and take up new ones.
To complicate it further, some computations/resource access might be interdependent, i.e. produce data that can be used in other computations.
What's important we know the pattern in which these events usually occur. For example: their relative frequency with respect to each other, or a common order and time intervals in which they happen.
The task is to make an algorithm which deals with the problem in the most statistically efficient way. Approaches yielding sub-optimal solutions can be more than sufficient.
Is there a concept which embraces designing such algorithms?
Example:
A tabbed internet browser.
When told to load different web pages in several tabs, should decide whether to load the page in an active tab with higher priority, to render just the visible part of the page or pre-render the full page, if so what to do first - pre-render the whole page for the active tab or render other tabs instead, etc.
(I know nothing about how browsers actually work, but assuming it is this way won't hurt)
I think scheduling algorithms deal with these kind of scenarios.
What you're describing is a prioritizing application scheduler. You would need to be more specific to determine which algorithm would be best, but here's a list that you might find useful.
I am tossing keywords: Scheduling with pre-emption? Also, prefetching, double-buffering
I don't know a lot about it but this sounds like something that the reactor patern may be used for.

Resources