Improved performance, Lighthouse audit,

Improved performance, Lighthouse audit, - performance

I generated a report using Lighthouse for nuxt-prismic-showcase
Here the result :
lighthouse report
I set up lazy loading for images, cut to the right size, use next-gen formats, fix ensure test remains visible during webfont load by adding font-display: swap.
I read this article and set up vue-lazy-hydration to try to get better performance :
https://markus.oberlehner.net/blog/how-to-drastically-reduce-estimated-input-latency-and-time-to-interactive-of-ssr-vue-applications/
But this did not bring a significant drop.
What can be done to improve the result for Time to Interactive, First CPU Idle and Max Potential First Input Delay ?
Thank you

Related

Why does PageSpeed Insights keeps returning a high TTI (Time to Interactive) for a simple game?

I submitted my app/game/PWA to PageSpeed Insights and it keeps giving me TTI values > 7000ms and TBT values > 2000ms, as it can be seen in the screenshot below (the overall score for a mobile experience is around 63):
I read what those values mean over and over, but I just cannot make them lower!
What is most annoying, is that when accessing the page in a real-life browser, I don't need to wait 7 seconds for the page to become interactive, even with a clear cache!!!
The game can be accessed here and its source is here.
What comforts me is that Google's own game, Doodle Cricket also scores terribly. In fact, PageSpeed Insights gives it an overall score of "?".
Summing up: is there a way to tell PageSpeed Insights the page is actually a game with only a simple canvas in it and that it is indeed interactive as soon as the first frame is rendered on the canvas (not 7 seconds later)?
UPDATE: Partial Solution
Thanks to #Graham Ritchie's answer, I was able to detect the two slowest points, simulating a mid-tier mobile phone:
Loading/Compiling WebAssembly files: there was not much I could do about this, and this alone consumes almost 1.5 seconds...
Loading the main script file, script.min.js: I split the file into two, since almost two thirds of this file are just string constants, and I started loading them asynchronously, both using async to load the main script and delay loading the other string constants, which has saved more than 1.2 seconds from the load time.
The improvements have also saved some time on better mobile devices/desktop devices.
The commit diff is here.
UPDATE 2: Improving the Tooling
For anyone who gets here from Google, two extra tips that I forgot to mention before...
Use the CLI Lighthouse tool rather than the website (both for localhost and for internet websites): npm install -g lighthouse, then call lighthouse --view http.... (or use any other arguments as necessary).
If running on a notebook, make sure it is not running on the battery, but actually connected to a power source 😅

Summing up: is there a way to tell PageSpeed Insights the page is actually a game with only a simple canvas in it and that it is indeed interactive as soon as the first frame is rendered on the canvas (not 7 seconds later)?
No and unfortunately I think you have missed one key piece of the puzzle as to why those numbers are so high.
Page Speed Insights uses throttling on the Network AND the CPU to simulate a mid-tier mobile phone on a 4G connection.
The CPU throttling is your issue.
If I run your game within the "performance" tab on Google Chrome Developer Tools with "4x slowdown" on the CPU I get a few long tasks, one of which takes 5.19s to run!
Your issue isn't page weight as the site is lightweight it is JavaScript execution time.
You would have to look through your code and see why you have a task that takes so long to run, look for nested loops as they are normally the issue!
There are several other tasks that take 1-2 seconds total between them but that 5 second task is the main culprit!
Hopefully that clears things up a bit, any questions just ask.

PageSpeed error: Invalid task timing data

My website uses the following optimizations to free up main thread as well as optimize content load process:
- Web workers for loading async data as well as images.
- Defer images until all the content on page is loaded first
- typekit webfontloader for optimized font load
Now since the time I completely switched over to webworkers for all network [async] related tasks, I have noticed the increased occurence in following errors[by ~50%]:
But my score seems to be unaffected.
My question is, how accurate is this score?
P.S: My initial data is huge, so styling and rendering takes ~1300ms & ~1100ms resp. [required constraint]

After doing a few experiments and glancing through the LightHouse (the engine that powers PSI) source code I think the problem comes in the fact that once everything has loaded (page load event) Lighthouse only runs for a few seconds before terminating.
However your JS runs for quite some time afterwards with the service workers performing some tasks nearly 11 seconds after page load on one run of mine (probably storing some images which take a long time to download).
I am guessing you are getting intermittent errors as sometimes the CPU goes quiet long enough to calculate JS execution time and sometimes it does not (depending on how long it is between tasks it is performing).
To see what I mean open developer tools on Chrome -> Performance Tab -> Set CPU slowdown to 4 x slowdown (which is what Lighthouse emulates) and press 'record' (top left). Now reload the page and once it has loaded fully stop recording.
You will get a preformance profile and there is a section 'Main' that you can expand to see the main thread load (which is still used despite using a worker as it needs to decode the base64 encoded images, not sure if that can be shifted onto a different thread)
You will see that tasks continue to use CPU for 3-4 seconds after page load.
It is a bug with Lighthouse but at the same time something to address at your side as it is symptomatic of a problem (why base64 encoded images? that is where you are taking the performance hit on what would otherwise be a well-optimised site).

Apollo InMemoryCache Performance Strategies for Large Data Set (React)

The initial data set received from an Apollo Client GraphqQL query for an application I am trying to tune is currently very large. In "large" I mean that it seems that the data normalizes to about 7,000 entries under the "data" key in the cache. The payload is about 1.6MB. If I were to save the cache's data entry it's normalized to about 3MB. I'm not a fan of how the initial query works as I am currently redesigning their application to use cursors, and filtering, on the graph rather than the client fetching such a large amount of data and filtering itself. The current implementation cannot scale due to larger data sets will be returned when this software is installed in other locations. But, I am looking for a short term solution to make this cache build faster while I undertake very large redesign task.
*UPDATE July 25, 2018** The cursor approach doesn't work as the cache write performance degrades as more entries are added during each page/cursor of data is fetch.
The real issue is that IE 11, which we I have to support due to the industry's (healthcare) usage of this browser, is extremely slow. It's very difficult to measure, but it's about 8-10x slower than Chrome in the area of the Apollo cache and react integration code. Chrome can take 1-2 seconds to build the cache on these slower virtual desktops while IE will take 10-20 seconds.
So, my question is: Are there any performance tweaks to help the cache build faster? I've attached a screenshot to show where the bottleneck lies. It's the same in chrome as in IE, it's just about an order of magnitude slower in IE. I'm not sure if it's an IE shortcoming, or if it's some crazy polyfill issue that is awful. The screenshot shows the hot spots that show up in the performance results. Yes, this screenshot is of the development version of React, but we aren't seeing any real noticeable performance increases in a production. The screenshot is really just a call to the graph and the simplest HTML table being rendered with about 260 rows. The render phase is negligible. It seems there are an awful lot of queued up events or 'work' during this phase. Perhaps there is a way to suspend this? Chrome's profiler shows the same hot spot, it's just not as slow.
Anyway, any advice is greatly appreciated.
The screenshot columns are: function | invocation count | time (seconds)

Our team is facing similar issues. Our current approach is to "denormalize" part of server schema into a String type, which holds a JSON string. On the client side, we will parse the JSON string that's returned by the Apollo client.

Apollo 3.0 will have an option to disable cache normalization for a type:
https://www.apollographql.com/docs/react/v3.0-beta/caching/cache-configuration/#disabling-normalization

I ran into a similar problem (on Apollo client 3.3.6). Eventually, it became clear that a collection this large was not suitable for the Apollo client cache. Rather than saddle yourself with seconds-long processing (probably right when you load your app), I'd strongly recommend that you manage your larger datasets on your own outside the cache - native js map/filter/etc is much faster and probably a better fit depending on why you need the data. Just pass the option fetchPolicy: 'no-cache', and watch your app speed up significantly. Any large fetch (thousands of typed results) is probably better off with this treatment.

Chrome extension effect on page load time

I'm writing a Chrome extension and I want to measure how it affects performance, specifically currently I'm interested in how it affects page load times.
I picked a certain page I want to test, recorded it with Fiddler and I use this recording as the AutoResponder in Fiddler. This allows me to measure load times without networking traffic delays.
Using this technique I found out that my extension adds ~1200ms to the load time. Now I'm trying to figure out what causes the delay and I'm having trouble understanding the DevTools Performance results.
First of all, it seems there's a discrepancy in the reported load time:
On one hand, the summary shows a range of ~13s, but on the other hand, the load event arrived after ~10s (which I also corroborated using performance.timing.loadEventEnd - performance.timing.navigationStart):
The second thing I don't quite understand is how the number add up (or rather don't add up). For example, here's a grouping of different categories during load:
Neither of this columns sums up to 10s nor to 13s.
When I group by domain I can get different rows for the extension and for the rest of the stuff:
But it seems that the extension only adds 250ms which is much lower than the exhibited difference in load times.
I assume that these numbers represent just CPU time, and do not include any wait time. Is this correct? If so, it's OK that the numbers don't add up and it's possible that the extension doesn't spend all its time doing CPU bound work.
Then there's also the mysterious [Chrome extensions overhead], which doesn't explain the difference in load times either. Judging by the fact that it's a separate line from my extension, I thought they are mutually exclusive, but if I dive deeper into the specifics, I find my extension's functions under the [Chrome extensions overhead] subdomain:
So to summarize, this is what I want to be able to do:
Calculate the total CPU time my extension uses - it seems it's not enough to look under the extension's name, and its functions might also appear in other groups.
Understand whether the delay in load time is caused by CPU processing or by synchronous waiting. If it's the latter, find where my extension is doing a synchronous wait, because I'm pretty sure that I didn't call any blocking APIs.
Update
Eventually I found out that the reason for the slowdown was that we also activated Chrome accessibility whenever our extension was running and that's what caused the drastic slowdown. Without accessibility the extension had a very minor effect. I still wonder though, how I could see in the profiler that my problem was the accessibility. It could have saved me a ton of time... I will try to look at it again later.

Are Sitecore's sublayout rendering stats incorrect?

The built-in Sitecore rendering stats http://<sitename>/sitecore/admin/stats.aspx is really helpful for identifying inefficient and slow-loading XSLT renders. Recently I've started switching to .ascx sub layouts to take advantage of the Sitecore C# API which can help improve performance when used correctly.
However, I've noticed that sub layouts (as opposed to XSLT renders) are not reported correctly on the stats page. See the screenshot below....
I know for a fact that this sub layout takes about 1.8 seconds to generate (I calculated this in the code-behind). Caching is turned off. I've refreshed the page 20 times to ensure I get an average. You will see that the "Avg. items" is always 0 - I can live with this - but the "Avg. time (ms)" is less than 1ms which is just clearly wrong.
Does anyone have any insights into this? Has anyone found a way to get it to work correctly?

Judging whether a statistic is right/wrong is going to rely on understanding exactly what it is measuring.
Digging around in Sitecore.Diagnostics.Statistics using Reflector I note the following:
Sitecore.Web.UI.Webcontrol contains a field m_timer
This is 'started' in the BeforeRender() method and 'stopped' in the AfterRender() method
Data from that timer is sent to Statistics.AddRenderingData() and is logged against the control
This means it is measuring the time taken to render the control, which for an XSLT includes the processing time for preparing all the data in it, but as much of the work of a normal ASCX is done prior to the Render-stage the statistic is much less useful. Incorporating the Load stage in the time would inadvertently include the processing time for all child components, since the Load sequence is chained and called recursively, so that probably doesn't help much either.
I suspect there is no good way of measuring the processing time for a specific ASCX control (excluding children) without first acquiring cumulative data then post-processing the call chain and splitting the time apart. This is the sort of thing RedGate ANTS does really well, but might not be so good if it was being executed on a live production system, given the overheads.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio