Are Sitecore's sublayout rendering stats incorrect? - performance

The built-in Sitecore rendering stats http://<sitename>/sitecore/admin/stats.aspx is really helpful for identifying inefficient and slow-loading XSLT renders. Recently I've started switching to .ascx sub layouts to take advantage of the Sitecore C# API which can help improve performance when used correctly.
However, I've noticed that sub layouts (as opposed to XSLT renders) are not reported correctly on the stats page. See the screenshot below....
I know for a fact that this sub layout takes about 1.8 seconds to generate (I calculated this in the code-behind). Caching is turned off. I've refreshed the page 20 times to ensure I get an average. You will see that the "Avg. items" is always 0 - I can live with this - but the "Avg. time (ms)" is less than 1ms which is just clearly wrong.
Does anyone have any insights into this? Has anyone found a way to get it to work correctly?

Judging whether a statistic is right/wrong is going to rely on understanding exactly what it is measuring.
Digging around in Sitecore.Diagnostics.Statistics using Reflector I note the following:
Sitecore.Web.UI.Webcontrol contains a field m_timer
This is 'started' in the BeforeRender() method and 'stopped' in the AfterRender() method
Data from that timer is sent to Statistics.AddRenderingData() and is logged against the control
This means it is measuring the time taken to render the control, which for an XSLT includes the processing time for preparing all the data in it, but as much of the work of a normal ASCX is done prior to the Render-stage the statistic is much less useful. Incorporating the Load stage in the time would inadvertently include the processing time for all child components, since the Load sequence is chained and called recursively, so that probably doesn't help much either.
I suspect there is no good way of measuring the processing time for a specific ASCX control (excluding children) without first acquiring cumulative data then post-processing the call chain and splitting the time apart. This is the sort of thing RedGate ANTS does really well, but might not be so good if it was being executed on a live production system, given the overheads.

Related

Lost Duration while Debugging Apex CPU time limit exceeded

I'm open to posting the code in this section to work through the optimization but its a bit length and complex, so instead I'm hoping that somebody can assist me with a few debugging questions I have. My goal is to find out what is causing my Apex CPU Time Limit Exceeded issue.
When using the Debug Log in its basic or normal layout I receive the message
Maximum CPU Time: 15062 out of 10,000 ** Close to Limit
I've optimized and re-wrote various loops and queries several times now and in each case this number concludes around there which leads me to believe it is lying to me and that my actual usage far exceeds that number. So on my journey I switched the Log Panels of the Developer Console to Analysis in hopes of isolating exactly what loop, method, or area of the code is giving me a headache.
This leads me to my main question and problem.
Execution Tree, Performance Tree & Executed Units
All show me that my durations UNDER the 10,000ms allowance. My largest consumption is 3,556.19ms which is being used by a wrapper class I created and consumed in the constructor method where there is a fair amount of logic that is constructing a fairly complicated wrapper class that spans over 5-7 custom objects. Still even with those 3,000ms the remainder of the process shows at negligible times bringing my total around 4,000ms. Again my question is.... Why am I unable to see or find what is consuming all my time?
Incorrect Iteration Data
In addition to this, on the Performance tree there is a column of data that shows the number of iterations for each method. I know that my Production Org has 81 objects that would essentially call the constructor for my custom wrapper object. I.E. my Constructor SHOULD be called 81 times, but instead it is called 32 times. So my other question is can I rely on the iteration data in the column? Or because it was iterating so many times does it stop counting at a certain point? Its possible that one of my objects is corrupted or causing an infinite loop somehow, but I don't want to dig through all the data in search of that conclusion if its a known issue that the iteration data is not accurate anyway.
System.Debug in the Production org
The Last question is why my System.Debug() lines are not displaying in my Developer Console on the production org. I've added serveral breadcrumbs throughout the code that would help me isolate just which objects are making it through and which are not, however, I cannot in any layout view system.debug messages outside of my Sandbox.
Sorry for the wealth of questions but I did want to give an honest effort to better understand the debugging process in Salesforce. If this is a lost cause I'm happy to start sharing some code as well but hopefully some debugging tips can get me to the solution.
It's likely your debug log got truncated, see "Each debug log must be 20 MB or smaller. If it exceeds this amount, you won’t see everything you need." in https://trailhead.salesforce.com/en/content/learn/modules/apex_basics_dotnet/debugging_diagnostics
Download the log and search for text similar to "skipped 123456 bytes of detailed log" to confirm, some system.debug statements will just not show up.
You might have to fine-tune the log levels (don't log validation rules and workflows? don't log every single variable assignment with "FINE" level etc). You might have to set all flags to NONE, then track only 1 particular class/trigger that you suspect (see https://help.salesforce.com/articleView?id=code_debug_log_classes.htm&type=5 and https://salesforce.stackexchange.com/questions/214380/how-are-we-supposed-to-use-debug-logs-for-a-specific-apex-class-only)
If it's truncated it's possible analysis tools give up (I had mixed luck with console to be honest, sometimes https://apextimeline.herokuapp.com/ is great to give overview - but it'll also fail to parse a 20 MB log...
When all else fails you can load up the log into Notepad++ (or any editor of your choice), find lines related to method entry/method exit (you might need a regular expression search), take these filtered lines tor excel, play with "text to columns" and just look at timing manually, see if there's a record that causes the spike. Because it could be #10 that's the problem, the fact it exhausts limits on #32 of 81 doesn't mean much. Search like [METHOD_ENTRY|METHOD_EXIT]MyTriggerHandler.onBeforeUpdate could be a good start. But first thing is to make sure log is not truncated.

Chrome extension effect on page load time

I'm writing a Chrome extension and I want to measure how it affects performance, specifically currently I'm interested in how it affects page load times.
I picked a certain page I want to test, recorded it with Fiddler and I use this recording as the AutoResponder in Fiddler. This allows me to measure load times without networking traffic delays.
Using this technique I found out that my extension adds ~1200ms to the load time. Now I'm trying to figure out what causes the delay and I'm having trouble understanding the DevTools Performance results.
First of all, it seems there's a discrepancy in the reported load time:
On one hand, the summary shows a range of ~13s, but on the other hand, the load event arrived after ~10s (which I also corroborated using performance.timing.loadEventEnd - performance.timing.navigationStart):
The second thing I don't quite understand is how the number add up (or rather don't add up). For example, here's a grouping of different categories during load:
Neither of this columns sums up to 10s nor to 13s.
When I group by domain I can get different rows for the extension and for the rest of the stuff:
But it seems that the extension only adds 250ms which is much lower than the exhibited difference in load times.
I assume that these numbers represent just CPU time, and do not include any wait time. Is this correct? If so, it's OK that the numbers don't add up and it's possible that the extension doesn't spend all its time doing CPU bound work.
Then there's also the mysterious [Chrome extensions overhead], which doesn't explain the difference in load times either. Judging by the fact that it's a separate line from my extension, I thought they are mutually exclusive, but if I dive deeper into the specifics, I find my extension's functions under the [Chrome extensions overhead] subdomain:
So to summarize, this is what I want to be able to do:
Calculate the total CPU time my extension uses - it seems it's not enough to look under the extension's name, and its functions might also appear in other groups.
Understand whether the delay in load time is caused by CPU processing or by synchronous waiting. If it's the latter, find where my extension is doing a synchronous wait, because I'm pretty sure that I didn't call any blocking APIs.
Update
Eventually I found out that the reason for the slowdown was that we also activated Chrome accessibility whenever our extension was running and that's what caused the drastic slowdown. Without accessibility the extension had a very minor effect. I still wonder though, how I could see in the profiler that my problem was the accessibility. It could have saved me a ton of time... I will try to look at it again later.

Using Layout (MasterPage) in MVC4 with Umbraco 7.2 has significant performance hit. Why?

In our project we use MVC4 + Umbraco7.2.8. On one of the pages we make around 25 calls to the server simultaneously, each call is full-fledged http get to controller's action which returns cshtml view. Then results are parsed and displayed on a single page. (We know that is wrong design and we are going to replace it with web api calls to get light weight json).
So the problem is that abovementioned process is very time consuming (no wonder, eh?) taking around 19-20 seconds to retrieve and display results of all the 25 calls. Seeking the way to improve performance without changing existing design I noticed that all of the returned view had a Layout page (Master Page), and after I placed all of the code from the Layout page into view and set Layout = null (without changing any functionality) I got dramatic performance boost. Now it takes around 2-3 seconds to accomplish all that 25 calls.
I put traces and I can see (when running with Layouts) that between last trace in the view and first trace in its Layout page is around 800ms gap (not always but often) so when removing Layouts we get rid of those slowdowns. But why is that? I read lots of info and forums and everybody says that using Layouts should have only very light performance impact since it has no more overhead than just calling a function. But my facts say just opposite: just getting rid of master pages we benefit gigantic performance boost. I tried that many times and results are always the same.
Can anybody explain why having/removing Layouts can influence performance so much and what can MVC engine (or Umbraco???) be doing so long while stitching view with its Layout ?

"Time to Interact" metric in web performance measurements

Apparently "Time to Interact" is the new metric to use when measuring the perceived speed of a webpage. I'm interested in understanding a bit more about what this actually is.
The term was apparently coined by Radware, and is being pushed as the most meaningful performance measurement (compared to things such as Time to First/Last Byte, Time to Render etc.).
It is described as:
the point which a page displays its primary interactive (think
clickable) content, rather than full page load.
This seems pretty subjective to me; what is the "primary interactive content" of a webpage for example?
There have been reports citing results for the measurement, so some how this is being measured, and further, it must be automated as the result sets are pretty big (~500 sites were tested).
Other than the above quote, I cannot find any more information on how to measure this.
As Google are placing more emphasis on above the fold content (or visible content), I am wondering whether this metric is actually more like "Time to First Meaningful Render", i.e. it is contextual to the current page goal. So for example, on an eCommerce site's product page, this could be the main image, and an add to basket link.
I am keen to understand this metric, as to me it does seem like the most useful one. My question is therefore whether anyone is measuring this, and if so how are they doing so?
You kind of answered your own question, it is subjective, and contextual to you current project.
What if I'm testing a site with only HTML without any complex resources? There is no point measuring TTI there. On the other hand, let's see this demo site.
Bigger picture here.
Blue line is marking the "COMContentLoaded" event (main document is loaded and markup parsed), red line indicates the load event, where all page resources are loaded. The TTI line would go in-between the two lines, that is defined differently for each project, based on some essential to interact resources loaded event.
For example, let's say that the pictures on the demo site are not essential to the core features of the site. While the main site loaded in 0.8 seconds, the 3 big pictures took 36 extra seconds to load, so in this case using the overall response time as a KPI would yield ~36second response time, while if you define TTI excluding those big, non essential resources, you end up with < 1s response time.
I am keen to understand this metric, as to me it does seem like the most useful one.
Definitely useful, but as you said it in your question, it's specific to the project. You wouldn't measure TTI on a simple, relatively static web app, you would probably measure overall response time. I always define KPIs "tailored" for the current project, instead of trying to use common metrics, and "force them" on a project.
My question is therefore whether anyone is measuring this, and if so how are they doing so?
Definitely used it before, you should identify the essential resources for your site, and when the last of those resources are loaded, that is your TTI. This could be a javascript file, a css, etc...
Websites are getting more complex. Whereas they might not always contain more content they still have more resources to load as the user interaction/user experience is more complex from a technical point of view. Ajax helps us to load different parts separately. So rather than one page load we have the loading of several small things. And for each of these parts we can measure the loading performance. But there might some parts on the site that might be more important than others. The "primary interactive content" is that part of your view that enables the user to do what he intends to do, for example buy a train ticket. If some advertisement or a special animation on the left side of the screen hasn't loaded this does not prevent the user to buy start buying a ticket. But of course "primary interactive content" as a term is quite vague and you have to define it for your specific application. It is the point an average user can and will start to interact with the website while some parts are sill loading.
This is how I understand the concept and I see the difference to "Time to First Meaningful Render" here: you might have a basket rendered on your eCommerce page but the GUI is not yet responsive. So you see something meaningful but the interactivity is not yet there. Therefore TTI >= TtFMR.
Measuring TTI requires you to define what elements are required for interactivity which not only depends on what the site does but also HOW it does it. So it highly depends on your implementation/technology.

Why is codeigniter $this->load->view rendering so slow?

I have been using codeigniter for quite sometimes and i really love it. But for some reason i am not sure why but rendering a view in 1 of my application takes 18-23 whole seconds. Well i really wonder why as such time is to be taken for rendering the output.
I am sure about this as i measured the time from it reaching the Index Page till the data collection / preparation from the system. All took not less then a second. Just this rendering of the view page screwed up - taking 19 / 23 random seconds.
I really want to understand why as so and will like to get a proper through solution to work around on the same.
As for now i am caching the output and storing it and playing around with it but i too know it is not so full proof solution. There must be a way to analyze as where and why the rendering takes so much time on the server. If anyone knows anything about the same, please please do update / share on the same.
Have you tried using XDebug? There are a lot of online resources for how to use it in profiling your PHP apps, such as this one by Eric Hogue.
But if you included some sample code of what is happening in your controller and view then perhaps we could offer some specific guidance in this instance.
Removing a call to $this->carabiner->empty_cache(); speeds up our views quite a bit.

Resources