Simple bandwidth / latency test to estimate a users experience - performance

I write web based applications. Performance is obviously a key factor. Whilst database load and page rendering time are things I have control of, the users internet connection is not.
What I'm looking for is a way to indicate what sort of a connection a user has. Something along the lines of a traffic light in the corner of a website that shows the user what sort of a connection they have to the site and therefore indicating what sort of perceived performance they should expect. e.g. Maybe the app just seems slow because everyone else in your company is browsing facebook on their lunch hour.
My initial thoughts are that this could be achieved by some javascript that runs on each page load.
Ideally the code is very "dropable" and does not require major code or infrastructure changes to implement.

This seems like something you could use. You could also time an AJAX request; how long is the round trip? You'll probably have to establish some benchmarks on your own.

Related

"Time to Interact" metric in web performance measurements

Apparently "Time to Interact" is the new metric to use when measuring the perceived speed of a webpage. I'm interested in understanding a bit more about what this actually is.
The term was apparently coined by Radware, and is being pushed as the most meaningful performance measurement (compared to things such as Time to First/Last Byte, Time to Render etc.).
It is described as:
the point which a page displays its primary interactive (think
clickable) content, rather than full page load.
This seems pretty subjective to me; what is the "primary interactive content" of a webpage for example?
There have been reports citing results for the measurement, so some how this is being measured, and further, it must be automated as the result sets are pretty big (~500 sites were tested).
Other than the above quote, I cannot find any more information on how to measure this.
As Google are placing more emphasis on above the fold content (or visible content), I am wondering whether this metric is actually more like "Time to First Meaningful Render", i.e. it is contextual to the current page goal. So for example, on an eCommerce site's product page, this could be the main image, and an add to basket link.
I am keen to understand this metric, as to me it does seem like the most useful one. My question is therefore whether anyone is measuring this, and if so how are they doing so?
You kind of answered your own question, it is subjective, and contextual to you current project.
What if I'm testing a site with only HTML without any complex resources? There is no point measuring TTI there. On the other hand, let's see this demo site.
Bigger picture here.
Blue line is marking the "COMContentLoaded" event (main document is loaded and markup parsed), red line indicates the load event, where all page resources are loaded. The TTI line would go in-between the two lines, that is defined differently for each project, based on some essential to interact resources loaded event.
For example, let's say that the pictures on the demo site are not essential to the core features of the site. While the main site loaded in 0.8 seconds, the 3 big pictures took 36 extra seconds to load, so in this case using the overall response time as a KPI would yield ~36second response time, while if you define TTI excluding those big, non essential resources, you end up with < 1s response time.
I am keen to understand this metric, as to me it does seem like the most useful one.
Definitely useful, but as you said it in your question, it's specific to the project. You wouldn't measure TTI on a simple, relatively static web app, you would probably measure overall response time. I always define KPIs "tailored" for the current project, instead of trying to use common metrics, and "force them" on a project.
My question is therefore whether anyone is measuring this, and if so how are they doing so?
Definitely used it before, you should identify the essential resources for your site, and when the last of those resources are loaded, that is your TTI. This could be a javascript file, a css, etc...
Websites are getting more complex. Whereas they might not always contain more content they still have more resources to load as the user interaction/user experience is more complex from a technical point of view. Ajax helps us to load different parts separately. So rather than one page load we have the loading of several small things. And for each of these parts we can measure the loading performance. But there might some parts on the site that might be more important than others. The "primary interactive content" is that part of your view that enables the user to do what he intends to do, for example buy a train ticket. If some advertisement or a special animation on the left side of the screen hasn't loaded this does not prevent the user to buy start buying a ticket. But of course "primary interactive content" as a term is quite vague and you have to define it for your specific application. It is the point an average user can and will start to interact with the website while some parts are sill loading.
This is how I understand the concept and I see the difference to "Time to First Meaningful Render" here: you might have a basket rendered on your eCommerce page but the GUI is not yet responsive. So you see something meaningful but the interactivity is not yet there. Therefore TTI >= TtFMR.
Measuring TTI requires you to define what elements are required for interactivity which not only depends on what the site does but also HOW it does it. So it highly depends on your implementation/technology.

Web site loading speed is slow

My website http://theminimall.com is taking more loading time than before
initially i had ny server in US at that time my website speed is around 5 sec.
but now i had transferred my server to Singapore and loading speed is got increased is about 10 sec.
the more waiting time is going in getting result from Store Procedure(sql server database)
but when i execute Store Procedure in Sql Server it is returning result very fast
so i assume that the time taken is not due to the query execution delay but the data transfer time from the sql server to the web server how can i eliminate or reduce the time taken any help or advice will be appreciated
thanks in advance
I took a look at your site on websitetest.com. You can see the test here: http://www.websitetest.com/ui/tests/50c62366bdf73026db00029e.
I can see what you mean about the performance. In Singapore, it's definitely fastest, but even there its pretty slow. Elsewhere around the world it's even worse. There are a few things I would look at.
First pick any sample, such as http://www.websitetest.com/ui/tests/50c62366bdf73026db00029e/samples/50c6253a0fdd7f07060012b6. Now you can get some of this info in the Chrome DevTools, or FireBug, but the advantage here is seeing the measurements from different locations around the world.
Scroll down to the waterfall. All the way on the right side of the Timeline column heading is a drop down. Choose to sort descending. Here we can see the real bottlenecks. The first thing in the view is GetSellerRoller.json. It looks like hardly any time is spent downloading the file. Almost all the time is spent waiting for the server to generate the file. I see the site is using IIS and ASP.net. I would definitely look at taking advantage of some server-side caching to speed this up.
The same is true for the main html, though a bit more time is spent downloading that file. Looks like its taking so long to download because it's a huge file (for html). I would take the inline CSS and JS out of there.
Go back to the natural order for the timeline, then you can try changing the type of file to show. Looks like you have 10 CSS files you are loading, so take a look at concatenating those CSS files and compressing them.
I see your site has to make 220+ connection to download everything. Thats a huge number. Try to eliminate some of those.
Next down the list I see some big jpg files. Most of these again are waiting on the server, but some are taking a while to download. I looked at one of a laptop and was able to convert to a highly compressed png and save 30% on the size and get a file that looked the same. Then I noticed that there are well over 100 images, many of which are really small. One of the big drags on your site is that there are so many connections that need to be managed by the browser. Take a look at implementing CSS Sprites for those small images. You can probably take 30-50 of them down to a single image download.
Final thing I noticed is that you have a lot of JavaScript loading right up near the top of the page. Try moving some of that (where possible) to later in the page and also look into asynchronously loading the js where you can.
I think that's a lot of suggestions for you to try. After you solve those issues, take a look at leveraging a CDN and other caching services to help speed things up for most visitors.
You can find a lot of these recommendations in a bit more detail in Steve Souder's book: High Performance Web Sites. The book is 5 years old and still as relevant today as ever.
I've just taken a look at websitetest.com and that website is completely not right at all, my site is amoung the 97% fastest and using that website is says its 26% from testing 13 locations. Their servers must be over loaded and I recommend you use a more reputatable testing site such as http://www.webpagetest.org which is backed by many big companies.
Looking at your contact details it looks like the focus audience is India? if that is correct you should use hosting where-ever your main audience is, or closest neighbor.

Techniques to reduce data harvesting from AJAX/JSON services

I was wondering if anyone had come across any techniques to reduce the chances of data exposed through JSON type services on the server (intended to supply AJAX functions) from being harvested by external agents.
It seems to me that the problem is not so difficult if you had say a Flash client consuming the data. Then you could send encrypted data to the client, which would know how to decrypt it. The same method seems impossible with AJAX though, due to the open nature of the Javascript source.
Has anybody implemented a clever technique here?
Whatever the method, it should still allow a genuine AJAX function to consume the data.
Note that I'm not really talking about protecting 'sensitive' information here, the odd record leaking out is not a problem. Rather I am thinking about stopping a situation where the whole DB is hoovered up by bots (either in one go, or gradually over time).
Thanks.
First, I would like to clear on this:
It seems to me that the problem is not
so difficult if you had say a Flash
client consuming the data. Then you
could send encrypted data to the
client, which would know how to
decrypt it. The same method seems
impossible with AJAX though, due to
the open nature of the Javascrip
source.
It will be pretty obvious the information is being sent encrypted to the flash client & it won't be that hard for the attacker to find out from your flash compiled program what's being used for this - replicate & get all that data.
If the data does happens to have the value you are thinking, you can count on the above.
If this is public information, embrace that & don't combat it - instead find ways to capitalize on it.
If this is information that you are only exposing to a set of users, make sure you have the corresponding authentication / secure communication. Track usage as others have said, and have measures that act on it,
The first thing to prevent bots from stealing your data is not technological, it's legal. First, make sure you have the right language in your site's Terms of Use that what you're trying to prevent is actually disallowed and defensible from a legal standpoint. Second, make sure you design your technical strategy with legal issues in mind. For example, in the US, if you put data behind an authentication barrier and an attacker steals it, it's likely a violation of the DMCA law. Third, find a lawyer who can advise you on IP and DMCA issues... nice folks on StackOverflow aren't enough. :-)
Now, about the technology:
A reasonable solution is to require that users be authenticated before they can get access to your sensitive Ajax calls. This allows you to simply monitor per-user usage of your Ajax calls and (manually or automatically) cancel the account of any user who makes too many requests in a particular time period. (or too many total requests, if you're trying to defend against a trickle approach).
This approach of course is vulnerable to sophisticated bots who automatically sign up new "users", but with a reasonably good CAPTCHA implementation, it's quite hard to build this kind of bot. (see "circumvention" section at http://en.wikipedia.org/wiki/CAPTCHA)
If you are trying to protect public data (no authentication) then your options are much more limited. As other answers noted, you can try IP-address-based limits (and run afoul of large corporate proxy users) but sophisticated attackers can get around this by distributing the load. There's also likley sophisticated software which watches things like request timing, request patterns, etc. and tries to spot bots. Poker sites, for example, spend a lot of time on this. But don't expect these kinds of systems to be cheap. One easy thing you can do is to mine your web logs (e.g. using Splunk) and find the top N IP addresses hitting your site, and then do a reverse-IP lookup on them. Some will be legitimate corporate or ISP proxies. But if you recognize a compeitor's domain name among the list, you can block their domain or follow up with your lawyers.
In addition to pre-theft defense, you might also want to think about inserting a "honey pot": deliberately fake information that you can track later. This is how, for example, maps manufacturers catch plaigarism: they insert a fake street in their maps and see which other maps show the same fake street. While this doesn't prevent determined folks from sucking out all your data, it does let you find out later who's re-using your data. This can be done by embedding unique text strings in your text output, and then searching for those strings on Google later (assuming your data is re-usable on another public website). If your data is HTML or images, you can include an image which points back to your site, and you can track who is downloading it, and look for patterns you can use to bust the freeloaders.
Note that the javascript encryption approach noted in one of the other answers won't work for non-authenticated sessions-- an attacker can simply download the javascript and run it just like a regular browser would. Moral of the story: public data is essentially indefensible. If you want to keep data protected, put it behind an authentication barrier.
This is obvious, but if your data is publicly searchable by search engines, you'll both need a non-AJAX solution for them (Google won't read your ajax data!) and you'll want to mark those pages NOARCHIVE so your data doesn't show up in Google's cache. You'll also probably want a white list of search engine crawler IP addreses which you allow into your search-engine-crawlable pages (you can work with Google, Bing, Yahoo, etc. to get these), otherwise malicious bots could simply impersonate Google and get your data.
In conclusion, I want to echo #kdgregory above: make sure that the threat is real enough that it's worth the effort required. Many companies overestimate the interest that other people (both legitimate customers and nefarious actors) have in their business. It might be that yours is an oddball case where you have particularly important data, it's particularly valuable to obtain, it must be publicly accessible without authentication, and your legal recourses will be limited if someone steals your data. But all those together is admittedly an unusual case.
P.S. - another way to think about this problem which may or may not apply in your case. Sometimes it's easier to change how your data works which obviates securing it. For example, can you tie your data in some way to a service on your site so that the data isn't very useful unless it's being used in conjunction with your code. Or can you embed advertising in it, so that wherever it's shown you get paid? And so on. I don't know if any of these mitigations apply to your case, but many businesses have found ways to give stuff away for free on the Internet (and encourage rather than prevent wide re-distribution) and still make money, so a hybrid free/pay strategy may (or may not) be possible in your case.
If you have an internal Memcached box, you could consider using a technique where you create an entry for each IP that hits your server with an hour expiration. Then increment that value each time the IP hits your AJAX endpoint. If the value gets over a particular threshold, fry the connection. If the value expires in Memcached, you know it isn't getting "hoovered away".
This isn't a concrete answer with a proof of concept, but maybe a starting point for you. You could create a javascript function that provides encryption/decryption functions. The javascript would need to be built dynamically, and you would include an encryption key that is unique to the session. On the server side, you'd have an encryption service that uses the key from the session to encrypt your JSON before delivering it.
This would at least prevent someone from listening to your web traffic, pulling information out of your database.
I'm with kdgergory though, it sounds like your data is too open.
Some techniques are listed in Further thoughts on hindering screen scraping.
If you use PHP, Bad behavior is a nice tool to help. If you don't use PHP, it can give some ideas on how to filter (see How it works page).
Incredibill's blog is giving nice tips, lists of User-agents/IP ranges to block, etc...
Here are a variety of suggestions:
Issue tokens required for redemption along with each AJAX request. Expire the tokens.
Track how many queries are coming from each client, and throttle excessive usage based on expected normal usage of your site.
Look for patterns in usage such as sequential queries, spikes in requests, or queries that occur faster than a human could conduct.
Check user-agents. Many bots don't completely replicate the user agent info of a browser, and you can eliminate programatic scraping of your data using this method.
Change the front-end component of your website to redirect to a captcha (or some other human verifying mechanism) once a request threshold is exceeded.
Modify your logic so the respsonse data is returned in a few different ways to complicate the code required to parse.
Obsfucate your client-side javascript.
Block IPs of offending clients.
Bots usually doesn't parse Javascript, so your ajax code won't be instantly executed. And if they even do, bots usually doesn't maintain sessions/cookies as well. Knowing that, you could reject the request if it is invoked without a valid session/cookie (which is obviously set on the server side beforehand by the request on the parent page).
This does not protect you from human hazard though. The safest way is to restrict access to users with a login/password. If that is not your intent, well, then you have to live with the fact that it's a public application. You could of course scan logs and maintian blacklists with IP addresses and useragents, but that goes extreme.

Web Development: What page load times do you aim for?

Website Page load times on the dev machine are only a rough indicator of performance of course, and there will be many other factors when moving to production, but they're still useful as a yard-stick.
So, I was just wondering what page load times you aim for when you're developing?
I mean page load times on Dev Machine/Server
And, on a page that includes a realistic quantity of DB calls
Please also state the platform/technology you're using.
I know that there could be a big range of performance regarding the actual machines out there, I'm just looking for rough figures.
Thanks
Less than 5 sec.
If it's just on my dev machine I expect it to be basically instant. I'm talking 10s of milliseconds here. Of course, that's just to generate and deliver the HTML.
Do you mean that, or do you mean complete page load/render time (html download/parse/render, images downloading/display, css downloading/parsing/rendering, javascript download/execution, flash download/plugin startup/execution, etc)? The later is really hard to quantify because a good bit of that time will be burnt up on the client machine, in the web browser.
If you're just trying to ballpark decent download + render times with an untaxed server on the local network then I'd shoot for a few seconds... no more than 5-ish (assuming your client machine is decent).
Tricky question.
For a regular web app, you don't want you page load time to exceed 5 seconds.
But let's not forget that:
the 20%-80% rule applies here; if it takes 1 sec to load the HTML code, total rendering/loading time is probably 5-ish seconds (like fiXedd stated).
on a dev server, you're often not dealing with the real deal (traffic, DB load and size - number of entries can make a huge difference)
you want to take into account the way users want your app to behave. 5 seconds load time may be good enough to display preferences, but your basic or killer features should take less.
So in my opinion, here's a simple method to get a rough figures for a simple web app (using for example, Spring/Tapestry):
Sort the pages/actions given you app profile (which pages should be lightning fast?) and give them a rough figure for production environment
Then take into account the browser loading/rendering stuff. Dividing by 5 is a good start, although you can use best practices to reduce that time.
Think about your production environment (DB load, number of entries, traffic...) and take an additional margin.
You've got your target load time on your production server; now it's up to you and your dev server to think about your target load time on your dev platform :-D
One of the most useful benchmarks we use for identifying server-side issues is the "internal" time taken from request-received to response-flushed by the web server itself. This means ignoring network traffic / latency and page render times.
We have some custom components (.net) that measure this time and inject it into the HTTP response header (we set a header called X-Server-Response); we can extract this data using our automated test tools, which means that we can then measure it over time (and between environments).
By measuring this time you get a pretty reliable view into the raw application performance - and if you have slow pages that take a long time to render, but the HTTP response header says it finished its work in 50ms, then you know you have network / browser issues.
Once you push your application into production, you (should) have things to like caching, static files sub-domains, js/css minification etc. - all of which can offer huge performance gains (esp. caching), but can also mask underlying application issues (like pages that make hundreds of db calls.)
All of which to say, the values we use for this time is sub 1sec.
In terms of what we offer to clients around performance, we usually use 2-3s for read-only pages, and up to 5s for transactional pages (registration, checkout, upload etc.)

Is there a generally acceptable definition of (soft) realtime delays?

I'm trying to find a benchmark for how long users are willing to wait for a response from a remote service. In my case the response is for very useful but not business critical validation of data entry. I guess that there must have been some work done in the HCI space on this.
If you know of a generally accepted definition for soft realtime responses then great but I'd also appreciate your well reasoned thoughts.
Chris
US DOD MIL-STD 1472-F Human Engineering Standard has the most widely accepted requirements for maximum allowed response time (from Table XXII, page 196, times in seconds):
Key Response (Key depression until positive response, e.g., "click"): 0.1
Key Print (Key depression until appearance of character): 0.2
Page Turn (End of request until first few lines are visible): 1.0
Page Scan (End of request until text begins to scroll): 0.5
XY Entry (From selection of field until visual verification): 0.2
Function (From selection of command until response): 2.0
Pointing (From input of point to display point): 0.2
Sketching (From input of point to display of line): 0.2
Local Update (Change to image using local data base, e.g., new menu list): 0.5
Host Update (from display buffer): 2.0
File Update (Change where data is at host in readily accessible form): 10.0
Inquiry - Simple (e.g., a scale change of existing image): 2.0
Inquiry - Complex (Image update requires an access to a host file): 10.0
Error Feedback (From command until display of a commonly used message): 2.0
As you can see, acceptable response time depends on what response the user is waiting for. For something like a pulldown menu appearing, it's 0.5 seconds max. For a full page load in a browser, you want something to appear in 1.0 s to 2.0 s and the full page loaded in 10.0 s. In all the above, shorter response times are better. Only in bizarre circumstances will users object to a 0.001s response time.
In any case, if the response time will be greater than 0.5 s, then you need to provide feedback such as a throbber or hourglass sprite. If response time is a minimum of 5-15 s (depending on what standard you use), provide a progress bar. With a progress bar, very long response times (on the order or minutes over even hours) may be acceptable as long as you set it up for the user as a “batch” process rather than being an interactive program. It's much better for the user to make all input and wait an hour than to make input on four occasions, waiting 15 minutes after each.
The above list has the accepted standards. How long your users are willing to wait (e.g., before giving up) essentially boils down to the user making a cost-benefit analysis. Is what I’m going to get worth the wait? What are my sunk costs? Is there an alternative (e.g., another web site) that can do it better? Can I do other things while I wait to make the most of my time? However, whatever users willing to do, you can bet they’ll resent delays greater than the standards above.
Human reaction time seems to be around 200 ms - anything around there will be perceived as instantaneous. That sort of number is hard to achieve, especially in an application that gets information from remote services.
If you take a look at Google's search suggestion box, the lag there is minimal - less than a second. It's astoundingly fast, and really remarkable for a web application. This is really nice for Google's users, but it's bad news for you. These days, users expect most applications to react with the same sort of speed an efficiency; anything slower is considered rather laggy. However, it's worth noting that people's patience usually varies with the complexity of the task at hand. A simple form submit should never take much time, but something like uploading photos is expected to take a while.
My feeling is this: go with your gut. If your application is fairly simple then you should try to get the wait/load time down to less than a second. If you can't, then your best bet is to add an indicator so the user knows that some computations are being done in the background. This can be in the form of a small animation or a progress bar.
Unfortunately, the answer to this question is not typically a well-defined number. Users expectations vary widely and can change depending on what it is you're talking about.
As computers continue to become more ubiquitous and we (the consumers) continue to have growing expectations of speed, remote services, websites, and even applications will need to continue to respond more quickly. Generally speaking, you want everything to be as fast as possible.
With this said, I would look at what your remote service is for. Since you said, "the response is for very useful..." to me, that means it probably will get used frequently. People tend to use what is useful. If that's the case, I would look for ways to make that remote service respond quickly.
Of course, there is also the caveat that you don't want to start optimizing before the service is written. What is the current response time? What is the context in which this will be used? Those factors will do a lot to determine the longest users are willing to wait for the service.
You might want to search for "SLA" or "Service Level Agreement". Those are the documents in a web business that make guarantees as to how long data will take to get back to the user, whether it's an HTML document or a web service call.

Resources