Lazily loading an 11 MB file in the browser

Lazily loading an 11 MB file in the browser - ajax

I'm building a quiz backed by a lot of Census data, and I would really like to power it with flat files rather than worry about a database and servers. (More on the case for flat files here.) The uncompressed JSON file I need for the file part of the quiz is currently 11MB.
Right now, I'm making an AJAX request for the file right away, but not using it until the user has completed the quiz, which will take about 1 minute.
I realize this is a bit of a subjective question, but I'm wondering how much this is pushing the envelope when it comes to supporting a wide variety of modern phones and computers in 2015. I'm worried less about the bandwidth than about the memory and processing power in the device. The code parses through all the data, matches entries to user responses, and computes a result.
TL;DR
Is it crazy to AJAX an 11B JSON file if I don't need it for about 60 seconds? (I'll check to make sure I have it when the time comes, of course)

Related

What is the relationship between request content size and request duration

At the company I work, all our APIs send and expect requests/responses that follow the JSON:API standard, making the structure of the request/response content very regular.
Because of this regularity and the fact that we can have hundreds or thousands of records in one request, I think it would be fairly doable and worthwhile to start supporting compressed requests (every record would be something like < 50% of the size of its JSON:API counterpart).
To make a well informed judgement about the viability of this actually being worthwhile, I would have to know more about the relationship between request size and duration, but I cannot find any good resources on this. Anybody care to share their expertise/resources?
Bonus 1: If you were to have request performance issues, would you look at compression as a solution first, second, last?
Bonus 2: How does transmission overhead scale with size? (If I cut the size by 50%, by what percentage will the transmission overhead be cut?)

Request and response compression adds to a time and CPU penalty on both sender's side and receiver's side. The savings in time is in the transmission.
The weighing of the tradeoff depends a lot on the customers of the API -- when they make requests, how much do they request, what is requested, where they are located, type of device/os and capabilities etc.,
If the data is static -- for eg: a REST query apihost/resource/idxx returning a static resource, there are web standard approaches like caching of static resources that clients / proxies will be able to assist with.
If the data is dynamic -- there are architectural patterns that could be used.
If the data is huge -- eg: big scientific data sets, video etc., almost always you would find them being served statically with a metadata service that provides the dynamic layer. For eg: MPEG-DASH or HLS is just a collection of files.
I would choose compression as a last option relative to the other architectural options.
There are also implementation optimizations that would precede using compression of request/response. For eg:
Are your services using all available resources at disposal (cores, memory, i/o)
Does the architecture allow scale-up and scale-out and can the problem be handled effectively using that (remember the penalties on client side due to compression)
Can you use queueing, caching or other mechanisms to make things appear faster?
If you have explored all these and the answer is your system is optimal and you are looking at the most granular unit of service where data volume is an issue, by all means go after compression. Keep in mind that you need to budget compute resources for compression on the server side as well (for a fixed workload).
Your question#2 on transmission overhead vs size is a question around bandwidth and latency. Bandwidth determines how much you can push through the pipe. Latency governs the perceived response times. Whether the payload is 10 bytes or 10MB, latency for a client across the world encountering multiple hops will be larger relative to a client encountering only one or two hops and is bound by the round-trip time. So, a solution may be to distribute the servers and place them closer to your clients from across the world rather than compressing data. That is another reason why compression isn't the first thing to look at.
Baseline your performance and benchmark your experiments for a representative user base.

I think what you are weighing here is going to be the speed of your processor / cpu vs the speed of your network connection.
Network connection can be impacted by things like distance, signal strength, DNS provider, etc; whereas, your computer hardware is only limited by how much power you've put in it.
I'd wager that compressing your data before you are sending would result in shorter response times, yes, but it's=probably going to be a very small amount. If you are sending json, usually text isn't all that large to begin with, so you would probably only see a change in performance at the millisecond level.
If that's what you are looking for, I'd go ahead and implement it, set some timing before and after, and check your results.

Writing multiple files Vs. writing one big file [in a solid state drive]

(I was not able to find a clear answer to my question, maybe I used the wrong search term)
I want to record many images from a camera, with no compression or lossless compression, on a not so powerful device with one single solid drive.
After investigating, I have decided that, if any, the compression will be simply png image by image (this is not part of the discussion).
Given these constraints, I want to be able to record at maximum possible frequency from the camera. The bottleneck is the (only one) hard drive speed. I want to use the RAM for queuing, and the few available cores for compressing the images in parallel, so that there's less data to write.
Once the data is compressed, do I get any gain in writing speed if I stream all the bytes in one single file, or, considering that I am working with a solid drive, can I just write one file (let's say about 1 or 2 MB) per image still working at the maximum disk bandwidth? (or very close to it, like >90%)?
I don't know if it matters, this will be done using C++ and its libraries.

My question is "simply" if by writing my output on a single file instead of in many 2MB files I can expect a significant benefit, when working with a solid state drive.
There's a benefit, not a significant one. A file system driver for a solid state drive already knows how to distribute the data of a file across many non-adjacent clusters so doing it yourself doesn't help. Necessary to fit a large file on a drive that already contains files. By breaking it up, you force extra writes to also add the directory entries for those segments.
The type of a solid state drive matters but this is in general already done by the driver to implement "wear-leveling". In other words, intentionally scatter the data across the drive. This avoids wearing out flash memory cells, they have a limited number of times you can write them before they physically wear out and fail. Traditionally only guaranteed at 10,000 writes, they've gotten better. You'll exercise this of course. Notable as well is that flash drives are fast to read but slow to write, that matters in your case.
There's one notable advantage to breaking up the image data into separate files: it is easier to recover from a drive error. Either from a disastrous failure or the drive just filling up to capacity without you stopping in time. You don't lose the entire shot. But inconvenient to whatever program reads the images off the drive, it has to glue them back together. Which is an important design goal as well, if you make it too impractical with a non-standard uncompressed file format or just too slow to transfer or just too inconvenient in general then it will just not get used very often.

HTTP request cost vs. page size cost?

I know it's a good practice to minimize the number of requests each page needs. For example, combining javascript files and using css sprites will greatly reduce the number of requests needed to render your page.
Another approach I've seen is to keep javascript embedded in the page itself, especially for javascript specific to that page and not really shared across other pages.
But my question is this:
At what point does my javascript grow too large that it becomes more efficient to pull the script into a separate file and allow the additional request for the separate js file?
In other words, how do I measure how much bytes equates to the cost of one request?
Since successive requests are cached, the only cost of calling that same js file is the cost of the request. Whereas keeping the js in the page will always incur the cost of additional page size, but will not incur the cost of an additional request.
Of course, I know several factors go into this: speed of the client, bandwidth speed, latency. But there has to be a turning point to where it makes more sense to do one over the other.
Or, is bandwidth so cheap (speed, not money) these days that it requires many more bytes than it used to in order to exceed the cost of a request? It seems to be the trend that page size is become less of a factor, while the cost of a request has plateaued.
Thoughts?

If you just look at the numbers and assume an average round-trip time for a request of 100 ms and an average connection speed of 5 Mbps, you can arrive at a number which says that up to 62.5 KB can be added to a page before breaking it out to a separate file becomes worthwhile. Assuming that gzip compression is enabled on your server, then the real amount of JavaScript that you can add is even larger still.
But, this number ignores a number of important considerations. For instance, if you move your JavaScript to a separate file, the user's browser can cache it more effectively such that a user that hits your page 100 times might only download the JavaScript file once. If you don't do this, and assuming that your webpage has any dynamic content whatsoever, then the same user would have to download the entire script every single time.
Another issue to consider is the maintainability of the page. As a general rule, the more JavaScript you add, the more difficult it becomes to maintain your page and make changes and updates without introducing bugs and other problems. So even if you don't have quite 62.5 KB of JavaScript and even if you don't care about the caching side of things, you have to ask yourself whether or not having a separate JavaScript file would improve maintainability and if so, whether it's worth sacrificing that maintainability for a slightly faster page load.
So there really isn't an exact answer here, but as a general rule I think that if the JavaScript is stuff that is truly intrinsic to the page (onclick handlers, effects/animations, other things that interface directly with elements on the page) then it belongs with the page. But if you have a bunch of other code that your handlers, effects, and other things use more like a library/helper utility, then that code can be moved to a separate file. Favor maintainability of your code over both page size and load times. That's my recommendation, anyways.

This is a huge topic - you are indirectly asking about many different aspects of web performance, so there are a few tricks, some of which wevals mentions.
From my own experience, I think it comes down partially to modularization and making tradeoffs. So for instance, it makes sense to pack together javascript that's common across your entire site. If you serve the assets from a CDN and set correct HTTP headers (Cache-Control, Etag, Expires), you can get a big performance boost.
It's true that you will incur the cost of the browser making a request and receiving a 304 Not Modified from the server, but that response at least will be fast to go across the wire. However, you will (typically) still incur the cost of the server processing your request and deciding that the asset is unchanged. This is where web proxies like Squid, Varnish and CDNs in general shine.
On the topic of CDN, especially with respect to JavaScript, it makes sense to pull libraries like jQuery out of one of the public CDNs. For example, Google makes a lot of the most popular libraries available via its CDN which is almost always going to be faster than than you serving it from your own server.
I also agree with wevals that page size is still very important, particularly for international sites. There are many countries where you get charged by how much data you download and so if your site is enormous there's a real benefit to your visitors when you serve them small pages.
But, to really boil it down, I wouldn't worry too much about "byte cost of request" vs "total download size in bytes" - you'd have to be running a really high-traffic website to worry about that stuff. And it's usually not an issue anyway since, once you get to a certain level, you really can't sustain any amount of traffic without a CDN or other caching layer in front of you.
It's funny, but I've noticed that with a lot of performance issues, if you design your code in a sensible and modular way, you will tend to find the natural separations more easily. So, bundle together things that make sense and keep one-offs by themselves as you write.
Hope this helps.

With the correct headers set (far future headers see: 1), pulling the js into a separate file is almost always the best bet since all subsequent requests for the page will not make any request or connection at all for the js file.
The only only exception to this rule is for static websites where it's safe to use a far future header on the actual html page itself, so that it can be cached indefinitely.
As for what byte size equating to the cost of an http connection, this is hard to determine because of the variables that you mentioned as well as many others. HTTP resource requests can be cached at nodes along the way to a user, they can be paralleled in a lot of situations, and a single connection can be reused for multiple request (see: 2).
Page size is still extremely important on the web. Mobile browsers are becoming much more popular and along with that flaky connections through mobile providers. Try and keep file size small.
http://developer.yahoo.com/performance/rules.html
http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Persistent_connections
Addition: It's worth noting that major page size achievements can be achieved through minification and gzip which are super simple to enable through good build tools and web servers respectively.

Accelerated downloads with HTTP byte range headers

Has anybody got any experience of using HTTP byte ranges across multiple parallel requests to speed up downloads?
I have an app that needs to download fairly large images from a web service (1MB +) and then send out the modified files (resized and cropped) to the browser. There are many of these images so it is likely that caching will be ineffective - i.e. the cache may well be empty. In this case we are hit by some fairly large latency times whilst waiting for the image to download, 500 m/s +, which is over 60% our app's total response time.
I am wondering if I could speed up the download of these images by using a group of parallel HTTP Range requests, e.g. each thread downloads 100kb of data and the responses are concatenated back into a full file.
Does anybody out there have any experience of this sort of thing? Would the overhead of the extra downloads negate a speed increase or might this actually technique work? The app is written in ruby but experiences / examples from any language would help.
A few specifics about the setup:
There are no bandwidth or connection restrictions on the service (it's owned by my company)
It is difficult to pre-generate all the cropped and resized images, there are millions with lots of potential permutations
It is difficult to host the app on the same hardware as the image disk boxes (political!)
Thanks

I found your post by Googling to see if someone had already written a parallel analogue of wget that does this. It's definitely possible and would be helpful for very large files over a relatively high-latency link: I've gotten >10x improvements in speed with multiple parallel TCP connections.
That said, since your organization runs both the app and the web service, I'm guessing your link is high-bandwidth and low-latency, so I suspect this approach will not help you.
Since you're transferring large numbers of small files (by modern standards), I suspect you are actually getting burned by the connection setup more than by the transfer speeds. You can test this by loading a similar page full of tiny images. In your situation you may want to go serial rather than parallel: see if your HTTP client library has an option to use persistent HTTP connections, so that the three-way handshake is done only once per page or less instead of once per image.
If you end up getting really fanatical about TCP latency, it's also possible to cheat, as certain major web services like to.
(My own problem involves the other end of the TCP performance spectrum, where a long round-trip time is really starting to drag on my bandwidth for multi-TB file transfers, so if you do turn up a parallel HTTP library, I'd love to hear about it. The only tool I found, called "puf", parallelizes by files rather than byteranges. If the above doesn't help you and you really need a parallel transfer tool, likewise get in touch: I may have given up and written it by then.)

I've written the backend and services for the sort of place you're pulling images from. Every site is different so details based on what I did might not apply to what you're trying to do.
Here's my thoughts:
If you have a service agreement with the company you're pulling images from (which you should because you have a fairly high bandwidth need), then preprocess their image catalog and store the thumbnails locally, either as database blobs or as files on disk with a database containing the paths to the files.
Doesn't that service already have the images available as thumbnails? They're not going to send a full-sized image to someone's browser either... unless they're crazy or sadistic and their users are crazy and masochistic. We preprocessed our images into three or four different thumbnail sizes so it would have been trivial to supply what you're trying to do.
If your request is something they expect then they should have an API or at least some resources (programmers) who can help you access the images in the fastest way possible. They should actually have a dedicated host for that purpose.
As a photographer I also need to mention that there could be copyright and/or terms-of-service issues with what you're doing, so make sure you're above board by consulting a lawyer AND the site you're accessing. Don't assume everything is ok, KNOW it is. Copyright laws don't fit the general public's conception of what copyrights are, so involving a lawyer up front can be really educational, plus give you a good feeling you're on solid ground. If you've already talked with one then you know what I'm saying.

I would guess that using any p2p network would be useless as there is more permutations then often used files.
Downloading parallel few parts of file can give improvement only in slow networks (slower then 4-10Mbps).
To get any improvement of using parallel download you need to ensure there will be enough server power. From you current problem (waiting over 500ms for connection) I assume you already have problem with servers:
you should add/improve load-balancing,
you should think of changing server software for something with more performance
And again if 500ms is 60% of total response time then you servers are overloaded, if you think they are not you should search for bottle neck in connections/server performance.

What is an appropriate page processing time for a web application?

I'm working on a web application, and it's getting to the point where I've got most of the necessary features and I'm starting to worry about execution speed. So I did some hunting around for information and I found a lot about reducing page load times by minifying CSS/JS, setting cache control headers, using separate domains for static files, compressing the output, and so on (as well as basic server-side techniques like memcached). But let's say I've already optimized the heck out of all that and I'm concerned with how long it actually takes my web app to generate a page, i.e. the pure server-side processing time with no cache hits. Obviously the tricks for bringing that time down will depend on the language and underlying libraries I'm using, but what's a reasonable number to aim for? For comparison, I'd be interested in real-world examples of processing times for apps built with existing frameworks, doing typical things like accessing a database and rendering templates.
I stuck in a little bit of code to measure the processing time (or at least the part of it that happens within the code I wrote) and I'm generally seeing values in the range 50-150ms, which seems pretty high. I'm interested to know how much I should focus on bringing that down, or whether my whole approach to this app is too slow and I should just give it up and try something simpler. (Based on the Net tab of Firebug, the parts of processing that I'm not measuring typically add less than 5ms, given that I'm testing with both client and server on the same computer.)
FYI I'm working in Python, using Werkzeug and SQLAlchemy/Elixir. I know those aren't the most efficient technologies out there but I'm really only concerned with being fast enough, not as fast as possible.
EDIT: Just to clarify, the 50-150ms I quoted above is pure server-side processing time, just for the HTML page itself. The actual time it takes for the page to load, as seen by the user, is at least 200ms higher (so, 250-350ms total) because of the access times for CSS/JS/images (although I know that can be improved with proper use of caching and Expires headers, sprites, etc. which is something I will do in the near future). Network latency will add even more time on top of that, so we're probably talking about 500ms for the total client load time.
Better yet, here's a screenshot from the Net tab of Firebug for a typical example:
It's the 74ms at the top that I'm asking about.

IMHO, 50-150 ms on client side on server side is fine in most circumstances. When I measure the speed of some very known websites, I rarely see something as fast. Most of the times, it is about 250 ms, often higher.
Now, I want to underline three points.
Everything depends on the context. A home page or a page which will be accessed very frequently will suck a lot if it takes seconds to load. On the other hand, some rarely used parts of the website can take up to one second if optimizations are to expensive.
The major concern of the users is to accomplish what they want quickly. It's not about the time taken to access a single page, but rather the time to access information or to accomplish a goal. That means that it's better to have one page taking 250 ms than requiring the user to visit three pages one after another to do the same thing, each one taking 150 ms to load.
Be aware of the perceived load time. For example, there is an interesting trick used on Stack Overflow website. When doing something which is based on AJAX, like up/down-voting, first you see the effect, then the request is made to the server. For example, try to up-vote your own message. It will show you that the message is up-voted (the arrow will become orange), then, 200 ms later, the arrow will become gray and an error box will be displayed. So in the case of an up-vote, the perceived load time (arrow becomes orange) is 1 ms, when the real load time spent doing the request is 100 ms.
EDIT: 200 ms is fine too. 500 ms will probably hurt a little if the page is accessed frequently or if the user expects the page to be fast (for example, AJAX requests are expected to be fast). By the way, I see on the screenshot that you are using several CSS files and ten PNG images. By combining CSS into one file and using CSS sprites, you can probably reduce the perceived load time, especially when dealing with network latency.

Jakob Nielsen, a well known speaker on usability posted an article [1] on this a few days back. He suggests that under 1 second is deal, under 100ms is perfect as it interrupts the user flow a bit more.
As other users have pointed out it depends on the context of that page. If someone is uploading a file they expect a delay. If they're logging in and it takes ten seconds they can start to get frustrated.
[1] http://www.useit.com/alertbox/response-times.html

I looked at some old JMeter results from when I wrote and ran a suite of performance tests against a web service. I'll attach some of them below, it's not apples-to-apples of course but at least another data point.
Times are in milliseconds. Location Req and Map Req had inherent delays of 15000 and 3000 milliseconds, respectively. Invite included a quick call to a mobile carrier's ldap server. Others were pretty standard, mainly database read/writes.
sampler_label count average min max
Data Blurp 2750 185 30 2528
UserAuth 2750 255 41 2025
Get User Acc 820 148 29 2627
Update User Acc 4 243 41 2312
List Invitations 9630 345 47 3966
Invite 2750 591 102 4095
ListBuddies 5500 344 52 3901
Block Buddy 403 419 79 1835
Accept invite 2065 517 94 3043
Remove Buddy 296 411 83 1942
Location Req 2749 16963 15369 20517
Map Req 2747 3397 3116 5926
This software ran on a dedicated, decent virtual machine, tuned the same way production VMs were. The max results were slow, my goal was to find the number of concurrent users we could support so I was pushing it.
I think your numbers are absolutely ok. With regards to all the other stuff that makes websites seem slow, if you haven't, take a look at YSlow. It integrates nicely with Firebug and provides great information about how to make pages load faster.

50-150ms for page load time is fine - you do not need to optimize further at this point.
The fact is, so long as your pages are loading within a second, you are OK.
See this article, which discusses the effects of load times for conversion (100ms increase = 1% for amazon).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio