HTTP/2 and responsive images - image

I'm currently experimenting with http/2 and server push rules. It's quite easy to implement the push rules for js and css files, but, there seems to be no way to effectively use the push feature with responsive images, like the picture tag and/or the srcset attribute.
Of course, I can push every version of an image to the client, but that would be a traffic disaster, especially on mobile devices with limited traffic.
As far as I know, the browser gets a promise for each file push. The promise is used to interrupt that push, when the fileis already cached. I hope that this statement is correct.
Is there a way to tell the browser, that an image is just for a special screen size or pixel ratio?

Of course, I can push every version of an image to the client, but
that would be a traffic disaster, especially on mobile devices with
limited traffic.
Yes that would defeat the point of using different versions (which is primarily done as a bandwidth saving).
As far as I know, the browser gets a promise for each file push. The
promise is used to interrupt that push, when the fileis already
cached. I hope that this statement is correct.
Yes it is, however, if you are thinking you can make the browser cancel the request, then you need to realise that 1) the browsers will typically only do this for requests they already have in the cache and 2) cancelling takes time, by which point some (or perhaps even all) of the pushed resource may have been needlessly downloaded.
Is there a way to tell the browser, that an image is just for a
special screen size or pixel ratio?
You don't push images to the screen, but to the browser cache, so the pushed resources will only be used if appropriate according to the page (e.g. the correct srcset value). However, as mentioned above, you don't want them to be needlessly pushed or you are wasting bandwidth.
The key to successfully using Server push is to be reasonably certain that the resources are needed - or you will actually cause a performance bottleneck. I would honestly suggest NOT pushing everything but only pushing the critical, render blocking, resources that will almost certainly be needed (CSS, JavaScript). Images are not typically render blocking so they is not usually a massive need to push them.
This could be handled with cookies. If no cookies are set, then it's likely a fresh session so push the critical CSS file and set a "cssLoaded" cookie. If a page is requested, and that cookie is set, then don't push the CSS file. I've blogged about a simple implementation of this in Apache here: https://www.tunetheweb.com/performance/http2/http2-push/. This could still lead to over pushing - if client didn't allow cookies for example - but for most users it would be fine.
You could extended this further, by having JavaScript set a cookie with the screen size and then for subsequent page loads, the server can read that cookie, know the screen size, and push the appropriately sized images. This won't help the initial page load, but would help other page loads if your visitor visits several pages on your site in same session. But honestly it sounds like overkill and I would just not push the images.

Related

Why do some websites that have SSL not work but still load if using the HTTPS version? How can I avoid it if I make a website?

Sometimes, if I go to a website, such as this one through an HTTP link, it looks fine and works as apparently intended:
However, if you change the address to be HTTPS, the page loads without any browser warnings but looks really weird and seems broken—spacing is messed up, the colors are wrong, fonts don't load, etc.:
All of this same stuff happens in both Firefox and Chrome on my computer.
What causes this to happen? How can I avoid this if I make an HTTPS-secured website?
For me the browser tells you what is wrong in a warning message. Parts of the page are not secure (such as images).
What does this mean? The developer of the site has linked some content such as CSS, JS, or images using HTTPS links and some using HTTP links.
Why is this a problem? Since some content is being retrieved over an insecure connection (http), it would be possible for malicious content to be injected into your browser which could then grab information which was transmitted over https. Browsers have had this warning for a very long time, but in the interest of security they have hedged their behavior on the more secure side of things now.
What will fix this? There is nothing we can do as consumers of the website. The owner of the site should fix the problem. If you are really interested in viewing the site and not concerned about security, you can temporarily disable this protection from the URL bar warning message in Firefox.
As #micker explained, the page looks weird because not all of the sources are loading since their connections could not be made securely and the website's ability to load those sources are being denied by the browser for not being referenced using a secure connection.
To elaborate further. in case it's still not quite clear, a more accurate and technical explanation would be that, for styling a webpage, the Cascading Style Sheets, or CSS, is the language used to describe the presentation of a document or webpage in this case, and tells the browser how elements should be rendered on the screen. If you consider these stylesheets as sort of building blocks, where you can combine them together to define different areas on a webpage to build one masterpiece, then you would see why having multiple building blocks for a site would sound pretty normal.
To save even more time, rather than try to figure out the code for each and every stylesheet or "building block" that I want to include, I can burrow someone else's style sheet that has the properties I want and link to it as a resource instead of making or hosting the resource myself. Now if we pretend that there's a stylesheet for every font size change, font color variance, or font placement, then that means we're going to need a building block to define each of those
Now, If I am on a secure connection, then the browser ensures that connection stays secure by only connecting to other sites, or resources, which are also secure. If any of the sites containing the building blocks of CSS that I want to use but are not secure, AKA not using SSL (indicated by a lack of "s" in "http://" in their address), then the browser will prevent those connections from happening and thus prevents the resources from loading, because the browser considers it a risk to your current secure connection.
In your example's particular case, things looked fine when you entered only http:// without the https:// because the site you were visiting doesn't automatically force visitors to use SSL and lets you connect to it using the less secure, http protocol, which means your browser is not connecting securely to it, and therefore won't take extra steps to protect you by blocking anything outside of that since you're already on an insecure connection anyway. In which case, the browser doesn't need to prevent sources that are coming from an insecure connection or sites because in a way, your connection is already exposed so it can freely connect where it needs to and load any resources regardless if they can be transferred securely or not.
So then, when you go to the "https://" version of the site, there are no browser warnings because you're connecting to that site with a secure connection and unfortunately that also means that if the designer of the page had linked resources from somewhere that just didn't have an SSL connection available or didn't update the link to go to the new https:// standard, then it's going to be considered insecure and since you're on a secure connection, the browser will block those connections which means blocks those resources from being able to load, making the page load incomplete with not all of its building blocks. Build blocks that tells your screen to move all the text on the right into a panel and to have a blue font color while changing to a different font face. Those definitions defining the look and appearance didn't make it through and so those sections adopted whatever existing stylesheet is present which normally don't match with what was intended to be there.

Send an entire web app as 1 HTTP response (html, js, css, images, ...)

Traditionally a browser will parse HTML and then send further requests to the server for all related data. This seems like inefficient to me, since it might require a large number of requests, even though my server already knows that a browser that wants to use this web application will need all of it's resources.
I know that js and css could be inlined, but that complicates server side code and img data as base64 bloats the size of the data... I'm aware as well that rendering can start before all assets are downloaded, which would potentially no longer work (depending on the implementation). I still feel that streaming an entire application in one go should be faster on slow connections than making tens of requests separately.
Ideally I would like the server to stream an entire directory into one HTTP response.
Does any model for this exist?
Does the reasoning make sense?
ps: If browser support for this is completely lacking, I'm wondering about a 2 step approach. Download a small JavaScript which downloads a compressed web app file, extracts it and plugs the resources into the page. Is anyone already doing something like this?
Update
I found one: http://blog.another-d-mention.ro/programming/read-load-files-from-zip-in-javascript/
I started to research related issues in order to find the way to get best results with what seems possible without changing web standards, and I wondered about caching. If I could send the last modified date of every subresource of a page along with the initial HTML page, a browser could avoid asking if modified headers once it has loaded every resource at least once. This would in effect be better than to send all resources with the initial request, since that would be beneficial only on the first load, and detrimental on subsequent loads, since it would be better for browsers to use their cache (as Barmar pointed out).
Now it turns out that even with a web extension you can not get hold of the if-modified-since header and so you surely can't tell the browser to use the cached version instead of contacting the server.
I then found this post from Facebook on how they tried to reduce traffic by hashing their static files and giving them a 1 year expiry date. This would mean that the url garantuees the content of the file. They still saw plenty of unnecessary if-modified-since requests and they managed to convince Firefox and Chrome to change the behaviour of their reload buttons to no longer reload static resources. For Firefox this requires a new cache-control: immutable header, for Chrome it doesn't.
I then remembered that I had seen something like that before and it turns out there is a solution for this problem which is more convenient than hashing the contents of resources and serving them from a database for at least ten years. It is to just a new version number in the filename. The even more convenient solution would be to just add a version query string, but it turns out that that doesn't always work.
Admittedly, changing your filenames all the time is a nuisance, because files referencing these files also need to change. However the files don't actually need to change. If you control the server it might be as simple as writing a redirect rule to make sure that logo.vXXXX.png will be redirected to logo.png (where XXXX is the last modified timestamp in seconds since epoch)[1]. Now let your template system automatically generate the timestamp, like in wordpress' wp_enqueue_script. WordPress actually satisfies itself with the query string technique. Now you can set the expiration date to a far future and use the immutable cache header. If browsers respect the cache control, you can now safely ignore etags and if-modified-since headers, since they are now completely redundant.
This solution guarantees the browser shall never ask for cache validation and yet you shall never see a stale resource, without having to decide on the expiry date in advance.
It doesn't answer the original question here about how to avoid having to do multiple requests to fetch the resources on the same page on a clean cache, but ever after (as long as the browser cache doesn't get cleared), you're good! I suppose that's good enough for me.
[1] You can even avoid the server overhead of checking the timestamp on every resource every time a page references it by using the version number of your application. In debug mode, for development, one can use the timestamp to avoid having to bump the version on every modification of the file.

Employing a CDN for a dynamic website

I have a website forum where users exchange photos and text with one another on the home page. The home page shows 20 latest objects - be they photos or text. The 21st object is pushed out out of view. A new photo is uploaded every 5 seconds. A new text string is posted every second. In around 20 seconds, a photo that appeared at the top has disappeared at the bottom.
My question is: would I get a performance improvement if I introduced a CDN in the mix?
Since the content is changing, it seems I shouldn't be doing it. However, when I think about it logically, it does seem I'll get a performance improvement from introducing a CDN for my photos. Here's how. Imagine a photo is posted, appearing on the page at t=1 and remaining there till t=20. The first person to access the page (closer to t=1) will enable to photo to be pulled to an edge server. Thereafter, anyone accessing the photo will be receiving it from the CDN; this will last till t=20, after which the photo disappears. This is a veritable performance boost.
Can anyone comment on what are the flaws in my reasoning, and/or what am I failing to consider? Also would be good to know what alternative performance optimizations I can make for a website like mine. Thanks in advance.
You've got it right. As long as someone accesses the photo within the 20 seconds that the image is within view it will be pulled to an edge server. Then upon subsequent requests, other visitors will receive a cached response from the nearest edge server.
As long as you're using the CDN for delivering just your static assets, there should be no issues with your setup.
Additionally, you may want to look into a CDN which supports HTTP/2. This will provide you with improved performance. Check out cdncomparison.com for a comparison between popular CDN providers.
You need to consider all requests hitting your server, which includes the primary dynamically generated HTML document, but also all static assets like CSS files, Javascript files and, yes, image files (both static and user uploaded content). An HTML document will reference several other assets, each of which needs to be downloaded separately and thus incurs a server hit. Assuming for the sake of argument that each visitor has an empty local cache, a single page load may incur, say, ~50 resource hits for your server.
Probably the only request which actually needs to be handled by your server is the dynamically generated HTML document, if it's specific to the user (because they're logged in). All other 49 resource requests are identical for all visitors and can easily be shunted off to a CDN. Those will just hit your server once [per region], and then be cached by the CDN and rarely bother your server again. You can even have the CDN cache public HTML documents, e.g. for non-logged in users, you can let the CDN cache HTML documents for ~5 seconds, depending on how up-to-date you want your site to appear; so the CDN can handle an entire browsing session without hitting your server at all.
If you have roughly one new upload per second, that means there is likely a magnitude more passive visitors per second. If you can let a CDN handle ~99% of requests, that's a dramatic reduction in actual hits to your server. If you are clever with what you cache and for how long and depending on your particular mix of anonymous and authenticated users, you can easily reduce server loads by a magnitude or two. On the other side, you're speeding up page load times accordingly for your visitors.
For every single HTML document and other asset, really think whether this can be cached and for how long:
For HTML documents, is the user logged in? If no, and there's no other specific cookie tracking or similar things going on, then the asset is static and public for all intents and purposes and can be cached. Decide on a maximum age for the document and let the CDN cache it. Even caching it for just a second makes a giant difference when you get 1000 hits per second.
If the user is logged in, set the cache pragma to private, but still let the visitor's browser cache it for a few seconds. These headers must be decided upon by your forum software while it's generating the document.
For all other assets which aren't access restricted: let the CDN cache it for a long time and you can practically forget about ever having to serve those particular files ever again. These headers can be statically configured for entire directories in the web server.

Fixing slow response time for resources

I have a Magento website and I have been noticing an increase in warnings from Catchpoint that various images, CSS files, and javascript files are taking longer than usual to load. We use Edgecast for our CDN and have all images, CSS, and JS files hosted there. I have been in contact with them and they determined that the delays happen when the cache for the resource has expired and it must contact the origin for an updated file. The problem is that I can't figure out why it would take longer than a second to return a small image file. If I load the offending image off our server (not from the CDN) in my browser it always returns quickly. I assume that if you call up an image file directly using the full URL to the image file (say a product image, for example), that would bypass any Magento logic or database access and simply return the image to you. This should happen quickly, and it normally does, but sometimes it doesn't.
We have a number of things in play that may have an effect. There are API calls to the server for various integrations, though they are directed at a secondary server and not the web frontend. We may also have a large number of stale images since Magento doesn't delete any images even if you replace them or delete the product.
I realize this is a fairly open ended question, and I'm sorry if it breaks SO protocol, but I'm grasping at straws here. If anyone has any ideas on where to look or what could cause small resource files, like images, to take upwards of 8 seconds to load, I'm all ears. As an eCommerce site, it's getting close to peak season, and I can feel the hot breath of management on my neck. Any help would be greatly appreciated.
Thanks!
Turns out we had stumbled upon some problems with the CDN that they were somewhat aware of and not quick to admit. They made some changes to our account to work around the issues and things are much better now.

2 instances of 1 image on a page

I am not really sure how to google this, so I thought I could ask here.
I have the same image, posted on a page twice, will that slow down the execution time or will it remain the same since I am using the same resource?
The browser should be able to cache the image the first time it is requested from the source server. Most of the popular browsers should have this implemented. It should not have to load it twice, just once on the initial request to the server and then using the cached version the second time.
This also assumes the end user has the browser caching enabled. If that is turned off (even if the browser supports it), then it will make that extra request for the image since the cache is not there to pull from.

Resources