'Soft' move to HTTPS - performance

I have a mobile site that I generally want to always be all HTTPS. It uses the HTML5 application cache to store all the data locally so its a fast user experience EXCEPT during the first page view.
My homepage will load on an iPhone over 4G ATT in 1 second without HTTPS and in 4.5 seconds with HTTPS (round trip time latency is the killer here). So, I am trying to find a way to softly move the user to HTTPS so they have a great first impression, followed by a secure experience.
Here is what I am doing:
Externally, always referencing HTTP (ie press releases, etc)
canonical = HTTP (so Google see HTTP)
On the site pages all links are HTTPS
Once the user comes to the HTTP page the HTTPS pages (all of them) will load via an application cache manifest in an iFrame
Upon hitting the server for the first time (HTTP), the server will set a cookie, and the next time force a redirect to HTTPS (which is already cached)
Is this a good implementation to get the speed of HTTP with the security of HTTPS?

Related

Read HTTP IndexedDB data from HTTPS web app

I created data on my app and stored it in an IndexedDB.
After upgrading to HTTPS, the data disappeared since the address is different. Now I need to access it again.
I tried to remove the certificate on the server but this didn't help. The browser (Brave on iPad) still forces HTTPS, even if I deactivate the HTTPS Brave Shield option.
My main question is how can I retrieve the "unsecured" data, while having access to domain DNS settings, code and browser.
Browser storage is origin-scoped. http://example.com and https://example.com are different origins. They can't access each others' data - they have a different localStorage, a different set of IndexedDB databases, etc.
Origins can cooperate to share data. In the past, you could have a page from the https origin contain an iframe in the http origin, and they could use postMessage() to communicate to proxy the data - i.e. the parent frame messages the child frame saying "give me your data" and the child frame validates that the request is from the expected origin, pulls the data out of the database, and sends it back to the parent.
That will still work in Chrome, but browsers are generally moving towards partitioning data in third-party iframes (so the storage seen by a top level B.com window is different than the storage seen in a B.com iframe inside an A.com window). I believe a non-iframe (i.e. via window.open()) would work here, although it would be more disruptive to the user.

Jmeter with Caching server

My Application is using CDN which is a caching server. Now when I use JMETER for recording the functional flow. Browser doesn't load any CSS,JS or image being cached at CDN server. Removing CDN is alo not good option because I need to judge performance with CDN in place. Please Guide
JMeter records only HTTP requests sent by browser, so if you have already visited this page your browser may already have these resources in its own cache therefore it doesn't send actual requests. If you want these requests to be recorded - you should clear your browsing history and especially delete cache. The procedure differs from browser to browser so check your browser documentation for details or check out How do I clear my web browser's cache, cookies, and history? article.
In general you should not be recording these calls as real browsers download these images, scripts and styles using concurrent thread pool, i.e. one main request followed by parallel requests to get the resources. The same behavior can be set up in JMeter using "Advanced" tab of the HTTP Request sampler (or even better HTTP Request Defaults, this way the setting will be applied to all HTTP Request samplers in scope)
I accepted the Security certificate for CDN server through browser only. And problem was solved.

Google shows some of my pages in https

I had the SSL installed on my site for a day and uninstalled it. Now in serps, google shows some of the pages in https version. I don't think this is a duplicate content issue because only one version of the web page shows up in serps. I'm not sure how to fix this because the mix of http hand https show up.
Why does google randomly show https version of my pages?
How can i tell google to use the http version of those random pages?
Thanks in advance!
I think more information is needed to fully answer this question, however, most likely Google is showing the HTTPS version of your page because your site was using HTTPs at the time when Google visited and indexed your page, thus the HTTPS version was cached.
You can partially manage your search page and request that Google re-index your site from Google Web Developer tools. https://www.google.com/webmasters/tools/. However, the change isn't guaranteed and won't take place immediately.
Furthermore, you can control whether a visitor uses HTTPS or HTTP when visiting your site via standard redirects. In other words, update your webserver or web application so if someone visits http://mysite they will be redirected (preferably with a 301 status code for SEO reasons) to https://mysite.
Finally, as a general practice, you should really use HTTPS as browsers may highlight warnings for non-secured traffic in the future. For sites I manage, "http" requests are automatically redirected to the "https" of the same URL to ensure all traffic is always over a secure connection.

Why we need HTTPS when we send result to user

The reason we need HTTPS(Secured/Encrypted Data over network):
We need to get the user side data(Either via form or by URL which ever way users sends their data to server via network) securely Which is done by http + ssl encryption - so in that case only the form or which ever URL that user posting/sending data to server has to be secure URL and not the page that I am sending to browser[ Eg. When I need to have customer register form From server itself I have to send it as https url - if I dont do that then browser will give warning like mixed content error. Instead is it wrong that browsers could have had some sort of param to mention the form I have has to be secure url.
In some cases my server side content cant be read by anyone outside other than who I allow to be - for that I can use https to deliver the content with extra security measurements in server side.
Other than these two scenarios I dont see any reason on having https based encoded content over network. Lets assume a site with 10+ css, 10+ js, 50+ images with 200k of content weight and total weight may be ~2 - 3MB - so this whole content is encrypted - have no doubt this is going to be min. of 100 - 280 connection creation between browser and server.
Please explain - why we need to follow the way we deliver[Most of us doing because browsers/google like search engines/w3o standards asks us to use on every page].
why we need to follow the way we deliver
Because otherwise it's not secure. The browsers which warn about this are not wrong.
Let's assume a site with 10+ css, 10+ js
Just 1 .js served over non-HTTPS and a man-in-the-middle attacker could inject abitrary code into your HTTPS page, from which origin they can completely control the user's interaction with your site. That's why browsers don't allow it, and give you the mixed content warning.
(And .css can have the same impact in many cases.)
Plus it's just plain bad security-usability to switch between HTTP and HTTPS for different pages. The user is likely to fail to notice the switch, and may be tricked into entering data into (or accepting data from) a non-HTTPS page. All the attacker would have to do would be to change one of the HTTP links so it pointed to HTTP instead of HTTPS, and the usual process would be subverted.
have no doubt this is going to be min. of 100 - 280 connection creation between browser and server.
HTTP[S] reuses connections. You don't pay the SSL handshake latency for every resource linked.
HTTPS is really not that expensive today to be worth worrying about performance for a typical small web app.

Amazon Cloudfront: private content but maximise local browser caching

For JPEG image delivery in my web app, I am considering using Amazon S3 (or Amazon Cloudfront
if it turns out to be the better option) but have two, possibly opposing,
requirements:
The images are private content; I want to use signed URLs with short expiration times.
The images are large; I want them cached long-term by the users' browser.
The approach I'm thinking is:
User requests www.myserver.com/the_image
Logic on my server determines the user is allowed to view the image. If they are allowed...
Redirect the browser (is HTTP 307 best ?) to a signed Cloudfront URL
Signed Cloudfront URL expires in 60 seconds but its response includes "Cache-Control max-age=31536000, private"
The problem I forsee is that the next time the page loads, the browser will be looking for
www.myserver.com/the_image but its cache will be for the signed Cloudfront URL. My server
will return a different signed Cloudfront URL the second time, due to very short
expiration times, so the browser won't know it can use its cache.
Is there a way round this without having my webserver proxy the image from Cloudfront (which obviously negates all the
benefits of using Cloudfront)?
Wondering if there may be something I could do with etag and HTTP 304 but can't quite join the dots...
To summarize, you have private images you'd like to serve through Amazon Cloudfront via signed urls with a very short expiration. However, while access by a particular url may be time limited, it is desirable that the client serve the image from cache on subsequent requests even after the url expiration.
Regardless of how the client arrives at the cloudfront url (directly or via some server redirect), the client cache of the image will only be associated with the particular url that was used to request the image (and not any other url).
For example, suppose your signed url is the following (expiry timestamp shortened for example purposes):
http://[domain].cloudfront.net/image.jpg?Expires=1000&Signature=[Signature]
If you'd like the client to benefit from caching, you have to send it to the same url. You cannot, for example, direct the client to the following url and expect the client to use a cached response from the first url:
http://[domain].cloudfront.net/image.jpg?Expires=5000&Signature=[Signature]
There are currently no cache control mechanisms to get around this, including ETag, Vary, etc. The nature of client caching on the web is that a resource in cache is associated with a url, and the purpose of the other mechanisms is to help the client determine when its cached version of a resource identified by a particular url is still fresh.
You're therefore stuck in a situation where, to benefit from a cached response, you have to send the client to the same url as the first request. There are potential ways to accomplish this (cookies, local storage, server scripting, etc.), and let's suppose that you have implemented one.
You next have to consider that caching is only just a suggestion and even then it isn't a guarantee. If you expect the client to have the image cached and serve it the original url to benefit from that caching, you run the risk of a cache miss. In the case of a cache miss after the url expiry time, the original url is no longer valid. The client is then left unable to display the image (from the cache or from the provided url).
The behavior you're looking for simply cannot be provided by conventional caching when the expiry time is in the url.
Since the desired behavior cannot be achieved, you might consider your next best options, each of which will require giving up on one aspect of your requirement. In the order I would consider them:
If you give up short expiry times, you could use longer expiry times and rotate urls. For example, you might set the url expiry to midnight and then serve that same url for all requests that day. Your client will benefit from caching for the day, which is likely better than none at all. Obvious disadvantage is that your urls are valid longer.
If you give up content delivery, you could serve the images from a server which checks for access with each request. Clients will be able to cache the resource for as long as you want, which may be better than content delivery depending on the frequency of cache hits. A variation of this is to trade Amazon CloudFront for another provider, since there may be other content delivery networks which support this behavior (although I don't know of any). The loss of the content delivery network may be a disadvantage or may not matter much depending on your specific visitors.
If you give up the simplicity of a single static HTTP request, you could use client side scripting to determine the request(s) that should be made. For example, in javascript you could attempt to retrieve the resource using the original url (to benefit from caching), and if it fails (due to a cache miss and lapsed expiry) request a new url to use for the resource. A variation of this is to use some caching mechanism other than the browser cache, such as local storage. The disadvantage here is increased complexity and compromised ability for the browser to prefetch.
Save a list of user+image+expiration time -> cloudfront links. If a user has an non-expired cloudfront link use it for an image and don't generate a new one.
It seems you already solved the issue. You said that your server is issuing a redirect http 307 to the cloudfront URL (signed URL) so the browser caches only the cloudfront URL not your URL(www.myserver.com/the_image). So the scenario is as follows :
Client 1 checks www.myserver.com/the_image -> is redirect to CloudFront URL -> content is cached
The CloudFront url now expires.
Client 1 checks again www.myserver.com/the_image -> is redirected to the same CloudFront URL-> retrieves the content from cache without to fetch again the cloudfront content.
Client 2 checks www.myserver.com/the_image -> is redirected to CloudFront URL which denies its accesss because the signature expired.

Resources