HTTPS is URL in plain text at first connection? - https

Let say I have never connected to the site example.com
If this site is https and I write https://example.com/supersecretpage
will the URL be sent in clear text since it's the first time I connect to the site and therefore the crypto keys were not yet exchanged? If not when does this take place?
Could anyone explain the steps when I type that URL?

The URL (address portion) is all that is not encrypted, the query parameters etc are encrypted, first time, every time. The address portion of the URL is in-the-clear every time for routing.
A negotiation occurs prior to the transfer of any information.
There is a lot of good documentation on the web from high level such as they question to low level details, check it out.

Related

How to validate that a certain domain is reachable from browser?

Our single page app embeds videos from Youtube for the end-users consumption. Everything works great if the user does have access to the Youtube domain and to the content of that domain's pages.
We however frequently run into users whose access to Youtube is blocked by a web filter box on their network, such as https://us.smoothwall.com/web-filtering/ . The challenge here is that the filter doesn't actually kill the request, it simply returns another page instead with a HTTP status 200. The page usually says something along the lines of "hey, sorry, this content is blocked".
One option is to try to fetch https://www.youtube.com/favicon.ico to prove that the domain is reachable. The issue is that these filters usually involve a custom SSL certificate to allow them to inspect the HTTP content (see: https://us.smoothwall.com/ssl-filtering-white-paper/), so I can't rely TLS catching the content being swapped for me with the incorrect certificate, and I will instead receive a perfectly valid favicon.ico file, except from a different site. There's also the whole CORS issue of issuing an XHR from our domain against youtube.com's domain, which means if I want to get that favicon.ico I have to do it JSONP-style. However even by using a plain old <img> I can't test the contents of the image because of CORS, see Get image data in JavaScript? , so I'm stuck with that approach.
Are there any proven and reliable ways of dealing with this situation and testing browser-level reachability towards a specific domain?
Cheers.
In general, web proxies that want to play nicely typically annotate the HTTP conversation with additional response headers that can be detected.
So one approach to building a man-in-the-middle detector may be to inspect those response headers and compare the results from when behind the MITM, and when not.
Many public websites will display the headers for a arbitrary request; redbot is one.
So perhaps you could ask the party whose content is being modified to visit a url like: youtube favicon via redbot.
Once you gather enough samples, you could heuristically build a detector.
Also, some CDNs (eg, Akamai) will allow customers to visit a URL from remote proxy locations in their network. That might give better coverage, although they are unlikely to be behind a blocking firewall.

Why we need HTTPS when we send result to user

The reason we need HTTPS(Secured/Encrypted Data over network):
We need to get the user side data(Either via form or by URL which ever way users sends their data to server via network) securely Which is done by http + ssl encryption - so in that case only the form or which ever URL that user posting/sending data to server has to be secure URL and not the page that I am sending to browser[ Eg. When I need to have customer register form From server itself I have to send it as https url - if I dont do that then browser will give warning like mixed content error. Instead is it wrong that browsers could have had some sort of param to mention the form I have has to be secure url.
In some cases my server side content cant be read by anyone outside other than who I allow to be - for that I can use https to deliver the content with extra security measurements in server side.
Other than these two scenarios I dont see any reason on having https based encoded content over network. Lets assume a site with 10+ css, 10+ js, 50+ images with 200k of content weight and total weight may be ~2 - 3MB - so this whole content is encrypted - have no doubt this is going to be min. of 100 - 280 connection creation between browser and server.
Please explain - why we need to follow the way we deliver[Most of us doing because browsers/google like search engines/w3o standards asks us to use on every page].
why we need to follow the way we deliver
Because otherwise it's not secure. The browsers which warn about this are not wrong.
Let's assume a site with 10+ css, 10+ js
Just 1 .js served over non-HTTPS and a man-in-the-middle attacker could inject abitrary code into your HTTPS page, from which origin they can completely control the user's interaction with your site. That's why browsers don't allow it, and give you the mixed content warning.
(And .css can have the same impact in many cases.)
Plus it's just plain bad security-usability to switch between HTTP and HTTPS for different pages. The user is likely to fail to notice the switch, and may be tricked into entering data into (or accepting data from) a non-HTTPS page. All the attacker would have to do would be to change one of the HTTP links so it pointed to HTTP instead of HTTPS, and the usual process would be subverted.
have no doubt this is going to be min. of 100 - 280 connection creation between browser and server.
HTTP[S] reuses connections. You don't pay the SSL handshake latency for every resource linked.
HTTPS is really not that expensive today to be worth worrying about performance for a typical small web app.

Google Analytics uses gif get request why not post request?

Google Analytics uses Get Request for .gif image to server
http://www.google-analytics.com/__utm.gif?utmwv=4&utmn=769876874&utmhn=example.com&utmcs=ISO-8859-1&utmsr=1280x1024&utmsc=32-bit&utmul=en-us&utmje=...
We can observer that all parameters are sent in this Get Request and the requested image is no where found useful (Its just 1px by 1px Image)
Known Information: If requesting query string is large then Google are going for Post Request.
Now the question is why not Post Request always irrespective of the query string is large or not.
Being data sent via Get Request its leads to security issue. Since, the parameters will be stored in browser history or in web server logs in case of Get Request.
Could someone give any supportive reasons why Google Analytics is depending on both the things?
Because GET requests is what you use for retrieving information that does not alter stuff.
Please note that the use of POST has quite some downsides, the browser usually warns against reloading a resource requested via POST (to prevent double data-entry), POST requests are not cached (which is why some analytics misuse it), proxied etc.
If you want to retrieve a LOT of data using a URL (advice: rethink if there might be a better option), then it's necessary to use post, from Wikipedia:
There are times when HTTP GET is less suitable even for data retrieval. An example of this is when a great deal of data would need to be specified in the URL. Browsers and web servers can have limits on the length of the URL that they will handle without truncation or error. Percent-encoding of reserved characters in URLs and query strings can significantly increase their length, and while Apache HTTP Server can handle up to 4,000 characters in a URL, Microsoft Internet Explorer is limited to 2048 characters in any URL. Equally, HTTP GET should not be used where sensitive information, such as user names and passwords have to be submitted along with other data for the request to complete. In these cases, even if HTTPS is used to encrypt the message body, data in the URL will be passed in clear text and many servers, proxies, and browsers will log the full URL in a way where it might be visible to third parties. In these cases, HTTP POST should be used.
A POST request would require an ajax call and it wouldn't work because of http://en.wikipedia.org/wiki/Same-origin_policy. But images can easily be cross-site, so they just need to add an img tag to the DOM with the required url and the browser will load it, sending the needed information to their servers for tracking.

XMLHttpRequest over SSL from unsecure page

How secure would this setup be ?
Unsecure page 'http://www.site.com' makes an XMLHttpRequest with POST
to url 'https://www.site.com/dosomething.asp'
The page dosomething.asp has header 'Access-Control-Allow-Origin: http://www.site.com' set
and returns some user related data that needs to be secure.
No errors, all goes well.
How secure is the actual POST request ?
How secure is the responseText from this request ?
The most significant issue I can see is that your unsecure page is not secure (ok, obvious). If someone were to attempt a man-in-the-middle attack on that unsecure page, they could edit the functionality of the page (using JavaScript injection, etc.) to intercept the content being sent to and received from the secure URL. You are best off to use both pages in secure mode (SSL/TLS).
As soon as you introduce a non-SSL component to your application, you have lost all the benefits of SSL. You are only as secure as the weakest part. This is why browsers report mixed SSL/non-SSL content as a security alert to the user.
Wireshark is a program that monitors network packets traveling across a network. It's free and popular. The definitive way to answer this question would be get Wireshark, take a day to learn it, and apply it.
The filter to see traffic from the source site would be:
(ip.src == [ip address of source]) && (ip.dst == [ip address of target])
Swap ip.src and ip.dst to see what's coming back. You could actually combine both in one filter expression actually.
This would work provided that you're on the network through which the packets are traveling.
One final item: Here's a description of PKI (https/SSL/TLS): http://www.mitre.org/news/the_edge/february_01/steve.html
I Wiresharked a sort of similar situation, and verified I was sending and receiving TLS (https) traffic. But it wasn't this situation exactly so I don't want to speculate.

HTTPS, URL path, and query string

This is a follow up post of my previous question about BASIC auth over HTTPS
Are the path to the resource and query string passed securely to the server if I use HTTPS?
i.e.
URI: http://server/path/to/a/resource?with=a&query=string
Server: server
path: /path/to/a/resource
query string: with=a&query=string
This is a really good explanation of this: http://answers.google.com/answers/threadview/id/758002.html#answer
Summary: only the host and port would be visible unencrypted.
In short, yes. But you shouldn't store sensitive data in URL's since it may be visible in the browsers history and server logfiles. And anyone who looks over your shoulder sees it too.
Yes it is - the entire session is secured and encryped so anything you send, including the query string is unreadable.
You can prove this to yourself, if you wish, by using something like Fiddler to view the http/https traffic you generate when you visit a secure url. Anything you send over HTTPS will not show the querystring, as shown here:
The actual URL I was visiting looked like this:
https://www.halifax-online.co.uk/_mem_bin/formslogin.asp?source=halifaxcouk&simigvis=
As per other answers, you shouldn't pass any sensitive information in the querystring as this may be stored in your webservers log files, so if you were passing a username/password combination anyone who could access your logs would be able to capture that information. This could allow someone to log into your site/application as if they were someone else even if you were making efforts such as storing passwords in your database as salted hashes, rather than plaintext.
HTTPS is simply HTTP tunnelled over an SSL connection. This means that the request, response, headers and content are all within the SSL tunnel and should therefore be encrypted.

Resources