IE8 XSS filter: what does it really do? - internet-explorer-8

Internet Explorer 8 has a new security feature, an XSS filter that tries to intercept cross-site scripting attempts. It's described this way:
The XSS Filter, a feature new to Internet Explorer 8, detects JavaScript in URL and HTTP POST requests. If JavaScript is detected, the XSS Filter searches evidence of reflection, information that would be returned to the attacking Web site if the attacking request were submitted unchanged. If reflection is detected, the XSS Filter sanitizes the original request so that the additional JavaScript cannot be executed.
I'm finding that the XSS filter kicks in even when there's no "evidence of reflection", and am starting to think that the filter simply notices when a request is made to another site and the response contains JavaScript.
But even that is hard to verify because the effect seems to come and go. IE has different zones, and just when I think I've reproduced the problem, the filter doesn't kick in anymore, and I don't know why.
Anyone have any tips on how to combat this? What is the filter really looking for? Is there any way for a good-guy to POST data to a 3rd-party site which can return HTML to be displayed in an iframe and not trigger the filter?
Background: I'm loading a JavaScript library from a 3rd-party site. That JavaScript harvests some data from the current HTML page, and posts it to the 3rd-party site, which responds with some HTML to be displayed in an iframe. To see it in action, visit an AOL Food page and click the "Print" icon just above the story.

What does it really do? It allows third parties to link to a messed-up version of your site.
It kicks in when [a few conditions are met and] it sees a string in the query submission that also exists verbatim in the page, and which it thinks might be dangerous.
It assumes that if <script>something()</script> exists in both the query string and the page code, then it must be because your server-side script is insecure and reflected that string straight back out as markup without escaping.
But of course apart from the fact that's it's a perfectly valid query someone might have typed that matches by coincidence, it's also just as possible that they match because someone looked at the page and deliberately copied part of it out. For example:
http://www.bing.com/search?q=%3Cscript+type%3D%22text%2Fjavascript%22%3E
Follow that in IE8 and I've successfully sabotaged your Bing page so it'll give script errors, and the pop-out result bits won't work. Essentially it gives an attacker whose link is being followed license to pick out and disable parts of the page he doesn't like — and that might even include other security-related measures like framebuster scripts.
What does IE8 consider ‘potentially dangerous’? A lot more and a lot stranger things than just this script tag. eg. What's more, it appears to match against a set of ‘dangerous’ templates using a text pattern system (presumably regex), instead of any kind of HTML parser like the one that will eventually parse the page itself. Yes, use IE8 and your browser is pařṣinͅg HT̈́͜ML w̧̼̜it̏̔h ͙r̿e̴̬g̉̆e͎x͍͔̑̃̽̚.
‘XSS protection’ by looking at the strings in the query is utterly bogus. It can't be ‘fixed’; the very concept is intrinsically flawed. Apart from the problem of stepping in when it's not wanted, it can't ever really protect you from anything but the most basic attacks — and the attackers will surely workaround such blocks as IE8 becomes more widely used. If you've been forgetting to escape your HTML output correctly you'll still be vulnerable; all XSS “protection” has to offer you is a false sense of security. Unfortunately Microsoft seem to like this false sense of security; there is similar XSS “protection” in ASP.NET too, on the server side.
So if you've got a clue about webapp authoring and you've been properly escaping output to HTML like a good boy, it's definitely a good idea to disable this unwanted, unworkable, wrong-headed intrusion by outputting the header:
X-XSS-Protection: 0
in your HTTP responses. (And using ValidateRequest="false" in your pages if you're using ASP.NET.)
For everyone else, who still slings strings together in PHP without taking care to encode properly... well you might as well leave it on. Don't expect it to actually protect your users, but your site is already broken, so who cares if it breaks a little more, right?
To see it in action, visit an AOL Food page and click the "Print" icon just above the story.
Ah yes, I can see this breaking in IE8. Not immediately obvious where IE has made the hack to the content that's stopped it executing though... the only cross-domain request I can see that's a candidate for the XSS filter is this one to http://h30405.www3.hp.com/print/start:
POST /print/start HTTP/1.1
Host: h30405.www3.hp.com
Referer: http://recipe.aol.com/recipe/oatmeal-butter-cookies/142275?
csrfmiddlewaretoken=undefined&characterset=utf-8&location=http%253A%2F%2Frecipe.aol.com%2Frecipe%2Foatmeal-butter-cookies%2F142275&template=recipe&blocks=Dd%3Do%7Efsp%7E%7B%3D%25%3F%3D%3C%28%2B.%2F%2C%28%3D3%3F%3D%7Dsp%7Ct#kfoz%3D%25%3F%3D%7E%7C%7Czqk%7Cpspm%3Db3%3Fd%3Do%7Efsp%7E%7B%3D%25%3F%3D%3C%7D%2F%27%2B%2C.%3D3%3F%3D%7Dsp%7Ct#kfoz%3D%25%3F%3D%7E%7C%7Czqk...
that blocks parameter continues with pages more gibberish. Presumably there is something there that (by coincidence?) is reflected in the returned HTML and triggers one of IE8's messed up ideas of what an XSS exploit looks like.
To fix this, HP need to make the server at h30405.www3.hp.com include the X-XSS-Protection: 0 header.

You should send me (ericlaw#microsoft) a network capture (www.fiddlercap.com) of the scenario you think is incorrect.
The XSS filter works as follows:
Is XSSFILTER enabled for this process?
If yes– proceed to next check
If no – bypass XSS Filter and continue loading
Is a "document" load (like a frame, not a subdownload)?
If yes– proceed to next check
If no – bypass XSS Filter and continue loading
Is it a HTTP/HTTPS request?
If yes– proceed to next check
If no – bypass XSS Filter and continue loading
Does RESPONSE contain x-xss-protection header?
Yes:
Value = 1: XSS Filter Enabled (no urlaction check)
Value = 0: XSS Filter Disabled (no urlaction check)
No: proceed to next check
Is the site loading in a Zone where URLAction enables XSS filtering? (By default: Internet, Trusted, Restricted)
If yes– proceed to next check
If no – bypass XSS Filter and continue loading
Is a cross site Request? (Referrer header: Does the final (post-redirect) fully-qualified domain name in the HTTP request referrer header match the fully-qualified domain name of the URL being retrieved?)
If yes – bypass XSS Filter and continue loading
If no – then the URL in the request should be neutered.
Does the heuristic indicate of the RESPONSE data came from unsafe REQUEST DATA?
If yes – modify the response.
Now, the exact details of #7 are quite complicated, but basically, you can imagine that IE does a match of request data (URL/Post Body) to response data (script bodies) and if they match, then the response data will be modified.
In your site's case, you'll want to look at the body of the POST to http://h30405.www3.hp.com/print/start and the corresponding response.

Actually, it's worse than might seem. The XSS filter can make safe sites unsafe. Read here:
http://www.h-online.com/security/news/item/Security-feature-of-Internet-Explorer-8-unsafe-868837.html
From that article:
However, Google disables IE's XSS filter by sending the X-XSS-Protection: 0 header, which makes it immune.
I don't know enough about your site to judge if this may be a solution, but you can probably try.
More in depth, technical discussion of the filter, and how to disable it is here: http://michael-coates.blogspot.com/2009/11/ie8-xss-filter-bug.html

Related

Data URI and a potentially dangerous Request.Path value

I have tried using a Data URI with this CSS property:
background-image: url("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAA9JREFUeNpiYGBg8AUIMAAAUgBOUWVeTwAAAABJRU5ErkJggg==");
And locally it works fine.
However, when I am debugging the file appears missing in chrome. If I try to navigate to it, I get: A potentially dangerous Request.Path value was detected from the client (:).
So obviously my application considers the URI for this image suspicious.
How do I get it to show?
I tried relaxing the validation using:
<httpRuntime requestPathInvalidCharacters="" requestValidationMode="2.0" />
<pages validateRequest="false"></pages>
Ideally I wouldn't want to relax the rules too much, only enough to get these data URI images loading.
I would bet that the application considers the request suspicious because of the Base-64 encoded URI. Encoding malicious URLs in Base-64 is a common strategy by attackers to get URLs through front end filters that strip and/or escape URLs, and to obscure the request from any humans reading the code. XSS attacks are commonly done by getting one of these URIs stored in a database and served back to other users.
Because of the high risks of XSS these days, I would hesitate to disable the check. If you can, just use a non-encoded URI. If you can't, you should ask yourself why. If you are trying to enhance security by obfuscating the URI, do know that this is very trivial for an attacker to decode. It is not any form of encryption, just a different way to represent data.

A method for standardizing URI/URLs in Ruby in a forgiving manner

I'm trying to find a method to take a URI/URL string from a user and determine a working, canonical form (or failing if the resource isn't valid). Simultaneously, it should also verify that the URL exists. So we're checking for both valid "syntax" and also existence.
For instance, a string like google.com should be turned into http://www.google.com, and a string like google.com/insights should be turned into http://www.google.com/insights. A string like http://thiswebsitedoesntexistatall.com should return some sort of error or exception.
I believe a portion of the solution may likely be calling an HTTP get_response() method and following redirects until I get a 200 OK status.
It seems like the URI.parse() method is not forgiving of leaving off the http. I realize I could write a simple thing to try adding http in front, etc., but I was hoping there was some existing gem or little-known library function that would be really forgiving about URLs and canonicalize them for me.
Both the built in net/http and HTTParty seem to be too strict for what I'm looking for. Is there a nice way to do this?
There are some problems with what you're asking for:
A URL parser shouldn't assume the value passed in is HTTP, when FTP and many other protocols are equally valid. If you know the protocol is likely to be HTTP, then you need to add the protocol.
If you try to connect to a site and follow redirects until you get a 200 response, you've only proven that the URL resolves to a valid page of some sort. That 200 could be an error-page returned because the one you want is a dead link or invalid, or that the site is temporarily down. To prove otherwise means you have to have some intimate pre-knowledge about the page you're looking for, such as specific content to search for.
Assuming the URL is good after you follow the redirects, is not quite safe. Many sites add on all sorts of session data to the URL, so what could start as a simple and clean URL can resolve to a long and convoluted one.
I'd recommend you look at the Addressable::URI gem. It's much more full-featured than Ruby's URI. It won't make the decisions for you, but at least it will give you a more complete API and can rewrite/normalize URLs. Cleaning them up and/or determining if they are good is still left as an exercise for you.

What does a Ajax call response like 'for (;;); { json data }' mean? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why do people put code like “throw 1; <dont be evil>” and “for(;;);” in front of json responses?
I found this kind of syntax being used on Facebook for Ajax calls. I'm confused on the for (;;); part in the beginning of response. What is it used for?
This is the call and response:
GET http://0.131.channel.facebook.com/x/1476579705/51033089/false/p_1524926084=0
Response:
for (;;);{"t":"continue"}
I suspect the primary reason it's there is control. It forces you to retrieve the data via Ajax, not via JSON-P or similar (which uses script tags, and so would fail because that for loop is infinite), and thus ensures that the Same Origin Policy kicks in. This lets them control what documents can issue calls to the API — specifically, only documents that have the same origin as that API call, or ones that Facebook specifically grants access to via CORS (on browsers that support CORS). So you have to request the data via a mechanism where the browser will enforce the SOP, and you have to know about that preface and remove it before deserializing the data.
So yeah, it's about controlling (useful) access to that data.
Facebook has a ton of developers working internally on a lot of projects, and it is very common for someone to make a minor mistake; whether it be something as simple and serious as failing to escape data inserted into an HTML or SQL template or something as intricate and subtle as using eval (sometimes inefficient and arguably insecure) or JSON.parse (a compliant but not universally implemented extension) instead of a "known good" JSON decoder, it is important to figure out ways to easily enforce best practices on this developer population.
To face this challenge, Facebook has recently been going "all out" with internal projects designed to gracefully enforce these best practices, and to be honest the only explanation that truly makes sense for this specific case is just that: someone internally decided that all JSON parsing should go through a single implementation in their core library, and the best way to enforce that is for every single API response to get for(;;); automatically tacked on the front.
In so doing, a developer can't be "lazy": they will notice immediately if they use eval(), wonder what is up, and then realize their mistake and use the approved JSON API.
The other answers being provided seem to all fall into one of two categories:
misunderstanding JSONP, or
misunderstanding "JSON hijacking".
Those in the first category rely on the idea that an attacker can somehow make a request "using JSONP" to an API that doesn't support it. JSONP is a protocol that must be supported on both the server and the client: it requires the server to return something akin to myFunction({"t":"continue"}) such that the result is passed to a local function. You can't just "use JSONP" by accident.
Those in the second category are citing a very real vulnerability that has been described allowing a cross-site request forgery via tags to APIs that do not use JSONP (such as this one), allowing a form of "JSON hijacking". This is done by changing the Array/Object constructor, which allows one to access the information being returned from the server without a wrapping function.
However, that is simply not possible in this case: the reason it works at all is that a bare array (one possible result of many JSON APIs, such as the famous Gmail example) is a valid expression statement, which is not true of a bare object.
In fact, the syntax for objects defined by JSON (which includes quotation marks around the field names, as seen in this example) conflicts with the syntax for blocks, and therefore cannot be used at the top-level of a script.
js> {"t":"continue"}
typein:2: SyntaxError: invalid label:
typein:2: {"t":"continue"}
typein:2: ....^
For this example to be exploitable by way of Object() constructor remapping, it would require the API to have instead returned the object inside of a set of parentheses, making it valid JavaScript (but then not valid JSON).
js> ({"t":"continue"})
[object Object]
Now, it could be that this for(;;); prefix trick is only "accidentally" showing up in this example, and is in fact being returned by other internal Facebook APIs that are returning arrays; but in this case that should really be noted, as that would then be the "real" cause for why for(;;); is appearing in this specific snippet.
Well the for(;;); is an infinite loop (you can use Chrome's JavaScript console to run that code in a tab if you want, and then watch the CPU-usage in the task manager go through the roof until the browser kills the tab).
So I suspect that maybe it is being put there to frustrate anyone attempting to parse the response using eval or any other technique that executes the returned data.
To explain further, it used to be fairly commonplace to parse a bit of JSON-formatted data using JavaScript's eval() function, by doing something like:
var parsedJson = eval('(' + jsonString + ')');
...this is considered unsafe, however, as if for some reason your JSON-formatted data contains executable JavaScript code instead of (or in addition to) JSON-formatted data then that code will be executed by the eval(). This means that if you are talking with an untrusted server, or if someone compromises a trusted server, then they can run arbitrary code on your page.
Because of this, using things like eval() to parse JSON-formatted data is generally frowned upon, and the for(;;); statement in the Facebook JSON will prevent people from parsing the data that way. Anyone that tries will get an infinite loop. So essentially, it's like Facebook is trying to enforce that people work with its API in a way that doesn't leave them vulnerable to future exploits that try to hijack the Facebook API to use as a vector.
I'm a bit late and T.J. has basically solved the mystery, but I thought I'd share a great paper on this particular topic that has good examples and provides deeper insight into this mechanism.
These infinite loops are a countermeasure against "Javascript hijacking", a type of attack that gained public attention with an attack on Gmail that was published by Jeremiah Grossman.
The idea is as simple as beautiful: A lot of users tend to be logged in permanently in Gmail or Facebook. So what you do is you set up a site and in your malicious site's Javascript you override the object or array constructor:
function Object() {
//Make an Ajax request to your malicious site exposing the object data
}
then you include a <script> tag in that site such as
<script src="http://www.example.com/object.json"></script>
And finally you can read all about the JSON objects in your malicious server's logs.
As promised, the link to the paper.
This looks like a hack to prevent a CSRF attack. There are browser-specific ways to hook into object creation, so a malicious website could use do that first, and then have the following:
<script src="http://0.131.channel.facebook.com/x/1476579705/51033089/false/p_1524926084=0" />
If there weren't an infinite loop before the JSON, an object would be created, since JSON can be eval()ed as javascript, and the hooks would detect it and sniff the object members.
Now if you visit that site from a browser, while logged into Facebook, it can get at your data as if it were you, and then send it back to its own server via e.g., an AJAX or javascript post.

GET vs POST in Ajax

What is the difference between GET and POST for Ajax requests?
I don't see any difference between those two, except that when I use GET, the parameters are send in URL, which for me don't really make any difference, since all requests are made on background and user doesn't find any difference.
edit:
What are PUT and DELETE methods used for?
GET is designed for getting data from the server. POST (and lesser-known friends PUT and DELETE) are designed for modifying data on the server.
A GET request should never cause data to be removed from an application. If you have a link you can click on with a GET to remove data, then Google spidering your site could click on all your "Delete" links.
The canonical answer can be found here, which quotes the HTML 2.0 spec:
If the processing of a form is idempotent (i.e. it has no lasting
observable effect on the state of the
world), then the form method should be
GET. Many database searches have no
visible side-effects and make ideal
applications of query forms.
If the service associated with the processing of a form has side effects
(for example, modification of a
database or subscription to a
service), the method should be POST.
In your AJAX call, you need to use whatever method your server supports. You should always design your server so that operations that modify data are called by POST/PUT/DELETE. Other comments have links to REST, which generally maps C/R/U/D to "POST or PUT"(Create)/GET(Read)/PUT(Update)/DELETE(Delete).
If you're sending large amounts of data, or sensitive data over HTTPS, you will want to use POST. If it's just a simple parameter, I would use GET.
GET requests have a limit to the amount of data that can be sent. I forget the exact number, but this can cause issues if you're sending anything substantial.
Basically the difference between GET and POST is that in a GET request, the parameters are passed in the URL where as in a POST, the parameters are included in the message body.
Whether its AJAX or not is irrelevant. Its about the action that you're taking. I'd recommend following the principles of REST. Which have further provisions for updating, deleting, etc...
GET requests are easier to exploit in CSRF (cross site request forgery) attacks. Namely fake POST requests require Javascript to be enabled on the user side, while fake GET requests are still possible just with img, script tags.
Many web servers limit the length of the data that can be passed as part of the URL, so the GET request may break in odd ways that are hard to debug.
Also, most server software logs URLs in the access logs, so if you pass sensitive information (such as passwords) in a GET request, this will in all likelihood be written to disk in plaintext.
From a REST perspective, GET requests should have no side-effects -- they shouldn't modify data. So, if you're just GETting a resource by ID, this makes sense, but if you're committing changes to a resource, you should be using PUT, POST, or UPDATE for the http verb.
Both are used to send some data and receive some response using that data.
GET: Get information store in server. Ie. Search, tweet, Person Information. If you want to send information then get request send request using process.php?name=subroto
So it basically send information through url. Url cannot handle more than 2083 char. So for blog post can you remember it is not possible?
POST: Post do same thing as get. User registration, User login, Big data send, Blog Post.
If you need to send secure information then use post or for big data as it not go through url.
AJAX: $.get() and $.post() contain features that are subsets of $.ajax(). It has much configuration.
$.get () method, which is a kind of shorthand for $.Ajax (). When using $.get (), instead of passing in an object, you pass in arguments. At minimum, you’ll need the first two arguments, which are the URL of the file you want to retrieve (i.e. ‘test.txt’) and a success callback.
Summary:
$.get( url [, data ] [, success ] [, dataType ] )
$.post( url [, data ] [, success ] [, dataType ] ) // for sending secure or Large information
$.ajax( url [, settings ] ) // More Configaration
First, general information. Use GET if you only read data, use POST if you change something on database, txt files etc.
But the problem is, some browsers cache GET results. I had problems with AJAX requests in IE7, but at last I found out that browser caches GET results. I rethought the flow and changes my request to POST.
So, don't use GET if you don't want caching.
(Of course you can disable caching in GET operations. But I didn't prefer it)
About me, i prefer POST. I reserve get to the events i know the sent value is limited to data i have the "control", for example, to retreive an item with an id. Example, "getitem?id=123", "deleteImtem?id=123", ... For the other cases, when i have a form fillable by a user, i prefer POST.
Like Ryan Smith have said, it's better to use POST to send a large amount of data, and less wories in cases of the use in others language/special chars (generally all majors javascript framework should'nt have any problems to deal with that but i think is less wories to use POST).
For the REST perspective, in my opinion, you can use this with a new project (to keep a consistency with the entire project).
Finally, maybee some programs used in a network (URL loguers (ie.: to see if the employees lost their time on non-autorised sites, ...) proxys, ... ) or any other kind of tool can intercept the query. Somes will show in the reports the params you have sent with GET, considering it like a different web page. But in this situation, is could be not your problem it's changes from a project to an other! ;)
The difference is the same between GET and POST whether you're using Ajax, HTML forms, or curl. Here are the relevant definitions:
GET
POST
If you are passing on any arguments with characters that can get messed up in the URL (such as spaces), you use POST. Otherwise you can use GET.
Generally, if you're just passing on a few tiny arguments you would use GET. But for passing on user submitted information such as blog entries, text, etc, its a good practice to use POST.
There are also certain frameworks that rely completely on segment based urls (such as site.com/products/133 rather than site.com/products.php?id=333 and these frameworks unset the GET variables for security. In such cases you would use POST allt the time.

Ajax and filenames - Best practices

I am using jQuery to call PHP files via the $.get method
function fetchDepartment(company_id)
{
$.get("ajax/fetchDepartment.php?sec=departments&company_id="+company_id, function(data){
$("#department_id").html(data);
});
}
What I am thinking is can I secure the filename even further?
Currently I have a global access check within the .php file that check if the user is logged in, if he can access this data etc.
But I am wondering if there are further steps I can take so a user couldn't see this filename, or what other steps you recommend to take.
Encoded requests
You could make the request details effectively invisible to the casual miscreant by encoding almost all of the URL and then decoding the request details server-side.
The request details would include the action you wish to perform plus the parameters relevant to that action.
All requests would be sent to a single URL, where a server-side process would decode the request details and perform the relevant action as required.
Example Original URL:
/ajax/delete.php?parameter1=foo&parameter2=bar
Request details:
action=delete&parameter1=foo&parameter2=bar
Encoded request details (encoded using base64):
YWN0aW9uPWRlbGV0ZSZwYXJhbWV0ZXIxPWZvbyZwYXJhbWV0ZXIyPWJhcg==
Encoded URL:
/ajax/?request=YWN0aW9uPWRlbGV0ZSZwYXJhbWV0ZXIxPWZvbyZwYXJhbWV0ZXIyPWJhcg==
I don't believe there is native functionality to encode to base64 in JavaScript, but it's far from impossible to find a suitable method or to write your own.
With obfuscated/minified client-side JavaScript it would be quite tricky for someone to determine how to make a request 'by hand'.
Hide implementation details
There are a number of practices you can follow to make your application less susceptible to attack through URL misuse.
Let's start with a URL of: ajax/fetchDepartment.php?sec=departments&company_id=99
There's no need to reveal what server-side technology you're using (PHP) nor, through the query string (sec, company_id), what the query string values actually mean.
Masking the server-side technology
Assuming you have index.php defined as a default, the following URLs are equivalent:
ajax/fetchDepartment.php?sec=departments&company_id=99
ajax/fetchDepartment/index.php?sec=departments&company_id=99
ajax/fetchDepartment/?sec=departments&company_id=99
The third URL does not reveal the server-side technology you're using. This limits the range of possible attacks. It also makes it easier for you to switch over to a different server-side technology without changing your URLs.
Hiding the meaning of request parameters
ajax/fetchDepartment/?sec=departments&company_id=99
ajax/99/departments/
The latter URL still conveys enough information to perform the request without revealing what the information means.
Whilst someone could still change the values, they won't know what they're changing. This will make it more difficult for an attacker to evaluate and understand the result of any URL changes they make.
Pretty much the only way you can obscure the URL for a certain piece of information from the user is by not loading it in through http. Maybe you can load a set of data on the calling page, or another page with a more generic url.

Resources