Generic way to ask which browser they're using? - user-interface

I would like a text input with the question "what browser are you using" above it. Then when a form is submitted, I'd like to compare their answer to their User-Agent HTTP header.
I am stumped on how to reliably make this work.
I could ask them to spell it out instead of using acronyms like IE or FF, but Internet Explorer uses "MSIE" as its' identifier doesn't it?
Another thought I had was to keep a pool of User-Agent strings, then present them with a select element that has theirs inserted randomly among 4 or so other random strings and asking them to select theirs. I fear non-tech-savy users would bungle this enough times for it to be a problem though. I suppose I could use some logic to make sure there's only one of each browser type among the options, but I'm leery about even that.

Why would You want to ask user about its User-Agent?
Pulling appropriate http header - as you've mentioned, should be enough.
But if you need that badly, I'd go for
regular expressions - for checking http user-agent header and cutting out unimportant info from that header
present possible match based on the previous step
ask user if the match is correct, if not, let him enter its own answer
then I'd try to match what he entered to some dictionary values, so as to be able to enter IE and MSIE and get the same result.
The above seems vague enough :) and abstract, maybe you could provide an explanation - why you want that? maybe there is some other way?

Remember: The client sends the HTTP header and potentially the user can put anything in User Agent. So if you want to catch people who "lie about" the browser they are using, you will only catch those who cannot modify the HTTP header before they send it.

You can neither 100% trust the user input nor the string that the browser sends in the HTTP headers...

The obvious question is why you want to ask the user what browser they are using?
But given that:
a) Normalise the user string: lower-case, remove spaces, remove numbers?
b) Build a map between the normalised strings, and user-agent strings.
When you do a lookup, if the normalised string, or the user-agent string is not in the map, pass it to a human to add to the map with appropriate mapping.
Possibly you'll want to normalise the user-agent in some way as well?

Related

Should JSON data contain formatted data?

When using JSON to populate a section of a page I often encounter that data needs special formatting - formatting that need to match that already on the page, which is done serverside.
A number might need to be formatted as a currency, a special date format or wrapped in for negative values.
But where should this formatting take place - doing it clientside will mean that I need to replicate all the formatting that takes place on the serverside. Doing it serverside and placing the formatted values in the JSON object means a less generic and reusable data set.
What is the recommended approach here?
The generic answer is to format data as late/as close to the user as is possible (or perhaps "practical" is a better term).
Irritatingly this means that its an "it depends" answer - and you've more or less already identified the compromise you're going to have to make i.e. do you remove flexibility/portability by formatting server side or do you potentially introduct duplication by doing it client side.
Personally I would tend towards client side unless there's a very good reason not to do so - simply because we're back to trying to format stuff as close to the user as possible, though I would be a bit concerned about making sure that I'm applying the right formatting rules in the browser.
JSON supports the following basic types:
Numbers,
Strings,
Boolean,
Arrays,
Objects
and Null (empty).
A currency is usually nothing else than a number, but formatted according to country-specific rules. And dates are not (yet) included in JSON at all.
Whatever is recommendable depends on what you do in your application and what kind of JScript libraries you are already using. If you are already formatting alot of data in your server side code, then add it there. If not, and you already have some classes included, which can cope with formatting (JQuery and MooTools have some capabilities), do it in the browser.
So either format them in the client or format them before sending them over - both solutions work.
If you want to delve deeper into this, i recommend this wikipedia article about JSON.

Remove Signature from Received Message

I have a python script that receives text messages from users, and processes them as a query. However, some users have signatures automatically appended to their messages, and the script incorrectly treats them as actual content. What's the best programmatic way to recognize and remove these signatures?
(I'd prefer in python, but am fine with any other language too, as well as just saying it in pseudocode)
If the signatures are appended to the body of the message such that they're actually part of the body text, then there are only two ways to remove them:
Heuristics, such as "anything following three dashes must be a signature". These may be effective if you spend some time tuning them.
A classifier. This is a lot of work to set up, and requires that you "train" it by marking some message parts as signatures. These can also be very effective, but like heuristics will never work 100% of the time.
If the signature always follows a specific pattern, you should be able to just use a regular expression to trim it off.
However, if the user can setup their signature any way they wish, and there is no leading characters (ie: -- at the beginning), this is going to be very difficult. The only reliable way to do this would be to know the content of the signature for each user in advance so you can strip it out. Imagine a worst-case scenario: Somebody could always send a blank message, with a signature that was a fully valid "query". There'd be no way for the script to differentiate that from a "query" message with no signature.

filtering user input in php

Am wondering if the combination of trim(), strip_tags() and addslashes() is enough to filter values of variables from $_GET and $_POST
That depends what kind of validation you are wanting to perform.
Here are some basic examples:
If the data is going to be used in MySQL queries make sure to use mysql_real_escape_query() on the data instead of addslashes().
If it contains file paths be sure to remove the "../" parts and block access to sensitive filename.
If you are going to display the data on a web page, make sure to use htmlspecialchars() on it.
But the most important validation is only accepting the values you are expecting, in other words: only allow numeric values when you are expecting numbers, etc.
Short answer: no.
Long answer: it depends.
Basically you can't say that a certain amount of filtering is or isn't sufficient without considering what you want to do with it. For example, the above will allow through "javascript:dostuff();", which might be OK or it might not if you happen to use one of those GET or POST values in the href attribute of a link.
Likewise you might have a rich text area where users can edit so stripping tags out of that doesn't exactly make sense.
I guess what I'm trying to say is that there is simple set of steps to sanitizing your data such that you can cross it off and say "done". You always have to consider what that data is doing.
It highly depends where you are going to use it for.
If you are going to display things as HTML, make absolutely sure you are properly specifying the encoding (e.g.: UTF-8). As long as you strip all tags, you should be fine.
For use in SQL queries, addslashes is not enough! If you use the mysqli library for example, you want to look at mysql::real_escape_string. For other DB libraries, use the designated escape function!
If you are going to use the string in javascript, addslashes will not be enough.
If you are paranoid about browser bugs, check out the OWASP Reform library
If you use the data in another context than HTML, other escaping techniques apply.

Is freemarker safe when allow user edit their template

I'm new with freemarker, I need know about this problem too choose it or not, I will strip XSS by myself but I don't know are other features of freemarker safe when site allow user edit their template?
Oh, goodness no! This is basically equivalent to allowing the user to evaluate arbitrary code. Removing XSS after the fact only removes one potential vulnerability. They'll still be able to do plenty of other things like manipulate POST parameters or perform page redirects.
John is right. And letting the user actually edit freemarker templates themselves seems odd. If you are outputting user input again (like displaying the search term on the results page) I'd suggest using the using the ?html string built-in, it'll save you from the most rudimentary xss attacks (e.g. "you searched for '${term?html}'").
So as others said, it's not safe. However, if those users are employees at your company or something like that (i.e., if they are easily accountable for malevolent actions) then it's not entirely out of question. For more details see: http://freemarker.org/docs/app_faq.html#faq_template_uploading_security

Do you validate your URL variables?

When you're passing variables through your site using GET requests, do you validate (regular expressions, filters, etc.) them before you use them?
Say you have the URL http://www.example.com/i=45&p=custform. You know that "i" will always be an integer and "p" will always contain only letters and/or numbers. Is it worth the time to make sure that no one has attempted to manipulate the values and then resubmit the page?
Yes. Without a doubt. Never trust user input.
To improve the user experience, input fields can (and IMHO should) be validated on the client. This can pre-empt a round trip to the server that only leads to the same form and an error message.
However, input must always be validated on the server side since the user can just change the input data manually in the GET url or send crafted POST data.
In a worst case scenario you can end up with an SQL injection, or even worse, a XSS vulnerability.
Most frameworks already have some builtin way to clean the input, but even without this it's usually very easy to clean the input using a combination of regular exceptions and lookup tables.
Say you know it's an integer, use int.Parse or match it against the regex "^\d+$".
If it's a string and the choices are limited, make a dictionary and run the string through it. If you don't get a match change the string to a default.
If it's a user specified string, match it against a strict regex like "^\w+$"
As with any user input it is extremely important to check to make sure it is what you expect it is. So yes!
Yes, yes, and thrice yes.
Many web frameworks will do this for you of course, e.g., Struts 2.
One important reason is to check for sql injection.
So yes, always sanitize user input.
not just what the others are saying. Imagine a querystring variable called nc, which can be seen to have values of 10, 50 and 100 when the user selects 10, 50 and 100 results per page respectively. Now imagine someone changing this to 50000. If you are just checking that to be an integer, you will be showing 50000 results per page, affecting your pageviews, server loads, script times and so on. Plus this could be your entire database. When you have such rules (10, 50 or 100 results per page), you should additionaly check to see if the value of nr is 10, 50 or 100 only, and if not, set it to a default. This can simply be a min(nc, 100), so it will work if nc is changed to 25, 75 and so on, but will default to 100 if it sees anything above 100.
I want to stress how important this is. I know the first answer discussed SQL Injection and XSS Vulnerabilities. The latest rave in SQL Injection is passing a binary encoded SQL statement in query strings, which if it finds a SQL injection hole, it will add a http://reallybadsite.com'/> to every text field in your database.
As web developers we have to validate all input, and clean all the output.
Remember a hacker isn't going to use IE to compromise your site, so you can't rely on any validation in the web.
Yes, check them as thoroughly as you can. In PHP I always check the types (IsInt(i), IsString(p)).

Resources