It seems I'm coding contact forms for someone or another every week. I've developed a validate/mail client/mail thankyou/save to DB/thankspage process that's fairly robust but I was wondering what lessons others have for performing this most common of website tasks?
the best hint I would give you is to come up with a good way to prevent spam. We use a honeypot technique which we find very effective and is also very simple to implement. It involves the addition of two extra text fields to each form, hidden by CSS. One of these fields (lets call in input1) will have a set value which you can check for on your processing page, the other field (input2) will have its value left empty. A lot of automated spam will use a bot that will detect the presence of a form and autofill all form fields. With our technique, simply by checking that input1 still has its initial value and that input2 still has a value of "" we can guess the form was not autofilled by a bot or similar. We do have more advanced techniques for checking the content of each field but as a simple trick this one is a real winner and has almost completely eradicated automated contact form spam.
Make sure it works without Javascript (or fails gracefully)
Strip out any nasty characters before sending the email
Use parameterized queries when writing to the database
Be sure the From or Reply-to header is set to the sender's email address
Read this article on form validation.
Related
I often encounter advice for protecting a web application against a number of vulnerabilities, like SQL injection and other types of injection, by doing input validation.
It's sometimes even said to be the single most important technique.
Personally, I feel that input validation for security reasons is never necessary and better replaced with
if possible, not mixing user input with a programming language at all (e.g. using parameterized SQL statements instead of concatenating input in the query strings)
escaping user input before mixing it with a programming or markup language (e.g. html escaping, javascript escaping, ...)
Of course for a good UX it's best to catch input that would generate errors on the backand early in the GUI, but that's another matter.
Am I missing something or is the only purpose to try to make up for mistakes against the above two rules?
Yes you are generally correct.
A piece of data is only dangerous when "used". And it is only dangerous if it has special meaning in the context it is used.
For example, <script> is only dangerous if used in output to an HTML page.
Robert'); DROP TABLE Students;-- is only dangerous when used in a database query.
Generally, you want to make this data "safe" as late as possible. Such as HTML encoding when output as HTML to an HTML page, and parameterised when inserting into a database. The big advantage of this is that when the data is later retrieved from these locations, it will be returned in its original, unsanitized format.
So if you have the value A&B O'Leary in an input field, it would be encoded like so:
<input type="hidden" value="A& O'Leary" />
and if this is submitted to your application, your programming framework will automatically decode it for you back to A&B O'Leary. Same with your DB:
string name = "A&B O'Leary";
string sql = "INSERT INTO Customers (Name) VALUES (#Name)";
SqlCommand command = new SqlCommand(sql);
command.Parameters.Add("#Name", name];
Simples.
Additionally if you then need to give the user any output in plain text, you should retrieve it from your DB and spit it out. Or in JavaScript - you just JavaScript entity encode (although best avoided for complexity reasons - I find it easier to secure if I only output to HTML then read the values from the DOM).
If you'd HTML encoded it early, then to output to JavaScript/JSON you'd first have to convert it back then hex entity encode it. It will get messy and some developers will forget they have to decode first and you will have &s everywhere.
You can use validation as an additional defence, but it should not be the first port of call. For example, if you are validating a UK postcode you would want to whitelist the alphanumeric characters in upper and lower cases. Any other characters would be rejected or removed by your application. This can reduce the chances of SQLi or XSS occurring on your application, but this method falls down where you need inputs to include characters that have special meaning to your output context (" '<> etc). For example, on Stack Overflow if they did not allow characters such as these you would be preventing questions and answers from including code snippets which would pretty much make the site useless.
Not all SQL statements are parameterizable. For example, if you need to use dynamic identifiers (as opposed to literals). Even whitelisting can be hard, sometimes it needs to be dynamic.
Escaping XSS on output is a good idea. Until you forget to escape it on your admin dashboard too and they steal all your admin's cookies. Don't let XSS in your database.
I'm just trying to find a pattern / best practice to come up with interactive user decisions.
So basically I have a (quite large) form the user has to fill. Once he submits, an AJAX-Post-Request is sent to the server. At first some fault checks are done there but some checks require user interaction e.g. "Is this really correct". As the returned "document" is always XML I thought of returning all questions at once like
bla
bla2
And then iterating through them. Ohh, I'm using JQuery and the Bootstrap' modal therefore. And if all these questions are answered with yes I'll send the form again with a parameter allyes=true or something.
However, I don't feel very happy with that and I'm just wondering if there are some easier ways to code that.
Best regards,
fire
From a users perspective, it's better if the form tells you there is a problem as you go along, rather than having filled it all in. If the checks are field level, I'd be tempted to validate each one as they are filled in.
When using JSON to populate a section of a page I often encounter that data needs special formatting - formatting that need to match that already on the page, which is done serverside.
A number might need to be formatted as a currency, a special date format or wrapped in for negative values.
But where should this formatting take place - doing it clientside will mean that I need to replicate all the formatting that takes place on the serverside. Doing it serverside and placing the formatted values in the JSON object means a less generic and reusable data set.
What is the recommended approach here?
The generic answer is to format data as late/as close to the user as is possible (or perhaps "practical" is a better term).
Irritatingly this means that its an "it depends" answer - and you've more or less already identified the compromise you're going to have to make i.e. do you remove flexibility/portability by formatting server side or do you potentially introduct duplication by doing it client side.
Personally I would tend towards client side unless there's a very good reason not to do so - simply because we're back to trying to format stuff as close to the user as possible, though I would be a bit concerned about making sure that I'm applying the right formatting rules in the browser.
JSON supports the following basic types:
Numbers,
Strings,
Boolean,
Arrays,
Objects
and Null (empty).
A currency is usually nothing else than a number, but formatted according to country-specific rules. And dates are not (yet) included in JSON at all.
Whatever is recommendable depends on what you do in your application and what kind of JScript libraries you are already using. If you are already formatting alot of data in your server side code, then add it there. If not, and you already have some classes included, which can cope with formatting (JQuery and MooTools have some capabilities), do it in the browser.
So either format them in the client or format them before sending them over - both solutions work.
If you want to delve deeper into this, i recommend this wikipedia article about JSON.
I'm going to use PHP in my example, but my question applies to any programming language.
Whenever I deal with forms that can be filled out by users who are not logged in (in other words, untrusted users), I do several things to make sure it is safe to store in the database:
Verify that all of the expected fields are present in $_POST (none were removed using a tool such as Firebug)
Verify that there are no unexpected fields in $_POST. This way, a field in the database doesn't accidentally get written over.
Verify that all of the expected fields are of the expected type (almost always "string"). This way, problems don't come up if a malicious user is tinkering with the code and adds "[]" to the end of a field name, thus making PHP consider the field to be an array and then performing checks on it as though it were a string.
Verify that all of the required fields were filled out.
Verify that all of the fields (both required and optional) were filled out correctly (for example, email addresses and phone numbers are in the expected format).
Related to the previous item, but worthy of being its own item: verify that fields that are dropdown menus were submitted with values that are actually in the dropdown menu. Again, a user could tinker with the code and change the dropdown menu to be anything they want.
Sanitize all fields just in case the user intentionally or unintentionally included malicious code.
I don't believe that any of the above things are overkill because, as I mentioned, the user filling out the form is not trusted.
When it comes to admin backends, however, I'm not sure all of those things are necessary. These are the things that I still consider to be necessary:
Verify that all of the required fields were filled out.
Verify that all of the fields (both required and optional) were filled out correctly (for example, email addresses and phone numbers are in the expected format).
Sanitize all fields just in case the user intentionally or unintentionally included malicious code.
I'm considering dropping the remaining items in order to save time and have less code (and, therefore, more readable code). Is that a reasonable thing to do or is it worthwhile to treat all forms equally regardless of whether or not they are being filled out by a trusted user?
These are the only two reasons I can think of for why it might be wise to treat all forms equally:
The trusted user's credentials might be found out by an untrusted user.
The trusted user's machine could be infected with malware that messes with forms. I have never heard of such malware and doubt that this is something to be really be worried about, but it is something to consider anyway.
Thanks!
Without knowing all the details, it's hard to say.
However, in general this feels like a situation where code re-use should be possible. In other words, it feels like this boiler-plate form validation shouldn't need to be re-written for each unique form. Instead, I would aim to create some reusable external class that could be used for any form.
You mentioned PHP and there are already lots of form validation classes available:
http://www.google.com/search?gcx=w&sourceid=chrome&ie=UTF-8&q=form+validation+php+class
Best of luck!
I'm new with freemarker, I need know about this problem too choose it or not, I will strip XSS by myself but I don't know are other features of freemarker safe when site allow user edit their template?
Oh, goodness no! This is basically equivalent to allowing the user to evaluate arbitrary code. Removing XSS after the fact only removes one potential vulnerability. They'll still be able to do plenty of other things like manipulate POST parameters or perform page redirects.
John is right. And letting the user actually edit freemarker templates themselves seems odd. If you are outputting user input again (like displaying the search term on the results page) I'd suggest using the using the ?html string built-in, it'll save you from the most rudimentary xss attacks (e.g. "you searched for '${term?html}'").
So as others said, it's not safe. However, if those users are employees at your company or something like that (i.e., if they are easily accountable for malevolent actions) then it's not entirely out of question. For more details see: http://freemarker.org/docs/app_faq.html#faq_template_uploading_security