Prevent data tampering in Response -

While reading The Web Application Hacker's Handbook, I tried to make a small test on my own website (ASP.NET MVC3).
I have a model which it contains two fields, the first field is a disabled dropdownlist.
The second is an enabled text field.
The first field is disabled from the View.chtml and added
new {disabled="disabled"} as a paratemer.
Here is what happened to me, I ran Burp Suite tool as a proxy and trapped the response.
In the response, I removed the disabled="disabled"attribute from HTML, then forwarded the response to the browser. Oviuosly, the page now has two enabled fields.
The question is how to prevent tampering fields using tools such as Burp Suite?

You can't. For that matter you can't be sure that a post back you receive in your controller is the result from your view in a browser. Just posting whatever you want using some script is easy and something hackers frequently do.
The bottom line is. Never trust input and always validate it is permissible.


Ways to programatically check if a website is up and functioning as expected

I know this is an open ended question, but hopefully it will get some good answers before the thread is locked...
I'm wondering what methods there are to programmatically check (language agnostic) if a website is online from a client perspective (assume you can't make changes to the site/server, but you can rely on certain behaviours of the site.)
The result of each method could stack to provide a measure of certainty that the site is up/down - that is, a method does not have to provide a definite indication if the site is up/down on its own.
Some common tests just to check 'upness' may be:
Ping the site (which in the case of shared hosting isn't very
Send a http head/get request and check the status
Others I can think of to check that the site is up and functioning:
Check you received a well formed html response i.e. html to html
tags, if the site is experiencing trouble it may spit an error and
exit without writing the rest of the page (not all that reliable
though because the site may handle most errors in a better way)
Check certain content is or is not on the page, i.e. perhaps there is some content that is always present on your pages, or always present in the case of an error
Can anybody think of any other methods that could be used to help determine if a site is in fact up/down and functioning/not functioning correctly from within a program?
If your get request on a page that displays info from database comes back with status 200 and matching keywords are found, you can be pretty certain that your site is up and running.
And you don't really need to write your own script to do that. There are free services such as GotSiteMonitor, Pingdom, UptimeRobot etc. allows you to monitor your site.
Based your set of test on the unit tests priciple. It is normally used in programming to test classes, modules or other artefacts after changes have been made. You can use any of the available frameworks, so don't have to reinvent the wheel. You must describe (implement) tests to be run, in your case a typical test should request a url inside the page and then do some evaluations like:
call result (for example return code of curl execution)
http return code
http headers
response mime type
response size
response content (test against a regular expression)
This way you can add, remove and modify single tests without having to care about the framework, once you are up. You can also chain tests, so perform a login in one test and virtually click a button in subsequent test.
There are also tools to handle such test runs automatically including visualization of results, statistics and the like.
OK, it sounds like you want to test and monitor your website from a customer experience perspective rather than purely establishing if a server is up (using ping for example). An effective way to replicate the customer experience is to simulate tests against the site using one of the headless browser testing tools (phantomJS is great a great choice) as they will render the page fully (including images, CSS, JS etc.) giving you a real page load time. These tools also allow you to make assertions on all aspects of the HTML content and HTTP response.
pingdom recently started offering a (paid for) service to perform these exact types of checks for alongside their existing monitoring solution. The demo is worth looking at, their interface for writing the actual tests is very nice.

Content negotiation ignored when using browser Back button

Here's the situation:
I have a web application which response to a request for a list of resources, lets say:
This is initially requested directly by the web browser by navigating to that path. The browser uses it's standard "Accept" header which includes "text/html" and my application notices this and returns the HTML content for the item list.
Within the returned HTML is some JavaScript (jQuery), which then does an ajax request to retrieve the actual data:
Only this time, the "Accept" header is explicitly set to "application/json". Again, my application notices this and JSON is correctly returned to the request, the data is inserted into the page, and everything is happy.
Here comes the problem: The user navigates to another page, and later presses the BACK button. They are then prompted to save a file. This turns out to be the JSON data of the item list.
So far I've confirmed this to happen in both Google Chrome and Firefox 3.5.
There's two possible types of answers here:
How can I fix the problem. Is
there some magic combination of
Cache-Control headers, or other
voodoo which cause the browser to do
the right thing here?
If you think I am doing something
horribly wrong here, how should I go
about this? I'm seeking correctness,
but also trying not to sacrifice
If it helps, the application is a JAX-RS web application, using Restlet 2.0m4. I can provide sample request/response headers if it's helpful but I believe the issue is completely reproducible.
Is there some magic combination of Cache-Control headers, or other voodoo which cause the browser to do the right thing here?
If you serve different responses to different Accept: headers, you must include the header:
Vary: Accept
in your response. The Vary header should also contain any other request headers that influence the response, so for example if you do gzip/deflate compression you'd have to include Accept-Encoding.
IE, unfortunately handles many values of Vary poorly, breaking cacheing completely, which might or might not matter to you.
If you think I am doing something horribly wrong here, how should I go about this?
I don't think the idea of serving different content for different types at the same URL is horribly wrong, but you are letting yourself in for more compatibility problems than you really need. Relying on headers working through JSON isn't really a great idea in practice; you'd be best off just having a different URL, such as /items/json or /items?format=json.
I know this question is old, but just in case anyone else runs into this:
I was having this same problem with a Rails application using jQuery, and I fixed it by telling the browser not to cache the JSON response with the solution given here to a different question:
jQuery $.getJSON works only once for each control. Doesn't reach the server again
The problem only seemed to occur with Chrome and Firefox. Safari was handling the back behavior okay without explicitly having to tell it to not cache.
Old question, but for anyone else seeing this, there is nothing wrong with the questioner's usage of the Accept header.
This is a confirmed bug in Chrome. (Previously also in Firefox but since fixed.)

Automate website log-in and form filling?

I'm trying to log in to a website and save an HTML page automatically (I want to be able to do this on a regular time interval). From the surface, this is a typical modern website where, if the user navigates directly to a "locked" URL, a log-in form pops up, and after logging in, the user is redirected to the intended page.
I gave mechanize a shot ( but it wasn't finding some form elements which were needed for login (hidden elements that have some values put in by a javascript function that runs when the user clicks the "log in" button).
I played a bit with the "web browser" control in .NET but quickly lost interest because I couldn't even get it to submit a query on the Google page.
I don't care what the language is; I'll learn it to solve this problem. At a minimum it has to work in Windows.
A simple example, say, typing in a query into the Google search box would be a great bonus.
In my experience, the most reliable way is to use javascript. It works well in .Net. To test, browse to the following addresses one after another in Firefox or Internet Explorer:
javascript:function f(){document.forms[0]['q'].value='stackoverflow';}f();
That performs a search for "stackoverflow" on Google. To do it in VB .Net using the webbrowser control, do this:
Do While WebBrowser1.IsBusy OrElse WebBrowser1.ReadyState <> WebBrowserReadyState.Complete
Threading.Thread.Sleep(2000) 'wait for javascript to run
Threading.Thread.Sleep(2000) 'wait for javascript to run
Notice how the space in the URL is converted to %20. I'm not certain if this is necessary but it can't hurt. It is important that the first javascript be in a function. The calls to Sleep() are to wait for Google to load and also for the javascript stuff. The Do While Loop might run forever if the page fails to load so for automation purposes have a counter that will timeout after, say, 60 seconds.
Of course, for Google you can just navigate directly to but if your site has hidden input fields, etc, then this is the way to go. Only works for HTML sites - flash is a whole other matter.
If I understand you right, you want to log in to only one webpage, and that form always stays the same. You could either reverse engineer the java script, or debug it via a javascript debugger in the browser (e.g. firebug for firefox). Or you can fill in the form in your browser and look at the http request via a network packet sniffer. Once you have all required form data to submit, you can do the same with your program (thats what I did the last time I had a pretty similar task to do). dont forget to store all cookie data you requested back from the webserver and send it with the next request, to 'stay logged in'.
Its being already discussed here.
Basically its gist is you can use selenium, an open source web automation tool, which has api library available in various languages like java, ruby, etc.
Neoload can handle the form filling with authentication, assuming you don't want to collect data, just perform actions. It's a web stress tool, so it's not really meant to be used as a time-based service, but you COULD just leave it running.
I've used Ruby and Watir (a web app testing suite) for something similar, but it was a very small task (basically visiting URLs from a text file and downloading an image).
There's also an extension called iMacros that can do some automation, but I'm not personally familiar with it (just aware of it).
"I'm trying to log in to a website and save an HTML page automatically"
This commands played in iMacros addon will save the page on C: drive and name it page.html
Goes on the particular website you want to save. You can also use scripting in iMacros and set different websites in macro.

Screen scraping an ASP.NET web page to retrieve data displayed in the grid view

I am using RUBY to screen scrap a web page (created in which uses gridview to display data. I am successfully able to read the data displayed on page-1 of the grid but unable to figure out how I can move to the next page in the grid to read all the data.
Problem is the page number hyperlinks are not normal hyperlinks (with URL) but instead are javascript hyperlink which causes postback to the same page..
An example of the hyperlink:-
I recommend using Watir, a ruby library designed for browser testing, if you're already using ruby for processing. For one thing, it gives you a much nicer interface to the DOM elements on the page, and it makes clicking links like this easier:, '6').click
Then, of course you have easier methods for navigating the table as well. It's easy enough to automate this process:
1..total_number_of_pages.each do |next_page|, next_page).click
# table processing goes here
I don't know your use case, but this approach has its advantages and disadvantages. For one thing, it actually runs a browser instance, so if this is something you need to frequently run quietly in the background in completely automated way, this may not be the best approach. On the other hand, if it's ok to launch a browser instance, then you don't have to worry about all that postback nonsense, and you can just click the link as if you were a user.
You'll need to figure out the actual URL.
Option 1a: Open the page in a browser with good developer support (e.g. firefox with the web development tools) and look through the source to find where _doPostBack is defined. Figure out what URL it's constructing. Note that it might not be in the main page source, but instead in something that the page loads.
Option 1b: Ditto, but have ruby do it. If you're fetching the page with Net:HTTP you've got the tools to find the definition of __doPostBack already (the body as a string, ruby's grep, and the ability to request additional files, such as those in script tags).
Option 2: Monitor the traffic between a browser and the page (e.g. with a logging proxy) to find out what the URL is.
Option 3: Ask the owner of the web page.
Option 4: Guess. This may not be as bad as it sounds (e.g. if the original URL ends with "...?page=1" or something) but in general this is the least likely to work.
Edit (in response to your comment on the other question):
Assuming you're using the Net:HTTP library, you can do a postback by just replacing your get with a post, e.g. instead of my_http.get(my_url)
Edit (in response to danieltalsky's answer):
watir may be a really good solution for you (I'm kicking myself for not having thought of it), but be aware that you may have to manually fire the event or go through other hoops to get what you want. As a specific gotcha, with any asynchronous fetch like this you need to make sure that the full response has come back before you scrape it; that isn't a problem when you're doing the request inline yourself.
You will have to perform the postback. The data is pass with a form POST back to the server. Like Markus said use something like FireBug or the Developer Tools in IE 8 and fiddler to watch the traffic. But honestly this is a web form using the bloated GridView and you will be in for a fun adventure. ;)
You'll need to do some investigation in order to figure out what HTTP request the javascript execution is performing. I've used the Mozilla browser with the Firebug plugin and also the "Live HTTP Headers" plugin to help determine what is going on. It will likely become clear to you which requests you will need to make in order to traverse to the next page. Make sure you pay attention to any cookies getting set.
I've had really good success using Mechanize for scraping. It wraps all of the HTTP communication, html parsing and searching(using Nokogiri), redirection, and holding onto cookies. But it doesn't know how to execute Javascript, which is why you will need to figure out what http request to perform on your own.

show "webpage has expired" on back button

What is the requirement for the browser to show the ubiquitous "this page has expired" message when the user hits the back button?
What are some user-friendly ways to prevent the user from using the back button in a webapp?
Well, by default whenever you're dealing with a form POST, and then the user hits back and then refresh then they'll see the message indicating that the browser is resubmitting data. But if the page is set to expire immediately then they won't even have to hit refresh and they'll see the page has expired message when they hit back.
To avoid both messages there are a couple things to try:
1) Use a form GET instead. It depends on what you're doing but this isn't always a good solution as there are still size restrictions on a GET request. And the information is passed along in the querystring which isn't the most secure of options.
-- or --
2) Perform a server-side redirect to a different page after the form POST.
Looks like a similar question was answered here:
Redirect with a 303 after POST to avoid "Webpage has expired": Will it work if there are more bytes than a GET request can handle?
As a third option one could prevent a user from going back in their browser at all. The only time I've felt a need to do this was to prevent them from doing something stupid such as paying twice. Although there are better server-side methods to handle that. If your site uses sessions then you can prevent them from paying twice by first disabling cache on the checkout page and setting it expire immediately. And then you can utilize a flag of some sort stored in a session which will actually change the behavior of the page if you go back to it.
you need to set pragma-cache control option in HTTP headers:
However, from the usability point of view, this is discouraged approach to the matter. I strongly encourage you to look for other options.
ps: as proposed by Steve, redirection via GET is the proper way (or check page movement with JS).
Try using the following code in the Page_Load
use one of the following before session_start:
session_cache_expire(60); // in minutes
ini_set('session.cache_limiter', 'private');
Language is PHP
I'm not sure if this is standard practice, but I typically solve this issue by not sending a Vary header for IE only. In Apache, you can put the following in httpd.conf:
BrowserMatch MSIE force-no-vary
According to the RFC:
The Vary field value indicates the set
of request-header fields that fully
determines, while the response is
fresh, whether a cache is permitted to
use the response to reply to a
subsequent request without
The practical effect is that when you go "back" to a POST, IE simply gets the page from the history cache. No request at all goes to the server side. I can see this clearly in HTTPWatch.
I would be interested to hear potential bad side-effects of this approach.
