show "webpage has expired" on back button - http-post

What is the requirement for the browser to show the ubiquitous "this page has expired" message when the user hits the back button?
What are some user-friendly ways to prevent the user from using the back button in a webapp?

Well, by default whenever you're dealing with a form POST, and then the user hits back and then refresh then they'll see the message indicating that the browser is resubmitting data. But if the page is set to expire immediately then they won't even have to hit refresh and they'll see the page has expired message when they hit back.
To avoid both messages there are a couple things to try:
1) Use a form GET instead. It depends on what you're doing but this isn't always a good solution as there are still size restrictions on a GET request. And the information is passed along in the querystring which isn't the most secure of options.
-- or --
2) Perform a server-side redirect to a different page after the form POST.
Looks like a similar question was answered here:
Redirect with a 303 after POST to avoid "Webpage has expired": Will it work if there are more bytes than a GET request can handle?
As a third option one could prevent a user from going back in their browser at all. The only time I've felt a need to do this was to prevent them from doing something stupid such as paying twice. Although there are better server-side methods to handle that. If your site uses sessions then you can prevent them from paying twice by first disabling cache on the checkout page and setting it expire immediately. And then you can utilize a flag of some sort stored in a session which will actually change the behavior of the page if you go back to it.

you need to set pragma-cache control option in HTTP headers:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
However, from the usability point of view, this is discouraged approach to the matter. I strongly encourage you to look for other options.
ps: as proposed by Steve, redirection via GET is the proper way (or check page movement with JS).

Try using the following code in the Page_Load
Response.Cache.SetCacheability(HttpCacheability.Private)

use one of the following before session_start:
session_cache_expire(60); // in minutes
ini_set('session.cache_limiter', 'private');
/Note:
Language is PHP

I'm not sure if this is standard practice, but I typically solve this issue by not sending a Vary header for IE only. In Apache, you can put the following in httpd.conf:
BrowserMatch MSIE force-no-vary
According to the RFC:
The Vary field value indicates the set
of request-header fields that fully
determines, while the response is
fresh, whether a cache is permitted to
use the response to reply to a
subsequent request without
revalidation.
The practical effect is that when you go "back" to a POST, IE simply gets the page from the history cache. No request at all goes to the server side. I can see this clearly in HTTPWatch.
I would be interested to hear potential bad side-effects of this approach.

Related

Cache-control Immutable Header

I was reading about immutable header and i came across with this article saying that:
Cache-Control: max-age=365000000, immutable
When a client supporting immutable sees this attribute it should
assume that the resource, if unexpired, is unchanged on the server and
therefore should not send a conditional revalidation for it (e.g.
If-None-Match or If-Modified-Since) to check for updates. Correcting
possible corruption (e.g. shift reload in Firefox) never uses
conditional revalidation and still makes sense to do with immutable
objects if you're concerned they are corrupted.
source
I cant understand this phrase "if unexpired, is unchanged on the server and therefore should not send a conditional revalidation"
Client, by default doesnt send a revalidation until the max-age is expired.
So whats the point define immutable in the first place?
People pressing the refresh button.
Facebook, who first proposed this immutable cache-control directive, have a good post on this about how it saved them a huge amount of requests, including this quote:
The problem with reloads
The browser’s reload button exists to allow the user to get an updated
version of the current page. In order to meet this goal, when you
reload, browsers revalidate the page that you are currently on, even
if that page hasn’t expired yet. However, they also go a step further
and revalidate all sub-resources on the page — things like images and
JavaScript files.

How to automatically resubmit the form in Firefox on error

In case a (say login) form POST submission fails and Firefox displays "Try Again" message.
Is there any way to click this "Try Again" automatically or through any settings in Firefox about:config that it clicks it?
Related
"Clicking" the Try Again button is relatively easy. There is an extension that does just that, and lets you set the number of seconds between retries.
The real rub here is that you want to "blindly" retry form POSTs. As we all know, just because you didn't get a response, that doesn't necessarily imply that nothing was changed on the server.
Re-submitting a login form sounds harmless enough, and usually is. But if you imagine forms that result in orders being placed or money being moved, it's easy to understand why browsers have implemented this kind of warning:
This is what you'll see if you enable an extension like TryAgain and a form post fails. It's the same behavior you'd get by pressing F5 yourself. The extension will dutifully try to POST again, but the browser is going to intervene with an alert, and refuse to send the POST until "Resend" is clicked.
This kind of safety feature does a fair amount to protect end-users and developers from poor implementations and network hiccups. However, it's really going to work against what you're trying to accomplish.
That said, if you could figure out a way to modify the extension to detect the alert and somehow click "Resend", you'd be in business. I can't say for sure that this is impossible, put it kind of looks that way, at least for now: this issue was marked as "won't fix", and this issue is still open.
Here is an extension for firefox:
auto reload
but i would warn you. because you could auto send any sensitive data. usually web browsers ask reload because the dont want any sensitive data to be submitted without user discretion.

Prevent data tampering in Response

While reading The Web Application Hacker's Handbook, I tried to make a small test on my own website (ASP.NET MVC3).
I have a model which it contains two fields, the first field is a disabled dropdownlist.
The second is an enabled text field.
The first field is disabled from the View.chtml and added
new {disabled="disabled"} as a paratemer.
Here is what happened to me, I ran Burp Suite tool as a proxy and trapped the response.
In the response, I removed the disabled="disabled"attribute from HTML, then forwarded the response to the browser. Oviuosly, the page now has two enabled fields.
The question is how to prevent tampering fields using tools such as Burp Suite?
You can't. For that matter you can't be sure that a post back you receive in your controller is the result from your view in a browser. Just posting whatever you want using some script is easy and something hackers frequently do.
The bottom line is. Never trust input and always validate it is permissible.

Content negotiation ignored when using browser Back button

Here's the situation:
I have a web application which response to a request for a list of resources, lets say:
/items
This is initially requested directly by the web browser by navigating to that path. The browser uses it's standard "Accept" header which includes "text/html" and my application notices this and returns the HTML content for the item list.
Within the returned HTML is some JavaScript (jQuery), which then does an ajax request to retrieve the actual data:
/items
Only this time, the "Accept" header is explicitly set to "application/json". Again, my application notices this and JSON is correctly returned to the request, the data is inserted into the page, and everything is happy.
Here comes the problem: The user navigates to another page, and later presses the BACK button. They are then prompted to save a file. This turns out to be the JSON data of the item list.
So far I've confirmed this to happen in both Google Chrome and Firefox 3.5.
There's two possible types of answers here:
How can I fix the problem. Is
there some magic combination of
Cache-Control headers, or other
voodoo which cause the browser to do
the right thing here?
If you think I am doing something
horribly wrong here, how should I go
about this? I'm seeking correctness,
but also trying not to sacrifice
flexibility.
If it helps, the application is a JAX-RS web application, using Restlet 2.0m4. I can provide sample request/response headers if it's helpful but I believe the issue is completely reproducible.
Is there some magic combination of Cache-Control headers, or other voodoo which cause the browser to do the right thing here?
If you serve different responses to different Accept: headers, you must include the header:
Vary: Accept
in your response. The Vary header should also contain any other request headers that influence the response, so for example if you do gzip/deflate compression you'd have to include Accept-Encoding.
IE, unfortunately handles many values of Vary poorly, breaking cacheing completely, which might or might not matter to you.
If you think I am doing something horribly wrong here, how should I go about this?
I don't think the idea of serving different content for different types at the same URL is horribly wrong, but you are letting yourself in for more compatibility problems than you really need. Relying on headers working through JSON isn't really a great idea in practice; you'd be best off just having a different URL, such as /items/json or /items?format=json.
I know this question is old, but just in case anyone else runs into this:
I was having this same problem with a Rails application using jQuery, and I fixed it by telling the browser not to cache the JSON response with the solution given here to a different question:
jQuery $.getJSON works only once for each control. Doesn't reach the server again
The problem only seemed to occur with Chrome and Firefox. Safari was handling the back behavior okay without explicitly having to tell it to not cache.
Old question, but for anyone else seeing this, there is nothing wrong with the questioner's usage of the Accept header.
This is a confirmed bug in Chrome. (Previously also in Firefox but since fixed.)
http://code.google.com/p/chromium/issues/detail?id=94369

Screen scraping an ASP.NET web page to retrieve data displayed in the grid view

I am using RUBY to screen scrap a web page (created in asp.net) which uses gridview to display data. I am successfully able to read the data displayed on page-1 of the grid but unable to figure out how I can move to the next page in the grid to read all the data.
Problem is the page number hyperlinks are not normal hyperlinks (with URL) but instead are javascript hyperlink which causes postback to the same page..
An example of the hyperlink:-
6
I recommend using Watir, a ruby library designed for browser testing, if you're already using ruby for processing. For one thing, it gives you a much nicer interface to the DOM elements on the page, and it makes clicking links like this easier:
ie.link(:text, '6').click
Then, of course you have easier methods for navigating the table as well. It's easy enough to automate this process:
1..total_number_of_pages.each do |next_page|
ie.link(:text, next_page).click
# table processing goes here
end
I don't know your use case, but this approach has its advantages and disadvantages. For one thing, it actually runs a browser instance, so if this is something you need to frequently run quietly in the background in completely automated way, this may not be the best approach. On the other hand, if it's ok to launch a browser instance, then you don't have to worry about all that postback nonsense, and you can just click the link as if you were a user.
Watir: http://wtr.rubyforge.org/
You'll need to figure out the actual URL.
Option 1a: Open the page in a browser with good developer support (e.g. firefox with the web development tools) and look through the source to find where _doPostBack is defined. Figure out what URL it's constructing. Note that it might not be in the main page source, but instead in something that the page loads.
Option 1b: Ditto, but have ruby do it. If you're fetching the page with Net:HTTP you've got the tools to find the definition of __doPostBack already (the body as a string, ruby's grep, and the ability to request additional files, such as those in script tags).
Option 2: Monitor the traffic between a browser and the page (e.g. with a logging proxy) to find out what the URL is.
Option 3: Ask the owner of the web page.
Option 4: Guess. This may not be as bad as it sounds (e.g. if the original URL ends with "...?page=1" or something) but in general this is the least likely to work.
Edit (in response to your comment on the other question):
Assuming you're using the Net:HTTP library, you can do a postback by just replacing your get with a post, e.g. my_http.post(my_url) instead of my_http.get(my_url)
Edit (in response to danieltalsky's answer):
watir may be a really good solution for you (I'm kicking myself for not having thought of it), but be aware that you may have to manually fire the event or go through other hoops to get what you want. As a specific gotcha, with any asynchronous fetch like this you need to make sure that the full response has come back before you scrape it; that isn't a problem when you're doing the request inline yourself.
You will have to perform the postback. The data is pass with a form POST back to the server. Like Markus said use something like FireBug or the Developer Tools in IE 8 and fiddler to watch the traffic. But honestly this is a web form using the bloated GridView and you will be in for a fun adventure. ;)
You'll need to do some investigation in order to figure out what HTTP request the javascript execution is performing. I've used the Mozilla browser with the Firebug plugin and also the "Live HTTP Headers" plugin to help determine what is going on. It will likely become clear to you which requests you will need to make in order to traverse to the next page. Make sure you pay attention to any cookies getting set.
I've had really good success using Mechanize for scraping. It wraps all of the HTTP communication, html parsing and searching(using Nokogiri), redirection, and holding onto cookies. But it doesn't know how to execute Javascript, which is why you will need to figure out what http request to perform on your own.

Resources