Symfony2, doctrine2, session, best way to paginate search page - session

Let's imagine we have simple data and want to make pagination of it. It's not hard to do, simple _GET var with page number others doctrine with offset will allow us to do it in easy way, BUT How should it look like in search page? Let me explain.
For example we have simple route with /search url. Where we have form for our search. When use input string we user POST method on same page and will get result. Simple enough but if we add pagination here it become a problem with storing "inputed string".
If we store in session on search query it will be solution BUT... it's not. Why? User input search string - get result with pagination (here search string already in session) after that leave the page (or close browser, or left to another page). When he will return data from session will show him 'result of old query'...
So question is, what is the best practice for such situation? I want simple search query + pagination of it but if user left page - clear result.

Using POST instead of GET for search query is kinda unusual and not really safe. Since search query operations are read-only you should use GET to access/get the data. POST is used for updating or creating resources.
And how you will go back/forward in the pagination (using browser's buttons)? You always will be getting an alert box. AND you cannot share/bookmark the search query url.
BTW to answer your question, sessions and hidden input fields would be the way to go. You also can use a combination of get and post
When should I use GET or POST method? What's the difference between them?

Related

Use of Mechanize

I want to get response from websites that take a simple input, which is also reflected in the parameter of the url. Is it better to simply get the result by using conventional methods, for example OpenURI.open_uri(...) with some parameter set, or it is better to use mechanize, extract the form, and get the result through submit?
The mechanize page gives an example of extracting a form and submitting it to get the search result from Google search. However, this much can be done simply as OpenURI.open_uri("http://www.google.com/search?q=...").read. Is there any reason I should try to use one way or the other?
There are lots of sites where it turns out to be easiest to use mechanize. If you need to log in, and set a cookie before accessing the data, then mechanize is a simple way of doing this. Similarly, if there are lots of hidden fields that need to be matched (such as CSRF token), then fetching the page using mechanize then submitting it with the data filled out is often a more foolproof method that crafting the URL yourself.
If it is a simple URI, like google's search pages, then manually constructing it may be simpler.

xss protection and html purifier

I am currently using the CodeIgniter framework, and looking to strengthen the XSS protection by using HTMLPurifier (http://htmlpurifier.org/).
Is my understanding correct that you want to 'clean' data on post, so that its purified before its inserted into the Database? Or do I run it before displaying in the view?
If so, do I want to run HTMLPurifier on every single post that takes place? Since the app contains a lot of forms, I'd hate to have to selectively choose what gets cleaned and what doesnt - assuming that I can intercept all posts, is this the way to go? Of course, I validate some fields anyway (like email addresses, numeric numbers, etc)
Use $this->input->post() to get $_POST data. Codeigniter filters it automatically if global xss filter is set to true.
See the docs: http://codeigniter.com/user_guide/libraries/input.html
Edit: to clarify
Yes you should filter before inserting into the DB and yes you should filter all user input.
A quick google search, http://www.google.com/search?q=codeigniter+htmlpurifier, led to this page: http://codeigniter.com/wiki/htmlpurifier which is a helper for htmlpurifier. Regarding catching all $_POST data: you have to do something with the data, right? In your models, when you're doing that something, just make purify() part of that process:
$postdata = purify($_POST);

CakePHP session data cleared on paginator sort

My session data is being saved in my form as expected.
However, when I run a sort on any column of my results, my form session values are cleared.
I am calling in my search form through en element as it's used on specific locations of the site.
Does anyone know why pagination is clearing out my session? Is this standard Cake?
The paginator sort elements are simply a link generated by the paginator and won't consider any of your form data. The first thing you need to make sure that you're doing is tell the paginator to include any URL paramters for the current page in the url it generates. Put this anywhere in the view before you call any of the $paginator functions.
$paginator->options(array('url' => $this->passedArgs));
Secondly, make sure that your search parameters are being included in the URL. It sounds like they probably aren't. I just answered another question on the best practices of search result URLs here: CakePHP Search Results Best Practices
I solved this:
CakePHP session ID path or other method to share the results of a url - recommendations welcome

Scraping pages with asynchronous responses with Hpricot

I'm trying to scrape a page but the initial response has nothing in the body as the content is pumped in asynchronously, e.g. the results from a search on the apple website: http://www.apple.com/uk/search/?q=searching+for+something&sec=global
Any ideas on how I can successfully grab the results from the search with hpricot?
Thanks.
When the search page you refer to is loaded, it makes a request via javascript/ajax to some other location, then populates the search results. This is what you're seeing in the page. Hpricot itself can't help you here because it has no way to interpret the javascript that comes with the page in order to fetch the actual search results list.
Now, if what you're interested in are the search results, you'd need to analyze a bit what happens when you enter that page and type a search query. Some javascript in the page takes your query, and calls (via XMLHttpRequest or similar, AJAX techniques) some other script in Apple's server. This is the one that actually does the search in a database and returns the result.
I suggest you install Firefox with the Firebug plugin, or some other way of seeing the actual requests a page and its javascript components send and / or receive. You'll see that, for the search page you referred, it fetches two parts: First, the "featured" results that come from this URL:
http://www.apple.com/global/scripts/search_featured.php?q=mac+mini&section=global&geo=uk
Notice the search string is in the "q" parameter.
Second, a long results list comes from here:
http://www.apple.com/search/service/nph-search10?site=uk_www&filter=1&snum=50&q=mac+mini
These both are XML documents; you might have better luck parsing these URLs with Hpricot.

Web Programming with AJAX, Problem with caching (I think)

Web programmer here - using AJAX (HTML, CSS, JavaScript, AJAX, PHP, MySQL), but for some reason Internet Explorer is acting up (surprise surprise).
AJAX is updating query results on the HTML page, via a PHP script that queries a MySQL Database.
Everything is working fine, except when I use Internet Explorer 8.0 .
There are several php scripts, which allow for the data to be ordered according to certain criteria, and for testing purposes I have attached the mktime field (current time, in the format HH:MM:SS) to the beginning of the results for each query.
When I use IE, these times appear to remain constant, whereas with ALL other browsers these times are correct and display the current time.
I think the issue has something to do with caching or something along those lines anyway.
Any thoughts or suggestions welcome...
Here is an article on the caching issue.
If your request is a GET change it to a POST, this will prevent the results being cached.
GET requests are cached in IE; switch it to a POST request and it won't be cached anymore.
Instead of switching to POST, which can be ugly if you're not really using it to update or create content, you should append a random number to the query string, as in http://domain.com/ajax/some-request?r=123456. If this number is unique for every request you won't have caching problems.
What I have done is, I have kept the "GET" and added new dummy query parameter to the querystring as follows,
./BaseServlet?sname=3d_motor&calcdir=20110514&dummyParam=datetime
I set dummyParam a value of date object in the javascript so that every time the url is generated browser will treat it as a new url and fetch new (fresh) results.
var d = new Date();
url = url + '&dummyParam='+d.valueOf();
So instead of generating some random numbers this is easy way!

Resources