Suggestions on how to handle a static response with dynamic session data - ajax

Imagine a website that is highly cached where the output of almost every GET action is cached into a html file that is accessible directly from the HTTP server without having to perform a server-side CGI operation. Now imagine that in addition to that, JavaScript is used to filter the response of the HTML request using AJAX. The AJAX response contains only the appropriate response of the page (so for standard HTML pages it will contain everything except for the surrounding layout, for modals it will contain only the modal box HTML, etc...).
Now lets imagine that the HTML content may be cached neutrally (when nobody is logged in) or cached for someone who is logged in. There are certain areas of the page that are tied to session data (like the welcome message, the profile link, etc...) and that data is specific to the session. But since we're using JavaScript, we can buffer the AJAX response, change the session element values, and then stick it into the DOM all the while the user is unaware of any session hot swapping. This relies ofcoarse only on GET requests and pages where the actual content is not 100% session dependent.
Now here is my question. If I were to implement this (and trust me I will) then how might I actually keep track of the session activity while the user is browsing the page? With a traditional server-side operation, whenever the user accesses a page then the server-side framework will update the session and keep tabs on the session-related variables. With a static HTTP request operation then all server-side involvement is avoided. So I will need to figure out some way of keeping track of what's going on with the session; here are my approaches:
1) Perform two AJAX requests (or an additional one when needed):
Once the user queries a page then the contents will be downloaded as static HTML. But at the same time that page is queried then another AJAX request will be serviced to a session-specific URL/server updating/querying the status of the session. This can be done side by side or can be performed after every few requests are made.
Pros = HTML files are left unchanged, HTML files can be set to have a ETag or future expires header, JavaScript can cache only the static HTML and use it for offline browsing, a session-server can be dedicated, optimized and configured for session activity.
Cons = Two AJAX requests are performed, excessive polling for potentially redundant data, session handling made be separated from content server.
2) Use a midway proxy that appends the session-data as a trailing session JSON
A request is made to the server. There is a proxy in between that locally accesses the session data and then performs another HTTP request (either locally or remotely) which is then concatenated with the session data findings fetched just before. The browser is responded with a clean copy of HTML code where has JavaScript-specific session content and then everything is updated at the same moment.
Pros = Everything is downloaded at once, only one connection required, works like a normal HTTP request would
Cons = Caching gets difficult when a dynamic content proxy is used, content-length may need to be search and replaced with to append additional data, may not work with some browsers?
3) Use Comet for session data
A persistant, reverse-AJAX comet connection could be established at the start of the website connection. Then, all static-HTML requests could be accessed normally. All session-related requests could be accessed from the comet connection.
Pros = Separation of static content and dynamic content.
Cons = Comet isn't supported very well and doesn't work very well, server latency, may conflict with same origin policy.
How do you guys think this problem should be solved? Do you think its doable?

The solution I've found is to use templated data and dynamic data separate from each other. It's too much work and too messy to implement this on your own so you can go as far as using a MVC framework to provide JSON requests with templating (AngularJS, KnockoutJS, EmbedJS, etc...) or you can just stick to using templates in general. Keep in mind that this destroys SEO.

Related

Caching web pages with high-frequently changing elements

What would be a good approach in general to cache a web page where most of the content living in a database almost never changes (e.g. description) but a little content changes high-frequently (e.g. stock items).
I want to keep the web page cached as long as possible. Would it be an option to get the dynamic content via AJAX request? Do better approaches exist?
You could request the stock data from a separate URL and use JavaScript to insert it into the document. That way, the HTML/CSS/JS remains the same and can be cached. The stock information is loaded using JavaScript and it's not inserted into the HTML by the server.
You could create a URL that returns JSON for this purpose (and similarly for other information that you wish to include using JavaScript).

Designing an application around HMVC and AJAX [Kohana 3.2]

I am currently designing an application that will have a few different pages, and each page will have components that update through AJAX. The layout is similar to the new Twitter design where 'Home', 'Discover', and 'Connect' are separate pages, but interacting within the page (such as clicking 'Followers' or 'Following') uses AJAX.
Since the design requires an initial page load with several components (in the context of Twitter: tweets, followers, following), each of which can be updated individually through AJAX, I thought it'd be best to have a default controller for serving pages, and other controllers with actions that, rather than serving full pages, strictly handle querying the database and returning JSON objects. This way, on initial page load several HMVC requests can be made to gather the data for each component, and AJAX calls can also be made to update each component individually.
My idea is to have a Controller_Default that handles serving pages. In the context of Twitter, Controller_Default would contain:
action_home()
action_connect()
action_discover()
I would then have other Controllers that don't deal with serving full pages, but rather components of pages. For instance, in the context of Twitter Controller_Tweet may have:
action_get()
which returns a JSON object containing tweets for a specific user. Action_home() could then make several HMVC requests to get the data for the several different components of the page (i.e. make requests to 'tweet/get', 'followers/get', 'following/get'). While on the page, however, AJAX calls could be made to the function specific controllers (i.e. 'tweet/get') to update the content.
My question: is this a good design? Does it make sense to have the pages served through a default controller, with page components served (in JSON format) through other function specific controllers?
If there is any confusion regarding the question please feel free to ask for clarification!
One of the strengths of the HMVC pattern is that employing this type of layered application doesn't lock you into a workflow that might be difficult to change later on.
From what you've indicated above, this would be perfectly acceptable as a way of serving content to a client; the default controller wraps sub-requests, which avoids multiple AJAX calls from the client to achieve the same goal.
Two suggestions I might make:
Ensure that your Twitter back-end requests are abstracted out and managed in a library to make the application DRY'er and easier to maintain.
Consider whether the default controller is making only the absolutely necessary calls on each request. Employ caching to avoid pulling infrequently changed data on every request (e.g., followers might only be updated every 30 seconds). This of course depends entirely on your application requirements, but if you get heavily loaded you could quickly find your Twitter API request limit being reached.
One final observation: if you do find the server is experiencing high load and Twitter API requests are taking a long time to return, consider provisioning another server and installing a copy of your application. You can then "point" sub-requests from the default gateway application to your second application server, which should help improve response times if the two servers are connected by a high-speed link.

What's the minimum an application needs to be considered an Ajax application?

Is anything that uses JavaScript and asynchronous communication of XML data considered Ajax?
Most people who deal with AJAX would consider any usage of XMLHttpRequest to be AJAX.
This doesn't mean that the request need be async either.
These days, JSON replaces XML for communications.
From wikipedia:
With Ajax, web applications can send data to, and retrieve data from, a server asynchronously (in the background) without interfering with the display and behavior of the existing page. Data is usually retrieved using the XMLHttpRequest object. Despite the name, the use of XML is not needed (JSON is often used instead), and the requests need not be asynchronous.
AJAX seems to encompass any application which retrieves data using the XMLHttpRequest object. Despite it's name you don't need to use XML and I'd wager most AJAX apps these days are using JSON instead. Also they don't necessarily make asynchronous requests. We probably need a new buzzword at this point. Maybe websockets will take off!
The term AJAX and its abbreviation is a misnomer. It has nothing to do with XML. It typically refers to the XMLHttpRequest function. The name of this function again is a misnomer because you could use it to get or send JSON data, plain text, or even binary data now.
AsyncHttpRequest would have been a more appropriate term for the function, and AJAH (Asynchronous JavaScript and HTML) instead of AJAX. On a side note, although XMLHttpRequest allows synchronous requests too, they'd probably be better off getting rid of it altogether.
Typically AJAX applications make good use of asynchronous calls and avoid page refreshes as much as possible. Gmail is a good example. Facebook, on a modern browser, also uses AJAX. Clicking on different links like "News Feed", "Events", etc. doesn't cause a page reload although the path in the address bar changes. Github does the same on modern browsers.

Lazy HTTP caching

I have a website which is displayed to visitors via a kiosk. People can interact with it. However, since the website is not locally hosted, and uses an internet connection - the page loads are slow.
I would like to implement some kind of lazy caching mechanism such that as and when people browse the pages - the pages and the resources referenced by the pages get cached, so that subsequent loads of the same page are instant.
I considered using HTML5 offline caching - but it requires me to specify all the resources in the manifest file, and this is not feasible for me, as the website is pretty large.
Is there any other way to implement this? Perhaps using HTTP caching headers? I would also need some way to invalidate the cache at some point to "push" the new changes to the browser...
The usual approach to handling problems like this is with HTTP caching headers, combined with smart construction of URLs for resources referenced by your pages.
The general idea is this: every resource loaded by your page (images, scripts, CSS files, etc.) should have a unique, versioned URL. For example, instead of loading /images/button.png, you'd load /images/button_v123.png and when you change that file its URL changes to /images/button_v124.png. Typically this is handled by URL rewriting over static file URLs, so that, for example, the web server knows that /images/button_v124.png should really load the /images/button.png file from the web server's file system. Creating the version numbers can be done by appending a build number, using a CRC of file contents, or many other ways.
Then you need to make sure that, wherever URLs are constructed in the parent page, they refer to the versioned URL. This obviously requires dynamic code used to construct all URLs, which can be accomplished either by adjusting the code used to generate your pages or by server-wide plugins which affect all text/html requests.
Then, you then set the Expires header for all resource requests (images, scripts, CSS files, etc.) to a date far in the future (e.g. 10 years from now). This effectively caches them forever. This means that all requests loaded by each of your pages will be always be fetched from cache; cache invalidation never happens, which is OK because when the underlying resource changes, the parent page will use a new URL to find it.
Finally, you need to figure out how you want to cache your "parent" pages. How you do this is a judgement call. You can use ETag/If-None-Match HTTP headers to check for a new version of the page every time, which will very quickly load the page from cache if the server reports that it hasn't changed. Or you can use Expires (and/or Max-Age) to reload the parent page from cache for a given period of time before checking the server.
If you want to do something even more sophisticated, you can always put a custom proxy server on the kiosk-- in that case you'd have total, centralized control over how caching is done.

Cross site scripting(XSS)

I am loading content from another page and depending on the content of page, changing content of my page and this is giving me cross site scripting issues.
When i use iframe, since the content is from other domain, content of iframe becomes inaccessible.
When i use ajax and try to inject the content as plain html code, XmlHttpRequest object throws permission denied exception due to cross site scripting.
When i use JSONP, such as getJSON in JQuery, it only supports GET protocol and it is not adequate for further processing.
I wonder what other options i can try. Heard that DOJO, GWT,Adobe Air do some XSS, but dont know which one is the best.
Thanks,
Ebe.
Without JSON-P, your only option is to run a proxy script on your own server that fetches the content from the external site and pipes it back to the browser.
The browser fetches the content from the script on your server, hence no cross-domain issues, but the script on your server dynamically fetches it from the external site.
There's an example of such a script in PHP here: http://www.daniweb.com/code/snippet494.html (NB. I haven't personally used it).
If you have control over both domains, take a look at EasyXDM. It's a library which wraps cross-browser quirks and provides an easy-to-use API for communicating in client script between different domains using the best available mechanism for that browser (e.g. postMessage if available, other mechanisms if not).
Caveat: you need to have control over both domains in order to make it work (where "control" means you can place static files on both of them). But you don't need any server-side code changes.
To add to what RichieHindle says, there are some good script (Python+Cron) that you can plonk on your server and it will check for changes to a POST/GET location and cache the changes on your server.
Either set your triggers low (once every 10 mins/ 1 per day) or you might get blacklisted from the target.
This way, a local cache won't incur the HTTP overhead on every AJAX call from the client.

Resources