Web application design with/without ajax - ajax

Let's say I am creating a webapp for a library. My base url is http://mylibrary.com. I want to use "pretty" URLs as follows:
http://mylibrary.com/books (list all books)
http://mylibrary.com/books/book1 (details of a particular book)
At present my approach is to create a single page app and use history api to manage the URLs. i.e I load all CSS and JS files when the user visits the home page. From then I just get data from server using AJAX, in JSON format and then create the required HTML using Javascript.
But I have learnt that this is not so good from SEO point of view.If a crawler were to visit http://mylibrary.com/books it will not see booklist at all because AJAX calls would not take place.
My question is what is the other approach to design this kind of app ? Specifically:
Should the server create entire web page and send it to browser? I mean will the response from server include everything from <html> to </html> or only the required parts?
Do programming languages like PHP efficiently manage to send the HTML to clients ? I would rather have the webserver do it ..
It appears to me that in this scenario AJAX would have very little role to play other than may be change minor parts of the page. Is that a correct understanding ? ..and here I was thinking AJAX is the modern way of doing things

A library would have many books.
So the list would be long..
Using ajax allows you to fetch only the part of it the user is trying to read, without having to retrieve the entire list, or navigate by reloading.
so for low bandwidth, and impatient users, ajax is a godsend.
for crawlers that need the entire page to collect data from, not so much..
so really you want to provide different content depending on the visistor.
How to identify web-crawler?
IMHO: Provide the page from php, if the user agent is a robot, provide the list, otherwise provide the fancy ajax based site, that shows only what you want, when you want..

Related

Caching web pages with high-frequently changing elements

What would be a good approach in general to cache a web page where most of the content living in a database almost never changes (e.g. description) but a little content changes high-frequently (e.g. stock items).
I want to keep the web page cached as long as possible. Would it be an option to get the dynamic content via AJAX request? Do better approaches exist?
You could request the stock data from a separate URL and use JavaScript to insert it into the document. That way, the HTML/CSS/JS remains the same and can be cached. The stock information is loaded using JavaScript and it's not inserted into the HTML by the server.
You could create a URL that returns JSON for this purpose (and similarly for other information that you wish to include using JavaScript).

Designing an application around HMVC and AJAX [Kohana 3.2]

I am currently designing an application that will have a few different pages, and each page will have components that update through AJAX. The layout is similar to the new Twitter design where 'Home', 'Discover', and 'Connect' are separate pages, but interacting within the page (such as clicking 'Followers' or 'Following') uses AJAX.
Since the design requires an initial page load with several components (in the context of Twitter: tweets, followers, following), each of which can be updated individually through AJAX, I thought it'd be best to have a default controller for serving pages, and other controllers with actions that, rather than serving full pages, strictly handle querying the database and returning JSON objects. This way, on initial page load several HMVC requests can be made to gather the data for each component, and AJAX calls can also be made to update each component individually.
My idea is to have a Controller_Default that handles serving pages. In the context of Twitter, Controller_Default would contain:
action_home()
action_connect()
action_discover()
I would then have other Controllers that don't deal with serving full pages, but rather components of pages. For instance, in the context of Twitter Controller_Tweet may have:
action_get()
which returns a JSON object containing tweets for a specific user. Action_home() could then make several HMVC requests to get the data for the several different components of the page (i.e. make requests to 'tweet/get', 'followers/get', 'following/get'). While on the page, however, AJAX calls could be made to the function specific controllers (i.e. 'tweet/get') to update the content.
My question: is this a good design? Does it make sense to have the pages served through a default controller, with page components served (in JSON format) through other function specific controllers?
If there is any confusion regarding the question please feel free to ask for clarification!
One of the strengths of the HMVC pattern is that employing this type of layered application doesn't lock you into a workflow that might be difficult to change later on.
From what you've indicated above, this would be perfectly acceptable as a way of serving content to a client; the default controller wraps sub-requests, which avoids multiple AJAX calls from the client to achieve the same goal.
Two suggestions I might make:
Ensure that your Twitter back-end requests are abstracted out and managed in a library to make the application DRY'er and easier to maintain.
Consider whether the default controller is making only the absolutely necessary calls on each request. Employ caching to avoid pulling infrequently changed data on every request (e.g., followers might only be updated every 30 seconds). This of course depends entirely on your application requirements, but if you get heavily loaded you could quickly find your Twitter API request limit being reached.
One final observation: if you do find the server is experiencing high load and Twitter API requests are taking a long time to return, consider provisioning another server and installing a copy of your application. You can then "point" sub-requests from the default gateway application to your second application server, which should help improve response times if the two servers are connected by a high-speed link.

when to use AJAX and when not to use AJAX in web application

We have web applications elgifto.com, roadbrake.com in which we used AJAX at many places, especially to update major portions of a page. All the important functionality of elgifto.com was implemented using AJAX. Now we realize a few issues due to AJAX implementation.
All the content implemented using
AJAX is not available to the SEO
bots and it is hurting the page rank
of our site.
Users will not be able to bookmark
some of the pages as they are always
available through AJAX.
When we want to direct the user from
one page through an anchor link to
another page having AJAX, we find it
difficult.
So now we are thinking of removing AJAX for these pages and use it only for small functionality such as something similar to marking a question as favorite in SO. So before going ahead and removing, we want to know expert's opinion on this. Thanks.
The problem is not "AJAX" per se, but your implementation of it. Just as a for instance, you can fix the 'bookmark' problem like google maps does it: provide a generated link for each state of your webapp.
SEO can befixed by supplying various of these state-links to the crawlers, either organically trough links in your site, or by supplying a list (sitemap).
If you implement 2, you can fix 1 and 3 with those links.
In the end you must figure out if the effort is worth it, and if you are not overusing AJAX ofcourse, but the statements you've made are not set in stone at all.
I'm costantly developing ajax based websites, with no problems for SEO at all. You just have to use it in the best possible way.
For example, I have a website with normal links pointing to normal webpages (PHP pages), this for normal navigation if a user doesn't have JS enabled. But if a user has JS enabled, a script will change the links behavior, only fetching the content of the page needed.
This way you still have phisycal separated webpages with all their content, which will be indexed as normal.

Google Search optimisation for ajax calls

I have a page on my site which has a list of things which gets updated frequently. This list is created by calling the server via jsonp, getting json back and transforming it into html. Fast and slick.
Unfortunately, Google isn't able to index it. After reading up on how to get this done according to Google's AJAX crawling guide, I am bit confused and need some clarification and confirmation:
The ajax pages need to be implement the rules only, right?
I currently have a rest url like
[site]/base/junkets/browse.aspx?page=1&rows=18&sidx=ScoreAll&sord=desc&callback=jsonp1295964163067
this would need to become something like:
[site]/base/junkets/browse.aspx#page=1&rows=18&sidx=ScoreAll&sord=desc&callback=jsonp1295964163067
And when google calls it like this
[site]/base/junkets/browse.aspx#!page=1&rows=18&sidx=ScoreAll&sord=desc&callback=jsonp1295964163067
I would have to deliver the html snapshot.
Why replace the ? with # ?
Creating html snapshots seems very cumbersome. Would it suffice to just serve simple links? In my case I would be happy if google would only index the things pages.
It looks like you've misunderstood the AJAX crawling guide. The #! notation is to be used on links to the page your AJAX application lives within, not on the URL of the service your appliction makes calls to. For example, if I access your app by going to example.com/app/, then you'd make page crawlable by instead linking to example.com/app/#!page=1.
Now when Googlebot sees that URL in a link, instead of going to example.com/app/#!page=1 – which means issuing a request for example.com/app/ (recall that the hash is never sent to the server) – it will request example.com/app/?_escaped_fragment_=page=1. If _escaped_fragment_ is present in a request, you know to return the static HTML version of your content.
Why is all of this necessary? Googlebot does not execute script (nor does it know how to index your JSON objects), so it has no way of knowing what ends up in front of your users after your scripts run and content is loaded. So, your server has to do the heavy lifting of producing a HTML version of what your users ultimately see in the AJAXy version.
So what are your next steps?
First, either change the links pointing to your application to include #!page=1 (or whatever), or add <meta name="fragment" content="!"> to your app's HTML. (See item 3 of the AJAX crawling guide.)
When the user changes pages (if this is applicable), you should also update the hash to reflect the current page. You could simply set location.hash='#!page=n';, but I'd recommend using the excellent jQuery BBQ plugin to help you manage the page's hash. (This way, you can listen to changes to the hash if the user manually changes it in the address bar.) Caveat: the currently released version of BBQ (1.2.1) does not support AJAX crawlable URLs, but the most recent version in the Git master (1.3pre) does, so you'll need to grab it here. Then, just set the AJAX crawlable option:
$.param.fragment.ajaxCrawlable(true);
Second, you'll have to add some server-side logic to example.com/app/ to detect the presence of _escaped_fragment_ in the query string, and return a static HTML version of the page if it's there. This is where Google's guidance on creating HTML snapshots might be helpful. It sounds like you might want to pursue option 3. You could also modify your service to output HTML in addition to JSON.
I've more or less given up on this. There really seems no alternative to generating the html on the server and delivering it in the html bdoy if you want goolge to index your directory.
I even tried adding a section wraped a .net user control which implemented a simple html version of the directory. But google also managed to ignore ..
So in the end my directory has been de-ajaxified. :(

With Google's #! mess, what effect would a redirect on the converted URL have?

So Google takes:
http://www.mysite.com/mypage/#!pageState
and converts it to:
http://www.mysite.com/mypage/?_escaped_fragment_=pageState
...So... Would be it fair game to redirect that with a 301 status to something like:
http://www.mysite.com/mypage/pagestate/
and then return an HTML snapshot?
My thought is if you have an existing html structure, and you just want to add ajax as a progressive enhancement, this would be a fair way to do it, if Google just skipped over _escaped_fragment_ and indexed the redirected URL. Then your ajax links are configured by javascript, and underneath them are the regular links that go to your regular site structure.
So then when a user comes in on a static url (ie http://www.mysite.com/mypage/pagestate/ ), the first link he clicks takes him to the ajax interface if he has javascript, then it's all ajax.
On a side note does anyone know if Yahoo/MSN onboard with this 'spec' (loosely used)? I can't seem to find anything that says for sure.
If you redirect the "?_escaped_fragment_" URL it will likely result in the final URL being indexed (which might result in a suboptimal user experience, depending on how you have your site setup). There might be a reason to do it like that, but it's hard to say in general.
As far as I know, other search engines are not yet following the AJAX-crawling proposal.
You've pretty much got it. I recently did some tests and experimented with sites like Twitter (which uses #!) to see how they handle this. From what I can tell they handle it like you're describing.
If this is your primary URL
http://www.mysite.com/mypage/#!pageState
Google/Facebook will go to
http://www.mysite.com/mypage/?_escaped_fragment_=pageState
You can setup a server-side 301 redirect to a prettier URL, perhaps something like
http://www.mysite.com/mypage/pagestate/
On these HTML snapshot pages you can add a client-side redirect to send most people back to the dynamic version of the page. This ensures most people share the dynamic URL. For example, if you try to go to http://twitter.com/brettdewoody it'll redirect you to the dynamic (https://twitter.com/#!/brettdewoody) version of the page.
To answer your last question, both Google and Facebook use the _escaped_fragment_ method right now.

Resources