Combining apache requests? - ajax

I have an ajax client that often needs to retrieve 3-10 static documents from the server. Those 3-10 documents are selected by the client from roughly 100 documents in total. I have no way of knowing in advance which 3-10 documents the client will require.
Additionally, those 100 documents are generated from database content, and are dynamic.
It does not seem logical though to make a single ajax requests per document.
My first thought was to write a JSP that makes use of the include action.
ie in pseudo code:
for (param in params){
jsp:include page="[param]"
}
Tomcat can't support this as it doesn't just include the html resource, it re-compiles it, generating a class file each time, which also seems costly .
Can the community provide a solution for combining apache requests to static files to make use of single requests, rather than several, but without the overhead of extra class files for each of the static files and in way that avoids the regeneration each time the static file changes?

You can certainly write a Servlet (or JSP if you're really insistent on doing so) which will serve the contents of a set of documents.
However, doing so would encounter some of the following issues
How do you delimit the documents so that the clients know which is which
How to stop the clients from requesting something they don't have permission to (this one is tricky)

Related

REST API for main page - one JSON or many?

I'm providing RESTful API to my (JS) client from (Java Spring) server.
Main site page contains a number of logical blocks (news, last comments, some trending stuff), each of them has a corresponding entity on server. Which way is a right one to go, handle one request like
/api/main_page/ ->
{
news: {...}
comments: {...}
...
}
or let the client do a few requests like
/api/news/
/api/comments/
...
I know in general it's better to have one large request/response, but is this an answer to this situation as well?
Ideally, you should have different API calls for fetching individual configurable content blocks of the page from the same API.
This way your content blocks are loosely bounded to each other.
You
can extend, port(to a new framework) and modify them independently at
anytime you want.
This comes extremely useful when application grows.
Switching off a feature is fairly easy in this
case.
A/B testing is also easy in this case.
Writing automation is
also very easy.
Overall it helps in reducing the testing efforts.
But if you really want to fetch this in one call. Then you should add additional params in request and when the server sees that additional param it adds the additional independent JSON in the response by calling it's own method from BL layer.
And, if speed is your concern then try caching these calls on server for some time(depends on the type of application).
I think in general multiple requests can be justified, when the requested resources reflect parts of the system state. (my personal rule of thumb, still WIP).
i.e. if a news gets displayed in your client application a lot, I would request it once and reuse it wherever I can. If you aggregate here, you would need to request for it later, maybe some of them never get actually displayed, and you have some magic to do if the representation of a news differs in the aggregation and /news/{id}-resource.
This approach would increase communication if the page gets loaded for the first time, but decrease communication throughout your client application the longer it runs.
The state on the server gets copied request by request to your client or updated when needed (Etags, last-modified, etc.).
In your example it looks like /news and /comments are some sort of latest or since last visit, but not all.
If this is true, I would design them to be a resurce as well, like /comments/latest or similar.
But in any case I would them only have self-links to the /news/{id} or /comments/{id} respectively. Then you would have a request to /comments/latest, what results in a list of news-self-links, for what I would start a request only if I don't already have that news (maybe I want to check if the cached copy is still up to date).
It is also possible to trigger the request to a /news/{id} only if it gets actually displayed (scrolling, swiping).
Probably the lifespan of a news or a comment is a criterion to answer this question. Meaning the caching in the client it is not that vital to the system, in opposite of a book in an Book store app.

Is a CDN an appropriate solution to cache query results?

I'm working on a little search engine, where I'm trying to find out how to cache query results.
These results are simple JSON text, retrieved using an ajax request.
Storing results in memory is not an option, I can see two options remaining:
Use a nosql database to retrieve cached results.
Store results on a CDN and redirect the http request (307 - Temporary Redirect) in case the result was already cached.
However, I don't have much experience with CDN, and wonder if using it for a huge amount of temporary small text files is a good practice.
Is it a good practice to use redirection on an ajax request?
Is a CDN an appropriate solution to cache small text files?
Short answer: no.
Long: Usually, you use a CDN for large static files that you want the CDN to mirror all around the world so it's close to a user when she requests them. When you have data that changes a lot, it would always take a while to propagate the changes to all nodes of the CDN, in the meantime users get inconsistent results (this may or may not matter to you).
Also, to avoid higher latency I wouldn't use an HTTP redirect (where you tell the client to make a second request to somewhere else) but rather figure out whether to get the data from the cache or the engine on your end (e.g. using a caching proxy or a load balancer) and then serve it directly to the client.

Ajax and Performance/Speed

I'm currently creating a small todo site, and I have multiple questions related to ajax and performance... So here are my questions:
In order to reduce number of request, I want to get all data from one request, so I will pass for example these attributes:
1.1. to get 1 task:
entity=task&id=2&type=single&extra=subtasks%%contexts
1.2. to get list of tasks and events in one listing:
entity=task%%event&user_id=2%type=multiple%order=date&limit=10
Do you think it will reduce number of request and improves some how the performance?
If all requests will go to one file, it means that that .php file might be quite big, is it bad? Or it not really matter?
For the listing. I will be able to change the order of listing and maybe filter it somehow. Do you think it will be better to load all tasks and event to
To keep things fast there are two concerns:
Reduce HTTP requests – if you need two separate bits of data, send them in one file.
Keep the content delivered in each AJAX request small – gzip and caching works wonders here.
So, yes, bundle things together. Large PHP file doesn't make any difference, DB queries are the only real bottleneck in a normally trafficked webpage.
For filtering and sorting, a good approach is to use JSON for the AJAX response, then sort/filter based on that on the client side if you are talking about a smallish number of items (probably upto 1000 items). If you have 100s of thousands of items, then returning a subset from the server will be better.

Large number of concurrent ajax calls and ways to deal with it

I have a web page which, upon loading, needs to do a lot of JSON fetches from the server to populate various things dynamically. In particular, it updates parts of a large-ish data structure from which I derive a graphical representation of the data.
So it works great in Chrome; however, Safari and Firefox appear to suffer somewhat. Upon the querying of the numerous JSON requests, the browsers become sluggish and unusable. I am under the assumption that this is due to the rather expensive iteration of said data structure. Is this a valid assumption?
How can I mitigate this without changing the query language so that it's a single fetch?
I was thinking of applying a queue that could limit the number of concurrent Ajax queries (and hence also limit the number of concurrent updates to the data structure)... Any thoughts? Useful pointers? Other suggestions?
In browser-side JS, create a wrapper around jQuery.post() (or whichever method you are using)
that appends the requests to a queue.
Also create a function 'queue_send' that will actually call jQuery.post() passing the entire queue structure.
On server create a proxy function called 'queue_receive' that replays the JSON to your server interfaces as though it came from the browser, collects the results into a single response, sends back to browser.
Browser-side queue_send_success() (success handler for queue_send) must decode this response and populate your data structure.
With this, you should be able to reduce your initialization traffic to one actual request, and maybe consolidate some other requests on your website as well.
in particular, it updates parts of a largish data structure from which i derive a graphical representation of the data.
I'd try:
Queuing responses as they come in, then update the structure once
Hiding the representation invisible until the responses are in
Magicianeer's answer is also good - but I'm not sure if it fits your definition of "without changing the query language so that it's a single fetch" - it would avoid re-engineering existing logic.

How do I get around the Twitter API caching problem?

I'm building a Twitter app that requires to check user data somewhat frequently, but I'm facing trouble with a cache that's oddly on Twitter's side, not mine.
Try the following user:
users/show in XML: http://twitter.com/users/show.xml?screen_name=technolocus
users/show in JSON: http://twitter.com/users/show.json?screen_name=technolocus
normal page: http://twitter.com/technolocus
All these methods of accessing data should return the same values, right? Check the statuses_count for each of them.
XML: 12548
JSON: 12513
normal: 12498
The normal method (i.e. just visiting the profile non-programatically) serves up the most correct value of 12498. If I post or delete tweets to this account, it gets updated on the profile page instantly, but the XML and JSON methods still return cached data.
At this point, the values of the XML and JSON methods are 12 to 18 hours old respectively.
I first tried to access these methods from my website (hosted on Dreamhost). I thought it was Dreamhost caching the responses. Then I tried to access the API directly from my browser. I did a cURL from the command line from my machine after that. It wasn't dreamhost. I thought it was probably my ISP (I think they use NetApp or something like that). Then I asked a friend in another corner of India to try it. He's getting the exact same cached responses as I am.
So it isn't Dreamhost's cache; it isn't my ISP or my country's cache. There's only one conclusion - Twitter is caching responses.
How in the heavens do I get around this?!?
Forgot to mention this: The script on the server is in PHP and is using cURL to retrieve the XML and JSON data from Twitter, while the local tests have been just using the browser. Both have the exact same result!
First, I think you should report this a a bug to Twitter. I see the same discrepancy as you, and no matter what that seems like a bug. Even if they're caching, I'd expect that a cache on their side would store an abstract form that would then be rendered into HTML, JSON, and XML. I wonder if what's actually going on is that these requests are performing similar but different queries.
Are you sure that the values are "old"? For example, did you actually delete about 50 updates recently (since you say the HTML one is newest but shows a lower count than the other two)? If you create another update do you see the HTML number increment while the other numbers stay the same, or do they all increment simultaneously?
If what you are saying is accurate, and it probably is, generally, you can't get around it. Twitter would want to be caching its responses since they are costly to reproduce every single time.
When you use Twitter's APIs, you end up being bound by its conventions, even if that includes caching.
Your best bet is to tweet to #twitterapi and get them to give you a response as to why the two representations are divergent.
Add ?blah=xxxx to all urls.
I don't develop anything against twitter and ocassionaly manually "follow" three tweets by going to them in my browser. They always lag behind by half a day. I add ?asdsadsadsad to the url (everytime something different) and it always updates. I don't know what Twitter is doing here and came here while searching for the problem. But I guess this trick of appending a random value to the url via GET will probably work for your api requests, too.

Resources