what's the skinny on long polling with ajax and webapi...is it going to kill my server? and string comparisons - ajax

I have a very simple long polling ajax call like this:
(function poll(){
$.ajax({ url: "myserver", success: function(data){
//do my stuff here
}, dataType: "json", complete: poll, timeout: 30000 });
})();
I just picked this example up this afternoon and it seems to work great. I'm using it to build out some html on my page and it's nearly instantaneous as best I can tell. I'm a little worried though that this is going to keep worker threads open on my server and that if I have too big of a load on the server, it's going to stop entirely. Can someone shed some light on this theory? On the back end I have a webapi service (.net mvc 4) that calls a database, build the object, then passes the object back down. It also seems to me that in order for this to work, the server would have to be calling the database constantly...and that can't be good right???
My next question is what is the best way on the client to determine if I need to update the html on my page? Currently I"m using JSON.stringify() to turn my object into a string and comparing the string that comes down to the old string and if there's a delta, it re-writes the html on the page. right now there's not a whole lot in the object coming down, but it could potentially get very large and I think doing that string comparison could be pretty resource intensive on the client...especially if it's doing it nearly constantly.
bottom line for me is this: I"m not sure exactly sure how long polling works. I just googled it and found the above sample code and implemented it and, on the surface, it's awesome. I just fear that it's going to bog things down (on the server) and my way of comparing old results to new is going to bog thigns down (on the client).
any and all information you can provide is greatly appreciated.
TIA.

OK, my two cents:
As others said, SignalR is tried and tested code so I would really consider using that instead of writing my own.
SignalR does change some of the IIS settings to optimise IIS for this sort of work. So if you are looking to implement your own, have a look at IIS setting changes done in SignalR
I suppose you are doing long polling so that your server could implement some form of Server Push. Just bear in mind that this will turn your stateless HTTP machine into a stateful machine which is not good if you want to scale. Long polling behind a load ballancer is not nice :) For me this is the worst thing about server push.
ASP.NET uses ThreadPool for serving requests. A long poll will hog a ThreadPool thread. If you have too many of these threads you might end up in thread starvation (and tears). As a ballpark figure, 100 is not too many but +1000 is.
Even SignalR team say that the IIS box which is optimised for SingalR, probably not optimised for normal ASP.NET and they recommend to separate these boxes. So this means cost and overhead.
At the end of the day, I recommend to using long polling if you are solving a business problem (and not because it is just cool) because then that will pay its costs and overheads and headaches.

I agreee with SLaks - i.e. use SignalR if you need realtime web with WebApi http://www.asp.net/signalr. Long polling is difficult to implement well, let someone else handle that complexity i.e. use SignalR (natural choice for WebApi) or Comet.
SignalR attempts 3 other forms of communication before resorting to long polling, web sockets, server sent events and forever frame (here).
In some circumstances you may be better of with simple polling i.e. a hit every second or so to update... take a look at this article. But here is a quote:
when you have a high message volume, long-polling does not provide any substantial
performance improvements over traditional polling. In fact, it could be worse,
because the long-polling might spin out of control into an
unthrottled, continuous loop of immediate polls.
The fear is that with any significant load on your web page your 30 second ajax query could end up being your own denial of service attack.
Even Bayeux (CometD) will resort to simple polling if the load gets too much:
Increased server load and resource starvation are addressed by using
the reconnect and interval advice fields to throttle clients, which in
the worst-case degenerate to traditional polling behaviour.
As for the second part of you question.
If you are using long polling then your server should ideally only be returning an update if something actually has changed thus your UI should probably "trust" the response and assume that a response means new data. The same goes for any of the Server Push type approaches.
If you did move back down towards simple polling pullmethod then you can use the inbuilt http methods for detecting an update using the If-Modified-Since header which would allow you to return a 304 Not Modified, so the server would check the timestamp of an object and only return a 200 with an object if it had been modified since the last request.

Related

AJAX or WebSockets for a Heartstone-like game?

Game is a one-vs-one turn based 2D card management game to be played in a browser.
It is very much like Hearstone where a player plays a number of cards, observes effects and then passes turn to the opponent.
Game mechanics and prototype are ready and I need to decide on technology.
Server is PHP + MySQL, heard of node.js but have no experience with it.
I cannot have loss of packets, so need to use HTTP I guess.
Initial idea is to have scheduled AJAX call every 5 seconds to get game state for each client to check for:
end of turn
change of game state (and render animation based on it)
Obviously I would also need to validate every action of an active player on the server.
I am concerned of the number of calls to my server (not an expensive hosting) and how many calls would a modest server be capable of handling...
As a plus of Ajax I see guaranteed packet delivery and no issues with proxies involved (which may cut persistent connections).
Websockets reduce latency and server workload ( no need to open a new connection, meaning a key exchange in case of https), provided that you interact frequently.
A great advantage is that you are able to 'push' a message to the client (as opposed to having to 'pull' via Ajax every few seconds.
The server language shouldn't be a problem, but if you plan to maintain / extend it, you should choose carefully (I'm guessing you're a rather new programmer, thus gaining experience in a better suited environment would not be a huge amount of work).
Edit: just to clarify, I would recommend the usage of a websocket for your use case

Do browsers limit AJAX polling rate? What is the limit?

I just read that some browsers would prevent HTTP polling (I guess by limiting the rate of requests)...
From https://github.com/sstrigler/JSJaC:
Note: As security restrictions of most modern browsers prevent HTTP
Polling from being usable anymore this module is disabled by default
now. If you want to compile it in use 'make polling'.
This could explain some misbehavior of some of my JavaScripts (sometimes requests are just not sent or retried, even if they were actually successful). But I couldn't find further information on details..
Questions
if it's "max. number of requests n per x seconds", what are the usual/default settings for x and n?
Is there any way good resource for this?
Any way to detect if a request has been "delayed" or "rejected" because of a rate limit?
Thanks for your help...
Stefan
Yes, as far as I am aware there is a default pool limit of 10 and a default request timeout of 30 seconds per request, however the timeout and poll limits can be controlled and different browsers implement different limitations!
Check out this Google implementation.
and this is an awesome implementation of catching a timeout error!
You can find the Firefox specifics HERE!
Internet Explorer specifics are controlled from inside the Windows registry.
Also have a look at this question.
Basically, the way you control is not by changing the browser limitations, but by abiding them. So you apply a technique called throttle-ing.
Think of it as creating a FIFO/priority queue of functions. A queue struct that takes xhr requests as members and enforces delay between them is an Xhr Poll. For instance, I am using
Jsonp to get data from a node.js server located on another domain and I am polling of course due to browser limitations. Otherwise, I get zero response back from the server and that is only because of browser limitations.
I am actually doing a console log for every request that's supposed to be sent, but not all of them are being logged. So the browser limits them.
I'll be even more specific with helping you out. I have a page on my website which is supposed to render a view for tens or even hundreds of articles. You go through them using a cool horizontal slider.
The current value of the slider matches the currrent 'page'. Since I am only displaying 5 articles per page and I can't exactly load thousands of articles 'onload' without severe performance implications, I load the articles for the current page. I get them from a MongoDB by sending a cross-domain request to a Python script.
The script is supposed to return an array of five objects with all the details I need to build the DOM elements for a 'page'. However, there are a couple of issues.
First, the slider works extremely fast, as it's more or less a value change. Even if there is drag drop functionality, key down events etc, the actual change takes miliseconds. However, the code of the slider looks something like this:
goog.events.listen(slider, goog.events.EventType.CHANGE, function() {
myProject.Articles.page(slider.getValue());
}
The slider.getValue() method returns an int with the current page number, so basically I have to load from:
currentPage * articlesPerPage to (currentPage * articlesPerPage + 1) - 1
But in order to load, i do something like this:
I have a storage engine(think of it as an array):
I check if the content is not already there
If it is, there is no point to make another request, so go forward with getting the DOM elements from the array with the already created DOM elements in place.
If it isn't, then I need to get it so I need to send that request I was mentioning, which would look something like(without accounting for browser limitations):
JSONP.send({'action':'getMeSomeArticles','start':start,'length': itemsPerPage, function(callback){
// now I just parse the callback quickly to make sure it is consistent
// create DOM elements, and populate the client side storage
// and update the view for the user.
}}
The problem comes from the speed with which you can change that slider. Since every change supposedly triggers a request(same would happen for normal Xhr requests), then you are basically crossing the limitations of all browsers, so without throttle-ing, there would be no 'callback' for most of the requests. 'callback' is the JS code returned by the JSONP request(which is more of a remote script inclusion than anything else).
So what I do is push a request to a priority queue, not POLL, as now I don't need to send multiple simultaneous requests. If the queue is empty, the recently added member is executed and everyone is happy. If it's not, then all non-completed requests in progress are cancelled and only the last one is executed.
Now in my particular case, I do a binary search(0(log n)) to see if the storage engine doesn't have data for the previous requests yet, which tells me if the previous request has been completed or not. If it has, then it's removed from the queue and the current one is processed, otherwise the new one fires. So an and so forth.
Again, for speed consideration and shit browser wanna-bes such as Internet Explorer, I do the above described procedure about 3-4 steps ahead. So I pre-load 20 pages ahead till everything is the client side storage engine. This way, every limitation is successfully dealt with.
The cooldown time is covered by the minimum time it would take to slide through 20 pages and the throttle-ing makes sure there are no more than 1 active requests at any given time(with backwards compatibility going as far as Internet Explorer 5).
The reason why I wrote all this is to give you an example trying to say that you cannot always enforce delay directly from the FIFO structure, as your calls may need to turn into what a user sees, and you don't exactly want to make a user wait 10-15 seconds for a single page to render.
Also, always minimize the polling and the need to poll(simultaneously fired Ajax events, as not all browsers actually do good things with them). For instance, instead of doing something like sending one request to get content and sending another for that content to be tracked as viewed in your app metrics, do as many tasks at server level as you possibly can!
Of course, you probably want to track your errors properly, so your Xhr object from your library of choice implement error handling for ajax and because you are an awesome developer you want to make use of them.
so say you have a try - catch block in place
The scenario is this:
An Ajax call has finished and it's supposed to return a JSON, but the call somehow failed. However, you try to parse the JSON and do whatever you need to do with it.
so
function onAjaxSuccess (ajaxResponse) {
try {
var yourObj = JSON.parse(ajaxRespose);
} catch (err) {
// Now I've actually seen this on a number of occasions, to log that an error occur
// a lot of developers will attempt to send yet another ajax request to log the
// failure of the previous one.
// for these reasons, workers exist.
myProject.worker.message('preferrably a pre-determined error code should go here');
// Then only the worker should again throttle and poll the ajax requests that log the
//specific error.
};
};
While I have seen various implementations that try to fire as many Xhr requests at the same time as they possible can until they encounter browser limitations, then do quite a good job at stalling the ones that haven't fired in wait for the browser 'cooldown', what I can advise you is to think about the following:
How important is speed for your app?
Just how scalable and how intensive the I/O will be?
If the answer to the first one is 'very' and to the latter 'OMFG modern technology', then try to optimize your code and architecture as much as you can so that you never need to send 10 simultaneous Xhr requests. Also, for large scale apps, multi-thread your processes. The JavaScript way to accomplish that is by using workers. Or you could call the ECMA board, tell them to make this a default, and then post it here so that the rest of us JS devs can enjoy native multi-threading in JS:)(how dafuq did they not think about this?!?!)
Stefan, quick answers below:
-if it's "max. number of requests n per x seconds", what are the usual/default settings for x and n?
This sounds more like a server restriction. The browser ones usually sound like:
-"the maximum requests for the same hostname is x"
-"the maximum connections for ANY hostname is y"
-Is there any way good resource for this?
http://www.browserscope.org/?category=network (also hover over table headers to see what is measured)
http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections
-Any way to detect if a request has been "delayed" or "rejected" because of a rate limit?
You could look at the http headers for "Connection: close" to detect server restrictions but I am not aware of being able in JavaScript to read settings from so many browsers in a consistent, browser-independent way. (For Firefox, you could read this http://support.mozilla.org/en-US/questions/746848)
Hope this quick answer helps?
No, browser does not in any way affect polling. I think what was meant on that page is the same origin policy - you can only access the same host and port as your original page.
Only known limitation to connections themselves is that you usually can only have from two to four simultaneous connections to the same host.
I've written some apps with long poll, some with C++ backend with my own webserver, and one with PHP backend with Apache2.
My long poll timeout is 4..10 s. When something occurs, or 4..10 s passes, my server returns an empty response. Then the client immediatelly starts another AJAX request. I found that some browsers hangs up when I start AJAX call from previous AJAX handler, so I am using setTimeout() with a small value to start the next AJAX request.
When something happens on the client side, which should be sent to server, I use another AJAX request for it, but it's a one-way thing: the server does not send any response, and the client does not process anything. The result of the operation (if any) will be received on the long poll. It requires max. 2 connection to the server, which all browsers supports.
Keep in mind, that if there's 500 client, it means 500 server-side webserver thread, which will move together, occurring load peaks, because when something happens, the server have to report it at the same time for each clients, the clients will process it near same time long, they will start the next long request in the same time, and from then, the timeout will expire also at the same time, and furthcoming ones too. You can trick with rnd timeout, say 4 rnd(0..4), but it's worthless, if anything happens, they will "sync" again, all the request have to be served at the same time, when something reportable happens.
I've tested it thru a router, and it works. I assume, routers respects 4..10 lag, it's around the speed of a slow webapge (far, far away), which no router think, that it should be canceled.
My PHP work is a collaborative spreadsheet, it looks amazing when you hit enter and the stuff is updating simultaneously in several browsers. Have fun!
No limit for no of ajax requests. However it will be on same host & port.
Server can limit no of request from a machine based on its setting.
For example. A server can set so that if there are more than few request from same machine within specified time it will reject request.
After small mistake in javascript code, neverending loop was made witch each step calling 2 ajax requests. In firebug i could see more and more requests until firefox started to slow down, dont response and finally crash.
So, yes, there is a "limit" ;)

AJAX Real Time and collaborative

I am trying to create real-time and collaborative application like - google wave for example.
When user1 writes something at the same time it shows on user2 screens.
I started a little research,and found some ways to this with Ajax -
1.every X seconds send request to the server and to check what is "happening"
2.timeout - long request ,Problem - I saw i can do this only with IE8
there are other options?what is the best way to this?
And with way number 2,this true I can do this only with IE8?
Yosy
The whole point of AJAX is that the server can wait for notifications from each clients, and notify all the other clients when something happens. There's no need for polling. Look up keywords like comet, and bayeux. Dojo has a good implementation.
I'm not sure what you are referring to in 2, but if I were going to implement something like this, I'd do what you explain in 1. Basically your server will be keeping track of the conversation, and the clients will constantly ask for updates.
Another possible option would be flash, but I don't know much about that other than it would be capable, so your on your own for researching that.
Some notes on keeping things running quickly in option 1:
Remember you only have 2 "ajax"
calls to work with on the client side (you can only have 2 calls
out at once). So keep track
of the calls that are out. Make use
of abort() if a call takes too long or its response is not going to be valid anymore.
Get the most out of your calls, if
you need to send text to the server,
use the response to get an update on
the current "conversation".

Ajax use on website design

I just want to ask for your experience. I'm designing a public website, using jQuery Ajax in most of operations. I'm having some timeouts, and I think it should be for hosting provider cause. Any of you have expirience in this case and may advise me on some hints (especially on timeouts handling)?
Thanks in advance to all.
Esteve
If you have a half-decent host, chances are these aren't network timeouts but are rather due to insufficient hardware which causes your server-side scripts to take too long to answer. For example if you have an autocomplete field and the script goes through a database of 100,000 entries, this is a breeze for newer servers but older "budget" servers or overcrowded shared hosting servers might croak on it.
Depending on what your Ajax operations are, you may be able to break them down in shorter chunks. If you're doing database queries for example, use LIMIT and OFFSET and only return say, 5 entries at a time. When those 5 entries arrive on the client, make another Ajax call for 5 more, so from the user's point of view the entries will keep coming in and it will look fluid (instead of waiting 30s and possibly timing out before they see all entries at once). If you do this make sure you display a spiffy web 2.0 turning wheel to let the user know if they should be waiting some more or if it's done.

Distributed time synchronization and web applications

I'm currently trying to build an application that inherently needs good time synchronization across the server and every client. There are alternative designs for my application that can do away with this need for synchronization, but my application quickly begins to suck when it's not present.
In case I am missing something, my basic problem is this: firing an event in multiple locations at exactly the same moment. As best I can tell, the only way of doing this requires some kind of time synchronization, but I may be wrong. I've tried modeling the problem differently, but it all comes back to either a) a sucky app, or b) requiring time synchronization.
Let's assume I Really Really Do Need synchronized time.
My application is built on Google AppEngine. While AppEngine makes no guarantees about the state of time synchronization across its servers, usually it is quite good, on the order of a few seconds (i.e. better than NTP), however sometimes it sucks badly, say, on the order of 10 seconds out of sync. My application can handle 2-3 seconds out of sync, but 10 seconds is out of the question with regards to user experience. So basically, my chosen server platform does not provide a very reliable concept of time.
The client part of my application is written in JavaScript. Again we have a situation where the client has no reliable concept of time either. I have done no measurements, but I fully expect some of my eventual users to have computer clocks that are set to 1901, 1970, 2024, and so on. So basically, my client platform does not provide a reliable concept of time.
This issue is starting to drive me a little mad. So far the best thing I can think to do is implement something like NTP on top of HTTP (this is not as crazy as it may sound). This would work by commissioning 2 or 3 servers in different parts of the Internet, and using traditional means (PTP, NTP) to try to ensure their sync is at least on the order of hundreds of milliseconds.
I'd then create a JavaScript class that implemented the NTP intersection algorithm using these HTTP time sources (and the associated roundtrip information that is available from XMLHTTPRequest).
As you can tell, this solution also sucks big time. Not only is it horribly complex, but only solves one half the problem, namely giving the clients a good notion of the current time. I then have to compromise on the server, either by allowing the clients to tell the server the current time according to them when they make a request (big security no-no, but I can mitigate some of the more obvious abuses of this), or having the server make a single request to one of my magic HTTP-over-NTP servers, and hoping that request completes speedily enough.
These solutions all suck, and I'm lost.
Reminder: I want a bunch of web browsers, hopefully as many as 100 or more, to be able to fire an event at exactly the same time.
Let me summarize, to make sure I understand the question.
You have an app that has a client and server component. There are multiple servers that can each be servicing many (hundreds) of clients. The servers are more or less synced with each other; the clients are not. You want a large number of clients to execute the same event at approximately the same time, regardless of which server happens to be the one they connected to initially.
Assuming that I described the situation more or less accurately:
Could you have the servers keep certain state for each client (such as initial time of connection -- server time), and when the time of the event that will need to happen is known, notify the client with a message containing the number of milliseconds after the beginning value that need to elapse before firing the event?
To illustrate:
client A connects to server S at time t0 = 0
client B connects to server S at time t1 = 120
server S decides an event needs to happen at time t3 = 500
server S sends a message to A:
S->A : {eventName, 500}
server S sends a message to B:
S->B : {eventName, 380}
This does not rely on the client time at all; just on the client's ability to keep track of time for some reasonably short period (a single session).
It seems to me like you're needing to listen to a broadcast event from a server in many different places. Since you can accept 2-3 seconds variation you could just put all your clients into long-lived comet-style requests and just get the response from the server? Sounds to me like the clients wouldn't need to deal with time at all this way ?
You could use ajax to do this, so yoǘ'd be avoiding any client-side lockups while waiting for new data.
I may be missing something totally here.
If you can assume that the clocks are reasonable stable - that is they are set wrong, but ticking at more-or-less the right rate.
Have the servers get their offset from a single defined source (e.g. one of your servers, or a database server or something).
Then have each client calculate it's offset from it's server (possible round-trip complications if you want lots of accuracy).
Store that, then you the combined offset on each client to trigger the event at the right time.
(client-time-to-trigger-event) = (scheduled-time) + (client-to-server-difference) + (server-to-reference-difference)
Time synchronization is very hard to get right and in my opinion the wrong way to go about it. You need an event system which can notify registered observers every time an event is dispatched (observer pattern). All observers will be notified simultaneously (or as close as possible to that), removing the need for time synchronization.
To accommodate latency, the browser should be sent the timestamp of the event dispatch, and it should wait a little longer than what you expect the maximum latency to be. This way all events will be fired up at the same time on all browsers.
Google found the way to define time as being absolute. It sounds heretic for a physicist and with respect to General Relativity: time is flowing at different pace depending on your position in space and time, on Earth, in the Universe ...
You may want to have a look at Google Spanner database: http://en.wikipedia.org/wiki/Spanner_(database)
I guess it is used now by Google and will be available through Google Cloud Platform.

Resources