AJAX Polling Question - Blocking Or Frequent? - ajax

I have a web application that relies on very "live" data - so it needs an update every 1 second if something has changed.
I was wondering what the pros and cons of the following solutions are.
Solution 1 - Poll A Lot
So every 1 second, I send a request to the server and get back some data. Once I have the data, I wait for 1 second before doing it all again. I would detect client-side if the state had changed and take action appropriately.
Solution 2 - Block A Lot
So I start a request to the server that will time-out after 30 seconds. The server keeps an eye on the data on the server by checking it once per second. If the server notices the data has changed it sends the data back to the client, which takes action appropriately.
Scenario
Essentially, the data is reasonably small in size, but changes at random intervals based on live events. The thing is, the web UI will be running something in the region of 2,000 instances, so do I have 2,000 requests per second coming from the UI or do I have 2,000 long-running requests that take up to 30 seconds?
Help and advice would be much appreciated, especially if you have worked with AJAX requests under similar volumes.

One common solution for such cases is to use static json files. Server-side scripts update them when the data is changed and they are served by fast and light webserver (like nginx). Since files are static and small - webserver will do that right in cache, in very fast manner.

Consider a better architecture. Implementing this kind of messaging system is trivial to do right in something like nodeJS. Message dispatch will be instantaneous, and you won't need to poll for your data on either side.
You don't need to rewrite your whole system: The data producer could simply POST the updates to the nodeJS server instead of writing them to a file, and as a bonus, you don't even need to waste time on disk IO.
If you started without knowing any nodeJS, you could still be done in a couple hours, because you can just hack up the chat example.

I can't comment yet, but I would agree with geocar. Running live or almost live web services with just polling will be solution stuck between a rock and a hard place.
You could also look into web sockets to allow push as this sounds a better solution for this than just updating every second to 30 seconds.
Good luck!

Related

Is this a correct scenario to use WebSocket?

I have a browser plugin which will be installed on 40,000 dekstops.
This plugin will connect to a backend configuration file available via https, e.g. http://somesite/config_file.js.
The plugin is configured to poll this backend resource once/day.
But there is only one backend server. So if 40,000 endpoints start polling together the server might crash.
I could think of randomize the polling frenquency from the desktop plugins. But randomization still does not gurantee that there will not be a overload at the server.
Is using websocket in this scenario solves the scalability issue?
Polling once a day is very little.
I don't see any upside for Websockets unless you switch to Push and have more notifications.
However, staggering the polling does make a lot of sense, since syncing requests for the same time is like writing a DoS attack against your own server.
Staggering doesn't necessarily have to be random and IMHO, it probably shouldn't.
You could start with a fixed time and add a second per client ID, allowing for ~86K connections in 24 hours which should be easy for any server to handle.
As a side note, 40K concurrent connections might not as hard to achieve as you imagine.
EDIT (relating to the comments)
Websockets vs. Server Sent Events:
IMHO, when pushing data (vs. polling), I would prefer Websockets over Server Sent Events (SSE).
Websockets have a few advantages, such as client side communication which allows clients to ping the server and confirm that the connection is still alive.
The Specific Use-Case:
From the description in the question and the comments it seems that you're using browser clients with a custom plugin and that the updates you wish to install daily might require the browser to be active.
This raises different questions that effect the implementation (are the client browsers open all day? do you have any control over the client browsers and their environment? can you guarantee installation while the browser is closed?).
...
IMHO, you might consider having the client plugins test for an update each morning as they load for the first time during that day (first access).
People arrive at work in different times and they open their browsers for the first time at different schedules. So the 40K requests you're expecting will be naturally scattered across that timeline (probably a 20-30 minute timespan).
This approach makes sure that the browsers and computers are actually open (making the update possible) and that the update requests are staggered over a period of time (about 33.3 requests per second, if my assumption is correct).
If you're serving a pre-written static configuration file (perhaps updated by the server daily), avoiding dynamic content and minimizing any database calls, than 33 req/sec should be very easy to manage.

browser implication when ajax call every 3 sec

We would like to check every 3 seconds if there are any updates in our database, using jquery $.ajax. Technology is clear but are there any reasons why not to fire so many ajax calls? (browser, cache, performance, etc.). The web application is running for round about 10 hrs per day on every client.
We are using Firefox.
Ajax calls has implications not on client side(Browser,...) but on the server side. For example, every ajax call is a hit on server. ie. more bandwidth consumption, no of server request hit increases which in turn increases server load etc etc. Ajax call is actually meant to increase client friendliness at the cost of Server side implications.
Regards,
Ravi
You should think carefully before implementing infinite repeating AJAX calls with an arbitrary delay between them. How did you come up with 3 seconds? If you're going to be polling your server in this way, you need to reduce the frequency of requests to as low a number as possible. Here are some things to think about:
Is the data you're fetching really going to change that often?
Can your server handle a request every 3 seconds, how long does the operation take for a single request?
Could you increase the delay after inactivity or guess based on previous server responses how long the next delay should be?
Can you stop the polling completely when the window loses focus, and restart it when it's in the foreground again.
If a user opens the same page in a website 10 times, your server should recognise this and throttle its responses, either using a cookie with a unique value in it (recommended) or based on the client IP address.
Above all, instead of polling, consider using HTML 5 web sockets to "push" data to the client - most modern browsers support this. Several frameworks are available that will fall back to polling if web sockets are not available - one excellent .NET example is SignalR.
I've seen a lot of application making request each 5sec or so, for instance a remote control (web player) or a chat. So that should not be a problem for the browser to do so.
What would be a good practice is to wait an answer before making a new request, that means not firing the requests with a setInterval for instance.
(In the case the user lose its connection that would prevent opening too much connections).
Also verifying that all the calculations associated with an answer are done when receiving the next answer.
And if you have access to that in the server side, configure you server to set http headers Connection: Keep-Alive, so you won't add to much TCP overhead to each of your requests. That could speed up small requests a lot.
The last point I see is of course verifying that you server is able to answer that much request.
You are looking for any changes after each 3sec , In this way the traffic would be increased as you fetching data after short duration and continuously . It may also continuous increase the memory usage on browser side . As you need to check any update done in the database , you can go for any other alternatives like Sheepjax , Comet or SignalR . (SignalR generally broadcast the data to all users and comet needs license ) . Hope this may help you .

Push (i.e. Netty) vs. Pull (i.e. Nginx) for a live "ping" server?

Imagine an application which receives something like news updates every minute.
Would it be more efficient to build the server-side with something like Netty- which would maintain the connections and push the data once a minute, or something like nginx/php which would drop/open the connection each time a pull-request is made?
Each request would require a database lookup custom tailored to that user (i.e. no caching) and some basic processing (i.e. encryption/decryption)
?
Once every minute does not sound like it should put too much load on your infrastructure so I would say which ever way is easiest for you.
However, if capacity is an issue, I would say the push method is better because it will only send when there is data. The pull method will take up resources no matter if there is data to retrieve or not.
Hope this helps.

Periodic Ajax POST calls versus COMET/Websocket Push

On a site like Trello.com, I noticed in firebug console that it makes frequent and periodic Ajax POST calls to its server to retrieve new data from the database and update the dom as and when something new is available.
On the other hand, something like Facebook notifications seem to be implementing a COMET push mechanism.
What's the advantage and disadvantage of each approach and specifically, my question is why Trello.com uses a "pull" mechanism as I have always thought using such an approach (especially since it pings its server so frequently) as it seems like it is not a scalable solution - when more and more users sign up to use its services?
Short Answer to Your Question
Your gut instinct is correct. Long-polling (aka comet) will be more efficient than straight up polling. And when available, websockets will be more efficient than long-polling. So why some companies use the "pull polling" is quite simply: they are out of date and need to put some time into updating their code base!
Comparing Polling, Long-Polling (comet) and WebSockets
With traditional polling you will make the same request repeatedly, often parsing the response as JSON or stuffing the results into a DOM container as content. The frequency of this polling is not in any way tied to the frequency of the data updates. For example you could choose to poll every 3 seconds for new data, but maybe the data stays the same for 30 seconds at a time? In that case you are wasting HTTP requests, bandwidth, and server resources to process many totally useless HTTP requests (the 9 repeats of the same data before anything has actually changed).
With long polling (aka comet), we significantly reduce the waste. When your request goes out for the updated data, the server accepts the request but doesn't respond if there is no new changes, instead it holds the request open for 10, 20, 30, or 60 seconds or until some new data is ready and it can respond. Eventually the request will either timeout or the server will respond with an update. The idea here is that you won't be repeating the same data so often like in the 3 second polling above, but you still get very fast notification of new data as there is likely already an open request just waiting for the server to respond to.
You'll notice that long polling reduced the waste considerably, but there will still be the chance for some waste. 30-60 seconds is a common timeout period for long polling as many routers and gateways will shutdown hanging connections beyond that time anyway. So what if your data is actually changed every 15 minutes? Polling every 3 seconds would be horribly inefficient, but long-polling with timeouts at 60 seconds would still have some wasted round trips to the server.
Websockets is the next technology advancement that will allow a browser to open a connection with the server and keep it open for as long as it wants and deliver multiple messages or chunks of data via the same open websocket. The server can then send down updates exactly when new data is ready. The websocket connection is already established and waiting for data, so it is quick and efficient.
Reality Check
The problem is that Websockets is still in its infancy. Only the very latest generation of browsers support it, if at all. The spec hasn't been fully ratified as of this posting, so implementations can vary from browser to browser. And of course your visitors may be using browsers a few years old. So unless you can control what browsers your visitors are using (say corporate intranet where IT can dictate the software on the workstations) you'll need a mechanism to abstract away this transport layer so that your code can use the best technique available for that particular visitor's browser.
There are other benefits to having an abstracted communications layer. For example what if you had 3 grid controls on your page all pull polling every 3 seconds, see the mess this would be? Now rolling your own long-polling implementation can clean this up some, but it would be even cooler if you aggregated the updates for all 3 of these tables into one long-polling request. That will again cut down on waste. If you have a small project, again you could roll your own, but there is a standard Bayeux Protocol that many server push implementations follow. The Bayeux protocol automatically aggregates messages for delivery and then segregates messages out by "channel" (an arbitrary path-like string you as a developer use to direct your messages). Clients can listen on channels, and you can publish data on channels, the messages will get to all clients listening on the channel(s) you published to.
The number of server side server push tool kits available is growing quite fast these days as Push technology is becoming the next big thing. There are probably 20 or more working implementations of server push out there. Do your own search for "{Your favorite platform} comet implementation" as it will continue to change every few months I'm sure (and has been covered on stackoverflow before).

Is there an alternative of ajax that does not require polling without server side modifications?

I'm trying to create a small and basic "ajax" based multiplayer game. Coordinates of objects are being given by a PHP "handler". This handler.php file is being polled every 200MS, by using ajax.
Since there is no need to poll when nothing happens, I wonder, is there something that could do the same thing without frequent polling? Eg. Comet, though I heard that you need to configure server side applications for Comet. It's a shared webserver, so I can't do that.
Maybe prevent the handler.php file from even returning a response if nothing has to be changed at the client, is that possible? Then again you'd still have the client uselessly asking for a response even though something hasn't changed yet. Basically, it should only use bandwidth and sever resources if something needs to be told to the client, eg. the change of an object's coordinates.
Comet is generally used for this kind of thing, and it can be a fragile setup as it's not a particularly common technology so it can be easy not to "get it right." That said, there are more resources available now than when I last tried it ~2 years ago.
I don't think you can do what you're thinking and have handler.php simply not return anything and stop execution: The web server will keep the connection open and prevent any further polling until handler.php does something (terminates or provides output). When it does, you're still handling a response.
You can try a long polling technique, where your AJAX allows a very large timeout (e.g. 30 seconds), and handler.php spins without responding until it has something to report, then returns. (You'll want to make sure the spinning is not resource-intensive). If handler.php "expires" and nothing happens, have it exit and let AJAX poll again. Since it only happens every 30 seconds, it will be a huge improvement over ~5 times a second. That would keep your polling to a minimum.
But that's the sort of thing Comet is designed for.
As Ajax only offers you a client server request model (normally termed pull, rather than push), the only way to get data from the server is via requests. However a common technique to get around this is for the server to only respond when it has new data. So the client makes a request, the server hangs on to that request until something happens and then replies. This gets around the need for frequent polling even when the data hasn't changed as you only need the client send a new request after it gets a response.
Since you are using PHP, one simple method might be to have the PHP code call the sleep command for 200ms at a time between checks for data changes and then return the data to the client when it does change.
EDIT: I would also recommend having a timeout on the request. So if nothing happens for say 2 seconds, a "no change" message is sent back. That way the client knows the server is still alive and processing its request.
Since this is tagged “html5”: HTML5 has <eventsource> and WebSocket, but the implementation side is still in the future tense in practice.
Opera implemented an old version of <eventsource> called <event-source>.
Here's a solution - use a SaaS comet provider, such as WebSync On-Demand. No server resources to worry about, shared hosting or not, since it's all offloaded, and you can push out the information as needed.
Since it's SaaS, it'll work with any server language. For PHP, there's already a publisher written and ready to go.
The server must take part in this. Check with the hosting provider what modules are available. Or try to convince them to support Comet.
Maybe you should consider a small Virtual Private Server (VPS) for this.
One thing to add on the long polling suggestions: If you're on a shared server, this solution will have limited scalability, as each active long poll will keep a connection (and a server-side process to service that connection) active. Your provider most likely has limits (either policy-defined or de facto) on the number of connections you can have open at a time, so you'll hit a wall if you have more sessions/windows than that playing concurrently.

Resources