How fast can I poll with AJAX? - performance

Let's say I'm updating some data on a page using Ajax. I need to call it on a timer so if a user sits on the page it will keep updating. What's a reasonable rate to poll at to try and maintain a "real-time" feel without running up the client's CPU usage or impeding them in some other noticeable way?

If you really want to maintain that level of a real time feel, I would strongly suggest that you look into Comet. Comet -- also known as Reverse Ajax -- involves the client JavaScript opening a connection to the server and the server keeping that connection open until it's ready to send a response/update to the client.
This is used a lot in live chat applications, and there are Chat Demos and other Comet Demos that demonstrate the concept.
If you poll any more often than about 10 seconds, you will waste bandwidth and CPU cycles. The overhead of opening and closing a connection and the load on your server would be intense.

Related

Is this a correct scenario to use WebSocket?

I have a browser plugin which will be installed on 40,000 dekstops.
This plugin will connect to a backend configuration file available via https, e.g. http://somesite/config_file.js.
The plugin is configured to poll this backend resource once/day.
But there is only one backend server. So if 40,000 endpoints start polling together the server might crash.
I could think of randomize the polling frenquency from the desktop plugins. But randomization still does not gurantee that there will not be a overload at the server.
Is using websocket in this scenario solves the scalability issue?
Polling once a day is very little.
I don't see any upside for Websockets unless you switch to Push and have more notifications.
However, staggering the polling does make a lot of sense, since syncing requests for the same time is like writing a DoS attack against your own server.
Staggering doesn't necessarily have to be random and IMHO, it probably shouldn't.
You could start with a fixed time and add a second per client ID, allowing for ~86K connections in 24 hours which should be easy for any server to handle.
As a side note, 40K concurrent connections might not as hard to achieve as you imagine.
EDIT (relating to the comments)
Websockets vs. Server Sent Events:
IMHO, when pushing data (vs. polling), I would prefer Websockets over Server Sent Events (SSE).
Websockets have a few advantages, such as client side communication which allows clients to ping the server and confirm that the connection is still alive.
The Specific Use-Case:
From the description in the question and the comments it seems that you're using browser clients with a custom plugin and that the updates you wish to install daily might require the browser to be active.
This raises different questions that effect the implementation (are the client browsers open all day? do you have any control over the client browsers and their environment? can you guarantee installation while the browser is closed?).
...
IMHO, you might consider having the client plugins test for an update each morning as they load for the first time during that day (first access).
People arrive at work in different times and they open their browsers for the first time at different schedules. So the 40K requests you're expecting will be naturally scattered across that timeline (probably a 20-30 minute timespan).
This approach makes sure that the browsers and computers are actually open (making the update possible) and that the update requests are staggered over a period of time (about 33.3 requests per second, if my assumption is correct).
If you're serving a pre-written static configuration file (perhaps updated by the server daily), avoiding dynamic content and minimizing any database calls, than 33 req/sec should be very easy to manage.

How to read data from an Ajax web service with Qt?

I would like to process some data in a Qt application. This data can be found on a web page which uses Ajax to dynamically update itself.
For example, the page itself is www.example.com, and it uses Ajax to load data from www.example.com/data, which is a plain text file. If I view www.example.com in a browser, I can clearly see when the data is updated.
The brute force solution would be to just call the QWebView's load(QUrl("www.example.com/data")) every couple of seconds, or every time its loadFinished() signal is emitted, but that would be a waste of bandwidth, an I will be downloading the same data over and over. The time between updates could theoretically be a few seconds, but it could also be minutes, hours, or longer.
Is there a possibility to only reload the data when the page is updated?
The traditional AJAX model uses the following sequence of events:
Browser opens connection
Browser sends request
Server sends response
Server closes connection
Because the connection is closed, there is no way for the server to notify your browser if any data have changed. In order to get this information, you have no option but to query the server periodically.
As you mentioned in your question, this is not very efficient since you can waste a lot of bandwidth if nothing changes for a long while.
WebSockets is a more up-to-date technology that tries to overcome this inefficiency and Qt has a module that caters for this.
Unfortunately, it's not universal yet so, if you want to use WebSocket technology on a third-party server, you need to have traditional AJAX code to fall back on in case WebSockets are not supported.
EDIT:
Unfortunately, WebSockets are not the golden solution. It's still up to the server to have been programmed to send out notifications of changes. If the server does not have this feature, it won't matter if you're using WebSockets or traditional AJAX, you'll still have to keep querying for changes.

browser implication when ajax call every 3 sec

We would like to check every 3 seconds if there are any updates in our database, using jquery $.ajax. Technology is clear but are there any reasons why not to fire so many ajax calls? (browser, cache, performance, etc.). The web application is running for round about 10 hrs per day on every client.
We are using Firefox.
Ajax calls has implications not on client side(Browser,...) but on the server side. For example, every ajax call is a hit on server. ie. more bandwidth consumption, no of server request hit increases which in turn increases server load etc etc. Ajax call is actually meant to increase client friendliness at the cost of Server side implications.
Regards,
Ravi
You should think carefully before implementing infinite repeating AJAX calls with an arbitrary delay between them. How did you come up with 3 seconds? If you're going to be polling your server in this way, you need to reduce the frequency of requests to as low a number as possible. Here are some things to think about:
Is the data you're fetching really going to change that often?
Can your server handle a request every 3 seconds, how long does the operation take for a single request?
Could you increase the delay after inactivity or guess based on previous server responses how long the next delay should be?
Can you stop the polling completely when the window loses focus, and restart it when it's in the foreground again.
If a user opens the same page in a website 10 times, your server should recognise this and throttle its responses, either using a cookie with a unique value in it (recommended) or based on the client IP address.
Above all, instead of polling, consider using HTML 5 web sockets to "push" data to the client - most modern browsers support this. Several frameworks are available that will fall back to polling if web sockets are not available - one excellent .NET example is SignalR.
I've seen a lot of application making request each 5sec or so, for instance a remote control (web player) or a chat. So that should not be a problem for the browser to do so.
What would be a good practice is to wait an answer before making a new request, that means not firing the requests with a setInterval for instance.
(In the case the user lose its connection that would prevent opening too much connections).
Also verifying that all the calculations associated with an answer are done when receiving the next answer.
And if you have access to that in the server side, configure you server to set http headers Connection: Keep-Alive, so you won't add to much TCP overhead to each of your requests. That could speed up small requests a lot.
The last point I see is of course verifying that you server is able to answer that much request.
You are looking for any changes after each 3sec , In this way the traffic would be increased as you fetching data after short duration and continuously . It may also continuous increase the memory usage on browser side . As you need to check any update done in the database , you can go for any other alternatives like Sheepjax , Comet or SignalR . (SignalR generally broadcast the data to all users and comet needs license ) . Hope this may help you .

Periodic Ajax POST calls versus COMET/Websocket Push

On a site like Trello.com, I noticed in firebug console that it makes frequent and periodic Ajax POST calls to its server to retrieve new data from the database and update the dom as and when something new is available.
On the other hand, something like Facebook notifications seem to be implementing a COMET push mechanism.
What's the advantage and disadvantage of each approach and specifically, my question is why Trello.com uses a "pull" mechanism as I have always thought using such an approach (especially since it pings its server so frequently) as it seems like it is not a scalable solution - when more and more users sign up to use its services?
Short Answer to Your Question
Your gut instinct is correct. Long-polling (aka comet) will be more efficient than straight up polling. And when available, websockets will be more efficient than long-polling. So why some companies use the "pull polling" is quite simply: they are out of date and need to put some time into updating their code base!
Comparing Polling, Long-Polling (comet) and WebSockets
With traditional polling you will make the same request repeatedly, often parsing the response as JSON or stuffing the results into a DOM container as content. The frequency of this polling is not in any way tied to the frequency of the data updates. For example you could choose to poll every 3 seconds for new data, but maybe the data stays the same for 30 seconds at a time? In that case you are wasting HTTP requests, bandwidth, and server resources to process many totally useless HTTP requests (the 9 repeats of the same data before anything has actually changed).
With long polling (aka comet), we significantly reduce the waste. When your request goes out for the updated data, the server accepts the request but doesn't respond if there is no new changes, instead it holds the request open for 10, 20, 30, or 60 seconds or until some new data is ready and it can respond. Eventually the request will either timeout or the server will respond with an update. The idea here is that you won't be repeating the same data so often like in the 3 second polling above, but you still get very fast notification of new data as there is likely already an open request just waiting for the server to respond to.
You'll notice that long polling reduced the waste considerably, but there will still be the chance for some waste. 30-60 seconds is a common timeout period for long polling as many routers and gateways will shutdown hanging connections beyond that time anyway. So what if your data is actually changed every 15 minutes? Polling every 3 seconds would be horribly inefficient, but long-polling with timeouts at 60 seconds would still have some wasted round trips to the server.
Websockets is the next technology advancement that will allow a browser to open a connection with the server and keep it open for as long as it wants and deliver multiple messages or chunks of data via the same open websocket. The server can then send down updates exactly when new data is ready. The websocket connection is already established and waiting for data, so it is quick and efficient.
Reality Check
The problem is that Websockets is still in its infancy. Only the very latest generation of browsers support it, if at all. The spec hasn't been fully ratified as of this posting, so implementations can vary from browser to browser. And of course your visitors may be using browsers a few years old. So unless you can control what browsers your visitors are using (say corporate intranet where IT can dictate the software on the workstations) you'll need a mechanism to abstract away this transport layer so that your code can use the best technique available for that particular visitor's browser.
There are other benefits to having an abstracted communications layer. For example what if you had 3 grid controls on your page all pull polling every 3 seconds, see the mess this would be? Now rolling your own long-polling implementation can clean this up some, but it would be even cooler if you aggregated the updates for all 3 of these tables into one long-polling request. That will again cut down on waste. If you have a small project, again you could roll your own, but there is a standard Bayeux Protocol that many server push implementations follow. The Bayeux protocol automatically aggregates messages for delivery and then segregates messages out by "channel" (an arbitrary path-like string you as a developer use to direct your messages). Clients can listen on channels, and you can publish data on channels, the messages will get to all clients listening on the channel(s) you published to.
The number of server side server push tool kits available is growing quite fast these days as Push technology is becoming the next big thing. There are probably 20 or more working implementations of server push out there. Do your own search for "{Your favorite platform} comet implementation" as it will continue to change every few months I'm sure (and has been covered on stackoverflow before).

Online chat - Ajax poll or Reverse Ajax

after an entire day of searches, I would to talk about the best solution for an online chat.
This is what I know:
Ajax poll is the old, bandwith consuming, and not scalable way of doing it. It makes a request for new data to the server every X seconds. This implies one database query every X seconds * number_of_connected_users.
Reverse Ajax and one of its application (comet) requires a customized web-server or a dedicated comet server which can handle the number_of_connected_users amount of long-time http connections.
My actual server is: 1 Xeon CPU, 1 GB of RAM and 1 Gb/s of bandwith. The server is a virtual machine (hence highly scalable).
I need a solution that could scale with the server and the future growing user base.
My doubts are:
How much the ajax polling method can impact my bandwith usage?
In which way can I optimize the ajax polling to make a db query only when necessary?
Can the comet server be run in the same machine of the web-server (Apache)?
With the comet way, I still need an interval to do the queries on the database and then send the response, so where is the real-time?
With my actual server, can the comet way work?
Thank you in advance.
You should never use polling if you can get away with it. It clogs up resources on both the server and client. The server must make more database requests with polling, more checks to see if data has changed.
The ajax polling method also generates more unneccessary requests. With polling, you use memory and CPU. Comet (if it's done properly) uses only memory.
The comet server can probably not run under Apache. Apache does not seem to be designed for long running requests. I'd recommend implementing your comet server in ruby (using EventMachine) an example, in Python (using Twisted), or in C.
I don't see why you need to have an interval to do database queries. When you make a change, you can just tell your comet server to notify the neccessary users of the change.
I'm writing my website in PHP.
So, I need to run a server (like twisted) and write my chat application in python?
This app should take care of incoming ajax request and push new data to clients.
If I understand well, this approach doesn't need a database, right?

Resources