I have noticed that some of my ajax-heavy sites (ones I visit, not ones I have built), have certain auto-refresh features. For example, in GMail, if I get a new message, I see the new message without a page reload. It's the same with the Facebook browser-based IM client. From what I can tell, there aren't any java applets handling the server-browser binding, so I'm left to assume it's being done by AJAX and perhaps some element I'm unaware of. So by my best guess, it's done in one of two ways:
The javascript does a steady "ping" to a server-side script, checking for any updates that might be available (which would explain why some of these pages bring any other heavy-duty pages to a crawl). or
The javascript sits idly by and a server-side script actually "Pushes" any updates to the browser. But I'm not sure if this is possible. I'd imagine there is some kind of AJAX function that still pings, but all it simply asks "any updates?" and the server-script has a simple boolean that says "nope" or "I'm glad you asked." But if this is the case, any data changes would need to call the script directly so that it has the data changes ready and makes the change to that boolean function.
So is that possible/feasible/how it works? I imagine something like:
Someone sends an email/IM/DB update to the server, the server calls the script using the script's URL plus some relevant GET variable, the script notes the change and updates the "updates available" variable, the AJAX gets the response that there are in fact updates, the AJAX runs its normal "update page" functions, which executes the normal update scripts and outputs them to the browser.
I ask because it seems really inefficient that the js is just doing a constant check which requires a) the server to do work every 1.5 seconds, and b) my browser to do work every 1.5 seconds just so that on my end I can say "Oh boy, I got an IM! just like a real IM client!"
Read about Comet
I've actually been working on a small .NET Web App that uses the Ajax with long polling technique described.
Depending on what technology you're using, you could use thread signaling mechanisms to hold your request until an update is retrieved.
With ASP.NET I'm running my server on a single machine, so I store a reference to my Producer object (which contains a thread that processes the data). To initiate the data pull, my service's Subscribe method is called, which creates a Consumer object that's registered with the Producer. If the Consumer is long polling mode, it has a AutoResetEvent which is signaled whenever it receives new data, and whenever the web client makes a request for data, the Consumer first waits on the reset event, and then returns it.
But you're mentioning something about PHP - as far as I know persistence is maintained through serialization, not actually keeping the object in memory, so I don't know how you could reference a Producer object using $_CACHE[] or $_SESSION[]. When I developed in PHP I never really knew anything about multithreading so I didn't play around with it, but I guess you can look into that.
Using infinite loops is going to consume a lot of your processing power - I would exhaust all other options first.
Related
I would like to process some data in a Qt application. This data can be found on a web page which uses Ajax to dynamically update itself.
For example, the page itself is www.example.com, and it uses Ajax to load data from www.example.com/data, which is a plain text file. If I view www.example.com in a browser, I can clearly see when the data is updated.
The brute force solution would be to just call the QWebView's load(QUrl("www.example.com/data")) every couple of seconds, or every time its loadFinished() signal is emitted, but that would be a waste of bandwidth, an I will be downloading the same data over and over. The time between updates could theoretically be a few seconds, but it could also be minutes, hours, or longer.
Is there a possibility to only reload the data when the page is updated?
The traditional AJAX model uses the following sequence of events:
Browser opens connection
Browser sends request
Server sends response
Server closes connection
Because the connection is closed, there is no way for the server to notify your browser if any data have changed. In order to get this information, you have no option but to query the server periodically.
As you mentioned in your question, this is not very efficient since you can waste a lot of bandwidth if nothing changes for a long while.
WebSockets is a more up-to-date technology that tries to overcome this inefficiency and Qt has a module that caters for this.
Unfortunately, it's not universal yet so, if you want to use WebSocket technology on a third-party server, you need to have traditional AJAX code to fall back on in case WebSockets are not supported.
EDIT:
Unfortunately, WebSockets are not the golden solution. It's still up to the server to have been programmed to send out notifications of changes. If the server does not have this feature, it won't matter if you're using WebSockets or traditional AJAX, you'll still have to keep querying for changes.
Use a StackOverflow Q&A thread as an example - when you vote up, vote down, or favorite a question, you can see the UI quickly respond to that action with changes in the # of up-votes on the side.
How can we achieve that effect? If send every of such action to back-end for processing and use the returned response to update UI, you will see a slow update and feel the glitches. But if put some of the logic on the front-end, you will also need to take care of the fraud/abuse etc before reflecting the action on UI, i.e - before changing the # of up-votes, don't you need to make sure that's a valid click by an valid user first?
You make sure that a valid user is using the app before a user clicks on anything. This is done through authentication, and it must include various protection mechanisms against malicious users.
When a user clicks, a call is made to a server. In a properly architected app this call is lightweight, and the server responds very quickly. I don't know why you believe that "you will see a slow update and feel the glitches". Adding an upvote to the database should take a few hundred milliseconds at most (including the roundtrip from the client), especially if the commit is asynchronous or a memcache is used.
If a database update results in a need to do some complex operations, typically these operations are not done right away. For example, a cron job may run periodically to compute new rankings, etc., precisely because you do not want every user to wait. Alternatively, a task is created and put in a task queue to be executed when resources are available - again to make sure that a user does not wait.
In some apps a UI is updated immediately after the call to the server is made, before any response from a server arrives. You can do it when the consequences of a failed call are negligible. For example, if an upvote fails to be saved in the database, it's not a disaster, especially if it happens once in a million tries. Again, in a properly architected app calls fail extremely rarely.
This is a decision that an app developer needs to make. I would not update a UI before a server response if such an update may lead a user to believe that some other action is now possible. For example, if a user uploads a new photo, I would not show icons to edit or share this photo until I know that the photo is safely saved.
We're running into transaction issue with Grails.
During performance test we have a scenario, where single API is called multiple times for the same user.
During each call something is changed on the domain object and it is saved in database.
We have discovered, that it is possible, that update in the database will be made after the response was sent to client and BEFORE another request for the same API arrives on the server.
So we end up with another API call which selects data from database before first API call updates it and we get StaleObjectStateException when second request tries to save stuff in database.
We were using the auto commit feature in Grails, which saves everything when transaction is finished. So the first decision was to start using .save() before render() in controller.
It's ok when doing it for simple APIs, but we do have some more complex APIs where we would have to keep track of quite a lot objects and save them explicitly. Currently it seems to work without flush:true, but we are still testing.
So my question is: is there any way to make sure that response is not sent before transaction is committed in Grails?
This is probably due to caching, if you require guarantees as to state of db, you need to .save(flush:true).
Do note that this flushes all across your session, so it might affect performance in a negative way.
Edit:
Not sure how I managed to read your question without seeing the part about you already doing flush:true.
Anyways, that is indeed the way you must go.
This was due to a tool we used for testing - SoapUI. It can send a duplicate request before response comes back. This made us think it was a Grails fault. It was not.
A web application I am developing needs to perform tasks that are too long to be executed during the http request/response cycle. Typically, the user will perform the request, the server will take this request and, among other things, run some scripts to generate data (for example, render images with povray).
Of course, these tasks can take a long time, so the server should not hang for the scripts to complete execution before sending the response to the client. I therefore need to perform the execution of the scripts async, and give the client a "the resource is here, but not ready" and probably tell it a ajax endpoint to poll, so it can retrieve and display the resource when ready.
Now, my question is not relative to the design (although I would very much enjoy any hints on this regard as well). My question is: does a system to solve this issue already exists, so I do not reinvent the square wheel ? If I had to, I would use a process queue manager to submit the task and put a HTTP endpoint to shoot out the status, something like "pending", "aborted", "completed" to the ajax client, but if something similar already exists specifically for this task, I would mostly enjoy it.
I am working in python+django.
Edit: Please note that the main issue here is not how the server and the client must negotiate and exchange information about the status of the task.
The issue is how the server handles the submission and enqueue of very long tasks. In other words, I need a better system than having my server submit scripts on LSF. Not that it would not work, but I think it's a bit too much...
Edit 2: I added a bounty to see if I can get some other answer. I checked pyprocessing, but I cannot perform submission of a job and reconnect to the queue at a later stage.
You should avoid re-inventing the wheel here.
Check out gearman. It has libraries in a lot of languages (including python) and is fairly popular. Not sure if anyone has any out of the box ways to easily connect up django to gearman and ajax calls, but it shouldn't be do complicated to do that part yourself.
The basic idea is that you run the gearman job server (or multiple job servers), have your web request queue up a job (like 'resize_photo') with some arguments (like '{photo_id: 1234}'). You queue this as a background task. You get a handle back. Your ajax request is then going to poll on that handle value until it's marked as complete.
Then you have a worker (or probably many) that is a separate python process connect up to this job server and registers itself for 'resize_photo' jobs, does the work and then marks it as complete.
I also found this blog post that does a pretty good job summarizing it's usage.
You can try two approachs:
To call webserver every n interval and inform a job id; server processes and return some information about current execution of that task
To implement a long running page, sending data every n interval; for client, that HTTP request will "always" be "loading" and it needs to collect new information every time a new data piece is received.
About second option, you can to learn more by reading about Comet; Using ASP.NET, you can do something similiar by implementing System.Web.IHttpAsyncHandler interface.
I don't know of a system that does it, but it would be fairly easy to implement one's own system:
create a database table with jobid, jobparameters, jobresult
jobresult is a string that will hold a pickle of the result
jobparameters is a pickled list of input arguments
when the server starts working on a job, it creates a new row in the table, and spwans a new process to handle that, passing that process the jobid
the task handler process updates the jobresult in the table when it has finished
a webpage (xmlrpc or whatever you are using) contains a method 'getResult(jobid)' that will check the table for a jobresult
if it finds a result, it returns the result, and deletes the row from the table
otherwise it returns an empty list, or None, or your preferred return value to signal that the job is not finished yet
There are a few edge-cases to take care of so an existing framework would clearly be better as you say.
At first You need some separate "worker" service, which will be started separately at powerup and communicated with http-request handlers via some local IPC like UNIX-socket(fast) or database(simple).
During handling request cgi ask from worker state or other data and replay to client.
You can signal that a resource is being "worked on" by replying with a 202 HTTP code: the Client side will have to retry later to get the completed resource. Depending on the case, you might have to issue a "request id" in order to match a request with a response.
Alternatively, you could have a look at existing COMET libraries which might fill your needs more "out of the box". I am not sure if there are any that match your current Django design though.
Probably not a great answer for the python/django solution you are working with, but we use Microsoft Message Queue for things just like this. It basically runs like this
Website updates a database row somewhere with a "Processing" status
Website sends a message to the MSMQ (this is a non blocking call so it returns control back to the website right away)
Windows service (could be any program really) is "watching" the MSMQ and gets the message
Windows service updates the database row with a "Finished" status.
That's the gist of it anyways. It's been quite reliable for us and really straight forward to scale and manage.
-al
Another good option for python and django is Celery.
And if you think that Celery is too heavy for your needs then you might want to look at simple distributed taskqueue.
I'm trying to create a small and basic "ajax" based multiplayer game. Coordinates of objects are being given by a PHP "handler". This handler.php file is being polled every 200MS, by using ajax.
Since there is no need to poll when nothing happens, I wonder, is there something that could do the same thing without frequent polling? Eg. Comet, though I heard that you need to configure server side applications for Comet. It's a shared webserver, so I can't do that.
Maybe prevent the handler.php file from even returning a response if nothing has to be changed at the client, is that possible? Then again you'd still have the client uselessly asking for a response even though something hasn't changed yet. Basically, it should only use bandwidth and sever resources if something needs to be told to the client, eg. the change of an object's coordinates.
Comet is generally used for this kind of thing, and it can be a fragile setup as it's not a particularly common technology so it can be easy not to "get it right." That said, there are more resources available now than when I last tried it ~2 years ago.
I don't think you can do what you're thinking and have handler.php simply not return anything and stop execution: The web server will keep the connection open and prevent any further polling until handler.php does something (terminates or provides output). When it does, you're still handling a response.
You can try a long polling technique, where your AJAX allows a very large timeout (e.g. 30 seconds), and handler.php spins without responding until it has something to report, then returns. (You'll want to make sure the spinning is not resource-intensive). If handler.php "expires" and nothing happens, have it exit and let AJAX poll again. Since it only happens every 30 seconds, it will be a huge improvement over ~5 times a second. That would keep your polling to a minimum.
But that's the sort of thing Comet is designed for.
As Ajax only offers you a client server request model (normally termed pull, rather than push), the only way to get data from the server is via requests. However a common technique to get around this is for the server to only respond when it has new data. So the client makes a request, the server hangs on to that request until something happens and then replies. This gets around the need for frequent polling even when the data hasn't changed as you only need the client send a new request after it gets a response.
Since you are using PHP, one simple method might be to have the PHP code call the sleep command for 200ms at a time between checks for data changes and then return the data to the client when it does change.
EDIT: I would also recommend having a timeout on the request. So if nothing happens for say 2 seconds, a "no change" message is sent back. That way the client knows the server is still alive and processing its request.
Since this is tagged “html5”: HTML5 has <eventsource> and WebSocket, but the implementation side is still in the future tense in practice.
Opera implemented an old version of <eventsource> called <event-source>.
Here's a solution - use a SaaS comet provider, such as WebSync On-Demand. No server resources to worry about, shared hosting or not, since it's all offloaded, and you can push out the information as needed.
Since it's SaaS, it'll work with any server language. For PHP, there's already a publisher written and ready to go.
The server must take part in this. Check with the hosting provider what modules are available. Or try to convince them to support Comet.
Maybe you should consider a small Virtual Private Server (VPS) for this.
One thing to add on the long polling suggestions: If you're on a shared server, this solution will have limited scalability, as each active long poll will keep a connection (and a server-side process to service that connection) active. Your provider most likely has limits (either policy-defined or de facto) on the number of connections you can have open at a time, so you'll hit a wall if you have more sessions/windows than that playing concurrently.