Include picture in result, JAX-RS, any performance penalty? - performance

As someone on this forum pointed out that send back picture is a bad idea since it have some performance issue. Thus he/she suggest to send back picture's url instead. I'm wondering how could sending back picture have performance impact?

If the picture is rather large, it will take up considerable bandwidth, and also slow down the rest of the JAX-RS framework. Let the client do the work of retrieving the image, it is their bandwidth after all.
If the pictures are hosted elsewhere (on another server), why hog up your bandwidth sending them. After all, while your server is handling hundreds of concurrent requests, each sending a photo, the client only needs to make a few request.
No matter where the pictures are hosted (as long as they are not hosted via JAX-RS), you are putting load on JAX-RS. Other requests will serve slowly because different threads are each trying to send a large file. Overall, this will make your application considerably slower.

Related

Parameters that affect performance of web app on different clients

I have built a web application, in which I display a map and I use both WMS and WFS requests in order to show network (lines) and points of interest on the map. The application has several filters in order to send queries to database (such as date filters etc.).
The app runs in a remote server. The speed when accessing the app from my browser (firefox or chrome) is satisfying. Everything runs quite smoothly.
My issue is that I get complaints related to speed and performance. So my question is: on what does the performance depends on and how can it be possible that me having great experience and others don't?
One hypothesis is that the computer power of the other client is too low (which is not the case).
Another one is that the server doesn't have enough resources (which is also not the case).
What are other parameters that affect the performance of a web app?
The connection speed is a very important factor and also the response time (mostly distance client-server).

How can I improve the performance of this architecture?

I'm running a website that is CPU heavy due to a lot of thumbnailing of images.
This is how I currently do things:
User uploads image to server
Server keeps a copy, and stores the image on Amazon S3
When an thumbnail is requested, server uses the local copy to generate it, and then stores it on S3; then gives the S3 URL to the client
Subsequent requests are optimized like this: Server caches S3 URL in memcached, so it won't do the work again; server never generates a thumbnail again if the file exists; the server uses mid-sized thumbnails to generate small-sized one, so not to work with large files of not necessary
Now, I'm hosting on a Linode 4G instance (8 cores with 4x priority, 4GB RAM), and despite my optiomizations and having a memcached hit ratio of 70%, my average CPU is 170%. I'm constantly seeing all 8 CPUs working with frequent spikes of 100% for many of them at the same time.
I'm using nginx and gunicorn to serve a Django application, and the thumbnails are generated with PIL.
How can I improve this architecture?
I was thinking about a few possibilities:
#1. Easiest: add a second identical server with a load balancer in front, so that they'd share the load.
The problem with this is that the two servers would not share the local image cache. Could I solve this by placing such share on a network drive, or would the latency ultimately hinder the gains?
#2. A little harder: split the thumbnailing code out of my app, as a separate webservice, that would run on a second server. This way the main application and database would not suffer from high CPU usage, and the web pages would be served fast. The thumbnails are anyway already served asynchronously with JavaScript
Can anyone recommend some other solution?
Are you sure your performance problems come from thumbnails? OK, I suppose you've checked that.
You can downsize and upload the 2 thumbnails to S3 immediately (or shortly) after user uploaded the image. This way you should be able to save unnecessary CPU load you're now wasting for every HTTP request checking those thumbnails and doing IPC with memcached.
In a way your problem is a "good" problem to have (or at least it could have been a lot worse), in that there are no dependencies between separate image resizing tasks, so you can trivially distribute them over multiple servers. A few comments:
Have you checked to see if there is anything you can do to make the image resizing operations faster? (Google brought this up, don't know if it's any help: http://dmmartins.appspot.com/blog/speeding-up-image-resizing-with-python-and-pil) Even if you still find you need to add more servers, anything you can do to make each resize operation more efficient will make each server go farther.
If your users keep becoming more and more, you will eventually need to "scale out", but for the short term, it is possible you could solve the problem simply by paying another $80 for the next "tier" of service (8 cores at 8x priority).
Is image resizing really your app's only bottleneck? If image resizing was "free", how much further can you scale on your existing server before rendering pages, running DB queries, etc. would limit throughput? If you don't know, it would be good to do some simulated load testing and find out. I ask because if rendering pages, DB queries, etc. are also bottlenecks, or are soon to become bottlenecks, you are going to have to distribute the app anyways. In that case, you might as well keep thumbnailing in the main app, and distribute it right now, rather than making your thumbnailing run as a web service on a 2nd server.
Regardless of whether you distribute the main app, or split out thumbnailing into a separate app on a different server, you need some kind of authoritative store to keep track of where each thumbnail is kept on S3. You can keep that information in memcached, in a database, or wherever you want. It doesn't really matter. Even if you keep it in memcached, that doesn't mean you can't share the cache between 2 servers -- 1 server can connect to a memcached instance running on the other server.
You asked if "the latency" of checking a cache which is held on a different server will "hinder the gains". I don't think you need to worry about that. Your problem is throughput, not latency. Those high-latency network operations parallelize very well. So if you just service more requests in parallel, you can still make full use of your CPUs (which is the resource bottleneck right now).

Minimising number of requests vs Browser Caching & Multiple domains

I have recently been working on improving the front end performance of our website and have been employing a number of best practices.
However I have had a recent example where some of the practices are slightly at odds with each other
Minimise HTTP requests
In order to "trick" the browser into making more concurrent requests have some assets served from a different domain
Leverage browser caching
Why?
We used to bundle almost all of our Javascript into one file to minimise HTTP requests. This included JQuery and JQuery UI.
I thought this was silly as many users are likely to have JQuery already cached in their browser so I decided we should remove it from our all.js and instead serve it from Google's CDN. This would save users downloading the code again and because it's on a different domain it can be downloaded in parallell with other resources from our own domains.
The concurrent downloading is shown in the graph below:
This of course has raised the number of requests for people without JQuery already cached which isn't great though.
So my question is this:
Is the change a sensible one? Do the benefits of leveraging caching and allowing concurrent requests outweigh a slight increase in the number of requests?
That is a very good question.
You have explained your reasoning well and they are all good reasons for making this change.
But there still remains benefits to both approaches.
Keeping everything combined in one file
Reduce number of HTTP requests, reduces the negative effects of round-trip latency on the user's connection.
All libraries/plugins are downloaded at once, and should remain cached for when they are later needed.
Reduce dependency on other services (although, Google is going to be quite reliable).
Separate files spread across domains
Increase parallelisation of downloads, reduces the negative effects of bandwidth shaping on the user's connection. (Note that most browsers don't limit concurrent per-domain requests to 2 anymore though.)
Increase granularity - separate parts can be downloaded on-demand as needed, ie if a particular plugin is not needed on the first page hit, it isn't downloaded.
Personally, I'd normally lean a little bit towards the former (reducing HTTP requests by combining them into one big file). I feel like most of my audience is going to be on a fairly high-bandwidth connection and I can reduce latency. Remember to use Google and Yahoo's page speed tools to find other ways of speeding things up.

HTTPS on Apache; Will it slow Apache?

Our company runs a website which currently supports only http traffic.
We plan to support https traffic too as some of the customers who link to our pages want us to support https traffic.
Our website gets moderate amount of traffic, but is expected to increase over time.
So my question is this:
Is it a good idea to make our website https only?(redirect all http traffic to https)
Will this bring down the websites performance?
Has anyone done any sort of measurement?
PS: I am a developer who also doubles up as a apache admin.
Yes, it will impact performance, but it's usually not too bad compared to the running all the DB queries that go into the typical dymanically generated page.
Of course the real answer is: don't guess, benchmark it. Try it both ways and see the difference. You can use tools like siege and ab to simulate traffic.
Also, I think you may have more luck with this question over at http://www.serverfault.com/
I wouldn't worry about the load on the server; unless you are serving high volumes of static content, the encryption itself won't create much of a burden, in my experience.
However, using SSL dramatically slows down web sites by creating a lot more latency in connection setup.
An encrypted session requires about* three times as much time to set up as an unencrypted one, and the exact time depends on the latency.
Even on low latency connections, it is noticeable to the end user, but on higher latency (e.g. different continents, especially Australasia where latency to America/Europe is quite high) it makes a dramatic difference and will severely impact the user experience.
There are things you can do to mitigate it, such as ensuring that keep-alives are on (But don't turn them on without understanding exactly what the impact is), minimising the number of requests and maximising the use of browser cache.
Using HTTPS also affects browser behaviour in some cases. Certain optimisations tend to get turned off for security reasons, and some web browsers don't store objects loaded over HTTPS in the disc cache, which means they'll need to get them again in a later session, further impacting the user experience.
* An estimate based on some informal measurement
Is it a good idea to make our website
https only?(redirect all http traffic
to https) Will this bring down the
websites performance?
I'm not sure if you really mean all HTTP traffic or just page traffic. A lot of sites unnecessarily encrypt images, javascript and a bunch of other content that doesn't need to be hidden. This kind of content comprises most of the data transferred in a request so
if you do find feel that HTTPs is taking too much out of the system you can recommend the programmers separate content that needs to be secured from the content that does not.
Most webservers, unless severely underpowered, do not even use a fraction of the CPU power for serving up content. Most production servers I've seen are under 10%, even when using some SSL traffic. I think it would be best to see where your current CPU usage is at, and then do some of your own benchmarking to see how much extra CPU usage is used by an SSL request. I would guess it isn't that much.
No, it is not good idea to make any website as only https. Page loading speed might be little slower, because your server has to perform redirection operation unnecessarily for each web page request. It is better idea to make only pages as https that may contain secure/personal/sensitive information of users or organization. Even if the user information passing through web pages, you can use https. The web page which have information that can be shown to all in the world can normally use http. Finally, it is up to your requirement. If all pages contain secure information, you may make the website as https only.

Does Ajax detoriate performance?

Does excess use of AJAX affects performance? In context of big size web-applications, how do you handle AJAX requests to control asynchronous requests?
excess use of anything degrades performance; using AJAX where necessary will improve performance, especially if the alternative is a complete full-page round-trip to the server [a 'postback' in asp.net terminology]
There are two sides to this story.
AJAX generally improves the performance from the client's perspective. Rather than loading an entire page, a smaller amount of data is requested from the server when it is needed. Given that a HTML page often references many dependent files (images, css, javascript,etc, each requiring a hit from the server (or the cache)) the client performance from judicious use of AJAX can be remarkable.
On the server-side, the issue becomes one of having many more connections to manage. Polling applications, such as in-browser chat in particular, can really start to increase the load on the server because the browser is now hitting the server much more rapidly. In a typical dynamic application (where the response is generated by code rather than from a static file) you may start running into issues - but these are generally balanced by the fact that the complexity of your request is often much lower (again, you aren't generating the entire page but a small subset of the page) and so therefore your platform can probably get a higher throughput in any case.
The exact outcome of any performance issue is going to depend on a number of factors including your server, platform, framework, and prevailing climactic conditions at the time.
My ultimate advice - focus on creating a good user experience, develop intelligently, collect as many metrics as you can and optimise when you know you need it.
AJAX itself (being asynchronous requests).. No not generally.
However if you have an abundance of javascript and markup and have large amounts of data transferred via your xmlhttprequests then yes you can see a performance hit. It really depends on how you want your website to function any degredation is generally avoidable if sculpted correctly.
Performance of what exactly? I'm going to assume you meant performance of an application in terms of user experience.
What Ajax appears to be best at is causing network traffic only when it's needed. Rather than downloading a honkin' great web page in one hit, it downloads only what's needed in as quick a manner as possible.
Then, if you do something that needs more info, it goes and gets it from the network then.
This means unused stuff is never downloaded (if you design it right, of course - bad code can be written in Ajax as much as any other environment).
I prefer to mix Ajax methods for data transfer and a client-side library like jQuery for pretty interface.
Depending on the situation, AJAX may have a performance overhead or it can actually have better performance than an equivelantly functioning web site that doesn't use AJAX.
It's very easy to overuse AJAX to overload the server with tons of frivilous requests and it can also be a burden on the client's CPU. Conversely, AJAX can also be used to deliver small bits of HTML and other code rather than a whole page for each request, which is at least less of a burden on the server.
Ajax is just an ordinary HTTP request, so as long as your server can handle those requests it won't be a problem. The upside to Ajax is faster perceived performance by the user, since the page doesn't have to reload and redraw itself for every user action.
If scalability is a concern, I'm sure you are also looking at scaling the system horizontally by adding more web servers to the farm. Same goes with even non-Ajax web apps anyway.
AJAX, like any technology can be a good thing or a bad thing depending on the situation and how it is implemented. If you have a specific need for the asynchronous process then it is a good tool to use. However, if you use it irresponsibly you can get into trouble. If you do use it, try to find a good framework that does most of the heavy lifting and be aware of some of the downsides of AJAX...
http://learningremix.net/w2007integ/vangoori/2007/01/the_downsides_of_ajax.shtml
I would agree with quite a few other posts in here. If you are using it in an intelligent way (ie, not using ajax every 30 seconds), then it will be fine. I use ajax on my website (and there is also a js free version) and from a clients perspective, the ajax version loads at anywhere from near-equal speeds to four times faster. It all depends on the design (graphics and other content) of the website and what you are updating.
The downside is, since you have to load some frameworks (even if you create your own like I have) you will have a bit slower of a load for the first page, or any full refreshes, and it does increase the processing load a bit. But that is just because the ajax has increased productivity and therefore the user can make more requests/updates
If the site is busy then it will, eventually, kill the server, unless your in a farm.
As to the site itself it shouldn't.

Resources