Websocket server performance comparison - performance

Are there any reliable performance test results/benchmarks/comparisons for websocket server frameworks?
I was googlin to and fro without any significant results. Going by various of Google's instant search suggestions, it seems to be a really hot topic, but the search results are far from satisfying. So a good answer is highly appreciated!
Background
I need to implement websockets for an application that has to be able to scale well, and would therefore like to knoow if there are any big performance differences amongst the available frameworks. I do not really care (too much) about the specific programming language, as stated above performance as well as scalability (+ obviously stability) are the important factors.
What I found so far
A more or less functional check with some time measurements by the autobahn testsuite, comparing their implementation (with CPy, pypy, wsaccel) against 4 others (Jetty, Websocket++, cowboy, ws)
A websockets byproduct of the HTTP comparison between MochiWeb and cowboy
So it is not much, but maybe at least these help someone also looking for an answer.
Aside: If you think this question is not good as is, please consider editing rather than close-voting it, (I found out that I am not a good OP ^^) I would really appreciate it, and there really is a great demand for a good answer to this (kind of) question.

Related

How can I make SocketIO more performant?

We used SocketIO quite extensively in Etherpad(since very early on) and we're really grateful for all of the efforts of the team for providing such a useful thingy :)
Etherpad is a nodejs project.
My problem with SocketIO is probably due to me misconfiguration or understanding something but after quite a lot of test tool generation, tweaking of memory settings etc. we still get a frustratingly low maximum number of messages per second, hitting the 10k mark.
Etherpad latest simulated load test
Reading online it looks like switching to ws would be more performant but I fail to see how that could be the case in our scenario where our bottleneck is not negotiations (which end up being websockets) but messages per second handled by the server.
I'm reluctant to try other packages so I thought I'd come here and ask for some insight or things to try to see if we can improve performance by, well, a lot.. The usual node tricks(access to more hardware[ram/cpu]) help a bit but it still feels like we're getting really small gains and not the huge numbers you see on other module benchmarks.
A dream outcome of this question would be for someone to look at the Etherpad code and tell me why I'm an idiot and hopefully we can get Etherpad into the competitive ~100k changes per second but also I may be being misty eyed about other modules so if anyone has benchmarks that contradict the likes of ws then I'm all ears.
I feel like I should just add, we tested to see if it was internal Etherpad logic that is the cause and it's not, it really is the communication layer that ends up bottlenecking an operational transform algorithm, we're like 99.95% sure...
Throwing more hardware at this problem is not the solution, nor is any method of reverse proxy/passing the problem.
If you are blind to where the "problem" is, you don't have many options. You could be looking for a "misconfiguration" that does not exist. Which could waste you a lot of time and money and in the end you will probably still have to switch.
Maturity, one discovers, has everything to do with the acceptance of "not knowing".
Rewrite pieces of the code that are relevant for the load test, to test if using e.g. uWebSockets would help push the bondary. There are multiple sources stating that uWebSockets server is A LOT faster. I bet it will not take that much time and you will get very important information you need to help you decide if its worth switching. The new web technology is moving forward extremely fast and if you want to be able to make a right choice for future of the product, you have to be willing to experiment with it. Alex Hultman wrote an article
how-µwebsockets-achieves-efficient-pub-sub
where he encorages switching and explains why its worth a try.

Performance improvement jsf

I read about 2 parameters such as javax.faces.FACELETS_REFRESH_PERIOD =-1
javax.faces.FACELETS_SKIP_COMMENTS=true can improve the performance.I tried those but i dont see any improvements is there anything i am missing or is there any better option which can achive the improvement.
You hardly see a performance improvement because the author of this blog highly exagerates when he/she states:
we will talk about the most important aspects that can be tuned in order to enhance the
performance of JSF 2.x applications.
And adds the two you talk about as important ones without adding that at least the refresh period is highly related to load and has with java (n)io and way faster disks been optimized a lot.
javax.faces.FACELETS_REFRESH_PERIOD =-1
The OS IO layer caching in combination with the java nio and faster disks makes it really quick to check if a file timestamp, against which JSF checks for changes, has changed. This is so quick that you'll hardly notice any improvement these days. Maybe only a little when you have 1000's of simultaneous users. So yes, it helps, a little, but not as much as you'd expect from the wordings in the blog.
javax.faces.FACELETS_SKIP_COMMENTS=true
This would help if you have a comment to useful code ratio of 1:1, but you'll hardly have that much comment in the page so the gain of sending 100-500 bytes less in a 10k-100k page (all examples) is negligeable.
This setting is (at least for me) more usefull in that our internal comments in the page doe not end-up at the enduser.
For other improvements see
JSF Performance improvement)

Cookies driving me crazy, or am I crazy?

Working on a project for a client, and an aspect of this job is to utilize sessions for individual users on a series of separate websites.
Unfortunately, all of the sudden, I'm running into innumerable issues with this approach. Mechanize doesn't like cookies (obviously, because there is JS involved).
The next choice would be to utilize something like Watir or Capybara to work around this problem. This is what I originally had done, but my client was not satisfied with the speed.
So, questions:
Is there a good way to get this done w/o using a browser-driver and without deeply analyzing the JS of each website, reverse-engineering their cookie setting process, and re-implementing that in our codebase? (my guess: absolutely not)
If not, what would be the best way around this problem? (my guess: browser-driver, like Watir or Capybara)
If my guesses to questions #1 and #2 sound correct to you, then how can I convince this client that I know what I'm saying?
It's quite irritating to be hired to develop something on the magnitude of difficulty that I was hired to develop, and then to have your recommendations and insight be ignored. How can I better handle this problem?
This probably isn't a great question for Stack Overflow, and I apologize for that. I find myself coming here for help quite a bit. You guys usually have good answers. Thank you in advance for answering, if you do.
EDIT: To be a bit clearer, the issue is one of speed vs. reliability. He wants the utmost speed. Obviously, a browser-driver is not going to be the best speed, but it does sort of "guarantee", in a way, that you can "persist" a given session. Mechanize is much speedier, but requires much more finnicky diddling about to get things working correctly. Considering we're working with dozens of websites, I'm thinking the best route is to sacrifice speed and gain reliability/accuracy. What do you all think?
Ultimately, I'm looking for your help, because I'm at a loss for any more good arguments. I had plenty, but have exhausted them all, and he seems convinced that there is a way to do this without sacrificing either speed or accuracy (at least with the amount of human resources invested [a.k.a me]). I've tried explaining this, that we can have one or the other. Basically, as far as I know, we can only have one. The one we should choose is accuracy. How might I be able to argue this in a way that this person might listen more acutely?
Well you can always login with Watir (I'm assuming that's where you can't solve the cookies issue) and then load the browser cookies into Mechanize. I know there's some sample code in other mechanize questions for this.
BTW, switching to Watir doesn't gain you reliability, it just makes it easier to solve your cookies problem. In my experience, Mechanize is generally more reliable.

Is Performance Always Important?

Since I am a Lone Developer, I have to think about every aspect of the systems I am working on. Lately I've been thinking about performance of my two websites, and ways to improve it. Sites like StackOverflow proclaim, "performance is a feature." However, "premature optimization is the root of all evil," and none of my customers have complained yet about the sites' performance.
My question is, is performance always important? Should performance always be a feature?
Note: I don't think this question is the same as this one, as that poster is asking when to consider performance and I am asking if the answer to that question is always, and if so, why. I also don't think this question should be CW, as I believe there is an answer and reasoning for that answer.
Adequate performance is always important.
Absolute fastest possible performance is almost never important.
It's always worth keeping an eye on performance and being aware of anything outrageously non-optimal that you're doing (particularly at a design/architecture level) but that's not the same as micro-optimising every line of code.
Performance != Optimization.
Performance is a feature indeed, but premature optimization will cost you time and will not yield the same result as when you optimize the parts that need optimization. And you can't really know which parts need optimization until you can actually profile something.
Performance is the feature that your clients will not tell you about if it's missing, unless it's really painfully slow and they're forced to use your product. Existing customers may report it in the end, but new customers will simply not bother if the performance is required.
You need to know what performance you need, and formulate it as a requirement. Then, you have to meet your own requirement.
That 'root of all evil' quote is almost always misused and misunderstood.
Designing your application to perform well can be mostly be done with just good design. Good design != premature optimization, and it's utterly ridiculous to go off writing crap code and blowing off doing a better job on the design as an 'evil' waste. Now, I'm not specifically talking about you here... but I see people do this a lot.
It usually saves you time to do a good job on the design. If you emphasize that, you'll get better at it... and get faster and faster at writing systems that perform well from the start.
Understanding what kinds of structures and access methods work best in certain situations is key here.
Sure, if you're app becomes truly massive or has insane speed requirements you may find yourself doing tricked out optimizations that make your code uglier or harder to maintain... and it would be wrong to do those things before you need to.
But that is absolutely NOT the same thing as making an effort to understand and use the right algorithms or data patterns or whatever in the first place.
Your users are probably not going to complain about bad performance if it's bearable. They possibly wouldn't even know it could be faster. Reacting to complaints as a primary driver is a bad way to operate. Sure, you need to address complaints you receive... but a lack of them does not mean there isn't a problem. The fact that you are considering improving performance is a bit of an indicator right there. Was it just a whim, or is some part of you telling you it should be better? Why did you consider improving it?
Just don't go crazy doing unnecessary stuff.
Keep performance in mind but given your situation it would be unwise to spend too much time up front on it.
Performance is important but it's often hard to know where your bottleneck will be. Therefore I'd suggest planning to dedicate some time to this feature once you've got something to work with.
Thus you need to set up metrics that are important to your clients and you. Keep and analyse these measurements. Then estimate how long and how much each would take to implement. Now you can aim on getting as much bang for you buck/time.
If it's web it would be wise to note your page size and performance using Firebug + yslow and/or google page speed. Again, know what applies to a small site like yours and things that only apply to yahoo and google.
Jackson’s Rules of Optimization:
Rule 1. Don’t do it.
Rule 2 (for experts only). Don’t do it
yet— that is, not until you have a
perfectly clear and unoptimized
solution.
—M. A. Jackson
Extracted from Code Complete 2nd edition.
To give a generalized answer to a general question:
First make it work, then make it right, then make it fast.
http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast
This puts a more constructive perspective on "premature optimization is the root of all evil".
So to parallel Jon Skeet's answer, adequate performance (as part of making something work, and making it right) is always important. Even then it can often be addressed after other functionality.
Jon Skeets 'adequate' nails it, with the additional provision that for a library you don't know yet what's adequate, so it's better to err on the safe side.
It is one of the many stakes you must not get wrong, but the quality of your app is largely determined by the weakest link.
Performance is definitely always important in a certain sense - maybe not the one you mean: namely in all phases of development.
In Big O notation, what's inside the parantheses is largely decided by design - both components isolation and data storage. Choice of algorithm will usually only best/worst case behavior (unless you start with decidedly substandard algorithms). Code optimizations will mostly affect the constant factor - which shouldn't be neglected, either.
But that's true for all aspects of code: in any stage, you have a good chance to fail any aspect - stability, maintainability, compatibility etc. Performance needs to be balanced, so that no aspect is left behind.
In most applications 90% or more of execution time is spend in 10% or less of the code. Usually there is little use in optimizing other code than these 10%.
performance is only important to the extent that developing the performance improvement takes less time than the total amount of time that will be saved for the user(s).
the result is that if you're developing something for millions... yeah it's important to save them time. if you're coding up a tool for your own use... it might be more trouble than it's worth to save a minute or even an hour or more.
(this is clearly not a rule set in stone... there are times when performance is truly critical no matter how much development time it takes)
There should be a balance to everything. Cost (or time to develop) vs Performance for instance. More performance = more cost. If a requirement of the system being built is high performance then the cost should not matter, but if cost is a factor then you optimize within reason. After a while, your return on investment suffers in that more performance does not bring in more returns.
The importance of performance is IMHO highly correlated to your problem set. If you are creating a site with an expectation of a heavy load and lot of server side processing, then you might want to put some more time into performance (otherwise your site might end up being unusable). However, for most applications the the time put into optimizing your perfomance on a website is not going to pay off - users won't notice the difference.
So I guess it breaks down to this:
Will users notice the improvements?
How does this improvement compare to competing sites?
If users will notice AND the improvement would be enough to differentiate you from the competition - performance is an important feature - otherwise not so much. (To a point - I don't recommend ignoring it entirely - you don't want your site to turtle along after all).
No. Fast enough is generally good enough.
It's not necessarily true, however, that your client's ideas about "fast enough" should trump your own. If you think it's fast enough and your client doesn't then yes, you need to accommodate your ideas to theirs. But if you're client thinks it's fast enough and you don't you should seriously consider going with your opinion, no theirs (since you may be more knowledgeable about performance standards in the wider world).
How important performance is depends largely and foremost on what you do.
For example, if you write a library that can be used in any environment, this can hardly ever have too much performance. In some environments, a 10% performance advantage can be a major feature for a library.
If you, OTOH, write an application, there's always a point where it is fast enough. Users won't neither realize nor care whether a button pressed reacts within 0.05 or 0.2 seconds - even though that's a factor of 4.
However, it is always easier to get working code faster, than it is to get fast code working.
No. Performance is not important.
Lack of performance is important.
Performance is something to be designed in from the outset, not tacked on at the end. For the past 15 years I have been working in the performance engineering space and the cause of most project failures that I work on is a lack of requirements on performance. A couple of posts have noted "fast enough" as an observation and whether your expectation matches that of your clients, but what about when you have a situation of your client, your architectural team, your platform engineering team, your functional test team, your performance test team and your operations team all have different expectations on performance, none of which have been committed to stone and measured against. Bad Magic to be certain.
Capture those expectations on the part of your clients. Commit them to a specific, objective, measurable requirement that you can evaluate at each stage of production of your software. Expectations may not be uniform, with one section of your app/code needing to be faster than others, nor will each customer have the same expectations on what is considered acceptable. Having this information will force you to confront decisions in the design and implementation that you may have overlooked in the past and it will result in a product which is a better match to your clients expectations.

What strategies have you employed to improve web application performance? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Any personal experience in overcoming web application performance hurdles?
Any recommended strategies for improving the performance of a data-driven web application?
My development team works on a web application (JSP reports, HTML, JavaScript) that uses an Oracle database (PL/SQL). The key functionality the application delivers is in reporting, where a user can get PDFs of reports at a high level and drill down to lower levels of supporting details.
As the number of supporting detail records has grown into the millions, the performance of the system has significantly degraded. Based on our current analysis of the metrics, the bottleneck seems to be in the logic hitting the DB and the DB performance. Changing the DB model and re-doing some of the server side logic is currently being explored.
Partioning, indexing, explain plans, and running statistics are things that have been done on the DB side to try to help improve performance. While they've helped, they haven't solved the issue satisfactorily. The toughest part in analyzing performance data is that the database and web servers are remotely administered by a different part of the IT organization, so the developers don't have regular, full access to see what's going on (especially in the production environment, which is not mirrored exactly in any other development/testing environment).
While my answer may not contain any concrete steps to help this is always where I start.
First thing I would do is try to throw away all of your assumptions about what the trouble is and take steps to install metrics everywhere you can. Let the metrics guide you rather than your intuition. I've chased many, many, many white rabbits going on a hunch...the let me down more times than they've been right.
Have you checked this out?
Best practices for making web pages fast from Yahoo!'s Exceptional Performance team
If you really are having trouble at the backend, this won't help. But we used their advice to great effect to make our site faster, and there is still more to do.
Also use the YSlow add-on for Firebug. You may be surprised when you see where the actual time is being taken up.
Have you considered building your data ahead of time? In other words are there groups of data that are requested again and again? If so have them ready before the user asks. I'm not exactly talking about caching, but I think that is part of the equation.
It might be worth it to take a step back from the code and examine the usage patterns of the system. For example, if you are showing people monthly inventory or sales information do they look at it at only at the end of the month? If so just build the data on the last day and store it. If they look at it daily, maybe try building each previous days results and storing the results and avoid the calculation. I guess ultimately I am pushing you in to a Dynamic Programming solution; if you know an answer don't solve it again.
As Webjedi says, metrics are your friend.
Also look at your stack and see where there are opportunities for caching - then employ mercilessly wherever possible!
As I said in another question:
Use a profiler. Yes they cost money, and using them can occasionally be a bit awkward, but they do provide you with a great deal more real evidence rather than guesswork.
Human beings are universally bad at guessing where performance bottlenecks are. It just seems to be something our brains aren't build to do very well. It may seem obvious, you may have great ideas about what the problem is, but the real world often turns out to be doing something different. And optimising the wrong part of code means, at best, lots of work for minimal benefit. More often it makes things slower, and sometimes it breaks things entirely. So before you make any changes for the sake of optimisation, you should always have real evidence from a profiler or other accurate tool.
Not all profilers cost (extra) money. For .Net, I'm successfully using an old build of NProf (currently abandoned but it still works for me) for profiling my ASP.Net applications. For SQL Server, the query profiler is part of the package. There's also the CLF Profiler from MS but I've never been able to get it to work successfully.
That being said, profilers are definitely the way to go. That way you can see where your program is spending most of its time, and not focus on things that you think are slow. Plus it means you don't have to write anything in your code to actually record the metrics.
As I hinted to at the beginning, there are different types of profilers. The three I find most useful are application profilers, which let you see which functions you actually spend most of your time in. The second is SQL profilers that let you see how long your queries take to run. The third is memory profilers, which help to show you what type of objects your memory is being used up by. All three of these are really useful, and although you won't use them every day, the times you do use them will save you a lot of headache.

Resources