Performance Tuning for RabbitMQ installed on Windows? - performance

I work with a Windows application that uses RabbitMQ, which the vendor provided. The only value at all in the advanced.config is a timeout value that the application vendor gave me when troubleshooting some issue with the connections timing out. I have been searching for performance tuning documentation as it pertains to Windows deployments of RabbitMQ, but all of the most useful stuff that I am finding is for Linux or containerized deployments. Unfortunately, the application that is using RabbitMQ is Windows dependent, so I can't just go a different/better route with the RabbitMQ deployment.
The problem we're running into is that we have close to 13,000 queues without any way to trim that down substantially, and even though the CPU and RAM resources don't ever seem taxed, RabbitMQ itself seems to be struggling. We're often seeing unacknowledged messages in the queues, and even the web interface itself sometimes stalls out. With all of the Linux information I am seeing, I feel like there has to be a list of Windows best practices too, or values that can be added and then tweaked in the advanced.config. Any advice great appreciated.

Related

Websockets server implementation real performance for high concurrent production environment

I'm evaluating the substitution of some http pooling features of my production application with the new JEE7 supported Websocket feature. I'm planning to use Wildfly 8 as my next production server environment and I've migrated some of my websockets compatible modules with good results on development time; but I have the doubt about how it will work on production and what performance will have the websockets implementation on a a high load enviroment.
I´ve been searching documentation about the most used JEE servers but the main manufacturers haven´t yet a production JEE7 enviroment and when they have a JEE7 version, they haven´t enought documentation about how the implementation works or some values of maximum concurrency users. In addition, some not official comments says websocket connections are associated "with a server socket" but this seems to be not very efficient.
We might assume the Websocket is used only for receive data from the client point of view and we assume each user will receive, for example, an average of 10 messages per minute with a little json serialized object (a typical model data class). My requirement is more like a stock market than a chat application, for example.
What´s the real performance can I expect for a Websockets use on production enviroment in Wildfly 8 Server, for example? Also I´m interested about some comparision with another JEE7 implementations you are acquainted with.
Web sockets are TCP/IP based, no matter the implementation they will always use a socket (and hence an open file).
Performance wise it really depends on what you are doing, basically how many clients, how many requests/sec per client, and obviously on how big your hardware is. Undertow is based on non-blocking IO and is generally pretty fast, so it should be enough for what you need.
If you are testing lots of clients just be aware that you will hit OS based limits (definitely open files, and maybe available ports depending on your test).

Communication between Heroku apps

I've build a distributed system consisting of several web-services and some web applications consuming them.
They are all hosted on Heroku.
Is there some way for request between these applications to be done "inside heroku" without going through the web.
Something analog to using localhost.
You are maybe in luck: such a feature has currently reached the experimental phase.
Let me take a moment to underscore that: this feature may disappear or change at any time. It's not supported, but bug reports are appreciated. Don't build a bank with it. Don't get yourself in a position to be incredibly sad if severe problems are found that render it unshippable and it's aborted.
However, it is still cool, and here it is: containerized-network
You can use, for example, the pub-sub interface of any of the hosted Redis solutions. Or any of the message brokers (IronMQ, RabbitMQ) to pass messages.

Running NServiceBus on Amazon EC2

So I have seen a number of references and links from a year +/- ago asking about support for NServiceBus on Amazon EC2. Wondering if anyone out there has attempted to do anything with this recently?
I have seen the following articles/posts but fear the information and related links are dated.
A Less Than Positive Experience w/NServiceBus on Amazon EC2
The right idea, any movement on this?
Azure Love, but no Amazon?
I see a lot of chatter on the NServiceBus forums about the "next version" having a big focus on support for the cloud (at the time the current version was 2.5). I have a scenario where I would like to run NServiceBus w/MSMQ or RabbitMQ on a cluster of Amazon EC2 instances but it concerns me that there is not more discussion around people actually using NServiceBus on Amazon.
Anyone doing it successfully or have reasons to avoid considering it?
[EDIT] - Does anyone know if using reserved instances gets around the issue with EC2 restarts described in the article above?
There are different ways to successfully run NServiceBus on EC2. Picking which option to go with requires weighing the balance of cost, scalability, & operational overhead.
MSMQ
NServiceBus runs fine on EC2 with MSMQ, but there are a few obstacles that need attention. The main issue is that the computer names / DNS names on EC2 instances change during each restart. This is an issue because the computer name is used when sending messages to endpoint as well when subscribing to messages. One simple option to overcome this overhead is to attach an elastic IP to the instance & use its DNS name. The benefit is that it's pretty easy to do this. The downside is that you're only given 5 Elastic IPs by default. You can ask for more & Amazon is usually pretty liberal with handing out extra Elastic IPs. You will also be limited in how you scale. For instance, you won't be able to simply plug into the elastic scaling features of AWS. You also have to deal with backups. I would put the queues on a separate EBS volume & take snapshots on an interval.
I'd pick this option if you want to use messaging, but you don't have really crazy SLA's, you don't need to scale up and down machines quickly, & you don't need to deal with high message volumes. This is the case with most projects.
Amazon SQS
You could write a custom transport for SQS. The benefit of using NSB with SQS remote queues is that you get highly available queues, you don't have to manage them on your EC2 instances & you don't have to worry about backups. It's also easier to leverage elastic scaling with this approach. The downside is that each read costs $$$, so it may not be economically feasible to read at the same speeds as MSMQ or RabbitMQ - although this problem is mitigated by support for long polling and the ability to download many messages in a single call. Another downside is that it doesn't support distributed transactions with DTC. If you're using NServiceBus 5 or later, you could implement the Outbox pattern in your transport as described here, to ensure your messages are still processed only once. Otherwise it's up to you to ensure that your endpoints and handlers have idempotency solutions in place. You can play around with speed vs cost by adjusting the polling intervals of each of your endpoints & perhaps even have a back-off strategy where you decrease your polling intervals if you haven't received messages in a while. You will also have to worry about the size of your messages, as SQS has a small size limitation (256 K). You don't hit this in most messages.
I'd pick this option if read / write speeds aren't an issue, but you don't want to worry about operationally supporting your queuing infrastructure.
RabbitMQ
I haven't personally played with RabbitMQ on EC2, but a quick search came up with a few articles on how to get it up and running on an EC2 instance. There is a mature RabbitMQ transport available and it supports guaranteed once-only processing of messages as of NServiceBus version 5, as described in the link above. This would be cheaper to operate than SQS & I've heard that it's easier to cluster than MSMQ. Finally, like MSMQ, you would have to come up with a backup strategy (probably using snapshots).
Mixed
Nobody says that you have to pick one queuing system. You could use SQS for endpoints that need high availability & you don't mind paying the $$$, then use MSMQ / RabbitMQ for the rest of your system.

Scaling Tigase XMPP server on Amazon EC2

Does anyone have an experience running clustered Tigase XMPP servers on Amazon's EC2, primarily I wish to know about anything that might trip me up that is non-obvious. (For example apparently running Ejabberd on EC2 can cause issues due to Mnesia.)
Or if you have any general advice to installing and running Tigase on Ubuntu.
Extra information:
The system I’m developing uses XMPP just to communicate (in near real-time) between a mobile app and the server(s).
The number of users will initially be small, but hopefully will grow. This is why the system needs to be scalable. Presumably for a just a few thousand users you wouldn’t need a cc1.4xlarge EC2 instance? (Otherwise this is going to be very expensive to run!)
I plan on using a MySQL database hosted in Amazon RDS for the XMPP server database.
I also plan on creating an external XMPP component written in Python, using SleekXMPP. It will be this external component that does all the ‘work’ of the server, as the application I’m making is quite different from instant messaging. For this part I have not worked out how to connect an external XMPP component written in Python to a Tigase server. The documentation seems to suggest that components are written specifically for Tigase - and not for a general XMPP server, using XEP-0114: Jabber Component Protocol, as I expected.
With this extra information, if you can think of anything else I should know about I’d be glad to know.
Thank you :)
I have lots of experience. I think there is a load of non-obvious problems. Like the only reliable instance to run application like Tigase is cc1.4xlarge. Others cause problems with CPU availability and this is just a lottery whether you are lucky enough to run your service on a server which is not busy with others people work.
Also you need an instance with the highest possible I/O to make sure it can cope with network traffic. The high I/O applies especially to database instance.
Not sure if this is obvious or not, but there is this problem with hostnames on EC2, every time you start instance the hostname changes and IP address changes. Tigase cluster is quite sensitive to hostnames. There is a way to force/change the hostname for the instance, so this might be a way around the problem.
Of course I am talking about a cluster for millions of online users and really high traffic 100k XMPP packets per second or more. Generally for large installation it is way cheaper and more efficient to have a dedicated servers.
Generally Tigase runs very well on Amazon EC2 but you really need the latest SVN code as it has lots of optimizations added especially after tests on the cloud. If you provide some more details about your service I may have some more suggestions.
More comments:
If it comes to costs, a dedicated server is always cheaper option for constantly running service. Unless you plan to switch servers on/off on hourly basis I would recommend going for some dedicated service. Costs are lower and performance is way more predictable.
However, if you really want/need to stick to Amazon EC2 let me give you some concrete numbers, below is a list of instances and how many online users the cluster was able to reliably handle:
5*cc1.4xlarge - 1mln 700k online users
1*c1.xlarge - 118k online users
2*c1.xlarge - 127k online users
2*m2.4xlarge (with 5GB RAM for Tigase) - 236k online users
2*m2.4xlarge (with 20GB RAM for Tigase) - 315k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 400k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 312k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 327k online users
5*m2.4xlarge (with 60GB RAM for Tigase) - 280k online users
A few more comments:
Why amount of memory matters that much? This is because CPU power is very unreliable and inconsistent on all but cc1.4xlarge instances. You have 8 virtual CPUs but if you look at the top command you often see one CPU is working and the rest is not. This insufficient CPU power leads to internal queues grow in the Tigase. When the CPU power is back Tigase can process waiting packets. The more memory Tigase has the more packets can be queued and it better handles CPU deficiencies.
Why there is 5*m2.4xlarge 4 times? This is because I repeated tests many times at different days and time of the day. As you can see depending on the time and date the system could handle different load. I guess this is because Tigase instance shared CPU power with some other services. If they were busy Tigase suffered from CPU under power.
That said I think with installation of up to 10k online users you should be fine. However, other factors like roster size greatly matter as they affect traffic, and load. Also if you have other elements which generate a significant traffic this will put load on your system.
In any case, without some tests it is impossible to tell how really your system behaves or whether it can handle the load.
And the last question regarding component:
Of course Tigase does support XEP-0114 and XEP-0225 for connecting external components. So this should not be a problem with components written in different languages. On the other hand I recommend using Tigase's API for writing component. They can be deployed either as internal Tigase components or as external components and this is transparent for the developer, you do not have to worry about this at development time. This is part of the API and framework.
Also, you can use all the goods from Tigase framework, scripting capabilities, monitoring, statistics, much easier development as you can easily deploy your code as internal component for tests.
You really do not have to worry about any XMPP specific stuff, you just fill body of processPacket(...) method and that's it.
There should be enough online documentation for all of this on the Tigase website.
Also, I would suggest reading about Python support for multi-threading and how it behaves under a very high load. It used to be not so great.

Has anyone tried using ZooKeeper?

I was currently looking into memcached as way to coordinate a group of server, but came across Apache's ZooKeeper along the way. It looks interesting, and Yahoo uses it, so it shouldn't be bad, but I'd never heard of it before, so I'm kind of skeptical. Has anyone else given it a try? Any comments or ideas?
ZooKeeper and Memcached have different purposes. You can use memcached to do server coordination, but you'll have to do most of this work yourself. Memcached only allows coordination in that it caches common data lookups to be used by multiple clients. From reading ZooKeeper's documentation, it has a much broader focus than this. ZooKeeper seems to provide support for server clustering, which isn't the same as the cache clustering memcached provides.
Have a look at Brad Fitzpatrick's Linux Journal article on memcached to get a better idea what I mean.
To get an overview of what Zookeper is capable of, watch the following presentation by it's creators. It's capable of so much more (creating queue's, electing master processes amongst a group of peers, distributed high performance run time configurations, rendezvous points for dis-joined processes, determining if processes are still running, etc).
http://zookeeper.sourceforge.net/index.sf.shtml
To answer your question, if "coordination" is what you are looking for Zookeeper is much better targeted at that than memcached.
Zookeeper is great for coordinating data across servers. It does a good job of ordering every transaction and making guarantees that transactions happen in order. However when first breaking into it the documentation sucks; it's very 'high-level' without enough concrete examples or explanations as how to properly handle certain events. One of the included examples (as of version 3.3.3) had its own bugs in it.
Your code will also need to be cognizant of event driven interactions, and polling interactions. With massively distributed architecture, when acting upon 'events' you can inadvertently create a stampede that could not be desirable for your environment (herding effect).

Resources