Performance Testing MSMQ Server - performance

Has anyone done any sort of performance tests against MSMQ?
We have a solution in prod environment where errors are added to a MSMQ for distribution to databases or event monitors.
We need to test the capacity of this system but not sure how to start.
Anyone know any tools or have any tips?

try overloading it with a test program and see where it balks/fails
[analgous to "destructive testing" in materials engineering]

QueueExplorer has "Mass send" option which could be used to send bunch of messages to a queue, with or without delay between them. I know it's not a fully automated stress test, but running it from few instances or even few machines, could generate significant stress load.
Disclaimer: I'm author of QueueExplorer.

yeah I was thinking that was hoping for a more public tool.

Related

Performance Testing in Mirth Connect Using JMeter

Mirth Connect is a software that is designed to handle a message flow and it has built-in support to handle HL7 messages in particular and therefore this software is widely used for interfacing in Healthcare applications. Over the years I have seen the Mirth software experiencing performance issues primarily due to the message build up over time and in scenarios where it receives a heavy message load in quick succession.
Mirth has a channel-based architecture and it's ideal if there is some way we can performance test the Mirth channel and get JMeter statistics for its performance. Whereby we can gather the necessary information to optimize the channel transformers and also to set the purge routines accordingly.
However in the Internet there was little to no information on this area, that is how one can use JMeter to test a Mirth channel. A team in Sri Lanka did some research on this area back in 2013 and I found their findings and achievements below
http://pragmatictestlabs.com/2016/10/09/performance-testing-healthcare-application-hl7-jmeter/
However this is very specific the output here was a JSon object which they extracted, in Mirth however we can have outputs in various forms and there need to be a better way to do this. An important takeaway from this is the input that is the input is general we can use JMeter to generate HL7 messages and pass them to Mirth that's great but how to capture the response generally, it would be ideal if there is a way to read the Mirth Dashboard through JMeter, all the output statistics are there it's just a matter of reading them.
I have an application where Mirth reads HL7 messages both ADT and RDE and creates a text file accordingly with appropriate content and drops it to a shared location. Then the application reads the files and shows the information to the user.
I wish to do two performance tests here
Measure how much time the complete system takes and how it varies with load from the arrival of a message to its information being available to the user
Measure how much time the channel takes and how it does it as the load increases
I can do the first one because I can generate HL7 messages using JMeter and I can get JMeter to read the output in the application or the database. The problem is with the second, can I do this in a general way.
You asked for suggestions, so I'm going to share my general strategy for performance testing Mirth channels. I suspect that this won't be a complete answer to your question, and I might not be telling you anything you don't already know, but I'm hoping this will help you find an answer that you are comfortable with.
For several reasons, try not to spend too much time "testing the complete system":
Firstly, testing the entire system necessarily includes testing low-level configuration like the number of CPU cores, the NICs being used in the box, and kernel level software like the TCP/IP stack. You don't usually have any control over these things, so you can't optimize them in any way.
Secondly, the performance of the entire system is going to be heavily dependant on whatever ancillary code is running on the box. If a sysadmin decides to 'nice' my Mirth process down, or to use that box to also host a SQL server, that will have an impact on the system that I (again) have no control over.
Thirdly and most frankly, I find that the "performance of an entire system" is something that management asks about during system setup so they can get a cost estimate; but they know that they're only getting an estimate. You do your best to use test metrics to give a good guess for the initial hardware provisioning, but everyone knows that it's really the production performance metrics that will drive later provisioning costs.
Make sure that you build your channels for testability. I find that it's much easier to test a channel when the source and destination can be changed to "Channel Reader" and "Channel Writer" without changing message handling. One way to look at this is that you're not going to overhaul Mirth's MLLP stack or Java's TCP stack, so just eliminate these things from your testing.
I keep a source of useful test messages. I have a couple of files on a network drive that have around a hundred messages that test for nasty edge cases that I've run into over the years on my HL7 interfaces. I wrote a small Mirth channel that reads these in from a file and spews out copies as fast as it can. By turning on "Queueing" on the destination side of that channel, I can queue up a bajillion test messages that are ready to send to the channel I want to test. In the past I took the time to build a test interface that acted like a fake EMR to spew out randomly constructed messages, but there didn't seem to be any advantage over just spewing copies of the same messages from my test files.
Finally, and most importantly, it's critical that you measure the performance of your test instance using the same metrics that you'll use to measure the performance of your production instance. If the sole production metric you care about is 'messages per second', then that's what you need to measure on your test box. If memory footprint is a concern in production, then you need to measure memory usage in your test environment as well. When you make a change to to your test instance that decreases an important metric by 10%, you'll need to make sure your management is aware before you push that change to production.
Note that getting some of these metrics can be tricky, since Mirth doesn't include good tools to monitor its own performance. The Mirth dashboard is a good place to keep an eye on errors or crashes, but it's not a great place to find performance data. During my testing I make sure that I use whatever resource monitoring tool that the sysadmins will be using to monitor the performance of the production instance. Beyond that, I use a manual process to test performance: If I want to count message per second, I send through a batch of messages and look at the timestamps of the first and last messages. If I want to get an idea of the CPU load of a Mirth channel, I use the Windows Performance Monitor or the posix 'top' command.

JMeter use in sanity testing in production servers

I'm using JMeter on development environment and I think of executing sanity tests on production servers.
Sanity of web sites login and other actions.
Is it reasonable to use JMeter on production servers? How to limit JMeter so it won't impact real users? I found only tutorial which doesn't advice it.
Do not run these tests against your production servers unless you know they can handle the load, or you may negatively impact your server's performance.
From JMeter's point of view it doesn't really matter where you run your tests. Running load tests against production environment is very useful as this way you can discover "real" limitations, bottlenecks, integration and interoperabitity problems opposite to load testing in scaled down environments where you can only guess or calculate the anticipated production metrics.
Ideally you should have some form of "staging environment" which is an exact replica of production environment in terms of hardware, software and data.
If you cannot afford having "staging" environment to play with you can run your tests on production, however you need to keep in mind several important constraints to avoid "surprises"
Run your tests in "dead" time when your application real life usage is minimal, i.e. over night or during weekends.
Make sure JMeter test leaves the system at the same state as it was before test, i.e. if you create users, content, data, etc. - make sure you clean it up after the test so your system is not filled with "junk" data used for load testing. So consider using setUp Thread Group for setting up all the necessary test data and tearDown Thread Group to clean up after yourself
Make sure you monitor your servers health so you will be notified when (if) your system is overloaded. You can use JMeter PerfMon Plugin for this.
It would be also good to have AutoStop Listener enabled so JMeter test would stop automatically
Consider adding SMTP Sampler to your test plan so you would be informed in case of unexpected errors.
As a engineering manager I would say: not in my life time ;-)
So what do you want to hear: that it is not a problem?
Only you can tell whether it would be an issue if something behaves different from what you expect.
My advise would be the same as what you are quoting: don't do it. Unless you know what you are doing, and even then...

How can I monitor my Application server or Database server from Jmeter scripts ? Can we check CPU , memory utilization, etc?

I need to know till what extend we can analyze our Application using Apache Jmeter.
My script creation is complete. Paramatrized & Correlated. Now I need a deep understanding of Analysis.
Earlier, I just use to focus on Response time, Standard deviation, throughput, etc.
But now my boss wants me to do more analysis. Please help me guys.
You can use these Samplers from JMeter-plugins project:
http://jmeter-plugins.org/wiki/DbMon/
http://jmeter-plugins.org/wiki/JMXMon/
Correctly still divide the tasks and means to solve them. If you need to monitor the parameters of the server utilization - then it needs to use the appropriate means, for example, zabbix. If you need to understand how much resources consumes your server applications - it is necessary to refer to the appropriate monitoring tools, plug-ins, such as Zorka for WebSphere in zabbix.

TDD Scenario: Looking for advice

I'm currently in an environment where we are parsing data off of the client's website. I want to use my tests to ensure that when the client changes their site, I know when we are no longer receiving the information.
My first approach was to do pure integration tests where my tests hit the client's site and assert that the data was found. However half way through and 500 tests in, the test run has become unbearable and in some cases started timing out. So I cleared out as many tests that I could without loosing the core protection they are providing and I'm down to 350 or so. I'm left with a fear to add more tests to only break all the tests. I also find myself not running the 5+ minute duration (some clients will be longer as this is based on speed of communication with their site) when I make changes anymore. I consider this a complete failure.
I've been putting a lot of thought into this and asking around the office, my thoughts for my next attempt at this is to pull down the client's pages and write tests against these embedded resources in my projects. This will give me my higher test coverage and allow me to go back to testing in isolation. However I would need to be notified when they make changes and then re-pull down the pages to test against. I don't think the clients will adhere to this.
A suggestion was made to me to augment this with a suite of 'random' integration tests that serve the same function as my failed tests (hit the clients site) but in a lot less number than before. I really don't like the idea of random testing, where the possibility of sometimes getting red lights and some times getting green lights with the same code. But this so far sounds like the best idea I've heard to still gain the awareness of when the client's site has changed and my code no longer finds the data.
Has anyone found themselves testing an environment like this? Any suggestions from the testing community for me?
When you say the big test has become unbearable, it suggests that you are running this test suite manually. You shouldn't have to. It should just be running constantly in the background, at whatever speed it takes to complete the suite - and then start over again (perhaps after a delay if there are associated costs). Only when something goes wrong should you get an alert.
If there is something about your tests that causes them to get slower as their number grows - find it and fix it. Tests should be independent of one another, so simply having more of them shouldn't cause individual tests to time out.
My recommendation would be to try to isolate as much as possible the part of code that deals with the uncertainty. This part should be an API that works as a service used by all the other code. This way you would be protecting most of your code against changes.
The stable parts of the code should be unit-tested. With that part being independent from the connection to client's site running the tests should be way quicker and it would also make those tests more reliable.
The part that has to deal with the changes on the client's websites can be reduced. This way you are not solving the problem but at least you're minimising it and centralising it in only one module of your code.
Suggesting to the clients to expose the data as a web service would be the best for you. But I guess that doesn't depend on you :P.
You should look at dividing your tests up, maybe into separate assemblies that can be run independently. I typically have a unit tests assembly and a slower running integration tests assembly.
My unit tests assembly is very fast (because the code is tested in isolation using mocks) and gets run very frequently as I develop. The integration tests are slower and I only run them when I finish a feature / check in or if I have a bad feeling about breaking something.
Maybe you could do something similar or even take the idea further and have 3 test suites with the third containing even slower client UI polling tests.
If you don't have a continuous integration server / process you should look at setting one up. This would continuously build you software and execute the tests. This could be set up to monitor check-ins and work in the background, sending out a notification if anything fails. With this in place you wouldn't care how long your client UI polling tests take because you wouldn't ever have to run them yourself.
Definitely split the tests out - separate unit tests from integration tests as a minimum.
As Martyn said, get a Continuous Integration system in place. I use Teamcity, which is excellent, easy to use, free for the first 20 builds, and you can happily run it on your own machine if you don't have a server at your disposal - http://www.jetbrains.com/teamcity/
Set up one build to run on every check in, and make that build run your unit tests, or fast-running tests if you will.
Set up a second build to run at midnight every night (or some other convenient time), and include in this the longer running client-calling integration tests. With this in place, it won't matter how long the tests take, and you'll get a big red flag first thing in the morning if your client has broken your stuff. You can also run these manually on demand, if you suspect there might be a problem.

Diagnosing pathological behavior of a piece of cluster software

I'm using a kind of load balancer over a small cluster that is able to achieve >2000rps on zero-duration requests (t.i. ones that are immediately satisfied by the worker nodes).
But as soon as the requests stop being zero-duration and start taking even 1ms, performance immediately drops >10x. The data being transfered in both directions is identical and is about 2kb in size.
This is for sure not related to saturation of the cluster or network throughput, because 200rps of 1ms requests is a very tiny load and the network is 10Gbit. Besides, the CPU load is just some 2-5% both on the load balancer and on the worker nodes.
I wonder whether that might be related to some pathological behavior of the OS scheduler, or the OS network stack (t.i. there is some special case behavior for very short interactions).
How might I diagnose the reason? Which perfcounters to watch? What tools or methodologies to use?
(Just in case someone simply knows the answer to my particular problem, I'm talking about the MS HPC Server 2008 R2's "WCF Broker", running on Windows Server 2008 R2 over Hyper-V)
One thing you can do is use ETW tracing to try and understand what the nodes are doing while your WCF job is running. On HPC server, I sometimes clusrun xperf to collect traces on all or specific nodes. There are a number of tools that you can use for analyzing ETW traces, including xperf itself. I haven't done any serious work using HPC SOA (WCF), but I did write a simple WCF raytracer app and then used xperf to profile it on several of the nodes.
Turned out it was a completely network-unrelated issue having to do with peculiarities of the scheduling mechanism of HPC Server. I resolved the issue by tweaking a configuration option "serviceRequestPrefetchCount" to 0 in the loadBalancing section of the WCF service config file.
I'm assuming that there are some shared resources with some kind of locking system in place? Is locking a bottleneck? It's hard to guess without seeing the system.
Do you have a way to profile the workers? What are they spending most of their time on, especially in the fast vs slow scenarios?

Resources