I am able to get Performance counters for every two seconds in Windows Server 2008 machine using Powershell script. But when i go to Task Manager and check for the CPU Usage, powershell.exe is taking 50% of CPU. So i am trying to get those Performance counters using other third party tools. I have searched and found this and this. Those two are need to refresh manually and not getting automatically for every two seconds. Can anyone Please suggest some tool which gives the Performance Counters for every two seconds and analyzes the Maximum, Average of those counters and stores the results in text/xls or any other format. Please help me.
I found some Performance tools from here, listed below:
Apache JMeter
NeoLoad
LoadRunner
LoadUI
WebLOAD
WAPT
Loadster
LoadImpact
Rational Performance Tester
Testing Anywhere
OpenSTA
QEngine (ManageEngine)
Loadstorm
CloudTest
Httperf.
There are a number of tools that do this -- Google for "server monitor". Off the top of my head:
PA Server Monitor
Tembria FrameFlow
ManageEngine
SolarWinds Orion
GFI Max Nagios
SiteScope. This tool leverages either the perfmon API or the SNMP interface to collect the stats without having to run an additional non-native app on the box. If you go the open source route then you might consider Hyperic. Hyperic does require an agent to be on the box.
In either case I would look to your sample window as part of the culprit for the high CPU and not powershell. The higher your sample rate the higher you will drive the CPU, independent of tool. You can see this yourself just by running perfmon. Use the same sets of stats and what what happens to the CPU as you adjust the sample rate from once every 30 seconds, to once in 20, then ten, 5 and finally 2 seconds as the interval. When engaged in performance testing we rarely go below ten seconds on a host as this will cause the sampling tool to distort the performance of the host. If we have a particularly long term test, say 24 hours, then adjusting the interval to once in 30 seconds will be enough to spot long term trends in resource utilization.
If you are looking to collect information over a long period of time, 12 hours to more, consider going to a longer term interval. If you are going for a short period of sampling, an hour for instance, you may want to run a couple of different periods of one hour at lesser and greater levels of sampling (2 seconds vs 10 seconds) to ensure that the shorter sample interval is generating additional value for the additional overhead to the system.
To repeat, tools just to collect OS stats:
Commercial: SiteScope (Agentless). Leverages native interfaces
Open Source: Hyperic (Agent)
Native: Perfmon. Can dump data to a file for further analysis
This should be possible without third party tools. You should be able to collect the data using Windows Performance Monitor (see Creating Data Collector Sets) and then translate that data to a custom format using Tracerpt.
If you are still looking for other tools, I have compiled a list of windows server performance monitoring tools that also includes third party solutions.
Related
I have a scenario with 5K HTTP requests. When I start JMeter with it, JMeter simply hangs after about 170 users. I followed all the guidelines for successful stress testing (no listeners, headless, increased heap space).
I must say that some of those requests are a little big, the overall file is ~115M.
When I only take a subset of the requests (~100), the simulation works better (faster initialization of users, holds more than 170 users, etc).
My question is, first, as I understand JMeter loads the scenario tree and every threads plays it, there should not be any duplication, so what exactly causes this extensive load? and second, what can I do about it?
PS: when I view the system bottlenecks I notice both CPU and memory are at very high values on the long file, both of the metrics have low values on the shorter version. Anyone can explain?
PS2: the requests have about 7 seconds of delay between them
First I need to let you know that if you are using a single system to do the load testing, the maximum your hardware or the port can handle at a time is 1 Gig of data. and your firewall(if any) would again receive/pass not more than I Gig of data. Try doing the same load test with Distributed System of load testing in Jmeter(Master-Slave-Distributed System). Even then, I don't think it would run for 4k requests(if these requests are heavy).
Best possible solution:
Try Distributed system as I mentioned above.
Try running the load test in Non GUI Mode- CLI
Increase the ramp up time as needed.
Increase the Ram of your system and allocate maximum available heap space to jmeter.
Drastic change- Use 1. Blazemeter cloud or 2. Move the complete setup of your load testing to Amazon Server which is more reliable and scalable.
while using nagios with multiple hosts spread over the network,hosts status shows a recognizable lag and takes a long time to reflect on nagios server cgi.Thus what is the optimal nrpe/nagios configration to speed up the status process for a distributed host environment.
In my case I use nagios core 4.1
nrpe 1.5
server/clients: Amazon ec2
The GUI is usually only updated once each minute (automatically), though clicking refresh can provide you with 'nearly' the latest information. I say nearly because there is a distinct processing loop inside of the Nagios core that causes it to never be real time. NRPE is going to run at the speed of your network connection - it does little else besides sending and receiving tiny amounts of data. About the only delay here is the time it takes to actually perform the check and send back the response - which, of course, has way to many factors to mention. Try looking at the output of
[nagioshome]/bin/nagiostats
There are several entries that tell you:
'Latency' - the time between when the check was scheduled to start, and the actual start time.
'Execution Time' - the amount of time checks are actually taking to run.
These entries will have three numbers, which are; Min / Max / Avg
High latency numbers (in my book that means Avg is greater than 1 second) usually means your Nagios server is over worked. There are a few things you can do to improve latency times, and these are outlined in the 'nagios.cfg' file. This latency has nothing to do with network speed or the speed of NRPE - it is primarily hardware speed. If you're already using the optimal values specified in nagios.cfg, then its time to find some faster hardware.
High execution times (for me an Avg greater than 5 seconds) can be blamed on just about everything except your Nagios system. This can be caused by faulty networks (improper packet routing), over loaded networks, faulty and/or poorly designed checks, slow target systems, ... the list is endless. Nothing you do with the Nagios and/or NRPE configs will help lower these values. Well, you could disable NRPE's encryption to improve wire time; but if you have encryption enabled in the first place, then its not likely you'd want it disabled.
Are there any tools to generate an automated scenario with a predefined ramp up of user requests (running same map-reduce job) and monitoring some specific metrics of Hadoop cluster under load? I am looking ideally for something like LoadRunner but free/open source tool.
The tool does not have to have a cool UI but rather an ability to record and save scenarios that include a ramp up and a rendezvous point for several users (wait until other users reach some point and do some action simultaneously).
The Hadoop distribution I am going to test is the latest MapR.
Searching internet did not bring any good free alternatives to HP LoadRunner. In case you had an experience with Hadoop (or MapR in particular) load testing, please share what tool you have used.
Every solution you will look at has both a tool quotient and a labor quotient in the total price. There are many open source tools which take the tool cost to zero but the labor charge is so high that your total cost to deliver will be higher than a purchase of a commercial tool with a lower labor charge. Also, many people look at performance testing tools as load generation alone, ignoring the automated collection of monitoring data and the analysis of the results where you can pin an increase in response times with a correlated use of resources at the same time. This is a laborious process made longer to do when you are using decoupled tools.
As you have mentioned LoadRunner, when you are provided a tool you should compare what is available in that tool to whatever you are provided. For instance,
there are Java, C, C++ & VB interfaces available in LoadRunner. You are going to find a way to exercise your map and reduce infrastructure. Compare the integrated monitoring capabilities (native/SNMP/terminal user with command line...) as well as analysis and reporting. Where capabilities do not exist you will either need to build the capability or acquire it elsewhere.
You have also brought up the concept of Rendezvous. You will want to be careful with its application in any tool. Unless you have a very large population the odds of Simultaneous collision in the same area of code/action at the same time becomes quite small. Humans are chaotic instruments, arriving and departing independently from one another. On the other hand, if you are automating an agent which is based upon a clock tick then rendezvous makes a lot more sense. Taking a look at your job submission logs by IP address can provide an objective model for how many are submitted simultaneously (rendezvous) versus how many are running concurrently. I audit a lot of tests and rendezvous is the most abused item across tools, resulting in thousands of lost engineering hours chasing engineering ghosts that would never occur in natural use.
I want to do load testing for 10 million users for my site. The site is a Java based web-app. My approach is to create a Jmeter test plan for all the links and then take a report for the 10 million users. Then use jvisualVM to do profiling and check if there are any bottlenecks.
Is there any better way to do this? Is there any existing demo for doing this? I am doing this for the first time, so any assistance will be very helpful.
You are on the correct path, but your load limit is of with a high factor.
Why I'm saying this is cause your site probably will need more machine to handle 10Milj Concurrent users. A process alone would probably struggle to handle concurrent 32K TCP-streams. Also do some math of the bandwidth it would take to actually handle 10Milj users.
Now I do not know what kind of service you thinking of providing on your site, but when thinking of that JVisualVM slows down processing by a factor 10 (or more for method tracing), you would not actually measure the "real world" if you got JMeter and JVisualVM to work at the same time.
JVisualVM is more useful when you run on lower loads.
To create a good measurement first make sure your have a good baseline.
Make a test with 10 concurrent users, connect up JVisuamVM and let it run for a while, not down all interesting values.
After you have your baseline, then you can start adding more load.
Add 10times the load (ea: 100 users), look at the changes in JVisualVM. Continue this until it becomes obvious that JVisualVM slows you down, for every time to add extra load, make sure you have written down the numbers your are interested in. Plot down the numbers in a graph.
Now... Interpolate the graph (by hand) for the number of users you want. This works for memory usage, disc access etc, but not for used CPU time, cause JVisualVM will eat CPU and give you invalid numbers on that (especially if you have method tracing turned on).
If you really want to go as high as 10Milj users, I would not trust JMeter either, I would write a little test program of my own that performs the test you want. This would be okey, since the the setting up the site to handle 10Milj will also take time, so spending a little extra time of the test tools are not a waste.
Just because you have 10 million users in the database, doesn't mean that you need to load test using that many users. Think about it - is your site really going to have 10 million simultaneous users? For web applications, a ratio of 1:100 registered users is common i.e. you are unlikely to have more than 100K users at any moment.
Can JMeter handle that kind of load? I doubt it. Please try faban instead. It is very light-weight and can support thousands of users on a single VM. You also have much better flexibility in creating your workload and can also automate monitoring of your entire test infrastructure.
Now to the analysis part. You didn't say what server you were using. Any Java appserver will provide sufficient monitoring support. Commercial servers provide nice GUI tools while Tomcat provides extensive monitoring via JMX. You may want to start here before getting down to the JVM level.
For the JVM, you really don't want to use VisualVM while running such a large performance test. Besides to support such a load, I assume you are using multiple appserver/JVM instances. The major performance issue is usually GC, so use the JVM options to collect and log GC information. You will have to post-process the data.
This is a non-trivial exercise - good luck!
There are two types of load testing - bottleneck identification and throughput. The question leads me to believe this is about bottlenecks, so number of users is a something of a red herring, instead the goal being for a given configuration finding areas that can be improved to increase concurrency.
Application bottlenecks usually fall into three categories: database, memory leak, or slow algorithm. Finding them involves putting the application in question under stress (i.e. load) for an extended period of time - at least an hour, perhaps up to several days. Jmeter is a good tool for this purpose. One of the things to consider is running the same test with cookie handling enabled (i.e. Jmeter retains cookies and sends with each subsequent request) and disabled - sometimes you get very different results and this is important because the latter is effectively a simulation of what some crawlers do to your site. Details for bottleneck detection follow:
Database
Tables without indices or SQL statements involving multiple joins are frequent app bottlenecks. Every database server I've dealt with, MySQL, SQL Server, and Oracle has some way of logging or identifying slow running SQL statements. MySQL has the slow query log, whereas SQL Server has dynamic management views that track the slowest running SQL. Once you've got your hands on the slow statements use explain plan to see what the database engine is trying to do, use any features that suggest indices, and consider other strategies - such as denormalization - if those two options do not solve the bottleneck.
Memory Leak
Turn on verbose garbage collection logging and a JMX monitoring port. Then use jConsole, which provides much better graphs, to observe trends. In particular leaks usually show up as filling the Old Gen or Perm Gen spaces. Leaks are a bottleneck with the JVM spends increasing amounts of time attempting garbage collection unsuccessfully until an OOM Error is thrown.
Perm Gen implies the need to increase the space as a command line parameter to the JVM. While Old Gen implies a leak where you should stop the load test, generate a heap dump, and then use Eclipse Memory Analysis Tool to identify the leak.
Slow Algorithm
This is more difficult to track down. The most frequent offenders are synchronization, inter process communication (e.g. RMI, web services), and disk I/O. Another common issue is code using nested loops (look mom O(n^2) performance!).
Best way I've found to find these issues absent some deeper knowledge is generating stack traces. These will tell what all threads are doing at a given point in time. What you're looking for are BLOCKED threads or several threads all accessing the same code. This usually points at some slowness within the codebase.
I blogged, the way I proceeded with the performance test:
Make sure that the server (hardware can be as per the staging/production requirements) has no other installations that can affect the performance.
For setting up the users in DB, a procedure can be used and can be called as a part of jmeter test plan.
Install jmeter on a separate machine, so that jmeter won't affect the performance.
Create a test plan in jmeter (as shown in the figure 1) for all the uri's, with response checking and timer based requests.
Take the initial benchmark, using jmeter.
Check for the low performance uri's. These are the points to expect for bottlenecks.
Try different options for performance improvement, but focus on only one bottleneck at a time.
Try any one fix from step 6 and then take an benchmark. If there is any improvement commit the changes and repeat from step 5. Otherwise revert and try for any other options from step 6.
The next step would be to use load balancing, hardware scaling, clustering, etc. This may include some physical setup and hardware/software cost. Give the results with the scalability options.
For detailed explanation: http://www.daemonthread.com/2011/06/site-performance-tuning-using-jmeter.html
I started using JMeter plugins.
This allows me to gather application metrics available over JMX to use in my Load Test.
I have some basic questions around understanding fundamentals of Performance testing. I know that under various circumstances we might want to do
- Stress Testing
- Endurance Testing etc.
But my main objective here is to ensure that response time is decent from application under a set of load which is towards a higher end or in least above average load.
My questions are as follows:
When you start to plan your expected response time of application; what do you consider. If thats the first step at all. I mean, I have a web application now. Do I just pull out a figure from air and say "I would expect application to take 3 seconds to respond to each request". and then go about figuring out what my application is lacking to get that response time?
OR is it the other way round, and you start performance test with a given set of hardware and say, lets see what response time I get now, and then look at results and say, well it's 8 seconds right now, I'd like it to be 3 seconds at max, so lets see how we can optimize it to be 3 seconds? But again is 3 seconds out of air? I am sure, scaling up machines only will not get response time up. It'll get response time up only when single machine/server is under load and you start clustering?
Now for one single user I have response time as 3 seconds but as the load increases it goes down exponentially; so where do I draw the line between "I need to optimize code further" (which has it's upper limit) and "I need to scale up my servers" (Which has a limit too)
What are the best free tools to do performance and load testing? I have used Jmeter a bit. But is there anything else, that is good and open source?
If I have to optimize code, I start profiling the specific flows which took lot of time responding to requests?
Basically I'd like to see how one goes about from end to end doing performance testing for their application. Any links or articles would be very helpful.
Thanks.
The Performance Testing Council is your gateway to freely exchange experiences, knowledge, and practice of performance testing.
Also read Microsoft Patterns & Practises for Performance testing. This guide shows you an end-to-end approach for implementing performance testing.
phoenix mentioned the Open Source tools.
First of all you can read
Best Practices for Speeding Up Your Web Site
For tools
Open source performance testing tools
performance: tools
This link and this show an example and method of performance tuning an application when the application does not have any obvious "bottlenecks". It works most intuitively on individual threads. I have no experience using it on web applications, although other people do. I agree that profiling is not easy, but I've always relied on this technique, and I think it is pretty easy / effective.
First of all, design your application properly.
Use a profiler, see where the bottlenecks in your application are, and take them away if possible. MEASURE performance before improving it.
I will try to provide basic step by step guide, which can be used for implementing Performance testing in you project.
1 - Before you start testing you should know amount of physical memory and amount of memory allocated for JVM, or whatever. DB size collect as much metrics as possible for your current environment. Know you environment
2 - Next step would be to identify common DB production size and expected yearly growth. You will want to test how your application will behave after year, two, five etc.,
3 - Automate environment setup, this is will help you a lot in future for regression testing and defect fix validation. So you need to have DB dumps for your tests. With current (baseline), one year, five year volume.
4 - Once you're done if gathering basic information - Think about monitoring your servers under load, maybe you already have some monitoring solution like http://newrelic.com/ this will help you to identify cause of performance degradation (CPU/Mem/Amount of threads etc.,) Some performance testing tools do have built in monitoring systems.
At this you are ready to move with tooling and load selection, there is already provided materials on how to do that so I will skip part with workload selection.
5 - Select tool I think that JMeter + http://blazemeter.com/ is what you need at this point, both do have a lot nice articles and education materials, for your script recording I would recommend to use blazemeters Chrom Extension instead of inbuilt JMeters solution. If you still think that you do lack knowledge on how things are done in JMeter I recommend to get this book - Performance Testing With JMeter 2.9 by Bayo Erinle
6 - Analyze results, review test plan and take corresponding actions.