Elastic Search Load Testing - elasticsearch

I have a single node elastic search server running on ec2. I want to do some load testing using search requests with random search queries. I am using JMeter for load testing with two different approaches -
HTTP Client - When I test using these clients with 10k/20k/50k of requests, it works fine.
ES Transport Client - This works fine with approx 2k of requests.
Here are the steps I have followed -
Instantiating client on every run and close it once the test finished.
Once client instantiates, I start the jmeter sampling and send the search request.
After this run, stops the sampling.
I am getting No Node Available Exception after 2k of request with transport client.
ES Server is running with 3g of memory and have given 6g of memory to load tester.
Please help me if there is some config modification required and if I am not using the correct approach to test the load.
Thanks in Advance.

What kind of responses are you getting from the http test? Have you verified you are getting valid responses for all 10~50k requests? It might be perhaps your cluster cannot take on the load you're putting on it for either test. Since TransportClient is more intimately coupled to the ES server, you will explicitly see errors that come back from TransportClient, but if you're simply sending requests via HTTP without validating the response, it's easy to miss any issues.
Although, before taking a stab in the dark like I just did, I would also check to see what kind of QPS you are getting using the HTTP method vs the TC method, what your CPU/memory look like throughout both tests, what the response times look like, etc. It helps to monitor the health of your system throughout the process to detect any symptoms that might help explain the cause.

Related

JMeter - returns 401 unauthorized for few test

I am using jmeter to load test a website. I have tested from 1 to 400 users. however while testing for above 500 users/threads, I am getting 401/unauthorized error for few users. Hope you'll help me to find out a solution to this problem.
I can think of 3 possible reasons:
There is a parameterization problem with your test, i.e. it sends wrong credentials, you can use i.e. Simple Data Writer configured to save requests and response data for failing requests and inspect it using View Results Tree listener.
JMeter gets overloaded and cannot properly perform correlation. In addition to point 1 check whether you're following JMeter Best Practices and ensure that JMeter has enough headroom to operate in terms of CPU, RAM, etc. using i.e. JMeter PerfMon Plugin
Your application gets overloaded and cannot handle 500+ users, check your application logs, resources usage, output of APM tools, etc.

How to handle Socket Exception when response time is high

We are executing a test of Upload scenario where we are aware that the response time will be more than 5 minutes. Hence we have configured timeout in HTTP Request Defaults as well as in the Http request as 3600000 milliseconds. But still we are getting Socket Exception in Upload transaction . Could you please suggest how to handle this.
Thanks,
SocketException doesn't necessarily means "timeout", it indicates that JMeter is not able to create or access Socket connection, there are too many possible reasons, the most common are:
Network configuration of your server doesn't allow that many connections as you're trying to open, check the maximum number of open connections on your application server and operating system level.
Your application server is overloaded and cannot handle such a big load. Make sure it has enough headroom to operate in terms of CPU, RAM and especially Network metrics (these can be monitored using JMeter PerfMon Plugin)
You might be experiencing the behaviour described in JMeterSocketClosed article
Basically the same as points 1 and 2 but this time you need to check JMeter health, make sure you're following JMeter Best Practices and maybe even consider going for distributed testing

Whats the impact of response code 400,503 ? Can we ignore these codes if my primary focus is to measure loading time of web application?

I am testing a web application login page loading time with 300 thread users and ramp up period of 300 secs.Most of my samples return response code 200.But few of them return response code 400,503.
My goal is to just check the performance of the web application if 300 users start using it.
I am new to Jmeter and have basic knowledge of programming.
My Question :-
1.Can i ignore these errors and focus just on timings from the summary report ?
2.If i really need to fix these errors, how to fix it ?
There are 2 different problems indicated by these errors:
HTTP Status 400 stands for Bad Request - it means that you're sending malformed requests which cannot be understood by the server. You should inspect request details and amend JMeter configuration as it is the problem in your script.
HTTP Status 503 stands for Service Unavailable - it indicates the problem on server side, i.e. server is not capable of handling the load you're generating. This is something you can already report as the application issue. You can try to identify the underlying cause by:
looking into your application log files
checking whether your application has enough headroom to operate in terms of CPU, RAM, Network, Disk, etc. It can be done using APM tool or JMeter PerfMon Plugin
re-running your test with profiler tool telemetry to deep dive into what's under the hood of the longest response times
So first of all you should ensure that your test is doing what it is supposed to be doing by running it with 1-2 users/loops and inspecting requests/response details. At this stage you should not be having any errors.
Going forward you should increase the load gradually and correlate the increasing number of virtual users with the increasing response time/number of errors
`
Performance testing is different from load testing. What you are doing is load testing.
Performance testing is more about how quickly an action takes. I typically capture performance on a system not under load for a given action.
This gives a baseline that I can then refer to during load tests.
Hopefully, you’ve been given some performance figures to test. E.g. must be able to handle 300 requests in two minutes.
When moving onto load, I run a series of load tests with increasing number of users/threads and capture the results from each test.
Armed with this, I can see how load degrades performance to the point where errors start to show up. This gives you an idea of how much typical load the system can handle.
I’d also look to run soak tests too. This where I’d run JMeter for a long period with typical (not peak) load to make sure the system can handle sustained load.
In terms of the errors you’re seeing, no I would not ignore them. Assuming your test is calling the same endpoint, it seems safe to say the code is fine, its the infrastructure struggling with the load you’re throwing at it.

Interpreting JMeter Results

I have been running some load tests against APIs using JMeter, the results are below:
I am trying to understand what the cause of the two different patterns of slow behaviour I am seeing could be:
Pattern 1: Time to connect is low, Latency is high
Pattern 2: Time to connect is high, Latency is low
*Note: the majority of calls are returning in around 45-50ms.
My current thoughts are as follows:
Pattern 1: This is "server processing time" so for some reason back-end server is taking longer than usual to respond. We will need to do deeper dive to figure out why.
Pattern 2: This pattern shows a long time to establish a TCP connection. Is there a way to determine if this is a problem on the outgoing side i.e. JMeter itself is running out of threads to make API connections, or if the API server is running out of connections and is unable to accept more?
How should I interpret these results? Are there any additional data points I could pull or tools I could use to better understand the findings?
Both Connect Time and Latency are network-related metrics, the formula is:
Elapsed Time = Connect Time + Latency + Server Response time
It looks like your server itself is no brainer, the problem is either on network level or connected with JMeter which might lack resources in order to send requests fast enough.
With regards to additional information sources:
Generate HTML Reporting Dashboard and look into "Over Time" charts. It should allow you to correlate increasing load with the increasing connect time / latency.
Consider setting up monitoring of essential health metrics of JMeter load generator(s) and the application under test. You can use JMeter PerfMon Plugin for this.
Make sure to follow JMeter Best Practices as JMeter default configuration is good for tests development and debugging and you need to perform fine tuning of JMeter for the high loads.

How can I increase SSL performance with Elastic Beanstalk

I really like Elastic Beanstalk and managed to get my webapp (Spring MVC, Hibernate, ...) up and running using SSL on a Tomcat7 64-bit container.
A major concern to me is performance (I thought using the Amazon cloud would help here).
To benchmark my server performance I am using blitz.io (which uses the amazon cloud to have multiple clients access my webservice simultaneously).
My very first simple performance test already got me wondering:
I benchmarked a health check url (which basically just prints "I'm ok").
Without SSL: Looks fine.
13 Hits/s with a response time of 9ms
230 Hits/s with a response time of 8ms
With SSL: Not so fine.
13 Hits/s with a response time of 44ms (Ok, this should be a bit larger due to encryption overhead)
30 Hits/s with a response time of 3.6s!
Going higher left me with connection timeouts (timeout = 10s).
I tried using a larger EC2 instance in the background with essentially the same result.
If I am not mistaken, the Load Balancer before the EC2 Instances serves as an endpoint for SSL encryption. How do I increase this performance?
Can this be done with elastic beanstalk? Or do I need to setup my own load balancer etc.?
I also did some tests using Heroku (albeith with a slightly different technology stack, play! vs. SpringMVC). Here I also saw the increased response time, but it stayed mostly constant. I am assuming they are using quite performant SSL endpoints. How do I get that for Elastic Beanstalk?
It seems my testing method was flawed.
Amazon's Elastic Load Balancers seem to go up to 10k SSL requests per second.
See this great writeup:
http://blog.mattheworiordan.com/post/24620577877/part-2-how-elastic-are-amazon-elastic-load-balancers
SSL requires a handshaking before a secure transmission channel is opened. Once the handshaking is done, which involves several roundtrips, the data is transmitted.
When you are just hitting a page using a load tester, it is doing the handshake for each and every hit. It is not reusing an already established session.
That's not how browsers are going to do. Browse will do handshake once and then reuse the open encrypted session for all the subsequent requests for a certain duration.
So, I would not be very worried about the results. I suggest you try a tool like www.browsermob.com to see how long a full page with many image, js, css etc takes to load over SSL vs non-SSL. That will be a fair comparison.
Helps?

Resources