java.net.SocketException: Connection reset at Jmeter - performance

I'm doing a load test on a web application, and with minimum of 14-15 users am getting this connection reset issue and I ensure the following from my end:
Request retries has been set to 1 in user.properties files
stale check is set to true
Test data and lan connectivity is good.
number of users are less hence it wont need more RAM for jmeter
Hence could this be concluded as an issue in application design and not an issue from Jmeter?

To avoid long trail of comments, I'll try to summarize it and answer.
This issue looks from application deployment system.
JMeter ---------------> ( Web server <-> App server <-> DB )
Find out in which area bottleneck is present using profilers.
Issue could be in anyone of below layers,
Web Server :
If Web server is bottleneck then try to tune the web server for handling more load. Like more threadpool size, more timeouts, buffers, queues
Application Server :
If app server is bottleneck then tune your application server. Again check configurations, any specific settings for handling more load and if required code improvement should be done.
Database Server :
If DB is bottleneck then check queries, indexes, statistics and optimize them for your needs. config settings also help sometimes.
For all layers check server resource utilization. If it is not much then there is room for perf. improvement else server vertical/horizontal scaling is required.
You are saying problem is because some ids were not generated in DB. so you can start with DB layer for possible bottlenecks.
Hope this helps :)

Related

How to handle Socket Exception when response time is high

We are executing a test of Upload scenario where we are aware that the response time will be more than 5 minutes. Hence we have configured timeout in HTTP Request Defaults as well as in the Http request as 3600000 milliseconds. But still we are getting Socket Exception in Upload transaction . Could you please suggest how to handle this.
Thanks,
SocketException doesn't necessarily means "timeout", it indicates that JMeter is not able to create or access Socket connection, there are too many possible reasons, the most common are:
Network configuration of your server doesn't allow that many connections as you're trying to open, check the maximum number of open connections on your application server and operating system level.
Your application server is overloaded and cannot handle such a big load. Make sure it has enough headroom to operate in terms of CPU, RAM and especially Network metrics (these can be monitored using JMeter PerfMon Plugin)
You might be experiencing the behaviour described in JMeterSocketClosed article
Basically the same as points 1 and 2 but this time you need to check JMeter health, make sure you're following JMeter Best Practices and maybe even consider going for distributed testing

BizTalk SOAP receive location poor performance

I've encountered poor performance in my BizTalk application that uses SOAP/ASMX Receive location web service hosted in IIS on the same sever. This service only invoke one function on Oracle DB (connected via Oracle Driver)
I've done load tests via Soap UI and I stressed DB a little from PL/SQL Profiler in SQL Navigator and it turned out that avg request time = 700ms, avg DB query time = 15ms, avg Orchestration done time = 30ms (via BT Admin Console), so there is an tremendous amount of time wasting by IIS, asmx or SOAP?
I've read this: Configuration Parameters that Affect Adapter Performance and tweaked minFreeThreads , minFreeLocalRequestFreeThreads but nothing really happened.
But as I understand well - there is send port described there and I have problem Receive Location, right?
Also read that article: BizTalk: Performance problems using the SOAP adapter
There is no such key like:
Registry Key:
HKLM\SYSTEM\CurrentControlSet\Services\BTSSvc$BizTalkServerApplication\CLR Hosting
How to achieve Option 2?
Option 2:
Look into process isolation – this would using a different instance of the .NET threadpool executed in a separate address space from the BizTalk NT service.
Guide me please
Go to your receive host properties and change message polling interval to 50ms from default 500ms, that will provide an improved performance. If you're using orchestration on a separate host to process service request and response, do the same on orchestration host but reduce orchestration polling interval. Doing this increase performance for low latency scenario, however it adds overhead on SQL message box. Depending on your volume and need tune this.
Also try upgrading to WCF services

Using Azure load balancer to reboot/update server with zero downtime

I have a really simple setup: An azure load balancer for http(s) traffic, two application servers running windows and one database, which also contains session data.
The goal is being able to reboot or update the software on the servers, without a single request being dropped. The problem is that the health probe will do a test every 5 seconds and needs to fail 2 times in a row. This means when I kill the application server, a lot of requests during those 10 seconds will time out. How can I avoid this?
I have already tried running the health probe on a different port, then denying all traffic to the different port, using windows firewall. Load balancer will think the application is down on that node, and therefore no longer send new traffic to that specific node. However... Azure LB does hash-based load balancing. So the traffic which was already going to the now killed node, will keep going there for a few seconds!
First of all, could you give us additional details: is your database load balanced as well ? Are you performing read and write on this database or only read ?
For your information, you have the possibility to change Azure Load Balancer distribution mode, please refer to this article for details: https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-distribution-mode
I would suggest you to disable the server you are updating at load balancer level. Wait a couple of minutes (depending of your application) before starting your updates. This should "purge" your endpoint. When update is done, update your load balancer again and put back the server in it.
Cloud concept is infrastructure as code: this could be easily scripted and included in you deployment / update procedure.
Another solution would be to use Traffic Manager. It could give you additional option to manage your endpoints (It might be a bit oversized for 2 VM / endpoints).
Last solution is to migrate to a PaaS solution where all this kind of features are already available (Deployment Slot).
Hoping this will help.
Best regards

MaxConcurrentRequest in selfhost application

I have a selfhost signalr application, everything is ok but when users become more than 5000, users reconnected rapidly. I know that defalt value of appConcurrentRequestLimit is 5000. and i run this:
cd %windir%\system32\inetsrv
appcmd.exe set config /section:system.webserver/serverRuntime /appConcurrentRequestLimit:100000
but nothing changed. I increased maxConcurrentRequestsPerCPU and requestQueueLimit according to this
but i have got problem yet.
i'm using windows server 2012 and iis 8
You are shooting in the dark here, and you have no data about the actual performance and what's happening. The users could reconnect because of different reasons (server timeouts, regular interval reconnects, server errors). There are countless possibilities.
The correct way to know what's happening and measure performance is to run a Baseline performance load test using the default configuration, and collect the relevant performance counters like current requests, queued requests, current connections, max connections etc.
You should also collect any relevant Error logs on the server that could help you figure out what's happening.
You can find the full list of performance counters you need below:
Memory
.NET CLR Memory# bytes in all Heaps (for w3wp)
ASP.NET
ASP.NET\Requests Current
ASP.NET\Queued
ASP.NET\Rejected
CPU
Processor Information\Processor Time
TCP/IP
TCPv6\Connections Established
TCPv4\Connections Established
Web Service
Web Service\Current Connections
Web Service\Maximum Connections
Threading
.NET CLR LocksAndThreads\ # of current logical Threads
.NET CLR LocksAndThreads\ # of current physical Threads
Once you have your baseline performance results on a graph, then you can modify configuration (e.g. modify the number of concurrent requests like you tried above) and then re-run your test, and collect again the same performance counters.
The performance counter results will speak for themselves, and they will lead you to a solution.
You can generate the load with a tool like Crank:
https://github.com/SignalR/SignalR/tree/dev/src/Microsoft.AspNet.SignalR.Crank
In addition you can also check the SignalR troubleshooting guide:
http://www.asp.net/signalr/overview/testing-and-debugging/troubleshooting

Troubleshooting MVC4 Web API Performance Issues

I have an asp.net mvc4 web api interface that gets about 54k requests a day.
http://myserv.x.com/api/123/getstuff?whatstuff=thisstuff
I have 3 web servers behind a load balancer that are setup to handle the http requests.
On average response times are ~300ms. However, lately something has gone awry (or maybe it has always been there) as there is sporadic behavior of response times coming back in 10-20sec. This would be for the same request hitting the same server directly instead of through the load balancer.
GIVEN:
- System has been passed down to me so there may be gaps with IIS confiuration, etc,.
- Database: SQL Server 2008R2
- Web Servers: Windows Server 2008R2 Enterprise SP1
- IIS 7.5
- Using MemoryCache aggressively with Model and Business Objects with eviction set to 2hrs
- Looked at the logs but really don't see anything significantly relevant
- One application pool...no other LOB applications running on this server
Assumptions & Ask:
Somehow I'm thinking that something is recycling the application pool or IIS worker threads are shutting down and restarting thus causing each new request to warmup and recache itself. It's so sporadic that it's tough to trouble shoot right now. The same request to the same server comes back fast as expected (back to back N requests) since it was cached in about 300ms....but wait about 5-10-20min and that same request to the same server takes 16seconds.
I have limited tracing to go by as these are prod systems so I can only expose so much logging details. Any help and information attacking this or similar behavior somebody else has run into is appreciated. Thx
UPDATE:
The w3wpe.exe process grows to ~3G. Somehow it gets wiped out and the PID changes so itself or something is killing it every 3-4min I see tons of warnings in my webserver (IIS) log:
A process serving application pool 'MyApplication' suffered a fatal
communication error with the Windows Process Activation Service. The
process id was '1732'. The data field contains the error number.
After 4-5 days of assessing IIS and configuration vs internal code issues I finally found the issue with little to no help with windbg or debugdiag IIS tools. Those tools contain so much information even with mini dumps or log trace stacks that they can be red herrings. Best bet was to reproduce it by setting up a "copy intelligently" instance of a production system, which we did not have at the time and took a bit for ops to set something up.
Needless to say the problem had to do with over cacheing business objects. There was one race condition where updates on a certain table were updating an attribute to that corresponding business object (updates were coming from multiple servers) which was causing an OOC stackoverflow that pretty much caused the cacheing to recursively cache itself to death thus causing the w3wp.exe process to die and psuedo-recycle itself. It was one of those edge cases that was incredibly hard to test and repro in a non-production environment.

Resources