I find that WCF service will take 8-10 seconds to load the first hit. After that it will take less than a second.
Any thoughts?
Probably due to .NET's cold start. Have you looked at setting up the IIS Warmup Module which initializes dependancies before an initial request?
From the Learn IIS website
Decrease the response time for first requests by pre-loading worker processes. The IIS Application Warm-Up module lets you configure the Web application to be pre-loaded before the first request arrives so that the worker process responds to the first Web request more quickly.
Increase reliability by pre-loading worker processes when overlapped recycling occurs. Because the recycled worker process in an overlapped recycling scenario only communicates its readiness and starts accepting requests after it finishes loading and initializing the resources as specified by the configuration, pre-loading the dependencies reduces the response times for the first requests.
Customize the pre-loading of applications. You can configure the IIS Application Warm-Up module to initialize Web applications by using specific Web pages and user identities. This makes it possible to create specific initialization processes that can be executed synchronously or asynchronously, depending on the initialization logic. In addition, these procedures can use specific identities to ensure a proper initialization.
Related
Scenario: we have a php 7 web application using IIS10 via FastCGI. We did performance testing on our application and noticed a slowdown. We ruled out the database tier. We looked at the IIS server and, even under heavy load, when the app slows down dramatically, the resources are not strained... no CPU spikes no RAM spikes. Digging further we came to the conclusion that all the requests coming in are simply being queued. When some of requests take longer (for some very large reports that take 1-3 minutes) then every other request is being queued up waiting for the first ones to go through.
So, the question is: where do we look to increase the number of concurrent requests IIS can handle at one time?
I have found these settings, under FastCGI, but VERY little documentation. Can someone explain what these four settings do?
Instance MaxRequests
Max Instances
Queue Length
Rapid Fails PerMinute
Are there any other settings we should be looking at under DefaultAppPool?
Queue Length
Maximum Worker Processes
Recycling
UPDATE:
A few things should be clarified for other who might search this:
a request in this context means one call from the browser to the IIS server
we used jmeter https://jmeter.apache.org/ to do some basic load testing
a request (for the sake of this context) from browser to IIS server will be processed like this: browser > DefaultAppPool worker process (this is defaulted to 1 in IIS look into web gardens if you want to increase this) > FastCGI instance (think of this as process threads, language is tricky and people on the web use threads/processes/instances interchangeably which can be confusing). FastCGI is defaulted to 4. So this means when 5 concurrent requests come through, they all get funneled through the 1 DefaultAppPool worker process and 4 of the 5 will be worked on concurrently by FastCGI and the 5th will be queued. We tested this: after an IIS restart there will be no w3wp.exe or php-cgi.exe processes running. When the 5 concurrent requests come in: w3wp.exe will get started and it will spawn 4 php-cgi.exe processes (check task manager)
to increase concurrency we set FastCGI "Max Instances" to 0 which allows IIS to decide how much it wants to handle based on resources available but you can also set it to a specific higher number. We tested this and I believe it to be accurate. You can see the w2wp.exe process and php-cgi.exe processes increasing in number as requests are coming in.
you should also be able to increase the DefaultAppPool worker processes, if you were to set this to 4 and leave the FastCGI instances to 4 as well that should mean, in theory, that each worker process would spawn its own 4 FastCGI instances for a total of 4x4=16 concurrent requests. I have not tested sufficiently to be sure 100% that that's how this actually plays out.
Instance MaxRequests: Controls the FastCGI process-recycling behavior. Specifies the maximum number of requests that a FastCGI application is allowed to handle before the process is recycled. The default value is 200.
Max Instances: Specifies the maximum number of FastCGI processes to allow in the application process pool for the selected FastCGI application. This number also represents the maximum number of concurrent requests that the FastCGI application can handle. The default value is 4.
Queue Length: Specifies the maximum number of requests that are queued for the FastCGI application pool. When the queue is full, subsequent requests return the HTTP error code 503 (Service Unavailable) to clients. This error code indicates that the application is too busy. The default value is 1000.
Rapid Fails PerMinute: Specifies the maximum time allowed for a request to the application. If a FastCGI process takes longer than the specified time on a single request, it is terminated. The default value is 90 seconds.
Application Pools:
Queue Length: Indicates to HTTP.sys how many requests to queue for an application pool before rejecting future requests. The default value is 1000.
Maximum Worker Processes: Indicates the maximum number of worker processes that would be used for the application pool.
For the attributes in Recycling, you can refer to this link:
https://learn.microsoft.com/en-us/iis/configuration/system.applicationhost/applicationpools/add/recycling/
It may also be important if you say:
Digging further we came to the conclusion that all the requests coming in are simply being queued. When some of requests take longer (for some very large reports that take 1-3 minutes) then every other request is being queued up waiting for the first ones to go through.
if this is true for one session, it may be also the session-handler of php
if you are using php sessions, and the default configuration is files, you will encounter the described problem, because the first request locks the session file, and a second request has to wait until release
you can, for example use another session-handler, i preferred wincache a long time, now i use memcached.
both are able to be used as session-save-handlers, and both are non-blocking
What is the advantage of using threadpool in Hystrix?
Suppose we are calling a third party service ,when we call a service or DB that thread goes into waiting state than what is the use of keep creating thread for each call?
So,I mean how short circuited(Threadpooled) method is batter then normal(non short circuited) method?
Lets say when a remote service(any service) is started to respond slowly, but a typical application(service which is making call to remote service) will still continue to call that remote service. So short circuited(Threadpooled) method helps you build a Defensive system in this particular case.
As calling service does not know if the remote service is healthy or not and new threads are spawned every time a request comes in. This will cause threads on an already struggling server to be used.
We don’t want this to happen as we need these threads for other remote calls or processes running on our server and we also want to avoid CPU utilization spiking up. so this prevents resources from becoming blocked if latency occurs. Also Bounded thread pool also gives some breathing room for downstream services to recover.
For detail : ThreadPool in Hystrix
I'm experiencing an issue with an asp.net core application that virtually hangs when started under load. Once started, the application can handle the load no problem. It feels like initialization is happening for every concurrent request. Is anyone else experiencing similar behavior?
test scenario:
start a test application that hits the service with 50 simultaneous tasks
start the service
notice the requests getting started, but extremely long delays before any start to finish and most will time out
As a temporary work around, I created middleware that throttles subsequent requests until the very first finishes. This effectively lets asp.net mvc initialize before processing the bulk of the requests. This particular app is asp.net core 1.1 (web api) with EF core.
When using a real database located halfway across the country, I experience a good 900ms delay for the first request to my ASP.NET Core WebAPI. This is because it needs to get a connection pool established for use with connecting to said database, and I do not eagerly create a connection pool when running the service. Instead, it gets initialized lazily when I request a connection via a connection factory that is registered as a singleton in the services container.
Perhaps you're experiencing the same type of delay as you stated you're using Entity Framework Core, which is presumably backed by SQL Server. Is a connection pool with the database being initiated lazily as part of this initial request?
Try making a controller that does not have any dependencies that returns a vanilla 200 OK. Assuming you do not have global filters that have expensive services to hydrate, you should see the baseline initialization performance of your web service.
For a web application deployed on Heroku (on a free dyno), is there a limit for making HTTP calls to an external web service?
The calls would be performed within the request, synchronously.
By limit I mean maximum number of requests, bandwidth and so on.
I hope the question in appropriate for this place.
The external web service might have its own rate limiting.
Other than that, you mention that "the calls would be performed within the request, synchronously".
In that case, you need to be aware of the Heroku Request Timeout. Your web dyno must not spend more than 30 seconds processing a single request, before it returns a response to the client. For decent performance, your web dyno should always respond to the client within a few hundred milliseconds, and delegate any long running jobs to a background worker dyno.
first time poster so go easy on me.
I am currently trying to address a performance issue when hitting my web service after a one minute period of inactivity. Literally after one minute of THAT user not hitting the web service then the next call will take 15 seconds before actually hitting the service operation. If you keep making random (not the same service operation just so you guys don't think it is "caching" the call) service operation calls the service returns immediately (less than a second).
Here are some "timings" I decided to take so you can see how I came to the one minute of inactivity:
2:04PM
2:16PM --15 seconds
2:21PM --15 seconds
2:24PM --15 seconds
2:25PM --15 seconds
Again, if you hit the web service continuously without a one minute period of inactivity then ALL methods will return in less than a second.
Here are some details regarding my web service:
WCF, WebHttpBinding, RESTful, using HTTPs.
Basic Authentication + Custom Authentication using IDispatchMessageInspector. Authentication happens with EVERY call (except to the Initializer.aspx page).
Custom Initialization.aspx page has been created which is called every night after the Application Pool is recycled. This page caches a bunch of global data used by all users along with starting that compile.
Application Pool ONLY recycles every night at 2AM. Worker threads are never killed off because timeout is disabled.
I heard about ReliableSession but as the setting implies that sounds like it would only work for PerSession, not PerCall.
Is there any way to resolve this or am I stuck to resorting to "pinging" the server every 45 seconds using a dummy service operation?
Found out the issue. We have multiple domain controllers. When the user was getting authenticated it would start from the forest level and work its way down to the actual domain controller that server resided on. The firewalls that were put in place were blocking all domain controllers except what the server resided on.
So basically, it would fail to communicate to the N+ domain controllers until it finally reached the only one it could.
You can fix this a number of ways but we just created firewall rules to allow the web server to communicate to the domain controller the users needed to be authenticated against.