Web API concurrency and scalability - asp.net-web-api

We are faced with the task to convert a REST service based on custom code to Web API. The service has a substantial amount of requests and operates on data that could take some time to load, but once loaded it can be cached and used to serve all of the incoming requests. The previous version of the service would have one thread responsible for loading the data and getting it into the cache. To prevent the IIS from running out of worker threads clients would get a "come back later" response until the cache was ready.
My understanding of Web API is that it has an asynchronous behavior built in by operating on tasks, and as a result the number of requests will not directly relate to the number of physical threads being held.
In the new implementation of the service I am planning to let the requests wait until the cache is ready and then make a valid reply. I have made a very rough sketch of the code to illustrate:
public class ContactsController : ApiController
{
private readonly IContactRepository _contactRepository;
public ContactsController(IContactRepository contactRepository)
{
if (contactRepository == null)
throw new ArgumentNullException("contactRepository");
_contactRepository = contactRepository;
}
public IEnumerable<Contact> Get()
{
return _contactRepository.Get();
}
}
public class ContactRepository : IContactRepository
{
private readonly Lazy<IEnumerable<Contact>> _contactsLazy;
public ContactRepository()
{
_contactsLazy = new Lazy<IEnumerable<Contact>>(LoadFromDatabase,
LazyThreadSafetyMode.ExecutionAndPublication);
}
public IEnumerable<Contact> Get()
{
return _contactsLazy.Value;
}
private IEnumerable<Contact> LoadFromDatabase()
{
// This method could be take a long time to execute.
throw new NotImplementedException();
}
}
Please do not put too much value in the design of the code - it is only constructed to illustrate the problem and is not how we did it in the actual solution. IContactRepository is registered in the IoC container as a singleton and is injected into the controller. The Lazy with LazyThreadSafetyMode.ExecutionAndPublication ensures only the first thread/request is running the initialization code, the following rquests are blocked until the initialization completes.
Would Web API be able to handle 1000 requests waiting for the initialization to complete while other requests not hitting this Lazy are being service and without the IIS running out of worker threads?

Returning Task<T> from the action will allow the code to run on the background thread (ThreadPool) and release the IIS thread. So in this case, I would change
public IEnumerable<Contact> Get()
to
public Task<IEnumerable<Contact>> Get()
Remember to return a started task otherwise the thread will just sit and do nothing.
Lazy implementation while can be useful, has got little to do with the behaviour of the Web API. So I am not gonna comment on that. With or without lazy, task based return type is the way to go for long running operations.
I have got two blog posts on this which are probably useful to you: here and here.

Related

Is HealthIndicator thread safe?

Do I have to make the method check() thread-safe?
#Component
public class MyHealthIndicator implements HealthIndicator {
#Autowired
private MyComponent myComponent;
#Override
public Health health() {
int errorCode = myComponent.check();
if (errorCode != 0) {
return Health.down().withDetail("Error Code", errorCode).build();
}
return Health.up().build();
}
}
Is the request to the corresponding actuator endpoint executed in a separated thread?
The app logic itself has only one thread.
To answer the direct question you asked ...
Do I have to make the method check() thread-safe?
You don't have to make it thread-safe, but if your application requires that myComponent.check() is only executed by a single thread at once, then yes, you'll need to mark it synchronized.
To answer the more general question
Is HealthIndicator thread safe?
By default, each health check initiated (often by an HTTP call, perhaps to /actuator/health) will run on a single thread, and check the health of each component that's registered a HealthIndicator sequentially, and thus the individual request is single-threaded.
HOWEVER ... there's nothing to stop multiple clients each making a request to /actuator/health at the same time, and thus there may be multiple health checks in progress at the same time, each of which will be executing on a different thread.
Therefore, if there's some reason why myComponent.check() should not be executed by more than one thread concurrently, you will need to mark it synchronized or else add in some other concurrency limiting mechanisms (e.g. java.util.concurrent.Semaphore).

Safe processing data coming from KafkaListener

I'm implementing Spring Boot App which reads some data from kafka to provide it for all requesting clients. Let's say I have a following class:
#Component
public class DataProvider {
private Prices prices;
public DataProvider() {
this.prices = Prices.of();
}
public Prices getPrices() {
return prices;
}
}
Each client may perform GET /api/prices to get info about newest prices. Live updates about prices are consumed from kafka. Due to the fact, that update comes every 5 seconds, which is not super often, the topic has only one partition.
I tried the very basic option using Kafka Listener:
#Component
public class DataProvider {
private Prices prices;
public DataProvider() {
this.prices = Prices.of();
}
public Prices getPrices() {
return prices;
}
#KafkaListener(topics = "test-topic")
public void consume(String message) {
Prices prices = Prices.of(message);
this.prices = prices;
}
}
Is this approach safe?
The prices must be volatile. But again: you need to be sure that an actual data for prices is OK to be dispersed. One HTTP request may return one data, but another concurrent may return other. Just because it has been just update by the Kafka consumer.
You may have your consume() and getPrices() as synchronized. So, every one is going to get an actual data at the same moment. However they are not going to be parallel since synchronized ensures only one thread can get access to the object.
Another way for consistency is to look into a ReadWriteLock barrier. So, getPrices() calls can be parallel, but as long as consume() takes a WriteLock, everyone is blocked until it is done.
So, technically your code is really safe. Only the problem if it is safe from a business purpose.

Spring-Boot: scalability of a component

I am trying Spring Boot and think about scalabilty.
Lets say I have a component that does a job (e.g. checking for new mails).
It is done by a scheduled method.
e.g.
#Component
public class MailMan
{
#Scheduled (fixedRateString = "5000")
private void run () throws Exception
{ //... }
}
Now the application gets a new customer. So the job has to be done twice.
How can I scale this component to exist or run twice?
Interesting question but why Multiple components per customer? Can scheduler not pull the data for every customer on scheduled run and process the record for each customer? You component scaling should not be decided based on the entities evolved in your application but the resources utilization by the component. You can have dedicated components type for processing the messages for queues and same for REST. Scale them based on how much each of them is getting utilized.
Instead of using annotations to schedule a task, you could do the same thing programmatically by using a ScheduledTaskRegistrar. You can register the same bean multiple time, even if it is a singleton.
public class SomeSchedulingConfigurer implements SchedulingConfigurer {
private final SomeJob someJob; <-- a bean that is Runnable
public SomeSchedulingConfigurer(SomeJob someJob) {
this.someJob = someJob;
}
#Override
public void configureTasks(#NonNull ScheduledTaskRegistrar taskRegistrar) {
int concurrency = 2;
IntStream.range(0, concurrency)).forEach(
__ -> taskRegistrar.addFixedDelayTask(someJob, 5000));
}
}
Make sure the thread executor you are using is large enough to process the amount of jobs concurrently. The default executor has exactly one thead :-). Be aware that this approach has scaling limits.
I also recommend to add a delay or skew between jobs, so that not all jobs run at exactly the same moment.
See SchedulingConfigurer
and
ScheduledTaskRegistrar
for reference.
The job needs to run only once even with multiple customers. The component itself doesn't need to scale at all. It just a mechanism to "signal" that some logic needs to be run at some moment in time. I would keep the component really thin and just call the desired business logic that handles all the rest e.g.
#Component
public class MailMan {
#Autowired
private NewMailCollector newMailCollector;
#Scheduled (fixedRateString = "5000")
private void run () throws Exception {
// Collects emails for customers
newMailCollector.collect();
}
}
If you want to check for new e-mails per customer you might want to avoid using scheduled tasks in a backend service as it will make the implementation very inflexible.
Better make an endpoint available for clients to call to trigger that logic.

Why is it beneficial to make async REST services?

Spring allows a method annotated with #RequestMapping to return a variety of objects, including a CompletableFuture or a Future. This allows me to spawn off an async method and let spring return the value whenever it is ready. What I am not sure I am understanding is if there are any benefits to this. For instance:
#RestController
public class MyController {
#RequestMapping("/user/{userId}")
public CompletableFuture<User> getUser(#PathVariable("userId") String userId) {
return CompletableFuture.supplyAsync(
() -> this.dataAccess.getUser(userId));
}
In this case, even though the actual computation is happening in the background, the connection will still not close and the request thread will still be active till it is done. How is it better than say:
#RequestMapping("/user/{userId}")
public User getUser(#PathVariableS("userId") String userId) {
return this.dataAccess.getUser(userId);
}
From first glances, this seems to be a better approach as there is no overhead with an additional thread and a watcher that looks for completion.
This takes advantage of Servlet 3 asynchronous request processing, using request.startAsync() method. Read here and here
To achieve this, a Servlet 3 web application can call request.startAsync() and use the returned AsyncContext to continue to write to the response from some other separate thread. At the same time from a client's perspective the request still looks like any other HTTP request-response interaction. It just takes longer to complete. The following is the sequence of events:

ActionFilter for Nhibernate Transaction Management is this an ok way to go

I have the following wrapper:
public interface ITransactionScopeWrapper : IDisposable
{
void Complete();
}
public class TransactionScopeWrapper : ITransactionScopeWrapper
{
private readonly TransactionScope _scope;
private readonly ISession _session;
private readonly ITransaction _transaction;
public TransactionScopeWrapper(ISession session)
{
_session = session;
_scope = new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions {IsolationLevel = IsolationLevel.ReadCommitted});
_transaction = session.BeginTransaction();
}
#region ITransactionScopeWrapper Members
public void Dispose()
{
try
{
_transaction.Dispose();
}
finally
{
_scope.Dispose();
}
}
public void Complete()
{
_session.Flush();
_transaction.Commit();
_scope.Complete();
}
#endregion
}
In my ActionFilter I have the following:
public class NhibernateTransactionAttribute : ActionFilterAttribute
{
public ITransactionScopeWrapper TransactionScopeWrapper { get; set; }
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
}
public override void OnActionExecuted(ActionExecutedContext filterContext)
{
TransactionScopeWrapper.Complete();
base.OnActionExecuted(filterContext);
}
}
I am using Castle to manage my ISession using a lifestyle of per web request:
container.Register(
Component.For<ISessionFactory>().UsingFactoryMethod(
x => x.Resolve<INHibernateInit>().GetConfiguration().BuildSessionFactory()).LifeStyle.Is(
LifestyleType.Singleton));
container.Register(
Component.For<ISession>().UsingFactoryMethod(x => container.Resolve<ISessionFactory>().OpenSession()).
LifeStyle.Is(LifestyleType.PerWebRequest));
container.Register(
Component.For<ITransactionScopeWrapper>().ImplementedBy<TransactionScopeWrapper>().LifeStyle.Is(
LifestyleType.PerWebRequest));
So now on to my questions.
Any issues with managing the transaction this way
Does an ActionFilter OnActionExecuting and OnActionExecuted methods use the same thread.
I ask number 2 because BeginRequest and EndRequest are not guaranteed to operate on the same thread and if you toss transactions on them you will run into big problems.
In my ActionFilter TransactionScopeWrapper is property injected.
There are some other aspects you should also look into.
First I would say is to decide where to dispose of your transaction. Be aware that if you use lazy loading and pass a data entity back to your view and access a property or reference that is configured to be lazy loaded, you'll encounter problems because your transaction has already been closed in your OnActionExecuted. Though as much as I know you should only use viewmodels in your views, sometimes an entity is a little more convenient. Regardless of the reason if you do want to use lazy loading and access them in your views you'll have to move your transaction completion into the OnResultExecuted method so that it doesn't get prematurely committed.
Second you should also look into checking if there were any exceptions or model errors before committing your transaction. I ended up using inspiration from here and here for my final Filter for dealing with my nHibernate Transaction.
Third, if you decide to dispose of your transaction in the OnResultExecuted handler that you do not do so if it's a request for a child actions. The reason being that like you I scoped my session to the web request, but I found that child actions don't count as a new request and when they are called and they try to open their own session they were getting the already open session context instead. When the child action then completed it was trying to close ITS session but was actually closing the session used by the parent view as well. This caused any logic after the child action that relied on lazy loaded data to fail as well.
I'd like to go through and try to remove my lazy loaded data from my app when it comes to views but until I get the time to do so you should be aware of these issues that may come up.
I was going to post my own action filter when I realized I had some DRY issues I needed to fix. suffice to say I am checking that filterContext.Exception and filterContext.ExceptionHandled to see if there were any errors and if they have been handled already. Note that just because an exception was handled doesn't mean that your transaction is OK to be committed. And though this is more subjective to how your app is coded you may also want to check filterContext.Controller.ViewData.ModelState.IsValid before your commit your transaction as well.
UPDATE: Unlike you, I'm using StructureMap, not Castle for Dependency Injection but in my case I added this line to my Application_EndRequest method in the gobal.asax file as a final bit of cleanup. I'm assuming there is something similar in Castle?
StructureMap.ObjectFactory.ReleaseAndDisposeAllHttpScopedObjects();
UPDATE 2: Anyway, a more direct answer to your question. I don't see anything wrong with using a wrapper like you opt'd to, though I am not sure why you feel the need to wrap it? nHibernate does a really good job of handling the transaction itself so another abstraction layer around that seems unneeded to me. You could just as easily explicitly start the transaction in your OnActionExecuting and explicitly complete it in the OnActionExecuted. By retrieving the ISession object through the DependencyResolver you eliminate any concerns you may have with threading as the IoC container is thread-safe I believe, and from there you can get your current transaction using Session.Transaction and check it's current state from the IsActive property. My understanding is that it's possible for the two methods to occur on different threads though, particularly when dealing with an action on a class inheriting from AsynController.
I've got a problem with a such method. What it do if you use "#Html.Action("TestMethod", "TestController")" ?
As for me I prefer to use explicit transaction call:
using (var tx = session.BeginTransaction())
{
// perform your insert here
tx.Commit();
}
What's about threadsafe, I'd like to know too.

Resources