I am working on a project using ASP.Net Web API that will be receiving a large number of POST operations where I will need to write many successive / simultaneous records to the DB. I don't have an exact number per second so this is more of a conceptual design question.
I am thinking of having either a standard message queue (RabbitMQ, etc) or an in-memory data store such as Redis to handle the initial intake of the data and then persisting that data to the disk via another process (or even a built in one of the queue mechanism has one).
I know I could also use threading to improve performance of the API.
Does anyone have any suggestions as far as which message queues or memory storage to look at or even just architectural recommendations?
Thanks for any and all help everyone.
-c
Using all this middle ware will make your web application scale, but it still means the same load on your DB. Your asp.net web api can be pretty fast with just using async/await. On async/await you just need to be carefully to do them all the way down - from controller to database and external requests - don't mix them with Tasks because you will end up with deadlocks.
And don't you threading because you will consume applications threads and this way it will not be able to scale - leave the threads to be used by the ASP.NET Web API.
Related
I am developing web application backend with Spring where client and server talk through Restful APIs. There is a specific API where I assume the hit will be much. Is there any way to scale this specific API?( Like, assigning more threads)
In this application everything is interdependent. So, microservice wont be best approach I guess.
There are two possible ways, i can think of
Use Load Balancer, this will help you to add multiple application instances of Rest API. This is classical approach in such cases.
This depends upon existing implementation, API can be refactor to just receive the message and decouple the processing thread.
The your suggested way of increasing thread has limitation and more fine tuning require. If the use case is just to support limited user, following configure can be use. tomcat thread pool.
Just have multiple instances of the same service. REST has a statelessness constraint, so it is easy to do it.
Let say I have 22 microservices. I developed with docker on local.
Client wants to get product model data which contains 3 different service data and aggregate them.
Should I use aggregator gateway api or SPA get separately from each service. Does Aggregator service couple services ?
These Microservices patterns always come with Trade-offs. Here you need to consider more than just a coupling issue when you are going with Aggregator pattern (Backend for Frontend).
The following are some of the points you need to think about before going with this pattern.
The Latency problem. If you want this implementation to make it better without any latency problem, then your services and aggregator should be in the same location or the same data center. Avoid third party calls from aggregators.
This can introduce a single point of failure. Make sure that you've designed in such a way that the service is highly available.
Implement a resilient design and timeout since this aggregator is calling other services and getting data. If one or more service calls take too long, it should timeout and return a partial set of data. Consider how your application will handle this scenario
Monitoring of your aggregator and it's child service calls. Implement distributed tracing using correlation IDs to track each call.
Ensure the aggregator has the adequate performance to handle the load and can be scaled to meet your anticipated growth.
These are the best practices that I can suggest, You are the best person who can decide based on your system requirements and these points.
There are some compelling advantages to using a BfF service as an orchestration layer that aggregates calls to various backend data services.
It will reduce the complexity in the data access areas of your SPA.
It can also reduce load times.
Over time, your frontend devs will be less likely to get blocked on the backend devs assuming that the BfF is maintained by the frontend devs.
Take a look at this article on Consistency, Coupling, and Complexity at the Edge that goes into more detail on this and proposes some best practices such as GraphQL vs REST.
In the web app that I'm currently working on I've to make multiple calls to database and combine the results at-last to show in the UI. Right now.. I'm doing the calls one by one and combining the results at last. Since the web app will be hosted in a multi-core machine(intel i5) I think I can use TPL to make parallel db calls. Is it a good idea? What are the things/pitfalls I want to consider when I'm doing parallel calls to db?
There are two things to remember here. Firstly you're DB provided API may not be thread-safe, for example ADO.NET explicitly isn't 100% thread-safe. Secondly by doing this you are moving your load from the clinet to the DB. In other words if your client creates 5 concurrent connections to the DB at once it's going to have a larger impact on the DB's load. The latency of an individual client to the user may be reduced but at the expense of overall throughput in terms of the number of clients an individual DB can support.
If largely depends on your scenario as to whether you think this is a good tradeoff.
You say the "we app" if you mean web app then their are similar tradeoffs, I'd recommend this blog post on using the TPL from a web application.
http://blogs.msdn.com/b/pfxteam/archive/2010/02/08/9960003.aspx
It's the same issue. You trade of individual request latency for throughput or vis versa.
I am writing a Java EE application which is supposed to consume SAP BAPIs/RFC using JCo and expose them as web-services to other downstream systems. The application needs to scale to huge volumes in scale of tens of thousands and thousands of simultaneous users.
I would like to have suggestions on how to design this application so that it can meet the required volume.
Its good that you are thinking of scalability right from the design phase. Martin Abbott and Michael Fisher (PayPal/eBay fame) layout a framework called AKF Scale for scaling web apps. The main principle is to scale your app in 3 axis.
X-axis: Cloning of services/ data such that work can be easily distributed across instances. For a web app, this implies ability to add more web servers (clustering).
Y-axis: separation of work responsibility, action or data. So for example in your case, you could have different API calls on different servers.
Z-Axis: separation of work by customer or requester. In your case you could say, requesters from region 1 will access Server 1, requesters from region 2 will access Server 2, etc.
Design your system so that you can follow all 3 above if you need to. But when you initially deploy, you may not need to use all three methods.
You can checkout the book "The Art of Scalability" by the above authors. http://amzn.to/oSQGHb
A final answer is not possible, but based on the information you provided this does not seem to be a problem as long as your application is stateless so that it only forwards requests to SAP and returns the responses. In this case it does not maintain any state at all. If it comes to e.g. asynchronous message handling, temporary database storage or session state management it becomes more complex. If this is true and there is no need to maintain state you can easily scale-out your application to dozens of application servers without changing your application architecture.
In my experience this is not necessarily the case when it comes to SAP integration, think of a shopping cart you want to fill based on products available in SAP. You may want to maintain this cart in your application and only submit the final cart to SAP. Otherwise you end up building an e-commerce application inside your backend.
Most important is that you reduce CPU utilization in your application to avoid a 'too-large' cluster and to reduce all kinds of I/O wherever possible, e.g. small SOAP messages to reduce network I/O.
Furthermore, I recommend to design a proper abstraction layer on top of JCo including the JCO.PoolManager for connection pooling. You may also need a well-thought-out authorization concept if you work with a connection pool managed by only one technical user.
Just some (not well structured) thoughts...
I've found no clear answer so far, but maybe I've searched the wrong way.
My Question is, can Core Data to be used as a Persitence Storage for a Server Project? Where are Core Data's Limits, how much Data can be handled with Core Data and SQLite? SQLite should handle a lot of Data very well according to their website. I know of a properitary Java Persitence Manager with an Oracle DB as Storage that handles Millions of Entries and 3000 Clients without Problems. For my own Project I wonder if I can use Core Data on the Server Side for User Mangament and intern microblogging, texting with up to 5000 clients. Will it handle such big amounts of Data or do I have to manage something like that myself? Does anyone happend to have experience with huge amounts if Data and Core Data?
Thank you
twickl
I wouldn't advise using Core Data for a server side project. Core Data was designed to handle the data of individual, object-oriented applications therefore it lacks many of the common features of dedicated server software such as easily handling multiple simultaneous accesses.
Really, the only circumstance where I would advise using it is when the server side logic is very complex and the number of users small. For example, if you wanted to write an in house web app and have almost all the logic on the server, then Core Data might serve well.
Apple used to have WebObjects which was a package to manage servers using an object-oriented DB much like Core Data. (Core Data was inspired by a component of WebObjects called Enterprise Objects.) However, IIRC Apple no longer supports WebObjects for external use.
Your better off using one of the many dedicated server packages out there than trying to roll your own.
I have no experience using Core Data in the manner you describe, but my understanding of the architecture leads me to believe that it could be used, depending on how you plan to query and manipulate the data.
Core Data is very good at maintaining an object graph and using faults to bring parts into memory as needed. In that manner, it could be good on a server for reducing memory requirements even with a large data set.
Core Data is not very good at manipulating collections of objects without loading them into memory, making a change, and writing them back out to disk. Brent Simmons wrote a blog post about this, where he decide to stop using Core Data for some of his RSS reader's model objects because an operation like "mark all as read" didn't scale. While you would like to be able to say something like UPDATE articles SET status = 'read', Core Data must load each article, set its status property, then write it back to disk.
This isn't because Apple engineers are stupid, but because the query layer can't make assumptions about the storage layer (you could be using XML instead of SQLite) and it also must take into account cascading changes and the fact that some article objects may already be loaded into memory and will need to be updated there.
Note that you can also write your own storage providers for Core Data, see Aaron Hillegass's BNRPersistence project. So if Core Data was "mostly good" you might be able to improve on it for your application.
So, a possible answer to your question is that Core Data may be appropriate to your application, as long as you do not need to rely on batch updates to large number of objects. In general, no algorithm or data structure is appropriate for every scenario. Engineering is about wisely choosing between trade-offs. You won't find anything that works well for many clients in every case. It always matters what you are doing.