Choosing between performance and ease of use in WS API - performance

We're in the process of creating a new API for our product, which will be exposed via web services. We have an internal debate whether the API should be as easy as possible to use (in the price of making more calls) or make it as efficient as possible (rendering it harder to use). For example, here are two issues that came up:
Should we manage session info on the server, or should we pass that info back to the user, and expect him to send it back to us when needed? (Please ignore the security implications of a session)
Should we combine calls that are likely to be consequent, in order to save the time spent on the round trip in between, even if they don't really share the same logical functionality?
Basically, the desktop people are in favor of a clear, easy to use API, while the internet people would like to make it as efficient as possible. This is the first public API we're providing, and we need a strategy.
Personally, I'm in favor of making the API as usable as possible. Other components on the system are probably going to have a much larger affect on the performance, and hard to use APIs are much more error prone. But I'm a desktop programmer…
So, what should be our strategy? What are the common practices when creating such an API?

What are typical usage scenarios? Will the latency of your API determine the UI responsiveness? How big the performance trade-off will be? It's hard to suggest anything without knowing your circumstances.
But
My guess is that passing session info to the client will scale better. In-proc session management won't allow you to share the state between service instances. Managing sessions in DB will make your services more complex. Anyway this all depends on your bandwidth/memory/computational power capabilities.
I would start from the most granular operations and only provide composite methods when performance problem becomes obvious.

IMO, the best is to have the simplest API for people who will have to use your API, but letting these people have deep customizing possibilities, to make it efficient, even if harder to use.
But simplicity for users > all.

I would recommend you create your web service following the RESTful model and that means stateless. Efficiency is more critical than ease of use. You can always create a framework on top of the API later that eases implementation headaches.

You're debating a circular agrument. This is all usability (ease of use).
You're likely going to factor in a bit of both aspects - as they both influence user performance.
Its a case of (i) extent of customsiable features versus (ii) efficiency in manipulating those features. There will be an intersect between the two.
I'd say give a simple API that puts control into the users hands - this is the primary purpose of CMS. The more detailed aspects you could combine initially, introducing them as added control later on.
This way you will manage users learning curve of the system so as (i) not to bombard them with excessive options initially and (ii) allow them to adopt your system quickly at first. Then extend your users control (system functionality from a user perspective) later on by making more of the API features available.
Another good tip would be to ask you users upfront & as you go.

My 2 cents:
First start with a very granular low level API that gives high performance.
Then create a easy-to-use high level API that is "composed" of the above low level API.
This way clients can customize their behavior as they want. The clients can start with using the high-level API but if high performance is expected for certain user actions, they can use the faster-but-difficult low level API on a case-by-case basis.
One more important point to consider in service design. Try to keep them as stateless as possible. The advantage of stateless web services is that they can be easily distributed using Network Load Balancing.

Related

Should event driven architecture be targeted for all data & analytics platforms?

For example,
You have an IT estate where a mix of batch and real-time data sources exists from multiple systems, e.g. ERP, Project management, asset, website, monitoring etc.
The aim is to integrate the datasources into a cloud environment (agnostic).
There is a need for reporting and analytics on combinations of all data sources.
Inevitably, some source systems are not capable of streaming, hence batch loading is required.
Potential use-cases for performing functionality/changes/updates based on the ingested data.
Given a steer for creating a future-proofed platform, architecturally, how would you look to design it?
It's a very open-end question, but there are some good principles you can adopt to help direct you in the right direction:
Avoid point-to-point integration, and get everything going through a few common points - ideally one. Using an API Gateway can be a good place to start, the big players (Azure, AWS, GCP) all have their own options, plus there's lots of decent independent ones like Tyk or Kong.
Batches and event-streams are totally different, but even then you can still potentially route them all through the gateway so that you get the centralised observability (reporting, analytics, alerting, etc).
Use standards-based API specifications where possible. A good REST based API, based off a proper resource model is a non-trivial undertaking, not sure if it fits with what you are doing if you are dealing with lots of disparate legacy integration. If you are going to adopt REST, use OpenAPI to specify the API's. Using this standard not only makes it easier for consumers, but also helps you with better tooling as many design, build and test tools support OpenAPI. There's also AsyncAPI for event/async API's
Do some architecture. Moving sh*t to cloud doesn't remove the sh*t - it just moves it to the cloud. Don't recreate old problems in a new place.
Work out the logical components in your new solution: what does each of them do (what's it's reason to exist)? Don't forget ancillary components like API catalogues, etc.
Think about layering the integration (usually depending on how they will be consumed and what role they need to play, e.g. system interface, orchestration, experience APIs, etc).
Want to handle data in a consistent way regardless of source (your 'agnostic' comment)? You'll need to think through how data is ingested and processed. This might lead you into more data / ETL centric considerations rather than integration ones.
Co-design. Is the integration mainly data coming in or going out? Is the integration with 3rd parties or strictly internal?
If you are designing for external / 3rd party consumers then a co-design process is advised, since you're essentially designing the API for them.
If the API's are for internal use, consider designing them for external use so that when/if you decide to do that later it's not so hard.
Taker a step back:
Continually ask yourselves "what problem are we trying to solve?". Usually, a technology initiate is successful if there's a well understood reason for doing it, which has solid buy-in from the business (non-IT).
Who wants the reporting, and why - what problem are they trying to solve?
As you mentioned its an IT estate aka enterprise level solution mix of batch and real time so first you have to identify what is end goal of this migration. You can think of refactoring applications. If you are trying to make it event driven then assess the refactoring efforts and cost. Separation of responsibility is the key factor for refactoring and migration.
If you are thinking about future proofing your solution then consider Cloud for storing and processing your data. Not necessary it will be cheap but mix of Cloud and on-prem could be a way. There are services available by cloud providers to move your data in minimal cost. Cloud native solutions are there for performing analysis on your data. Database migration service in AWS or Azure can move data and then capture on-going changes. So you can keep using on-prem db & apps and perform analysis for reporting on cloud. It will ease out load on your transactional DB. Most data sync from on-prem to cloud is near real time.

Spring HATEOAS: Practicable for a microservice architecture?

I know this question was already asked but I could not find a satisfying answer.
I started to dive deeper in building a real restful api and I like it's contraint of using links for decoupling. So I built my first service ( with java / spring ) and it works well ( although I struggled a bit with finding the right format but that's another question ). After this first step I thought about my real world use case. Micorservices. Highly decoupled individual services. So I made a my previous scenario and I came to some problems or doubts.
SCENARIO:
My setup consists of a reverse proxy ( Traefik which works as service discovery and api gateway) and 2 Microservices. In addition, there is an openid connect security layer. My services are a Player service and a Team service.
So after auth I have an access token with the userId and I am able to call player/userId to get the player information and teams?playerId=userId to get all the teams of the player.
In my opinion, I would in both responses link the opposite service. The player/userId would link to the teams?playerId=userId and vice versa.
QUESTION:
I haven't found a solution besides linking via a hardcoded url. But this comes with so many downfalls as I can't imagine that this a solution used in real world applications. I mean just imagine your api is a bit more advanced and you have to link to 10 resources. If something changes, you have refactor and redeploy them all.
Besides the synchonization problem, how do you handle state in such a case. I mean, REST is all about state transfer. So I won't offer the link of the player to teams service if the player is in no team. Of course I can add the team ids as attribute to the player to decide whether to include the link or not. But this again increases coupling between the services.
The more I dive in the more obstacles I find and I'm about to just stay with my spring rest docs and neglect the core of Rest which I is a pity to me.
Practicable for a microservice architecture?
Fielding, 2000
The REST interface is designed to be efficient for large-grain hypermedia data transfer, optimizing for the common case of the Web, but resulting in an interface that is not optimal for other forms of architectural interaction.
Fielding 2008
REST is intended for long-lived network-based applications that span multiple organizations.
It is not immediately clear to me that "microservices" are going to fall into the sweet spot of "the web". We're not, as a rule, tring to communicate with a microservice that is controlled by another company, we often don't get a lot of benefit out of caching, or code on demand, or the other rest architectural constraints. How important is it to us that we can use general purpose components to exchange information between different microservices within our solution? and so on.
If something changes, you have refactor and redeploy them all.
Yes; and if that's going to be a problem for us, then we need to invest more work up front to define a stable interface between the two. (The fact that we are using "links" isn't special in that regard - if these two things are going to talk to each other, then they are going to need to speak a common language; if that common language needs to evolve over time (likely) then you need to build those capabilities into it).
If you want change over time, then you have to plan for it.
If you want backwards/forwards compatibility, then you have to plan for it.
Your identifiers don't need to be static - there are lots of possible ways of deferring the definition of an identifier; the most obvious being that you can use another identifier to look up the identifier you want, or the formula for calculating it, or whetever.
Think about how Google works - the links they use change all the time, but it doesn't matter because the protocol (refresh your bookmarked search form, enter your text in "the" one field, click the button) hasn't changed in 20 years. The interface is stable (even though the underlying spellings of the identifiers is not) and that's enough.

System API in mulesoft

I have a requirement to persist some data in a table (single table). The data is coming from UI. Do i need to write just the system API and persist the data OR i need to write process and system API both? I don't see a use of process API in this case. Please suggest. Is it always necessary to access system API through process API or system API can be invoked without process API as well.
I would recommend a fine-grained approach to this. We should be following it through the experience layer even though we do not have must customization to the data.
In short, an experience layer API and directly calling System layer API (if there is no orchestration/data conversion/formatting needed)
Why we need a system API & experience API? A couple of points.
System API should be more attached to the underlying system. And if
in case, in the future, it changes then it should not impact any of
the clients.
Secondly, giving an upper layer gives us the feasibility to add
different SLAs, policies, logging and lots more, to different
clients. Even if you have a single client right now it's better to
architect for the future. Reusing is the key advantage of these APIs.
Please check Pattern 2 in this document
That is a question for the enterprise architect in your organisation. In this case, the process API would probably be a simple proxy for the system API, but that might not always be the case in future. Also, it is sometimes useful to follow a standard architectural pattern even if it creates some spurious complexity in the implementation. As always, there are design trade-offs and the answer will depend on factors that cannot be known by people outside of your organisation.

using both api gateway and message broker in microservice

I have a question about microservice implementation. right now I am using an api gateway to process all get request to my individual services and using kafka to handle asynchronous post put and delete request. Is this a good way of handling of handling request in a microservice architecture?
Your question is too unspecific to give a good answer. What is a good architecture totally depends on the details of your use cases. Are you serving web pages, streaming media, amass data for analysis, or something completely different? We would also need to know what are you requirements in terms of concurrency, consistency and scalability? What are the constraints for budget/size of development teams, ease of development, dev skills, etc?
For example the decisions you have taken may be considered good if you have strong requirements for a highly scalable input of large data sets and very frequent data collection as well as the team to support it. But it may be considered bad if you have a small team only and are trying to get a quick and cheap MVP for a new service that has limited scalability requirements (because the complexity of the solution slows down your development unnecessarily).
It may be good because the development team is familiar with those technologies and can effectively develop with those. Or it may be bad because your team does not know anything about those and the investment in learning those will not be justifiable by long term gains.
Don't forget that one of the ideas of the microservices architectural style is that each service can be owned by a distinct team that makes its own decisions about what technology to use for implementation (for whatever reason: ease of development, business reasons etc). So in other words the microservices style embraces the old wisdom architecture follows organization.
Here a link to a recommended further read.

Best scaling methodologies for a highly traffic web application?

We have a new project for a web app that will display banners ads on websites (as a network) and our estimate is for it to handle 20 to 40 billion impressions a month.
Our current language is in ASP...but are moving to PHP. Does PHP 5 has its limit with scaling web application? Or, should I have our team invest in picking up JSP?
Or, is it a matter of the app server and/or DB? We plan to use Oracle 10g as the database.
No offense, but I strongly suspect you're vastly overestimating how many impressions you'll serve.
That said:
PHP or other languages used in the application tier really have little to do with scalability. Since the application tier delegates it's state to the database or equivalent, it's straightforward to add as much capacity as you need behind appropriate load balancing. Choice of language does influence per server efficiency and hence costs, but that's different than scalability.
It's scaling the state/data storage that gets more complicated.
For your app, you have three basic jobs:
what ad do we show?
serving the add
logging the impression
Each of these will require thought and likely different tools.
The second, serving the add, is most simple: use a CDN. If you actually serve the volume you claim, you should be able to negotiate favorable rates.
Deciding which ad to show is going to be very specific to your network. It may be as simple as reading a few rows from a database that give ad placements for a given property for a given calendar period. Or it may be complex contextual advertising like google. Assuming it's more the former, and that the database of placements is small, then this is the simple task of scaling database reads. You can use replication trees or alternately a caching layer like memcached.
The last will ultimately be the most difficult: how to scale the writes. A common approach would be to still use databases, but to adopt a sharding scaling strategy. More exotic options might be to use a key/value store supporting counter instructions, such as Redis, or a scalable OLAP database such as Vertica.
All of the above assumes that you're able to secure data center space and network provisioning capable of serving this load, which is not trivial at the numbers you're talking.
You do realize that 40 billion per month is roughly 15,500 per second, right?
Scaling isn't going to be your problem - infrastructure period is going to be your problem. No matter what technology stack you choose, you are going to need an enormous amount of hardware - as others have said in the form of a farm or cloud.
This question (and the entire subject) is a bit subjective. You can write a dog slow program in any language, and host it on anything.
I think your best bet is to see how your current implementation works under load. Maybe just a few tweaks will make things work for you - but changing your underlying framework seems a bit much.
That being said - your infrastructure team will also have to be involved as it seems you have some serious load requirements.
Good luck!
I think that it is not matter of language, but it can be be a matter of database speed as CPU processing speed. Have you considered a web farm? In this way you can have more than one machine serving your application. There are some ways to implement this solution. You can start with two server and add more server as the app request more processing volume.
In other point, Oracle 10g is a very good database server, in my humble opinion you only need a stand alone Oracle server to commit the volume of request. Remember that a SQL server is faster as the people request more or less the same things each time and it happens in web application if you plan your database schema carefully.
You also have to check all the Ad Server application solutions and there are a very good ones, just try Google with "Open Source AD servers".
PHP will be capable of serving your needs. However, as others have said, your first limits will be your network infrastructure.
But your second limits will be writing scalable code. You will need good abstraction and isolation so that resources can easily be added at any level. Things like a fast data-object mapper, multiple data caching mechanisms, separate configuration files, and so on.

Resources