Setup
My app uses gRPC for frontend/backend separation
Backend (server + services) is written in Python
Goals
I'd love to achieve live-updates including:
Updating existing services
Adding new services
Removing services
Problem
I can achieve #1 by using Python's importlib feature. But there are fewer options for adding/removing services. It seems to depend on the app's gRPC implementation. The major constraints seem to be the fact that servicers can only be registered before running the server, i.e., through the call to
add_MyServiceServicer_to_server()
So does adding reflection support, which is through the call to
service_names = [
MyService_pb.DESCRIPTOR.services_by_name[''].full_name,
...
]
reflection.enable_server_reflection(service_names, my_server)
Solution Candidates
Approach 1: Have one servicer per service, much like the official SayHello example of gRPC
Approach 2: Have one giant servicer that includes all the other services as its RPC methods.
Approach 1 seems to be intuitive, but it won't support adding/removing services while the server is running.
Approach 2 seems promising, but it is confusing by sticking the entire universe in a single servicer. And I'm not sure how gRPC's thread pool would like this approach.
Questions
Can I achieve my goals with these approaches?
If true, which one is better in terms of both maintainability and performance?
If false, are there alternatives?
As per #DougFawley's comments,
Typically microservices would have multiple replicas deployed, and be restarted in a rolling update when new services are added.
"What's the best user experience like in this scenario?" ->
microservice clients should expect and be resilient to RPC failures.
They can happen for many other reasons in steady state. Typically you
will run multiple replicas and when one is restarted, if you
gracefully shut it down, clients will have a chance to create
connections to other backends and no RPCs will fail. But if they do
fail, clients should retry and will use another backend. Users should
not really be impacted by this.
In short, it's a bad idea to add/remove services without rebooting server. So for this reason, I'd better adopt the one-to-one servicer-service binding and not hack it for this particular live-update intent.
Related
gRPC is a modern open source high performance RPC framework that can
run in any environment. It can efficiently connect services in and
across data centers with pluggable support for load balancing,
tracing, health checking and authentication. It is also applicable in
last mile of distributed computing to connect devices, mobile
applications and browsers to backend services.
I'm finding GRPC is becoming increasingly more pertinent in backend infrastructure, and would've liked to have it in my favorite language/tsdb kdb+/q.
I was surprised to find that kdb+ does not have a grpc implementation. Obviously, the (https://code.kx.com/q/interfaces/protobuf/)
package doesn't support the parsing of rpc's, is there anything quantitatively preventing there being a KDB+ implementation of the rpc requests/services etc. found in grpc?
Why would one not want to implement rpc's (grpc) in kdb+ and would it be a good idea to wrap a c++/c implemetation therin inorder to achieve this functionality.
Thanks for your advice.
Interesting post:
https://zimarev.com/blog/event-sourcing/myth-busting/2020-07-09-overselling-event-sourcing/
outlines event sourcing, which I think might be a better fit for kdb?
What is the main issue with services using RPC calls to exchange information? Well, it’s the high degree of coupling introduced by RPC by its nature. The whole group of services or even the whole system can go down if only one of the services stops working. This approach diminishes the whole idea of independent components.
In my practice I hardly encounter any need to use RPC for inter-service communication. Partially because I often use Event Sourcing, more about it later. But we always use asynchronous communication and exchange information between services using events, even without Event Sourcing.
For example, an order microservice in an e-commerce system needs customer data from the customer microservice. These dependencies between microservices are not ideal. Other microservices can go down and synchronous RESTful requests over https do not scale well due to their blocking nature. If there was a way to completely eliminate dependencies between microservices completely the result would be a more robust architecture with less bottlenecks.
You don’t need Event Sourcing to fix this issue. Event-driven systems are perfectly capable of doing that. Event Sourcing can eliminate some of the associated issues like two-phase commits, but again, not a requirement to remove the temporal coupling from your system.
I have a micro-service project with multiple services in .NET Core. When it comes to placing the controllers, there are 2 approaches:
Place the controllers in respective Micro Services, with Startup.cs in each micro-service.
Place all controllers in a separate project and have them call the individual services.
I think the 1st approach will involve less coding effort but the 2nd one separates controllers from actual services using interfaces etc.
Is there a difference in terms of how they are created and managed in Fabric using both approaches.
This is very broad topic and can raise points for discussion because it all depends on preferences, experiences and tech stacks. I will add my two cents, but do not consider it as a rule, just my view for both approaches.
First approach (APIs for each service isolated from each other):
the services will expose their public APIs themselves and you will need to put a service discovery approach in place to enable clients to call each microservice, a simple one is using the reverse proxy to forward the calls using the service name.
Each service and it's APIs scales independently
This approach is better to deploy individual updates without taking down other microservices.
This approach tends to have more code repetition to handle authorization, authentication, and other common aspects, from there you will end up doing shared libraries using on all services.
This approach increase the points of failures, it is good because failures will affect less services, if one API is failing, other services won't be impacted (if the failure does not affect the machine like memory leak or high CPU usage).
The second approach (Single API to forward the calls to right services):
You have a single endpoint and the service discovery will happen in the API, all work will be handled by each services.
The API must scale for everyone even though one service consumes much more resources than others. just the service will scale independently.
This approach, to add or modify api endpoints, you will likely update the API and the service, taking down the API will affect other services.
This approach reduces the code duplication and you can centralize many common aspects like Authorization, request throttling and so on.
This approach has less points of failures, if one microservices goes down, and a good amount of calls depend on this service, the API will handle more connection and pending requests, this will affect other services and performance. If it goes down, every services will be unavailable. Compared to the first approach, the first approach will offloaded the resilience to the proxy or to the client.
In summary,
both approaches will have a similar effort, the difference is that the effort will be split into different areas, you should evaluate both and consider which one to maintain. Don't consider just code in the comparison, because code has very little impact on the overall solution when compared with other aspects like release, monitoring, logging, security, performance.
In our current project we have a public facing API. We have several individual microservice projects for each domain. Being individual allows us to scale according to the resources each microservice use. For example we have an imaging service that consumes a lot of resources, so scaling this is easier. You also have the chance to deploy them individually and if any service fails it doesn't break the whole application.
In front of all the microservices we have an API Gateway that handles all the authentication, throttles, versioning, health checks, metrics, logging etc. We have interfaces for each microservice, and keep the Request and Response models seperately for each context. There is no business logic on this layer, and you also have the chance to aggregate responses where several services need to be called.
If you would like to ask anything about this structure please feel free to ask.
How can I share database connection aong in spring cloud module microservices. If there are many microservices how can i use same db connection or should i use db connection per microservices?
In my opinion, the thing that you've asked for is impossible only because each microservice is a dedicated process and it runs inside its own JVM (probably in more than one server). When you create a connection to the database (assuming you use connection pool) its always at the level of a single JVM.
I understand that the chances are that you meant something different but I had to put it on because it directly answers your question
Now, you can share the same database between microservices (the same schema, tables, etc) so that each JVM will have a set of connections opened (in accordance with connection pool definitions).
However, this is a really bad practice - you don't want to share the databases between microservice. The reason is the cost of change: if you (as a maintainer of microservice A) decide to, say, alter one of the tables, now all microservices will have to support this, and this is not a trivial thing to do.
So, a better approach is to have a service that has a "sole responsibility" for your data in some domain. Now, all the services could contact this service and ask for the required data through well-established APIs that should never be broken. In this approach, the cost of change is much "cheaper" since only this "data service" should be changed in a way that it doesn't break existing APIs.
Now regarding the database connection thing: you usually will have more than one JVM that runs the same microservice (like data microservice) so, it's not that you share connections between them, but rather you share the same way of working with database (because after all its the same code).
When dealing with a mircoservice architecture it is usually the case that you have a distributed system.
Most microservices that communicate with each other are not on the same machine, instance or container. Communication between them is most commonly done via http, though there are many other ways.
I would suggest designing mircoservices around a single concern of your application. For example, in your case, you could have a "persistence microservice" that would be responsible for dealing with data persistence operations on a single or multiple types data-stores. It could possibly deal with relational DBs, noSQL, file storage etc. Then, via REST endpoints, you can expose any persistence functionality to the mircoservices that deal with business logic.
A very easy way to build a REST service like this would be with the help of Spring Data REST project.
To answer your actual question, I'm not aware of any way to share actual connections between processes. Beyond that, having many microservices running on the same instance is not a good practice most of the time.
Mircoservices are very popular these days and everybody is trying to transition to them. My advice would be to make sure you don't "over-engineer" your project.
Hope I didn't misunderstand your question, but to be fair it is a little vague. If you could provide a longer more detailed description of your architecture and use case I can suggest more tools/frameworks you can use to achieve your cloudy goals.
First and most important - your microservice should be responsible for handling all data in a given business domain/bounded context. So the question is - 'Why do you need to share database connection between microservices and isn't this a sign you went too far with slicing your system?' Microservice is a tool and word 'micro' may be misleading a bit :)
For more reading I would suggest e.g. https://learn.microsoft.com/en-us/dotnet/standard/microservices-architecture/architect-microservice-container-applications/identify-microservice-domain-model-boundaries (don' t worry, it's general enough to be applicable also to Spring).
As a mechanism to connect microservices together and make them work it is usually suggested to use APIs and Service Discovery. But they usually work as their own microservices, but these ones should apparently be "hard-coded" into others, as every microservice is supposed to register with them and query for other microservices' locations. Doesn't this break the idea of loose coupling, since a loss of a discovery service implies others being unable to communicate?
Pretty much yes. If one microservice "knows" about another microservice - it means that they are highly coupled. It doesn't matter where specifically this knowledge about other service is coming from: hardcoded, config file or maybe some service discovery, it still leads to high coupling.
What's important to understand is that many microservice evangelists are just preaching about how to build monolith apps on top of Web APIs. Some of them probably think that the more buzz words they use - the better ... not quite sure why this happens. It is probably easier to fake a language and become an "expert" by using buzzword salad instead of really building fault tolerant and horizontally scalable app.
But there is another way to look at service discovery: when it is used by service clients like SPA application or an API Gateway it may be very useful. Clients and gateways should know about service APIs, otherwise, the whole thing doesn't make sense. But they can use a registry to make it more flexible/dynamic.
So, to summarize:
if services are using discovery to get more information about each other - this is probably a bad thing and design flaw (pretty sure there are corner cases where this may be a valid scenario, please post a comment if you know some)
if discovery is used by other parts of the app, like SPA or API Gateway, this may be useful, but not necessarily it always is.
PS: to avoid high coupling, consider reading series of articles by Jeppe Cramon that illustrate the problem and possible solutions very nicely.
In a distributed system, you will always have some amount of coupling, what you want to do is to reduce all aspects of coupling to a minimum.
I'd argue that is does matter how you design your service location. if your code knows of the other service i.e. OrderService.Send(SubmitOrderMessage); (where 'OrderService' is an instance of the other service's proxy)
as opposed to transportAgent.Send(SubmitOrderMessage); (where 'transportAgent' is an instance of the transport's proxy i.e. the queuing service/agent and the actual address of the queue can be in your config), this reduces the coupling and your business logic code (Service) and delegates the routing to your infrastructure.
Make Sense?
Each microservice is expected to be functionally independent. For interacting with other microservices, it should rely on rest api calls only. Service Discovery plays a role to keep the services fairly loose coupled with each other. Also due to dynamic nature of the service urls, the hard dependencies are removed. Hope this helps
Reduced coupling or fairly loose coupling still has one thing in common; coupling. In my opinion, coupling to any degree will always create rigid communication patterns that are difficult to maintain and difficult to troubleshoot as a platform grows into a large distributed platform. Isn't the idea behind microservices to allow consumers to engage in "Permissionless innovation?" I would suggest that this is only possible by decomposing to microservices that have high cohesion and low coupling and then let the consumer decide how to route, orchestrate or aggregate.
I'm porting a huge application to Windows Azure. It will have a web service frontend and a processing backend. So far I thought I would use web roles for servicing client requests and worker roles for backend processing.
Managing two kinds of roles seems problematic - I'll need to decide how to scale two kinds of roles and also I'll need several (at least two) instances of each to ensure reasonable fault tolerance and this will slightly increase operational costs. Also in my application client requests are rather lightweight and backend processing is heavyweight, so I'd expect that backend processing would consume far more processing power than servicing client requests.
This is why I'm thinking of using web roles for everything - just spawn threads and do both servicing requests and backend processing in each instance. This will make the role more complicated but will I guess simplify management. I'll have more instances of a uniform role and better fault tolerance.
Is it a good idea to reuse web roles for backend processing? What drawbacks should I expect?
Sounds like you already have a pretty good idea of what to think about when using multiple roles:
Cost for 2 instances to meet SLA (although some background tasks really don't need SLA if the end user doesn't see the impact)
Separate scale units
However: If you run everything in one role, then everything scales together. If, say, you have an administrative web site on port 8000, you might have difficulty reaching it if your user base is slamming the main site on port 80 with traffic.
I blogged about combining web and worker roles, here, which goes into a bit more detail along what we're discussing here. Also, as of some time in March, the restriction of 5 endpoints per role was lifted - see my blog post here for just how far you can push endpoints now. Having this less-restrictive endpoint model really opens up new possibilities for single-role deployments.
From what I understand your are asking if it makes sense to consolidate service layers so that you only have to deal with a single layer. At a high level, I think that makes sense. The simpler the better, as long as it's not so simple that you can't meet your primary objectives.
If your primary objective is performance, and the calls to your services are inline (meaning that the caller is waiting for an answer), then consolidating the layers may help you in achieving greater performance because you won't have to deal with the overhead of additional network latency of additional physical layers. You can use the Task Parallel Library (TPL) to implement your threading logic.
If your primary objective is scalability, and the calls to your services are out-of-band (meaning that the caller implements a fire-and-forget pattern), then using processing queues and worker roles may make more sense. One of the tenets of cloud computing is loosely coupled services. While you have more maintenance work, you also have more flexibility to grow your layers independendly. Your worker roles could also use the TPL mentioned above so that you can deploy your worker roles on larger VMs (say with 4CPUs, or 8), which would keep the number of instances deployed to a minimum.
My 2 cents. :)
I would suggest you to develop them as separated roles: a web role and a worker role, and then just combine them into a single web role.
this way, in the future you can easaly convert to real separated roles, if needed.
for more details:
http://www.31a2ba2a-b718-11dc-8314-0800200c9a66.com/2010/12/how-to-combine-worker-and-web-role-in.html
http://wely-lau.net/2011/02/25/combining-web-and-worker-role-by-utilizing-worker-role-concept/