KDB+/Q: GRPC implementation? - protocol-buffers

gRPC is a modern open source high performance RPC framework that can
run in any environment. It can efficiently connect services in and
across data centers with pluggable support for load balancing,
tracing, health checking and authentication. It is also applicable in
last mile of distributed computing to connect devices, mobile
applications and browsers to backend services.
I'm finding GRPC is becoming increasingly more pertinent in backend infrastructure, and would've liked to have it in my favorite language/tsdb kdb+/q.
I was surprised to find that kdb+ does not have a grpc implementation. Obviously, the (https://code.kx.com/q/interfaces/protobuf/)
package doesn't support the parsing of rpc's, is there anything quantitatively preventing there being a KDB+ implementation of the rpc requests/services etc. found in grpc?
Why would one not want to implement rpc's (grpc) in kdb+ and would it be a good idea to wrap a c++/c implemetation therin inorder to achieve this functionality.
Thanks for your advice.

Interesting post:
https://zimarev.com/blog/event-sourcing/myth-busting/2020-07-09-overselling-event-sourcing/
outlines event sourcing, which I think might be a better fit for kdb?
What is the main issue with services using RPC calls to exchange information? Well, it’s the high degree of coupling introduced by RPC by its nature. The whole group of services or even the whole system can go down if only one of the services stops working. This approach diminishes the whole idea of independent components.
In my practice I hardly encounter any need to use RPC for inter-service communication. Partially because I often use Event Sourcing, more about it later. But we always use asynchronous communication and exchange information between services using events, even without Event Sourcing.
For example, an order microservice in an e-commerce system needs customer data from the customer microservice. These dependencies between microservices are not ideal. Other microservices can go down and synchronous RESTful requests over https do not scale well due to their blocking nature. If there was a way to completely eliminate dependencies between microservices completely the result would be a more robust architecture with less bottlenecks.
You don’t need Event Sourcing to fix this issue. Event-driven systems are perfectly capable of doing that. Event Sourcing can eliminate some of the associated issues like two-phase commits, but again, not a requirement to remove the temporal coupling from your system.

Related

Circuit breaker as stand alone service

I am building microservices architecture for the first time and despite I have read a lot of articles I am still confused how to correctly implement circuit breaker.
Let's suppose that I have several microservices that call each other. So I implemented circuit breaker into each of them as an request interceptor and it works. But I dont like it.
Firstly each service now needs to hit the fail threshold separately before the breaker open. Secondly I write the same functionality for each service over and over again.
So my first thought was to create circuit breaker as stand alone service but I can not find any pattern describing such a functionality. How it would works? Every service before making request calls circuit breaker service firs if target circuit is closed. If so it sends request and when request is finished then reports back to circuit breaker service whether the request was successful or failed?
Or how should be circuit breaker correctly put into microservices architecture?
When you are talking about real micro services architecture circuit breaking is a cross-cutting-concern
You should not implement it by yourself. First of all I should say please be careful of creating spaghetti between your micro services, It's too dangerous and anti-pattern.
Although its an anti-pattern I highly recommend you to use cloud native platforms to deploy your micro-services like Kubernetes or mabye Docker.
There are lots of useful tools like Envoy-implemented side-cars, service mesh implementations using Istio (not recommended), Consul and other Hashicorp products.
You can improve your service discovery, observability, monitoring, circuit-breaking, logging, side-micro-service-communication and other useful concepts using cloud native tools.
Hint: I highly recommend you to use grpc instead of http requests between your services (To reduce latency based on http3 and tcp connections)
Secondly I write the same functionality for each service over and
over again.
One of ways to address this issue in the world of microservices is (as you correctly noticed) to have this functionality moved away of your service. Circuit breaking is just one element, there is many many more other aspects, related to inter-service communication, that you'd have to take care of, such as: handling retries, failovers, authentication and authorization, tracing, monitoring etc.
If you were to handle it in all services separately, you'd end up writing the same code (or configuring various frameworks/plugins) over and over again.
The solution that emerged from that need is a service mesh. You can think of it as a middle-man that intercept all the communication between your services and taking care of all above mentioned aspects.
There are various solutions. You can check https://github.com/cncf/landscape to find out what is now "hot" and considered a standard.
I'd however recommend you getting familiar with the https://istio.io/latest/about/service-mesh/ as it's really mature and powerful.

Microservice Aggregator Service BFF

Let say I have 22 microservices. I developed with docker on local.
Client wants to get product model data which contains 3 different service data and aggregate them.
Should I use aggregator gateway api or SPA get separately from each service. Does Aggregator service couple services ?
These Microservices patterns always come with Trade-offs. Here you need to consider more than just a coupling issue when you are going with Aggregator pattern (Backend for Frontend).
The following are some of the points you need to think about before going with this pattern.
The Latency problem. If you want this implementation to make it better without any latency problem, then your services and aggregator should be in the same location or the same data center. Avoid third party calls from aggregators.
This can introduce a single point of failure. Make sure that you've designed in such a way that the service is highly available.
Implement a resilient design and timeout since this aggregator is calling other services and getting data. If one or more service calls take too long, it should timeout and return a partial set of data. Consider how your application will handle this scenario
Monitoring of your aggregator and it's child service calls. Implement distributed tracing using correlation IDs to track each call.
Ensure the aggregator has the adequate performance to handle the load and can be scaled to meet your anticipated growth.
These are the best practices that I can suggest, You are the best person who can decide based on your system requirements and these points.
There are some compelling advantages to using a BfF service as an orchestration layer that aggregates calls to various backend data services.
It will reduce the complexity in the data access areas of your SPA.
It can also reduce load times.
Over time, your frontend devs will be less likely to get blocked on the backend devs assuming that the BfF is maintained by the frontend devs.
Take a look at this article on Consistency, Coupling, and Complexity at the Edge that goes into more detail on this and proposes some best practices such as GraphQL vs REST.

Reactive approach of Marklogic database

Does Marklogic supports backpressure or allow to send data in chunks that is reactive approach ?
'Reactive' is a fairly new term describing a particular incarnation of old concepts common in server and database technologies, but fairly new to modern client and middle-tier programming.
I am assuming the question is prompted by the need/desire to work within an existing 'Reactive' framework (such as vert.x or Rx/Java). For that question, the answer is 'no' - there is not an 'official' API which integrates directly with these frameworks to my knowledge. There are community APIs which I have not personally used, an example is https://github.com/etourdot/vertx-marklogic (reactive, vert.x marklogic API).
MarkLogic is a 'reactive' design internally in that it implements the functionality the modern 'reactive' term is used to describe -- but does not expose any standard 'reactive APIs' for this (there are very few standards in this area). Code running within MarkLogic server (xquery,javascript) implicitly benefits from this - although there is not an explicit backpressure API, a side effect of single threaded blocking IO (from the app perspective) is that the equivalent of 'back pressure' is implemented by implicit flow control of the IO APIS - you cannot over drive a properly configured ML server on a single thread doing blocking IO. Connections to an overloaded server will take longer and eventually time out ('backpressure' :)
Similarly, (most of) the external APIs (REST, XCC) are also blocking, single threaded.
The server core manages rate control via a variety of methods such as actively managing the TCP connection queue size, keep alive times, numbers of active threads etc.
In general the server does a very good job at this without explicit low level programming needed, balancing the latency across all clients. If this needs improving, the administration guides have good direction on how to tune the various parameters so the system behaves well on its own.
If you want to implement a per-connection client aware 'reactive' API you will need to implement it yourself. This can be done using the same techniques used for other blocking IO APis -- i.e. either use multiple threads or non-blocking IO. Some of the ML SDK's have provision for non-blocking IO or control over timeouts which can be used to implement a 'reactive' API.
Similarly, code running in the server itself (XQuery or JavaScript) can implement 'reactive' type behaviour by making use of the task queue -- as exposed by the xdmp:spawn-xxx apis. This is done in many libraries to manage bulk ingest. Care must be taken to carefully control the amount of concurrency as you can easily overload the server by spawning too many concurrent requests. Managing state is a bit tricky as there is a interaction/opposition between the transaction model and task creation -- the former generally presenting an idempotent view of data that can be incongruous with the concept of 'current' wrt asynchronous tasks.

web Api application subscribing to a queue. Is it a good idea?

We are designing a reporting system using microservice architecture. All the services are supposed to be subscribers to the event bus and they communicate by raising events. We also decided to expose each of our services using REST api. Now the question is , is it a good idea to create our services as web api [RESTful] applications which are also subscribers to the event bus? so basically there are 2 ponits of entry to each service - api and events. I have a feeling that we should separate out these 2 as these are 2 different concerns. Any ideas?
Since Microservices architecture are Un-opinionated software design. So you may get different answers on this questions.
Yes, REST and Event based are two different things but sometime both combined gives design to achieve better flexibility.
Answering to your concerns, I don't see any harm if REST APIs also subscribe to a queue as long as you can maintain both of them i.e changes to message does not have any impact of APIs and you have proper fallback and Eventual consistency mechanism in place. you can check discussion . There are already few project which tried it such as nakadi and ponte.
So It all depends on your service's communication behaviour to choose between REST APIs and Event-Based design Or Both.
What you do is based on your requirement you can choose REST APIs where you see synchronous behaviour between services
and go with Event based design where you find services needs asynchronous behaviour, there is no harm combining both also.
Ideally for inter-process communication protocol it is better to go with messaging and for client-service REST APIs are best fitted.
Check the Communication style in microservices.io
REST based Architecture
Advantage
Request/Response is easy and best fitted when you need synchronous environments.
Simpler system since there in no intermediate broker
Promotes orchestration i.e Service can take action based on response of other service.
Drawback
Services needs to discover locations of service instances.
One to one Mapping between services.
Rest used HTTP which is general purpose protocol built on top of TCP/IP which adds enormous amount of overhead when using it to pass messages.
Event Driven Architecture
Advantage
Event-driven architectures are appealing to API developers because they function very well in asynchronous environments.
Loose coupling since it decouples services as on a event of once service multiple services can take action based on application requirement. it is easy to plug-in any new consumer to producer.
Improved availability since the message broker buffers messages until the consumer is able to process them.
Drawback
Additional complexity of message broker, which must be highly available
Debugging an event request is not that easy.

Does Service Discovery microservice break idea of loose coupling?

As a mechanism to connect microservices together and make them work it is usually suggested to use APIs and Service Discovery. But they usually work as their own microservices, but these ones should apparently be "hard-coded" into others, as every microservice is supposed to register with them and query for other microservices' locations. Doesn't this break the idea of loose coupling, since a loss of a discovery service implies others being unable to communicate?
Pretty much yes. If one microservice "knows" about another microservice - it means that they are highly coupled. It doesn't matter where specifically this knowledge about other service is coming from: hardcoded, config file or maybe some service discovery, it still leads to high coupling.
What's important to understand is that many microservice evangelists are just preaching about how to build monolith apps on top of Web APIs. Some of them probably think that the more buzz words they use - the better ... not quite sure why this happens. It is probably easier to fake a language and become an "expert" by using buzzword salad instead of really building fault tolerant and horizontally scalable app.
But there is another way to look at service discovery: when it is used by service clients like SPA application or an API Gateway it may be very useful. Clients and gateways should know about service APIs, otherwise, the whole thing doesn't make sense. But they can use a registry to make it more flexible/dynamic.
So, to summarize:
if services are using discovery to get more information about each other - this is probably a bad thing and design flaw (pretty sure there are corner cases where this may be a valid scenario, please post a comment if you know some)
if discovery is used by other parts of the app, like SPA or API Gateway, this may be useful, but not necessarily it always is.
PS: to avoid high coupling, consider reading series of articles by Jeppe Cramon that illustrate the problem and possible solutions very nicely.
In a distributed system, you will always have some amount of coupling, what you want to do is to reduce all aspects of coupling to a minimum.
I'd argue that is does matter how you design your service location. if your code knows of the other service i.e. OrderService.Send(SubmitOrderMessage); (where 'OrderService' is an instance of the other service's proxy)
as opposed to transportAgent.Send(SubmitOrderMessage); (where 'transportAgent' is an instance of the transport's proxy i.e. the queuing service/agent and the actual address of the queue can be in your config), this reduces the coupling and your business logic code (Service) and delegates the routing to your infrastructure.
Make Sense?
Each microservice is expected to be functionally independent. For interacting with other microservices, it should rely on rest api calls only. Service Discovery plays a role to keep the services fairly loose coupled with each other. Also due to dynamic nature of the service urls, the hard dependencies are removed. Hope this helps
Reduced coupling or fairly loose coupling still has one thing in common; coupling. In my opinion, coupling to any degree will always create rigid communication patterns that are difficult to maintain and difficult to troubleshoot as a platform grows into a large distributed platform. Isn't the idea behind microservices to allow consumers to engage in "Permissionless innovation?" I would suggest that this is only possible by decomposing to microservices that have high cohesion and low coupling and then let the consumer decide how to route, orchestrate or aggregate.

Resources