I'm fairly new to Microservices...
I've taken an interest in learning more about two main patterns like service discovery and circuit breaker and I have conducted research on how these could be implemented.
As a Java Developer, I'm using Spring Boot. From what I understand, these patterns are useful if microservices communicate via HTTP.
One of the topics I've recently seen is the importance of event-driven architecture, which makes use of an event message bus that services would use to send messages to for other services, which subscribe to the bus
and process the message.
Given this event-driven nature, how can service-discovery and circuit breakers be achieved/implemented, given that these are commonly applicable for services communicating via HTTP?
From what I understand, these patterns are useful if microservices communicate via HTTP.
It is irrelevant that the communication is HTTP. The circuit breaker is useful in prevention of cascade failures that are more probable to occur in the architectures that use a synchronous communication style.
Event-driven architectures are in general asynchronous so cascade failure is less probable to occur.
Service discovery is used in order for the microservices to discover each other but in Event-driven architectures microservices communicate only to the messaging infrastructure (i.e. the Event store in Event sourcing) so discoverability could be used only at the infrastructure level.
I. circuit breaker and service discovery are patterns. When we say Pattern they can be implemented with any programming language. 'HTTP' protocol is for transfer of data.
circuit breaker can be implemented within Java. You can find many implementations (of course, with varying capabilities and interpretation of pattern) on github.
Some of the well-known, built for purpose implementations are :
Hysterix from NetflixOSS For using Hysterix: You can follow Spring Guide - Spring Circuit Breaker
Apache Polygene - which has example of JMX circuit breaker
Resilience4j
II. About,
Given this event-driven nature, how can service-discovery and circuit
breakers be achieved/ implemented, given that these are commonly
applicable for services communicating via HTTP?
It seems you need bit more research on topic of Microservices interactions.
There are two ways to which microservices interactions are possible. You have to choose one over the other. You can/should not mix both.
Orchestration: An interaction style that has an intelligent controller that dispatches events to processes. Please note the word 'processes' which is representing business processes here. Orchestration style was preferred in old SOA implementations as well.
Choreography: An interaction style that allows processes to subscribe to events and handle them independently or through integration with other processes without the need for a central controller.
These topics are greatly covered under
Orchestration vs. Choreography
Need of Service Discovery:
With choreography, two or more microservices can coordinate their activities and processes to share information and value.
But, these microservices may not be aware of each other's existence i.e. There are no hard-coded or service references of dependency endpoints configured or coded into them. Why we do this, is for avoiding any kind of coupling between services. So, the question remains is how one service, if required will find another services' endpoint? This is where service discovery mechanism is used.
Another perspective is, with microservices deployment with containers etc, microservices endpoints will not be even tied to any hosts etc. [due to spin-up and spin-down of containers]. So, for this case as well, we need 'service discovery' mechanism.
So, In service discovery mechanism, a centralized service discovery tool helps services to register themselves and to discover other services via a DNS or HTTP interface.
Service discovery can be implemented with
1. Server-side service discovery
2. Client Side service discovery
Consul,etcd, zookeeper are some of the key-tools names within service discovery space.
Spring Boot integrates well with Spring Cloud. And Spring Cloud provides Eureka (for service discovery) as well as Hystrix (for circuit breaker patterns). Also, Spring Cloud Stream to provide event driven patterns
Very easy to use with Spring Boot
I believe there is a misunderstanding in the question in that you assume that event-driven architectures cannot be implemented on top of HTTP.
An event-driven architecture may be implemented in many different ways and (when the architecture is that of a distributed system), on top of many different protocols.
It can be implemented using a message broker (i.e. Kafka, RabbitMQ, ActiveMQ, etc) as you suggested it too. However, this is just a choice and certainly not the only way to do it.
For example, the seminal book Building Microservices by Sam Newman, in Chapter 4: Integration, under Implementing Asynchronous Event-Based Collaboration says:
“Another approach is to try to use HTTP as a way of propagating
events. ATOM is a REST-compliant specification that defines semantics
(among other things) for publishing feeds of resources. Many client
libraries exist that allow us to create and consume these feeds. So
our customer service could just publish an event to such a feed when
our customer service changes. Our consumers just poll the feed,
looking for changes. On one hand, the fact that we can reuse the
existing ATOM specification and any associated libraries is useful,
and we know that HTTP handles scale very well. However, HTTP is not
good at low latency (where some message brokers excel), and we still
need to deal with the fact that the consumers need to keep track of
what messages they have seen and manage their own polling schedule.
I have seen people spend an age implementing more and more of the
behaviors that you get out of the box with an appropriate message
broker to make ATOM work for some use cases. For example, the
Competing Consumer pattern describes a method whereby you bring up
multiple worker instances to compete for messages, which works well
for scaling up the number of workers to handle a list of independent
jobs. However, we want to avoid the case where two or more workers see
the same message, as we’ll end up doing the same task more than we
need to. With a message broker, a standard queue will handle this.
With ATOM, we now need to manage our own shared state among all the
workers to try to reduce the chances of reproducing effort. If you
already have a good, resilient message broker available to you,
consider using it to handle publishing and subscribing to events. But
if you don’t already have one, give ATOM a look, but be aware of the
sunk-cost fallacy. If you find yourself wanting more and more of the
support that a message broker gives you, at a certain point you might
want to change your approach.”
Likewise, if your design uses a message broker for the event-driven architecture, then I'm not sure if a circuit breaker is needed, because in that case the consumer applications control the rate at which event messages are being consumed from the queues. The producer application can publish event messages at its own pace, and the consumer applications can add as many competing consumers as they want to keep up with that pace. If the server application is down the client applications can still continue consuming any remaining messages in the queues, and once the queues are empty, they will just remain waiting for more messages to arrive. But that does not put any burden on the producer application. The producer and the consumer applications are decoupled in this scenario, and all the work the circuit breaker does in other scenarios would be solved by the message broker application.
Somewhat similar can be said of the service discovery feature. Since the producer and the consumer do not directly talk to each other, but only through the message broker, then the only service you need to discover would be the message broker.
Related
Spring + Apache Kafka noob here. I'm wondering if its advisable to run a single Spring Boot application that handles both producing messages as well as consuming messages.
A lot of the applications I've seen using Kafka lately usually have one separate application send/emit the message to a Kafka topic, and another one that consumes/processes the message from that topic. For larger applications, I can see a case for separate producer and consumer applications, but what about smaller ones?
For example: I'm a simple app that processes HTTP requests => send requests to a third party service, but to ensure retryability, I put the request on a Kafka queue with a service using the #Retryable annotation?
And what other considerations might come into play since it would be on the Spring framework?
Note: As your question states, what'll say is more of an advice based on my beliefs and experience rather than some absolute truth written in stone.
Your use case seems more like a proxy than an actual application with business logic. You should make sure that making this an asynchronous service makes sense - maybe it's good enough to simply hold the connection until you get a response from the 3p, and let your client handle retries if you get an error - of course, you can also retry until some timeout.
This would avoid common asynchronous issues such as making your client need to poll or have a webhook in order to get a result, or making sure a record still makes sense to be processed after a lot of time has elapsed after an outage or a high consumer lag.
If your client doesn't care about the result as long as it gets done, and you don't expect high-throughput on either side, a single Spring Boot application should be enough for handling both producer and consumer sides - while also keeping it simple.
If you do expect high throughput, I'd look into building a WebFlux based application with the reactor-kafka library - high throughput proxies are an excellent use case for reactive applications.
Another option would be having a simple serverless function that handles the http requests and produces the records, and a standard Spring Boot application to consume them.
TBH, I don't see a use case where having two full-fledged java applications to handle a proxy duty would pay off, unless maybe you have a really sound infrastructure to easily manage them that it doesn't make a difference having two applications instead of one and using more resources is not an issue.
Actually, if you expect really high traffic and a serverless function wouldn't work, or maybe you want to stick to Java-based solutions, then you could have a simple WebFlux-based application to handle the http requests and send the messages, and a standard Spring Boot or another WebFlux application to handle consumption. This way you'd be able to scale up the former in order to accommodate the high traffic, and independently scale the later in correspondence with your performance requirements.
As for the retry part, if you stick to non-reactive Spring Kafka applications, you might want to look into the non-blocking retries feature from Spring Kafka. This will enable your consumer application to process other records while waiting to retry a failed one - the #Retryable approach is deprecated in favor of DefaultErrorHandler and both will block consumption while waiting.
Note that with that you lose ordering guarantees, so use it only if the order the requests are processed is not important.
I am investigating solution to implement microservice Saga pattern in platform hosted in K8S in GCP.
There are 2 options: Eventulate Tram and Axon. However, these frameworks seem not to support message broker managed by cloud provider such as google-cloud-Pubsub whereas I do not want to deploy either Kafka or RabbitMQ to K8S since GCP support PubSub already.
So is there any way to integrate either Eventulate or Axon to use google cloud PubSub?
Thanks
Uncertain about Eventuate's angle on this, but Axon works with extensions as message brokers other than Axon Server. Throughout Axon's lifecycle (read: last 10 years), some of these have been provided, but none are currently used for all types of messages defined by Axon Framework. So, you wouldn't be able to use Kafka for sending commands in Axon for example.
Reasoning for this? Commands, events and queries have different routing requirements which should be reflected by using the right tool for the job.
To be a bit more specific on Axon's side, the following extensions can be used for distributing your messages:
AMQP -> for Events
Kafka -> for Events
JGroups -> for Commands
Spring Cloud Discovery -> for Commands
As you can tell, there currently is no Pub/Sub extension out there to allow you to distribute your messages. Added on top of that, my gut would tell me if it was available, then it would likely only be used for Event messages due to Pub/Sub's intent when it comes to being a message broker.
Luckily this actually makes it rather straightforward to create just such a extension yourself. Going into all the details to build this would be a little much, so I would recommend to have a look at Axon's AMQP extension first when it comes to achieving this. Hints on the matter are that for publication, you should add a component to handle Axon's events and publish them on Pub/Sub. For handling events, you are required to build a StreamableMessageSource or SubscribableMessageSource. These interfaces are used respectively by the TrackingEventProcessor and SubscribingEventProcessor, which in turn are the component in charge of dealing with the technical aspect of handling events.
By the way, if you would be building such an extension and you need a hand, it would be best to request this at AxonIQ forum, which you can find here.
Last note, and rather important I'd say, is the argument that such a connector would not be able to deal with all types of messages. If you would require a more full fledged Axon application to run in a distributed fashion, I would highly recommend to give Axon Server a try prior to building your own solution from the ground up.
There are a lot of tutorials and articles (including official site) promoting spring boot as a good tool for building microservices.
Let's say we have some rest api endpoint (User profile) which aggregates data from multiple services (User service, Stat service, Friends service).
To achieve this, user profile endpoint makes 3 http calls to those services.
But in Spring, requests are blocking and as I see, the server will quickly run out of available resources (threads) to serve request in such system.
So to me, it as quite inefficient way to build such systems (compared to non-blocking frameworks, like play! framework or node.js)
Do I miss something?
P.S.: I do not mean here spring 5 with its new webflux framework.
No one prevents you from building an asynchronous microservice architecture with Spring Boot :).
Something along these lines:
Instead of one service calling another synchronously, a service can put events to a queue (e.g. RabbitMQ). The events are delivered to services that subscribe to those events.
Using RabbitMQ and its "exchange" concept, the event producing service doesn't even need to the consumers of its events.
A blog post detailing this with Spring Boot code can be found here: https://reflectoring.io/event-messaging-with-spring-boot-and-rabbitmq/
This is not a limitation of Spring rather it is more to do with the Application Architecture.
For instance, the scenario that you have is commonly solved using Aggregate Design Pattern
While this solution is quite prevalent,it has the limitation of being synchronous, and thus blocking. Asynchronous behaviour in such scenarios should be implemented in an application specific way.
Having said that if you have to call other services in order to be able to serve a response to a request from a client(outside), this is typically an architectural problem. It really doesn’t matter if you are using HTTP or asynchronous message passing (with a request-reply pattern), the overall response time for the outside client will be bad
Also, I have seen quite a few applications which uses synchronous REST calls for external clients, but when communication is needed between internal MicroServices, it should always be asynchronous. You can read an interesting paper on this topic here MicroServices Messaging Patterns
Please consider the scenario as shown in the attached image :
The Portal(producer) will send some message to the bus to which has to be processed by multiple applications(consumer) – PAYROLLAPP, HELPDESK etc.
Multiple instances of consumer applications may be running, also these instances can be added/
removed dynamically
Now, it is critical to ensure that message is processed only once, per application i.e if
PAYROLLAPP -1 processes the message, PAYROLLAPP -2 should NOT process it; of course, in the
above diagram, HELPDESK – 1 must process it. In short, in case of multiple instances, exactly one
must process the message, once
When I searched for answers, most of the stuff was about creating a 'selective consumer' - a consumer that accepts/rejects a message based on some logic - please note that no changes/additions/wrapping can be done for the applications shown in the diagram; the logic has to reside somewhere in the provider that manages the bus
Please guide about the same.
Adding more details after Petter's answer :
The items to the to the left of the left-dotted line are the 'approaches' - Pure JMS,ESB,EAI
The items to the to the right of the right-dotted line are the 'implementations'
Now, the big part - QUERIES :
Irrespective of the solution(pure JMS, ESB, EAI), does the part
below the horizontal dotted line(application-specific queues) needs
to be implemented?
How does the usage of ESB(JBoss ESB etc.), instead of ‘pure’ JMS(Active MQ etc.), help/
hamper? Does ESB provide any advantage over JMS which is ‘java-only’(?). I am hell confused
– ‘ESB or JMS’, even after referring threads like these : JMS and ESB - how they are related?.
It has one reply which says “JMS is not well suited
for the integration of REST services, File systems, S/FTP, Email, Hessian, SOAP etc. which are
better handled with an ESB that supports these types natively. For example, if you have a process
that dumps a CSV file of 500MB at midnight, and you want another system to pickup the file,
parse CSV and import into a database, this can easily be accomplished by an ESB - whereas a
solution with just JMS will be bad. Similarly, integration of REST services, with load balancing/
failover to multiple backend instances can be done better with an ESB supporting HTTP/S
natively.” It only added to my confusion !!!
Is the usage of EAI framework (Apache Camel etc.) an approach entirely different from the pure
JMS or ESB approach? If yes, how and what are the pros/cons?
I was told that ESB alone won’t help, BPM(or something else?) needs to be used to define and
store the ‘routing’ logic – is this true?
I see the point. This might be a bit tricky with "pure" JMS.
What you essentially want to do is to let the portal publish messages to a topic, but not let the PAYROLLAPPs subscribe to that topic (since all of them would get a copy of the message). So what you would need is some logic in between that distributes the message from the topic subscription to one queue per application type. From that queue, normal load balanced (the competing consumer pattern) can be implemented with JMS.
Different JMS providers have special implementations that can acomplish this task
ActiveMQ has its Virtual Destinations, WebSphere MQ has its server side subscriptions that can subscribe from a topic to a queue. In the case your JMS provider does not have any way to handle this, you might want to look at adding some routing middleware to your topology. Apache Camel is a nice, lightweight one, but there are lots of others that can setup some routing in the middle without affecting the real applications.
Update for detailed questions
The Queues below the line has to be there for sure (if your applications uses messaging). The "Some distrib. logic" box shouldn't be needed. The "Some routing logic" box could be an ESB or in this very case, be implemented in the messaging server, for instance ActiveMQ with virtual destinations (or WebSphere MQ or perhaps RabbitMQ among others).
There are a lot of buzwords in the domain of integration. Simplified (depending on who you ask - ESB can also be seen as an architectural pattern, but let's keep it simple), an ESB is a server application (or a topology of multiple servers in practice) that is a centerpiece of an integration landscape. The ESB server simply contain logic and small message flows that takes messages (files, or whatever) from one application and routes them to many applications, transform them to other formats, encrypts, converts from one transport protocol such as HTTP/SOAP to File etc.
JMS is a rather confusing and missused word. Java has to some extent dominated the enterprise messaging domain in the last years, so JMS is sometimes used pretty much as a synonym to Messaging. However, Messaging, (or message queueing, asynchronous messaging, MOM=message oriented middleware, etc.) is to be simply considered as a family of similar transport protocols that features a central relaying server. Is is not at all a Java only thing. Many successful ESBs setups I worked with actually leverage on a Messaging backbone
In your situation, I would not go too deep into the academical/philosophical differences between ESB and EAI software. They will most likely do pretty much the same things for you. Instead, look at the hard facts such as price, support, resource footprint, monitoring, tech. features, learning curve etc. Be it Camel/ServiceMix, Mule, JBoss ESB, Microsoft BizTalk, IBM Message Broker, Tibco etc.
Hah! Was it perhaps a salesman? An ESB will do just fine. A Messaging server will do in your case as well, such as ActiveMQ as has been pointed out already. BPM suits are fine for orchestrating semi-automatized business processes or if there is major business logic in the integration layer. Otherwise, avoid that added complexity.
Irrespective of the solution(pure JMS, ESB, EAI), does the part below the horizontal dotted line(application-specific queues) needs to be implemented?
The consumers need to be implemented in such a way that work with you chosen solution but you shouldn't have to worry about the creation of the queue per consumer or the distribution logic (assuming that the consumers can consume directly from the chosen tech)
How does the usage of ESB(JBoss ESB etc.), instead of ‘pure’ JMS(Active MQ etc.), help/ hamper? Does ESB provide any advantage over JMS which is ‘java-only’(?). I am hell confused – ‘ESB or JMS’, even after referring threads like these : JMS and ESB - how they are related?. It has one reply which says “JMS is not well suited for the integration of REST services, File systems, S/FTP, Email, Hessian, SOAP etc. which are better handled with an ESB that supports these types natively. For example, if you have a process that dumps a CSV file of 500MB at midnight, and you want another system to pickup the file, parse CSV and import into a database, this can easily be accomplished by an ESB - whereas a solution with just JMS will be bad. Similarly, integration of REST services, with load balancing/ failover to multiple backend instances can be done better with an ESB supporting HTTP/S natively.” It only added to my confusion !!!
My opinion is that ESB would overcomplicate this solution. It's designed (amongst other things) to assist integration with different technologies, but simpler solutions do this too - e.g - Apache Camel provides a very easy way of communicating using a huge variety of transports (including ActiveMQ).
Not all JMS implementations cater for connectivity from other languages, but ActiveMQ does using it's STOMP connector.
Is the usage of EAI framework (Apache Camel etc.) an approach entirely different from the pure JMS or ESB approach? If yes, how and what are the pros/cons?
Apache Camel and JMS are complementary technologies, as are JMS and ESB. Camel (& Spring Integration) are lightweight, simple and portable. ESB's are much more heavyweight and will normally lead to greater coupling with the ESB/application server.
I was told that ESB alone won’t help, BPM(or something else?) needs to be used to define and store the ‘routing’ logic – is this true?
It depends what your 'routing' logic is, it looks to me like you don't require routing logic, you just require guaranteed delivery to 1payroll consumer and 1 helpdesk consumer. BPM would be more useful where you want to selectively public data/invoke a service based on some characteristic of that data.
I strongly suggest reading http://activemq.apache.org/virtual-destinations.html, using these you would:
Send messages to the ActiveMQ broker, onto a VirtualTopic, e.g. VirtualTopic.X
Register the Payroll and Helpdesk consumers, as consumers on queues that ActiveMQ dynamically creates on the topic - e.g. Consumer.Payroll.VirtualTopic.X. Both Payroll consumers should be registered with the same string.
ActiveMQ will automatically retain a marker that represents what each set of consumers hasn't consumed. This means that 100% of messages will be processed by a Payroll consumer but a message will never be sent to > 1 payroll consumer.
Add/remove consumers at will.
N.B.I believe that other products, e.g. Apache QPID provide similar functionality - I'm just most aware of ActiveMQ, and have had success with this approach
How does RabbitMQ compare to Mule, I am going to build an application using message oriented architecture and AMQP (RabbitMQ) provides everything i want, but i am perplexed with so many related technology choice and similar concepts like ESB. I am having a doubt if i am making a choice without considering other alternatives.
I am mostly clear that RabbitMQ is a message broker and it helps me in mediating message between producer and consumer (all forms or publish subscribe and i could understand how its used from real examples like twitter , or Facebook updates, etc)
What is Mule, if i could achieve what i do in RabbitMQ using mule, should i consider mule similar to RabbitMQ?
Does mule has a different objective than that of a message broker?
Does mule assumes that underlying it there is a message broker that delivers message to the appropriate mule listeners (i could easily write a listener in RabbitMQ)
Is mule a complete Java bases system ( The current experiment i did with RabbitMQ took me less than 30 Min to write a simple RPC Client Server with client as C# and Server as Java , will such things be done in Mule easily).
Mule is an ESB (Enterprise Service Bus). RabbitMQ is a message broker.
An ESB provides added layers atop of a message broker such as routing, transformations and business process management. It is a mediator between applications, integrating Web Services, REST endpoints, database connections, email and ftp servers - you name it. It is a high-level integration backbone which orchestrates interoperability within a network of applications that speak different protocols.
A message broker is a lower level component which enables you as a developer to relay raw messages between publishers and subscribers, typically between components of the same system but not always. It is used to enable asynchronous processing to keep response times low. Some tasks take longer to process and you don't want them to hold things up if they're not time-sensitive. Instead, post a message to a queue (as a publisher) and have a subscriber pick it up and process it "later".
Mule is a "higher level" service implemented with message broker. From the docs
The messaging backbone of the ESB is
usually implemented using JMS, but any
other message server implementation
could be used
You can build an ESB with rabbit; however, you're going to be limited to sending byte[] packages, and you'll have to build your system out of messaging primitives like topics and queues. It might be a bit faster (based on absolutely no benchmarking, testing or data) because there are fewer layers of translation. Mule provides an abstraction on top of this, speaks a variety of transports, and can handle some routing logic.
Mule is a Enterprise service bus providing end to end integration solution where as Rabbit is message broker for queueing messages between subscriber and receiver.
RabbitMQ, a open source message broker software is written in Erlang programming language and is built on Open Telecom Platform for clustering and failover. It is easy to use, supports a huge number of developer platforms and runs on all major operating systems. It works on a concept called Exchange.
Mule connects RabbitMQ with AMQP connector.