What is Hystrix in Spring? [closed] - spring

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Can someone please explain to me about Hystrix? I googled it, but still, I am not clear.
What is Hystrix?
Why do we use Hystrix?
Please provide me with an example of Hystrix usage.

What is hystrix?
Hystrix is a library developed by Netflix and is part of Spring via the Spring Cloud Netflix project. Hystrix is a fault tolerance library and is used as strategy against failures (at different levels) in a service-layer.
Why do we use Hystrix?
Hystrix can be used in situations where your application depends on remote services. In case one or more remote services are down you can handle the situation by using a circuit breaker in your application.
In simpler terms: How to allow one service to continue functioning – when it calls external services which are failing?
Hystrix is watching methods for failing calls to related services. If there is such a failing method, it will open the circuit, which means, it forwards the call to a fallback method. In case the services is restored it will close the circuit and the applications acts as expected again.
See this great article for more background.

What Is Hystrix?
Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
In a distributed environment, inevitably some of the many service dependencies will fail. Hystrix is a library that helps you control the interactions between these distributed services by adding latency tolerance and fault tolerance logic. Hystrix does this by isolating points of access between the services,
stopping cascading failures across them, and providing fallback options, all of which improve your system’s overall resiliency.
What does it do?
1) Latency and Fault Tolerance
Stop cascading failures. Fallbacks and graceful degradation. Fail fast and rapid recovery. Thread and semaphore isolation with circuit breakers.
2) Realtime Operations
Realtime monitoring and configuration changes. Watch service and property changes take effect immediately as they spread across a fleet.Be alerted, make decisions, affect change and see results in seconds.
3) Concurrency
Parallel execution. Concurrency aware request caching. Automated batching through request collapsing.
Some of the major implementations of hystrix are used in
Circuit Breaker
This guide walks you through the process of applying circuit breakers to potentially-failing method calls using the Netflix Hystrix fault tolerance library.
Hystrix Dashboard
The Hystrix Dashboard allows you to monitor Hystrix metrics in real time.
For further information on hystrix visit https://github.com/Netflix/Hystrix/wiki/How-To-Use
For further info regarding hystrix dashboard visit https://github.com/Netflix/Hystrix/wiki/Dashboard

Related

Advisable to run a Kafka producer + consumer in same application?

Spring + Apache Kafka noob here. I'm wondering if its advisable to run a single Spring Boot application that handles both producing messages as well as consuming messages.
A lot of the applications I've seen using Kafka lately usually have one separate application send/emit the message to a Kafka topic, and another one that consumes/processes the message from that topic. For larger applications, I can see a case for separate producer and consumer applications, but what about smaller ones?
For example: I'm a simple app that processes HTTP requests => send requests to a third party service, but to ensure retryability, I put the request on a Kafka queue with a service using the #Retryable annotation?
And what other considerations might come into play since it would be on the Spring framework?
Note: As your question states, what'll say is more of an advice based on my beliefs and experience rather than some absolute truth written in stone.
Your use case seems more like a proxy than an actual application with business logic. You should make sure that making this an asynchronous service makes sense - maybe it's good enough to simply hold the connection until you get a response from the 3p, and let your client handle retries if you get an error - of course, you can also retry until some timeout.
This would avoid common asynchronous issues such as making your client need to poll or have a webhook in order to get a result, or making sure a record still makes sense to be processed after a lot of time has elapsed after an outage or a high consumer lag.
If your client doesn't care about the result as long as it gets done, and you don't expect high-throughput on either side, a single Spring Boot application should be enough for handling both producer and consumer sides - while also keeping it simple.
If you do expect high throughput, I'd look into building a WebFlux based application with the reactor-kafka library - high throughput proxies are an excellent use case for reactive applications.
Another option would be having a simple serverless function that handles the http requests and produces the records, and a standard Spring Boot application to consume them.
TBH, I don't see a use case where having two full-fledged java applications to handle a proxy duty would pay off, unless maybe you have a really sound infrastructure to easily manage them that it doesn't make a difference having two applications instead of one and using more resources is not an issue.
Actually, if you expect really high traffic and a serverless function wouldn't work, or maybe you want to stick to Java-based solutions, then you could have a simple WebFlux-based application to handle the http requests and send the messages, and a standard Spring Boot or another WebFlux application to handle consumption. This way you'd be able to scale up the former in order to accommodate the high traffic, and independently scale the later in correspondence with your performance requirements.
As for the retry part, if you stick to non-reactive Spring Kafka applications, you might want to look into the non-blocking retries feature from Spring Kafka. This will enable your consumer application to process other records while waiting to retry a failed one - the #Retryable approach is deprecated in favor of DefaultErrorHandler and both will block consumption while waiting.
Note that with that you lose ordering guarantees, so use it only if the order the requests are processed is not important.

How to find why and which microservice is slow?

In one of my recent interview in Sapient, interview asked few questions:
Q1: How to find which microservice is slow, if your query goes to multiple services?
Q2. How to use logs in microservices and which information you will display in logs?
If anybody has any answer, then please explain.
Thanks in advance.
Based on generality, for the first one you can follow circuit breaker patterns for this, where you can mention timeouts on the called methods, such that if they don't respond till a threshold then the fallback methods shall be used to return some mock object of the kind of data being expected from the called method.
There are frameworks for this in Spring like Resilience4j or Hystrix
For logging you can use distributed tracing, i.e. via Zipkins ( an offering in spring cloud again ). And its purely your choice on what has to be logged for your application
And if dealing in Kubernetes based environments then you can use Jaeger also for distributed tracing and Istio can be used for service mesh and circuit breakers.
Hope this turns up useful !!

Microservices: Service discovery/ circuit breaker for Event-driven architecture

I'm fairly new to Microservices...
I've taken an interest in learning more about two main patterns like service discovery and circuit breaker and I have conducted research on how these could be implemented.
As a Java Developer, I'm using Spring Boot. From what I understand, these patterns are useful if microservices communicate via HTTP.
One of the topics I've recently seen is the importance of event-driven architecture, which makes use of an event message bus that services would use to send messages to for other services, which subscribe to the bus
and process the message.
Given this event-driven nature, how can service-discovery and circuit breakers be achieved/implemented, given that these are commonly applicable for services communicating via HTTP?
From what I understand, these patterns are useful if microservices communicate via HTTP.
It is irrelevant that the communication is HTTP. The circuit breaker is useful in prevention of cascade failures that are more probable to occur in the architectures that use a synchronous communication style.
Event-driven architectures are in general asynchronous so cascade failure is less probable to occur.
Service discovery is used in order for the microservices to discover each other but in Event-driven architectures microservices communicate only to the messaging infrastructure (i.e. the Event store in Event sourcing) so discoverability could be used only at the infrastructure level.
I. circuit breaker and service discovery are patterns. When we say Pattern they can be implemented with any programming language. 'HTTP' protocol is for transfer of data.
circuit breaker can be implemented within Java. You can find many implementations (of course, with varying capabilities and interpretation of pattern) on github.
Some of the well-known, built for purpose implementations are :
Hysterix from NetflixOSS For using Hysterix: You can follow Spring Guide - Spring Circuit Breaker
Apache Polygene - which has example of JMX circuit breaker
Resilience4j
II. About,
Given this event-driven nature, how can service-discovery and circuit
breakers be achieved/ implemented, given that these are commonly
applicable for services communicating via HTTP?
It seems you need bit more research on topic of Microservices interactions.
There are two ways to which microservices interactions are possible. You have to choose one over the other. You can/should not mix both.
Orchestration: An interaction style that has an intelligent controller that dispatches events to processes. Please note the word 'processes' which is representing business processes here. Orchestration style was preferred in old SOA implementations as well.
Choreography: An interaction style that allows processes to subscribe to events and handle them independently or through integration with other processes without the need for a central controller.
These topics are greatly covered under
Orchestration vs. Choreography
Need of Service Discovery:
With choreography, two or more microservices can coordinate their activities and processes to share information and value.
But, these microservices may not be aware of each other's existence i.e. There are no hard-coded or service references of dependency endpoints configured or coded into them. Why we do this, is for avoiding any kind of coupling between services. So, the question remains is how one service, if required will find another services' endpoint? This is where service discovery mechanism is used.
Another perspective is, with microservices deployment with containers etc, microservices endpoints will not be even tied to any hosts etc. [due to spin-up and spin-down of containers]. So, for this case as well, we need 'service discovery' mechanism.
So, In service discovery mechanism, a centralized service discovery tool helps services to register themselves and to discover other services via a DNS or HTTP interface.
Service discovery can be implemented with
1. Server-side service discovery
2. Client Side service discovery
Consul,etcd, zookeeper are some of the key-tools names within service discovery space.
Spring Boot integrates well with Spring Cloud. And Spring Cloud provides Eureka (for service discovery) as well as Hystrix (for circuit breaker patterns). Also, Spring Cloud Stream to provide event driven patterns
Very easy to use with Spring Boot
I believe there is a misunderstanding in the question in that you assume that event-driven architectures cannot be implemented on top of HTTP.
An event-driven architecture may be implemented in many different ways and (when the architecture is that of a distributed system), on top of many different protocols.
It can be implemented using a message broker (i.e. Kafka, RabbitMQ, ActiveMQ, etc) as you suggested it too. However, this is just a choice and certainly not the only way to do it.
For example, the seminal book Building Microservices by Sam Newman, in Chapter 4: Integration, under Implementing Asynchronous Event-Based Collaboration says:
“Another approach is to try to use HTTP as a way of propagating
events. ATOM is a REST-compliant specification that defines semantics
(among other things) for publishing feeds of resources. Many client
libraries exist that allow us to create and consume these feeds. So
our customer service could just publish an event to such a feed when
our customer service changes. Our consumers just poll the feed,
looking for changes. On one hand, the fact that we can reuse the
existing ATOM specification and any associated libraries is useful,
and we know that HTTP handles scale very well. However, HTTP is not
good at low latency (where some message brokers excel), and we still
need to deal with the fact that the consumers need to keep track of
what messages they have seen and manage their own polling schedule.
I have seen people spend an age implementing more and more of the
behaviors that you get out of the box with an appropriate message
broker to make ATOM work for some use cases. For example, the
Competing Consumer pattern describes a method whereby you bring up
multiple worker instances to compete for messages, which works well
for scaling up the number of workers to handle a list of independent
jobs. However, we want to avoid the case where two or more workers see
the same message, as we’ll end up doing the same task more than we
need to. With a message broker, a standard queue will handle this.
With ATOM, we now need to manage our own shared state among all the
workers to try to reduce the chances of reproducing effort. If you
already have a good, resilient message broker available to you,
consider using it to handle publishing and subscribing to events. But
if you don’t already have one, give ATOM a look, but be aware of the
sunk-cost fallacy. If you find yourself wanting more and more of the
support that a message broker gives you, at a certain point you might
want to change your approach.”
Likewise, if your design uses a message broker for the event-driven architecture, then I'm not sure if a circuit breaker is needed, because in that case the consumer applications control the rate at which event messages are being consumed from the queues. The producer application can publish event messages at its own pace, and the consumer applications can add as many competing consumers as they want to keep up with that pace. If the server application is down the client applications can still continue consuming any remaining messages in the queues, and once the queues are empty, they will just remain waiting for more messages to arrive. But that does not put any burden on the producer application. The producer and the consumer applications are decoupled in this scenario, and all the work the circuit breaker does in other scenarios would be solved by the message broker application.
Somewhat similar can be said of the service discovery feature. Since the producer and the consumer do not directly talk to each other, but only through the message broker, then the only service you need to discover would be the message broker.

Measuring Microservices - Fault Tolerant

One of the Microservices Architecture benefit is Fault Tolerant. Which means any issues in one service should not impact other services. As result, it should improve the particular service availability. However, some implementation such as HA, auto scaling also help in availability. Instead of measuring the general of service availability, how we able to have more specific quantitative measurement that Microservice is benefits in fault tolerant?
Fault tolerance or resilience has to do more with your internal application architecture than with using Microservices or another architectural style. For example - If you compare a well structured monolith with internal error handling and fallback strategies to a bunch of Microservices that are designed with interdependencies but no built in resilience, the Microservices will be way more likely to fail all together.
Here some ideas of how to build a resilient system:
Avoid interdependencies. Most important, but not always possible.
Use an infrastructure with built in self-healing capabilities, such as Kubernetes.
Use an API gateway with built in resilience, such as Zuul.
Use specialized libraries for resilient calling with promises and circuit breakers. Such as Hystrix.
Cache requests in a stream processor, such as Kafka, to protect against load spikes, intermittent service failures.
Design your APIs idempotent.
When you ask for measuring fault tolerance you should look into automated testing of your application. For example you can write tests for your application that use randomized input/wrong input or ultra-high loads in an attempt to disturb the services. So measuring/proving fault tolerance really is a task for the testing team.

Spring cloud - how to get benefits of retry,load balancing and circuit breaker for distributed spring application

I want the following features in spring-cloud-Eureka backed microservices application.
1) Load balancing - if I have 3 nodes for one service, load balancing should happen between them
2)Retry logic - if one of the nodes did not respond, retry should happen for certain number ( eg 3. should be configurable) before falling back to another node.
3)circuit breaker - if for some reasons, all the 3 nodes of service is having some issue accessing db and throwing exceptions or not responding, the circuit should get open, fall back method called and circuit automatically closes after the services recovers.
Looking at many examples of Spring-cloud, I figured out
1) RestTemplate will help with option 1. but when RestTemplate access one instance of service and if the node fails, will it try with other two nodes?
2) Hystix will help with circuit breaker option (3 above). but if just one node is not responding, will it try other nodes, before opening up circuit and call fallback method. and will it automatically close circuit once the service recovers?
3) how to get retryLogic with spring-cloud? I do know about #Retryable annotation. But will it help in the following situation?
Retry with one node for 3 times and after it fails, try the next node 3 times and the last node 3 times before circuit breaker kicks in.
I see that all these configurations are available in spring cloud. but having a hard-time understanding how to configure for all these for efficient solution.
Here is one proposed:
#HystrixCommand
#Retryable
public Object doSomething() {
// use your RestTemplate here
}
But I don't totally know if it is going to help me with all the subtleties I mentioned above.
I do see there is a #FeignClient. But from this blog, I understand that it provides a high level feature for HTTP client requests. Does it help with retry and circuit breaker and load balancing all-in-one?
Thanks
I do see there is a #FeignClient. Does it help with retry and circuit breaker and load balancing all-in-one?
If you are using the full spring-cloud stack, it actually solves everything you mentioned.
The netflix components in this scenario are the following in spring-cloud:
Eureka - Service Registry
Let's you dyanmically register your services so you only need to fix one host in your app (eureka).
Ribbon - Load balancer
Out of the box it's providing you with round robin loadbalancing, but you can implement your own #RibbonClient (even for a specific service) and design your custom loadbalancing for example based on eureka metadata. The loadbalancing happens on the client side.
Feign - Http client
With #FeignClient you can rapidly develop clients for you other services (or services outside of your infrastructure). It is integrated with ribbon and eureka so you can refer to your services #FeignClient(yourServiceNameInEureka) and what you end up with is a client which loadbalances between the registered instances with your preferred logic. If you are using spring you can use the familiar #RequestMapping annotation to describe the endpoint you are using.
Hystrix - Circuit breaker
By default your feign clients will use hystrix, every request will be wrapped in a hystrix command. You can of course create hytrix commands by hand and configure them for your needs.
You have to configure a little to get thees working (actually just a few #Enable annotation on your configuration).
I highly recommend reading the provided spring documentation because it wraps up almost all of your aspects in a fairly quick read.
http://cloud.spring.io/spring-cloud-netflix/spring-cloud-netflix.html

Resources