Docker swarm springboot and eureka service discoverer not working - spring-boot

Currently working on swarmifying our Springboot microservice back-end with eureka service discoverer. The first problem was making sure the service discoverer doesn't pick de ingress IP-adress but instead IP-address from the overlay network. After some searching I found a post that suggest the following Eureka Client Configuration:
#Configuration
#EnableConfigurationProperties
public class EurekaClientConfig {
private ConfigurableEnvironment env;
public EurekaClientConfig(final ConfigurableEnvironment env) {
this.env = env;
}
#Bean
#Primary
public EurekaInstanceConfigBean eurekaInstanceConfigBean(final InetUtils inetUtils) throws IOException {
final String hostName = System.getenv("HOSTNAME");
String hostAddress = null;
final Enumeration<NetworkInterface> networkInterfaces = NetworkInterface.getNetworkInterfaces();
for (NetworkInterface netInt : Collections.list(networkInterfaces)) {
for (InetAddress inetAddress : Collections.list(netInt.getInetAddresses())) {
if (hostName.equals(inetAddress.getHostName())) {
hostAddress = inetAddress.getHostAddress();
System.out.printf("Inet used: %s", netInt.getName());
}
System.out.printf("Inet %s: %s / %s\n", netInt.getName(), inetAddress.getHostName(), inetAddress.getHostAddress());
}
}
if (hostAddress == null) {
throw new UnknownHostException("Cannot find ip address for hostname: " + hostName);
}
final int nonSecurePort = Integer.valueOf(env.getProperty("server.port", env.getProperty("port", "8080")));
final EurekaInstanceConfigBean instance = new EurekaInstanceConfigBean(inetUtils);
instance.setHostname(hostName);
instance.setIpAddress(hostAddress);
instance.setNonSecurePort(nonSecurePort);
System.out.println(instance);
return instance;
}
}
After deploying the new discoverer I got the correct result and the service discoverer had the correct overlay IP-address.
In order to understand the next step here is some information about the environment we run this docker swarm on. We currently have 2 droplets one for development and the other for production. Currently we are only working on the development server to Swarmify it. The production hasn't been touched in months.
The next step is to deploy a Discovery Client Springboot application that will connect to the correct service discoverer and also has the overly IP-address instead of the ingress. But when I build the application it always connects to our production service discoverer outside the docker swarm into the other droplet. I can see the application being deployed on the swarm but looking at the Eureka dashboard from the production server I can see that it connects to it.
The second problem is that the application also has the EurekaClient config you see above but it is ignored. Even the logs within the method is not called when starting up the applicaiton.
Here is the configuration from the Discovery Client application:
eureka:
client:
serviceUrl:
defaultZone: service-discovery_service:8761/eureka
enabled: false
instance:
instance-id: ${spring.application.name}:${random.value}
prefer-ip-address: true
spring:
application:
name: account-service
I assume that you can use defaultZone to point at the correct service discoverer but I can be wrong.

Just dont use an eureka service discoverer but something else like treafik. Much easier solution.

Related

Load balancing problems with Spring Cloud Kubernetes

We have Spring Boot services running in Kubernetes and are using the Spring Cloud Kubernetes Load Balancer functionality with RestTemplate to make calls to other Spring Boot services. One of the main reasons we have this in place is historical - in that previously we ran our services in EC2 using Eureka for service discovery and after the migration we kept the Spring discovery client/client-side load balancing in place (updating dependencies etc for it to work with the Spring Cloud Kubernetes project)
We have a problem that when one of the target pods goes down we get multiple failures for requests for a period of time with java.net.NoRouteToHostException ie the spring load balancer is still trying to send to that pod.
So I have a few questions on this:
Shouldn't the target instance get removed automatically when this happens? So it might happen once but after that, the target pod list will be repaired?
Or if not is there some other configuration we need to add to handle this - eg retry / circuit breaker, etc?
A more general question is what benefit does Spring's client-side load balancing bring with Kubernetes? Without it, our service would still be able to call other services using Kubernetes built-in service / load-balancing functionality and this should handle the issue of pods going down automatically. The Spring documentation also talks about being able to switch from POD mode to SERVICE mode (https://docs.spring.io/spring-cloud-kubernetes/docs/current/reference/html/index.html#loadbalancer-for-kubernetes). But isn't this service mode just what Kubernetes does automatically? I'm wondering if the simplest solution here isn't to remove the Spring Load Balancer altogether? What would we lose then?
An update on this: we had the spring-retry dependency in place, but the retry was not working as by default it only works for GETs and most of our calls are POST (but OK to call again). Adding the configuration spring.cloud.loadbalancer.retry.retryOnAllOperations: true fixed this, and hence most of these failures should be avoided by the retry using an alternative instance on the second attempt.
We have also added a RetryListener that clears the load balancer cache for the service on certain connection exceptions:
#Configuration
public class RetryConfig {
private static final Logger logger = LoggerFactory.getLogger(RetryConfig.class);
// Need to use bean factory here as can't autowire LoadBalancerCacheManager -
// - it's set to 'autowireCandidate = false' in LoadBalancerCacheAutoConfiguration
#Autowired
private BeanFactory beanFactory;
#Bean
public CacheClearingLoadBalancedRetryFactory cacheClearingLoadBalancedRetryFactory(ReactiveLoadBalancer.Factory<ServiceInstance> loadBalancerFactory) {
return new CacheClearingLoadBalancedRetryFactory(loadBalancerFactory);
}
// Extension of the default bean that defines a retry listener
public class CacheClearingLoadBalancedRetryFactory extends BlockingLoadBalancedRetryFactory {
public CacheClearingLoadBalancedRetryFactory(ReactiveLoadBalancer.Factory<ServiceInstance> loadBalancerFactory) {
super(loadBalancerFactory);
}
#Override
public RetryListener[] createRetryListeners(String service) {
RetryListener cacheClearingRetryListener = new RetryListener() {
#Override
public <T, E extends Throwable> boolean open(RetryContext context, RetryCallback<T, E> callback) { return true; }
#Override
public <T, E extends Throwable> void close(RetryContext context, RetryCallback<T, E> callback, Throwable throwable) {}
#Override
public <T, E extends Throwable> void onError(RetryContext context, RetryCallback<T, E> callback, Throwable throwable) {
logger.warn("Retry for service {} picked up exception: context {}, throwable class {}", service, context, throwable.getClass());
if (throwable instanceof ConnectTimeoutException || throwable instanceof NoRouteToHostException) {
try {
LoadBalancerCacheManager loadBalancerCacheManager = beanFactory.getBean(LoadBalancerCacheManager.class);
Cache loadBalancerCache = loadBalancerCacheManager.getCache(CachingServiceInstanceListSupplier.SERVICE_INSTANCE_CACHE_NAME);
if (loadBalancerCache != null) {
boolean result = loadBalancerCache.evictIfPresent(service);
logger.warn("Load Balancer Cache evictIfPresent result for service {} is {}", service, result);
}
} catch(Exception e) {
logger.error("Failed to clear load balancer cache", e);
}
}
}
};
return new RetryListener[] { cacheClearingRetryListener };
}
}
}
Are there any issues with this approach? Could something like this be added to the built in functionality?
Shouldn't the target instance get removed automatically when this
happens? So it might happen once but after that the target pod list
will be repaired?
To resolve this issue you have to use the Readiness and Liveness Probe in Kubernetes.
Readiness will check the health of the endpoint that your application has, on the period of interval. If the application fails it will mark your PODs as Unready to accept the Traffic. So no traffic will go to that POD(replica).
Liveness will restart your application if it fails so your container or we can say POD will come up again and once we will get 200 response from app K8s will mark your POD as Ready to accept the traffic.
You can create the simple endpoint in the application that give response as 200 or 204 as per need.
Read more at : https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Make sure you application using the Kubernetes service to talk with each other.
Application 1 > Kubernetes service of App 2 > Application 2 PODs
To enable load balancing based on Kubernetes Service name use the
following property. Then load balancer would try to call application
using address, for example service-a.default.svc.cluster.local
spring.cloud.kubernetes.loadbalancer.mode=SERVICE
The most typical way to use Spring Cloud LoadBalancer on Kubernetes is
with service discovery. If you have any DiscoveryClient on your
classpath, the default Spring Cloud LoadBalancer configuration uses it
to check for service instances. As a result, it only chooses from
instances that are up and running. All that is needed is to annotate
your Spring Boot application with #EnableDiscoveryClientto enable
K8s-native Service Discovery.
References : https://stackoverflow.com/a/68536834/5525824

Fail-fast behavior for Eureka client

It seems that following problem doesn't have common decision and I try to solve it from another side. Microservices infrastructure consists of Spring Boot Microservices with Eureka-Zuul-Config-Admin Servers as service mesh. All Microservices runs inside Docker containers at the Kubernetes platform. Kubernetes monitors application health check (liveness/readyness probes) and redeploy it when health check in down state exceeds liveness probe timeout.
The problem is following - sometimes Microservice doesn't get correct Eureka server address after redeployment. Service discovery registration fails but Microservice continue working with health check 'UP' and dependent Microservices miss it.
Microservices are interdependent and failure of one Microservice causes cascade failure of all dependent Microservices. I don't use Histrix because of some reasons and actually it is not resolve my problem - missed data from failed Microservice just disables entire functionality related to the set of dependent Microservices.
Question: Is it possible to configure something like 'fail-fast' behavior for Eureka client without writing custom HealthIndicator? The actuator health check should be in 'DOWN' state while Eureka client doesn't get 204 successful registration response from Eureka.
Here is an example of how I fix it in code. It has pretty simple behavior - healthcheck goes down 'forever' after exceeding timeout to successful registration in Eureka on start or/and during runtime. The main goal is that the Kubernetes will redeploy Microservice when liveness probe timeout exceeded.
#Component
public class CustomHealthIndicator implements HealthIndicator {
private static final Logger logger = LoggerFactory.getLogger(CustomHealthIndicator.class);
#Autowired
#Qualifier("eurekaClient")
private EurekaClient eurekaClient;
private static final int HEALTH_CHECK_DOWN_LIMIT_MIN = 15;
private LocalDateTime healthCheckDownTimeLimit = getHealthCheckDownLimit();
#Override
public Health health() {
int errCode = registeredInEureka();
return errCode != 0
? Health.down().withDetail("Eureka registration fails", errCode).build()
: Health.up().build();
}
private int registeredInEureka() {
int status = 0;
if (isStatusUp()) {
healthCheckDownTimeLimit = getHealthCheckDownLimit();
} else if (LocalDateTime.now().isAfter(healthCheckDownTimeLimit)) {
logger.error("Exceeded {} min. limit for getting 'UP' state in Eureka", HEALTH_CHECK_DOWN_LIMIT_MIN);
status = HttpStatus.GONE.value();
}
return status;
}
private boolean isStatusUp() {
return eurekaClient.getInstanceRemoteStatus().compareTo(InstanceInfo.InstanceStatus.UP) == 0;
}
private LocalDateTime getHealthCheckDownLimit() {
return LocalDateTime.now().plus(HEALTH_CHECK_DOWN_LIMIT_MIN, ChronoUnit.MINUTES);
}
}
Is it possible to do the same by just configuring Spring components?

How to communicate between two services in k8s using spring cloud

I have spring boot app which I use spring-cloud-kubernetes dependencies. This is deployed in K8s. I have implemented service discovery and I have #DiscoveryClient which gives me service ids k8s namespace.My Problem is I want to do a rest call to one of this found services ( which have multiple pods running). How to do this ? Do I have to use Ribbon Client ?
My Code is
#RestController
public class HelloController {
#Autowired
private DiscoveryClient discoveryClient;
#RequestMapping("/services")
public List<String> services() {
log.info("/services - Request Received " + new Date());
List<String> services = this.discoveryClient.getServices();
log.info("Found services " + services.toString());
for (String service : services) {
// TODO call to this service
List<ServiceInstance> instances = discoveryClient.getInstances(service);
for (ServiceInstance instance : instances) {
log.info("Service ID >> " + service + " : Instance >> " + getStringVal(instance));
}
}
return services;
}
In service instances I can find host and port to call, but I want to call to service so that some load balancing mechanism calls to actual pod instance.
You need to use Spring Cloud Kubernetes Ribbon to call the host and port which will go to the kubernetes service and you get the load balancing provided by Kubernetes service and Kube Proxy.

Netflix Ribbon throws No instances available for MY-MICROSERVICE exception

My application uses Eureka and Ribbon. I'm trying to get two microservices to talk to each other. Below is my method of concern.
#Autowired #LoadBalanced
private RestTemplate client;
#Autowired
private DiscoveryClient dClient;
public String getServices() {
List<String> services = dClient.getServices();
List<ServiceInstance> serviceInstances = new ArrayList<>();
List<String> serviceHosts = new ArrayList<>();
for(String service : services) {
serviceInstances.addAll(dClient.getInstances(service));
}
for(ServiceInstance service : serviceInstances) {
serviceHosts.add(service.getHost());
}
//throws No instances available exception here
try {
System.out.println(this.client.getForObject("http://MY-MICROSERVICE/rest/hello", String.class, new HashMap<String, String>()));
}
catch(Exception e) {
e.printStackTrace();
}
return serviceHosts.toString();
}
The method returns an array of two hostnames(IP). So DiscoveryClient is able to see instances of the two services registered with Eureka. But RestTemplate or more precisely Ribbon throws IllegalStateExcpetion: No instances available exception.
DynamicServerListLoadBalancer for client MY-MICROSERVICE initialized: DynamicServerListLoadBalancer:{NFLoadBalancer:name=MY-MICROSERVICE,current list of Servers=[],Load balancer stats=Zone stats: {},Server stats: []}ServerList:org.springframework.cloud.netflix.ribbon.eureka.DomainExtractingServerList#23edc38f
java.lang.IllegalStateException: No instances available for MY-MICROSERVICE
at org.springframework.cloud.netflix.ribbon.RibbonLoadBalancerClient.execute(RibbonLoadBalancerClient.java:119)
at org.springframework.cloud.netflix.ribbon.RibbonLoadBalancerClient.execute(RibbonLoadBalancerClient.java:99)
at org.springframework.cloud.client.loadbalancer.LoadBalancerInterceptor.intercept(LoadBalancerInterceptor.java:58)
Even the Eureka dashboard shows two services registered. I feel the problem is specifically with Ribbon. Here's my config file.
spring.application.name="my-microservice"
logging.level.org.springframework.boot.autoconfigure.logging=INFO
spring.devtools.restart.enabled=true
spring.devtools.add-properties=true
server.ribbon.eureka.enabled=true
eureka.client.serviceUrl.defaultZone = http://localhost:8761/eureka/
The other microservice also has the same configs except for a different name. What's the problem here?
Solved. I was using application.yml with Eureka-server and application.properties with the client. Once I converted everything to yml, all works fine.
spring:
application:
name: "my-microservice"
devtools:
restart:
enabled: true
add-properties: true
logging:
level:
org.springframework.boot.autoconfigure.logging: INFO
eureka:
client:
serviceUrl:
defaultZone: "http://localhost:8761/eureka/"
This is the yml file for both apps which only differ by the application name.

How to use SpringBoot actuator over JMX

I am having existing Spring Boot application and I want to do monitoring the application through actuator.I tried with http endpoints and it is working fine for me. Instead of http end points I need JMX end points for my existing running application.
If you add spring-boot-starter-actuatordependency in your build.gradle or pom.xml file you will have JMX bean enabled by default as well as HTTP Endpoints.
You can use JConsole in order to view your JMX exposed beans. You'll find more info about this here.
More details about how to access JMX endpoints here.
Assuming you're using a Docker image where the entry point is the Spring Boot app using java in which case the PID is "1" and so would the Attach API's Virtual Machine ID. You can implement a health probe as follows.
import com.sun.tools.attach.spi.AttachProvider;
import java.util.Map;
import javax.management.MBeanServerConnection;
import javax.management.ObjectName;
import javax.management.remote.JMXConnectorFactory;
import javax.management.remote.JMXServiceURL;
public class HealthProbe {
public static void main(String[] args) throws Exception {
final var attachProvider = AttachProvider.providers().get(0);
final var virtualMachine = attachProvider.attachVirtualMachine("1");
final var jmxServiceUrl = virtualMachine.startLocalManagementAgent();
try (final var jmxConnection = JMXConnectorFactory.connect(new JMXServiceURL(jmxServiceUrl))) {
final MBeanServerConnection serverConnection = jmxConnection.getMBeanServerConnection();
#SuppressWarnings("unchecked")
final var healthResult =
(Map<String, ?>)
serverConnection.invoke(
new ObjectName("org.springframework.boot:type=Endpoint,name=Health"),
"health",
new Object[0],
new String[0]);
if ("UP".equals(healthResult.get("status"))) {
System.exit(0);
} else {
System.exit(1);
}
}
}
}
This will use the Attach API and make the original process start a local management agent.
The org.springframework.boot:type=Endpoint,name=Health object instance would have it's health method invoked which will provide a Map version of the /actuator/health output. From there the value of status should be UP if things are ok.
Then exit with 0 if ok, or 1 otherwise.
This can be embedded in an existing Spring Boot app so long as loader.main is set. The following is the HEALTHCHECK probe I used
HEALTHCHECK --interval=5s --start-period=60s \
CMD ["java", \
"-Dloader.main=net.trajano.swarm.gateway.healthcheck.HealthProbe", \
"org.springframework.boot.loader.PropertiesLauncher" ]
This is the technique I used in distroless Docker Image.
Side note: Don't try to put this in a CommandLineRunner interface because it will try to pull the configuration from the main app and you likely won't need the whole web stack.

Resources