I'm building a training app with Netflix micro-services APIs.
This is my edge, starting on localhost:9999:
#EnableHystrix
#EnableZuulProxy
#EnableEurekaClient
#SpringBootApplication
public class EdgeApplication {
public static void main(String[] args) {
SpringApplication.run(EdgeApplication.class, args);
}
}
I defined the 2 following apps:
app-a exposes a simple web service service-a and starts on localhost:8081
app-b exposes a web service service-b which calls service-a, and starts on localhost:8082
service-b calls service-a using Netflix Feign:
#FeignClient(value = "app-a", fallback = AppAFallback.class)
public interface AppAClient {
#RequestMapping(value = "service-a", method = RequestMethod.GET)
List<Entity> serviceA();
}
#Component
public class AppAFallback implements AppAClient {
private static final Entity DEFAULT_ENTITY = new Entity();
#Override
public List<Entity> serviceA() {
return Collections.singletonList(DEFAULT_ENTITY);
}
}
While app-a and app-b are running, every service answers as expected:
http://localhost:8081/service-a
http://localhost:8082/service-b
http://localhost:9999/app-a/service-a (through edge)
http://localhost:9999/app-b/service-b (through edge)
The fallback AppAFallback should be called if app-b is down. However I have to wait about like 1 minute before it happens.
Just after the app-b is down:
http://localhost:8081/service-a works well and the fallback is called
http://localhost:8082/service-b is not reachable
http://localhost:9999/app-a/service-a TIMEOUT : HystrixRuntimeException: app-a timed-out and no fallback available.
http://localhost:9999/app-b/service-b TIMEOUT : HystrixRuntimeException: app-b timed-out and no fallback available.
And 1 minute after app-b is down:
http://localhost:8081/service-a works well and the fallback is called
http://localhost:8082/service-b is not reachable
http://localhost:9999/app-a/service-a works well and the fallback is called
http://localhost:9999/app-b/service-b GENERAL : load balancer does not have available server for client: app-b
And this is the result I expected. Any idea about why the calls to app-a/service-a just after app-b is down are giving me TIMEOUT?
Thanks in advance for your help.
I've experience the same problem, and I think (not tested), it's Eureka's updating frequency cause the problem. Just after app-b is down, Eureka still thinks app-b is up (not yet checked the heartbeat). And after app-b is down 1 minute, Eureka known app-b is down, just tells your app-a there is no app-b, thus the fallback fired immediately.
Related
We have Spring Boot services running in Kubernetes and are using the Spring Cloud Kubernetes Load Balancer functionality with RestTemplate to make calls to other Spring Boot services. One of the main reasons we have this in place is historical - in that previously we ran our services in EC2 using Eureka for service discovery and after the migration we kept the Spring discovery client/client-side load balancing in place (updating dependencies etc for it to work with the Spring Cloud Kubernetes project)
We have a problem that when one of the target pods goes down we get multiple failures for requests for a period of time with java.net.NoRouteToHostException ie the spring load balancer is still trying to send to that pod.
So I have a few questions on this:
Shouldn't the target instance get removed automatically when this happens? So it might happen once but after that, the target pod list will be repaired?
Or if not is there some other configuration we need to add to handle this - eg retry / circuit breaker, etc?
A more general question is what benefit does Spring's client-side load balancing bring with Kubernetes? Without it, our service would still be able to call other services using Kubernetes built-in service / load-balancing functionality and this should handle the issue of pods going down automatically. The Spring documentation also talks about being able to switch from POD mode to SERVICE mode (https://docs.spring.io/spring-cloud-kubernetes/docs/current/reference/html/index.html#loadbalancer-for-kubernetes). But isn't this service mode just what Kubernetes does automatically? I'm wondering if the simplest solution here isn't to remove the Spring Load Balancer altogether? What would we lose then?
An update on this: we had the spring-retry dependency in place, but the retry was not working as by default it only works for GETs and most of our calls are POST (but OK to call again). Adding the configuration spring.cloud.loadbalancer.retry.retryOnAllOperations: true fixed this, and hence most of these failures should be avoided by the retry using an alternative instance on the second attempt.
We have also added a RetryListener that clears the load balancer cache for the service on certain connection exceptions:
#Configuration
public class RetryConfig {
private static final Logger logger = LoggerFactory.getLogger(RetryConfig.class);
// Need to use bean factory here as can't autowire LoadBalancerCacheManager -
// - it's set to 'autowireCandidate = false' in LoadBalancerCacheAutoConfiguration
#Autowired
private BeanFactory beanFactory;
#Bean
public CacheClearingLoadBalancedRetryFactory cacheClearingLoadBalancedRetryFactory(ReactiveLoadBalancer.Factory<ServiceInstance> loadBalancerFactory) {
return new CacheClearingLoadBalancedRetryFactory(loadBalancerFactory);
}
// Extension of the default bean that defines a retry listener
public class CacheClearingLoadBalancedRetryFactory extends BlockingLoadBalancedRetryFactory {
public CacheClearingLoadBalancedRetryFactory(ReactiveLoadBalancer.Factory<ServiceInstance> loadBalancerFactory) {
super(loadBalancerFactory);
}
#Override
public RetryListener[] createRetryListeners(String service) {
RetryListener cacheClearingRetryListener = new RetryListener() {
#Override
public <T, E extends Throwable> boolean open(RetryContext context, RetryCallback<T, E> callback) { return true; }
#Override
public <T, E extends Throwable> void close(RetryContext context, RetryCallback<T, E> callback, Throwable throwable) {}
#Override
public <T, E extends Throwable> void onError(RetryContext context, RetryCallback<T, E> callback, Throwable throwable) {
logger.warn("Retry for service {} picked up exception: context {}, throwable class {}", service, context, throwable.getClass());
if (throwable instanceof ConnectTimeoutException || throwable instanceof NoRouteToHostException) {
try {
LoadBalancerCacheManager loadBalancerCacheManager = beanFactory.getBean(LoadBalancerCacheManager.class);
Cache loadBalancerCache = loadBalancerCacheManager.getCache(CachingServiceInstanceListSupplier.SERVICE_INSTANCE_CACHE_NAME);
if (loadBalancerCache != null) {
boolean result = loadBalancerCache.evictIfPresent(service);
logger.warn("Load Balancer Cache evictIfPresent result for service {} is {}", service, result);
}
} catch(Exception e) {
logger.error("Failed to clear load balancer cache", e);
}
}
}
};
return new RetryListener[] { cacheClearingRetryListener };
}
}
}
Are there any issues with this approach? Could something like this be added to the built in functionality?
Shouldn't the target instance get removed automatically when this
happens? So it might happen once but after that the target pod list
will be repaired?
To resolve this issue you have to use the Readiness and Liveness Probe in Kubernetes.
Readiness will check the health of the endpoint that your application has, on the period of interval. If the application fails it will mark your PODs as Unready to accept the Traffic. So no traffic will go to that POD(replica).
Liveness will restart your application if it fails so your container or we can say POD will come up again and once we will get 200 response from app K8s will mark your POD as Ready to accept the traffic.
You can create the simple endpoint in the application that give response as 200 or 204 as per need.
Read more at : https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Make sure you application using the Kubernetes service to talk with each other.
Application 1 > Kubernetes service of App 2 > Application 2 PODs
To enable load balancing based on Kubernetes Service name use the
following property. Then load balancer would try to call application
using address, for example service-a.default.svc.cluster.local
spring.cloud.kubernetes.loadbalancer.mode=SERVICE
The most typical way to use Spring Cloud LoadBalancer on Kubernetes is
with service discovery. If you have any DiscoveryClient on your
classpath, the default Spring Cloud LoadBalancer configuration uses it
to check for service instances. As a result, it only chooses from
instances that are up and running. All that is needed is to annotate
your Spring Boot application with #EnableDiscoveryClientto enable
K8s-native Service Discovery.
References : https://stackoverflow.com/a/68536834/5525824
I have a feign client like this with endpoints to two APIs from PROJECT-SERVICE
#FeignClient(name = "PROJECT-SERVICE", fallbackFactory = ProjectServiceFallbackFactory.class)
public interface ProjectServiceClient {
#GetMapping("/api/projects/{projectKey}")
public ResponseEntity<Project> getProjectDetails(#PathVariable("projectKey") String projectKey);
#PostMapping("/api/projects")
public ResponseEntity<Project> createProject(#RequestBody Project project);
}
I'm using those clients like this:
#Service
public class MyService {
#Autowired
private ProjectServiceClient projectServiceClient;
public void doSomething() {
// Some code
ResponseEntity<Project> projectResponse = projectServiceClient.getProjectDetails(projectKey);
// Some more code
}
public void doSomethingElse() {
// Some code
ResponseEntity<Project> projectResponse = projectServiceClient.createProject(Project projectToBeCreated);
// Some more code
}
}
My problem is, most of the times (around 60% of the time), either one of these Feign calls result in a HystrixTimeoutException.
I initially thought there could be a problem in the downstream micro service (PROJECT-SERVICE in this case), but that is not the case. In fact, when getProjectDetails() or createProject() is called, the PROJECT-SERVICE actually does the job and returns a ResponseEntity<Project> with status 200 and 201 respectively, but my fallback is activated with the HystrixTimeoutException.
I'm trying in vain to find what might be causing this issue.
I, however, have this in my main application configuration:
feign.hystrix.enabled=true
feign.client.config.default.connect-timeout=5000
feign.client.config.default.read-timeout=60000
Can anyone point me towards a solution?
Thanks,
Sriram Sridharan
Hystrix's timeout is not tied to that of Feign. There is a default 1 second execution timeout enabled for Hystrix. You need to configure this timeout to be slightly longer than Feign's, to avoid HystrixTimeoutException getting thrown earlier than desired timeout. Like so:
feign.client.config.default.connect-timeout=5000
feign.client.config.default.read-timeout=5000
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=6000
Doing so would allow FeignException, caused by timeout after 5 seconds, to be thrown first, and then wrapped in a HystrixTimeoutException
I have a JAX-RS API that does a long duration work and the API is being called via ajax call by the client. The client is getting 503 status - Service Unavailable after 50 seconds.
How can I increase this timeout value. I tried increasing the connection timeout in tomcat (which is hosting API). I also tried adding timeout in ajax call but that also didn't work.
You could use the Suspended annotation and create a TimeoutHandler .
Not sure if you need to increase the timeout in tomcat using this example.
public class Resource {
private Executor executor = Executors.newSingleThreadExecutor();
#GET
public void asyncGet(#Suspended final AsyncResponse asyncResponse) {
asyncResponse.setTimeoutHandler(new TimeoutHandler() {
#Override
public void handleTimeout(AsyncResponse asyncResponse) {
asyncResponse.resume("Processing timeout.");
executor.shutdown();
}
});
asyncResponse.setTimeout(60, TimeUnit.SECONDS);
executor.submit(() -> {
String result = someService.expensiveOperation();
asyncResponse.resume(result);
executor.shutdown();
});
}
}
Jersey documentation here
I have a CXF client configured in my Spring Boot app like so:
#Bean
public ConsumerSupportService consumerSupportService() {
JaxWsProxyFactoryBean jaxWsProxyFactoryBean = new JaxWsProxyFactoryBean();
jaxWsProxyFactoryBean.setServiceClass(ConsumerSupportService.class);
jaxWsProxyFactoryBean.setAddress("https://www.someservice.com/service?wsdl");
jaxWsProxyFactoryBean.setBindingId(SOAPBinding.SOAP12HTTP_BINDING);
WSAddressingFeature wsAddressingFeature = new WSAddressingFeature();
wsAddressingFeature.setAddressingRequired(true);
jaxWsProxyFactoryBean.getFeatures().add(wsAddressingFeature);
ConsumerSupportService service = (ConsumerSupportService) jaxWsProxyFactoryBean.create();
Client client = ClientProxy.getClient(service);
AddressingProperties addressingProperties = new AddressingProperties();
AttributedURIType to = new AttributedURIType();
to.setValue(applicationProperties.getWex().getServices().getConsumersupport().getTo());
addressingProperties.setTo(to);
AttributedURIType action = new AttributedURIType();
action.setValue("http://serviceaction/SearchConsumer");
addressingProperties.setAction(action);
client.getRequestContext().put("javax.xml.ws.addressing.context", addressingProperties);
setClientTimeout(client);
return service;
}
private void setClientTimeout(Client client) {
HTTPConduit conduit = (HTTPConduit) client.getConduit();
HTTPClientPolicy policy = new HTTPClientPolicy();
policy.setConnectionTimeout(applicationProperties.getWex().getServices().getClient().getConnectionTimeout());
policy.setReceiveTimeout(applicationProperties.getWex().getServices().getClient().getReceiveTimeout());
conduit.setClient(policy);
}
This same service bean is accessed by two different threads in the same application sequence. If I execute this particular sequence 10 times in a row, I will get a connection timeout from the service call at least 3 times. What I'm seeing is:
Caused by: java.io.IOException: Timed out waiting for response to operation {http://theservice.com}SearchConsumer.
at org.apache.cxf.endpoint.ClientImpl.waitResponse(ClientImpl.java:685) ~[cxf-core-3.2.0.jar:3.2.0]
at org.apache.cxf.endpoint.ClientImpl.processResult(ClientImpl.java:608) ~[cxf-core-3.2.0.jar:3.2.0]
If I change the sequence such that one of the threads does not call this service, then the error goes away. So, it seems like there's some sort of a race condition happening here. If I look at the logs in our proxy manager for this service, I can see that both of the service calls do return a response very quickly, but the second service call seems to get stuck somewhere in the code and never actually lets go of the connection until the timeout value is reached. I've been trying to track down the cause of this for quite a while, but have been unsuccessful.
I've read some mixed opinions as to whether or not CXF client proxies are thread-safe, but I was under the impression that they were. If this actually not the case, and I should be creating a new client proxy for each invocation, or use a pool of proxies?
Turns out that it is an issue with the proxy not being thread-safe. What I wound up doing was leveraging a solution kind of like one posted at the bottom of this post: Is this JAX-WS client call thread safe? - I created a pool for the proxies and I use that to access proxies from multiple threads in a thread-safe manner. This seems to work out pretty well.
public class JaxWSServiceProxyPool<T> extends GenericObjectPool<T> {
JaxWSServiceProxyPool(Supplier<T> factory, GenericObjectPoolConfig poolConfig) {
super(new BasePooledObjectFactory<T>() {
#Override
public T create() throws Exception {
return factory.get();
}
#Override
public PooledObject<T> wrap(T t) {
return new DefaultPooledObject<>(t);
}
}, poolConfig != null ? poolConfig : new GenericObjectPoolConfig());
}
}
I then created a simple "registry" class to keep references to various pools.
#Component
public class JaxWSServiceProxyPoolRegistry {
private static final Map<Class, JaxWSServiceProxyPool> registry = new HashMap<>();
public synchronized <T> void register(Class<T> serviceTypeClass, Supplier<T> factory, GenericObjectPoolConfig poolConfig) {
Assert.notNull(serviceTypeClass);
Assert.notNull(factory);
if (!registry.containsKey(serviceTypeClass)) {
registry.put(serviceTypeClass, new JaxWSServiceProxyPool<>(factory, poolConfig));
}
}
public <T> void register(Class<T> serviceTypeClass, Supplier<T> factory) {
register(serviceTypeClass, factory, null);
}
#SuppressWarnings("unchecked")
public <T> JaxWSServiceProxyPool<T> getServiceProxyPool(Class<T> serviceTypeClass) {
Assert.notNull(serviceTypeClass);
return registry.get(serviceTypeClass);
}
}
To use it, I did:
JaxWSServiceProxyPoolRegistry jaxWSServiceProxyPoolRegistry = new JaxWSServiceProxyPoolRegistry();
jaxWSServiceProxyPoolRegistry.register(ConsumerSupportService.class,
this::buildConsumerSupportServiceClient,
getConsumerSupportServicePoolConfig());
Where buildConsumerSupportServiceClient uses a JaxWsProxyFactoryBean to build up the client.
To retrieve an instance from the pool I inject my registry class and then do:
JaxWSServiceProxyPool<ConsumerSupportService> consumerSupportServiceJaxWSServiceProxyPool = jaxWSServiceProxyPoolRegistry.getServiceProxyPool(ConsumerSupportService.class);
And then borrow/return the object from/to the pool as necessary.
This seems to work well so far. I've executed some fairly heavy load tests against it and it's held up.
I am connection through SockJS over STOMP to my Spring backend. Everything work fine, the configuration works well for all browsers etc. However, I cannot find a way to send an initial message. The scenario would be as follows:
The client connects to the topic
function connect() {
var socket = new SockJS('http://localhost:8080/myEndpoint');
stompClient = Stomp.over(socket);
stompClient.connect({}, function(frame) {
setConnected(true);
console.log('Connected: ' + frame);
stompClient.subscribe('/topic/notify', function(message){
showMessage(JSON.parse(message.body).content);
});
});
}
and the backend config looks more or less like this:
#Configuration
#EnableWebSocketMessageBroker
public class WebSocketAppConfig extends AbstractWebSocketMessageBrokerConfigurer {
...
#Override
public void registerStompEndpoints(final StompEndpointRegistry registry) {
registry.addEndpoint("/myEndpoint").withSockJS();
}
I want to send to the client an automatic reply from the backend (on the connection event) so that I can already provide him with some dataset (e.g. read sth from the db) without the need for him (the client) to send a GET request (or any other). So to sum up, I just want to send him a message on the topic with the SimMessagingTemplate object just after he connected.
Usually I do it the following way, e.g. in a REST controller, when the template is already autowired:
#Autowired
private SimpMessagingTemplate template;
...
template.convertAndSend(TOPIC, new Message("it works!"));
How to achieve this on connect event?
UPDATE
I have managed to make it work. However, I am still a bit confused with the configuration. I will show here 2 configurations how the initial message can be sent:
1) First solution
JS part
stompClient.subscribe('/app/pending', function(message){
showMessage(JSON.parse(message.body).content);
});
stompClient.subscribe('/topic/incoming', function(message){
showMessage(JSON.parse(message.body).content);
});
Java part
#Controller
public class WebSocketBusController {
#SubscribeMapping("/pending")
Configuration
#Override
public void configureMessageBroker(final MessageBrokerRegistry config) {
config.enableSimpleBroker("/topic");
config.setApplicationDestinationPrefixes("/app");
}
... and other calls
template.convertAndSend("/topic/incoming", outgoingMessage);
2) Second solution
JS part
stompClient.subscribe('/topic/incoming', function(message){
showMessage(JSON.parse(message.body).content);
})
Java part
#Controller
public class WebSocketBusController {
#SubscribeMapping("/topic/incoming")
Configuration
#Override
public void configureMessageBroker(final MessageBrokerRegistry config) {
config.enableSimpleBroker("/topic");
// NO APPLICATION PREFIX HERE
}
... and other calls
template.convertAndSend("/topic/incoming", outgoingMessage);
SUMMARY:
The first case uses two subscriptions - this I wanted to avoid and thought this can be managed with one only.
The second one however has no prefix for application. But at least I can have a single subscription to listen on the provided topic as well as send initial message.
If you just want to send a message to the client upon connection, use an appropriate ApplicationListener:
#Component
public class StompConnectedEvent implements ApplicationListener<SessionConnectedEvent> {
private static final Logger log = Logger.getLogger(StompConnectedEvent.class);
#Autowired
private Controller controller;
#Override
public void onApplicationEvent(SessionConnectedEvent event) {
log.debug("Client connected.");
// you can use a controller to send your msg here
}
}
You can't do that on connect, however the #SubscribeMapping does the stuff in that case.
You just need to mark the service method with that annotation and it returns a result to the subscribe function.
From Spring Reference Manual:
An #SubscribeMapping annotation can also be used to map subscription requests to #Controller methods. It is supported on the method level, but can also be combined with a type level #MessageMapping annotation that expresses shared mappings across all message handling methods within the same controller.
By default the return value from an #SubscribeMapping method is sent as a message directly back to the connected client and does not pass through the broker. This is useful for implementing request-reply message interactions; for example, to fetch application data when the application UI is being initialized. Or alternatively an #SubscribeMapping method can be annotated with #SendTo in which case the resulting message is sent to the "brokerChannel" using the specified target destination.
UPDATE
Referring to this example: https://github.com/revelfire/spring4Test how would that be possible to send anything when the line 24 of the index.html is invoked: stompClient.subscribe('/user/queue/socket/responses' ... from the spring controllers?
Well, look like this:
#SubscribeMapping("/queue/socket/responses")
public List<Employee> list() {
return getEmployees();
}
The Stomp client part remains the same.