Call service method from spark - spring

I'm using spark with spring boot. I have a service ( spring bean ), which I want to call for batch of data. The issue is that I keep getting Task not serializable exception.
javaRDD.foreachPartition(iterator -> {
Iterators.partition(iterator, 200)
.forEachRemaining(value -> {
log.info("got value {}", value);
notificationService.post("topic", value);
});
});
As far as I understood from different places, it's because we're calling method of service in cluster and for that purpose the service class needs to be Serializable to be transferred to executors which are going to run it.
The question is how I can get rid of this and is there any best practices to do it?

Related

Load balancing problems with Spring Cloud Kubernetes

We have Spring Boot services running in Kubernetes and are using the Spring Cloud Kubernetes Load Balancer functionality with RestTemplate to make calls to other Spring Boot services. One of the main reasons we have this in place is historical - in that previously we ran our services in EC2 using Eureka for service discovery and after the migration we kept the Spring discovery client/client-side load balancing in place (updating dependencies etc for it to work with the Spring Cloud Kubernetes project)
We have a problem that when one of the target pods goes down we get multiple failures for requests for a period of time with java.net.NoRouteToHostException ie the spring load balancer is still trying to send to that pod.
So I have a few questions on this:
Shouldn't the target instance get removed automatically when this happens? So it might happen once but after that, the target pod list will be repaired?
Or if not is there some other configuration we need to add to handle this - eg retry / circuit breaker, etc?
A more general question is what benefit does Spring's client-side load balancing bring with Kubernetes? Without it, our service would still be able to call other services using Kubernetes built-in service / load-balancing functionality and this should handle the issue of pods going down automatically. The Spring documentation also talks about being able to switch from POD mode to SERVICE mode (https://docs.spring.io/spring-cloud-kubernetes/docs/current/reference/html/index.html#loadbalancer-for-kubernetes). But isn't this service mode just what Kubernetes does automatically? I'm wondering if the simplest solution here isn't to remove the Spring Load Balancer altogether? What would we lose then?
An update on this: we had the spring-retry dependency in place, but the retry was not working as by default it only works for GETs and most of our calls are POST (but OK to call again). Adding the configuration spring.cloud.loadbalancer.retry.retryOnAllOperations: true fixed this, and hence most of these failures should be avoided by the retry using an alternative instance on the second attempt.
We have also added a RetryListener that clears the load balancer cache for the service on certain connection exceptions:
#Configuration
public class RetryConfig {
private static final Logger logger = LoggerFactory.getLogger(RetryConfig.class);
// Need to use bean factory here as can't autowire LoadBalancerCacheManager -
// - it's set to 'autowireCandidate = false' in LoadBalancerCacheAutoConfiguration
#Autowired
private BeanFactory beanFactory;
#Bean
public CacheClearingLoadBalancedRetryFactory cacheClearingLoadBalancedRetryFactory(ReactiveLoadBalancer.Factory<ServiceInstance> loadBalancerFactory) {
return new CacheClearingLoadBalancedRetryFactory(loadBalancerFactory);
}
// Extension of the default bean that defines a retry listener
public class CacheClearingLoadBalancedRetryFactory extends BlockingLoadBalancedRetryFactory {
public CacheClearingLoadBalancedRetryFactory(ReactiveLoadBalancer.Factory<ServiceInstance> loadBalancerFactory) {
super(loadBalancerFactory);
}
#Override
public RetryListener[] createRetryListeners(String service) {
RetryListener cacheClearingRetryListener = new RetryListener() {
#Override
public <T, E extends Throwable> boolean open(RetryContext context, RetryCallback<T, E> callback) { return true; }
#Override
public <T, E extends Throwable> void close(RetryContext context, RetryCallback<T, E> callback, Throwable throwable) {}
#Override
public <T, E extends Throwable> void onError(RetryContext context, RetryCallback<T, E> callback, Throwable throwable) {
logger.warn("Retry for service {} picked up exception: context {}, throwable class {}", service, context, throwable.getClass());
if (throwable instanceof ConnectTimeoutException || throwable instanceof NoRouteToHostException) {
try {
LoadBalancerCacheManager loadBalancerCacheManager = beanFactory.getBean(LoadBalancerCacheManager.class);
Cache loadBalancerCache = loadBalancerCacheManager.getCache(CachingServiceInstanceListSupplier.SERVICE_INSTANCE_CACHE_NAME);
if (loadBalancerCache != null) {
boolean result = loadBalancerCache.evictIfPresent(service);
logger.warn("Load Balancer Cache evictIfPresent result for service {} is {}", service, result);
}
} catch(Exception e) {
logger.error("Failed to clear load balancer cache", e);
}
}
}
};
return new RetryListener[] { cacheClearingRetryListener };
}
}
}
Are there any issues with this approach? Could something like this be added to the built in functionality?
Shouldn't the target instance get removed automatically when this
happens? So it might happen once but after that the target pod list
will be repaired?
To resolve this issue you have to use the Readiness and Liveness Probe in Kubernetes.
Readiness will check the health of the endpoint that your application has, on the period of interval. If the application fails it will mark your PODs as Unready to accept the Traffic. So no traffic will go to that POD(replica).
Liveness will restart your application if it fails so your container or we can say POD will come up again and once we will get 200 response from app K8s will mark your POD as Ready to accept the traffic.
You can create the simple endpoint in the application that give response as 200 or 204 as per need.
Read more at : https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Make sure you application using the Kubernetes service to talk with each other.
Application 1 > Kubernetes service of App 2 > Application 2 PODs
To enable load balancing based on Kubernetes Service name use the
following property. Then load balancer would try to call application
using address, for example service-a.default.svc.cluster.local
spring.cloud.kubernetes.loadbalancer.mode=SERVICE
The most typical way to use Spring Cloud LoadBalancer on Kubernetes is
with service discovery. If you have any DiscoveryClient on your
classpath, the default Spring Cloud LoadBalancer configuration uses it
to check for service instances. As a result, it only chooses from
instances that are up and running. All that is needed is to annotate
your Spring Boot application with #EnableDiscoveryClientto enable
K8s-native Service Discovery.
References : https://stackoverflow.com/a/68536834/5525824

Spring boot: Separate thread pool for specific endpoint

Given a microservice in Spring Boot, it offers 2 end-points to be consumed from 2 separate system.
One of this system is critical while the other one is not.
I would like to prevent the "not critical" one to consume (due to unexpected problems) all the threads (or many) of the HTTP thread pool, so I would like to configure separated thread pools for each one of these end-points.
Is that possible?
There are multiple ways to do this. Using DeferredResult is probably the easiest way:
#RestController
public class Controller {
private final Executor performancePool = Executors.newFixedThreadPool(128);
private final Executor normalPool = Executors.newFixedThreadPool(16);
#GetMapping("/performance")
DeferredResult<String> performanceEndPoint() {
DeferredResult<String> result = new DeferredResult<>();
performancePool.execute(() -> {
try {
Thread.sleep(5000); //A long running task
} catch (InterruptedException e) {
e.printStackTrace();
}
result.setResult("Executed in performance pool");
});
return result;
}
#GetMapping("/normal")
DeferredResult<String> normalEndPoint() {
DeferredResult<String> result = new DeferredResult<>();
normalPool.execute(() -> result.setResult("Executed in normal pool"));
return result;
}
}
You immediately release the Tomcat thread by returning a DeferredResult from a controller, allowing it to serve other requests. The actual response is written to the user when the .setResult method is called.
DeferredResult is one of the many ways you can perform asynchronous request processing in Spring. Check out this section of the docs to learn more about the other ways:
https://docs.spring.io/spring-framework/docs/current/reference/html/web.html#mvc-ann-async
Not sure you can prevent, but you can surely increase the thread pool capacity. By default, tomcat (if default server) can handler 200 simultaneous requests , you can increase that number
Check if this article helps
https://stackoverflow.com/questions/46893237/can-spring-boot-application-handle-multiple-requests-simultaneously#:~:text=Yes%2C%20Spring%20boot%20can%20handle,can%20handle%20200%20simultaneous%20requests.&text=However%2C%20you%20can%20override%20this,tomcat.

Propagating errors between Hazelcast Server and Hazelcast Client

I have the following scenario:
- a Hazelcast Server as a microservice which performs some computations when receives a method call.
- a Hazelcast Client as another microservice which calls the Hazelcast Server through the specified method call.
I want that when I throw an exception from the Hazelcast Server to receive it on the Hazelcast Client side as it is (currently, I'm receiving somthing like this: java.util.concurrent.ExecutionException: com.hazelcast.client.UndefinedErrorCodeException: Class name: ro.orange.eshop.personalisationengineapi.application.exception.ValidationException)
I've digged a little into the APIs and on the Hazelcast Client side I've found a way to register a new exception:
#Bean
fun addHazelcastKnownExceptions(hazelcastInstance: HazelcastInstance): Int {
val hazelcastClientInstance = (hazelcastInstance as HazelcastClientProxy).client
hazelcastClientInstance.clientExceptionFactory.register(400, ValidationException::class.java) { message, cause -> ValidationException(message, cause) }
return 1
}
But it seems that this exception must be registered also on the server side as well. And here comes the problem! On the server side, I've found a class called ClientExceptions which has a method public void register(int errorCode, Class clazz) but I can't find a way to receive a ClientExceptions instance (I should mention that I'm using Hazelcast Spring).
Thank you!
It is not supported to register custom exception factory as an API as of 3.12.x.
Related issue to follow https://github.com/hazelcast/hazelcast/issues/9753
As a workaround, I could suggest using class name (UndefinedErrorCodeException.getOriginialClassName()) to recreate exception classes on the client side.
== EDIT ==
Client API does not support it. You have found the private API.
If you are ok with relying on private API here is the hack for registering classes on the hazelcast server:
Note that I DO NOT recommend this solution since it relies on private API that can change.
HazelcastInstance instance = Hazelcast.newHazelcastInstance();
if (instance instanceof HazelcastInstanceProxy) {
HazelcastInstanceImpl original = ((HazelcastInstanceProxy) instance).getOriginal();
ClientExceptions clientExceptions = original.node.getClientEngine().getClientExceptions();
clientExceptions.register( USER_EXCEPTIONS_RANGE_START + 1, UndefinedCustomFormatException.class);
}

How do I register a microservice (or its methods) to Task in Netflix Conductor?

I was looking for a more sophisticated workflow than Saga from AxonFramework -- which we are currently using -- and I found one in Netflix Conductor.
Sadly, I have searched the Internet for a decent example but to no avail.
My question is, in Netflix Conductor, how might one define and create Task or WorkflowTask and most importantly, link a microservice to it? Here is a Netflix Conductor code from github:
WorkflowDef def = new WorkflowDef();
def.setName("test");
WorkflowTask t0 = new WorkflowTask();
t0.setName("t0");
t0.setType(Type.SIMPLE);
t0.setTaskReferenceName("t0");
WorkflowTask t1 = new WorkflowTask();
t1.setName("t1");
t1.setType(Type.SIMPLE);
t1.setTaskReferenceName("t1");
def.getTasks().add(t0);
def.getTasks().add(t1);
Pardon my confusion as I am new to Netflix Conductor.
Assuming the Micro service has a REST Endpoint over HTTP. In that case you've to use HttpTask which is a system task. Httptask makes a Http call and the response is available as task output. Pls refer to the below link:HttpTask
Pls remember to set the SchemaVersion as 2 for the WorkflowDef which contains HttpTask. You would also need a corresponding Task type registered.
(disclaimer: i haven't tried, i just looked at documentation...)
implement your own WorkflowSystemTask
override start() / execute() method to call your microservice
set task type to SIMPLE according to https://netflix.github.io/conductor/intro/concepts/#worker-taks
Define a Task Client Bean and over ride execute method of worker Class.
Pass Task client and workers beans to TaskRunnerConfigurer
#Configuration
public class Configuration {
#Bean
public TaskClient taskClient(#Value("${conductor url}") String conductorServerURL) {
TaskClient taskClient = new TaskClient();
taskClient.setRootURI(conductorServerURL);
return taskClient;
}
#Bean
public TaskRunnerConfigurer taskRunnerConfigurer(
#Autowired final TaskClient taskClient,
#Autowired final List<Worker> workers) {
final TaskRunnerConfigurer taskRunnerConfigurer = new TaskRunnerConfigurer.Builder(taskClient, workers)
.withThreadCount(3)
.build();
taskRunnerConfigurer.init();
return taskRunnerConfigurer;
}
}
This workers will poll to tasks from conductor server
There are now a number of SDKs to connect your microservice worker to Conductor: https://github.com/conductor-sdk/
You can create a SIMPLE task in Conductor (using the API endpoint, and these parameters https://conductor.netflix.com/configuration/taskdef.html.
Workers poll your tasks in Conductor. When a task as work to be run, it assigns that to the worker. On completion the task takes the results from the workers back to the Conductor workflow.
Here's an worker in Go: https://github.com/conductor-sdk/conductor-examples/tree/main/go-samples
And a Java example: https://github.com/orkes-io/orkesworkers
Finally - there is now a free cloud playground for Netflix conductor at https://play.orkes.io

How to set a Message Handler programmatically in Spring Cloud AWS SQS?

maybe someone has an idea to my following problem:
I am currently on a project, where i want to use the AWS SQS with Spring Cloud integration. For the receiver part i want to provide a API, where a user can register a "message handler" on a queue, which is an interface and will contain the user's business logic, e.g.
MyAwsSqsReceiver receiver = new MyAwsSqsReceiver();
receiver.register("a-queue-name", new MessageHandler(){
#Override
public void handle(String message){
//... business logic for the received message
}
});
I found examples, e.g.
https://codemason.me/2016/03/12/amazon-aws-sqs-with-spring-cloud/
and read the docu
http://cloud.spring.io/spring-cloud-aws/spring-cloud-aws.html#_sqs_support
But the only thing i found there to "connect" a functionality for processing a incoming message is a annotation on a method, e.g. #SqsListener or #MessageMapping.
These annotations are fixed to a certain queue-name, though. So now i am at a loss, how to dynamically "connect" my provided "MessageHandler" (from my API) to the incoming message for the specified queuename.
In the Config the example there is a SimpleMessageListenerContainer, which gets a QueueMessageHandler set, but this QueueMessageHandler does not seem
to be the right place to set my handler or to override its methods and provide my own subclass of QueueMessageHandler.
I already did something like this with the Spring Amqp integration and RabbitMq and thought, that it would be also similar here with AWS SQS.
Does anyone have an idea, how to accomplish this?
thx + bye,
Ximon
EDIT:
I found, that Spring JMS could actually do that, e.g. www.javacodegeeks.com/2016/02/aws-sqs-spring-jms-integration.html. Does anybody know, what consequences using JMS protocol has here, good or bad?
I am facing the same issue.
I am trying to go in an unusual way where I set up an Aws client bean at build time and then instead of using sqslistener annotation to consume from the specific queue I use the scheduled annotation which I can programmatically pool (each 10 secs in my case) from which queue I want to consume.
I did the example that iterates over queues defined in properties and then consumes from each one.
Client Bean:
#Bean
#Primary
public AmazonSQSAsync awsSqsClient() {
return AmazonSQSAsyncClientBuilder
.standard()
.withRegion(Regions.EU_WEST_1.getName())
.build();
}
Consumer:
// injected in the constructor
private final AmazonSQSAsync awsSqsClient;
#Scheduled(fixedDelay = 10000)
public void pool() {
properties.getSqsQueues()
.forEach(queue -> {
val receiveMessageRequest = new ReceiveMessageRequest(queue)
.withWaitTimeSeconds(10)
.withMaxNumberOfMessages(10);
// reading the messages
val result = awsSqsClient.receiveMessage(receiveMessageRequest);
val sqsMessages = result.getMessages();
log.info("Received Message on queue {}: message = {}", queue, sqsMessages.toString());
// deleting the messages
sqsMessages.forEach(message -> {
val deleteMessageRequest = new DeleteMessageRequest(queue, message.getReceiptHandle());
awsSqsClient.deleteMessage(deleteMessageRequest);
});
});
}
Just to clarify, in my case, I need multiple queues, one for each tenant, with the queue URL for each one passed in a property file. Of course, in your case, you could get the queue names from another source, maybe a ThreadLocal which has the queues you have created in runtime.
If you wish, you can also try the JMS approach where you create message consumers and add a listener to each one you wish (See the doc Aws Jms documentation).
When we do Spring and SQS we use the spring-cloud-starter-aws-messaging.
Then just create a Listener class
#Component
public class MyListener {
#SQSListener(value="myqueue")
public void listen(MyMessageType message) {
//process the message
}
}

Resources