Could you run Spring Batch inside a Java EE server (eg. WebLogic), let's say as a Web Application? Is there any issue with Spring Batch creating more threads (for multi threaded steps and parallel steps) inside a Java EE server? Is this creation of threads by the framework against Java EE specification?
I am thinking it is okay and people are doing this after reading the following link
http://static.springsource.org/spring-batch/reference/html-single/index.html#runningJobsFromWebContainer
Please help.
This is an old question, but I will add an answer after all.
Yes, there may be some problems. I encountered such problems on WebSphere server.
According to their documentation: http://www-01.ibm.com/support/docview.wss?uid=swg21246676
Using a Java™ call such as "newThread()" to spawn a new thread is not
supported according to the J2EE specification. This spawned thread
does not inherit the J2EE context. What is recommended to do instead
is to use an asynchronous bean or Commonj WorkManager thread. These
threads have a proper J2EE context and support an indirect JNDI
lookup.
Spring batch creates it's own threads using new Thread, and these threads do not inherit J2EE context.
In my specific case one of Spring Batch Job consumed a few REST services over https, and it turned out that threads spawned by Spring Batch don't see https cerificates installed in WebSphere server, causing certificate's errors.
I see no issue here.
Spring Batch (like Quartz Scheduler) runs as a Web Application, it is not bound by the prohibition to create threads, which applies only to EJB components (not to Servlets).
So, provided you do not exceed the server capacity limits, Spring Batch can run in any EE application.
It is a common practice. The book Spring Batch In Action Chapter 4.4 discusses the exact same scenario, launching the batch job from the web container. The batch job should be run within a thread pool with N threads. The number of threads in the pool should be determined by the throughput result of performance load test.
Related
We have springboot microservice running on liberty server. We have liberty configured with executor threads core and max size, in parallel can we also have callable implementation on the service to make a maximum utilisation of machine? or will it have counter effect?
You should be able to instruct Spring to use Liberty's default managed executor (which runs tasks on the Liberty thread pool) by supplying a DefaultManagedTaskExecutor to the Spring AsyncSupportConfigurer.setTaskExecutor method.
This will cause Spring to look up java:comp/DefaultManagedExecutorService which is available in Liberty if you enable the concurrent-1.0 feature.
This page has an example of setting the task executor in Spring, except it uses a custom thread pool, and you would replace that with DefaultManagedTaskExecutor:
https://howtodoinjava.com/spring-boot2/rest/async-rest-controller-callable/
I am working on a project where we are planning to use WLP (WebSphere liberty) instead of traditional WAS.
The code is using WAS scheduler for scheduling activities.
Does liberty also have the same level of support/features for scheduler as present in WAS .
How can I migrate the scheduler tasks from websphere to liberty?
Code using the Scheduler in traditional WebSphere Application Server should not be migrated to EE Concurrency Utilities unless you are certain that you do not need the transactional/persistent quality of service that the Scheduler provides (Scheduler tasks run in a transaction and can roll back and be retried, and they can also persist across server restart). To obtain a similar quality of service in Liberty, you should migrate your Scheduler tasks to Persistent EJB Timers. Note that while fail over support across multiple servers is not present in Persistent EJB Timers in Liberty at the time of writing this, it is currently being worked on.
I am using spring boot 2.0.4.RELEASE. My doubt is whether my application is running in event loop style or not. I am using tomcat as my server.
I am running some performance tests in my application and after a certain time I see a strange behaviour. After the request reaches 500 req/second , my application is not able to serve more than 500 req/second. Via prometheus I was able to figure out max thread for tomcat were 200 by default. Looks like all the threads were consumed and that's why , it was not able to server more than 500 req/second. Please correct me if am wrong.
Can the tomcat server run in event-loop style ?
How can I change the event-loop size for tomcat server if possible.
Tried changing it to jetty still the same issue. Wondering if my application is running in event loop style.
Hey i think that you are doing something wrong in your project maybe one of your dependency does not support reactive programming. If you want to benefit from async programing(reactive) your code must be 100 reactive even for security you must use reactive spring security.
Normally a reactive spring application will run on netty not in tomcat so check your dependency because tomcat is not reactive
This is more of a analysis. After running some performance test on my local machine , I was able to figure out what was actually happening inside my application.
What I did was, ran performance test on my local machine and analysed the application through JConsole.
As I said I scheduled all my blocking dB calls to schedulers.elastic. What I realised that I it is causing the bottleneck. since my dB connections are limited and I am using hikari for connection pooling so it doesn’t matter the number of threads I create out of elastic pool.
Since reactive programming is more about consuming resource to the fullest with lesser number of threads, since the threads were being created in unbounded way so it was no different from normal application .
So what I did as part of resolution limited the number of threads to 100 that were supposed to be used by for dB calls. And bang number jumped from 500 tps to 2300 tps.
I know this is not the number which one should expect out of reactive application , it has much more capability. Since right now I do not have any choice but to bear with non reactive drivers .Waiting for production grade availability of reactive drivers for mssql server.
We are planning to retire the existing legacy java batch applications and recreate it with the latest available batch framework.
Given that we have a large number of batch jobs to be modernised, we are looking for a framework or architecture that would allow us to
Develop a batch solution that would allow us to dynamically deploy a new batch as and when they are created, without disturbing the existing deployed applications. - Does Spring cloud Task provide any of this feature. Note: We are looking only to deploy the apps to our local server, and has nothing to do with cloud.
If Spring Batch/Boot can provide us the feature we typically expect from a batch application, what is the special value add to go for Spring Cloud Task? - I wasn't able to completely understand this from the Spring documentation available online.
From the documentation of the Spring Cloud Task, I was able to understand that it allows an application to have many tasks within it. What should I do if each of the tasks have their own library dependencies, which might contradict with the dependencies of other Tasks? So in that case, should each of these tasks moved to a new application or this there a work around for that?
To answer your questions:
Does Spring Cloud Task handle orchestration - No. Spring Cloud Task does not handle orchestration of tasks or jobs. The component in this ecosystem that handles the deployment/orchestration of tasks or jobs is really Spring Cloud Data Flow (which is why I asked if you use any type of cloud platform including YARN, Cloud Foundry, Kubernetes, or Mesos...the environments supported by Spring Cloud Data Flow).
What added value does Spring Cloud Task provide over Spring Boot/Spring Batch - Spring Cloud Task is designed to provide a few things:
Similar abilities to Spring Batch with regards to state management without needing to create a batch job. When running a Boot application on a cloud environment, there is no standard way of getting the results from environment to environment (YARN handles job results differently from tasks on Cloud Foundry which is different from jobs on Kubernetes, etc). Spring Batch provides this but now all short lived processes need the overhead of the Batch API so Spring Cloud Task provides a lighter touch to those use cases.
Automatically adds informational listeners. With Spring XD, when you ran a job in an XD container, the XD container automatically added a number of informational listeners that broadcast events that you could listen for. Spring Cloud Task brings the same functionality without the need for the XD container.
Integration with Spring Cloud Stream. Spring Cloud Task provides the ability to launch tasks from messages received from Spring Cloud Stream. Also, the informational messages previously mentioned (both Batch events as well as Task events) are sent via Spring Cloud Stream channels.
The DeployerPartitionHandler. When working in a cloud environment, this PartitionHandler implementation allows you to launch workers for a partitioned batch job as tasks. This allows for the dynamic scaling of partitioned batch jobs instead of the traditional option of pre-deploying workers that listen for work which wastes resources in a modern cloud environment.
How does the packaging of multiple tasks work with dependencies - In short, this is not recommended. The idea of a Spring Cloud Task is that the execution of the Spring Boot application is the Task. While you could package up multiple tasks and using different methods, have them execute based on different stimulus, that goes against the 12 factor application concepts which are essential for correct use of Spring Cloud Task.
My two cents
For the best option for a modern batch platform, you really need to look into some from of platform first and that begins at the Cloud Foundry/Kubernetes/Mesos/YARN layer. Without that, you end up building a large part of the infrastructure yourself. That is why Spring XD evolved into Spring Cloud Data Flow. The added complexity that lived in the containers of Spring XD is removed by requiring a modern platform to run on (since they all handle those guarantees themselves). Without that piece, you're going to spend a lot of time managing the deployment and orchestration of applications that most modern platforms handle for you.
From there, the choice becomes pretty easy IMHO with Spring Cloud Task for simple tasks, Spring Batch for batch jobs, and Spring Cloud Data Flow for orchestration.
I am new to spring batch. I want to run spring batch jobs on server a and want to launch those jobs from server b using spring batch admin.is it possible? I have searched the following two ways:
1.JMX way: i could convert spring batch beans into mbeans but i cant read them from spring batch admin.can you tell how to read mbeans from spring batch admin and launch them?
2.common repository: i think if i use the same db repository for both spring batch and spring batch admin then i can launch remote jobs from spring batch admin (from server b).but in the job xml file in spring batch admin what should be the classpath for tasklet?
can you help in the above or tell me if any new way exists?
we ended up implementing a framework using mq communication to handle this. each 'batch node' registers itself and any 'batch class' parameters such as 'nodeType=A' or 'jobSizeiCanHandle=BIG' (these are fictitious but you get the point). The client console reads this information and queries the nodes via MQ for the job list. It then submits job requests with parameters via a rudimentary text based protocol (property file format).
command=START_JOB
job=JobABC
param1=x
param2=y
One of the batch nodes will pick up the message and start the job, it will return success/fail status in the same manner with a message with the same correlation id. so the client can show response to the user.
this allows us to do what you're talking about AND spark the jobs via an external scheduler (Control-M) . The 'nodeType=A' mentioned above allows us to query individual nodes (the nodes listen where 'nodeType=A or nodeType=*'. This allows commands to be 'targeted' to specific nodes if that is necessary.
Keep in mind, this is our own console, not the spring batch admin console. So perhaps that doesn't help you, but building up a simple console doesn't take that long using the spring batch APIs (4 or 5 asps).
The batch nodes could also have started up simple services like HTTP REST services or 'whatever' but we use MQ heavily and i liked the idea of not having to preregister nodes (the framework code doesn't know/care that it's in an HTTP container, so it couldn't register the endpoint easily). With MQ, the channel is preconfigured and all apps just 'use it' so it seemed easier.
Good luck.
I am trying to do the same thing. But it seems that in order to launch job directly from Spring batch admin, all the job resource has to be added to the spring batch web app. May be try restful job submission with spring MVC
#chau
One way to use Spring batch admin as is, but "discover" and "invoke" remote jobs is to provide your own implementations for org.springframework.batch.admin.service.JobService and org.springframework.batch.core.launch.JobOperator that can query and invoke jobs from remote job registry/repository.
You can find custom implementation for JobService and JMX enabled Job administrator in : https://github.com/regunathb/Trooper/tree/master/batch-core as: org.trpr.platform.batch.impl.spring.admin.SimpleJobService and org.trpr.platform.batch.impl.spring.jmx.JobAdministrator
Spring beans XML that uses these beans are here : https://github.com/regunathb/Trooper/blob/master/batch-core/src/main/resources/packaged/common-batch-config.xml