Spring Batch CompositeItemProcessor get value from other delegates - spring

I have a compositeItemProcessor as below
<bean id="compositeItemProcessor" class="org.springframework.batch.item.support.CompositeItemProcessor">
<property name="delegates">
<list>
<bean class="com.example.itemProcessor1"/>
<bean class="com.example.itemProcessor2"/>
<bean class="com.example.itemProcessor3"/>
<bean class="com.example.itemProcessor4"/>
</list>
</property>
</bean>
The issue i have is that within itemProcessor4 i require values from both itemProcessor1 and itemProcessor3.
I have looked at using the Step Execution Context but this does not work as this is within one step. I have also looked at using #AfterProcess within ItemProcessor1 but this does not work as it isn't called until after ItemProcessor4.
What is the correct way to share data between delegates in a compositeItemProcessor?
Is a solution of using util:map that is updated in itemProcessor1 and read in itemProcessor4 under the circumstances that the commit-interval is set to 1?

Using the step execution context won't work as it is persisted at chunk boundary, so it can't be shared between processors within the same chunk.
AfterProcess is called after the registered item processor, which is the composite processor in your case (so after ItemProcessor4). This won't work neither.
The only option left is to use some data holder object that you share between item processors.
Hope this helps.

This page seems to state that there are two types of ExecutionContexts, one at step-level, one at job-level.
https://docs.spring.io/spring-batch/trunk/reference/html/patterns.html#passingDataToFutureSteps
You should be able to get the job context and set keys on that, from the step context

I had a similar requirement in my application too. I went with creating a data transfer object ItemProcessorDto which will be shared by all the ItemProcessors. You can store data in this DTO object in first processor and all the remaining processors will get the information out of this DTO object. In addition to that any ItemProcessor could update or retrieve the data out of the DTO.
Below is a code snippet:
#Bean
public ItemProcessor1<ItemProcessorDto> itemProcessor1() {
log.info("Generating ItemProcessor1");
return new ItemProcessor1();
}
#Bean
public ItemProcessor2<ItemProcessorDto> itemProcessor2() {
log.info("Generating ItemProcessor2");
return new ItemProcessor2();
}
#Bean
public ItemProcessor3<ItemProcessorDto> itemProcessor3() {
log.info("Generating ItemProcessor3");
return new ItemProcessor3();
}
#Bean
public ItemProcessor4<ItemProcessorDto> itemProcessor4() {
log.info("Generating ItemProcessor4");
return new ItemProcessor4();
}
#Bean
#StepScope
public CompositeItemProcessor<ItemProcessorDto> compositeItemProcessor() {
log.info("Generating CompositeItemProcessor");
CompositeItemProcessor<ItemProcessorDto> compositeItemProcessor = new CompositeItemProcessor<>();
compositeItemProcessor.setDelegates(Arrays.asList(itemProcessor1(), itemProcessor2(), itemProcessor3), itemProcessor4()));
return compositeItemProcessor;
}
#Data
public class ItemProcessorDto {
private List<String> sharedData_1;
private Map<String, String> sharedData_2;
}

Related

Is my app's Controllers thread safe ? Spring 4.1

I am developing an app in spring 4.1 . I know that Controllers / any other bean in spring are not thread safe . ie: Singleton. That mean same instance of Controller will be used to process multiple concurrent requests. Till here I am clear . I want to confirm that do I need to explicitly set #Scope("prototype") or request in the Controller class ? I read on StackOverflow previous post that even if scope is not set as request/prototype , Spring container will be able to process each request individually based on #RequestParams passed or #ModelAttribute associated with method arguements .
So i want to confirm is my below code is safe to handle multiple request concurrently ?
#Controller
public class LogonController {
/** Logger for this class and subclasses */
protected final Log logger = LogFactory.getLog(getClass());
#Autowired
SimpleProductManager productManager;
#Autowired
LoginValidator validator;
#RequestMapping( "logon")
public String renderForm(#ModelAttribute("employee") Logon employeeVO)
{
return "logon";
}
#RequestMapping(value="Welcome", method = RequestMethod.POST)
public ModelAndView submitForm(#ModelAttribute("employee") Logon employeeVO,
BindingResult result)
{
//Check validation errors
validator.validate(employeeVO, result);
if (result.hasErrors()) {
return new ModelAndView("logon");
}
if(!productManager.chkUserValidation(employeeVO.getUsername(), employeeVO.getPassword())){
return new ModelAndView("logon");
}
ModelAndView model = new ModelAndView("Welcome");
return model ;
}
}
Also i have another doubt.
since i am using SimpleProductManager productManager; Do i need to specify scope="prototype in its bean declaration in app-servlet.xml ?
Below is my configuration.xml
<bean id="mySessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">
<property name="dataSource"><ref bean="dataSource"/></property>
<property name="configLocation" value="classpath:hibernate.cfg.xml" />
</bean>
<bean id="productManager" class="com.BlueClouds.service.SimpleProductManager" >
<property name="productDao" ref="productDao"/>
</bean>
<bean id="productDao" class="com.BlueClouds.dao.HbmProductDao">
<property name="sessionFactory"><ref bean="mySessionFactory"/></property>
</bean>
<bean id="loginValidator" class="com.BlueClouds.service.LoginValidator" >
</bean>
Being singleton single instance of validator is being shared among all request , for that do i need to add scope=request in bean configuration xml or do i need to surround validate() in synchronized block ? Please advise.
Thanks much .
You can tell your code is thread safe or not by answering following questions
Are there threads might modify a static field, which is not thread safe(ex: arrayList), in the same time?
Are there threads might modify a field of an instance, which is not thread safe, in the same time?
If any answer of the above is yes, then your code is not thread safe.
Since your code doesn't change any field, so it should be thread safe.
The general idea about thread safe is that if there are threads might change/access the same memory section in the same time, then it's not thread safe, which means "synchronized" is needed.
You'd better learn more about stack memory, heap memory and global memory in JAVA. So that you can understand if your code changes the same memory section in the same time or not.

Spring REST webservice serializing to multiple JSON formats

I have a Spring REST web service which populates a generic object based on data we have in a database, the goal is to have the users pass a parameter to the web service to to indicate the format they want the output to be in. Based on their input we will use the correct JSONSerializer to give them what they want.
I have set up my webservice as follows, in my spring-ws-servlet.xml I have set our company ObjectMapper to be used by the mvc:message-converters, I have also set it on the RestController so that it can adjust the ObjectMapper to register the serializer. It looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:mvc="http://www.springframework.org/schema/mvc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/mvc http://www.springframework.org/schema/mvc/spring-mvc.xsd">
<mvc:annotation-driven>
<mvc:message-converters register-defaults="true">
<bean
class="org.springframework.http.converter.json.MappingJackson2HttpMessageConverter">
<property name="objectMapper" ref="jacksonObjectMapper" />
</bean>
</mvc:message-converters>
</mvc:annotation-driven>
<bean id="endpoint" class="org.company.Controller">
<property name="objectMapper" ref="jacksonObjectMapper" />
</bean>
<bean id="jacksonObjectMapper" class="org.company.CompanyObjectMapper" />
</beans>
The controller looks like this:
#RestController
public class Controller {
private ObjectMapper objectMapper;
#RequestMapping(...)
public GenericObject getObject(#PathVariables ...) {
//Get Object from database, just creating an object for example
GenericObject object = new GenericObject();
//Based on the user input we will pick out
//a Serializer that extends JsonSerializer<GenericObject>
BaseSerializer serializer = getSerializer();
//Create a simpleModule and use it to register our serializer
SimpleModule module = new SimpleModule();
module.addSerializer(GenericObject.class, serializer);
//get module and register the serializer
ObjectMapper mapper = getObjectMapper();
mapper.registerModule(module);
return object;
}
public ObjectMapper getObjectMapper() {
return objectMapper;
}
public void setObjectMapper(ObjectMapper objectMapper) {
this.objectMapper = objectMapper;
}
}
The issue is that when I publish my webapp, the first query works correctly, if I specify format=format1, I will get the output in format1. However, after that I can only receive format1. I may specify format=format2, but still get the output in format1. I believe the issue is that the ObjectMapper still has the module registered to it from the first query. I have read that I can avoid this problem by creating a new ObjectMapper every time, but I am not sure how to set that to be used by Spring when it outputs the JSON.
Could someone help me come up with a solution to either create a new ObjectMapper every time I run the code and set that ObjectMapper to the Spring rest service, or help me figure out how I can "unregister" any modules that are registered on the object mapper before setting the latest desired serializer?
An idea could be to create and configure all the mappers you need at startup time as a spring beans.
Then create the default object mapper that will work as a dispatcher for other object mappers (or as the fallback one), and it may be aware of the current http request.
You can register all the mappers in this object mapper, register this mapper to be used as the default one in spring.
Something like this maybe :
public class RequestAwareObjectMapper extends ObjectMapper{
private Map<String, ObjectMapper > mappers = new HashMap<>();
#Override
public String writeValueAsString(Object value) throws JsonProcessingException{
HttpServletRequest req = null;//get request from spring context, if any, this is a managed spring bean it wont be a prorblem
String param = null; // read the param from the query
ObjectMapper mapper = mappers.get(param);
if(mapper == null){
mapper = this;
}
return mapper.writeValueAsString(value);
}
public void registerMapper(String key, ObjectMapper mapper){...}
}
in this way you are not going to pollute your controller with references to the object mapper and you can carry on using #ResponseBody (thanks to #RestController)..
I am sure there's a cleaner way to achieve the same result integrating a similar solution in the spring flow, can't look on something better right now.
Create your customObjectMapper class and auto wire it to your controller using #Autowire annotation. You can then create different methods to create different formatted objects.
You can also send serialiser as parameters.
public class CustomObjectMapper extends ObjectMapper {
public CustomObjectMapper() {
super();
super.setSerializationInclusion(JsonInclude.Include.ALWAYS);
super.configure(DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY, true);
..... etc.....
super.setDateFormat(df);
}
public byte[] generateJsonFormat1(Object value, BaseSerializer serializer) throws IOException, JsonGenerationException, JsonMappingException {
Hibernate4Module hm = new Hibernate4Module();
hm.configure(Hibernate4Module.Feature.USE_TRANSIENT_ANNOTATION, false);
hm.configure(Hibernate4Module.Feature.FORCE_LAZY_LOADING, false);
.....
.....
hm.addSerializer(Object.class, serializer);
return super.registerModule(hm).writeValueAsBytes(value);
}
public byte[] generateJsonFormat2(Object value, BaseSerializer serialiser) throws IOException, JsonGe nerationException, JsonMappingException {
SimpleModule sm = new SimpleModule();
sm.addSerializer(Object.class, serialiser);
return super.registerModule(hm).writeValueAsBytes(value);
}
}
Above code is a snippet from my own application. I hope it gives the idea.

CSV File To DB2 Database - skip columns - Spring Batch project

I am working on a Spring batch project where I have to push data from a CSV file into a DB. Managed to implement the batch and the rest, currently the data is being pushed as it should but I wonder if there's anyway to skip some of the columns in the CSV file as some of them are irrelevant.
I did a bit of research but I wasn't able to find an answer, unless I missed something.
Sample of my code below.
<bean id="mysqlItemWriter"
class="org.springframework.batch.item.database.JdbcBatchItemWriter">
<property name="dataSource" ref="dataSource" />
<property name="sql">
<value>
<![CDATA[
insert into WEBREPORT.RAWREPORT(CLIENT,CLIENTUSER,GPS,EXTENSION) values (:client, :clientuser, :gps, :extension)
]]>
</value>
</property>
You can implement your FieldSetMapper which will map structure from one line to your POJO in reader.
Lets say you have:
name, surname, email
Mike, Evans, test#test.com
And you have model of Person with only name and email. You are not interested in surname. Here is reader example:
#Component
#StepScope
public class PersonReader extends FlatFileItemReader<Person> {
#Override
public void afterPropertiesSet() throws Exception {
//load file in csvResource variable
setResource(csvResource);
setLineMapper(new DefaultLineMapper<Person>() {
{
setLineTokenizer(new DelimitedLineTokenizer());
setFieldSetMapper(new PersonFieldSetMapper());
}
});
super.afterPropertiesSet();
}
}
And you can define PersonFieldSetMapper:
#Component
#JobScope
public class PersonFieldSetMapper implements FieldSetMapper<Person> {
#Override
public Person mapFieldSet(final FieldSet fieldSet) throws BindExceptio
{
final Person person = new Person();
person.setName(fieldSet.readString(0)); // columns are zero based
person.setEmail(fieldSet.readString(2));
return person;
}
}
This is for skipping columns, if I understood right this is what you want. If you want to skip rows, it can be done as well and I explained how to skip blank lines for example in this question.
if the check for the skip is simple and does not need a database roundtrip, you can use a simple itemProcessor, which returns null for skipped items
real simple pseudo code
public class SkipProcessor implements ItemProcessor<Foo,Foo>{
public Foo process(Foo foo) throws Exception {
//check for a skip
if(skip(foo)) {
return null;
} else {
return foo;
}
}
}
if the skip check is more complex and needs a database roundtrip, you can use the item processor, but the performance (if needed) will suffer
if performance is critical...well then it depends on setup, requirements and your possibilities, i would try it with 2 steps, one step loads cvs into database (without any checks), second steps reads data from database, and the skip check is done with a clever sql JOIN in the SQL for the itemReader

How spring batch share data between job

I have one query on Spring Batch Job.
I want to share data of one job with another job in same execution context. Is it possible? If so, then how?
My requirement is caching. I have file, where some data is stored. My job runs daily and need data of that file. I don't want to read file by my job daily. instead of it, I want to store data of file in cache(Hash Map). So when same job runs next day, it will use data from cache only. Is it possible in spring batch?
Your suggestion are welcome.
You can use spring initialize bean which initializes your cache at startup.
Add initialize bean to your application context;
<bean id="yourCacheBean" class="yourpackage.YourCacheBean" init-method="initialize">
</bean>
YourCacheBean looks like;
public class YourCacheBean {
private Map<Object, Object> yourCache;
public void initialize() {
//TODO: Intialize your cache
}
}
Give the initialize bean to the itemReader or itemProcessor or itemWriter in job.xml;
<bean id="exampleProcessor" class="yourpackage.ExampleProcessor" scope="step">
<property name="cacheBean" ref="yourCacheBean" />
</bean>
ExampleProcessor looks like;
public class ExampleProcessor implements ItemProcessor<String, String> {
private YourCacheBean cacheBean;
public String process(String arg0) {
return "";
}
public void setCacheBean(YourCacheBean cacheBean) {
this.cacheBean = cacheBean;
}
}
Create a job to import file into database. Other jobs will use data from database as a cache.
Another way may be to read file into a Map<> and serialize object to a file than de-serialize when need (but I still prefer database as cache)
Spring have a cache annotation that may help that kind of case and it is really easy to implement. The first call to a method will be executed, afterwards if you call the same method with exactly the same arguments, the value will be returned by the cache.
Here you have a little tutorial: http://www.baeldung.com/spring-cache-tutorial
In your case, if your call to read the file is always with the same arguments will work as you want. Just take care of TTL.

Ignore some classes while scanning PackagesToScan

I've a package (say packagesToScan) containing Classes that I wish to persist annotated with #Entity.
While defining ApplicationContext configuration, I've done as follows.
#Configuration
#EnableJpaRepositories("packagesToScan")
#EnableTransactionManagement
#PropertySource("server/jdbc.properties")
#ComponentScan("packagesToScan")
public class JpaContext {
...
// Other configurations
....
#Bean
public LocalContainerEntityManagerFactoryBean entityManagerFactory() {
LocalContainerEntityManagerFactoryBean emf = new LocalContainerEntityManagerFactoryBean();
emf.setDataSource(this.dataSource());
emf.setJpaVendorAdapter(this.jpaVendorAdapter());
emf.setPackagesToScan("packagesToScan");
emf.setJpaProperties(this.hibernateProperties());
return emf;
}
While developing, I've some classes within packagesToScan which doesn't satisfy requirements for persistence (like no primary keys etc) and due to this I'm not allowed to run test because of ApplicationContext setup failure.
Now,
Is there any way that I can scan just some selected classes or ignore some classes within packagesToScan?
I have been trying to solve the same problem and finally got a solution as below:
<bean id="mySessionFactory" class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
<property name="dataSource" ref="myDataSource"/>
<property name="packagesToScan" value="com.mycompany.bean"/>
<property name="entityTypeFilters" ref="packagesToScanIncludeRegexPattern">
</property>
<property name="hibernateProperties">
// ...
</property>
</bean>
<bean id="packagesToScanIncludeRegexPattern" class="org.springframework.core.type.filter.RegexPatternTypeFilter" >
<constructor-arg index="0" value="^(?!com.mycompany.bean.Test).*"/>
</bean>
I realized that there is a setEntityTypeFilters function on the LocalSessionFactoryBean class which can be used to filter which classes to be included. In this example I used RegexPatternTypeFilter but there are other types of filters as well.
Also note that the filters work with include semantics. In order to convert to exclude semantics I had to use negative lookahead in the regex.
This example shows the xml configuration but it should be trivial to convert to java based configuration.
I stumbled upon a simmilar problem. I needed to add some but not all entities from a package. Here is how I did it:
// add all entities from some package
localContainerEntityManagerFactoryBean.setPackagesToScan("com.companyname.model");
// add only required enitites from a libray
localContainerEntityManagerFactoryBean.setPersistenceUnitPostProcessors(new PersistenceUnitPostProcessor() {
#Override
public void postProcessPersistenceUnitInfo(MutablePersistenceUnitInfo persistenceUnit) {
persistenceUnit.addManagedClassName("com.companyname.common.model.SomeEntityName");
}
});
I found a solution to use setPackagesToScan and then remove unwanted packages. It turns out that persistenceUnit.getManagedClassNames() in PersistenceUnitPostProcessor returns a regular ArrayList which can be manipulated.
em.setPersistenceUnitPostProcessors(new PersistenceUnitPostProcessor() {
#Override
public void postProcessPersistenceUnitInfo(MutablePersistenceUnitInfo persistenceUnit) {
List<String> managedClassNames = persistenceUnit.getManagedClassNames();
managedClassNames.removeIf(fullClassName -> fullClassName.startsWith("com.example.twodatasources.product"));
}
});
In my case the shown persistanceUnit is responsible for all entities but the ones in com.example.twodatasources.product package. The other one in responsible only for com.example.twodatasources.product.
The code above is for the hibernate part. It's worth to mention that the spring part should also be filtered. This can be achieved by adding a ComponentScan.Filter to #EnableJpaRepositories as shown below:
#Configuration
#PropertySource({ "classpath:application.yml" })
#EnableJpaRepositories(
basePackages = "com.example.twodatasources",
excludeFilters = {#ComponentScan.Filter(type = FilterType.REGEX, pattern = "com\\.example\\.twodatasources\\.product\\..*")},
entityManagerFactoryRef = "userEntityManager",
transactionManagerRef = "userTransactionManager"
)
I also used entityTypeFilters to ignore some classes while scanning PackagesToScan
but as I found out setting entityTypeFilters via
<property name="entityTypeFilters" ref="packagesToScanIncludeRegexPattern">
clears existing default filters which are(according documentation):
Specify custom type filters for Spring-based scanning for entity classes.
Default is to search all specified packages for classes annotated with #javax.persistence.Entity, #javax.persistence.Embeddable or #javax.persistence.MappedSuperclass.
As I checked in debug this is resulted in having all class packages being scanned by spring which match packagesToScan condition except packagesToScanIncludeRegexPattern regex. This has impact on startup time.
As workaround I implemented custom filter with filters as by default puls regexp filter:
public class RegexPatternEntitiesTypeFilter extends AbstractClassTestingTypeFilter {
private final Set<String> entityPackageRegistry;
private final Pattern pattern = Pattern.compile("^(?!my.company.package.exclude).*");
public RegexPatternEntitiesTypeFilter() {
entityPackageRegistry = ImmutableSet.of(Entity.class.getName(), Embeddable.class.getName(),
MappedSuperclass.class.getName());
}
#Override
protected boolean match(ClassMetadata metadata) {
boolean result = true;
if (metadata instanceof AnnotationMetadataReadingVisitor) {
Set<String> annotationTypes = ((AnnotationMetadataReadingVisitor) metadata).getAnnotationTypes();
result = !Sets.intersection(entityPackageRegistry, annotationTypes).isEmpty();
}
boolean matches = this.pattern.matcher(metadata.getClassName()).matches();
return result && matches;
}
}

Resources