Reading multiple files resides in a file system which matches the job parameters using MultiResourceItemReader - spring

Use Case :
I would like to launch a job which takes employee id as job parameters, which will be multiple employee ids.
In a file system, files will be residing which contains employee ids as part of the file name (It is a remote file system, not local)
i need to process those files where file name contains the employee-id and passing it to the reader.
I am thinking of using MultiResourceItemReader but i am confused how to match the file name with Employee Id (Job Parameter) which is there in a file system.
Please suggest.

The class MultiResourceItemReader has a method setResources(Resources[] resources) which lets you specify resources to read either with an explicit list or with a wildcard expression (or both).
Example (explicit list) :
<bean class="org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources">
<list>
<value>file:C:/myFiles/employee-1.csv</value>
<value>file:C:/myFiles/employee-2.csv</value>
</list>
</property>
</bean>
Example (wildcard) :
<bean class="org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="file:C:/myFiles/employee-*.csv" />
</bean>
As you may know, you can use job parameters in configuration by using #{jobParameters['key']} :
<bean class="org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="file:C:/myFiles/employee-#{jobParameters['id']}.csv" />
</bean>
Unfortunately, wildcard expressions can't manage an OR expression over a list of value with a separator (id1, id2, id3...). And I'm guessing you don't know how many distinct values you'll have to declare an explicit list with a predefined number of variables.
However a working solution would be to use the Loop mechanism of Spring Batch with a classic FlatFileItemReader. The principle is basically to set the next="" on the last step to the first step until you have exhausted every item to read. I will provide code samples if needed.
EDIT
Let's say you have a single chunk to read one file at a time. First of all, you'd need to put the current id from the job parameter in the context to pass it to the reader.
public class ParametersManagerTasklet implements Tasklet, StepExecutionListener {
private Integer count = 0;
private Boolean repeat = true;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
// Get job parameter and split
String[] ids = chunkContext.getStepContext().getJobParameters().getString(PARAMETER_KEY).split(DELIMITER);
// Check for end of list
if (count >= ids.length) {
// Stop loop
repeat = false;
} else {
// Save current id and increment counter
chunkContext.getStepContext().getJobExecutionContext().put(CURRENT_ID_KEY, ids[count++];
}
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
if (!repeat) {
return new ExitStatus("FINISHED");
} else {
return new ExitStatus("CONTINUE");
}
}
}
Now you declare this step in your XML and create a loop :
<batch:step id="ParametersStep">
<batch:tasklet>
<bean class="xx.xx.xx.ParametersManagerTasklet" />
</batch:tasklet>
<batch:next on="CONTINUE" to="ReadStep" />
<batch:end on="FINISHED" />
</batch:step>
<batch:step id="ReadStep">
<batch:tasklet>
<batch:chunk commit-interval="10">
<batch:reader>
<bean class="org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="file:C:/myFiles/employee-#{jobExecutionContext[CURRENT_ID_KEY]}.csv" />
</bean>
</batch:reader>
<batch:writer>
</batch:writer>
</batch:chunk>
</batch:tasklet>
<batch:next on="*" to="ParametersStep" />
</batch:step>

You can write your own FactoryBean to perform a custom resources search.
public class ResourcesFactoryBean extends AbstractFactoryBean<Resource[]> {
String[] ids;
String path;
public void setIds(String[] ids) {
this.ids = ids;
}
public void setPath(String path) {
this.path = path;
}
#Override
protected Resource[] createInstance() throws Exception {
final List<Resource> l = new ArrayList<Resource>();
final PathMatchingResourcePatternResolver x = new PathMatchingResourcePatternResolver();
for(final String id : ids)
{
final String p = String.format(path, id);
l.addAll(Arrays.asList(x.getResources(p)));
}
return l.toArray(new Resource[l.size()]);
}
#Override
public Class<?> getObjectType() {
return Resource[].class;
}
}
---
<bean id="reader" class="org.springframework.batch.item.file.MultiResourceItemReader" scope="step">
<property name="delegate" ref="itemReader" />
<property name="resources">
<bean class="ResourcesFactoryBean">
<property name="path"><value>file:C:/myFiles/employee-%s.cvs</value> </property>
<property name="ids">
<value>#{jobParameters['id']}</value>
</property>
</bean>
</property>
</bean>
jobParameter 'id' is a comma separated list of your ID.

Related

Spring Batch read step running in loop

I came across a piece of code that reads some data as the following:
public class StudioReader implements ItemReader<List<Studio>> {
#Setter private AreaDao areaDao;
#Getter #Setter private BatchContext context;
private HopsService hopsService = new HopsService();
#Override
public List<Studio> read() throws Exception {
List<Studio> list = hopsService.getStudioHops();
if (!isEmpty(list)) {
for (Studio studio : list) {
log.info("Studio being read: {}", studio.getCode());
List areaList = areaDao.getArea(studio
.getCode());
if (areaList.size() > 0) {
studio.setArea((String) areaList.get(0));
log.info("Area {1} is fetched for studio {2}", areaList.get(0), studio.getCode());
}
this.getContext().setReadCount(1);
}
}
return list;
}
However when I run the job this read is running in a loop. I found from another stackoverflow answer that it is the expected behavior. My question then is what is the best solution given this particular example? Extend StudioReader from JdbcCursorItemReader ? I found one example that defines everything in the xml which I don't want. And here is the context.xml part for the reader:
<bean class="org.springframework.batch.core.scope.StepScope" />
<bean id="ItemReader" class="com.syc.studio.reader.StudioReader" scope="step">
<property name="context" ref="BatchContext" />
<property name="areaDao" ref="AreaDao" />
</bean>
And here is the job definition in xml:
<bean id="StudioJob" class="org.springframework.batch.core.job.SimpleJob">
<property name="steps">
<list>
<bean id="StudioStep" parent="SimpleStep" >
<property name="itemReader" ref="ItemReader"/>
<property name="itemWriter" ref="ItemWriter"/>
<property name="retryableExceptionClasses">
<map>
<entry key="com.syc.studio.exception.CustomException" value="true"/>
</map>
</property>
<property name="retryLimit" value="2" />
</bean>
</list>
</property>
<property name="jobRepository" ref="jobRepository" />
</bean>
Writer:
public void write(List<? extends Object> obj) throws Exception {
List<Studio> list = (List<Studio>) obj.get(0);
for (int i = 0; i <= list.size(); i++) {
Studio studio = list.get(i);
if (apiClient == null) {
apiClient = new APIClient("v2");
}
this.uploadXML(studio);
}
The read method after suggestion from #holi-java:
public List<Studio> read() throws Exception {
if (this.listIterator == null) {
this.listIterator = initializing();
}
return this.listIterator.hasNext() ? this.listIterator.next() : null;
}
private Iterator<List<Studio>> initializing() {
List<Studio> listOfStudiosFromApi = hopsService.getStudioLocations();
for (Studio studio : listOfStudiosFromApi) {
log.info("Studio being read: {}", studio.getCode());
List areaList = areaDao.getArea(studio.getCode());
if (areaList.size() > 0) {
studio.setArea((String) areaList.get(0));
log.info("Area {1} is fetched for studio {2}", areaList.get(0), studio.getCode());
}
this.getContext().setReadCount(1);
}
return Collections.singletonList(listOfStudiosFromApi).iterator();
}
spring-batch documentation for ItemReader.read assert:
Implementations must return null at the end of the input data set.
But your read method is always return a List and should be like this:
public Studio read() throws Exception {
if (this.results == null) {
List<Studio> list = hopsService.getStudioHops();
...
this.results=list.iterator();
}
return this.results.hasNext() ? this.results.next() : null;
}
if you want your read method return a List then you must paging the results like this:
public List<Studio> read() throws Exception {
List<Studio> results=hopsService.getStudioHops(this.page++);
...
return results.isEmpty()?null:results;
}
if you can't paging the results from Service you can solved like this:
public List<Studio> read() throws Exception {
if(this.results==null){
this.results = Collections.singletonList(hopsService.getStudioHops()).iterator();
}
return this.results.hasNext()?this.results.next():null;
}
it's better not read a list of items List<Studio>, read an item at a time Studio instead. when you read a list of item you possibly duplicated iterate logic between writers and processors as you have shown the demo in comments. if you have a huge of data list to processing you can combine pagination in your reader, for example:
public Studio read() throws Exception {
if (this.results == null || !this.results.hasNext()) {
List<Studio> list = hopsService.getStudioHops(this.page++);
...
this.results=list.iterator();
}
return this.results.hasNext() ? this.results.next() : null;
}
Maybe you need to see step processing mechanism.
ItemReader - read an item at a time.
ItemProcessor - processing an item at a time.
ItemWriter - write entire chunk of items out.

Skip header, body and footer lines from file on Spring Batch

I have this specifically file:
H;COD;CREATION_DATE;TOT_POR;TYPE
H;001;2013-10-30;20;R
D;DETAIL_VALUE;PROP_VALUE
D;003;3030
D;002;3031
D;005;3032
T;NUM_FOL;TOT
T;1;503.45
As you can see, it has header/body/footer lines. I'm looking for a ItemReader that skip these lines. I've done this ItemReader below who identify those lines, using PatternMatchingCompositeLineMapper.
<bean id="fileReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" ref="myFileReference" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper">
<property name="tokenizers">
<map>
<entry key="H*" value-ref="headerLineTokenizer"/>
<entry key="D*" value-ref="bodyLineTokenizer"/>
<entry key="T*" value-ref="footerLineTokenizer"/>
</map>
</property>
<property name="fieldSetMappers">
<map>
<entry key="H*" value-ref="headerMapper"/>
<entry key="D*" value-ref="bodyMapper"/>
<entry key="T*" value-ref="footerMapper"/>
</map>
</property>
</bean>
</property>
</bean>
I tried to add linesToSkip property equals 1, but it only skipped the header line. Is there a way to skip the first line of each block(header, body and footer)?
Thks.
Nope. linesToSkip (as you wrote) just skip the first linesToSkip lines.
You have to write your own reader using multiorder-line example (or this post) as base and manage skip first line of each block manually
Another option would be this one:
1- Create a Reader Factory
public class CustomFileReaderFactory implements BufferedReaderFactory {
#Override
public BufferedReader create(Resource resource, String encoding) throws IOException {
return new CustomFileReader(new InputStreamReader(resource.getInputStream(), encoding));
}
2- Create your CustomFileReader (this will read one line and decide if we continue or we skip) and make sure to overwrite the readLine() method.
public class CustomFileReader extends BufferedReader {
public CustomFileReader(Reader in) {
super(in);
}
#Override
public String readLine() throws IOException {
String line = super.readLine();
// your logic here
if (hasToBeIgnored(line)) {
return null;
}
return line;
}
3- Set your brand new Factory into your FlatFileItemReader:
yourFlatFileItemReader.setBufferedReaderFactory(new CustomFileReaderFactory());

Getting ClassCastException error

I have two classes ClientLogic1 and WelcomeBean1 as follows
public class ClientLogic1 {
public static void main(String[] args)
{
Resource res = new ClassPathResource("spconfig.xml");
BeanFactory factory = new XmlBeanFactory(res);
Object o = factory.getBean("id1");
WelcomeBean1 wb = (WelcomeBean1)o;
wb.show();
}
}
2nd class
public class WelcomeBean1 {
private Map data;
public void setData(Map data) {
this.data = data;
}
public void show()
{
Set s=data.entrySet();
Iterator it = s.iterator();
while(it.hasNext())
{
Map.Entry me = (Map.Entry)it.next();
System.out.println(me.getKey()+ " - "+me.getValue());
}
}
}
I have a xml file as
<beans>
<bean id="id1" class="WelcomeBean1">
<property name="data">
<map>
<entry key="k1">
<vlaue>1323</value>
</entry>
<entry key="k2">
<value>feed</value>
</entry>
</map>
</property>
</bean>
</bean>
I have given the right path.It's just when i run this program i get the following error
Exception in thread "main" java.lang.ClassCastException: WelcomeBean cannot be
cast to mapexmpl.WelcomeBean1 at mapexmpl.ClientLogic1.main(ClientLogic1.java:15)
I am not sure where i am going wrong.Can someone help me plz...
make sure there is no duplicate bean id in spring configuration file. for instance you might have bean WelcomeBean with id id1
change to full package name <bean id="id1" class="mapexmpl.WelcomeBean1">
Actually it is saying as WelcomeBean cannot be ..............But your code is showing all as WelcomeBean1.
You used WelcomeBean some where .Please check it once.
I think before you used WelComeBean.Then changed it to WelComeBean1.Please build agian with clean.

Spring: import a module with specified environment

Is there anything that can achieve the equivalent of the below:
<import resource="a.xml">
<prop name="key" value="a"/>
</import>
<import resource="a.xml">
<prop name="key" value="b"/>
</import>
Such that the beans defined in resouce a would see the property key with two different values? The intention would be that this would be used to name the beans in the imports such that resource a.xml would appear:
<bean id="${key}"/>
And hence the application would have two beans named a and b now available with the same definition but as distinct instances. I know about prototype scope; it is not intended for this reason, there will be many objects created with interdepednencies that are not actually prototypes. Currently I am simply copying a.xml, creating b.xml and renaming all the beans using the equivalent of a sed command. I feel there must be a better way.
I suppose that PropertyPlaceholderConfigurers work on a per container basis, so you can't achieve this with xml imports.
Re The application would have two beans named a and b now available with the same definition but as distinct instances
I think you should consider creating additional application contexts(ClassPathXmlApplicationContext for example) manually, using your current application context as the parent application context.
So your many objects created with interdependencies sets will reside in its own container each.
However, in this case you will not be able to reference b-beans from a-container.
update you can postprocess the bean definitions(add new ones) manually by registering a BeanDefinitionRegistryPostProcessor specialized bean, but this solution also does not seem to be easy.
OK, here's my rough attempt to import xml file manually:
disclaimer: I'm very bad java io programmer actually so double check the resource related code :-)
public class CustomXmlImporter implements BeanDefinitionRegistryPostProcessor {
#Override
public void postProcessBeanFactory(
ConfigurableListableBeanFactory beanFactory) throws BeansException {
}
private Map<String, String> properties;
public void setProperties(Map<String, String> properties) {
this.properties = properties;
}
public Map<String, String> getProperties() {
return properties;
}
private void readXml(XmlBeanDefinitionReader reader) {
InputStream inputStream;
try {
inputStream = new ClassPathResource(this.classpathXmlLocation).getInputStream();
} catch (IOException e1) {
throw new AssertionError();
}
try {
Scanner sc = new Scanner(inputStream);
try {
sc.useDelimiter("\\A");
if (!sc.hasNext())
throw new AssertionError();
String entireXml = sc.next();
PropertyPlaceholderHelper helper = new PropertyPlaceholderHelper("${",
"}", null, false);
Properties props = new Properties();
props.putAll(this.properties);
String newXml = helper.replacePlaceholders(entireXml, props);
reader.loadBeanDefinitions(new ByteArrayResource(newXml.getBytes()));
} finally {
sc.close();
}
} finally {
try {
inputStream.close();
} catch (IOException e) {
throw new AssertionError();
}
}
}
private String classpathXmlLocation;
public void setClassPathXmlLocation(String classpathXmlLocation) {
this.classpathXmlLocation = classpathXmlLocation;
}
public String getClassPathXmlLocation() {
return this.classpathXmlLocation;
}
#Override
public void postProcessBeanDefinitionRegistry(
BeanDefinitionRegistry registry) throws BeansException {
XmlBeanDefinitionReader reader = new XmlBeanDefinitionReader(registry);
readXml(reader);
}
}
XML configuration:
<bean class="CustomXmlImporter">
<property name="classPathXmlLocation" value="a.xml" />
<property name="properties">
<map>
<entry key="key" value="a" />
</map>
</property>
</bean>
<bean class="CustomXmlImporter">
<property name="classPathXmlLocation" value="a.xml" />
<property name="properties">
<map>
<entry key="key" value="b" />
</map>
</property>
</bean>
this code loads the resources from classpath. I would think twice before doing something like that, anyway, you can use this as a starting point.

Spring init-method params

I am new to spring and I wanted to ask whether or not it is possible to pass params to the init and destroy methods of a bean.
Thanks.
No, you can't. If you need parameters, you will have to inject them as fields beforehand.
Sample Bean
public class Foo{
#Autowired
private Bar bar;
public void init(){
bar.doSomething();
}
}
Sample XML:
<bean class="Foo" init-method="init" />
This method is especially useful when you cannot change the class you are trying to create like in the previous answer but you are rather working with an API and must use the provided bean as it is.
You could always create a class (MyObjectFactory) that implements FactoryBean and inside the getObject() method you should write :
#Autowired
private MyReferenceObject myRef;
public Object getObject()
{
MyObject myObj = new MyObject();
myObj.init(myRef);
return myObj;
}
And in the spring context.xml you would have a simple :
<bean id="myObject" class="MyObjectFactory"/>
protected void invokeCustomInitMethod(String beanName, Object bean, String initMethodName)
throws Throwable {
if (logger.isDebugEnabled()) {
logger.debug("Invoking custom init method '" + initMethodName +
"' on bean with beanName '" + beanName + "'");
}
try {
Method initMethod = BeanUtils.findMethod(bean.getClass(), initMethodName, null);
if (initMethod == null) {
throw new NoSuchMethodException("Couldn't find an init method named '" + initMethodName +
"' on bean with name '" + beanName + "'");
}
if (!Modifier.isPublic(initMethod.getModifiers())) {
initMethod.setAccessible(true);
}
initMethod.invoke(bean, (Object[]) null);
}
catch (InvocationTargetException ex) {
throw ex.getTargetException();
}
}
see spring soruce code in Method initMethod = BeanUtils.findMethod(bean.getClass(), initMethodName, null);
the init method is find and param is null
You cannot pass params to init-method but you can still achieve the same effect using this way:
<bean id="beanToInitialize" class="com.xyz.Test"/>
<bean class="org.springframework.beans.factory.config.MethodInvokingFactoryBean">
<property name="targetObject" ref="beanToInitialize" />
<property name="targetMethod" value="init"/> <!-- you can use any name -->
<property name="arguments" ref="parameter" /> <!-- reference to init parameter, can be value as well -->
</bean>
Note: you can also pass multiple arguments as a list using this
<property name="arguments">
<list>
<ref local="param1" />
<ref local="param2" />
</list>
</property>

Resources