Spring Batch how to regroup/aggregate user datas into a single object - spring

I am trying to transform user operations (like purshases) into a user summary class (expenses by user). A user can have multiple operations but only one summary. I cannot sum purshases in the reader because I need a processor to reject some operation depending to another service.
So some code :
class UserOperation {
String userId;
Integer price;
}
class UserSummary {
String userId;
Long sum;
}
#Bean
public Step retrieveOobClientStep1(StepBuilderFactory stepBuilderFactory, ItemReader<UserOperation> userInformationJdbcCursorItemReader, ItemProcessor<UserOperation, UserSummary> userInformationsProcessor, ItemWriter<UserSummary> flatFileWriter) {
return stepBuilderFactory.get("Step1").<UserOperation, UserSummary>chunk(100) // chunck result that need to be aggregated... not good
.reader(userInformationJdbcCursorItemReader) // read all user operations from DB
.processor(userInformationsProcessor) // I need to reject or not some operations - but here 1 operation = 1 summary that is not good
.writer(flatFileWriter) // write result into flat file
.build();
}
I thing that ItemReader/ItemProcessor/ItemWriter is for single item processing.
But how to regroup multiples records into a single object using Spring Batch ? only Tasklet ?
Possibility but cause problems with small commit interval :
public class UserSummaryAggregatorItemStreamWriter implements ItemStreamWriter<UserSummary>, InitializingBean {
private ItemStreamWriter<UserSummary> delegate;
#Override
public void afterPropertiesSet() throws Exception {
Assert.notNull(delegate, "'delegate' may not be null.");
}
public void setDelegate(ItemStreamWriter<UserSummary> delegate) {
this.delegate = delegate;
}
#Override
public void write(List<? extends UserSummary> items) throws Exception {
Map<String, UserSummary> userSummaryMap = new HashMap<String, UserSummary>();
// Aggregate
for (UserSummary item : items) {
UserSummary savedUserSummary = userSummaryMap.get(item.getUserId());
if (savedUserSummary != null) {
savedUserSummary.incrementSum(item.getSum()); // sum
} else {
savedUserSummary = item;
}
userSummaryMap.put(item.getSubscriptionCode(), savedUserSummary);
}
Collection<UserSummary> values = userSummaryMap.values();
if(values != null) {
delegate.write(new ArrayList<UserSummary>(values));
}
}
#Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
delegate.open(executionContext);
}
#Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
delegate.update(executionContext);
}
#Override
public void close() throws ItemStreamException {
delegate.close();
}
}

Related

Spring Kafka Consumer seek particular offset to read message

I created a SpringBootApplication to consume a message from Particular offset. But consumer poll method returning zero records. If I run application multiple times it should return same message each time from 108134L offset.
#Configuration
public class FlightEventListener {
#Bean
public void listenForMessage() throws Exception {
TopicPartition tp = new TopicPartition("topic-name", 0);
KafkaConsumer<String, Object> consumer = new KafkaConsumer<>(clusterOneProps);
try {
consumer.subscribe(Collections.singletonList("topic-name"), new ConsumerRebalanceListener() {
#Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
// TODO Auto-generated method stub
}
#Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
// TODO Auto-generated method stub
consumer.seek(tp, 108134L);
}
});
ConsumerRecords<String, Object> crs = consumer.poll(Duration.ofMillis(100L));
System.out.println(crs.count());
for (ConsumerRecord<String, Object> record : crs) {
System.out.println("consumer Record is >>>>"+record.offset());
System.out.println("consumer Record is >>>>"+record);
}
}catch(Exception e) {
e.printStackTrace();
}finally {
consumer.close();
}
================================================
Implemented ConsumerSeekAware. but method is not invoking. How to invoke the method. I am looking for method invocation during startup
#Configuration
public class MessageSeeker extends AbstractConsumerSeekAware {
#Autowired
private FlightEventKafkaConfiguration clusterOneConfig;
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
// logic
}

Spring Batch MultiResourceItemReader re-reading first resource

I am writing a Spring Batch application using Spring Boot 1.5, following are my classes : -
CustomMultiResourceItemReader.java
#StepScoped
#Component
public class CustomMultiResourceItemReader
extends MultiResourceItemReader<MyDTO> {
public MultiResourceXmlItemReader(
#NonNull final MyResourceAwareItemReader itemReader,
#NonNull final ApplicationContext ctx)
throws IOException {
setResources(
ctx.getResources(
String.format(
"file:%s/*.xml", "~/data"))); // gives me a Resource[] array fine
setDelegate(itemReader);
}
#PreDestroy
void destroy() {
close();
}
}
MyResourceAwareItemReader.java
#RequiredArgsConstructor
#StepScope
#Component
#Slf4j
public class MyResourceAwareItemReader
implements ResourceAwareItemReaderItemStream<MyDTO> {
private static final String RESOURCE_NAME_KEY = "RESOURCE_NAME_KEY";
#NonNull private final Unmarshaller unmarshaller; // JaxB Unmarshaller
private Resource resource;
#Override
public void setResource(Resource resource) {
this.resource = resource; // **gets called only once**
}
#Override
public MyDTO read() throws Exception {
final MyDTO dto = (MyDTO) unmarshaller.unmarshal(resource.getFile()); // Standard JaxB unmarshalling.
return dto;
}
#Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
if (executionContext.containsKey(RESOURCE_NAME_KEY)) {
} else if (resource != null) {
executionContext.put(RESOURCE_NAME_KEY, resource.getFilename());
}
}
#Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
if (resource != null) executionContext.put(RESOURCE_NAME_KEY, resource.getFilename());
}
#Override
public void close() throws ItemStreamException {}
}
The problem is the setResource method in the delegate reader (MyResourceAwareItemReader.java) gets called only once at the beginning; while the read method gets called multiple times, as a result I read the same item multiple times, instead of reading the next item as expected.
I have also browsed the source code of MultiResouceItemReader in Spring Batch, it seems like, the read method of the delegate class is supposed to return null after each item is read, I can clearly see my code doesnt seem to do that.
I am bit lost how to make this work. Any help is much appreciated
Looking further into it, ItemReader documentation, clearly details that reader must return null at the end of the input data set. So basically I implemented my ItemReader with a boolean flag as follows: -
#RequiredArgsConstructor
#StepScope
#Component
#Slf4j
public class MyResourceAwareItemReader
implements ResourceAwareItemReaderItemStream<MyDTO> {
private static final String RESOURCE_NAME_KEY = "RESOURCE_NAME_KEY";
#NonNull private final Unmarshaller unmarshaller; // JaxB Unmarshaller
private Resource resource;
private boolean isResourceRead;
#Override
public void setResource(Resource resource) {
this.resource = resource;
isResourceRead = false;
}
#Override
public MyDTO read() throws Exception {
if(isResourceRead == true) return null;
final MyDTO dto = (MyDTO) unmarshaller.unmarshal(resource.getFile());
isResourceRead = true;
return dto;
}
#Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
if (executionContext.containsKey(RESOURCE_NAME_KEY)) {
} else if (resource != null) {
executionContext.put(RESOURCE_NAME_KEY, resource.getFilename());
}
}
#Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
if (resource != null) executionContext.put(RESOURCE_NAME_KEY, resource.getFilename());
}
#Override
public void close() throws ItemStreamException {}
}
MultiResourceItemReader does not returns null each time. If there are no more resources to read the it returns NULL otherwise it returns the next resources to the delegate that means – Your actual reader
I can see problem in your read() method . you are not moving to next file. As you are implementing you own MultiResourceItemReader It’s your responsibility to move to next resources item.
This is how it is implanted in MultiResourceItemReader . You will need your own similar implementation.
private T readNextItem() throws Exception {
T item = delegate.read();
while (item == null) {
currentResource++;
if (currentResource >= resources.length) {
return null;
}
delegate.close();
delegate.setResource(resources[currentResource]);
delegate.open(new ExecutionContext());
item = delegate.read();
}
return item;
}
You need to maintain index of resources array . Please check implementation of MultiResourceItemReader. You need to do exactly similar way

Why the intemReader is always sending the exact same value to CustomItemProcessor

Why does the itemReader method is always sending the exact same file name to be processed in CustomItemProcessor?
As far as I understand, since I settup reader as #Scope and I set more than 1 in chunk, I was expecting the "return s" to move forward to next value from String array.
Let me clarify my question with a debug example in reader method:
1 - the variable stringArray is filled in with 3 file names (f1.txt, f2.txt and f3.txt)
2 - "return s" is evoked with s = f1.txt
3 - "return s" evoked again before evoked customItemProcessor method (perfect untill here since chunk = 2)
4 - looking at s it contains f1.txt again (different from what I expected. I expected f2.txt)
5 and 6 - runs processor with same name f1.tx (it should work correctly if the second turn of "return s" would contain f2.txt)
7 - writer method works as expected (processedFiles contain twice the two names processed in customItemProcessor f1.txt and f1.txt again since same name was processed twice)
CustomItemReader
public class CustomItemReader implements ItemReader<String> {
#Override
public String read() throws Exception, UnexpectedInputException,
ParseException, NonTransientResourceException {
String[] stringArray;
try (Stream<Path> stream = Files.list(Paths.get(env
.getProperty("my.path")))) {
stringArray = stream.map(String::valueOf)
.filter(path -> path.endsWith("out"))
.toArray(size -> new String[size]);
}
//*** the problem is here
//every turn s variable receives the first file name from the stringArray
if (stringArray.length > 0) {
for (String s : stringArray) {
return s;
}
} else {
log.info("read method - no file found");
return null;
}
return null;
}
CustomItemProcessor
public class CustomItemProcessor implements ItemProcessor<String , String> {
#Override
public String process(String singleFileToProcess) throws Exception {
log.info("process method: " + singleFileToProcess);
return singleFileToProcess;
}
}
CustomItemWriter
public class CustomItemWriter implements ItemWriter<String> {
private static final Logger log = LoggerFactory
.getLogger(CustomItemWriter.class);
#Override
public void write(List<? extends String> processedFiles) throws Exception {
processedFiles.stream().forEach(
processedFile -> log.info("**** write method"
+ processedFile.toString()));
FileSystem fs = FileSystems.getDefault();
for (String s : processedFiles) {
Files.deleteIfExists(fs.getPath(s));
}
}
Configuration
#Configuration
#ComponentScan(...
#EnableBatchProcessing
#EnableScheduling
#PropertySource(...
public class BatchConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private JobRepository jobRepository;
#Bean
public TaskExecutor getTaskExecutor() {
return new TaskExecutor() {
#Override
public void execute(Runnable task) {
}
};
}
//I can see the number in chunk reflects how many time customReader is triggered before triggers customProcesser
#Bean
public Step step1(ItemReader<String> reader,
ItemProcessor<String, String> processor, ItemWriter<String> writer) {
return stepBuilderFactory.get("step1").<String, String> chunk(2)
.reader(reader).processor(processor).writer(writer)
.allowStartIfComplete(true).build();
}
#Bean
#Scope
public ItemReader<String> reader() {
return new CustomItemReader();
}
#Bean
public ItemProcessor<String, String> processor() {
return new CustomItemProcessor();
}
#Bean
public ItemWriter<String> writer() {
return new CustomItemWriter();
}
#Bean
public Job job(Step step1) throws Exception {
return jobBuilderFactory.get("job1").incrementer(new RunIdIncrementer()).start(step1).build();
}
Scheduler
#Component
public class QueueScheduler {
private static final Logger log = LoggerFactory
.getLogger(QueueScheduler.class);
private Job job;
private JobLauncher jobLauncher;
#Autowired
public QueueScheduler(JobLauncher jobLauncher, #Qualifier("job") Job job){
this.job = job;
this.jobLauncher = jobLauncher;
}
#Scheduled(fixedRate=60000)
public void runJob(){
try{
jobLauncher.run(job, new JobParameters());
}catch(Exception ex){
log.info(ex.getMessage());
}
}
}
Your issue is that you are relying on an internal loop to iterate over the items instead of letting Spring Batch do it for you by calling ItemReader#read multiple times.
What I'd recommend is changing your reader to the something like the following:
public class JimsItemReader implements ItemStreamReader {
private String[] items;
private int curIndex = -1;
#Override
public void open(ExecutionContext ec) {
curIndex = ec.getInt("curIndex", -1);
String[] stringArray;
try (Stream<Path> stream = Files.list(Paths.get(env.getProperty("my.path")))) {
stringArray = stream.map(String::valueOf)
.filter(path -> path.endsWith("out"))
.toArray(size -> new String[size]);
}
}
#Override
public void update(ExecutionContext ec) {
ec.putInt("curIndex", curIndex);
}
#Override
public String read() {
if (curIndex < items.length) {
curIndex++;
return items[curIndex];
} else {
return null;
}
}
}
The above example should loop through the items of your array as they are read. It also should be restartable in that we're storing the index in the ExecutionContext so if the job is restarted after a failure, you'll restart where you left off.

JMeter Plugin - How to Listen to TestState

I am working on developing a JMeter plugin. I'm trying to create an AbstractVisualizer that is capable of monitoring the current test state. However, implementing the TestStateListener doesn't seem to be working.
I'm testing this by creating a basic listener that has a login to output arbitrary info to JMeter's logging console. When a sample is sent through the Add function, a line is sent to the console. But nothing is ever triggered on the various TestState functions. Is there something more structural I'm missing?
public class TestListener extends AbstractVisualizer
implements TestStateListener
{
private static final Logger log = LoggingManager.getLoggerForClass();
#Override
public void add(SampleResult arg0) {
log.info("add");
}
#Override
public void clearData() {
// TODO Auto-generated method stub
}
#Override
public String getStaticLabel()
{
return "Test Listener";
}
#Override
public String getLabelResource() {
return null;
}
#Override
public void testEnded() {
log.info("Test Ended");
}
#Override
public void testEnded(String arg0) {
log.info("Test Ended");
}
#Override
public void testStarted() {
log.info("Test started");
}
#Override
public void testStarted(String arg0) {
log.info("Test started");
}
}
I'm not sure how to do it in 1 class. I have 2 classes:
The UI:
public class MonitorGui extends AbstractListenerGui
{
// ...
#Override
public TestElement createTestElement()
{
TestElement element = new Monitor();// <-- this is the backend
modifyTestElement(element);
return element;
}
// ...
}
And then the backend goes like this:
public class Monitor extends AbstractListenerElement
implements SampleListener,
Clearable, Serializable,
TestStateListener, Remoteable,
NoThreadClone
{
private static final String TEST_IS_LOCAL = "*local*";
// ...
#Override
public void testStarted()
{
testStarted(TEST_IS_LOCAL);
}
#Override
public void testEnded()
{
testEnded(TEST_IS_LOCAL);
}
#Override
public void testStarted(String host)
{
// ...
}
// ...
}
You may not need to implement SampleListener like I do, but probably other things are quite similar.
I based that implementation on a built-in pair of ResultSaverGui and ResultCollector which are the components that are saving results into the file(s) for Simple Data Writer, Summary Report and so on.

Spring Batch: pass data between reader and writer

I would like to get data in the Writer that I've set in the Reader of my step. I know about ExecutionContexts (step and job) and about ExecutionContextPromotionListener via http://docs.spring.io/spring-batch/trunk/reference/html/patterns.html#passingDataToFutureSteps
The problem is that in Writer I'm retrieving a null value of 'npag'.
Line on ItemWriter:
LOG.info("INSIDE WRITE, NPAG: " + nPag);
I've being doing some workarounds without luck, looking answer for other similar questions... Any help? thanks!
Here's my code:
READER
#Component
public class LCItemReader implements ItemReader<String> {
private StepExecution stepExecution;
private int nPag = 1;
#Override
public String read() throws CustomItemReaderException {
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
stepContext.put("npag", nPag);
nPag++;
return "content";
}
#BeforeStep
public void saveStepExecution(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
}
WRITER
#Component
#StepScope
public class LCItemWriter implements ItemWriter<String> {
private String nPag;
#Override
public void write(List<? extends String> continguts) throws Exception {
try {
LOG.info("INSIDE WRITE, NPAG: " + nPag);
} catch (Throwable ex) {
LOG.error("Error: " + ex.getMessage());
}
}
#BeforeStep
public void retrieveInterstepData(StepExecution stepExecution) {
JobExecution jobExecution = stepExecution.getJobExecution();
ExecutionContext jobContext = jobExecution.getExecutionContext();
this.nPag = jobContext.get("npag").toString();
}
}
JOB/STEP BATCH CONFIG
#Bean
public Job lCJob() {
return jobs.get("lCJob")
.listener(jobListener)
.start(lCStep())
.build();
}
#Bean
public Step lCStep() {
return steps.get("lCStep")
.<String, String>chunk(1)
.reader(lCItemReader)
.processor(lCProcessor)
.writer(lCItemWriter)
.listener(promotionListener())
.build();
}
LISTENER
#Bean
public ExecutionContextPromotionListener promotionListener() {
ExecutionContextPromotionListener executionContextPromotionListener = new ExecutionContextPromotionListener();
executionContextPromotionListener.setKeys(new String[]{"npag"});
return executionContextPromotionListener;
}
The ExecutionContextPromotionListener specifically states that it works at the end of a step so that would be after the writer executes. So the promotion I think you are counting on does not occur when you think it does.
If i were you I would set it in the step context and get it from the step if you need the value with in a single step. Otherwise I would set it to the job context.
The other aspect is the #BeforeStep. That marks a method for executing before the step context exists. The way you are setting the nPag value in the reader would be after the step had started executing.
You are trying to read the value for nPag even before it is set in the reader, ending up with a default value which is null. You need to read the value on nPag at the time of logging from the execution context directly. You can keep a reference to the jobContext. Try this
#Component
#StepScope
public class LCItemWriter implements ItemWriter<String> {
private String nPag;
private ExecutionContext jobContext;
#Override
public void write(List<? extends String> continguts) throws Exception {
try {
this.nPag = jobContext.get("npag").toString();
LOG.info("INSIDE WRITE, NPAG: " + nPag);
} catch (Throwable ex) {
LOG.error("Error: " + ex.getMessage());
}
}
#BeforeStep
public void retrieveInterstepData(StepExecution stepExecution) {
JobExecution jobExecution = stepExecution.getJobExecution();
jobContext = jobExecution.getExecutionContext();
}
}
In your Reader and Writer you need to implement ItemStream interface and use ExecutionContext as member variable.Here i have given example with Processor instead of Writer but same is applicable for Writer as well .Its working fine for me and i am able to take values from reader to processor.
I have set the value in context in reader and getting the value in processor.
public class EmployeeItemReader implements ItemReader<Employee>, ItemStream {
ExecutionContext context;
#Override
public Employee read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
context.put("ajay", "i am going well");
Employee emp=new Employee();
emp.setEmpId(1);
emp.setFirstName("ajay");
emp.setLastName("goswami");
return emp;
}
#Override
public void close() throws ItemStreamException {
// TODO Auto-generated method stub
}
#Override
public void open(ExecutionContext arg0) throws ItemStreamException {
context = arg0;
}
#Override
public void update(ExecutionContext arg0) throws ItemStreamException {
// TODO Auto-generated method stub
context = arg0;
}
}
My processor
public class CustomItemProcessor implements ItemProcessor<Employee,ActiveEmployee>,ItemStream{
ExecutionContext context;
#Override
public ActiveEmployee process(Employee emp) throws Exception {
//See this line
System.out.println(context.get("ajay"));
ActiveEmployee actEmp=new ActiveEmployee();
actEmp.setEmpId(emp.getEmpId());
actEmp.setFirstName(emp.getFirstName());
actEmp.setLastName(emp.getLastName());
actEmp.setAdditionalInfo("Employee is processed");
return actEmp;
}
#Override
public void close() throws ItemStreamException {
// TODO Auto-generated method stub
}
#Override
public void open(ExecutionContext arg0) throws ItemStreamException {
// TODO Auto-generated method stub
}
#Override
public void update(ExecutionContext arg0) throws ItemStreamException {
context = arg0;
}
}
Hope this helps.

Resources