Reduce stream of Flux to Flux - spring-boot

I am having some problems with applying to reduce operation on Stream<Flux<T>>, I would like to reduce it to Flux<T>. Each AdProvider provides offers as Flux, I would like to use the stream to get all offers from each one of them and concatenate them to one pipeline. How could I possibly do that with reduce?
Set<AdProvider> adProviders;
#Override
#LogBefore
public void gather()
{
adProviders
.parallelStream()
.map(this::gatherOffers)
.reduce(?)
.subscribe();
}
private Flux<Ad> gatherOffers(AdProvider adProvider)
{
try
{
return adProvider.offers();
}
catch(Exception e)
{
log.warn(EXCEPTION_WHILE_PROCESSING_OFFERS, adProvider.getClass().getSimpleName(), e);
return Flux.empty();
}
}

Flatten Stream<Flux> using Flux#fromStream() + Flux#flatMap()
In order to solve the problem, you may combine Flux#fromStream() (which convert Stream<Flux> to Flux<Flux>) and Flux#flatMap() (which flatten inner fluxes to flat Flux) as in the following example:
Set<AdProvider> adProviders;
#Override
public void gather()
{
Flux.fromStream(adProviders.stream())
.parallel() // replace .parallelStream with separate parallel + runOn
.runOn(Schedulers.parallel())
.flatMap(this::gatherOffers)
.subscribe();
}
private Flux<Ad> gatherOffers(AdProvider adProvider)
{
try
{
return adProvider.offers();
}
catch(Exception e)
{
log.warn(EXCEPTION_WHILE_PROCESSING_OFFERS, adProvider.getClass().getSimpleName(), e);
return Flux.empty();
}
}
As it might be noticed, I replaced parallelStream with plain .stream and parallel + runOn which do almost the same.
Alternatively, you may avoid using stream at all and simply rely on Flux.fromIterble + the same Flux#flatMap:
Set<AdProvider> adProviders;
#Override
public void gather()
{
Flux.fromIterable(adProviders)
.parallel() // replace .parallelStream with separate parallel + runOn
.runOn(Schedulers.parallel())
.flatMap(this::gatherOffers)
.subscribe();
}
private Flux<Ad> gatherOffers(AdProvider adProvider)
{
try
{
return adProvider.offers();
}
catch(Exception e)
{
log.warn(EXCEPTION_WHILE_PROCESSING_OFFERS, adProvider.getClass().getSimpleName(), e);
return Flux.empty();
}
}

Related

How to add default behavior to #KafkaListener

I would like to add custom behavior before my method annoted with #KafkaListener is called.
Actually I'm using an abstract class and the final class is using
#KafkaListener(topics = ....)
public void onMessages(List<ConsumerRecord> records) {
super.onMessages(records);
}
#Override
public void process(ConsumerRecord record) {
// Called by abstract class to really process the message (one by one)
}
But I also need to configure the abstract class in a #PostConstruct.
What would be the best approach ?
I would prefer just decorate the default container and use it with something like that :
#MyCustomKafkaLister(topics = ....)
public void onMessage(ConsumerRecord record) {
// Just handle the message
}
Or by customizing the MessageListenerFactory to create a CustomMessageListener inherited from the default one which will call my method annoted by #KafkaListener after some custom behavior.
But I don't know how.
Edit 1
I want my abstract processing to do the following :
public void process(List<ConsumerRecord> records) {
for (ConsumerRecord<K, V> record : records) {
// Check message
try {
if (record.value() == null) {
checkDeser(record, ErrorHandlingDeserializer2.VALUE_DESERIALIZER_EXCEPTION_HEADER);
}
if (record.key() == null) {
checkDeser(record, ErrorHandlingDeserializer2.KEY_DESERIALIZER_EXCEPTION_HEADER);
}
} catch (DeserializationException ex) {
this.deadLetterPublishingRecoverer.accept(record, ex);
LOGGER.error("Deserialization error recovered to DLT.", ex);
}
// Process message
try {
// Here I'm calling the original #KafkaListener aka the subclass
myRealListenerObject.processOneByOne(record);
} catch (Exception ex) {
LOGGER.warn("Exception while processing record. Key : {}", record.key(), ex);
handleException(record, ex);
}
}
}
This is calling "myRealListenerObject.processOneByOne(record);" which should be my listener implementation using #KafkaListener (or #CustomKafkaListener)
Edit 2
I would like my listeners to be like
#CustomKafkaListeners(topics = "myTopic", ...)
public void process(ConsumerRecord record) {
// Do my stuff
}
rather than having something like that for every listeners :
#KafkaListeners(topics = "myTopic", ...)
public void process(List<ConsumerRecord> records) {
for (ConsumerRecord record : records) {
try {
if (record.value() == null) {
checkDeser(record, ErrorHandlingDeserializer2.VALUE_DESERIALIZER_EXCEPTION_HEADER);
}
if (record.key() == null) {
checkDeser(record, ErrorHandlingDeserializer2.KEY_DESERIALIZER_EXCEPTION_HEADER);
}
} catch (DeserializationException ex) {
this.deadLetterPublishingRecoverer.accept(record, ex);
LOGGER.error("Deserialization error recovered to DLT.", ex);
}
// Process message
try {
// Do my stuff
} catch (Exception ex) {
LOGGER.warn("Exception while processing record. Key : {}", record.key(), ex);
MyExceptionHandler.handleException(record, ex);
}
}
}
You can perform that logic using a FilteringBatchMessageListenerAdapter with a custom RecordFilterStrategy to check for the deserialization exceptions.
Simply add the adapter to the listener container factory.

How to make resolvers run async

I am setting up java based graphQl App and find graphql-java-tools really convenient the problem though while itis pretty straight forward
With graphql-java to make field resolvers Async I couldn't Find a way to do it using graphql-java-tools
I tried
#Bean
public ExecutionStrategy executionStrategy() {
return new AsyncExecutionStrategy();
}
Here resolvers I use in order to test
#Component
public class VideoResolver implements GraphQLResolver<Video> {
public Episode getEpisode(Video video){
Episode result = new Episode();
result.setTitle("episodeTitle");
result.setUuid("EpisodeUuid");
result.setBrand("episodeBrand");
try {
Thread.sleep(2000L);
System.out.println(Thread.currentThread().getName());
} catch (InterruptedException e) {
e.printStackTrace();
}
return result;
}
public List<Images> getImages(Video video){
Images image = new Images();
image.setFileName("Image FileName1");
List<Images> imageList = new ArrayList<>();
imageList.add(image);
try {
Thread.sleep(2000L);
System.out.println(Thread.currentThread().getName());
} catch (InterruptedException e) {
System.out.println("Exxxxxxxxxx");
}
return imageList;
}
}
Was assuming this should run in about 2 seconds and print two different streams but no it takes 4
and print it is all in one same stream

Reactor: Backpressure and buffering without overflow

I can't seem to get an unbounded source to work with bufferTimeout without falling into unlimited demand. My source has a lot of data, but I can selectively pull from it, so there is no need to buffer a lot of data in memory if it isn't requested. However, I cannot figure out how to get reactor to both 1. Not request unlimited demand. 2. Not overflow when the source is a bit slow to respond.
Here is a JUnit test case:
#Test
void bufferAllowsRequested() throws InterruptedException {
ExecutorService workers = Executors.newFixedThreadPool(4);
AtomicBoolean down = new AtomicBoolean();
Flux.generate(sink -> {
produceRequestedTo(down, sink);
}).concatMap(Flux::fromIterable).bufferTimeout(400, Duration.ofMillis(200))
.doOnError(t -> {
t.printStackTrace();
down.set(true);
})
.publishOn(Schedulers.fromExecutor(workers), 4)
.subscribe(this::processBuffer);
Thread.sleep(3500);
workers.shutdownNow();
assertFalse(down.get());
}
private void processBuffer(List<Object> buf) {
System.out.println("Received " + buf.size());
try {
Thread.sleep(400);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private void produceRequestedTo(AtomicBoolean down, SynchronousSink<Object> sink) {
Thread.sleep(new Random().nextInt(1000));
try {
sink.next(IntStream.range(0, 500).boxed().collect(Collectors.toList()));
} catch (Exception e) {
e.printStackTrace();
down.set(true);
}
}
I've tried both Flux.create and Flux.generate, but both seem to suffer from this problem. I don't understand how this isn't a common use case.
I filed an issue here: https://github.com/reactor/reactor-core/issues/1557

process multiple object read through ItemReader at same time(concurrently) with Spring batch

I am trying to read data from the database, and run process on each object concurrently.
My config as below,
#Bean
public Job job() {
return jobBuilderFactory.get("job").incrementer(new RunIdIncrementer()).listener(new Listener(videoDao))
.flow(step1()).end().build();
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<VideosDTO, VideosDTO>chunk(3)
.reader(databaseVideoItemReader(null))
.processor(new Processor())
.writer(new Writer(videoDao))
.build();
}
#Bean
#StepScope
ItemReader<VideosDTO> databaseVideoItemReader(#Value("#{jobParameters[userId]}") String userId) {
logger.info("Fetching videos for userId:"+userId);
JdbcCursorItemReader<VideosDTO> databaseReader = new JdbcCursorItemReader<>();
databaseReader.setDataSource(dataSource);
databaseReader.setSql("SELECT * FROM voc.t_videos where user_id="+userId+"AND job_success_ind='N'");
databaseReader.setRowMapper(new BeanPropertyRowMapper<>(VideosDTO.class));
// databaseReader.open(new ExecutionContext());
ExecutionContext executionContext= new ExecutionContext();
executionContext.size();
databaseReader.open(executionContext);
return databaseReader;
}
My item process is as below,
#Override
public VideosDTO process(VideosDTO videosDTO) throws Exception {
log.info("processing........" + videosDTO.getVideoUrl());
try {
Process p = Runtime.getRuntime()
.exec("C:\\Program Files\\Git\\bin\\bash.exe " + "D:\\DRM\\script.sh " + videosDTO.getVideoUrl());
// .exec("D:\\PortableGit\\bin\\bash.exe
// D:\\Vocabimate_Files\\script.sh "+videosDTO.getVideoUrl());
// Thread.sleep(1000);
Thread.sleep(1000);
p.destroy();
try {
p.waitFor();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
try (InputStream is = p.getErrorStream()) {
int in = -1;
while ((in = is.read()) != -1) {
System.out.print((char) in);
}
}
try (InputStream is = p.getInputStream()) {
int in = -1;
while ((in = is.read()) != -1) {
System.out.print((char) in);
}
}
} catch (IOException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
}
return videosDTO;
}
writer is as below:
#Override
public void write(List<? extends VideosDTO>videosList) throws Exception {
for(VideosDTO vid:videosList){
log.info("writting...."+vid.getVideoUrl());
}
}
Suppose if there are 3 Objects fetched from DB this code first
complete process on first object,than second and than third than
starts writing.I want to Run process on the three object concurrently
at same time,than perform writing operation.
Is there any way to do this?
Using a multi-threaded step like suggested by #dimitrisli is the way to go. In addition to that, another way is to use the AsyncItemProcessor (in combination with an AsyncItemWriter).
A similar use case (calling a rest endpoint asynchronously from the processor) can be found here: https://stackoverflow.com/a/52309260/5019386 where I gave some more details.
Hope this helps.
Without getting into the details of your custom Reader/Processor/Writer, I think what you're looking for is a multi-threaded Step.
As also described in the above linked documentation in order to make your step multi-threaded (meaning reading/processing/writing each chunk in a separate thread) you first need to register a SimpleAsyncTaskExecutor:
#Bean
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("myAsyncTaskExecutor");
}
and then register this task executor in your Step's builder:
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<VideosDTO, VideosDTO>chunk(3)
.reader(databaseVideoItemReader(null))
.processor(new Processor())
.writer(new Writer(videoDao))
//making the Step multi-threaded
.taskExecutor(taskExecutor())
.build();
}

Java Streams to iterate over a ResultSet object

I have the following code snippet
ResultSet rs = stmt.executeQuery();
List<String> userIdList = new ArrayList<String>();
while(rs.next()){
userIdList.add(rs.getString(1));
}
Can I make use of Java streams/Lambda expressions to perform this iteration instead of a while loop to populate the List?
You may create a wrapper for the ResultSet making it an Iterable. From there you can iterate as well as create a stream. Of course you have to define a mapper function to get the iterated values from the result set.
The ResultSetIterable may look like this
public class ResultSetIterable<T> implements Iterable<T> {
private final ResultSet rs;
private final Function<ResultSet, T> onNext;
public ResultSetIterable(ResultSet rs, CheckedFunction<ResultSet, T> onNext){
this.rs = rs;
//onNext is the mapper function to get the values from the resultSet
this.onNext = onNext;
}
private boolean resultSetHasNext(){
try {
hasNext = rs.next();
} catch (SQLException e) {
//you should add proper exception handling here
throw new RuntimeException(e);
}
}
#Override
public Iterator<T> iterator() {
try {
return new Iterator<T>() {
//the iterator state is initialized by calling next() to
//know whether there are elements to iterate
boolean hasNext = resultSetHasNext();
#Override
public boolean hasNext() {
return hasNext;
}
#Override
public T next() {
T result = onNext.apply(rs);
//after each get, we need to update the hasNext info
hasNext = resultSetHasNext();
return result;
}
};
} catch (Exception e) {
//you should add proper exception handling here
throw new RuntimeException(e);
}
}
//adding stream support based on an iteratable is easy
public Stream<T> stream() {
return StreamSupport.stream(this.spliterator(), false);
}
}
Now that we have our wrapper, you could stream over the results:
ResultSet rs = stmt.executeQuery();
List<String> userIdList = new ResultSetIterable(rs, rs -> rs.getString(1)).stream()
.collect(Collectors.toList())
}
EDIT
As Lukas pointed out, the rs.getString(1) may throw a checked SQLException, therefor we need to use a CheckedFunction instead of a java Function that would be capable of wrapping any checked Exception in an unchecked one.
A very simple implementation could be
public interface CheckedFunction<T,R> extends Function<T,R> {
#Override
default R apply(T t) {
try {
return applyAndThrow(t);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
R applyAndThrow(T t) throws Exception;
}
Alternatively you could use a library with such a function, i.e. jooλ or vavr
If using a third party library is an option, you could use jOOQ, which supports wrapping JDBC ResultSet in jOOQ Cursor types, and then stream them. For example, using DSLContext.fetchStream()
Essentially, you could write:
try (ResultSet rs = stmt.executeQuery()) {
DSL.using(con) // DSLContext
.fetchStream(rs) // Stream<Record>
.map(r -> r.get(0, String.class)) // Stream<String>
.collect(toList());
}
Disclaimer: I work for the vendor.
Try library: abacus-jdbc
List<String> userIdList = StreamEx.<String> rows(resultSet, 1).toList(); // Don't forget to close ResultSet
Or: If you want to close the ResultSet after toList.
StreamEx.<String> rows(resultSet, 1).onClose(() -> JdbcUtil.closeQuitely(resultSet)).toList();
Or: If you use the utility classes provided in abacus-jdbc:
String sql = "select user_id from user";
// No need to worry about closing Connection/Statement/ResultSet manually. It will be took care by the framework.
JdbcUtil.prepareQuery(dataSource, sql).stream(String.class).toList();
// Or:
JdbcUtil.prepareQuery(dataSource, sql).toList(String.class);
Disclaimer:I'm the developer of abacus-jdbc.

Resources