stream() vs. parallelStream() when adding in an ArrayList

stream() vs. parallelStream() when adding in an ArrayList - java-8

I had this piece of code
List<UserNotification> userNotifications = new ArrayList<UserNotification>();
teatreAlertNotifications
.parallelStream()
.forEach(can -> userNotifications.add(new UserNotification(can)));
But since ArrayList is unsynchronized I think it is bad practice and I should use .stream() instead

Or just:
List<UserNotification> userNotifications = teatreAlertNotifications
.parallelStream()
.map(UserNotification::new)
.collect(Collectors.toList());
This is called un-needed side effects, that are generally discouraged in the documentation.
You could keep your original code, but use a synchronized data structure (thread safe), but in this case the order of the elements is not guaranteed.

Related

How to write StepVerfier to test Flux.interval and delayElements?

How to write test case using StepVerifier for Flux interval and delayElements
I want to write StepVerifier for below scenarios.
Flux.interval(Duration.ofMillis(1000))
.onBackpressureDrop()
.flatMap(ignore -> doSomething())
Mono.just(1).repeat() // infinite Flux with backpressure
.delayElements(Duration.ofMillis(1000))
.concatMap(ignore -> doSomething())

TL;DR You can use StepVerifier.withVirtualTime to test time-based operators and avoid long delays. In addition, because stream is infinite, you would need to cancel subscription using thenCancel at some point.
Here are some examples
#Test
void testDelayElements() {
StepVerifier.withVirtualTime(() ->
Mono.just(1).repeat() // infinite Flux with backpressure
.delayElements(Duration.ofMillis(10000))
.concatMap(ignore -> doSomething())
)
.expectSubscription()
.expectNoEvent(Duration.ofMillis(10000))
.expectNextCount(1)
.expectNoEvent(Duration.ofMillis(10000))
.expectNextCount(1)
.expectNoEvent(Duration.ofMillis(10000))
.expectNextCount(1)
.thenCancel()
.verify();
}
For more information, check Manipulating Time section in the Reactor 3 Reference Guide.
Very important point from the documentation
Take extra care to ensure the Supplier<Publisher> can be used in a lazy fashion. Otherwise, virtual time is not guaranteed. Especially avoid instantiating the Flux earlier in the test code and having the Supplier return that variable. Instead, always instantiate the Flux inside the lambda.
Note that the publisher is created lazily using the Supplier<Publisher<T>>. For example, the following will not work as expected
#Test
void testDelayElements() {
var stream = Mono.just(1).repeat() // infinite Flux with backpressure
.delayElements(Duration.ofMillis(10000))
.concatMap(ignore -> doSomething());
StepVerifier.withVirtualTime(() -> stream)
....
}

How do I use multiple reactive streams in the same pipeline?

I'm using WebFlux to pull data from two different REST endpoints, and trying to correlate some data from one stream with the other. I have Flux instances called events and egvs and for each event, I want to find the EGV with the nearest timestamp.
final Flux<Tuple2<Double,Object>> data = events
.map(e -> Tuples.of(e.getValue(),
egvs.map(egv -> Tuples.of(egv.getValue(),
Math.abs(Duration.between(e.getDisplayTime(),
egv.getDisplayTime()).toSeconds())))
.sort(Comparator.comparingLong(Tuple2::getT2))
.take(1)
.map(v -> v.getT1())));
When I send data to my Thymeleaf template, the first element of the tuple renders as a number, as I'd expect, but the second element renders as a FluxMapFuseable. It appears that the egvs.map(...) portion of the pipeline isn't executing. How do I get that part of the pipeline to execute?
UPDATE
Thanks, #Toerktumlare - your answer helped me figure out that my approach was wrong. On each iteration through the map operation, the event needs the context of the entire set of EGVs to find the one it matches with. So the working code looks like this:
final Flux<Tuple2<Double, Double>> data =
Flux.zip(events, egvs.collectList().repeat())
.map(t -> Tuples.of(
// Grab the event
t.getT1().getValue(),
// Find the EGV (from the full set of EGVs) with the closest timestamp
t.getT2().stream()
.map(egv -> Tuples.of(
egv.getValue(),
Math.abs(Duration.between(
t.getT1().getDisplayTime(),
egv.getDisplayTime()).toSeconds())))
// Sort the stream of (value, time difference) tuples and
// take the smallest time difference.
.sorted(Comparator.comparingLong(Tuple2::getT2))
.map(Tuple2::getT1)
.findFirst()
.orElse(0.)));

what i think you are doing is that you are breaking the reactive chain.
During the assembly phase reactor will call each operator backwards until it finds a producer that can start producing items and i think you are breaking that chain here:
egvs.map(egv -> Tuples.of( ..., ... )
you see egvs returns something that you need to take care of and chain on to the return of events.map
I'll give you an example:
// This works because we always return from flatMap
// we keep the chain intact
Mono.just("foobar").flatMap(f -> {
return Mono.just(f)
}.subscribe(s -> {
System.out.println(s)
});
on the other hand, this behaves differently:
Mono.just("foobar").flatMap(f -> {
Mono.just("foo").doOnSuccess(s -> { System.out.println("this will never print"); });
return Mono.just(f);
});
Because in this example you can see that we ignore to take care of the return from the inner Mono thus breaking the chain.
You havn't really disclosed what evg actually is so i wont be able to give you a full answer but you should most likely do something like this:
final Flux<Tuple2<Double,Object>> data = events
// chain on egv here instead
// and then return your full tuple object instead
.map(e -> egvs.map(egv -> Tuples.of(e.getValue(), Tuples.of(egv.getValue(), Math.abs(Duration.between(e.getDisplayTime(), egv.getDisplayTime()).toSeconds())))
.sort(Comparator.comparingLong(Tuple2::getT2))
.take(1)
.map(v -> v.getT1())));
I don't have compiler to check against atm. but i believe that is your problem at least. its a bit tricky to read your code.

Spring Webflux: efficiently using Flux and/or Mono stream multiple times (possible?)

I have the method below, where I am calling several ReactiveMongoRepositories in order to receive and process certain documents. Since I am kind of new to Webflux, I am learning as I go.
To my feeling the code below doesn't feel very efficient, as I am opening multiple streams at the same time. This non-blocking way of writing code makes it complicated somehow to get a value from a stream and re-use that value in the cascaded flatmaps down the line.
In the example below I have to call the userRepository twice, since I want the user at the beginning and than later as well. Is there a possibility to do this more efficiently with Webflux?
public Mono<Guideline> addGuideline(Guideline guideline, String keycloakUserId) {
Mono<Guideline> guidelineMono = userRepository.findByKeycloakUserId(keycloakUserId)
.flatMap(user -> {
return teamRepository.findUserInTeams(user.get_id());
}).zipWith(instructionRepository.findById(guideline.getInstructionId()))
.zipWith(userRepository.findByKeycloakUserId(keycloakUserId))
.flatMap(objects -> {
User user = objects.getT2();
Instruction instruction = objects.getT1().getT2();
Team team = objects.getT1().getT1();
if (instruction.getTeamId().equals(team.get_id())) {
guideline.setAddedByUser(user.get_id());
guideline.setTeamId(team.get_id());
guideline.setDateAdded(new Date());
guideline.setGuidelineStatus(GuidelineStatus.ACTIVE);
guideline.setGuidelineSteps(Arrays.asList());
return guidelineRepository.save(guideline);
} else {
return Mono.error(new InstructionDoesntBelongOrExistException("Unable to add, since this Instruction does not belong to you or doesn't exist anymore!"));
}
});
return guidelineMono;
}

i'll post my earlier comment as an answer. If anyone feels like writing the correct code for it then go ahead.
i don't have access to an IDE current so cant write an example but you could start by fetching the instruction from the database.
Keep that Mono<Instruction> then you fetch your User and flatMap the User and fetch the Team from the database. Then you flatMap the team and build a Mono<Tuple> consisting of Mono<Tuple<User, Team>>.
After that you take your 2 Monos and use zipWith with a Combinator function and build a Mono<Tuple<User, Team, Instruction>> that you can flatMap over.
So basically fetch 1 item, then fetch 2 items, then Combinate into 3 items. You can create Tuples using the Tuples.of(...) function.

Java 8 JPA Repository Stream produce two (or more) results?

I have a Java 8 stream being returned by a Spring Data JPA Repository. I don't think my usecase is all that unusual, there are two (actually 3 in my case), collections off of the resulting stream that I would like collected.
Set<Long> ids = // initialized
try (Stream<SomeDatabaseEntity> someDatabaseEntityStream =
someDatabaseEntityRepository.findSomeDatabaseEntitiesStream(ids)) {
Set<Long> theAlphaComponentIds = someDatabaseEntityStream
.map(v -> v.getAlphaComponentId())
.collect(Collectors.toSet());
// operations on 'theAlphaComponentIds' here
}
I need to pull out the 'Beta' objects and do some work on those too. So I think I had to repeat the code, which seems completely wrong:
try (Stream<SomeDatabaseEntity> someDatabaseEntityStream =
someDatabaseEntityRepository.findSomeDatabaseEntitiesStream(ids)) {
Set<BetaComponent> theBetaComponents = someDatabaseEntityStream
.map(v -> v.getBetaComponent())
.collect(Collectors.toSet());
// operations on 'theBetaComponents' here
}
These two code blocks occur serially in the processing. Is there clean way to get both Sets from processing the Stream only once? Note: I do not want some kludgy solution that makes up a wrapper class for the Alpha's and Beta's as they don't really belong together.

You can always refactor code by putting the common parts into a method and turning the uncommon parts into parameters. E.g.
public <T> Set<T> getAll(Set<Long> ids, Function<SomeDatabaseEntity, T> f)
{
try(Stream<SomeDatabaseEntity> someDatabaseEntityStream =
someDatabaseEntityRepository.findSomeDatabaseEntitiesStream(ids)) {
return someDatabaseEntityStream.map(f).collect(Collectors.toSet());
}
}
usable via
Set<Long> theAlphaComponentIds = getAll(ids, v -> v.getAlphaComponentId());
// operations on 'theAlphaComponentIds' here
and
Set<BetaComponent> theBetaComponents = getAll(ids, v -> v.getBetaComponent());
// operations on 'theBetaComponents' here
Note that this pulls the “operations on … here” parts out of the try block, which is a good thing, as it implies that the associated resources are released earlier. This requires that BetaComponent can be processed independently of the Stream’s underlying resources (otherwise, you shouldn’t collect it into a Set anyway). For the Longs, we know for sure that they can be processed independently.
Of course, you could process the result out of the try block even without the moving the common code into a method. Whether the original code bears a duplication that requires this refactoring, is debatable. Actually, the operation consists a single statement within a try block that looks big only due to the verbose identifiers. Ask yourself, whether you would still deem the refactoring necessary, if the code looked like
Set<Long> alphaIDs, ids = // initialized
try(Stream<SomeDatabaseEntity> s = repo.findSomeDatabaseEntitiesStream(ids)) {
alphaIDs = s.map(v -> v.getAlphaComponentId()).collect(Collectors.toSet());
}
// operations on 'theAlphaComponentIds' here
Well, different developers may come to different conclusions…
If you want to reduce the number of repository queries, you can simply store the result of the query:
List<SomeDatabaseEntity> entities;
try(Stream<SomeDatabaseEntity> someDatabaseEntityStream =
someDatabaseEntityRepository.findSomeDatabaseEntitiesStream(ids)) {
entities=someDatabaseEntityStream.collect(Collectors.toList());
}
Set<Long> theAlphaComponentIds = entities.stream()
.map(v -> v.getAlphaComponentId()).collect(Collectors.toSet());
// operations on 'theAlphaComponentIds' here
Set<BetaComponent> theBetaComponents = entities.stream()
.map(v -> v.getBetaComponent()).collect(Collectors.toSet());
// operations on 'theBetaComponents' here

Java8 streams map - check if all map operations succeeded?

I am trying to map one list to another using streams.
Some elements of the original list fail to map. That is, the mapping function may not be able to find an appropriate new value.
I want to know if any of the mappings has failed. Ideally I would also like to stop the processing once a failure happened.
What I am currently doing is:
The mapping function returns null if there's no mapped value
I filter() to remove nulls from the stream
I collect(), and then
I compare the size of the result to the size of the original list.
For example:
List<String> func(List<String> old, Map<String, String> oldToNew)
{
List<String> holger = old.stream()
.map(oldToNew::get)
.filter(Objects::nonNull)
.collect(Collectors.toList);
if (holger.size() < old.size()) {
// ... appropriate error handling code ...
}
else {
return holger;
}
}
This is not very elegant. Also, everything is processed even when the whole thing should fail.
Suggestions for a better way of doing it?
Or maybe I should ditch streams altogether and use good old loops?

There is no best solution because that heavily depends on the use case. E.g. if lookup failures are expected to be unlikely or the error handling implies throwing an exception anyway, just throwing an exception at the first failed lookup within the mapping function might indeed be a good choice. Then, no follow-up code has to care about error conditions.
Another way of handling it might be:
List<String> func(List<String> old, Map<String, String> oldToNew) {
Map<Boolean,List<String>> map=old.stream()
.map(oldToNew::get)
.collect(Collectors.partitioningBy(Objects::nonNull));
List<String> failed=map.get(false);
if(!failed.isEmpty())
throw new IllegalStateException(failed.size()+" lookups failed");
return map.get(true);
}
This can still be considered being optimized for the successful case as it collects a mostly meaningless list containing null values for the failures. But it has the point of being able to tell the number of failures (unlike using a throwing map function).
If a detailed error analysis has a high priority, you may use a solution like this:
List<String> func(List<String> old, Map<String, String> oldToNew) {
Map<Boolean,List<String>> map=old.stream()
.map(s -> new AbstractMap.SimpleImmutableEntry<>(s, oldToNew.get(s)))
.collect(Collectors.partitioningBy(e -> e.getValue()!=null,
Collectors.mapping(e -> Optional.ofNullable(e.getValue()).orElse(e.getKey()),
Collectors.toList())));
List<String> failed=map.get(false);
if(!failed.isEmpty())
throw new IllegalStateException("The following key(s) failed: "+failed);
return map.get(true);
}
It collects two meaningful lists, containing the failed keys for failed lookups and a list of successfully mapped values. Note that both lists could be returned.

You could change your filter to Objects::requireNonNull and catch a NullPointerException outside the stream

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

stream() vs. parallelStream() when adding in an ArrayList - java-8

Related

How to write StepVerfier to test Flux.interval and delayElements?

How do I use multiple reactive streams in the same pipeline?

Spring Webflux: efficiently using Flux and/or Mono stream multiple times (possible?)

Java 8 JPA Repository Stream produce two (or more) results?

Java8 streams map - check if all map operations succeeded?

Categories

Resources