RxJava Caching Results of Network Call IDs from a Stream (Redis or similar caching solution) - caching

I need some help with RxJava. I have an expensive network call which returns an Observable (stream of Adverts from elasticsearch). I want to cache the ID property of each emitted item (Advert) for 10 minutes (in Redis), so that subsequent calls in the following 10 minutes use the IDs from Cache to fetch the adverts from Elasticsearch.
I've got some code - which goes some way towards achieving the desired outcome, (credit to following blog ..
http://blog.danlew.net/2015/06/22/loading-data-from-multiple-sources-with-rxjava/)
It can cache each emitted item from the stream, what I need is it to cache all the IDs from those items in the stream as 1 cache entry
Code so far is here https://github.com/tonymurphy/rxjava for anyone interested, snippets below
#Component
public class CachingObservable {
private final Logger logger = LoggerFactory.getLogger(CachingObservable.class);
#Autowired
private AdvertService advertService;
// Each "network" response is different
private Long requestNumber = 0L;
public Observable<Advert> getAdverts(final String location) {
Observable<Advert> memory = memory(location);
Observable<Advert> network = network(location);
Observable<Advert> networkWithSave = network.doOnNext(new Action1<Advert>() {
#Override
public void call(Advert advert) {
List<Long> ids = new ArrayList<Long>();
ids.add(advert.getId());
advertService.cache(location, ids);
}
});
// Retrieve the first source with data - concat checks in order
Observable<Advert> source = Observable.concat(memory,
networkWithSave)
.first();
return source;
}
From my understanding, the concat method is not really useful for my use case. I need to know if/when the network observable completes, I need to get the list of advert id's returned and I need to store them in the cache. I could subscribe to the network observable - but I want this to be lazy - only called if no data is found in the cache. So the following updated code doesn't work.. any ideas appreciated
public Observable<Advert> getAdverts(final String location) {
Observable<Advert> memory = memory(location);
Observable<Advert> network = network(location);
Observable<Advert> networkWithSave = network.doOnNext(new Action1<Advert>() {
#Override
public void call(Advert advert) {
List<Long> ids = new ArrayList<Long>();
ids.add(advert.getId());
advertService.cache(location, ids);
}
});
// Retrieve the first source with data - concat checks in order
Observable<Advert> source = Observable.concat(memory,
networkWithSave)
.first();
Observable<List<Advert>> listObservable = networkWithSave.toList();
final Func1<List<Advert>, List<Long>> transformer = new Func1<List<Advert>, List<Long>>() {
#Override
public List<Long> call(List<Advert> adverts) {
List<Long> ids = new ArrayList<Long>();
for (Advert advert : adverts) {
ids.add(advert.getId());
}
return ids;
}
};
listObservable.map(transformer).subscribe(new Action1<List<Long>>() {
#Override
public void call(List<Long> ids) {
logger.info("ids {}", ids);
}
});
return source;
}

What I would do is use filter to make sure the old content of the cache is not emitted so the concat jumps to the network call:
Subject<Pair<Long, List<Advert>>, Pair<Long, List<Advert>>> cache =
BehaviorSubject.create().toSerialized();
static final long RETENTION_TIME = 10L * 60 * 1000;
Observable<Advert> memory = cache.filter(v ->
v.first + RETENTION_TIME > System.currentTimeMillis()).flatMapIterable(v -> v);
Observable<Advert> network = ...
Observable<Advert> networkWithSave = network.toList().doOnNext(v ->
cache.onNext(Pair.of(System.currentTimeMillis(), v)).flatMapIterable(v -> v)
);
return memory.switchIfEmpty(network);

Ok, I think I have a solution which will work for me. I may have overlooked something, but it should be easy?
public Observable<Advert> getAdverts(final String location) {
Observable<Advert> memory = memory(location);
final Observable<Advert> network = network(location);
final Func1<List<Advert>, List<Long>> advertToIdTransformer = convertAdvertsToIds();
memory.isEmpty().subscribe(new Action1<Boolean>() {
#Override
public void call(Boolean aBoolean) {
if (aBoolean.equals(Boolean.TRUE)) {
Observable<List<Long>> listObservable = network.toList().map(advertToIdTransformer);
listObservable.subscribe(new Action1<List<Long>>() {
#Override
public void call(List<Long> ids) {
logger.info("Caching ids {}", ids);
advertService.cache(location, ids);
}
});
}
}
});
// Retrieve the first source with data - concat checks in order
Observable<Advert> source = Observable.concat(memory,
network)
.first();
return source;
}

Related

java 8 streams grouping and creating Map<String, Set<String>> issue [duplicate]

In Java 8 how can I filter a collection using the Stream API by checking the distinctness of a property of each object?
For example I have a list of Person object and I want to remove people with the same name,
persons.stream().distinct();
Will use the default equality check for a Person object, so I need something like,
persons.stream().distinct(p -> p.getName());
Unfortunately the distinct() method has no such overload. Without modifying the equality check inside the Person class is it possible to do this succinctly?
Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> seen.add(keyExtractor.apply(t));
}
Then you can write:
persons.stream().filter(distinctByKey(Person::getName))
Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does.
(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)
An alternative would be to place the persons in a map using the name as a key:
persons.collect(Collectors.toMap(Person::getName, p -> p, (p, q) -> p)).values();
Note that the Person that is kept, in case of a duplicate name, will be the first encontered.
You can wrap the person objects into another class, that only compares the names of the persons. Afterward, you unwrap the wrapped objects to get a person stream again. The stream operations might look as follows:
persons.stream()
.map(Wrapper::new)
.distinct()
.map(Wrapper::unwrap)
...;
The class Wrapper might look as follows:
class Wrapper {
private final Person person;
public Wrapper(Person person) {
this.person = person;
}
public Person unwrap() {
return person;
}
public boolean equals(Object other) {
if (other instanceof Wrapper) {
return ((Wrapper) other).person.getName().equals(person.getName());
} else {
return false;
}
}
public int hashCode() {
return person.getName().hashCode();
}
}
Another solution, using Set. May not be the ideal solution, but it works
Set<String> set = new HashSet<>(persons.size());
persons.stream().filter(p -> set.add(p.getName())).collect(Collectors.toList());
Or if you can modify the original list, you can use removeIf method
persons.removeIf(p -> !set.add(p.getName()));
There's a simpler approach using a TreeSet with a custom comparator.
persons.stream()
.collect(Collectors.toCollection(
() -> new TreeSet<Person>((p1, p2) -> p1.getName().compareTo(p2.getName()))
));
We can also use RxJava (very powerful reactive extension library)
Observable.from(persons).distinct(Person::getName)
or
Observable.from(persons).distinct(p -> p.getName())
You can use groupingBy collector:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().forEach(t -> System.out.println(t.get(0).getId()));
If you want to have another stream you can use this:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream().map(l -> (l.get(0)));
You can use the distinct(HashingStrategy) method in Eclipse Collections.
List<Person> persons = ...;
MutableList<Person> distinct =
ListIterate.distinct(persons, HashingStrategies.fromFunction(Person::getName));
If you can refactor persons to implement an Eclipse Collections interface, you can call the method directly on the list.
MutableList<Person> persons = ...;
MutableList<Person> distinct =
persons.distinct(HashingStrategies.fromFunction(Person::getName));
HashingStrategy is simply a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Note: I am a committer for Eclipse Collections.
Similar approach which Saeed Zarinfam used but more Java 8 style:)
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream()
.map(plans -> plans.stream().findFirst().get())
.collect(toList());
You can use StreamEx library:
StreamEx.of(persons)
.distinct(Person::getName)
.toList()
I recommend using Vavr, if you can. With this library you can do the following:
io.vavr.collection.List.ofAll(persons)
.distinctBy(Person::getName)
.toJavaSet() // or any another Java 8 Collection
Extending Stuart Marks's answer, this can be done in a shorter way and without a concurrent map (if you don't need parallel streams):
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
final Set<Object> seen = new HashSet<>();
return t -> seen.add(keyExtractor.apply(t));
}
Then call:
persons.stream().filter(distinctByKey(p -> p.getName());
My approach to this is to group all the objects with same property together, then cut short the groups to size of 1 and then finally collect them as a List.
List<YourPersonClass> listWithDistinctPersons = persons.stream()
//operators to remove duplicates based on person name
.collect(Collectors.groupingBy(p -> p.getName()))
.values()
.stream()
//cut short the groups to size of 1
.flatMap(group -> group.stream().limit(1))
//collect distinct users as list
.collect(Collectors.toList());
Distinct objects list can be found using:
List distinctPersons = persons.stream()
.collect(Collectors.collectingAndThen(
Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(Person:: getName))),
ArrayList::new));
I made a generic version:
private <T, R> Collector<T, ?, Stream<T>> distinctByKey(Function<T, R> keyExtractor) {
return Collectors.collectingAndThen(
toMap(
keyExtractor,
t -> t,
(t1, t2) -> t1
),
(Map<R, T> map) -> map.values().stream()
);
}
An exemple:
Stream.of(new Person("Jean"),
new Person("Jean"),
new Person("Paul")
)
.filter(...)
.collect(distinctByKey(Person::getName)) // return a stream of Person with 2 elements, jean and Paul
.map(...)
.collect(toList())
Another library that supports this is jOOλ, and its Seq.distinct(Function<T,U>) method:
Seq.seq(persons).distinct(Person::getName).toList();
Under the hood, it does practically the same thing as the accepted answer, though.
Set<YourPropertyType> set = new HashSet<>();
list
.stream()
.filter(it -> set.add(it.getYourProperty()))
.forEach(it -> ...);
While the highest upvoted answer is absolutely best answer wrt Java 8, it is at the same time absolutely worst in terms of performance. If you really want a bad low performant application, then go ahead and use it. Simple requirement of extracting a unique set of Person Names shall be achieved by mere "For-Each" and a "Set".
Things get even worse if list is above size of 10.
Consider you have a collection of 20 Objects, like this:
public static final List<SimpleEvent> testList = Arrays.asList(
new SimpleEvent("Tom"), new SimpleEvent("Dick"),new SimpleEvent("Harry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Huckle"),new SimpleEvent("Berry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("Cherry"),
new SimpleEvent("Roses"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("gotya"),
new SimpleEvent("Gotye"),new SimpleEvent("Nibble"),new SimpleEvent("Berry"),new SimpleEvent("Jibble"));
Where you object SimpleEvent looks like this:
public class SimpleEvent {
private String name;
private String type;
public SimpleEvent(String name) {
this.name = name;
this.type = "type_"+name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
}
And to test, you have JMH code like this,(Please note, im using the same distinctByKey Predicate mentioned in accepted answer) :
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aStreamBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = testList
.stream()
.filter(distinctByKey(SimpleEvent::getName))
.map(SimpleEvent::getName)
.collect(Collectors.toSet());
blackhole.consume(uniqueNames);
}
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aForEachBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = new HashSet<>();
for (SimpleEvent event : testList) {
uniqueNames.add(event.getName());
}
blackhole.consume(uniqueNames);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(MyBenchmark.class.getSimpleName())
.forks(1)
.mode(Mode.Throughput)
.warmupBatchSize(3)
.warmupIterations(3)
.measurementIterations(3)
.build();
new Runner(opt).run();
}
Then you'll have Benchmark results like this:
Benchmark Mode Samples Score Score error Units
c.s.MyBenchmark.aForEachBasedUniqueSet thrpt 3 2635199.952 1663320.718 ops/s
c.s.MyBenchmark.aStreamBasedUniqueSet thrpt 3 729134.695 895825.697 ops/s
And as you can see, a simple For-Each is 3 times better in throughput and less in error score as compared to Java 8 Stream.
Higher the throughput, better the performance
I would like to improve Stuart Marks answer. What if the key is null, it will through NullPointerException. Here I ignore the null key by adding one more check as keyExtractor.apply(t)!=null.
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> keyExtractor.apply(t)!=null && seen.add(keyExtractor.apply(t));
}
This works like a charm:
Grouping the data by unique key to form a map.
Returning the first object from every value of the map (There could be multiple person having same name).
persons.stream()
.collect(groupingBy(Person::getName))
.values()
.stream()
.flatMap(values -> values.stream().limit(1))
.collect(toList());
The easiest way to implement this is to jump on the sort feature as it already provides an optional Comparator which can be created using an element’s property. Then you have to filter duplicates out which can be done using a statefull Predicate which uses the fact that for a sorted stream all equal elements are adjacent:
Comparator<Person> c=Comparator.comparing(Person::getName);
stream.sorted(c).filter(new Predicate<Person>() {
Person previous;
public boolean test(Person p) {
if(previous!=null && c.compare(previous, p)==0)
return false;
previous=p;
return true;
}
})./* more stream operations here */;
Of course, a statefull Predicate is not thread-safe, however if that’s your need you can move this logic into a Collector and let the stream take care of the thread-safety when using your Collector. This depends on what you want to do with the stream of distinct elements which you didn’t tell us in your question.
There are lot of approaches, this one will also help - Simple, Clean and Clear
List<Employee> employees = new ArrayList<>();
employees.add(new Employee(11, "Ravi"));
employees.add(new Employee(12, "Stalin"));
employees.add(new Employee(23, "Anbu"));
employees.add(new Employee(24, "Yuvaraj"));
employees.add(new Employee(35, "Sena"));
employees.add(new Employee(36, "Antony"));
employees.add(new Employee(47, "Sena"));
employees.add(new Employee(48, "Ravi"));
List<Employee> empList = new ArrayList<>(employees.stream().collect(
Collectors.toMap(Employee::getName, obj -> obj,
(existingValue, newValue) -> existingValue))
.values());
empList.forEach(System.out::println);
// Collectors.toMap(
// Employee::getName, - key (the value by which you want to eliminate duplicate)
// obj -> obj, - value (entire employee object)
// (existingValue, newValue) -> existingValue) - to avoid illegalstateexception: duplicate key
Output - toString() overloaded
Employee{id=35, name='Sena'}
Employee{id=12, name='Stalin'}
Employee{id=11, name='Ravi'}
Employee{id=24, name='Yuvaraj'}
Employee{id=36, name='Antony'}
Employee{id=23, name='Anbu'}
Here is the example
public class PayRoll {
private int payRollId;
private int id;
private String name;
private String dept;
private int salary;
public PayRoll(int payRollId, int id, String name, String dept, int salary) {
super();
this.payRollId = payRollId;
this.id = id;
this.name = name;
this.dept = dept;
this.salary = salary;
}
}
import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collector;
import java.util.stream.Collectors;
public class Prac {
public static void main(String[] args) {
int salary=70000;
PayRoll payRoll=new PayRoll(1311, 1, "A", "HR", salary);
PayRoll payRoll2=new PayRoll(1411, 2 , "B", "Technical", salary);
PayRoll payRoll3=new PayRoll(1511, 1, "C", "HR", salary);
PayRoll payRoll4=new PayRoll(1611, 1, "D", "Technical", salary);
PayRoll payRoll5=new PayRoll(711, 3,"E", "Technical", salary);
PayRoll payRoll6=new PayRoll(1811, 3, "F", "Technical", salary);
List<PayRoll>list=new ArrayList<PayRoll>();
list.add(payRoll);
list.add(payRoll2);
list.add(payRoll3);
list.add(payRoll4);
list.add(payRoll5);
list.add(payRoll6);
Map<Object, Optional<PayRoll>> k = list.stream().collect(Collectors.groupingBy(p->p.getId()+"|"+p.getDept(),Collectors.maxBy(Comparator.comparingInt(PayRoll::getPayRollId))));
k.entrySet().forEach(p->
{
if(p.getValue().isPresent())
{
System.out.println(p.getValue().get());
}
});
}
}
Output:
PayRoll [payRollId=1611, id=1, name=D, dept=Technical, salary=70000]
PayRoll [payRollId=1811, id=3, name=F, dept=Technical, salary=70000]
PayRoll [payRollId=1411, id=2, name=B, dept=Technical, salary=70000]
PayRoll [payRollId=1511, id=1, name=C, dept=HR, salary=70000]
Late to the party but I sometimes use this one-liner as an equivalent:
((Function<Value, Key>) Value::getKey).andThen(new HashSet<>()::add)::apply
The expression is a Predicate<Value> but since the map is inline, it works as a filter. This is of course less readable but sometimes it can be helpful to avoid the method.
Building on #josketres's answer, I created a generic utility method:
You could make this more Java 8-friendly by creating a Collector.
public static <T> Set<T> removeDuplicates(Collection<T> input, Comparator<T> comparer) {
return input.stream()
.collect(toCollection(() -> new TreeSet<>(comparer)));
}
#Test
public void removeDuplicatesWithDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(7), new C(42), new C(42));
Collection<C> result = removeDuplicates(input, (c1, c2) -> Integer.compare(c1.value, c2.value));
assertEquals(2, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 7));
assertTrue(result.stream().anyMatch(c -> c.value == 42));
}
#Test
public void removeDuplicatesWithoutDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(1), new C(2), new C(3));
Collection<C> result = removeDuplicates(input, (t1, t2) -> Integer.compare(t1.value, t2.value));
assertEquals(3, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 1));
assertTrue(result.stream().anyMatch(c -> c.value == 2));
assertTrue(result.stream().anyMatch(c -> c.value == 3));
}
private class C {
public final int value;
private C(int value) {
this.value = value;
}
}
Maybe will be useful for somebody. I had a little bit another requirement. Having list of objects A from 3rd party remove all which have same A.b field for same A.id (multiple A object with same A.id in list). Stream partition answer by Tagir Valeev inspired me to use custom Collector which returns Map<A.id, List<A>>. Simple flatMap will do the rest.
public static <T, K, K2> Collector<T, ?, Map<K, List<T>>> groupingDistinctBy(Function<T, K> keyFunction, Function<T, K2> distinctFunction) {
return groupingBy(keyFunction, Collector.of((Supplier<Map<K2, T>>) HashMap::new,
(map, error) -> map.putIfAbsent(distinctFunction.apply(error), error),
(left, right) -> {
left.putAll(right);
return left;
}, map -> new ArrayList<>(map.values()),
Collector.Characteristics.UNORDERED)); }
I had a situation, where I was suppose to get distinct elements from list based on 2 keys.
If you want distinct based on two keys or may composite key, try this
class Person{
int rollno;
String name;
}
List<Person> personList;
Function<Person, List<Object>> compositeKey = personList->
Arrays.<Object>asList(personList.getName(), personList.getRollno());
Map<Object, List<Person>> map = personList.stream().collect(Collectors.groupingBy(compositeKey, Collectors.toList()));
List<Object> duplicateEntrys = map.entrySet().stream()`enter code here`
.filter(settingMap ->
settingMap.getValue().size() > 1)
.collect(Collectors.toList());
A variation of the top answer that handles null:
public static <T, K> Predicate<T> distinctBy(final Function<? super T, K> getKey) {
val seen = ConcurrentHashMap.<Optional<K>>newKeySet();
return obj -> seen.add(Optional.ofNullable(getKey.apply(obj)));
}
In my tests:
assertEquals(
asList("a", "bb"),
Stream.of("a", "b", "bb", "aa").filter(distinctBy(String::length)).collect(toList()));
assertEquals(
asList(5, null, 2, 3),
Stream.of(5, null, 2, null, 3, 3, 2).filter(distinctBy(x -> x)).collect(toList()));
val maps = asList(
hashMapWith(0, 2),
hashMapWith(1, 2),
hashMapWith(2, null),
hashMapWith(3, 1),
hashMapWith(4, null),
hashMapWith(5, 2));
assertEquals(
asList(0, 2, 3),
maps.stream()
.filter(distinctBy(m -> m.get("val")))
.map(m -> m.get("i"))
.collect(toList()));
In my case I needed to control what was the previous element. I then created a stateful Predicate where I controled if the previous element was different from the current element, in that case I kept it.
public List<Log> fetchLogById(Long id) {
return this.findLogById(id).stream()
.filter(new LogPredicate())
.collect(Collectors.toList());
}
public class LogPredicate implements Predicate<Log> {
private Log previous;
public boolean test(Log atual) {
boolean isDifferent = previouws == null || verifyIfDifferentLog(current, previous);
if (isDifferent) {
previous = current;
}
return isDifferent;
}
private boolean verifyIfDifferentLog(Log current, Log previous) {
return !current.getId().equals(previous.getId());
}
}
My solution in this listing:
List<HolderEntry> result ....
List<HolderEntry> dto3s = new ArrayList<>(result.stream().collect(toMap(
HolderEntry::getId,
holder -> holder, //or Function.identity() if you want
(holder1, holder2) -> holder1
)).values());
In my situation i want to find distinct values and put their in List.

Spring reactive : mixing RestTemplate & WebClient

I have two endpoints : /parent and /child/{parentId}
I need to return list of all Child
public class Parent {
private long id;
private Child child;
}
public class Child {
private long childId;
private String someAttribute;
}
However, call to /child/{parentId} is quite slow, so Im trying to do this:
Call /parent to get 100 parent data, using asynchronous RestTemplate
For each parent data, call /child/{parentId} to get detail
Add the result call to /child/{parentId} into resultList
When 100 calls to /child/{parentId} is done, return resultList
I use wrapper class since most endpoints returns JSON in format :
{
"next": "String",
"data": [
// parent goes here
]
}
So I wrap it in this
public class ResponseWrapper<T> {
private List<T> data;
private String next;
}
I wrote this code, but the resultList always return empty elements.
What is the correct way to achieve this?
public List<Child> getAllParents() {
var endpointParent = StringUtils.join(HOST, "/parent");
var resultList = new ArrayList<Child>();
var responseParent = restTemplate.exchange(endpointParent, HttpMethod.GET, httpEntity,
new ParameterizedTypeReference<ResponseWrapper<Parent>>() {
});
responseParent.getBody().getData().stream().forEach(parent -> {
var endpointChild = StringUtils.join(HOST, "/child/", parent.getId());
// async call due to slow endpoint child
webClient.get().uri(endpointChild).retrieve()
.bodyToMono(new ParameterizedTypeReference<ResponseWrapper<Child>>() {
}).map(wrapper -> wrapper.getData()).subscribe(children -> {
children.stream().forEach(child -> resultList.add(child));
});
});
return resultList;
}
Calling subscribe on a reactive type starts the processing but returns immediately; you have no guarantee at that point that the processing is done. So by the time your snippet is calling return resultList, the WebClient is probably is still busy fetching things.
You're better off discarding the async resttemplate (which is now deprecated in favour of WebClient) and build a single pipeline like:
public List<Child> getAllParents() {
var endpointParent = StringUtils.join(HOST, "/parent");
var resultList = new ArrayList<Child>();
Flux<Parent> parents = webClient.get().uri(endpointParent)
.retrieve().bodyToMono(ResponseWrapper.class)
.flatMapMany(wrapper -> Flux.fromIterable(wrapper.data));
return parents.flatMap(parent -> {
var endpointChild = StringUtils.join(HOST, "/child/", parent.getId());
return webClient.get().uri(endpointChild).retrieve()
.bodyToMono(new ParameterizedTypeReference<ResponseWrapper<Child>>() {
}).flatMapMany(wrapper -> Flux.fromIterable(wrapper.getData()));
}).collectList().block();
}
By default, the parents.flatMap operator will process elements with some concurrency (16 by default I believe). You can choose a different value by calling another variant of the Flux.flatMap operator with a chosen concurrency value.

Limit and get flat list in java 8

I have a object like this
public class Keyword
{
private int id;
private DateTime creationDate
private int subjectId
...
}
So now i have the data list like bellow
KeywordList = [{1,'2018-10-20',10},{1,'2018-10-21',10},{1,'2018-10-22',10},{1,'2018-10-23',20},{1,'2018-10-24',20}{1,'2018-10-25',20},{1,'2018-10-26',30},{1,'2018-10-27',30},{1,'2018-10-28',40}]
I wanted to limit this list for subject id
Ex: If i provide limit as 2 it should only include latest 2 records for each subject id by sorting by creationDate and return the result as list too.
resultList = KeywordList = [{1,'2018-10-21',10},{1,'2018-10-22',10},{1,'2018-10-24',20},{1,'2018-10-25',20},{1,'2018-10-26',30},{1,'2018-10-27',30},{1,'2018-10-28',40}]
how we can achive this kind of thing in Java 8
I have achived it in this kind of way.But i have a doubt in this code peformance wise.
dataList.stream()
.collect(Collectors.groupingBy(Keyword::getSubjectId,
Collectors.collectingAndThen(Collectors.toList(),
myList-> myList.stream().sorted(Comparator.comparing(Keyword::getCreationDate).reversed()).limit(limit)
.collect(Collectors.toList()))))
.values().stream().flatMap(List::stream).collect(Collectors.toList())
Well you could do it in two steps I guess(assuming DateTime is comparable):
yourInitialList
.stream()
.collect(Collectors.groupingBy(Keyword::getSubjectId));
List<Keyword> result = map.values()
.stream()
.flatMap(x -> x.stream()
.sorted(Comparator.comparing(Keyword::getCreationDate))
.limit(2))
.collect(Collectors.toList());
This is doable in a single step too with Collectors.collectingAndThen I guess, but not sure on how readable it would be.
private static final Comparator<Keyword> COMPARE_BY_CREATION_DATE_DESC = (k1, k2) -> k2.getCreationDate().compareTo(k1.getCreationDate());
private static <T> Collector<T, ?, List<T>> limitingList(int limit) {
return Collector.of(ArrayList::new,
(l, e) -> {
if (l.size() < limit)
l.add(e);
},
(l1, l2) -> {
l1.addAll(l2.subList(0, Math.min(l2.size(), Math.max(0, limit - l1.size()))));
return l1;
}
);
}
public static <R> Map<R, List<Keyword>> retrieveLast(List<Keyword> keywords, Function<Keyword, ? extends R> classifier, int limit) {
return keywords.stream()
.sorted(COMPARE_BY_CREATION_DATE_DESC)
.collect(Collectors.groupingBy(classifier, limitingList(limit)));
}
And the client's code:
List<Keyword> keywords = Collections.emptyList();
Map<Integer, List<Keyword>> map = retrieveLast(keywords, Keyword::getSubjectId, 2);

Entity framework 6 code first Many to many insert slow

I am trying to find a way to improve insert performances with the following code (please, read my questions after the code block):
//Domain classes
[Table("Products")]
public class Product
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
public string Sku { get; set; }
[ForeignKey("Orders")]
public virtual ICollection<Order> Orders { get; set; }
public Product()
{
Orders = new List<Order>();
}
}
[Table("Orders")]
public class Order
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
public string Title { get; set; }
public decimal Total { get; set; }
[ForeignKey("Products")]
public virtual ICollection<Product> Products { get; set; }
public Order()
{
Products = new List<Product>();
}
}
//Data access
public class MyDataContext : DbContext
{
public MyDataContext()
: base("MyDataContext")
{
Configuration.LazyLoadingEnabled = true;
Configuration.ProxyCreationEnabled = true;
Database.SetInitializer(new CreateDatabaseIfNotExists<MyDataContext>());
}
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.Entity<Product>().ToTable("Products");
modelBuilder.Entity<Order>().ToTable("Orders");
}
}
//Service layer
public interface IServices<T, K>
{
T Create(T item);
T Read(K key);
IEnumerable<T> ReadAll(Expression<Func<IEnumerable<T>, IEnumerable<T>>> pre);
T Update(T item);
void Delete(K key);
void Save();
void Dispose();
void BatchSave(IEnumerable<T> list);
void BatchUpdate(IEnumerable<T> list, Action<UpdateSpecification<T>> spec);
}
public class BaseServices<T, K> : IDisposable, IServices<T, K> where T : class
{
protected MyDataContext Context;
public BaseServices()
{
Context = new MyDataContext();
}
public T Create(T item)
{
T created;
created = Context.Set<T>().Add(item);
return created;
}
public void Delete(K key)
{
var item = Read(key);
if (item == null)
return;
Context.Set<T>().Attach(item);
Context.Set<T>().Remove(item);
}
public T Read(K key)
{
T read;
read = Context.Set<T>().Find(key);
return read;
}
public IEnumerable<T> ReadAll(Expression<Func<IEnumerable<T>, IEnumerable<T>>> pre)
{
IEnumerable<T> read;
read = Context.Set<T>().ToList();
read = pre.Compile().Invoke(read);
return read;
}
public T Update(T item)
{
Context.Set<T>().Attach(item);
Context.Entry<T>(item).CurrentValues.SetValues(item);
Context.Entry<T>(item).State = System.Data.Entity.EntityState.Modified;
return item;
}
public void Save()
{
Context.SaveChanges();
}
}
public interface IOrderServices : IServices<Order, int>
{
//custom logic goes here
}
public interface IProductServices : IServices<Product, int>
{
//custom logic goes here
}
//Web project's controller
public ActionResult TestCreateProducts()
{
//Create 100 new rest products
for (int i = 0; i < 100; i++)
{
_productServices.Create(new Product
{
Sku = i.ToString()
});
}
_productServices.Save();
var products = _productServices.ReadAll(r => r); //get a list of saved products to add them to orders
var random = new Random();
var orders = new List<Order>();
var count = 0;
//Create 3000 orders
for (int i = 1; i <= 3000; i++)
{
//Generate a random list of products to attach to the current order
var productIds = new List<int>();
var x = random.Next(1, products.Count() - 1);
for (int j = 0; j < x; j++)
{
productIds.Add(random.Next(products.Min(r => r.Id), products.Max(r => r.Id)));
}
//Create the order
var order = new Order
{
Title = "Order" + i,
Total = i,
Products = products.Where(p => productIds.Contains(p.Id))
};
orders.Add(order);
}
_orderServices.CreateRange(orders);
_orderServices.Save();
return RedirectToAction("Index");
}
This code works fine but is very VERY slow when the SaveChanges is executed.
Behind the scene, the annotations on the domain objects creates all the relationships needed: a OrderProducts table with the proper foreign keys are automatically created and the inserts are being done by EF properly.
I've tried many things with bulk inserts using EntityFramework.Utilities, SqlBulkCopy, etc... but none worked.
Is there a way to achieve this?
Understand this is only for testing purposes and my goal is to optimize the best I can any operations in our softwares using EF.
Thanks!
Just before you do your inserts disable your context's AutoDetectChangesEnabled (by setting it to false). Do your inserts and then set the AutoDetectChangesEnabled back to true e.g.;
try
{
MyContext.Configuration.AutoDetectChangesEnabled = false;
// do your inserts updates etc..
}
finally
{
MyContext.Configuration.AutoDetectChangesEnabled = true;
}
You can find more information on what this is doing here
I see two reasons why your code is slow.
Add vs. AddRange
You add entity one by one using the Create method.
You should always use AddRange over Add. The Add method will try to DetectChanges every time the add method is invoked while AddRange only once.
You should add a "CreateRange" method in your code.
public IEnumerable<T> CreateRange(IEnumerable<T> list)
{
return Context.Set<T>().AddRange(list);
}
var products = new List<Product>();
//Create 100 new rest products
for (int i = 0; i < 100; i++)
{
products.Add(new Product { Sku = i.ToString() });
}
_productServices.CreateRange(list);
_productServices.Save();
Disabling / Enabling the property AutoDetectChanges also work as #mark_h proposed, however personally I don't like this kind of solution.
Database Round Trip
A database round trip is required for every record to add, modify or delete. So if you insert 3,000 records, then 3,000 database round trip will be required which is VERY slow.
You already tried EntityFramework.BulkInsert or SqlBulkCopy, which is great. I recommend you first to try them again using the "AddRange" fix to see the newly performance.
Here is a biased comparison of library supporting BulkInsert for EF:
Entity Framework - Bulk Insert Library Reviews & Comparisons
Disclaimer: I'm the owner of the project Entity Framework Extensions
This library allows you to BulkSaveChanges, BulkInsert, BulkUpdate, BulkDelete and BulkMerge within your Database.
It supports all inheritances and associations.
// Easy to use
public void Save()
{
// Context.SaveChanges();
Context.BulkSaveChanges();
}
// Easy to customize
public void Save()
{
// Context.SaveChanges();
Context.BulkSaveChanges(bulk => bulk.BatchSize = 100);
}
EDIT: Added answer to sub question
An entity object cannot be referenced by multiple instances of
IEntityChangeTracker
The issue happens because you use two different DbContext. One for the product and one for order.
You may find a better answer than mine in a different thread like this answer.
The Add method successfully attach the product, subsequent call of the same product doesn't throw an error because it's the same product.
The AddRange method, however, attach the product multiple time since it's not come from the same context, so when Detect Changes is called, he doesn't know how to handle it.
One way to fix it is by re-using the same context
var _productServices = new BaseServices<Product, int>();
var _orderServices = new BaseServices<Order, int>(_productServices.Context);
While it may not be elegant, the performance will be improved.

Page expired issue with back button and wicket SortableDataProvider and DataTable

I've got an issue with SortableDataProvider and DataTable in wicket.
I've defined my DataTable as such:
IColumn<Column>[] columns = new IColumn[9];
//column values are mapped to the private attributes listed in ColumnImpl.java
columns[0] = new PropertyColumn<Column>(new Model<String>("#"), "columnPosition", "columnPosition");
columns[1] = new PropertyColumn<Column>(new Model<String>("Description"), "description");
columns[2] = new PropertyColumn<Column>(new Model<String>("Type"), "dataType", "dataType");
Adding it to the table:
DataTable<Column> dataTable = new DataTable<Column>("columnsTable", columns, provider, maxRowsPerPage) {
#Override
protected Item<Column> newRowItem(String id, int index, IModel<Column> model) {
return new OddEvenItem<Column>(id, index, model);
}
};
My data provider:
public class ColumnSortableDataProvider extends SortableDataProvider<Column> {
private static final long serialVersionUID = 1L;
private List<Column> list = null;
public ColumnSortableDataProvider(Table table, String sortProperty) {
this.list = Arrays.asList(table.getColumns().toArray(new Column[0]));
setSort(sortProperty, true);
}
public ColumnSortableDataProvider(List<Column> list, String sortProperty) {
this.list = list;
setSort(sortProperty, true);
}
#Override
public Iterator<? extends Column> iterator(int first, int count) {
/*
first - first row of data
count - minimum number of elements to retrieve
So this method returns an iterator capable of iterating over {first, first+count} items
*/
Iterator<Column> iterator = null;
try {
if(getSort() != null) {
Collections.sort(list, new Comparator<Column>() {
private static final long serialVersionUID = 1L;
#Override
public int compare(Column c1, Column c2) {
int result=1;
PropertyModel<Comparable> model1= new PropertyModel<Comparable>(c1, getSort().getProperty());
PropertyModel<Comparable> model2= new PropertyModel<Comparable>(c2, getSort().getProperty());
if(model1.getObject() == null && model2.getObject() == null)
result = 0;
else if(model1.getObject() == null)
result = 1;
else if(model2.getObject() == null)
result = -1;
else
result = ((Comparable)model1.getObject()).compareTo(model2.getObject());
result = getSort().isAscending() ? result : -result;
return result;
}
});
}
if (list.size() > (first+count))
iterator = list.subList(first, first+count).iterator();
else
iterator = list.iterator();
}
catch (Exception e) {
e.printStackTrace();
}
return iterator;
}
The problem is the following:
- I click a column header to sort by that column.
- I navigate to a different page
- I click Back (or Forward if I do the opposite scenario)
- Page has expired.
It'd be nice to generate the page using PageParameters but I somehow need to intercept the sort event to do so.
Any pointers would be greatly appreciated. Thanks a ton!!
I don't know at a quick glance what might be causing this, but in order to help diagnose, you might want to enable debug logging for org.apache.wicket.Session or possibly more of the wicket code.
The retrieval of a page definitely involves calls to a method
public final Page getPage(final String pageMapName, final String path, final int versionNumber)
in this class, and it has some debug logging.
For help with setting up this logging, have a look at How to initialize log4j properly? or at the docs for log4j.

Resources