Filter Map keys into a set - java-8

I have a map Map<String,EnrollmentData> which maps student ID to his data.
The student id need to be filtered on certain EnrollmentData attributes ,and returned as a Set.
Map<String, EnrollmentData> studentData = .........;
if(MapUtils.isNotEmpty(studentData )){
Set<String> idSet = studentData .entrySet().stream()
.filter(x -> x.getValue().equals(...) )
.collect(Collectors.toSet( x -> x.getKey()));
}
However,this gives me a compilation error in the toSet [ Collectors is not applicable for the arguments (( x) -> {}) ] .
What needs to be done here.

After the filtering, you have a Stream<Map.Entry<String, EnrollmentData>>. Collecting with toSet() (which accepts no arguments) would collect Entry<String, EnrollmentData>s, but you want to map each element to their key prior to collecting instead.
You must first map the elements of the resulting stream to the Entry's key:
.filter(yourFilterFunction)
.map(Map.Entry::getKey)
.collect(Collectors.toSet());

Related

Assign random UUID on a key's first occurrence in a stream

I'm looking for a solution on how to assign a random UUID to a key only on its first occurrence in a stream.
Example:
time key value assigned uuid
| 1 A fff17a1e-9943-11eb-a8b3-0242ac130003
| 2 B f01d2c42-9943-11eb-a8b3-0242ac130003
| 3 C f8f1e880-9943-11eb-a8b3-0242ac130003
| 1 X fff17a1e-9943-11eb-a8b3-0242ac130003 (same as above)
v 1 Y fff17a1e-9943-11eb-a8b3-0242ac130003 (same as above)
As you can see fff17a1e-9943-11eb-a8b3-0242ac130003 is assigned to key "1" on its first occurrence. This uuid is subsequently reused on its second and third occurrence. The order doesn't matter, though. There is no seed for the generated uuid either.
My idea was to use a leftJoin() with a KStream and a KTable with key/uuid mappings. If the right side of the leftJoin is null I have to create a new UUID and add it to the mapping table. However, I think this does not work when there are several new entries with the same key in a short period of time. I guess this will create several UUIDs for the same key.
Is there an easy solution for this or is this simply not possible with streaming?
I don't think you need a join in your use case because joins are to merge to different streams that arrive with equal IDs. You said that you receive just one stream of events. So, your use case is an aggregation over one stream.
What I understood of your question is that you receive events: A, B, C, ... Then you want to assign some ID. You say that the ID is random. So, this is very uncertain. If it is random how would you know that A -> fff17a1e-9943-11eb-a8b3-0242ac130003 and X -> fff17a1e-9943-11eb-a8b3-0242ac130003 (the same). I suppose that you might have a seed to generate this UUID. And then you create a key based also on this seed.
I suggest you start with this sample of word count. then on the first map:
.map((key, value) -> new KeyValue<>(value, value))
you replace it with your map function. Something like this:
.map((k, v) -> {
if (v.equalsIgnoreCase("A")) {
return new KeyValue<String, ValueWithUUID>("1", new ValueWithUUID(v));
} else if (v.equalsIgnoreCase("B")) {
return new KeyValue<String, ValueWithUUID>("2", new ValueWithUUID(v));
} else {
return new KeyValue<String, ValueWithUUID>("0", new ValueWithUUID(v));
}
})
...
class ValueWithUUID {
String value;
String uuid;
public ValueWithUUID(String value) {
this.value = value;
// generate your UUID based on the value. It is random, but as you show in your question it might have a seed.
this.uuid = generateRandomUUIDWithSeed();
}
public String generateRandomUUIDWithSeed() {
return "fff17a1e-9943-11eb-a8b3-0242ac130003";
}
}
Then you decide if you want to use a windowed aggregation, every 30 seconds for instance. Or a non-windowing aggregation that updates the results for every event that arrives. Here is one nice example.
You can aggregate the raw stream as ktable, in the processing, generate or reuse the uuid; then use the stream of ktable.
final KStream<String, String> streamWithoutUUID = builder.stream("topic_name");
KTable<String, String> tableWithUUID = streamWithoutUUID.groupByKey().aggregate(
() -> "",
(k, v, t) -> {
if (!t.startsWith("uuid:")) {
return "uuid:" + "call your buildUUID function here" + ";value:" + v;
} else {
return t.split(";", 2)[0] + ";value:" + v;
}
},
Materialized.<String, String, KeyValueStore<Bytes, byte[]>>as("state_name")
.withKeySerde(Serdes.String()).withValueSerde(Serdes.String()));
final KStream<String, String> streamWithUUID = tableWithUUID.toStream();

How to get an element at a specific index in an Optional List?

Optional<List<Long>> optionalList = getList(someInput);
How do i retrieve an element from this list?
How to iterate this list?
You can unwrap the list and use it like a normal list.
List<Long> list = optionalList.orElseGet(Collections::emptyList);
list.forEach(System.out::println); // process the list
// e.g. printing its elements
If you just want a forEach(..) and don't need the list unwrapped.
optionalList.orElseGet(Collections::emptyList)
.forEach(System.out::println);
You can check if is present
if (optionalList.isPresent()) {
List<Long> myList = optionalList.get();
// process list present
} else {
// process not present
}
Or keep using optional to access to one of its elements
Optional<Long> longAt5 = optionalList.filter(list -> list.size() > 5)
.map(list -> list.get(5));
Check if there is a value present and then do some logic:
optionalList.ifPresent(list -> {
...
});
As for processing the list, you could do:
optionalList.orElseGet(() -> Collections.emptyList()).forEach(e -> {...});

Fastest way to convert key value pairs to grouped by key objects map using java 8 stream

Model:
public class AgencyMapping {
private Integer agencyId;
private String scoreKey;
}
public class AgencyInfo {
private Integer agencyId;
private Set<String> scoreKeys;
}
My code:
List<AgencyMapping> agencyMappings;
Map<Integer, AgencyInfo> agencyInfoByAgencyId = agencyMappings.stream()
.collect(groupingBy(AgencyMapping::getAgencyId,
collectingAndThen(toSet(), e -> e.stream().map(AgencyMapping::getScoreKey).collect(toSet()))))
.entrySet().stream().map(e -> new AgencyInfo(e.getKey(), e.getValue()))
.collect(Collectors.toMap(AgencyInfo::getAgencyId, identity()));
Is there a way to get the same result and use more simpler code and faster?
You can simplify the call to collectingAndThen(toSet(), e -> e.stream().map(AgencyMapping::getScoreKey).collect(toSet())))) with a call to mapping(AgencyMapping::getScoreKey, toSet()).
Map<Integer, AgencyInfo> resultSet = agencyMappings.stream()
.collect(groupingBy(AgencyMapping::getAgencyId,
mapping(AgencyMapping::getScoreKey, toSet())))
.entrySet()
.stream()
.map(e -> new AgencyInfo(e.getKey(), e.getValue()))
.collect(toMap(AgencyInfo::getAgencyId, identity()));
A different way to see it using a toMap collector:
Map<Integer, AgencyInfo> resultSet = agencyMappings.stream()
.collect(toMap(AgencyMapping::getAgencyId, // key extractor
e -> new HashSet<>(singleton(e.getScoreKey())), // value extractor
(left, right) -> { // a merge function, used to resolve collisions between values associated with the same key
left.addAll(right);
return left;
}))
.entrySet()
.stream()
.map(e -> new AgencyInfo(e.getKey(), e.getValue()))
.collect(toMap(AgencyInfo::getAgencyId, identity()));
The latter example is arguably more complicated than the former. Nevertheless, your approach is pretty much the way to go apart from using mapping as opposed to collectingAndThen as mentioned above.
Apart from that, I don't see anything else you can simplify with the code shown.
As for faster code, if you're suggesting that your current approach is slow in performance then you may want to read the answers here that speak about when you should consider going parallel.
You are collecting to an intermediate map, then streaming the entries of this map to create AgencyInfo instances, which are finally collected to another map.
Instead of all this, you could use Collectors.toMap to collect directly to a map, mapping each AgencyMapping object to the desired AgencyInfo and merging the scoreKeys as needed:
Map<Integer, AgencyInfo> agencyInfoByAgencyId = agencyMappings.stream()
.collect(Collectors.toMap(
AgencyMapping::getAgencyId,
mapping -> new AgencyInfo(
mapping.getAgencyId(),
new HashSet<>(Set.of(mapping.getScoreKey()))),
(left, right) -> {
left.getScoreKeys().addAll(right.getScoreKeys());
return left;
}));
This works by grouping the AgencyMapping elements of the stream by AgencyMapping::getAgencyId, but storing AgencyInfo objects in the map instead. We get these AgencyInfo instances from manually mapping each original AgencyMapping object. Finally, we're merging AgencyInfo instances that are already in the map by means of a merge function that folds left scoreKeys from one AgencyInfo to another.
I'm using Java 9's Set.of to create a singleton set. If you don't have Java 9, you can replace it with Collections.singleton.

Using Java 8 streams for aggregating list objects

We are using 3 lists ListA,ListB,ListC to keep the marks for 10 students in 3 subjects (A,B,C).
Subject B and C are optional, so only few students out of 10 have marks in those subjects
Class Student{
String studentName;
int marks;
}
ListA has records for 10 students, ListB for 5 and ListC for 3 (which is also the size of the lists)
Want to know how we can sum up the marks of the students for their subjects using java 8 steam.
I tried the following
List<Integer> list = IntStream.range(0,listA.size() -1).mapToObj(i -> listA.get(i).getMarks() +
listB.get(i).getMarks() +
listC.get(i).getMarks()).collect(Collectors.toList());;
There are 2 issues with this
a) It will give IndexOutOfBoundsException as listB and listC don't have 10 elements
b) The returned list if of type Integer and I want it to be of type Student.
Any inputs will be very helpful
You can make a stream of the 3 lists and then call flatMap to put all the lists' elements into a single stream. That stream will contain one element per student per mark, so you will have to aggregate the result by student name. Something along the lines of:
Map<String, Integer> studentMap = Stream.of(listA, listB, listC)
.flatMap(Collection::stream)
.collect(groupingBy(student -> student.name, summingInt(student -> student.mark)));
Alternatively, if your Student class has getters for its fields, you can change the last line to make it more readable:
Map<String, Integer> studentMap = Stream.of(listA, listB, listC)
.flatMap(Collection::stream)
.collect(groupingBy(Student::getName, summingInt(Student::getMark)));
Then check the result by printing out the studentMap:
studentMap.forEach((key, value) -> System.out.println(key + " - " + value));
If you want to create a list of Student objects instead, you can use the result of the first map and create a new stream from its entries (this particular example assumes your Student class has an all-args constructor so you can one-line it):
List<Student> studentList = Stream.of(listA, listB, listC)
.flatMap(Collection::stream)
.collect(groupingBy(Student::getName, summingInt(Student::getMark)))
.entrySet().stream()
.map(mapEntry -> new Student(mapEntry.getKey(), mapEntry.getValue()))
.collect(toList());
I would do it as follows:
Map<String, Student> result = Stream.of(listA, listB, listC)
.flatMap(List::stream)
.collect(Collectors.toMap(
Student::getName, // key: student's name
s -> new Student(s.getName(), s.getMarks()), // value: new Student
(s1, s2) -> { // merge students with same name: sum marks
s1.setMarks(s1.getMarks() + s2.getMarks());
return s1;
}));
Here I've used Collectors.toMap to create the map (I've also assumed you have a constructor for Student that receives a name and marks).
This version of Collectors.toMap expects three arguments:
A function that returns the key for each element (here it's Student::getName)
A function that returns the value for each element (I've created a new Student instance that is a copy of the original element, this is to not modify instances from the original stream)
A merge function that is to be used when there are elements that have the same key, i.e. for students with the same name (I've summed the marks here).
If you could add the following copy constructor and method to your Student class:
public Student(Student another) {
this.name = another.name;
this.marks = another.marks;
}
public Student merge(Student another) {
this.marks += another.marks;
return this;
}
Then you could rewrite the code above in this way:
Map<String, Student> result = Stream.of(listA, listB, listC)
.flatMap(List::stream)
.collect(Collectors.toMap(
Student::getName,
Student::new,
Student::merge));

Filter on map of map

I have below map of map and want to filter it based on a value. The result should be assigned back to same map. Please let know what is the best approach for this.
Map<String, Map<String, Employee>> employeeMap;
<
dep1, <"empid11", employee11> <"empid12",employee12>
dep2, <"empid21", employee21> <"empid22",employee22>
>
Filter: employee.getState="MI"
I tried like below but i was not able to access the employee object
currentMap = currentMap.entrySet().stream()
**.filter(p->p.getValue().getState().equals("MI"))**
.collect(Collectors.toMap(p -> p.getKey(),p->p.getValue()));
If you want to modify the map in place (and that it allows so), you can use forEach to iterate over the entries of the map, and then use removeIf for each values of the inner maps to remove the employees that satisfy the predicate:
employeeMap.forEach((k, v) -> v.values().removeIf(e -> e.getState().equals("MI")));
Otherwise, what you can do is to use the toMap collector, where the function to map the values takes care of removing the concerned employees by iterating over the entry set of the inner maps:
Map<String, Map<String, Employee>> employeeMap =
employeeMap.entrySet()
.stream()
.collect(toMap(Map.Entry::getKey,
e -> e.getValue().entrySet().stream().filter(emp -> !emp.getValue().getState().equals("MI")).collect(toMap(Map.Entry::getKey, Map.Entry::getValue))));

Resources