Java Streaming: get max if no duplicates - java-8

I'm trying to write a function that takes in a Map and returns an Entry. If the entry with the max Integer value is unique, it should return that entry. However, if there are duplicate entries with the same max value, it should return a new Entry with a key of "MULTIPLE" and a value of 0. It's easy enough for me to get the max value ignoring duplicates:
public static Entry<String,Integer> getMax(Map<String,Integer> map1) {
return map1.entrySet().stream()
.max((a,b) -> a.getValue().compareTo(b.getValue()))
.get();
}
But in order for me to do what I said initially, I could only find a solution where I had to create an initial stream to do a boolean check if there were multiple max values and then do another stream if not to get the value. I'd like to find a solution where I can do both tasks with only one stream.
Here's my little test case:
#Test
public void test1() {
Map<String,Integer> map1 = new HashMap<>();
map1.put("A", 100);
map1.put("B", 100);
map1.put("C", 100);
map1.put("D", 105);
Assert.assertEquals("D", getMax(map1).getKey());
Map<String,Integer> map2 = new HashMap<>();
map2.put("A", 100);
map2.put("B", 105);
map2.put("C", 100);
map2.put("D", 105);
Assert.assertEquals("MULTIPLE", getMax(map2).getKey());

This is a simple case of reduction, and you don't need any external libraries.
Map.Entry<String, Integer> max(Map<String, Integer> map) {
return map.entrySet().stream()
.reduce((e1, e2) -> {
if (e1.getValue() == e2.getValue()) {
return new SimpleImmutableEntry<>("MULTIPLE", 0);
} else {
return Collections.max(asList(e1, e2), comparingInt(Map.Entry::getValue));
}
})
.orElse(new SimpleImmutableEntry<>("NOT_FOUND", 0));
}

Here is the solution by StreamEx
public Entry<String, Integer> getMax(Map<String, Integer> map) {
return StreamEx.of(map.entrySet()).collect(collectingAndThen(MoreCollectors.maxAll(Map.Entry.comparingByValue()),
l -> l.size() == 1 ? l.get(0) : new AbstractMap.SimpleImmutableEntry<>("MULTIPLE", 0)));
}
Another solution is iterating the map twice with potential better performance:
public Entry<String, Integer> getMax(Map<String, Integer> map) {
int max = map.entrySet().stream().mapToInt(e -> e.getValue()).max().getAsInt();
return StreamEx.of(map.entrySet()).filter(e -> e.getValue().intValue() == max).limit(2)
.toListAndThen(l -> l.size() == 1 ? l.get(0) : new AbstractMap.SimpleImmutableEntry<>("MULTIPLE", 0));
}

Related

Java Map: group by key's attribute and max over value

I have an instance of Map<Reference, Double> the challenge is that the key objects may contain a reference to the same object, I need to return a map of the same type of the "input" but grouped by the attribute key and by retaining the max value.
I tried by using groupingBy and maxBy but I'm stuck.
private void run () {
Map<Reference, Double> vote = new HashMap<>();
Student s1 = new Student(12L);
vote.put(new Reference(s1), 66.5);
vote.put(new Reference(s1), 71.71);
Student s2 = new Student(44L);
vote.put(new Reference(s2), 59.75);
vote.put(new Reference(s2), 64.00);
// I need to have a Collection of Reference objs related to the max value of the "vote" map
Collection<Reference> maxVote = vote.entrySet().stream().collect(groupingBy(Map.Entry.<Reference, Double>comparingByKey(new Comparator<Reference>() {
#Override
public int compare(Reference r1, Reference r2) {
return r1.getObjId().compareTo(r2.getObjId());
}
}), maxBy(Comparator.comparingDouble(Map.Entry::getValue))));
}
class Reference {
private final Student student;
public Reference(Student s) {
this.student = s;
}
public Long getObjId() {
return this.student.getId();
}
}
class Student {
private final Long id;
public Student (Long id) {
this.id = id;
}
public Long getId() {
return id;
}
}
I have an error in the maxBy argument: Comparator.comparingDouble(Map.Entry::getValue) and I don't know how to fix it. Is there a way to achieve the expected result?
You can use Collectors.toMap to get the collection of Map.Entry<Reference, Double>
Collection<Map.Entry<Reference, Double>> result = vote.entrySet().stream()
.collect(Collectors.toMap( a -> a.getKey().getObjId(), Function.identity(),
BinaryOperator.maxBy(Comparator.comparingDouble(Map.Entry::getValue)))).values();
then stream over again to get List<Reference>
List<Reference> result = vote.entrySet().stream()
.collect(Collectors.toMap(a -> a.getKey().getObjId(), Function.identity(),
BinaryOperator.maxBy(Comparator.comparingDouble(Map.Entry::getValue))))
.values().stream().map(e -> e.getKey()).collect(Collectors.toList());
Using your approach of groupingBy and maxBy:
Comparator<Entry<Reference, Double>> c = Comparator.comparing(e -> e.getValue());
Map<Object, Optional<Entry<Reference, Double>>> map =
vote.entrySet().stream()
.collect(
Collectors.groupingBy
(
e -> ((Reference) e.getKey()).getObjId(),
Collectors.maxBy(c)));
// iterate to get the result (or invoke another stream)
for (Entry<Object, Optional<Entry<Reference, Double>>> obj : map.entrySet()) {
System.out.println("Student Id:" + obj.getKey() + ", " + "Max Vote:" + obj.getValue().get().getValue());
}
Output (For input in your question):
Student Id:12, Max Vote:71.71
Student Id:44, Max Vote:64.0

How to convert List<Person> into Map<String,List<Employee>>

Here is the piece of code which was written in Java7 and I wants to convert into Java8 by using Streams and Lambdas.
public static Map<String, List<Employee>> getEmployees(List<Person> personList) {
Map<String, List<Employee>> result = new HashMap<>();
for (Person person : personList) {
String[] perArr = person.getName().split("-");
List<Employee> employeeList = result.get(perArr[0]);
if (employeeList == null) {
employeeList = new ArrayList<>();
}
employeeList.add(new Employee(person.getPersonId(), perArr[1]));
result.put(perArr[0], employeeList);
}
return result;
}
Well you were somehow close I would say, problem being that you would need to pass along to the next stage of the stream pipeline 3 things actually: first token and second token (from split("-")) and also person::getPersonId; I've used a List here and some casting for this purpose (you could use a Triple for example, I've heard apache has it):
personList.stream()
.map(person -> {
String[] tokens = person.getName().split("-");
return Arrays.asList(tokens[0], tokens[1], person.getPersonId());
})
.collect(Collectors.groupingBy(
list -> (String) list.get(0),
Collectors.mapping(
list -> new Employee((Integer) list.get(2), (String) list.get(1)),
Collectors.toList())));
A straightforward way to improve your loop is to use Map.computeIfAbsent to manage creation of new map entries:
for (Person person : personList) {
String[] perArr = person.getName().split("-");
List<Employee> employeeList = result.computeIfAbsent(perArr[0], x -> new ArrayList<>());
employeeList.add(new Employee(person.getPersonId(), perArr[1]));
}
Doing this with streams is somewhat awkward because you cannot conveniently carry the result of an intermediate computation and so you would have to either complicate matters with intermediate objects or just split the string again:
import static java.util.stream.Collectors.*;
personList.stream()
.collect(groupingBy(
p -> p.getName().split("-")[0],
mapping(
p -> new Employee(p.getPersonId(), p.getName().split("-")[1]),
toList()
)
));

How to group objects in Java 8

WalletCreditNoteVO a1 = new WalletCreditNoteVO(1L, 1L, "A", WalletCreditNoteStatus.EXPIRED, null, null, CreditNoteType.CAMPAIGN_VOUCHER, BigDecimal.ONE, BigDecimal.ONE, "GBP");
WalletCreditNoteVO a2 = new WalletCreditNoteVO(1L, 1L, "A", WalletCreditNoteStatus.EXPIRED, null, null, CreditNoteType.CAMPAIGN_VOUCHER, BigDecimal.ONE, BigDecimal.TEN, "GBP");
WalletCreditNoteVO a3 = new WalletCreditNoteVO(2L, 1L, "A", WalletCreditNoteStatus.EXPIRED, null, null, CreditNoteType.CAMPAIGN_VOUCHER, BigDecimal.ONE, BigDecimal.ONE, "GBP");
WalletCreditNoteVO a4 = new WalletCreditNoteVO(2L, 1L, "A", WalletCreditNoteStatus.EXPIRED, null, null, CreditNoteType.CAMPAIGN_VOUCHER, BigDecimal.ONE, BigDecimal.TEN, "GBP");
final List<WalletCreditNoteVO> walletCreditNoteVOs = Lists.newArrayList(a1, a2, a3, a4);
Map<WalletCreditNoteVO, BigDecimal> collect2 = walletCreditNoteVOs.stream().collect(
groupingBy(wr -> new WalletCreditNoteVO(wr.getCreditNoteId(), wr.getWalletCustomerId(), wr.getCreditNoteTitle(),
wr.getWalletCreditNoteStatus(), wr.getCreditNoteStartDate(), wr.getCreditNoteExpiryDate(), wr.getCreditNoteType(), wr.getCreditNoteValue(), wr.getCurrency()),
mapping(WalletCreditNoteVO::getAvailableBalance,
reducing(BigDecimal.ZERO, (sum, elem) -> sum.add(elem)))));
I want to introduce condition for final reducing to be either sum (as written above) or last value in the list of BigDecimal based on the status of getWalletCreditNoteStatus
Can someone please help.
Thanks #xiumeteo . Below is improved solution
Function<WalletCreditNoteVO, WalletCreditNoteVO> function = wr -> new WalletCreditNoteVO(wr.getCreditNoteId(), wr.getWalletCustomerId(), wr.getCreditNoteTitle(),
wr.getWalletCreditNoteStatus(), wr.getCreditNoteStartDate(), wr.getCreditNoteExpiryDate(), wr.getCreditNoteType(), wr.getCreditNoteValue(), wr.getCurrency());
final Map<WalletCreditNoteVO, BigDecimal> collectMap =
walletCreditNoteVOs.stream()
.collect(groupingBy(function, LinkedHashMap::new, Collectors.collectingAndThen(
toList(),
(list) -> {
final List<BigDecimal> availableBalances = list.stream().map(WalletCreditNoteVO::getAvailableBalance).collect(toList());
if (list.stream().allMatch(WalletCreditNoteVO::isStatusExpired)) {
return availableBalances.stream().filter(o -> o != null).reduce((a, b) -> b).orElse(null).abs();
} else {
return availableBalances.stream().filter(o -> o != null).reduce(BigDecimal.ZERO, BigDecimal::add);
}
})));
List<WalletCreditNoteVO> walletCreditNoteVOGrouped = new ArrayList<>();
for(Map.Entry<WalletCreditNoteVO, BigDecimal> entry : collectMap.entrySet()){
WalletCreditNoteVO key = entry.getKey();
key.setAvailableBalance(entry.getValue());
walletCreditNoteVOGrouped.add(key);
}
I now want to remove 'for loop' and stream logic should just give me one list of WalletCreditNoteVO instead of Map of WalletCreditNoteVO as key and BigDecimal as value, with value set directly in the WalletCreditNoteVO
Thanks all again (I can't add code in my comments so adding it here).
So I did a test for your case, I created a dummy class that resembles yours:
public static class Something{
private String name;
private Integer sum;
private boolean checker;
public Something(String name, Integer sum, boolean checker) {
this.name = name;
this.sum = sum;
this.checker = checker;
}
public String getName() {
return name;
}
public boolean isChecker() {
return checker;
}
public Integer getSum() {
return sum;
}
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Something something = (Something) o;
return new EqualsBuilder().append(getName(), something.getName()).append(getSum(), something.getSum()).isEquals();
}
#Override
public int hashCode() {
return new HashCodeBuilder(17, 37).append(getName()).append(getSum()).toHashCode();
}
}
And then I did this little test
List<Something> items = Arrays.asList(new Something("name", 10, false), new Something("name", 14, true), new Something("name", 11, false),
new Something("name", 11, false), new Something("noName", 12, false));
final Map<Something, Integer> somethingToSumOrLastElement =
items.stream()
.collect(Collectors.groupingBy(Function.identity(),
Collectors.collectingAndThen(
Collectors.toList(), // first we collect all your related items into a list
(list) -> { //this collector allow us to have a finisher, Function<List<Something>, Object>, let's define it
final List<Integer> integerStream = list.stream().map(Something::getSum).collect(Collectors.toList());
if (list.stream().allMatch(Something::isChecker)) { // we check for the method you want to check
//you have to change this depending on required logic
//for this case if that's true for every element in the list, we do the reduce by summing
return integerStream.stream().reduce(0, (sum, next) -> sum + next);
}
//if not, we just get the last element of that list
return integerStream.stream().reduce(0, (sum, next) -> next);
})));
I think this is ok, but maybe someone has a better idea on how to handle your issue.
Ping me if you need clarification :)

Java 8 is not maintaining the order while grouping

I m using Java 8 for grouping by data. But results obtained are not in order formed.
Map<GroupingKey, List<Object>> groupedResult = null;
if (!CollectionUtils.isEmpty(groupByColumns)) {
Map<String, Object> mapArr[] = new LinkedHashMap[mapList.size()];
if (!CollectionUtils.isEmpty(mapList)) {
int count = 0;
for (LinkedHashMap<String, Object> map : mapList) {
mapArr[count++] = map;
}
}
Stream<Map<String, Object>> people = Stream.of(mapArr);
groupedResult = people
.collect(Collectors.groupingBy(p -> new GroupingKey(p, groupByColumns), Collectors.mapping((Map<String, Object> p) -> p, toList())));
public static class GroupingKey
public GroupingKey(Map<String, Object> map, List<String> cols) {
keys = new ArrayList<>();
for (String col : cols) {
keys.add(map.get(col));
}
}
// Add appropriate isEqual() ... you IDE should generate this
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final GroupingKey other = (GroupingKey) obj;
if (!Objects.equals(this.keys, other.keys)) {
return false;
}
return true;
}
#Override
public int hashCode() {
int hash = 7;
hash = 37 * hash + Objects.hashCode(this.keys);
return hash;
}
#Override
public String toString() {
return keys + "";
}
public ArrayList<Object> getKeys() {
return keys;
}
public void setKeys(ArrayList<Object> keys) {
this.keys = keys;
}
}
Here i am using my class groupingKey by which i m dynamically passing from ux. How can get this groupByColumns in sorted form?
Not maintaining the order is a property of the Map that stores the result. If you need a specific Map behavior, you need to request a particular Map implementation. E.g. LinkedHashMap maintains the insertion order:
groupedResult = people.collect(Collectors.groupingBy(
p -> new GroupingKey(p, groupByColumns),
LinkedHashMap::new,
Collectors.mapping((Map<String, Object> p) -> p, toList())));
By the way, there is no reason to copy the contents of mapList into an array before creating the Stream. You may simply call mapList.stream() to get an appropriate Stream.
Further, Collectors.mapping((Map<String, Object> p) -> p, toList()) is obsolete. p->p is an identity mapping, so there’s no reason to request mapping at all:
groupedResult = mapList.stream().collect(Collectors.groupingBy(
p -> new GroupingKey(p, groupByColumns), LinkedHashMap::new, toList()));
But even the GroupingKey is obsolete. It basically wraps a List of values, so you could just use a List as key in the first place. Lists implement hashCode and equals appropriately (but you must not modify these key Lists afterwards).
Map<List<Object>, List<Object>> groupedResult=
mapList.stream().collect(Collectors.groupingBy(
p -> groupByColumns.stream().map(p::get).collect(toList()),
LinkedHashMap::new, toList()));
Based on #Holger's great answer. I post this to help those who want to keep the order after grouping as well as changing the mapping.
Let's simplify and suppose we have a list of persons (int age, String name, String adresss...etc) and we want the names grouped by age while keeping ages in order:
final LinkedHashMap<Integer, List<String> map = myList
.stream()
.sorted(Comparator.comparing(p -> p.getAge())) //sort list by ages
.collect(Collectors.groupingBy(p -> p.getAge()),
LinkedHashMap::new, //keeps the order
Collectors.mapping(p -> p.getName(), //map name
Collectors.toList())));

Hadoop seems to modify my key object during an iteration over values of a given reduce call

Hadoop Version: 0.20.2 (On Amazon EMR)
Problem: I have a custom key that i write during map phase which i added below. During the reduce call, I do some simple aggregation on values for a given key. Issue I am facing is that during the iteration of values in reduce call, my key got changed and i got values of that new key.
My key type:
class MyKey implements WritableComparable<MyKey>, Serializable {
private MyEnum type; //MyEnum is a simple enumeration.
private TreeMap<String, String> subKeys;
MyKey() {} //for hadoop
public MyKey(MyEnum t, Map<String, String> sK) { type = t; subKeys = new TreeMap(sk); }
public void readFields(DataInput in) throws IOException {
Text typeT = new Text();
typeT.readFields(in);
this.type = MyEnum.valueOf(typeT.toString());
subKeys.clear();
int i = WritableUtils.readVInt(in);
while ( 0 != i-- ) {
Text keyText = new Text();
keyText.readFields(in);
Text valueText = new Text();
valueText.readFields(in);
subKeys.put(keyText.toString(), valueText.toString());
}
}
public void write(DataOutput out) throws IOException {
new Text(type.name()).write(out);
WritableUtils.writeVInt(out, subKeys.size());
for (Entry<String, String> each: subKeys.entrySet()) {
new Text(each.getKey()).write(out);
new Text(each.getValue()).write(out);
}
}
public int compareTo(MyKey o) {
if (o == null) {
return 1;
}
int typeComparison = this.type.compareTo(o.type);
if (typeComparison == 0) {
if (this.subKeys.equals(o.subKeys)) {
return 0;
}
int x = this.subKeys.hashCode() - o.subKeys.hashCode();
return (x != 0 ? x : -1);
}
return typeComparison;
}
}
Is there anything wrong with this implementation of key? Following is the code where I am facing the mixup of keys in reduce call:
reduce(MyKey k, Iterable<MyValue> values, Context context) {
Iterator<MyValue> iterator = values.iterator();
int sum = 0;
while(iterator.hasNext()) {
MyValue value = iterator.next();
//when i come here in the 2nd iteration, if i print k, it is different from what it was in iteration 1.
sum += value.getResult();
}
//write sum to context
}
Any help in this would be greatly appreciated.
This is expected behavior (with the new API at least).
When the next method for the underlying iterator of the values Iterable is called, the next key/value pair is read from the sorted mapper / combiner output, and checked that the key is still part of the same group as the previous key.
Because hadoop re-uses the objects passed to the reduce method (just calling the readFields method of the same object) the underlying contents of the Key parameter 'k' will change with each iteration of the values Iterable.

Resources