Flip key, value in a HashMap using Java8 Stream - java-8

I have a Map in following structure and I want to flip the key and value.
Map<String, List<String>> dataMap
Sample data :
acct01: [aa, ab, ad],
acct02: [ac, ad]
acct03: [ax, ab]
Want this data to be converted to,
aa: [acct01],
ab: [acct01, acct03],
ac: [acct02],
ad: [acct01, acct02],
ax: [acct03]
Want to know if there is a java 8 - stream way to transform the Map.
My Current implementation in (without Stream)
Map<String, List<String>> originalData = new HashMap<String, List<String>>();
originalData.put("Acct01", Arrays.asList("aa", "ab", "ad"));
originalData.put("Acct02", Arrays.asList("ac", "ad"));
originalData.put("Acct03", Arrays.asList("ax", "ab"));
System.out.println(originalData);
Map<String, List<String>> newData = new HashMap<String, List<String>>();
originalData.entrySet().forEach(entry -> {
entry.getValue().forEach(v -> {
if(newData.get(v) == null) {
List<String> t = new ArrayList<String>();
t.add(entry.getKey());
newData.put(v, t);
} else {
newData.get(v).add(entry.getKey());
}
});
});
System.out.println(newData);
input and output,
{Acct01=[aa, ab, ad], Acct02=[ac, ad], Acct03=[ax, ab]}
{aa=[Acct01], ab=[Acct01, Acct03], ac=[Acct02], ad=[Acct01, Acct02], ax=[Acct03]}
Looking for way to implement using Stream.

Get the stream for the entry set, flatten it out into one entry per key-value pair, group by value, collect associated keys into a list.
import static java.util.Arrays.asList;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.mapping;
import static java.util.stream.Collectors.toList;
import java.util.AbstractMap.SimpleImmutableEntry;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
<K, V> Map<V, List<K>> invert(Map<K, List<V>> map) {
return map.entrySet().stream().flatMap(
entry -> entry.getValue().stream().map(
value -> new SimpleImmutableEntry<>(
entry.getKey(),
value
)
)
).collect(
groupingBy(
Entry::getValue,
mapping(
Entry::getKey,
toList()
)
)
);
}

Here is the solution by Java 8 stream library: StreamEx
newData = EntryStream.of(originalData).invert().flatMapKeys(k -> k.stream()).grouping();

If you are open to using a third party library like Eclipse Collections you can use a ListMultimap (Each key can have a List of values). Multimap has flip(). So, this will work:
MutableListMultimap<String, String> originalData = Multimaps.mutable.list.empty();
originalData.putAll("Acct01", Arrays.asList("aa", "ab", "ad"));
originalData.putAll("Acct02", Arrays.asList("ac", "ad"));
originalData.putAll("Acct03", Arrays.asList("ax", "ab"));
System.out.println(originalData);
MutableBagMultimap<String, String> newData = originalData.flip();
System.out.println(newData);
input: {Acct03=[ax, ab], Acct02=[ac, ad], Acct01=[aa, ab, ad]}
output: {ac=[Acct02], ad=[Acct02, Acct01], aa=[Acct01], ab=[Acct03, Acct01], ax=[Acct03]}
Please note that flip() returns a BagMultimap where each key can have a Bag of values. Bag is special data structure which is unordered and allows duplicates.
Note: I am a committer for Eclipse Collections.

Your current implementation already relies on features of Java8. The forEach method was added to a number of data structures in J8, and while you could use streams, there would be no point, as much of the advantage of streams comes from being able to execute filters, sorting, and other methods in a lazy fashion, which would not apply to key remapping.
If you really wanted to, you could sprinkle in a couple streams by changing all .forEach instances to .stream().forEach in your example.

Related

Efficient way to group by a given list based on a key and collect in same list java 8

I have the below class:
class A{
String property1;
String property2;
Double property3;
Double property4;
}
So the property1 and property2 is the key.
class Key{
String property1;
String property2;
}
I already have a list of A like below:
List<A> list=new ArrayList<>();
I want to group by using the key and add to another list of A in order to avoid having multiple items with same key in the list:
Function<A, Key> keyFunction= r-> Key.valueOf(r.getProperty1(), r.getProperty2());
But then while doing group by I have to take a sum of property3 and average of property4.
I need an efficient way to do it.
Note: I have skipped the methods of the given classes.
Collecting to a Map is unavoidable since you want to group things. A brute-force way to do that would be :
yourListOfA
.stream()
.collect(Collectors.groupingBy(
x -> new Key(x.getProperty1(), x.getProperty2()),
Collectors.collectingAndThen(Collectors.toList(),
list -> {
double first = list.stream().mapToDouble(A::getProperty3).sum();
// or any other default
double second = list.stream().mapToDouble(A::getProperty4).average().orElse(0D);
A a = list.get(0);
return new A(a.getProperty1(), a.getProperty2(), first, second);
})))
.values();
This could be slightly improved for example in the Collectors.collectingAndThen to only iterate the List once, for that a custom collector would be required. Not that complicated to write one...
Try like this:
Map<A,List<A>> map = aList
.stream()
.collect(Collectors
.groupingBy(item->new A(item.property1,item.property2)));
List<A> result= map.entrySet().stream()
.map(list->new A(list.getValue().get(0).property1,list.getValue().get(0).property1)
.avgProperty4(list.getValue())
.sumProperty3(list.getValue()))
.collect(Collectors.toList());
and create avgProperty4 and sumProperty3 methods like to this
public A sumProperty3(List<A> a){
this.property3 = a.stream().mapToDouble(A::getProperty3).sum();
return this;
}
public A avgProperty4(List<A> a){
this.property4 = a.stream().mapToDouble(A::getProperty4).average().getAsDouble();
return this;
}
result = aList.stream().collect(Collectors
.groupingBy(item -> new A(item.property1, item.property2),
Collectors.collectingAndThen(Collectors.toList(), list ->
new A(list.get(0).property1, list.get(0).property1)
.avgProperty4(list).sumProperty3(list))
)
);

java 8 list grouping with value mapping function producing list

I have a following Person class
public class Person {
public String name;
public List<Brand> brands;
//Getters
}
and a List<Person> persons(possibly with same names). I need to group in a map of <String, List<Brand>> with Person's name as Keys and lists of accumulated Brands as values.
Something like this
Map<String, List<List<String>>> collect = list.stream().collect(
groupingBy(Person::getName, mapping(Person::getBrands, toList()))
);
produces undesired result and I know why. If the values could be somehow flatten during grouping? Is there a way to do it right there with Streams api?
java 9 will add the flatMapping collector specifically for this type of task:
list.stream().collect(
groupingBy(
Person::getName,
flatMapping(
p -> p.getBrands().stream(),
toList()
)
)
Guessing what is the desired result, you can achieve it with just toMap collector:
Map<String, List<String>> collect = persons.stream().collect(
toMap(
Person::getName,
Person::getBrands,
(l1, l2) -> ImmutableList.<String /*Brand*/>builder().addAll(l1).addAll(l2).build())
);
You will need to merge brands into a single List:
list.stream().collect(Collectors.toMap(
Person::getName,
Person::getBrands,
(left, right) -> {
left.addAll(right);
return left;
},
HashMap::new));
You can create a custom collector for the downstream to your groupBy:
Collector.of(LinkedList::new,
(list, person) -> list.addAll(person.brands),
(lhs, rhs) -> { lhs.addAll(rhs); return rhs; })
There is MoreCollectors provided in open source library: StreamEx
list.stream().collect(
groupingBy(Person::getName, MoreCollectors.flatMapping(p -> p.getBrands().stream()));

Convert ImmutableListMultimap to Map using Collectors.toMap

I would like to convert a ImmutableListMultimap<String, Character> to Map<String, List<Character>>.
I used to do it in the non-stream way as follows
void convertMultiMaptoList(ImmutableListMultimap<String, Character> reverseImmutableMultiMap) {
Map<String, List<Character>> result = new TreeMap<>();
for( Map.Entry<String, Character> entry: reverseImmutableMultiMap.entries()) {
String key = entry.getKey();
Character t = entry.getValue();
result.computeIfAbsent(key, x-> new ArrayList<>()).add(t);
}
//reverseImmutableMultiMap.entries().stream().collect(Collectors.toMap)
}
I was wondering how to write the above same logic using java8 stream way (Collectors.toMap).
Please share your thoughts
Well there is already a asMap that you can use to make this easier:
Builder<String, Character> builder = ImmutableListMultimap.builder();
builder.put("12", 'c');
builder.put("12", 'c');
ImmutableListMultimap<String, Character> map = builder.build();
Map<String, List<Character>> map2 = map.asMap()
.entrySet()
.stream()
.collect(Collectors.toMap(Entry::getKey, e -> new ArrayList<>(e.getValue())));
If on the other hand you are OK with the return type of the asMap than it's a simple method call:
ImmutableMap<String, Collection<Character>> asMap = map.asMap();
Map<String, List<Character>> result = reverseImmutableMultiMap.entries().stream()
.collect(groupingBy(Entry::getKey, TreeMap::new, mapping(Entry::getValue, toList())));
The important detail is mapping. It will convert the collector (toList) so that it collects List<Character> instead of List<Entry<String, Character>>. According to the mapping function Entry::getValue
groupingBy will group all entries by the String key
toList will collect all values with same key to a list
Also, passing TreeMap::new as an argument to groupingBy will make sure you get this specific type of Map instead of the default HashMap

How to view features list and their importance in TokenNameFinder model in OpenNLP

I have trained TokenNameFinder of OpenNLP which outputs .bin file. Now I need to list features with their importance.
I read code of TokenNameFinder and NameFinderME but could not find a way to print features. Is their any way to list all features of model along with their importance ?
Finally I figured out a way to list features. Function getDataStructures() of AbstractModel class returns a array of Object instances. Second element of this array is a Map<String, Integer> whose keys are combination of features and their values. Following is the code snippet for accessing features and their values:
AbstractModel maxModel = model.getArtifact("nameFinder.model");
Object[] obj = maxModel.getDataStructures();
if(obj!=null) {
Map<String, Integer> pmap = (HashMap<String, Integer>) obj[1];
Set<String> keySet = pmap.keySet();
for(String key: keySet) {
System.out.println(key +" **** "+ pmap.get(key));
}
} else {
System.out.println("obj is null." );
}

Spark RDD to update

I am loading a file from HDFS into a JavaRDD and wanted to update that RDD. For that I am converting it to IndexedRDD (https://github.com/amplab/spark-indexedrdd) and I am not able to as I am getting Classcast Exception.
Basically I will make key value pair and update the key. IndexedRDD supports update. Is there any way to convert ?
JavaPairRDD<String, String> mappedRDD = lines.flatMapToPair( new PairFlatMapFunction<String, String, String>()
{
#Override
public Iterable<Tuple2<String, String>> call(String arg0) throws Exception {
String[] arr = arg0.split(" ",2);
System.out.println( "lenght" + arr.length);
List<Tuple2<String, String>> results = new ArrayList<Tuple2<String, String>>();
results.addAll(results);
return results;
}
});
IndexedRDD<String,String> test = (IndexedRDD<String,String>) mappedRDD.collectAsMap();
The collectAsMap() returns a java.util.Map containing all the entries from your JavaPairRDD, but nothing related to Spark. I mean, that function is to collect the values in one node and work with plain Java. Therefore, you cannot cast it to IndexedRDD or any other RDD type as its just a normal Map.
I haven't used IndexedRDD, but from the examples you can see that you need to create it by passing to its constructor a PairRDD:
// Create an RDD of key-value pairs with Long keys.
val rdd = sc.parallelize((1 to 1000000).map(x => (x.toLong, 0)))
// Construct an IndexedRDD from the pairs, hash-partitioning and indexing
// the entries.
val indexed = IndexedRDD(rdd).cache()
So in your code it should be:
IndexedRDD<String,String> test = new IndexedRDD<String,String>(mappedRDD.rdd());

Resources