How to extract the values from the nested Maps using lambdas expression? - java-8

I need to extract the foreign conversion from the nested map using lambda expression of java 8:
I was able to solve it by the old school of the java 8 for each but wanted to see how it works with the lambda expression of java 8.
e.g i want to filter the maps inside map .
for cmp1, fee1, Inr-Try we have value present as 31. which is desired output
// camp1
Map<String,String> forexMap3_1 = new HashMap();
forexMap3_1.put("Eur-Try","11");
forexMap3_1.put("Usd-Try","21");
forexMap3_1.put("Inr-Try","31");
Map<String,String> forexMap3_2= new HashMap();
forexMap3_2.put("Eur-Try","12");
forexMap3_2.put("Usd-Try","22");
forexMap3_2.put("Inr-Try","32");
Map<String, Map> feeMap2 = new HashMap();
feeMap2.put("fee1", forexMap3_1);
feeMap2.put("fee2",forexMap3_2);
campaigns.put("cmp1", feeMap2);
// camp2
Map<String,String> forexMap3_3 = new HashMap();
forexMap3_3.put("Eur-Try","11");
forexMap3_3.put("Usd-Try","21");
forexMap3_3.put("Inr-Try","31");
Map<String,String> forexMap3_4= new HashMap();
forexMap3_4.put("Eur-Try","12");
forexMap3_4.put("Usd-Try","22");
forexMap3_4.put("Inr-Try","32");
Map<String, Map> feeMap3 = new HashMap();
feeMap3.put("fee3", forexMap3_3);
feeMap3.put("fee4",forexMap3_4);
campaigns.put("cmp2", feeMap3);

Try this :
out.entrySet().stream().filter(x->x.getKey().equals(yourkey)).flatMap(x->x.getValue().entrySet().stream()).collect(Collectors.toMap(x->x.getKey(),x->x.getValue()));

just iterate on campaign children:
HashMap<String, String> finalMap = new HashMap<>();
campaigns.forEach((s, stringMapMap) -> stringMapMap.forEach((s1, map) -> finalMap.putAll(map)));
System.out.println(finalMap.get("Inr-Try")); // output: 31

Related

ElasticSearch new Java API print created quesy

I trying out the new Java Client for Elastic 8.1.1.
In older versions i was able to print out the generated json query by using searchRequest.source().
I cannot find out actuallay what methode/service i can use do to so with the new client.
My code looks:
final Query range_query = new Query.Builder().range(r -> r.field("pixel_x")
.from(String.valueOf(lookupDto.getPixel_x_min())).to(String.valueOf(lookupDto.getPixel_x_max())))
.build();
final Query bool_query = new Query.Builder().bool(t -> t.must(range_query)).build();
SearchRequest sc = SearchRequest.of(s -> s.query(bool_query).index(INDEX).size(100));
The SearchRequest object offers a source() method but ist value is null.
You can use below code for printing query with new Elastic Java Client:
Query termQuery = TermQuery.of(t -> t.field("field_name").value("search_value"))._toQuery();
StringWriter writer = new StringWriter();
JsonGenerator generator = JacksonJsonProvider.provider().createGenerator(writer);
termQuery.serialize(generator, new JacksonJsonpMapper());
generator.flush();
System.out.println(writer.toString());

How to add values into a list in case of collision in java 8?

I am new to Java 8, I want to do something like this in java 8:
Map<String, List<Tuple>> userTupleMap = new HashMap<>();
for (Tuple tuple : tupleList) {
userTupleMap.get(tuple.get("user_id",String.class)).add(tuple);
}
I want to create a list of tuples which have same "user_id"
You can use groupingBy of Stream API
Map<String, List<Tuple>> userTupleMap = tupleList.stream()
.collect(Collectors.groupingBy(tuple -> tuple.get("user_id",String.class)));

Convert ImmutableListMultimap to Map using Collectors.toMap

I would like to convert a ImmutableListMultimap<String, Character> to Map<String, List<Character>>.
I used to do it in the non-stream way as follows
void convertMultiMaptoList(ImmutableListMultimap<String, Character> reverseImmutableMultiMap) {
Map<String, List<Character>> result = new TreeMap<>();
for( Map.Entry<String, Character> entry: reverseImmutableMultiMap.entries()) {
String key = entry.getKey();
Character t = entry.getValue();
result.computeIfAbsent(key, x-> new ArrayList<>()).add(t);
}
//reverseImmutableMultiMap.entries().stream().collect(Collectors.toMap)
}
I was wondering how to write the above same logic using java8 stream way (Collectors.toMap).
Please share your thoughts
Well there is already a asMap that you can use to make this easier:
Builder<String, Character> builder = ImmutableListMultimap.builder();
builder.put("12", 'c');
builder.put("12", 'c');
ImmutableListMultimap<String, Character> map = builder.build();
Map<String, List<Character>> map2 = map.asMap()
.entrySet()
.stream()
.collect(Collectors.toMap(Entry::getKey, e -> new ArrayList<>(e.getValue())));
If on the other hand you are OK with the return type of the asMap than it's a simple method call:
ImmutableMap<String, Collection<Character>> asMap = map.asMap();
Map<String, List<Character>> result = reverseImmutableMultiMap.entries().stream()
.collect(groupingBy(Entry::getKey, TreeMap::new, mapping(Entry::getValue, toList())));
The important detail is mapping. It will convert the collector (toList) so that it collects List<Character> instead of List<Entry<String, Character>>. According to the mapping function Entry::getValue
groupingBy will group all entries by the String key
toList will collect all values with same key to a list
Also, passing TreeMap::new as an argument to groupingBy will make sure you get this specific type of Map instead of the default HashMap

Perform aggregation in Elasticsearch index with Spark in Java

I want to prepare a Java class that will read an index from Elasticsearch, perform aggregations using Spark and then write the results back to Elasticsearch. The target schema (in the form of StructType) is the same as the source one. My code is as follows
SparkConf conf = new SparkConf().setAppName("Aggregation").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
SQLContext sqlContext = new SQLContext(sc);
JavaPairRDD<String, Map<String, Object>> pairRDD = JavaEsSpark.esRDD(sc, "kpi_aggregator/record");
RDD rdd = JavaPairRDD.toRDD(pairRDD);
Dataset df = sqlContext.createDataFrame(rdd, customSchema);
df.registerTempTable("data");
Dataset kpi1 = sqlContext.sql("SELECT host, SUM(bytes_uplink), SUM(bytes_downlink) FROM data GROUP BY host");
JavaEsSparkSQL.saveToEs(kpi1, "kpi_aggregator_total/record");
I am using the latest version of spark-core_2.11 and elasticsearch-spark-20_2.11. The previous code results in the following exception
java.lang.ClassCastException: scala.Tuple2 cannot be cast to org.apache.spark.sql.Row
Any ideas what I am doing wrong?
You get this exception because sqlContext.createDataFrame(rdd, customSchema) expects RDD<CustomSchemaJavaBean> but instead you pass to it results of JavaPairRDD.toRDD(pairRDD) which is RDD<Tuple2<String, Map<String, Object>>>. You have to map your JavaPairRDD<String, Map<String, Object>> to RDD<CustomSchemaJavaBean>:
SparkConf conf = new SparkConf().setAppName("Aggregation").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
SQLContext sqlContext = new SQLContext(sc);
JavaRDD<CustomSchemaBean> rdd = JavaEsSpark.esRDD(sc, "kpi_aggregator/record")
.map(tuple2 -> {
/**transform Tuple2<String, Map<String, Object>> to CustomSchemaBean **/
return new CustomSchemaBean(????);
} );
Dataset df = sqlContext.createDataFrame(rdd, customSchema);
df.registerTempTable("data");
Dataset kpi1 = sqlContext.sql("SELECT host, SUM(bytes_uplink), SUM(bytes_downlink) FROM data GROUP BY host");
JavaEsSparkSQL.saveToEs(kpi1, "kpi_aggregator_total/record");
Notice I used JavaRDD not RDD both methods are legal.

Spark RDD to update

I am loading a file from HDFS into a JavaRDD and wanted to update that RDD. For that I am converting it to IndexedRDD (https://github.com/amplab/spark-indexedrdd) and I am not able to as I am getting Classcast Exception.
Basically I will make key value pair and update the key. IndexedRDD supports update. Is there any way to convert ?
JavaPairRDD<String, String> mappedRDD = lines.flatMapToPair( new PairFlatMapFunction<String, String, String>()
{
#Override
public Iterable<Tuple2<String, String>> call(String arg0) throws Exception {
String[] arr = arg0.split(" ",2);
System.out.println( "lenght" + arr.length);
List<Tuple2<String, String>> results = new ArrayList<Tuple2<String, String>>();
results.addAll(results);
return results;
}
});
IndexedRDD<String,String> test = (IndexedRDD<String,String>) mappedRDD.collectAsMap();
The collectAsMap() returns a java.util.Map containing all the entries from your JavaPairRDD, but nothing related to Spark. I mean, that function is to collect the values in one node and work with plain Java. Therefore, you cannot cast it to IndexedRDD or any other RDD type as its just a normal Map.
I haven't used IndexedRDD, but from the examples you can see that you need to create it by passing to its constructor a PairRDD:
// Create an RDD of key-value pairs with Long keys.
val rdd = sc.parallelize((1 to 1000000).map(x => (x.toLong, 0)))
// Construct an IndexedRDD from the pairs, hash-partitioning and indexing
// the entries.
val indexed = IndexedRDD(rdd).cache()
So in your code it should be:
IndexedRDD<String,String> test = new IndexedRDD<String,String>(mappedRDD.rdd());

Resources