Java 8 collect and change the format of the result - java-8

I have the data structre called MyPojo which has fields called time, name and timetaken (all are in Strings). I'm trying to do some grouping as follows:
List<MyPojo> myPojos = Arrays.asList(
new MyPojo("2017", "ABC", "30"),
new MyPojo("2017", "ABC", "20"),
new MyPojo("2016", "ABC", "25"),
new MyPojo("2017", "XYZ", "40")
);
Map<String, Map<String, Double>> resultMap = myPojos.stream()
.collect(Collectors.groupingBy(MyPojo::getName,
Collectors.groupingBy(MyPojo::getTime,
Collectors.averagingDouble(MyPojo::getTimeTakenAsDouble))));
Please note that I've a method called getTimeTakenAsDouble to convert thetimetaken string to double value.
This results as follows:
{ABC={2017=25.0, 2016=25.0}, XYZ={2017=40.0}}
However, my frontend developer wanted the data either in the following format:
{ABC={2017=25.0, 2016=25.0}, XYZ={2017=40.0, 2016=0.0}}
or
[
{
"time": "2017",
"name": "ABC",
"avgTimeTaken": 25.0
},
{
"time": "2017",
"name": "XYZ",
"avgTimeTaken": 40.0
},
{
"time": "2016",
"name": "ABC",
"avgTimeTaken": 25.0
},
{
"time": "2016",
"name": "XYZ",
"avgTimeTaken": 0.0
}
]
I'm thinking to perform iterations on the resultMap and prepare the 2nd format. I'm trying to perform the iteration again on the resultMap. Is there any other way to handle this?

Actually it's pretty interesting what you are trying to achieve. It's like you are trying to do some sort of logical padding. The way I've done it is to use Collectors.collectingAndThen. Once the result is there - I simply pad it with needed data.
Notice that I'm using Sets.difference from guava, but that can easily be put into a static method. Also there are additional operations performed.
So I assume your MyPojo looks like this:
static class MyPojo {
private final String time;
private final String name;
private final String timetaken;
public MyPojo(String time, String name, String timetaken) {
super();
this.name = name;
this.time = time;
this.timetaken = timetaken;
}
public String getName() {
return name;
}
public String getTime() {
return time;
}
public String getTimetaken() {
return timetaken;
}
public static double getTimeTakenAsDouble(MyPojo pojo) {
return Double.parseDouble(pojo.getTimetaken());
}
}
And input data that I've checked against is :
List<MyPojo> myPojos = Arrays.asList(
new MyPojo("2017", "ABC", "30"),
new MyPojo("2017", "ABC", "20"),
new MyPojo("2016", "ABC", "25"),
new MyPojo("2017", "XYZ", "40"),
new MyPojo("2018", "RDF", "80"));
Here is the code that does what you want:
Set<String> distinctYears = myPojos.stream().map(MyPojo::getTime).collect(Collectors.toSet());
Map<String, Map<String, Double>> resultMap = myPojos.stream()
.collect(Collectors.groupingBy(MyPojo::getName,
Collectors.collectingAndThen(
Collectors.groupingBy(MyPojo::getTime,
Collectors.averagingDouble(MyPojo::getTimeTakenAsDouble)),
map -> {
Set<String> localYears = map.keySet();
SetView<String> diff = Sets.difference(distinctYears, localYears);
Map<String, Double> toReturn = new HashMap<>(localYears.size() + diff.size());
toReturn.putAll(map);
diff.stream().forEach(e -> toReturn.put(e, 0.0));
return toReturn;
}
)));
Result of that would be:
{ABC={2016=25.0, 2018=0.0, 2017=25.0},
RDF={2016=0.0, 2018=80.0, 2017=0.0},
XYZ={2016=0.0, 2018=0.0, 2017=40.0}}

Related

MongDB sink connector: How to upsert deep objects

I have a usecase of reading messages from kafka topics and loading them into MongoDB. As part of this process, I'm also looking to handle the data update part.
For example, consider this kafka message
{
"_id": "123",
"meta": {
"id": "456",
"name: "abc"
"lastname": "xyz"
}
}
After this is added to MongoDB sink, consider the next message
{
"meta": {
"id": "456",
"lastname": "oxy"
}
}
I'm expecting the sink connector to update only the lastname field without overwriting other fields and the document should look like
{
"_id": "123",
"meta": {
"id": "456",
"name: "abc"
"lastname": "oxy"
}
}
Basically, how to achieve this upsert functionality in MongoDB sink connector write model strategy? Here is the custom strategy and the sink configuration
public class UpsertAsPartOfDocumentStrategy implements WriteModelStrategy, Configurable {
private final static String ID_FIELD_NAME = "_id";
private boolean isPartialId = false;
private static final String CREATE_PREFIX = "%s%s.";
private static final String ELEMENT_NAME_PREFIX = "%s%s";
private static final UpdateOptions UPDATE_OPTIONS = new UpdateOptions().upsert(true);
static final String FIELD_NAME_MODIFIED_TS = "_modifiedTS";
static final String FIELD_NAME_INSERTED_TS = "_insertedTS";
#Override
public WriteModel<BsonDocument> createWriteModel(SinkDocument document) {
BsonDocument vd =
document
.getValueDoc()
.orElseThrow(
() ->
new DataException(
"Could not build the WriteModel,the value document was missing unexpectedly"));
BsonValue idValue = vd.get(ID_FIELD);
if (idValue == null || !idValue.isDocument()) {
throw new DataException(
"Could not build the WriteModel,the value document does not contain an _id field of"
+ " type BsonDocument which holds the business key fields.\n\n If you are including an"
+ " existing `_id` value in the business key then ensure `document.id.strategy.overwrite.existing=true`.");
}
BsonDocument businessKey = idValue.asDocument();
if (isPartialId) {
businessKey = flattenKeys(businessKey);
}
System.out.println("document" + vd);
return new UpdateOneModel<>(businessKey, vd, UPDATE_OPTIONS);
}
Reference: https://github.com/mongodb/mongo-kafka/blob/r1.7.0/src/main/java/com/mongodb/kafka/connect/sink/writemodel/strategy/UpdateOneBusinessKeyTimestampStrategy.java
https://www.mongodb.com/docs/drivers/go/current/fundamentals/crud/write-operations/upsert/
Sink properties
# Connection details
connector.class=com.mongodb.kafka.connect.MongoSinkConnector
connection.uri=mongodb://<connection>
tasks.max=1
topics=topc
database=db
collection=col
# Specific global MongoDB Sink Connector configuration
document.id.strategy.overwrite.existing=true
#writemodel.strategy=com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneBusinessKeyStrategy
writemodel.strategy=custom.writestrategy.UpsertAsPartOfDocumentStrategy
document.id.strategy=com.mongodb.kafka.connect.sink.processor.id.strategy.PartialValueStrategy
document.id.strategy.partial.value.projection.list=meta.id
document.id.strategy.partial.value.projection.type=AllowList
errors.tolerance=all
errors.deadletterqueue.topic.name=error_queue
errors.deadletterqueue.context.headers.enable=true
errors.log.include.messages=true

Java 8 Map & Stream - Sorting by value desc and group

I need to sort by descending order the response data by count sum after they grouped up in Java 8.
I have a view table query result like:
count(bigint)
category(varchar)
myEnum(int)
10
A
0
35
B
0
30
A
1
25
C
1
I have a projection interface for the view table for customizing the result of JPA Queries.
public interface MyView {
Long getCount();
String getCategory();
MyEnum getMyEnum();
}
And this one is my DTO for the response:
public class MyResponse {
private List<Long> count = new ArrayList<>();
private List<String> categories = new ArrayList<>();
private List<List<MyEnum>> myEnums = new ArrayList<>();
// ctors, getters and setters
}
I need to group the data by category and sum the total counts, then collect the Enum types in a list for each category. According to this, the count of category A should be 40 and has 0,1 enum types.
So, client-side needs to get the result like following after the get request:
{
"count": [
40,
35,
25
],
"categories": [
"A",
"B",
"C"
],
"myEnums": [
[
"ENUM_A",
"ENUM_B"
],
[
"ENUM_A",
],
[
"ENUM_B",
]
]
}
This is the related function in my service:
public MyResponse foo() {
// This list have the list of MyView.
List<MyView> myList = myRepository.getCountView());
Map<String, List<MyView>> myMap = myList.stream().collect(Collectors.groupingBy(MyView::getCategory));
MyResponse response = new MyResponse();
myMap.forEach((key, value) -> {
response.getCategories().add(key);
response.getCount().add(value.stream().mapToLong(MyView::getCount).sum());
response.getMyEnums().add(value.stream().flatMap(v -> Stream.of(v.getMyEnum())).collect(Collectors.toList()));
}
);
return response;
}
Alright, I have completed grouping by category and summing the counts but couldn't be able to sort them. The result is true but I need to order the data by total count by descending.
I would be appreciate to you for any suggestions. Thanks!
You can map each entry in the map to a CategorySummary, then reduce to a MyResponse:
List<MyView> myList = Arrays.asList(
new MyView(30, "B", MyEnum.ENUM_B),
new MyView(10, "A", MyEnum.ENUM_A),
new MyView(35, "B", MyEnum.ENUM_A),
new MyView(25, "C", MyEnum.ENUM_B)
);
Map<String, List<MyView>> myMap = myList.stream()
.collect(Collectors.groupingBy(MyView::getCategory, TreeMap::new, Collectors.toList()));
MyResponse myResponse = myMap.entrySet().stream()
.map(e -> new CategorySummary(
e.getValue().stream().mapToLong(MyView::getCount).sum(),
e.getKey(),
e.getValue().stream().flatMap(v -> Stream.of(v.getMyEnum())).collect(Collectors.toList())
))
.sorted(Comparator.comparing(CategorySummary::getCount).reversed())
.reduce(new MyResponse(),
(r, category) -> {
r.addCategory(category);
return r;
}, (myResponse1, myResponse2) -> { // the combiner won't be invoked because it's sequential stream.
myResponse1.getMyEnums().addAll(myResponse2.getMyEnums());
myResponse1.getCount().addAll(myResponse2.getCount());
myResponse1.getCategories().addAll(myResponse2.getCategories());
return myResponse1;
});
System.out.println(new ObjectMapper().writeValueAsString(myResponse));
Output:
{
"count": [65, 25, 10],
"categories": ["B", "C", "A"],
"myEnums": [
["ENUM_B", "ENUM_A"],
["ENUM_B"],
["ENUM_A"]
]
}
CategorySummary:
#AllArgsConstructor
#Getter
public class CategorySummary {
private long count;
private String name;
private List<MyEnum> myEnums;
}
I also had to add an additional method in MyResponse class:
public void addCategory(CategorySummary categorySummary){
this.count.add(categorySummary.getCount());
this.categories.add(categorySummary.getName());
this.myEnums.add(categorySummary.getMyEnums());
}

How to process a CSV file using Reactor Flux and output as JSON

I've got a CSV file which I want to process using Spring Reactor Flux.
Given a CSV file with header where first two columns are fixed, and
can have more then one optional data columns
Id, Name, Group, Status
6EF3C06E-6240-1A4A-17D6-27E73F0CDD31, Harlan Ferguson, xy1, true
6B261437-217C-0FDF-741A-92477EE354EC, Risa Greene, xy2, false
4FADC070-FCD0-C7E8-1963-A7FACDB6D8D1, Samson Blanchard, xy3, false
562C3486-E009-2C2D-9D3E-14355DB7D4D7, Damian Carson, xy4, true
...
...
...
I want to process the input using Flux
So that the output is
[{
"Id": "6EF3C06E-6240-1A4A-17D6-27E73F0CDD31",
"Name": "Harlan Ferguson",
"data": {
"Group": "xyz1",
"Status": true
}
}, {
"Id": "6B261437-217C-0FDF-741A-92477EE354EC",
"Name": "Risa Greene",
"data": {
"Group": "xy2",
"Status": false
}
}, {
"Id": "4FADC070-FCD0-C7E8-1963-A7FACDB6D8D1",
"Name": "Samson Blanchard",
"data": {
"Group": "xy3",
"Status": false
}
}, {
"Id": "562C3486-E009-2C2D-9D3E-14355DB7D4D7",
"Name": "Damian Carson",
"data": {
"Group": "xy4",
"Status": true
}
}]
I'm using CSVReader to stream and creating and Flux using
new CSVReader( Files.newBufferedReader(file) );
Flux<String[]> fluxOfCsvRecords = Flux.fromIterable(reader);
I'm coming back to Spring Reactor after couple of years, so my understanding is a bit rusty.
Creating a Mono of header using
Mono<String[]> headerMono = fluxOfCsvRecords.next();
And then,
fluxOfCsvRecords.skip(1)
.flatMap(csvRecord -> headerMono.map(header -> header[0] + " : " + csvRecord[0]))
.subscribe(System.out::println);
This is half-way code just to test that I'm able to combine data from header and rest of the flux, expecting to see
Id : 6EF3C06E-6240-1A4A-17D6-27E73F0CDD31
Id : 6B261437-217C-0FDF-741A-92477EE354EC
Id : 4FADC070-FCD0-C7E8-1963-A7FACDB6D8D1
Id : 562C3486-E009-2C2D-9D3E-14355DB7D4D7
But my output is just
4FADC070-FCD0-C7E8-1963-A7FACDB6D8D1 : 6EF3C06E-6240-1A4A-17D6-27E73F0CDD31
I'll appreciate if anyone can help me understand how to achieve this.
---------------------------Update---------------------
Tried another approach
Flux<String[]> take1 = fluxOfCsvRecords.take(1);
take1.flatMap(header -> fluxOfCsvRecords.map(csvRecord -> header[0] + " : " + csvRecord[0]))
.subscribe(System.out::println);
The output is
Id : 6B261437-217C-0FDF-741A-92477EE354EC
Id : 4FADC070-FCD0-C7E8-1963-A7FACDB6D8D1
Id : 562C3486-E009-2C2D-9D3E-14355DB7D4D7
Missing the row after the header
Add Two class like
public class TopJson {
private int Id;
private String name;
private InnerJson data;
public TopJson() {}
public TopJson(int id, String name, InnerJson data) {
super();
Id = id;
this.name = name;
this.data = data;
}
}
class InnerJson{
private String group;
private String status;
public InnerJson() {}
public InnerJson(String group, String status) {
super();
this.group = group;
this.status = status;
}
converted to appropriate types and used to create the object.
fluxOfCsvRecords.skip(1)
.map((Function<String, TopJson>) x -> {
String[] csvRecord = line.split(",");// a CSV has comma separated lines
return new TopJson(Integer.parseInt(csvRecord[0]), csvRecord[1],
new InnerJson(csvRecord[2], csvRecord[3]));
}).collect(Collectors.toList()));

Aggregation for multiple fields in elasticsearch

Is there any option in elasticsearch to use aggregation for multiple fields and get total count ?.
My query is
"SELECT COUNT(*), currency,type,status,channel FROM temp_index WHERE country='SG' and received_time=now/d group by currency,type,status,channel
Trying to implement the above in Java code using RestHighLevelClient , any suggestions or assistance will be helpful.
Currently we are using COUNT API
List<Object> dashboardsDataTotal = new ArrayList<>();
String[] channelList = { "test1", "test2", "test3", "test4", "test5", "test6" };
String[] currencyList = { "SGD", "HKD", "USD", "INR", "IDR", "PHP", "CNY" };
String[] statusList = { "COMPLETED", "FAILED", "PENDING", "FUTUREPROCESSINGDATE" };
String[] paymentTypeList = { "type1", "type2" };
String[] countryList = { "SG", "HK"};
CountRequest countRequest = new CountRequest(INDEX);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
try {
for (String country : countryAccess) { // per country
Map<String, Object> dashboardsDataPerCountry = new HashMap<>();
for (String channel : channelList) { // per channel
Map<String, Object> channelStore = new HashMap<>();
for (String paymentType : paymentTypeList) {
List<Object> paymentTypeStore = new ArrayList<>();
for (String currency : currencyList) {
Map<String, Object> currencyStore = new HashMap<>();
int receivedCount = 0;
for (String latestStatus : statusList) {
BoolQueryBuilder searchBoolQuery = QueryBuilders.boolQuery();
searchBoolQuery
.must(QueryBuilders.termQuery("channel", channel.toLowerCase()));
searchBoolQuery
.must(QueryBuilders.termQuery("currency", currency.toLowerCase()));
searchBoolQuery.must(QueryBuilders.matchPhraseQuery("source_country",
country.toLowerCase()));
if ("FUTUREPROCESSINGDATE".equalsIgnoreCase(latestStatus)) {
searchBoolQuery.must(
QueryBuilders.rangeQuery("processing_date").gt(currentDateS).timeZone(getTimeZone(country)));
}
else {
searchBoolQuery.must(QueryBuilders.termQuery("txn_latest_status",
latestStatus.toLowerCase()));
}
searchBoolQuery.must(
QueryBuilders.termQuery("paymentType", paymentType.toLowerCase()));
searchBoolQuery.must(QueryBuilders.rangeQuery("received_time").gte(currentDateS)
.lte(currentDateS).timeZone(getTimeZone(country)));
searchSourceBuilder.query(searchBoolQuery);
countRequest.source(searchSourceBuilder);
// try {
CountResponse countResponse = restHighLevelClient.count(countRequest,
RequestOptions.DEFAULT);
if (!latestStatus.equals("FUTUREPROCESSINGDATE")) {
receivedCount += countResponse.getCount();
}
currencyStore.put(latestStatus, countResponse.getCount());
}
currencyStore.put("RECEIVED", receivedCount); // received = pending + completed + failed
currencyStore.put("currency", currency);
paymentTypeStore.add(currencyStore);
} // per currency end
channelStore.put(paymentType, paymentTypeStore);
} // paymentType end
dashboardsDataPerCountry.put(channel, channelStore);
dashboardsDataPerCountry.put("country", country);
} // per channel end
dashboardsDataTotal.add(dashboardsDataPerCountry);
} // per country end
restHighLevelClient.close();
}
Appreciate if someone can provide a better solution to the above.
Made use of CompositeAggregationBuilder and got the aggregated results
CompositeAggregationBuilder compositeAgg = new CompositeAggregationBuilder("aggregate_buckets", sources);
searchSourceBuilder.aggregation(compositeAgg);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
ParsedComposite parsedComposite = aggregations.get("aggregate_buckets");
List<ParsedBucket> list = parsedComposite.getBuckets();
Map<String,Object> data = new HashMap<>();
for (ParsedBucket parsedBucket : list) {
data.clear();
for (Map.Entry<String, Object> m : parsedBucket.getKey().entrySet()) {
data.put(m.getKey(), m.getValue());
}
data.put("count", parsedBucket.getDocCount());
System.out.println(data);
}

Convert String to Double in Spring Expression Language

I'm quite new to spring expression language. I have a map which contains values as string.
{
"id": "1",
"object": {
"object1": {
"fields": {
"value1": {
"value":"3"
},
"value2": {
"value":"2"
}
}
}
}
}
The spring expression that i have looks something like this,
((object['object1'].fields['value1'].value * object['object1'].fields['value2'].value) * 0.5)
My calculation method looks like this where SampleObject class has the map object
public Double calculation(String expression, SampleObject sampleObject){
ExpressionParser parser = new SpelExpressionParser();
Expression exp = parser.parseExpression(expression);
Double evaluatedValue = (Double)exp.getValue(sampleObject);
return evaluatedValue;
}
Since the value is string, I get an error saying cannot convert String to Double. This can be eliminated if i change the expression like this
((new Double(object['object1'].fields['value1'].value) * new Double(object['object1'].fields['value2'].value)) * 0.5)
But I'm looking for a solution where I don't have to change the expression or the object. Is there a solution.
Thanks in advance
You can plugin a class that supports operations on those operand types:
class StringDoubleOperatorOverloader implements OperatorOverloader {
#Override
public boolean overridesOperation(Operation operation, Object leftOperand,
Object rightOperand) throws EvaluationException {
return operation==Operation.MULTIPLY &&
leftOperand instanceof String &&
rightOperand instanceof String;
}
#Override
public Object operate(Operation operation, Object leftOperand, Object rightOperand)
throws EvaluationException {
Double l = Double.valueOf((String)leftOperand);
Double r = Double.valueOf((String)rightOperand);
return l*r;
}
}
Then pass a customized StandardEvaluationContext to getValue():
ExpressionParser parser = new SpelExpressionParser();
Expression exp = parser.parseExpression(expression);
StandardEvaluationContext sec = new StandardEvaluationContext();
sec.setOperatorOverloader(new StringDoubleOperatorOverloader());
Double evaluatedValue = (Double)exp.getValue(sec,new SampleObject());
Since your JSON has value1.value and value2.value as String the Expression throws the Exception.
Either change your JSON to have Decimal values
{
"id": "1",
"object": {
"object1": {
"fields": {
"value1": {
"value":3.0
},
"value2": {
"value":2.0
}
}
}
}
}
Or in the Expression method instead of casting do parsing.
public Double calculation(String expression, SampleObject sampleObject){
ExpressionParser parser = new SpelExpressionParser();
Expression exp = parser.parseExpression(expression);
Double evaluatedValue = Double.parseDouble(exp.getValue(sampleObject).toString()); // This line
return evaluatedValue;
}

Resources