Spring MongoDB distict - can't get full document - spring

I have collection in following format.
{
id:____,
name:"Carlos",
city:"Mumbai"
},
{
id:____,
name:"Pravin",
city:"Mumbai"
},
{
id:_____,
name:"Gaurav",
city:"Ahmedabad"
}
I want whole document distinct by city. I tried the db.collection.distinct("city"). But it returns only distinct cities.
Current Output:
["Mumbai","Ahmedabad"]
Expected Output:
{
id:____,
name:"Carlos",
city:"Mumbai"
},
{
id:_____,
name:"Gaurav",
city:"Ahmedabad"
}
Above you can see there is only one record of "Mumbai". I need this kind of output.
Anyone know how we can get whole document with distinct in spring-mongodb?

You could try running an aggregation pipeline operation where you can include the the other fields inside the $group pipeline stage using the $first operator. Two examples that show this approach follow:
Mongo Shell:
pipeline = [
{
"$group": {
"_id": "$city",
"id": { "$first": "$_id" },
"name": { "$first": "$name" }
}
}
]
db.collection.aggregate(pipeline);
Spring Data MongoDB:
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
MongoTemplate mongoTemplate = repository.getMongoTemplate();
Aggregation agg = newAggregation(
group("city")
.first("_id").as("id")
.first("name").as("name")
);
AggregationResults<OutputType> result = mongoTemplate.aggregate(agg,
"collection", OutputType.class);
List<OutputType> mappedResult = result.getMappedResults();

Related

Use query result as parameter for another query in Elasticsearch DSL

I'm using Elasticsearch DSL, I'm trying to use a query result as a parameter for another query like below:
{
"query": {
"bool": {
"must_not": {
"terms": {
"request_id": {
"query": {
"match": {
"processing.message": "OUT Followup Synthesis"
}
},
"fields": [
"request_id"
],
"_source": false
}
}
}
}
}
}
As you can see above I'm trying to search for sources that their request_id is not one of the request_idswith processing.message equals to OUT Followup Synthesis.
I'm getting an error with this query:
Error loading data [x_content_parse_exception] [1:1660] [terms_lookup] unknown field [query]
How can I achieve my goal using Elasticsearch DSL?
Original question extracted from the comments
I'm trying to fetch data with processing.message equals to 'IN Followup Sythesis' with their request_id doesn't appear in data with processing.message equals to 'OUT Followup Sythesis'. In SQL language:
SELECT d FROM data d
WHERE d.processing.message = 'IN Followup Sythesis'
AND d.request_id NOT IN (SELECT request_id FROM data WHERE processing.message = 'OUT Followup Sythesis');
Answer: generally speaking, neither application-side joins nor subqueries are supported in Elasticsearch.
So you'll have to run your first query, take the retrieved IDs and put them into a second query — ideally a terms query.
Of course, this limitation can be overcome by "hijacking" a scripted metric aggregation.
Taking these 3 documents as examples:
POST reqs/_doc
{"request_id":"abc","processing":{"message":"OUT Followup Synthesis"}}
POST reqs/_doc
{"request_id":"abc","processing":{"message":"IN Followup Sythesis"}}
POST reqs/_doc
{"request_id":"xyz","processing":{"message":"IN Followup Sythesis"}}
you could run
POST reqs/_search
{
"size": 0,
"query": {
"match": {
"processing.message": "IN Followup Sythesis"
}
},
"aggs": {
"subquery_mock": {
"scripted_metric": {
"params": {
"disallowed_msg": "OUT Followup Synthesis"
},
"init_script": "state.by_request_ids = [:]; state.disallowed_request_ids = [];",
"map_script": """
def req_id = params._source.request_id;
def msg = params._source.processing.message;
if (msg.contains(params.disallowed_msg)) {
state.disallowed_request_ids.add(req_id);
// won't need this particular doc so continue looping
return;
}
if (state.by_request_ids.containsKey(req_id)) {
// there may be multiple docs under the same ID
// so concatenate them
state.by_request_ids[req_id].add(params._source);
} else {
// initialize an appendable arraylist
state.by_request_ids[req_id] = [params._source];
}
""",
"combine_script": """
state.by_request_ids.entrySet()
.removeIf(entry -> state.disallowed_request_ids.contains(entry.getKey()));
return state.by_request_ids
""",
"reduce_script": "return states"
}
}
}
}
which'd return only the correct request:
"aggregations" : {
"subquery_mock" : {
"value" : [
{
"xyz" : [
{
"processing" : { "message" : "IN Followup Sythesis" },
"request_id" : "xyz"
}
]
}
]
}
}
⚠️ This is almost guaranteed to be slow and goes against the suggested guidance of not accessing the _source field. But it also goes to show that subqueries can be "emulated".
💡 I'd recommend to test this script on a smaller set of documents before letting it target your whole index — maybe restrict it through a date range query or similar.
FYI Elasticsearch exposes an SQL API, though it's only offered through X-Pack, a paid offering.

Reducing output of GraphQL

I have set up a GraphQL-mongoose-express-apollo combo as per this guide.
When I run a query to get multiple results, is there a way to reduce the resulting array before I actually get to processing the response from the query.
Query:
query GetSomeUsers {
userMany (limit: 3){
_id
}
}
Actual output:
{
"data": {
"userMany": [
{
"_id": "5e950543cb48dbaafc60722d"
},
{
"_id": "5e950543cb48dbaafc60722e"
},
{
"_id": "5e950547cb48dbaafc60722f"
}
]
}
}
Desired output:
{
"data": {
"userMany": [
"5e950543cb48dbaafc60722d",
"5e950543cb48dbaafc60722e",
"5e950547cb48dbaafc60722f"
]
}
}
So far I have only found something that seems to be relevant in an article on GraphQL Leveler, but I don't see how it would work with graphql-compose-mongoose, as the GraphQL schema is automatically generated and there does not seem to be any place in the code to put in that LevelerObjectType in place of a GraphQLObjectType.

Spring mongodb - group operation after unwind - can not find $first or $push

I have articles & tags collection. Articles contain tags which is array of objectId. I want to fetch tagName as well, so I unwind (this gives me multiple rows - 1 per tag array entry) => lookup (joins with tabs collection) => group (combine it into original result set)
My mongodb query is as follows, which gives me correct result:
db.articles.aggregate([
{"$unwind": "$tags"},
{
"$lookup": {
"localField": "tags",
"from": "tags",
"foreignField": "_id",
"as": "materialTags"
}
},
{
"$group": {
"_id": "$_id",
"title": {"$first": "$title"},
"materialTags": {"$push": "$materialTags"}
}
}
])
My corresponding Spring code:
UnwindOperation unwindOperation = Aggregation.unwind("tags");
LookupOperation lookupOperation1 = LookupOperation.newLookup()
.from("tags")
.localField("tags")
.foreignField("_id")
.as("materialTags");
//I also want to add group operation but unable to find the proper syntax ??.
Aggregation aggregation = Aggregation.newAggregation(unwindOperation,
lookupOperation1, ??groupOperation?? );
AggregationResults<Article> resultList
= mongoTemplate.aggregate(aggregation, "articles", Article.class);
I tried to play around with group operation but without much luck. How can I add group operations as per original query ?
Thanks in advance.
Group query syntax in Spring for
{
"$group": {
"_id": "$_id",
"title": {"$first": "$title"},
"materialTags": {"$push": "$materialTags"}
}
}
is
Aggregation.group("_id").first("title").as("title").push("materialTags").as("materialTags")
Final query
UnwindOperation unwindOperation = Aggregation.unwind("tags");
LookupOperation lookupOperation1 = LookupOperation.newLookup()
.from("tags")
.localField("tags")
.foreignField("_id")
.as("materialTags");
Aggregation aggregation = Aggregation.newAggregation(unwindOperation,
lookupOperation1, Aggregation.group("_id").first("title").as("title").push("materialTags").as("materialTags") );
AggregationResults<Article> resultList
= mongoTemplate.aggregate(aggregation, "articles", Article.class);
To get more info please go thru the below references
http://www.baeldung.com/spring-data-mongodb-projections-aggregations
spring data mongodb group by
Create Spring Data Aggregation from MongoDb aggregation query
https://www.javacodegeeks.com/2016/04/data-aggregation-spring-data-mongodb-spring-boot.html

How to get selected object only from an array

I have a collection with documents of the following structure:
{
"category": "movies",
"movies": [
{
"name": "HarryPotter",
"language": "english"
},
{
"name": "Fana",
"language": "hindi"
}
]
}
I want to query with movie name="fana" and the response sholud be
{
"category": "movies",
"movies": [
{
"name": "HarryPotter",
"language": "english"
}
]
}
How do I get the above using spring mongoTemplate?
You can try something like this.
Non-Aggregation based approach:
public MovieCollection getMoviesByName() {
BasicDBObject fields = new BasicDBObject("category", 1).append("movies", new BasicDBObject("$elemMatch", new BasicDBObject("name", "Fana").append("size", new BasicDBObject("$lt", 3))));
BasicQuery query = new BasicQuery(new BasicDBObject(), fields);
MovieCollection groupResults = mongoTemplate.findOne(query, MovieCollection.class);
return groupResults;
}
Aggregation based approach:
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
import static org.springframework.data.mongodb.core.query.Criteria.where;
public List<BasicDBObject> getMoviesByName() {
Aggregation aggregation = newAggregation(unwind("movies"), match(where("movies.name").is("Fana").and("movies.size").lt(1)),
project(fields().and("category", "$category").and("movies", "$movies")));
AggregationResults<BasicDBObject> groupResults = mongoTemplate.aggregate(
aggregation, "movieCollection", BasicDBObject.class);
return groupResults.getMappedResults();
}
$unwind of mongodb aggregation can be used for this.
db.Collection.aggregate([{
{$unwind : 'movies'},
{$match :{'movies.name' : 'fana'}}
}])
You can try the above query to get required output.
Above approaches provides you a solution using aggregation and basic query. But if you dont want to use BasicObject below code will perfectly work:
Query query = new Query()
query.fields().elemMatch("movies", Criteria.where("name").is("Fana"));
List<Movies> movies = mongoTemplate.find(query, Movies.class);
The drawback of this query is that it may return duplicate results present in different documents, since more than 1 document may match this criteria. So you can add _id in the criteria like below:
Criteria criteria = Criteria.where('_id').is(movieId)
Query query = new Query().addCriteria(criteria)
query.fields().elemMatch("movies", Criteria.where("name").is("Fana"));
query.fields().exclude('_id')
List<Movies> movies = mongoTemplate.find(query, Movies.class);
I am excluding "_id" of the document in the response.

Using subtract in a Spring MongoDB group aggregation

I have the following aggregation query that works when I use the command line in Mongo.
{'$group':
{ '_id':
{'serviceName': '$serviceName'},
'timeAverage':
{'$avg':
{'$subtract': ['$lastCheckTime', '$enqueuedTime']}
}
}
}
But as far as I can tell, in Spring MongoDB there is no support for doing "subtract" inside of an avg operation in a group operation.
How would I go about making this work?
You could try projecting the difference field first by using the SpEL andExpression in the projection operation and then use it in the avg accumulator in the group operation:
Aggregation agg = newAggregation(
project("serviceName")
.andExpression("lastCheckTime - enqueuedTime").as("interval")
group("serviceName")
.avg("interval").as("timeAverage")
);
or use the $subtract arithmetic aggregation operator which is supported in Spring Data MongoDB as minus()
Aggregation agg = newAggregation(
project("serviceName")
.and("lastCheckTime").minus("enqueuedTime").as("interval")
group("serviceName")
.avg("interval").as("timeAverage")
);
This translates to following native aggregation operation:
[
{
"$project": {
"serviceName": 1,
"interval": { "$subtract":[ "$lastCheckTime", "$enqueuedTime" ] }
}
},
{
"$group": {
"_id": "$serviceName",
"timeAverage": { "$avg": "$interval" }
}
}
]

Resources