Spring Data Elasticsearch wildcard query with sort - sorting

I wrote a search query for elasticsearch:
{
"query": {
"query_string": {
"fields": ["studentName", "countryName"],
"query" : "*o*"
}
},
"sort" : [{
"studentName" : { "order": "desc" }
}]
}
This is executed for localhost:9200/myindex/mytype/_search (POST).
I get correct results based on the query which are sorted on the basis of sort part. But when I convert this into Spring Data Query like:
#Query("{ \"query\": { \"query_string\" : { \"fields\" : [\"studentName\", \"countryName\"], \"query\":\"*?0*\"}}," +
" \"sort\" : [{ \"?1\" : { \"order\": \"?2\" }}]}")
Page<Student> freeTextSearchPortSort(String freeText, String sortBy, String sortOrder, Pageable pageable);
I always get the same result, sorted on the order of insertion. What do I need to do differently?

You need to create PageRequest which has a constructor that takes page,size ,direction,properties. PageRequest implements Pageable interface.
Your query would look like:
#Query("{ \"query\": { \"query_string\" : { \"fields\" : [\"studentName\", \"countryName\"], \"query\":\"*?0*\"}}")
Page<Student> freeTextSearchPortSort(String freeText, Pageable pageable);
The location from where this is being called will look like.
PageRequest pageRequest = new PageRequest(0, no_of_rec_to_be_fetched, Sort.Direction.fromString("desc"), "studentName")
freeTextSearchPortSort ("someText",pageRequest)
Hope this helps!!

Related

Run a subquery for each of the filtered elasticsearch documents

I have an index named employees with the following structure:
{
id: integer,
name: text,
age: integer,
cityId: integer,
resumeText: text <--------- parsed resume text
}
I want to search employees with certain criteria e.g having age > 40, resumeText contains a specific skill or employee belongs to a certain city etc, and have the following query for so far requirement:
{
query:{
bool:{
should:[
{
term:{
cityId:2990
},
{
match:{
resumeText:"marketing"
},
{
match:{
resumeText:"critical thinking"
}}}
],
filter:{
range:{
age:{
gte:40
}}}}}
}
This gives me expected results but i want to know also among the returned documents/employees which are the ones whose resumeText contains the mentioned skills. e.g in the response, I want to get documents having mentioned that this document had matched "critical thinking" , this employee had matched both the skills and this employee didn't match any skills (as it was returned based on other filters)
What changes do i need to do to get the desired results:
can aggregation help?
can we rum a script for EACH filtered document to compute desired result (sub query for each document)?
any other approach?
Yes, You can use aggregation.
Refer this
You can bucket like how many resumes are matching each skill you are looking for.
GET employees/_search
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"marketing_resume_count" : { "match" : { "resumeText" : "marketing" }},
"thinking_resume_count" : { "match" : { "resumeText" : "thinking" }}
}
}
}
}
}
To extend to your use case:
You can add query section to the query as below
GET employees/_search
{
"size": 0,
"query":{
"match":{
"region":"AM"
}
},
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"marketing_resume_count" : { "match" : { "resumeText" : "marketing" }},
"thinking_resume_count" : { "match" : { "resumeText" : "thinking" }}
}
}
}
}
}
You can use range query to handle gte and let conditions. You can refer this for range query example. This can be used in place of query section.

Elasticsearch case insensitive nested query string

I haven't defined any analyzers and assume it's using the standard analyzer
ES version : 5.6
Lucene_version : 6.6.1
Normal query string results in case-insensitive search but if the query string is put inside nested query it results in case sensitive search
Non-nested query string
GET articles/_search
{
query : {
bool : {
must : [ {
query_string : {
query : "*ellow*
}}]}
}
Nested query string doesn't return case insensitive matches
GET articles/_search { query : {
bool : {
must : [ {
nested : {
path : "comments",
query : {
query_string : {
query : "*ellow*"
}}}} ]} }
Any help is greatly appreciated.
Thanks

Springdata mongodb aggregation match

After asking question to understand a bit more of the aggregation framework in MongoDB I finally found the way to do aggregation for my need (thanks to a StackExchange user)
So basically here is a document from my collection:
{
"_id" : ObjectId("s4dcsd5s4d6c54s6d"),
"items" : [
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_3",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
},
{
type : "TYPE_2",
text : "blablabla"
},
{
type : "TYPE_1",
text : "blablabla"
}
]
}
The idea was to be able to filter only some elements of my collections (avoiding Type 2 and 3). In fact I have more than 30 types and 6 are not allowed but for simplicity I made this example.
So the aggregation command in command line is this one:
db.history.aggregate([{
$match: {
_id: ObjectId("s4dcsd5s4d6c54s6d")
}
}, {
$unwind: '$items'
}, {
$match: {
'items.type': { '$nin': [ "TYPE_2" , "TYPE_3"] }
}
},
{ $limit: 10 }
]);
With this I am able to retrieve the 10 elements items of this document which do not match TYPE_2 and TYPE_3
However when I am using spring data there is no output. I looked a bit at the example to build mine but its still not working.
So I did:
Aggregation aggregation = newAggregation(
match(Criteria.where("id").is(myID)),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
AggregationResults<PersonnalHistory> results = mongAccess.getOperation().aggregate(query,
"items", PersonnalHistory.class);
PersonnalHistory is marked with annotation #Document(collection = "history") and id with the #id annotation
ignoreditemstype is a list containing TYPE_2 and TYPE_3
Here is what I have in the toString method of aggregation:
{
"aggregate" : "__collection__" ,
"pipeline" : [
{ "$match": { "id" : "s4dcsd5s4d6c54s6d"} },
{ "$unwind": "$items"},
{ "$match": { "items.type": { "$nin" : [ "TYPE_2" , "TYPE_3" ] } } },
{ "$limit" : 3},
{ "$skip" : 0 }
]
}
I tried a lot of stuff (to have at least an answer :) ) like removing id or the nin:
aggregation = newAggregation(
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
aggregation = newAggregation(
match(Criteria.where("id").is(myid)),
unwind("items")
);
For information when I do a simple query like:
query.addCriteria(Criteria.where("id").is(myID));
My document is returned. However I have thousands of items. So I just want to have the 15 first (in fact the 15 first are the 15 last added)
Do you maybe see what I am doing wrong?
Yeah looks like you are passing simple String while it is expecting ObjectId
Aggregation aggregation = newAggregation(
match(Criteria.where("_id").is(new ObjectId(myID))),
unwind("items"),
match(Criteria.where("items.type").nin(ignoreditemstype)),
limit(3),
skip(offsetLong)
);
Now the question is why it works with simple query, my answer would be because spring-data driver is not that mature at least not with aggregation pipeline.

How to get selected object only from an array

I have a collection with documents of the following structure:
{
"category": "movies",
"movies": [
{
"name": "HarryPotter",
"language": "english"
},
{
"name": "Fana",
"language": "hindi"
}
]
}
I want to query with movie name="fana" and the response sholud be
{
"category": "movies",
"movies": [
{
"name": "HarryPotter",
"language": "english"
}
]
}
How do I get the above using spring mongoTemplate?
You can try something like this.
Non-Aggregation based approach:
public MovieCollection getMoviesByName() {
BasicDBObject fields = new BasicDBObject("category", 1).append("movies", new BasicDBObject("$elemMatch", new BasicDBObject("name", "Fana").append("size", new BasicDBObject("$lt", 3))));
BasicQuery query = new BasicQuery(new BasicDBObject(), fields);
MovieCollection groupResults = mongoTemplate.findOne(query, MovieCollection.class);
return groupResults;
}
Aggregation based approach:
import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;
import static org.springframework.data.mongodb.core.query.Criteria.where;
public List<BasicDBObject> getMoviesByName() {
Aggregation aggregation = newAggregation(unwind("movies"), match(where("movies.name").is("Fana").and("movies.size").lt(1)),
project(fields().and("category", "$category").and("movies", "$movies")));
AggregationResults<BasicDBObject> groupResults = mongoTemplate.aggregate(
aggregation, "movieCollection", BasicDBObject.class);
return groupResults.getMappedResults();
}
$unwind of mongodb aggregation can be used for this.
db.Collection.aggregate([{
{$unwind : 'movies'},
{$match :{'movies.name' : 'fana'}}
}])
You can try the above query to get required output.
Above approaches provides you a solution using aggregation and basic query. But if you dont want to use BasicObject below code will perfectly work:
Query query = new Query()
query.fields().elemMatch("movies", Criteria.where("name").is("Fana"));
List<Movies> movies = mongoTemplate.find(query, Movies.class);
The drawback of this query is that it may return duplicate results present in different documents, since more than 1 document may match this criteria. So you can add _id in the criteria like below:
Criteria criteria = Criteria.where('_id').is(movieId)
Query query = new Query().addCriteria(criteria)
query.fields().elemMatch("movies", Criteria.where("name").is("Fana"));
query.fields().exclude('_id')
List<Movies> movies = mongoTemplate.find(query, Movies.class);
I am excluding "_id" of the document in the response.

Spring Data Elasticsearch to fetch Documents between two dates generates wrong query

I am trying to fetch documents which are greater or lesser than specified date.
I am using the below searchQuery for this purpose.
QueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.rangeQuery("date")
.gt("2015-06-25T00:00:00")
.lt("2015-06-25T00:00:00"));
The query generated from the above querybuilder is like this.
{
"bool" : {
"must" : [ {
"range" : {
"date" : {
"from" : "2015-06-25T00:00:00",
"to" : "2015-06-25T00:00:00",
"include_lower" : false,
"include_upper" : false
}
}
}
} ]
}
Even when i use functions gt and lt of rangequery the query is generated as from and to.
What is the solution so that a query can be generated like this.
{
"bool" : {
"must" : [ {
"range" : {
"date" : {
"gt" : "2015-06-25T00:00:00",
"lt" : "2015-06-25T00:00:00",
"include_lower" : false,
"include_upper" : false
}
}
}
} ]
}
This is the test class i have written.
#RunWith(SpringJUnit4ClassRunner.class)
#ContextConfiguration(classes = { ElasticSearchConfiguration.class }, loader = AnnotationConfigContextLoader.class)
public class ElasticSearchTest {
#Autowired
private ElasticsearchTemplate elasticsearchTemplate;
#Autowired
private Client client;
#Test
public void testAggregation(){
QueryBuilder querybuilder = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("receiptdate").gte("2015-06-25T00:00:00").lte ("2015-07-25T00:00:00")));
final SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(qb)
.build();
final List<Test> records = elasticsearchTemplate.queryForList(searchQuery, Test.class);
}
}
Any suggestions on how to achieve this in Spring Data Elasticsearch would be helpful.
Your query would not return any results, since you're looking for dates strictly greater and strictly lower than the same date. You need to use gte and lte instead:
QueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.rangeQuery("date")
.gte("2015-06-25T00:00:00")
.lte("2015-06-25T00:00:00"));
The official parameters of the range query are gt, gte, lt and lte.
The from, to, include_lower and include_upper parameters are old deprecated parameters, which the RangeQueryBuilder is still using but can (and will) be removed at anytime.
Just know that:
from + include_lower: false is equivalent to gt
from + include_lower: true is equivalent to gte
to + include_upper: false is equivalent to lt
to + include_upper: true is equivalent to lte

Resources