How do I have to write a Search Query in ElasticSearch? - spring

I use the Grails ElasticSearch Plugin and want to use the following query:
"bool" : {
"must" : {
"term" : { "user" : "kimchy" }
},
"must_not" : {
"range" : {
"age" : { "from" : 10, "to" : 20 }
}
},
"should" : [
{
"term" : { "tag" : "wow" }
},
{
"term" : { "tag" : "elasticsearch" }
}
],
"minimum_should_match" : 1,
"boost" : 1.0
}
Using the groovy api from the Grails plugin I would write something like:
def res = userAgentIdentService.search() {
"bool" {
"must" {
term("user" : "kimchy" )
}
"must_not" {
"range" {
age("from" : 10, "to" : 20 }
}
}
"should" : [
{
term( "tag" : "wow" )
}
{
term("tag" : "elasticsearch" )
}
]
"minimum_should_match" = 1
"boost" = 1.0
}
}
My query is not working!
Where do I have to define minimum_should_match and how do I have to define it?
How do I have to write the "should" : [ ... ] square brackets notation in the grails / groovy manner?

I think you're missing a couple of json levels in your search request. I don't think you can use the query without specifying that's a query (it could be a filter as well, or even something else). Have a look at this example from the groovy api reference:
def search = node.client.search {
indices "test"
types "type1"
source {
query {
terms(test: ["value1", "value2"])
}
}
}

Related

Elasticsearch search by keywords and boost

I'm using Spring Boot 2.0.5, Spring Data Elasticsearch 3.1.0 and Elasticsearch 6.4.2
I have loaded ElasticSearch with a set of articles. For each article, I have a keywords field with a string list of keywords e.g.
"keywords": ["Football", "Barcelona", "Cristiano Ronaldo", "Real Madrid", "Zinedine Zidane"],
For each user using the application, they can specify their keyword preferences with a weight factor.
e.g.
User 1:
keyword: Football, weight:3.0
keyword: Tech, weight:1.0
keyword: Health, weight:2.0
What I would like to do is find articles based on their keyword preferences and display them based on their weight factor preference (I think this relates to elastic search boost) and sort by latest article time.
This is what I have so far (only for one keyword):
public Page<Article> getArticles(String keyword, float boost, Pageable pageable) {
SearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(QueryBuilders.matchQuery("keywords", keyword).boost(boost))
.build();
return articleRepository.search(searchQuery);
}
As a user may have n number of keyword preferences, what would I need to change in the above code to support this?
Any suggestions would be highly appreciated.
Solution
OK I enabled logging so I can could see the elastic search query being produced. Then I updated the getArticles method to the following:
public Page<Article> getArticles(List<Keyword> keywords, Pageable pageable) {
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
List<FilterFunctionBuilder> functions = new ArrayList<FilterFunctionBuilder>();
for (Keyword keyword : keywords) {
queryBuilder.should(QueryBuilders.termsQuery("keywords", keyword.getKeyword()));
functions.add(new FunctionScoreQueryBuilder.FilterFunctionBuilder(
QueryBuilders.termQuery("keywords", keyword.getKeyword()),
ScoreFunctionBuilders.weightFactorFunction(keyword.getWeight())));
}
FunctionScoreQueryBuilder functionScoreQueryBuilder = QueryBuilders.functionScoreQuery(queryBuilder,
functions.toArray(new FunctionScoreQueryBuilder.FilterFunctionBuilder[functions.size()]));
NativeSearchQueryBuilder searchQuery = new NativeSearchQueryBuilder();
searchQuery.withQuery(functionScoreQueryBuilder);
searchQuery.withPageable(pageable);
// searchQuery.withSort(SortBuilders.fieldSort("createdDate").order(SortOrder.DESC));
return articleRepository.search(searchQuery.build());
}
This produces the following elastic search query:
{
"from" : 0,
"size" : 20,
"query" : {
"function_score" : {
"query" : {
"bool" : {
"should" : [
{
"terms" : {
"keywords" : [
"Football"
],
"boost" : 1.0
}
},
{
"terms" : {
"keywords" : [
"Tech"
],
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"functions" : [
{
"filter" : {
"term" : {
"keywords" : {
"value" : "Football",
"boost" : 1.0
}
}
},
"weight" : 3.0
},
{
"filter" : {
"term" : {
"keywords" : {
"value" : "Tech",
"boost" : 1.0
}
}
},
"weight" : 1.0
}
],
"score_mode" : "multiply",
"max_boost" : 3.4028235E38,
"boost" : 1.0
}
},
"version" : true
}
What you are looking for is the function_score query. Something along the lines of
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{"term":{"keyword":"Football"}},
{"term":{"keyword":"Tech"}},
{"term":{"keyword":"Health"}}
]
}
},
"functions": [
{"filter":{"term":{"keyword":"Football"}},"weight": 3},
{"filter":{"term":{"keyword":"Tech"}},"weight": 1},
{"filter":{"term":{"keyword":"Health"}},"weight": 2}
]
}
}
}
See here for API help https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-compound-queries.html#java-query-dsl-function-score-query

What's the functionality of "path" in ES_dsl.Q()?

I have a statement: ES_dsl.Q('nested', path='student', query=nest_filter)
What kind of role does the "path" play in the above one?
The path is simply the path to the nested field you're using in your query.
In nest_filter, you need to reference your nested field as student.xyz.
Check the equivalence in the query below:
GET /_search
{
"query": {
"nested" : {
"path" : "student", <--- this is the path
"query" : { <--- this is nest_filter
"bool" : {
{ "match" : {"student.name" : "john"} },
{ "range" : {"student.age" : {"gt" : 20}} }
]
}
}
}
}
}

Elasticsearch Multiple Prefix Keywords

I need to use the prefix filter, but allow multiple different prefixes, i.e.
{"prefix": {"myColumn": ["This", "orThis", "orEvenThis"]}}
This does not work. And if I add each as a separate prefix is also obviously doesn't work.
Help is appreciated.
Update
I tried should but without any luck:
$this->dsl['body']['query']['bool']['should'] = [
["prefix" => ["myColumn" => "This"]],
["prefix" => ["myColumn" => "orThis"]]
];
When I add those two constraints, I get ALL responses (as though filter is not working). But if I use must with either of those clauses, then I do get a response back with the correct prefix.
Based on your comments, it sounds like it may just be an issue with the syntax. With all ES queries (just like SQL ones), I suggest starting simple and just submitting them to ES as the raw DSL outside of code (although in your case this wasn't easily doable). For the request, it's a pretty straight forward one:
{
"query" : {
"bool" : {
"must" : [ ... ],
"filter" : [
{
"bool" : {
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
]
}
}
}
I added it as a filter because the optional nature of your prefixing is not improving relevancy: it's literally asking that one of them must match. In such cases where the question is "does this match? yes / no", then you should use a filter (with the added bonus that that's cacheable!). If you're asking "does this match, and which matches better?" then you want a query (because that's relevancy / scoring).
Note: The initial issue appeared to be that the bool / must was unmentioned and the suggestion was to just use a bool / should.
{
"bool" : {
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
behaves differently than
{
"bool" : {
"must" : [ ... ],
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
because the must impacts the required nature of should. Without must, should behaves like a boolean OR. However, with must, it behaves as a completely optional function to improve relevancy (score). To make it go back to the boolean OR behavior with must, you must add minimum_should_match to the bool compound query.
{
"bool" : {
"must" : [ ... ],
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
],
"minimum_should_match" : 1
}
}
Notice that it's a component of the bool query, and not of either should or must!

Elastic Search NEST - How to have multiple levels of filters in search

I would like to have multiple levels of filters to derive a result set using NEST API in Elastic Search. Is it possible to query the results of another filter...? If yes can I do that in multiple levels?
My requirement is like a User is allowed to select / unselect options of various fields.
Example: There are totally 1000 documents in my index 'people'. There may be 3 ListBoxs, 1) City 2) Favourite Food 3) Favourite Colour. If user selects a city it filters out 600 documents. Out of those 600 documents I would like to filter Favourite food, which may result with some 300 documents. Now further I would like to filter with resp. to favourite movie to retrieve 50 documents out of previously derived 300 documents.
You don't need to query within filters to achieve what you want. Just use filtered queries, http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html, and provide several filters. In your instance I would assume you would do something like this for your first query:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
}
]
}
}
}
You would then return the results from that and display them. You'd then let them select the next filter and do the following:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
},
{
"term" : {
"food" : "some food"
}
}
]
}
}
}
You'd then rinse and repeat for the 3 filter param:
{
"filtered" : {
"query" : {
"match_all" : { }
},
"filter" : {
"and" : [
{
"term" : {
"city" : "some city"
}
},
{
"term" : {
"food" : "some food"
}
},
{
"term" : {
"colour" : "some colour"
}
}
]
}
}
}
I haven't tested this, but the principle is sound and will work.

elasticsearch query_string and term search syntax

I would like to search for query_string that contains Bob and an Industry_Type_ID of either 8 OR 9.
I am getting a parse error: Parse Failure [No parser for element [Industry_Type_ID]]
{
"query" : {
"query_string":{"query":"Bob"},
"terms" : {
"Industry_Type_ID" : [ "8", "9" ],
"minimum_match" : 1
}
}
}
I am sure I am missing something obvious.
You can do it using bool query with two must clauses:
{
"query" : {
"bool" : {
"must" : [
{
"query_string":{"query":"Bob"}
},
{
"terms" : {
"Industry_Type_ID" : [ "8", "9" ],
"minimum_match" : 1
}
}
]
}
}
}

Resources