ElasticSearch exclude result from being filtered where a condition is met - elasticsearch

I want to exclude some results from being filtered by geo_distance where a condition is met.
For example, I filter my results on geo distance but I want to include all results where status is abnormal and meets the match_phrase query (even if it is outside the geo_distance)
GET /drive/_search
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"keywords": "wheels"
}
},
{
"match_phrase": {
"name": "car sale"
}
}
],
"filter": [
{
"term": {
"status": "normal"
}
},
{
"geo_distance": {
"distance": "0.09km",
"address.coordinate": {
"lat": -33.703082,
"lon": 18.981069
}
}
}
]
}
}
}
I've been reading the documentation and googling, but I think I might be going in the wrong direction.
If you can point me in the right direction, or explain to me what a better solution could be to do this Id be very grateful.

From the doc :
should
The clause (query) should appear in the matching document. If the bool
query is in a query context and has a must or filter clause then a
document will match the bool query even if none of the should queries
match. In this case these clauses are only used to influence the
score. If the bool query is in a filter context or has neither must or
filter then at least one of the should queries must match a document
for it to match the bool query. This behavior may be explicitly
controlled by setting the minimum_should_match parameter.
So in your case the geo condition is in "should" since it can be optional and the rest in "filter" which are mandatory : the status and match_phrase
Try this:
{
"query": {
"bool": {
"filter": [
{
"term": {
"status": "normal"
}
},
{
"match_phrase": {
"name": "car sale"
}
},
{
"match_phrase": {
"keywords": "wheels"
}
}
],
"should": [
{
"geo_distance": {
"distance": "0.09km",
"address.coordinate": {
"lat": -33.703082,
"lon": 18.981069
}
}
}
]
}
}
}

Related

elasticsearch multi field query is not working as expected

I've been facing some issues with multi field elasticsearch query. I am trying to query all the documents which matches the field called func_name to two hard coded strings, even though my index has documents with both these function names, but the query result is always fetching only one func_name. So far I have tried following queries.
1) Following returns only one function match, even though the documents have another function as well
GET /_search
{
"query": {
"multi_match": {
"query": "FEM_DS_GetTunerStatusInfo MDM_TunerStatusPrint",
"operator": "OR",
"fields": [
"func_name"
]
}
}
}
2) following intermittently gives me both the functions.
GET /_search
{
"query": {
"match": {
"func_name": {
"query": "MDM_TunerStatusPrint FEM_DS_GetTunerStatusInfo",
"operator": "or"
}
}
}
}
3) Following returns only one function match, even though the documents have another function as well
{
"query": {
"bool": {
"should": [
{ "match": { "func_name": "FEM_DS_GetTunerStatusInfo" }},
{ "match": { "func_name": "MDM_TunerStatusPrint" }}
]
}
}
}
Any help is much appreciated.
Thanks for your reply. Lets assume that I have following kind of documents in my elasticsearch. I want my search to return first two documents out of all as they matches my func_name.
{
"_index": "diag-178999",
"_source": {
"severity": "MIL",
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "MDM_TunerStatusPrint",
"timestamp": "2017-06-01T02:04:51.000Z"
}
},
{
"_index": "diag-344563",
"_source": {
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "FEM_DS_GetTunerStatusInfo",
"timestamp": "2017-07-20T02:04:51.000Z"
}
},
{
"_index": "diag-101010",
"_source": {
"severity": "MIL",
"t_id": "03468500",
"p_id": "000007c6",
"func_name": "some_func",
"timestamp": "2017-09-15T02:04:51.000Z"
}
The "two best ways" to request your ES is to filter by terms on a particular field or to aggregate your queries so that you can rename the field, apply multiple rules, and give a more understandable format to your response
See : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html and the other doc page is here, very useful :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html
In your case, you should do :
{
"from" : 0, "size" : 2,
"query": {
"filter": {
"bool": {
"must": {
"term": {
"func_name" : "FEM_DS_GetTunerStatusInfo OR MDM_TunerStatusPrint",
}
}
}
}
}
}
OR
"aggs": {
"aggregationName": {
"terms": {
"func_name" : "FEM_DS_GetTunerStatusInfo OR MDM_TunerStatusPrint"
}
}
}
}
The aggregation at the end is just here to show you how to do the same thing as your query filter. Let me know if it's working :)
Best regards
As I understand, you should use filtered query to match any document with one of the values of func_name mentioned above:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"func_name": [
"FEM_DS_GetTunerStatusInfo",
"MDM_TunerStatusPrint"
]
}
}
]
}
}
}
}
}
See:
Filtered Query, Temrs Query
UPDATE in ES 5.0:
{
"query": {
"bool": {
"must": [
{
"terms": {
"func_name": [
"FEM_DS_GetTunerStatusInfo",
"MDM_TunerStatusPrint"
]
}
}
]
}
}
}
See: this answer

geo_distance doesn't return any hit Elasticsearch

Have a problem with this query, when I use geo_distance filter, nothing returned from query. When I remove it I get proper results. Query is bellow:
GET _search
{
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": 20,
"distance_unit": "km",
"coordinates": [48.8488576, 2.3354223]
}
},
"must": {
"term": {
"_type": {
"value": "staff"
}
}
},
"must_not": [
{
"term": {
"cabinet.zipcode": {
"value": "75006"
}
}
},
{
"term": {
"next_availability_in_days": {
"value": "-1"
}
}
}
]
}
}
}
I would appreciate if someone gives me a hint.
UPDATE
When I run Elasticsearch Ruby DSL with same query logic, I get proper results:
<Elasticsearch::Model::Searching::SearchRequest:0x007ff335763560
#definition=
{:index=>["development_app_scoped_index_20170428134744",
"development_app_scoped_index_20170428134744"], :type=>["staff", "light_staff"],
:body=>
{:query=>
{:bool=>
{:must_not=>[
{:term=>{"cabinet.zipcode"=>75006}},
{:term=> {:next_availability_in_days=>-1}}
],
:must=>[
{:term=>{:_type=>"staff"}}
],
:filter=>{:geo_distance=>
{:coordinates=>
{:lat=>48.8488576, :lon=>2.3354223},
:distance=>"6km"
}
}}},
:sort=>[
{:type=>{:order=>"desc"}},
{"_geo_distance"=>{"coordinates"=>"48.8488576,2.3354223", "order"=>"asc",
"unit"=>"km"}},
{:next_availability_in_days=>{:order=>"asc"}},
{:priority=>{:order=>"asc"}}
]
}}
So this is really weird and I'm not sure what's going wrong in ES syntax, but it definitely should work as expected.
Thanks.
There is probably nothing in the range that you have entered.
Try to increase the "distance": 20 field to "distance": 500 and check the results then. For example the distance between these two geo points [0,0] and [0,1] is ~138.3414KM .
Another suggestion is to get rid of the "distance_unit" field and put the
and put the KM inside the "distance" field as following:
{
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "20km",
"coordinates": [
48.8488576,
2.3354223
]
}
}
}
}
}

Minimum should match on filtered query

Is it possible to have a query like this
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and"
}
}
}
}
With the "minimum_should_match": "2" statement?
I know that I can use a simple query (I've tried, it works) but I don't need the score to be computed. My goal is just to filter documents which contains 2 of the values.
Does the score generally heavily impact the time needed to retrieves document?
Using this query:
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and",
"minimum_should_match": "2"
}
}
}
}
I got this error:
QueryParsingException[[my_db] [terms] filter does not support [minimum_should_match]]
Minimum should match is not a parameter for the terms filter. If that is the functionality you are looking for, I might rewrite your query like this, to use the bool query wrapped in a query filter:
{
"filter": {
"query": {
"bool": {
"should": [
{
"term": {
"names": "Anna"
}
},
{
"term": {
"names": "Mark"
}
},
{
"term": {
"name": "Joe"
}
}
],
"minimum_should_match": 2
}
}
}
}
You will get documents matching preferably exactly all three, but the query will also match document with exactly two of the three terms. The must is an implicit and. We also do not compute score, as we have executed the query as a filter.

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

multiple search conditions in one query in es and distinguish the items according to the conditions

For one case I need to put multiple search conditions in one query to reduce the number of queries we need.
However, I need to distinguish the returning items based on the conditions.
Currently I achieved this goal by using function score query, specifically: each condition is assigned with a score, and I can differentiate the results based on those scores.
However, the performance is not that good. Plus now we need to get the doc count of each condition.
So is there any way to do it? I'm thinking using aggregation, but not sure if I can do it.
Thanks!
update:
curl -X GET 'localhost:9200/locations/_search?fields=_id&from=0&size=1000&pretty' -d '{
"query":{
"bool":{
"should":[
{
"filtered":{
"filter":{
"bool":{
"must":[{"term":{"city":"new york"}},{"term":{"state":"ny"}}]
}
}
}
},
{
"filtered":{
"filter":{
"bool":{
"must":[{"term":{"city":"los angeles"}},{"term":{"state":"ca"}}]
}
}
}
}
]
}
}}'
Well to answer the first part of your question , names queries are the best.
For eg:
{
"query": {
"bool": {
"should": [
{
"match": {
"field1": {
"query": "qbox",
"_name": "firstQuery"
}
}
},
{
"match": {
"field2": {
"query": "hosted Elasticsearch",
"_name": "secondQuery"
}
}
}
]
}
}
}
This will return an additional field called matched_queries for each hit which will have the information on queries matched for that document.
You can find more info on names queries here
But this this information cant be used for aggregation.
So you need to handle the second part of your question in a separate manner.
Filter aggregation for each query type would be the idea solution here.
For eg:
{
"query": {
"bool": {
"should": [
{
"match": {
"text": {
"query": "qbox",
"_name": "firstQuery"
}
}
},
{
"match": {
"source": {
"query": "elasticsearch",
"_name": "secondQuery"
}
}
}
]
}
},
"aggs": {
"firstQuery": {
"filter": {
"term": {
"text": "qbox"
}
}
},
"secondQuery": {
"filter": {
"term": {
"source": "elasticsearch"
}
}
}
}
}
You can find more on filter aggregation here

Resources