how to make match query on array field more accurate - elasticsearch

example:
here is a document:
{
"_source": {
"name": [
"beef soup",
"chicken rice"
]
}
}
it can be recalled by below query
{
"match": {
"name": {
"query": "soup chicken noodle",
"minimum_should_match": "67%"
}
}
}
but I only want it to be recalled by keyword hot beef soup or rice chicken hainan, is there any way except nested or span query to do this, thanks.
my es query is complex, anyone know how to rewrite it by span query
{
"query": {
"bool": {
"filter": [
...
],
"must": {
"dis_max": {
"queries": [
{
"match": {
"array_field_3": {
"boost": 2,
"minimum_should_match": "67%",
"query": "keyword aa bb"
}
}
},
......
{
"nested": {
"path": "path_1",
"query": {
"must": {
"match": {
"array_field_6": {
......
"query": "keyword aa bb"
}
}
}
}
}
}
}
],
"tie_breaker": 0.15
}
}
}
}
}

You can use match_phrase but it will only work for entire phrase. if you want to do only keyword match on each element of array then it is not possible without nested or span as mentioned in document.
Arrays of objects do not work as you would expect: you cannot query
each object independently of the other objects in the array. If you
need to be able to do this then you should use the nested data type
instead of the object data type.
When you get a document back from Elasticsearch, any arrays will be in the same order as when you indexed the document. The _source field that you get back contains exactly the same JSON document that you indexed.
However, arrays are indexed — made searchable — as multi-value fields, which are unordered. At search time you can’t refer to “the first element” or “the last element”.

Please try match_phrase query:
POST index1/_search
{
"query": {
"match_phrase": {
"text": {
"query": "chicken soup"
}
}
}
}

Related

Elastic query combining should (boolean OR) with retrieval of nested documents

I have an Elastic index with nested documents. I am trying to retrieve multiple documents by ids along with their nested documents. Retrieving the documents themselves is simple enough with a Should query, but where and how would I include the nested doc query in this?
Boolean "Should" query:
GET myIndex/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"id": "123"
}
},
{
"term": {
"id": "456"
}
},
{
"term": {
"id": "789"
}
}
]
}
}
}
Query to retrieve nested docs:
"nested": {
"path": "myNestedDocs",
"query": {
"match_all": {}
}
It is not possible to add the nested query to each term query, because this gives a parsing exception: "[term] malformed query, expected [END_OBJECT] but found [FIELD_NAME]"
Also, it is not possible to add the nested doc query on the same level as the term queries, because then it would be treated as just another OR clause and simply retrieve all docs from the index.
Any ideas? Thanks!
As per my understanding, you want to match any one id from list and retrive nested document. If my understanding correct then You need to combined both the query to must clause like below:
{
"query": {
"bool": {
"must": [
{
"terms": {
"id": [
"123",
"456",
"789"
]
}
},
{
"nested": {
"path": "myNestedDocs",
"query": {
"match_all": {}
}
}
}
]
}
}
}

Elasticsearch boolean query doesn't work with filter

I'm not very strong in Elasticsearch. I'm trying to set up search in my app and got some strange problems. I have two documents:
{
"title": "Second insight"
"content": "Bla bla bla"
"library": "workspace"
}
{
"title": "Test source"
"content": "Bla bla bla"
"library": "workspace"
}
Then, I want to be able to make a search by text fields like title and content and apply some filters on fields like library. I have a query:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "insight" }}
],
"filter": [
{
"term": {
"library": "workspace"
}
}
]
}
}
}
Despite the fact that I clearly defined title to be matched to insight, the query above returns both documents, not only the first one.
If I remove filter block:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "insight" }}
]
}
}
}
the query returns correct results.
Then, I also tried to make a partial search. For some reasons, the query uses ins instead of insight below doesn't work, so, it returns empty list:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "ins" }}
]
}
}
}
How should I make partial search? And how can I set up filters correctly? In other words, how to make a search partial query by some fields, but at the same time filtered by other fields?
Thanks.
You need to supply minimum_should_match in your first query.
I did the following and only got a single document (your desired outcome)
POST test_things/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"match": {
"title": "insight"
}
}
],
"filter": [
{
"term": {
"library": "workspace"
}
}
]
}
}
}
As for why ins doesn't work, it depends on your mapping + analyzer being used. You are matching against analyzed terms in the index, if you want to match against ins you need to change your analyzer (possibly using the ngram tokenizer) or use a wildcard query.

Minimum should match on filtered query

Is it possible to have a query like this
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and"
}
}
}
}
With the "minimum_should_match": "2" statement?
I know that I can use a simple query (I've tried, it works) but I don't need the score to be computed. My goal is just to filter documents which contains 2 of the values.
Does the score generally heavily impact the time needed to retrieves document?
Using this query:
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and",
"minimum_should_match": "2"
}
}
}
}
I got this error:
QueryParsingException[[my_db] [terms] filter does not support [minimum_should_match]]
Minimum should match is not a parameter for the terms filter. If that is the functionality you are looking for, I might rewrite your query like this, to use the bool query wrapped in a query filter:
{
"filter": {
"query": {
"bool": {
"should": [
{
"term": {
"names": "Anna"
}
},
{
"term": {
"names": "Mark"
}
},
{
"term": {
"name": "Joe"
}
}
],
"minimum_should_match": 2
}
}
}
}
You will get documents matching preferably exactly all three, but the query will also match document with exactly two of the three terms. The must is an implicit and. We also do not compute score, as we have executed the query as a filter.

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

How to display "ALL" the nested documents in an object in separate rows from elasticsearch?

I have a nested object in the following form:
{
"name": "Multi G. Enre",
"books": [
{
"name": "Guns and lasers",
"genre": "scifi",
"publisher": "orbit"
},
{
"name": "Dead in the night",
"genre": "thriller",
"publisher": "penguin"
}
]
}
I tried the following JSON query for the above document:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "books",
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"and": [
{
"term": {
"books.publisher": "penguin"
}
},
{
"term": {
"books.genre": "thriller"
}
}
]
}
}
}
}
}
}
}
}
So,I would like to see the second nested document i.e. "Dead in the night" as the result but, for anything I search only the first document i.e. "Guns and lasers" is displayed in the table in elasticsearch head plugin.
So, is there any way I can display the nested documents separately based on the search query and not just the first document?
I'm new to elasticsearch,so would appreciate any type of responses. THANKYOU!
You need to use inner_hits in your query.
Moreover, if you want to only retrieve the matching nested document and nothing else, you can add "_source":["books"] to your query and only the matching nested books will be returned, nothing else.
UPDATE
Sorry, I misunderstood your comment. You can add "_source": false and the top-level document will not be returned. Only the nested matching document.

Resources