Is there any option to minimize this elastic search must not match query? - elasticsearch

I'm trying to avoid some text from the field and for that I have used must not condition but, it seems to be static also took more lines. So, please let me know is there any other option to optimize this query.
Here is the query,
"must_not": [
{
"match": {
"field.keyword": "welcome"
}
},
{
"match": {
"field.keyword": "Welcome"
}
},
{
"match": {
"field.keyword": "entry_point"
}
},
{
"match": {
"field.keyword": "Entry point"
}
}
]
Thanks,

If search text is same , you can use multi- match which will search for text in multiple fields
"bool": {
"must_not": [
{
"multi_match": {
"query": "text",
"fields": ["field1.keyword","field2.keyword"]
}
}
]
}
If field is same and texts are different , you can use terms query
"must_not": [
{
"terms": {
"field.keyword": [
"VALUE1",
"VALUE2"
]
}
}
]
If both fields and texts are different you will have to use query in your question.

As you said you are not looking for an exact match i would just use query string for single words and match phrase for phrases.
"must_not": [
{
"query_string": {
"query": "welcome OR Welcome"
}
},
{
"match_phrase": {
"title": {
"query": "entry point",
}
}
}
]
I'm not sure which analyzer you use but if you use lowercase + alphanumeric only for example you wont have to have "duplicate" queries like "welcome" and "Welcome".

Related

Elasticsearch boolean query doesn't work with filter

I'm not very strong in Elasticsearch. I'm trying to set up search in my app and got some strange problems. I have two documents:
{
"title": "Second insight"
"content": "Bla bla bla"
"library": "workspace"
}
{
"title": "Test source"
"content": "Bla bla bla"
"library": "workspace"
}
Then, I want to be able to make a search by text fields like title and content and apply some filters on fields like library. I have a query:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "insight" }}
],
"filter": [
{
"term": {
"library": "workspace"
}
}
]
}
}
}
Despite the fact that I clearly defined title to be matched to insight, the query above returns both documents, not only the first one.
If I remove filter block:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "insight" }}
]
}
}
}
the query returns correct results.
Then, I also tried to make a partial search. For some reasons, the query uses ins instead of insight below doesn't work, so, it returns empty list:
{
"query": {
"bool": {
"should": [
{ "match": { "title": "ins" }}
]
}
}
}
How should I make partial search? And how can I set up filters correctly? In other words, how to make a search partial query by some fields, but at the same time filtered by other fields?
Thanks.
You need to supply minimum_should_match in your first query.
I did the following and only got a single document (your desired outcome)
POST test_things/_search
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"match": {
"title": "insight"
}
}
],
"filter": [
{
"term": {
"library": "workspace"
}
}
]
}
}
}
As for why ins doesn't work, it depends on your mapping + analyzer being used. You are matching against analyzed terms in the index, if you want to match against ins you need to change your analyzer (possibly using the ngram tokenizer) or use a wildcard query.

ElasticSearch multimatch substring search

I have to combine two filters to match requirements:
- a specific list of values in r.status field
- one of the multiple text fields contains the value.
Result query (with using Nest, but it doesn't matter) looks like:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"term": {
"isActive": {
"value": true
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"r.status": [
"VALUE_1",
"VALUE_2",
"VALUE_3"
]
}
},
{
"bool": {
"should": [
{
"match": {
"r.g.firstName": {
"type": "phrase",
"query": "SUBSTRING_VALUE"
}
}
},
{
"match": {
"r.g.lastName": {
"type": "phrase",
"query": "SUBSTRING_VALUE"
}
}
}
]
}
}
]
}
},
"path": "r"
}
}
]
}
}
]
}
}
}
Also tried with multi_match query:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"term": {
"isActive": {
"value": true
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"r.status": [
"VALUE_1",
"VALUE_2",
"VALUE_3"
]
}
},
{
"multi_match": {
"query": "SUBSTRING_VALUE",
"fields": [
"r.g.firstName",
"r.g.lastName"
]
}
}
]
}
},
"path": "r"
}
}
]
}
}
]
}
}
}
FirstName and LastName are configured in index mappings as text:
"firstName": {
"type": "text"
},
"lastName": {
"type": "text"
}
Elastic gives a lot of full-text search options: multi_match, phrase, wildcards etc. But all of them fail in my case looking a sub-string in my text fields. (terms query and isActive one work well, I just tried to run only them).
What options do I have also or maybe where I made a mistake?
UPD: Combined wildcards worked for me, but such query looks ugly. Looking for a more elegant solution.
The elasticsearch way is to use ngram tokenizer.
The ngram analyzer will split your terms with a sliding window. For example, the input "Hello World" will generate the following terms:
Hel
Hell
Hello
ell
ello
...
Wor
World
orl
...
You can configure the minimum and maximum size of the sliding window (in the example the minimum size is 3). Once the sub terms are generated you can use a match query an the subfield.
Another point, it is weird to use must within a filter. If you are interested in the score, you should always use must otherwise use filter. Read this article for a good understanding.

ElasticSearch combine must-match with multi-match

I have been trying to combine MUST-MATCH with MULTI-MATCH but couldn't get it to work. Basically I want these MUST conditions:
"must": [{ "match": { "city": $city } },
{ "match": { "is_displayed": 1 } },
{ "match": { "status": "active" } }]
and I want these matches:
"multi_match": {
"query": $query,
"type": $selectedType,
"fields": fieldArray,
}
where $query is the textbox values $selectedType is one of the multi-match query types and fieldArray is the fields to search for. For example, when the text box value is "hello world" and fieldArray is ['title', 'cuisine'], either "hello" and/or "world" must match either or all of the specified fields. Any insight and advice is appreciated.
I guess adding another clause in must block will do the needful.
{
"query": {
"bool": {
"must": [
{
"match": {
"city": "$city"
}
},
{
"match": {
"is_displayed": 1
}
},
{
"match": {
"status": "active"
}
},
"query_string": {
"fields": fieldArray,
"query": "*$query*"
}
}
]
}
}
}

match query on elastic search with multiple or conditions

I have three fields status,type and search. What I want is to search the data which contains status equals to NEW or status equals to IN PROGRESS and type is equal to abc or type equals to xyz and search contains( partial match ).
My call looks like below -
{
"query": {
"bool" : {
"must" : [{
"match": {
"status": {
"query": "abc",
}
}
}, {
"match": {
"type": {
"query": "NEW",
}
}
},{
"query_string": {
"query": "*abc*", /* for partial search */
"fields": ["title", "name"]
}
}]
}
}
}
Nest your boolqueries. I think what you are missing is this:
"bool": { "should": [
{ "match": { "status": "abc" } },
{ "match": { "status": "xyz" } }
]}
This is a query which MUST match one of the should clauses as only should clauses are given.
EDIT to explain the differences:
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"status": "abc"
}
},
{
"match": {
"status": "xyz"
}
}
]
}
},
{
"terms": {
"type": [
"NEW",
"IN_PROGRESS"
]
}
},
{
"query_string": {
"query": "*abc*",
"fields": [
"title",
"name"
]
}
}
]
}
}
}
So you have a boolquery at top. Every of the 3 inner queries must be true.
The first is a nested boolquery which is true if status matches either abc or xyz.
The second is true if type matches exactly NEW or IN_PROGRESS - Note the difference here. The First one would also match ABC or aBc or potentially "abc XYZ" depending on your analyzer. You might want terms for both.
The third is what you had before.

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

Resources