Elasticsearch exact match query (not fuzzy) - elasticsearch

I have the following query over a string field:
const query = {
"query": {
"match_phrase": {
name: "ron"
}
},
"from": 0,
"size": 10
};
these are the names I have in the database
1. "ron"
2. "ron martin"
3. "ron ron"
4. "ron howard"
the result of this query is very fuzzy, all the of the rows are returned instead of row number 1 only.
It's like it is performing "contains" instead of "equals".
Thanks

In your case, all the documents are returning, because all the documents have ron in them.
If you want that only the exact field should match, then you need to add keyword subfield to the name field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after name field). Try out this below query -
Index Mapping:
{
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Index Data:
{
"name":"ron"
}
{
"name":"ron martin"
}
{
"name":"ron ron"
}
{
"name":"ron howard"
}
{
"name": "john howard"
}
Search Query:
{
"query": {
"match_phrase": {
"name.keyword": "ron"
}
},
"from": 0,
"size": 10
}
Search Result:
"hits": [
{
"_index": "64982377",
"_type": "_doc",
"_id": "1",
"_score": 1.2039728,
"_source": {
"name": "ron"
}
}
]
Update 1:
Based on the comments below, if you want to search for both exact match and fuzzy match (according to your requirement), then you can use multi_match query.
Search Query:
{
"query": {
"multi_match": {
"query": "howard",
"fields": [
"name",
"name.keyword"
],
"type": "phrase"
}
}
}
Search Result:
"hits": [
{
"_index": "64982377",
"_type": "_doc",
"_id": "4",
"_score": 0.83740485,
"_source": {
"name": "ron howard"
}
},
{
"_index": "64982377",
"_type": "_doc",
"_id": "5",
"_score": 0.83740485,
"_source": {
"name": "john howard"
}
}
]

Related

elk's elastic search dsl case sensitive

I'm doing an Elasticsearch Query DSL query on ELK such as:
{
"query": {
"wildcard": {
"url.path": {
"value": "*download*",
"boost": 1,
"rewrite": "constant_score"
}
}
}
}
but it seems is case sensitive (so show only info with "download", not "Download" or "DOWNLOAD").
i.e. is case sensitive.
can I disable this? and search case insensitive?
Version used: 7.9.1
The below query will help you perform case-insensitive search as it will fetch results for *download, *Download and *DOWNLOAD. You may replace with your index and with the field you would like to perform this search.
Search Query
GET /<my-index>/_search
{
"query" : {
"bool" : {
"must" : {
"query_string" : {
"query" : "*download",
"fields": ["<field1>"]
}
}
}
}
}
If you wish to perform the same search on multiple fields, you can add the same in list.
Search on multiple fields
GET /<my-index>/_search
{
"query" : {
"bool" : {
"must" : {
"query_string" : {
"query" : "*download",
"fields": ["<field1>","<field2>","field3>"]
}
}
}
}
}
There is a case_insensitive parameter available for wildcard query, but it was introduced in Elasticsearch 7.10.0, so you need to upgrade if you are still on 7.9.1.
If you can upgrade to 7.10.0 or higher:
Ideally, in index mapping field should use wildcard type:
{
"mappings": {
"properties": {
"url.path": {
"type": "wildcard"
}
}
}
}
Then a wildcard query with case insensitivity enabled will find all the variants ("download", "DOWNLOAD", "download", etc)
{
"query": {
"wildcard": {
"url.path": {
"value": "*download*",
"boost": 1,
"rewrite": "constant_score",
"case_insensitive": true
}
}
}
}
If you must remain at 7.9.1:
Define your mapping in such a way that Elasticsearch treats the field contents as lowercase. The following will mimic wildcard type (it's a keyword, so only one token) indexed as lowercase.
{
"mappings": {
"properties": {
"url": {
"type": "text",
"analyzer": "lowercase-keyword"
}
}
},
"settings": {
"analysis": {
"analyzer": {
"lowercase-keyword": {
"type": "custom",
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
The query, without the case_insensitive parameter which is unsupported in this version:
{
"query": {
"wildcard": {
"url": {
"value": "*download*",
"boost": 1,
"rewrite": "constant_score"
}
}
}
}
Example results (note that searching for "*download*" and "*DoWnLoAd*" with both work in the same way):
{
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "my-index",
"_type": "_doc",
"_id": "PtbQe3wByTvslqtrs7Cn",
"_score": 1.0,
"_source": {
"url": "http://example.com/download"
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "P9bQe3wByTvslqtrvbDt",
"_score": 1.0,
"_source": {
"url": "http://example.com/Download"
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "QNbQe3wByTvslqtrzbDw",
"_score": 1.0,
"_source": {
"url": "http://example.com/DOWNLOAD"
}
}
]
}
}
You can use case_insensitive parameter for wildcard query. This parameter was introduced in 7.10.0 version
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"url": {
"properties": {
"path": {
"type": "wildcard"
}
}
}
}
}
}
Index Data:
{
"url":{
"path":"xx/download"
}
}
Search Query:
{
"query": {
"wildcard": {
"url.path": {
"value": "*Download*",
"boost": 1,
"rewrite": "constant_score",
"case_insensitive": false
}
}
}
}
Search Result:
No results will be there when you are searching for *Download* or *DOWNLOAD*
Update:
You can use the wildcard query with "case_insensitive": true parameter
Adding a sample index data, search query, and search result
Index Data:
{
"url": {
"path": "download"
}
}
{
"url": {
"path": "DOWNLOAD"
}
}
{
"url": {
"path": "Download"
}
}
Search Query:
{
"query": {
"wildcard": {
"url.path": {
"value": "*DOWNLOAD*",
"boost": 1,
"rewrite": "constant_score",
"case_insensitive": true
}
}
}
}
Search Result:
"hits": [
{
"_index": "67210888",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"url": {
"path": "download"
}
}
},
{
"_index": "67210888",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"url": {
"path": "Download"
}
}
},
{
"_index": "67210888",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"url": {
"path": "DOWNLOAD"
}
}
}
]

Elasticsearch query to return part of words searched for

I would like to know how I can return "thanks" or "thanking" if I search for "thank"
Currently I have a multi-match query which returns only occurrences of "thank" like "thank you" but not "thanksgiving" or "thanks". I am using ElasticSearch 7.9.1
query: {
bool: {
must: [
{match: {accountId}},
{
multi_match: {
query: "thank",
type: "most_fields",
fields: ["text", "address", "description", "notes", "name"],
}
}
],
filter: {match: {type: "personaldetails"}}
}
},
Also is it possible to combine the multimatch query with a queryString on one of the fields (say description, where I would do a querystring search only on description and a phrase match on other fields)
{ "query": {
"query_string": {
"query": "(new york city) OR (big apple)",
"default_field": "content"
}
}
}
Any input is appreciated.
thanks
You can use edge_ngrma tokenizer that first breaks text down into
words whenever it encounters one of a list of specified characters,
then it emits N-grams of each word where the start of the N-gram is
anchored to the beginning of the word.
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "edge_ngram",
"min_gram": 5,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"notes": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "standard" // note this
}
}
}
}
Index Data:
{
"notes":"thank"
}
{
"notes":"thank you"
}
{
"notes":"thanks"
}
{
"notes":"thanksgiving"
}
Search Query:
{
"query": {
"multi_match" : {
"query": "thank",
"fields": [ "notes", "name" ]
}
}
}
Search Result:
"hits": [
{
"_index": "65511630",
"_type": "_doc",
"_id": "1",
"_score": 0.1448707,
"_source": {
"notes": "thank"
}
},
{
"_index": "65511630",
"_type": "_doc",
"_id": "3",
"_score": 0.1448707,
"_source": {
"notes": "thank you"
}
},
{
"_index": "65511630",
"_type": "_doc",
"_id": "2",
"_score": 0.12199639,
"_source": {
"notes": "thanks"
}
},
{
"_index": "65511630",
"_type": "_doc",
"_id": "4",
"_score": 0.06264679,
"_source": {
"notes": "thanksgiving"
}
}
]
To combine multi-match query with query string, use the below query:
{
"query": {
"bool": {
"must": {
"multi_match": {
"query": "thank",
"fields": [
"notes",
"name"
]
}
},
"should": {
"query_string": {
"query": "(new york city) OR (big apple)",
"default_field": "content"
}
}
}
}
}

Elastic Search - weighting base on an attribute

Is there a way in Elastic Search to weight results base on an attribute other than the one used for the search query. For example, we search the field 'name', but all documents that have 'with_pictures' attributed to true weighted higher.
You can use boost on individual fields, that will be boosted automatically — count more towards the relevance score — at query time, with the boost parameter
Adding working example with index data, mapping and search query
Index mapping:
{
"mappings": {
"properties": {
"with_pictures": {
"type": "boolean",
"boost": 2
},
"name": {
"type": "keyword"
}
}
}
}
Index data:
{
"name": "A",
"with_pictures": false
}
{
"name": "A",
"with_pictures": true
}
{
"name": "B",
"with_pictures": true
}
Search Query:
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"bool": {
"should": [
{
"term": {
"name": "A"
}
},
{
"term": {
"with_pictures": true
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "fd_cb1",
"_type": "_doc",
"_id": "1",
"_score": 1.4100108,
"_source": {
"name": "A",
"with_pictures": true
}
},
{
"_index": "fd_cb1",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"name": "B",
"with_pictures": true
}
},
{
"_index": "fd_cb1",
"_type": "_doc",
"_id": "2",
"_score": 0.4700036,
"_source": {
"name": "A",
"with_pictures": false
}
}
]
Score of documents satisfying both the conditions of name and with_properties is having the highest score. But the document having name: B and with_pictures: true have higher score than name: A and with_pictures: false( because of the boost applied on the with_pictures
You can also refer function score query that allows you to modify the score of documents that are retrieved by a query.

Elastic Search text search not scoring more for exact match

I have created below mapping in ES index :
{
"field_to_search": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
And using below query to get the data :
{
"query": {
"bool": {
"should": [
{
"match": {
"field_to_search": {
"query": "is this test?",
"boost": 10,
"fuzziness": "2",
"prefix_length": 2
}
}
}
],
"minimum_should_match": 1
}
},
"size": 20
}
Getting results :
{
"_index": "test",
"_id": "2551",
"_score": 70.02259,
"_source": {
"id": "2551",
"field_to_search": "is this test value?",
}
},
{
"_index": "test",
"_id": "2545",
"_score": 61.861847,
"_source": {
"id": "2545",
"field_to_search": "is this test?",
}
},
{
"_index": "test",
"_id": "2355",
"_score": 50.987878,
"_source": {
"id": "2355",
"field_to_search": "is this test performance value?",
}
}
Expected : is this test? doc on top
Here I'm not getting exact match on top. Score for exact match is less than the fuzzy match. Can someone please help here?
I have tried fuzzy query with min boost as well but didn't worked.
Have you reindexed your data after each mapping update?
&explain=true will provide more scoring insight.

Elastic search query for name / value pair columns pull

We have one document in elastic search with multiple sections of name/value pair and we want to fetch value's only based on name column value.
"envelopeData": {
"envelopeName": "Bills",
"details": {
"detail": [
{
"name": "UC_CORP",
"value": "76483"
},
{
"name": "UC_CYCLE",
"value": "V"
}
We are expecting only 76483 as result based on name equals to UC_CORP
If the field envelopeData.details.detail is nested type then you can perform a match query for the desired name on the nested path and can use inner_hits to get just the value.
Map the field envelopeData.details.detail as nested(if not nested):
PUT stackoverflow
{
"mappings": {
"_doc": {
"properties": {
"envelopeData.details.detail": {
"type": "nested"
}
}
}
}
}
then you can perform the following query to get value using inner_hits:
GET stackoverflow/_search
{
"_source": "false",
"query": {
"nested": {
"path": "envelopeData.details.detail",
"query": {
"match": {
"envelopeData.details.detail.name.keyword": "UC_CORP"
}
},
"inner_hits": {
"_source": "envelopeData.details.detail.value"
}
}
}
}
which outputs:
{
"_index": "stackoverflow",
"_type": "_doc",
"_id": "W5GUW2gB3GnGVyg-Sf4T",
"_score": 0.6931472,
"_source": {},
"inner_hits": {
"envelopeData.details.detail": {
"hits": {
"total": 1,
"max_score": 0.6931472,
"hits": [
{
"_index": "stackoverflow",
"_type": "_doc",
"_id": "W5GUW2gB3GnGVyg-Sf4T",
"_nested": {
"field": "envelopeData.details.detail",
"offset": 0
},
"_score": 0.6931472,
"_source": {
"value": "76483" -> Outputs value only
}
}
]
}
}
}
}

Resources