Elasicsearch mixing NGram with Simple query string query

Elasicsearch mixing NGram with Simple query string query - elasticsearch

Currently, I am using Ngram tokenizer to-do partial matching of Employees.
I can match on FullName, Email address and Employee Number
My current setup looks as follow:
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
}
}
The problem that I am facing is that Employee Number can be 1 character long and because of the min_gram and max_gram, I can never match. I can't make the min_gram 1 either because the results do not look correct.
So I tried to mix the Ngram with a standard tokenizer and instead of doing in Multimatch search I am doing an simple_query_string.
This seems to also work partially.
My question is how can I partially match on all 3 fields bearing in mind that employee number can be 1 or 2 chars long. And exact match if I use semi quotes around a word or number
In the below example how can search for 11 and return documents 4 and 5?
Also, I would like document 2 to return if I had to search for 706 which is a partial match, but if I had to search with "7061" I would only return document 2
Full Code
PUT index
{
"settings": {
"analysis": {
"analyzer": {
"english_exact": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
},
"my_analyzer": {
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
}
},
"normalizer": {
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"number": {
"type": "text",
"analyzer": "english",
"fields": {
"exact": {
"type": "text",
"analyzer": "english_exact"
}
}
},
"fullName": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "my_analyzer"
}
},
"analyzer": "standard"
}
}
}
}
PUT index/_doc/1
{
"number" : 1,
"fullName": "Brenda eaton"
}
PUT index/_doc/2
{
"number" : 7061,
"fullName": "Bruce wayne"
}
PUT index/_doc/3
{
"number" : 23,
"fullName": "Bruce Banner"
}
PUT index/_doc/4
{
"number" : 111,
"fullName": "Cat woman"
}
PUT index/_doc/5
{
"number" : 1112,
"fullName": "0723568521"
}
GET index/_search
{
"query": {
"simple_query_string": {
"fields": [ "fullName.ngram", "number.exact"],
"query": "11"
}
}
}

You need to change the analyzer of the number.exact field and reduce the min_gram
count to 2. Modify the index mapping as shown below
Adding a working example
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"english_exact": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
},
"my_analyzer": {
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 2,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
}
},
"normalizer": {
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"number": {
"type": "keyword", // note this
"fields": {
"exact": {
"type": "text",
"analyzer": "my_analyzer"
}
}
},
"fullName": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "my_analyzer"
}
},
"analyzer": "standard"
}
}
}
}
Search Query:
{
"query": {
"simple_query_string": {
"fields": [ "fullName.ngram", "number.exact"],
"query": "11"
}
}
}
Search Result:
"hits": [
{
"_index": "66311552",
"_type": "_doc",
"_id": "4",
"_score": 0.9929736,
"_source": {
"number": 111,
"fullName": "Cat woman"
}
},
{
"_index": "66311552",
"_type": "_doc",
"_id": "5",
"_score": 0.8505551,
"_source": {
"number": 1112,
"fullName": "0723568521"
}
}
]
Update 1:
If you just need to search for 1, modify the data type of the number field from text type to keyword type, as shown in the index mapping above.
Search Query:
{
"query": {
"simple_query_string": {
"fields": [ "fullName.ngram", "number.exact","number"],
"query": "1"
}
}
}
Search Result will be
"hits": [
{
"_index": "66311552",
"_type": "_doc",
"_id": "1",
"_score": 1.3862942,
"_source": {
"number": 1,
"fullName": "Brenda eaton"
}
}
]
Update 2:
You can use two separate analyzers with n-gram tokenizer for the fullName field and number field. Modify with the below index mapping:
{
"settings": {
"analysis": {
"analyzer": {
"english_exact": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
},
"name_analyzer": {
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "name_tokenizer"
},
"number_analyzer": {
"filter": [
"lowercase",
"asciifolding"
],
"tokenizer": "number_tokenizer"
}
},
"tokenizer": {
"name_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
},
"number_tokenizer": {
"type": "ngram",
"min_gram": 2,
"max_gram": 3,
"token_chars": [
"letter",
"digit"
]
}
},
"normalizer": {
"lowersort": {
"type": "custom",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
"number": {
"type": "keyword",
"fields": {
"exact": {
"type": "text",
"analyzer": "number_analyzer"
}
}
},
"fullName": {
"type": "text",
"fields": {
"ngram": {
"type": "text",
"analyzer": "name_analyzer"
}
},
"analyzer": "standard"
}
}
}
}

Related

Elasticsearch succeeds for partial match and fails for exact match

I have a value in the "handle" field of my users index. Let's say the exact string of this value is "exactstring". My query will return the correct result for "exactstrin" but it will fail (return nothing) for "exactstring". N-grams, like "actstr", also return the correct result. WTF?
{
"query": {
"bool": {
"must": [{
"multi_match": {
"fields": ["handle","name","bio"],
"prefix_length": 2,
"query": "exacthandle",
"fuzziness": "AUTO"
}
}],
"should" : [{
"multi_match": {
"fields": ["handle^6", "name^3", "bio"],
"query": "exacthandle"
}
}]
}
},
"size": 100,
"from": 0
}
Here's my settings:
"settings": {
"analysis": {
"analyzer": {
"search_term_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"standard",
"lowercase",
"asciifolding",
"no_stop"
],
"tokenizer": "whitespace"
},
"ngram_token_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"standard",
"lowercase",
"asciifolding",
"no_stop",
"ngram_filter"
],
"tokenizer": "whitespace"
}
},
"filter": {
"no_stop": {
"type": "stop",
"stopwords": "_none_"
},
"ngram_filter": {
"type": "nGram",
"min_gram": "2",
"max_gram": "9"
}
}
}
}
And my mappings:
"handle": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"bio": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}

This must be happening because you have set less value for max_gram(Maximum length of characters in a gram). When you are querying for exactstring, then the tokens generated for exactstring, will have a maximum token of length 9 (i.e exactstri). Due to this, the query will match on exactstri and not on exactstring.
You need to specify the analyzer of the field. When mapping an
index, you can use the analyzer mapping parameter to specify an
analyzer for each text field.If none of the analyzers is specified,
then the standard analyzer is used.
You also need to increase the max_gram to at least 11 for ngram_token_analyzer in your index setting. And also set "max_ngram_diff": "20". The tokens generated (which you can check using Analyze API) will have exactstrin as well as exactstring and now your query will match on both exactstrin and exactstring
Index Mapping:
{
"settings": {
"analysis": {
"analyzer": {
"search_term_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"lowercase",
"asciifolding",
"no_stop"
],
"tokenizer": "whitespace"
},
"ngram_token_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"lowercase",
"asciifolding",
"no_stop",
"ngram_filter"
],
"tokenizer": "whitespace"
}
},
"filter": {
"no_stop": {
"type": "stop",
"stopwords": "_none_"
},
"ngram_filter": {
"type": "nGram",
"min_gram": "2",
"max_gram": "11" <-- note this
}
}
},
"max_ngram_diff": 20
},
"mappings": {
"properties": {
"handle": {
"type": "text",
"analyzer": "ngram_token_analyzer"
}
}
}
}
Index Data:
{
"handle":"exactstring"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"handle",
"name",
"bio"
],
"prefix_length": 2,
"query": "exactstring",
"fuzziness": "AUTO"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "64957062",
"_type": "_doc",
"_id": "1",
"_score": 0.6292809,
"_source": {
"handle": "exactstring"
}
}
]

I ended up using Multi-fields to solve this. Querying the field "handle.text" now returns exact string matches of arbitrary length.
I was also able to create ngram and edge_ngram filters with different min and max grams and then use them simultaneously.
The mapping looks like this now:
"handle": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"text": {
"type": "text"
},
"edge_ngram": {
"type": "text",
"analyzer": "edge_ngram_token_analyzer"
},
"ngram": {
"type": "text",
"analyzer": "ngram_token_analyzer"
}
}
}
The multi-match query can now simultaneously use different analyzers on the same field (the big difference between the must and should queries comes down to fuzziness, the bio field tends to contain a lot of small tokens and fuzziness produces a lot of bad hits):
{
"query": {
"bool": {
"must" : [{
"multi_match": {
"fields": [
"handle.text^6",
"handle.edge_ngram^4",
"handle.ngram^2",
"bio.text^6",
"bio.keyword^4",
"name.text^6",
"name.keyword^5",
"name.edge_ngram^2",
"name.ngram^3"
],
"query": "exactstring",
"prefix_length": 2
}
}],
"should" : [{
"multi_match": {
"fields": [
"handle.text^6",
"handle.edge_ngram^2",
"handle.ngram^1",
"name.text^6",
"name.keyword^5",
"name.edge_ngram^2",
"name.ngram^3"
],
"query": "exactstring",
"prefix_length": 2,
"fuzziness": "AUTO"
}
}]
}
},
"size": 100,
"from": 0
}
For completeness, here are the settings of my index:
"settings": {
"analysis": {
"analyzer": {
"search_term_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"standard",
"lowercase",
"asciifolding",
"no_stop"
],
"tokenizer": "whitespace"
},
"ngram_token_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"standard",
"lowercase",
"asciifolding",
"no_stop",
"ngram_filter"
],
"tokenizer": "whitespace"
},
"edge_ngram_token_analyzer": {
"type": "custom",
"stopwords": "_none_",
"filter": [
"standard",
"lowercase",
"asciifolding",
"no_stop",
"edge_ngram_filter"
],
"tokenizer": "whitespace"
}
},
"filter": {
"no_stop": {
"type": "stop",
"stopwords": "_none_"
},
"ngram_filter": {
"type": "nGram",
"min_gram": "3",
"max_gram": "12"
},
"edge_ngram_filter": {
"type": "edgeNGram",
"min_gram": "3",
"max_gram": "50"
}
}
},
"max_ngram_diff": 50
}
h/t to #bhavya for max_ngram_diff

How to highlight characters within words using elasticsearch

I have implemented auto suggest using elastic search where I am giving suggestions to users based on typed value 'where'. Most of the part works fine if I type full word or few starting characters of word. I want to highlight specific characters typed by the user, say for example user types 'ca' then suggestions should highlight 'California' only and not whole word 'California'
Highlight tag should show result like <b>Ca</b>lifornia and not <b>California</b>.
Here is my index settings
{
"settings": {
"index": {
"analysis": {
"filter": {
"edge_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 50
},
"lowercase_filter":{
"type":"lowercase",
"language": "greek"
},
"metro_synonym": {
"type": "synonym",
"synonyms_path": "metro_synonyms.txt"
},
"profession_specialty_synonym": {
"type": "synonym",
"synonyms_path": "profession_specialty_synonyms.txt"
}
},
"analyzer": {
"auto_suggest_analyzer": {
"filter": [
"lowercase",
"edge_filter"
],
"type": "custom",
"tokenizer": "whitespace"
},
"auto_suggest_search_analyzer": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "whitespace"
},
"lowercase": {
"filter": [
"trim",
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
}
}
},
"mappings": {
"properties": {
"what_auto_suggest": {
"type": "text",
"analyzer": "auto_suggest_analyzer",
"search_analyzer": "auto_suggest_search_analyzer",
"fields": {
"raw":{
"type":"keyword"
}
}
},
"company": {
"type": "text",
"analyzer": "lowercase"
},
"where_auto_suggest": {
"type": "text",
"analyzer": "auto_suggest_analyzer",
"search_analyzer": "auto_suggest_search_analyzer",
"fields": {
"raw":{
"type":"keyword"
}
}
},
"tags_auto_suggest": {
"type": "text",
"analyzer": "auto_suggest_analyzer",
"search_analyzer": "auto_suggest_search_analyzer",
"fields": {
"raw":{
"type":"keyword"
}
}
}
}
}
}
Query i am using to pull suggestions -
GET /autosuggest_index_test/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"where_auto_suggest": {
"query": "ca",
"operator": "and"
}
}
}
]
}
},
"aggs": {
"NAME": {
"terms": {
"field": "where_auto_suggest.raw",
"size": 10
}
}
},
"highlight": {
"pre_tags": [
"<b>"
],
"post_tags": [
"</b>"
],
"fields": {
"where_auto_suggest": {
}
}
}
}
One of json result that I am getting -
{
"_index" : "autosuggest_index_test",
"_type" : "_doc",
"_id" : "Calabasas CA",
"_score" : 5.755663,
"_source" : {
"where_auto_suggest" : "Calabasas CA"
},
"highlight" : {
"where_auto_suggest" : [
"<b>Calabasas</b> <b>CA</b>"
]
}
}
Can someone please suggest, how to get output here (in the where_auto_suggest) like - "<b>Ca</b>labasas <b>CA</b>"

I don't really know why but if you use a edge_ngram tokenizer instead of an edge_ngram filter you will have highlighted characters instead of highlighted words.
So in your settings, you could declare such a tokenizer :
"settings": {
"index": {
"analysis": {
"tokenizer": {
"edge_tokenizer": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 50,
"token_chars": [
"letter",
"digit",
"punctuation",
"symbol"
]
}
},
...
}
}
}
And change your analyzer to :
"analyzer": {
"auto_suggest_analyzer": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "edge_tokenizer"
}
...
}
Thus your example request will return
{
...
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.2876821,
"hits": [
{
"_index": "autosuggest_index_test",
"_type": "_doc",
"_id": "grIzo28BY9R4-IxJhcFv",
"_score": 0.2876821,
"_source": {
"where_auto_suggest": "california"
},
"highlight": {
"where_auto_suggest": [
"<b>ca</b>lifornia"
]
}
}
]
}
...
}

Elasticsearch NGram Analyser - Change the Order of the results of Query

Elasticsearch Query change display results according to the scoring
The current Query gives the result of the Field title in the following order.
Quick 123
Foxes Quick
Quick
Foxes Quick Quick
Quick Foxes
Shouldn't
3. Quick be coming as a first result instead?
Also , Foxes Quick Quick has two occurances of Quick, it should have some preference in the Queried result . But it is coming at 4th poistion .
Index Settings .
{
"fundraisers": {
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "fundraisers",
"creation_date": "1546515635025",
"analysis": {
"analyzer": {
"my_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "my_tokenizer"
},
"search_analyzer_search": {
"filter": [
"lowercase"
],
"tokenizer": "search_tokenizer_search"
}
},
"tokenizer": {
"my_tokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "3",
"type": "edge_ngram",
"max_gram": "50"
},
"search_tokenizer_search": {
"token_chars": [
"letter",
"digit",
"whitespace"
],
"min_gram": "3",
"type": "ngram",
"max_gram": "50"
}
}
},
"number_of_replicas": "1",
"uuid": "mVweO4_sT3Ww00MzdLyavw",
"version": {
"created": "6020399"
}
}
}
}
}
Query
GET fundraisers/_search?explain=true
{
"query": {
"match_phrase": {
"title": {
"query": "qui",
"analyzer": "my_analyzer"
}
}
}
}
Mapping
{
"fundraisers": {
"mappings": {
"fundraisers": {
"properties": {
"status": {
"type": "text"
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
},
"title": {
"type": "text",
"analyzer": "my_analyzer"
},
"twitterUrl": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"videoLinks": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"zipCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
Am I complicating this too much by using match_phrase,search analyzer and ngrams or is there any simpler way to achieve the expected result ?
Ref:
https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-match-query.html

Ok, first let's create a minimal and reproducible setup:
PUT test
{
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "1",
"analysis": {
"analyzer": {
"my_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "my_tokenizer"
},
"search_analyzer_search": {
"filter": [
"lowercase"
],
"tokenizer": "search_tokenizer_search"
}
},
"tokenizer": {
"my_tokenizer": {
"token_chars": [
"letter",
"digit"
],
"min_gram": "3",
"type": "edge_ngram",
"max_gram": "50"
},
"search_tokenizer_search": {
"token_chars": [
"letter",
"digit",
"whitespace"
],
"min_gram": "3",
"type": "ngram",
"max_gram": "50"
}
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
}
PUT test/_doc/1
{
"title": "Quick 123"
}
PUT test/_doc/2
{
"title": "Foxes Quick"
}
PUT test/_doc/3
{
"title": "Quick"
}
PUT test/_doc/4
{
"title": "Foxes Quick Quick"
}
PUT test/_doc/5
{
"title": "Quick Foxes"
}
Then let's try the simplest query:
GET test/_search
{
"query": {
"match": {
"title": {
"query": "qui"
}
}
}
}
And now your order is:
Quick
Foxes Quick Quick
Quick 123
Foxes Quick
Quick Foxes
That's pretty much what you were expecting, right? There might be other usecases, which are not covered by this query, but IMO you'll have to use multi_match and search on different analyzers, because I'm not sure a phrase_search on an edgegram makes much sense.

ElasticSearch indexing so query returns contains

I've been trying to create my own index for users, where the query is indexed on the "name" value.
This is my current index settings:
{
"users": {
"settings": {
"index": {
"analysis": {
"filter": {
"shingle_filter": {
"max_shingle_size": "2",
"min_shingle_size": "2",
"output_unigrams": "true",
"type": "shingle"
},
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete_query_analyzer": {
"filter": [
"standard",
"asciifolding",
"lowercase"
],
"tokenizer": "standard"
},
"autocomplete_index_analyzer": {
"filter": [
"standard",
"asciifolding",
"lowercase",
"shingle_filter",
"edgeNGram_filter"
],
"tokenizer": "standard"
}
}
},
"number_of_shards": "1",
"number_of_replicas": "1"
}
}
}
}
and my mapping:
{
"users": {
"mappings": {
"data": {
"properties": {
"name": {
"type": "string",
"analyzer": "autocomplete_index_analyzer",
"search_analyzer": "autocomplete_query_analyzer"
}
}
}
}
}
}
Right now my problem is that search queries do not return results that contain the term. For example if I have a user "David", the search queries "Da", "Dav", "Davi", etc will return the value but search for "vid" or "avid" will not return any values.
Is this because of some value I'm missing in the settings?

You need to use nGram instead of edgeNGram. So simply change this
"edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": "1",
"max_gram": "20"
}
into this
"edgeNGram_filter": {
"type": "nGram", <--- change here
"min_gram": "1",
"max_gram": "20"
}
Note that you need to wipe your index, recreate it and the populate it again.

Get suggestion on field Elasticsearch

I am trying to make a suggestion feature with Elasticsearch.
following this article https://qbox.io/blog/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams
What I have now works but not for two words in the same sentence.
The data I have now in ES is.
{
"_index": "books",
"_type": "book",
"_id": "AVJp8p4ZTfM-Ee45GnF5",
"_score": 1,
"_source": {
"title": "Making a dish",
"author": "Jim haunter"
}
},
{
"_index": "books",
"_type": "book",
"_id": "AVJp8jaZTfM-Ee45GnF4",
"_score": 1,
"_source": {
"title": "The big fish",
"author": "Jane Stewart"
}
},
{
"_index": "books",
"_type": "book",
"_id": "AVJp8clRTfM-Ee45GnF3",
"_score": 1,
"_source": {
"title": "The Hunter",
"author": "Jame Franco"
}
}
Here is the mapping and settings.
{"settings": {
"analysis": {
"filter": {
"nGram_filter": {
"type": "nGram",
"min_gram": 2,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"nGram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"nGram_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"books": {
"_all": {
"index_analyzer": "nGram_analyzer",
"search_analyzer": "whitespace_analyzer"
},
"properties": {
"title": {
"type": "string",
"index": "no"
},
"author": {
"type": "string",
"index": "no"
}
}
}
}
}
Here is the search
{
"size": 10,
"query": {
"match": {
"_all": {
"query": "Hunter",
"operator": "and",
"fuzziness": 1
}
}
}
}
when I search for "The" I get
"The big fish" and
"The hunter".
However when I enter "The Hunt" I get nothing.
To get the book again I need to enter "The Hunte".
Any suggestions?
Any help appreciated.

Removing "index": "no" from the fields worked for me. Also, since I'm using ES 2.x, I had to replace "index_analyzer" with "analyzer". So here is the mapping:
PUT /test_index
{
"settings": {
"analysis": {
"filter": {
"nGram_filter": {
"type": "nGram",
"min_gram": 2,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
},
"analyzer": {
"nGram_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"nGram_filter"
]
},
"whitespace_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"books": {
"_all": {
"analyzer": "nGram_analyzer",
"search_analyzer": "whitespace_analyzer"
},
"properties": {
"title": {
"type": "string"
},
"author": {
"type": "string"
}
}
}
}
}
Here's some code I used to test it:
http://sense.qbox.io/gist/0140ee0f5043f66e76cc3109a18d573c1d09280b

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Elasicsearch mixing NGram with Simple query string query - elasticsearch

Related

Elasticsearch succeeds for partial match and fails for exact match

How to highlight characters within words using elasticsearch

Elasticsearch NGram Analyser - Change the Order of the results of Query

ElasticSearch indexing so query returns contains

Get suggestion on field Elasticsearch

Categories

Resources