How to implement space ignored search across a document with multiple fields in Elasticsearch - elasticsearch

I'm trying to implement a space agnostic product catalog search solution using Elasticsearch for a chemical oriented product database. The use case is as follows:
Consider a chemical with the name: Dimethyl Sulphoxide
Some manufacturers label it as Dimethylsulphoxide and some as Dimethyl Sulphoxide. So I could have two Item entries in my ES document as follows
Item 1: {"item_name":"Dimethyl Sulphoxide", "brand":"Merck"}
Item 2: {"item_name":"Dimethylsulphoxide","brand":"Spectrochem"}
Now ideally, If the user enters either string (i.e Dimethyl sulphoxide or Dimethylsulphoxide), I want both the documents to be displayed in the hits.
To achieve this I'm doing two things:
1)At index time, I'm currently running the item_name field through a custom analyzer that consists of the following flow:
Tokenizing with keyword, then filtering with lowercase, then filtering with word joiner(with catenate_all), then filtering with an edge_ngram filter.
So the string "Dimethyl Sulphoxide" becomes ("Dimethyl","Sulphoxide") then ("dimethyl","sulphoxide"), then ("dimethyl","sulphoxide","dimethylsulphoxide"), then ("d","di","dim",dime"....."dimethyl","s","su","sul"....,"sulphoxide","d","di"......,"dimethylsulphoxide")
I'm also running the other fields in the product document, such as the brand field with the same analyzer.
At query time, I'm running the query search string via a similar analyzer without the edge_ngram. So a query string of "Dimethyl Sul" will become ("Dimethyl","Sul") then ("dimethyl","sul") then ("dimethyl","sul","dimethylsul") by specifying a custom search analyzer for each field at index time.
Now I'm able to surface both the results when the user searches a string with or without space, but this approach is coming in the way of my other use cases.
Consider the second use case where the user should also be able to search for an item by the name + the brand and other fields, all in one search box. For example, A user could search for one of the items above by entering, "dimethyl spectrochem" or "sulphoxide merck".
To allow this, I have tried using a multi_match with type as cross fields query and a combined_fields query, both with the operator as AND. But this combination of word_joiner in query string with cross_fields/combined_fields is giving me undesired results for my second use case.
When a user enters, "dimethyl spectrochem", the query search analyzer generates three tokens ("dimethyl","spectrochem" and "dimethylspectrochem") and when these are passed to the cross_fields/combined_fields it essentially generates a query as :
+("item_name":"dimethyl","brand":"dimethyl")
+("item_name":"spectrochem","brand":"spectrochem")
+("item_name":"dimethylspectrochem","brand":"dimethylspectrochem")
Given the way how cross_field works , it looks for a document where all the three queryStrings are present in either field. Since it's unable to find "dimethylspectrochem" in a single field it returns zero results.
Is there a way I can satisfy both use cases?
The mapping that I have specified during index creation is below
curl -XPUT http://localhost:9200/test-item-summary-5 -H 'Content-Type: application/json' -d'
{
"settings":{
"analysis":{
"tokenizer":{
"whitespace":{
"type":"whitespace"
},
"keyword":{
"type":"keyword"
}
},
"filter":{
"lowercase":{
"type":"lowercase"
},
"shingle_word_joiner":{
"type":"shingle",
"token_separator":""
},
"word_joiner":{
"type":"word_delimiter_graph",
"catenate_all":true,
"split_on_numerics":false,
"stem_english_possessive":false
},
"edge_ngram_filter":{
"type":"edge_ngram",
"min_gram":1,
"max_gram":20,
"token_chars":[
"letter",
"digit"
]
}
},
"analyzer":{
"whitespaceWithEdgeNGram":{
"tokenizer":"keyword",
"filter":[
"lowercase",
"word_joiner",
"edge_ngram_filter"
]
},
"spaceIgnoredWithLowerCase":{
"tokenizer":"keyword",
"char_filter":[
"dash_char_filter"
],
"filter":[
"lowercase",
"word_joiner"
]
},
"shingleSearchAnalyzer":{
"tokenizer":"whitespace",
"char_filter":[
"dash_char_filter"
],
"filter":[
"lowercase",
"shingle_word_joiner"
]
},
"whitespaceWithLowerCase":{
"tokenizer":"whitespace",
"char_filter":[
"dash_char_filter"
],
"filter":[
"lowercase"
]
}
},
"char_filter":{
"dash_char_filter":{
"type":"mapping",
"mappings":[
"- => ",
", => ",
". => ",
"( => ",
") => ",
"? => ",
"! => ",
": => ",
"; => ",
"_ => ",
"% => ",
"& => ",
"+ => ",
"\" => ",
"\/ => ",
"\\[ => ",
"\\] => ",
"* => ",
"\u0027 => "
]
}
}
}
},
"mappings":{
"properties":{
"item_code":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"mfr_item_code":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"brand":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"name":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"short_name":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"alias":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"attrs":{
"type":"nested",
"properties":{
"name":{
"type":"text",
"index":"false"
},
"value":{
"type":"text",
"copy_to":"item:attrs:value",
"index":"false"
},
"primaryAttribute":{
"type":"boolean",
"index":"false"
}
}
},
"variant_summaries":{
"type":"nested",
"properties":{
"item_code":{
"type":"text",
"index":"false"
},
"variant_code":{
"type":"text",
"copy_to":"variant:variant_code",
"index":"false"
},
"mfr_item_code":{
"type":"text",
"index":"false"
},
"mfr_variant_code":{
"type":"text",
"copy_to":"variant:mfr_variant_code",
"index":"false"
},
"brand":{
"type":"text",
"index":"false"
},
"unit":{
"type":"text",
"copy_to":"variant:unit",
"index":"false"
},
"unit_mag":{
"type":"float",
"copy_to":"variant:unit",
"index":"false"
},
"primary_alternate_unit":{
"type":"nested",
"properties":{
"unit":{
"type":"text",
"copy_to":"variant:unit",
"index":"false"
},
"unit_mag":{
"type":"float",
"copy_to":"variant:unit",
"index":"false"
}
}
},
"attrs":{
"type":"nested",
"properties":{
"name":{
"type":"text",
"index":"false"
},
"value":{
"type":"text",
"copy_to":"variant:attrs:value",
"index":"false"
},
"primaryAttribute":{
"type":"boolean",
"index":"false"
}
}
},
"image":{
"type":"text",
"index":"false"
},
"in_stock":{
"type":"boolean",
"index":"false"
}
}
},
"added_by":{
"type":"text",
"index":"false"
},
"modified_by":{
"type":"text",
"index":"false"
},
"created_on":{
"type":"date",
"index":"false"
},
"updated_on":{
"type":"date",
"index":"false"
},
"is_deleted":{
"type":"boolean",
"index":"false"
},
"variant:variant_code":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"variant:mfr_variant_code":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"variant:attrs:value":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"variant:unit":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
},
"item:attrs:value":{
"type":"text",
"analyzer":"whitespaceWithEdgeNGram",
"search_analyzer":"shingleSearchAnalyzer"
}
}
}
}'
Any suggestions on implementing a space ignored search across multiple fields would be highly appreciated.

Related

error while mapping : unknown setting [index.knn.algo_param.m]

I am trying to change mapping in elastic search and getting this error
https://ibb.co/q5LkfWz
"reason": "unknown setting [index.knn.algo_param.m] please check that any required plugins are installed, or check the breaking changes documentation for removed settings
and this is the PUT request i am trying to make
PUT posting
{
"settings":{
"index":{
"number_of_shards":1,
"number_of_replicas":0,
"knn":{
"algo_param":{
"ef_search":40,
"ef_construction":40,
"m":"4"
}
}
},
"knn":true
},
"mappings":{
"properties":{
"vector":{
"type":"knn_vector",
"dimension":384
},
"title":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"index":false
}
}
},
"company":{
"type":"keyword",
"index":false
},
"location":{
"type":"keyword",
"index":false
},
"salary":{
"type":"keyword",
"index":false
},
"job_description":{
"type":"keyword",
"index":false
}
}
}
}
The reason indicates that the KNN plugin is not installed on the OpenSearch cluster.

Error while performing aggregation query in elastic search . "illegal_argument_exception / Fielddata is disabled on text fields by default

Hi I am performing a curl request to an elastic search instance. However i am getting an error as below.
curl -X GET "localhost:57457/mep-reports*/_search?pretty&size=0" -H 'Content-Type: application/json' --data-binary #query.txt
Response.
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [status] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "mep-reports",
"node" : "NJuAFq3YSni4TIK9PzgJxg",
"reason" : {
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [status] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}
}
],
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [status] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [status] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}
}
},
"status" : 400
}
any idea how to fix it . the following is my mapping definition i got
curl-XGET"localhost:57457/mep-reports*/_mapping/field/*?pretty"
{
"mep-reports":{
"mappings":{
"doc":{
"_index":{
"full_name":"_index",
"mapping":{
}
},
"status.keyword":{
"full_name":"status.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"inventory":{
"full_name":"inventory",
"mapping":{
"inventory":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"flight_name.keyword":{
"full_name":"flight_name.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"nof_segments":{
"full_name":"nof_segments",
"mapping":{
"nof_segments":{
"type":"long"
}
}
},
"_all":{
"full_name":"_all",
"mapping":{
}
},
"_ignored":{
"full_name":"_ignored",
"mapping":{
}
},
"campaign_name":{
"full_name":"campaign_name",
"mapping":{
"campaign_name":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"_parent":{
"full_name":"_parent",
"mapping":{
}
},
"flight_id.keyword":{
"full_name":"flight_id.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"flight_name":{
"full_name":"flight_name",
"mapping":{
"flight_name":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"#version":{
"full_name":"#version",
"mapping":{
"#version":{
"type":"long"
}
}
},
"_version":{
"full_name":"_version",
"mapping":{
}
},
"campaign_id":{
"full_name":"campaign_id",
"mapping":{
"campaign_id":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"_routing":{
"full_name":"_routing",
"mapping":{
}
},
"_type":{
"full_name":"_type",
"mapping":{
}
},
"msg_text":{
"full_name":"msg_text",
"mapping":{
"msg_text":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"delivery_ts":{
"full_name":"delivery_ts",
"mapping":{
"delivery_ts":{
"type":"long"
}
}
},
"sender.keyword":{
"full_name":"sender.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"submission_ts":{
"full_name":"submission_ts",
"mapping":{
"submission_ts":{
"type":"long"
}
}
},
"flight_id":{
"full_name":"flight_id",
"mapping":{
"flight_id":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"_seq_no":{
"full_name":"_seq_no",
"mapping":{
}
},
"#timestamp":{
"full_name":"#timestamp",
"mapping":{
"#timestamp":{
"type":"date"
}
}
},
"account_id":{
"full_name":"account_id",
"mapping":{
"account_id":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"_field_names":{
"full_name":"_field_names",
"mapping":{
}
},
"sender":{
"full_name":"sender",
"mapping":{
"sender":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"recipient":{
"full_name":"recipient",
"mapping":{
"recipient":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"account_id.keyword":{
"full_name":"account_id.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"_source":{
"full_name":"_source",
"mapping":{
}
},
"_id":{
"full_name":"_id",
"mapping":{
}
},
"campaign_name.keyword":{
"full_name":"campaign_name.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"_uid":{
"full_name":"_uid",
"mapping":{
}
},
"recipient.keyword":{
"full_name":"recipient.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"inventory.keyword":{
"full_name":"inventory.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"msg_text.keyword":{
"full_name":"msg_text.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"status":{
"full_name":"status",
"mapping":{
"status":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
},
"campaign_id.keyword":{
"full_name":"campaign_id.keyword",
"mapping":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}
}
really appreciate if you can help.
From the error, it appear that you are trying to perform aggregation on a text field i.e. status.
Note that you cannot perform aggregation on a text field and that in order to do so, you need to have fielddata:true.
However this is not recommended as it would consume lot of heap space which is what you see in the error.
The mapping you've shared has the below details for status field.
{
"status":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
I see that status has a sibling field in status.keyword.
Just change your query that you would have in your query.txt to make use of status.keyword instead of status in your aggregation query and it should fix the issue.
Likewise if you see more errors like that, you may want to make similar changes on the other fields too accordingly. Note that this change is something you need to do in your aggregation query.
Let me know if this helps!

Elastic Search Error mapper [email.keyword] of different type, current_type [text], merged_type [keyword]

I'm doing an upsert in PHP on a newly created index so there is no data present.I'm getting an exception that I would expect to see if data was already there but the index is freshly created. Is there something special I have to do with upsert on newly created indexes as well? The upsert works fine until I add the custom analyzer.
{
"error":{
"root_cause":[
{
"type":"remote_transport_exception",
"reason":"[8902bb997443][127.0.0.1:9300][indices:data/write/update[s]]"
}
],
"type":"illegal_argument_exception",
"reason":"mapper [email.keyword] of different type, current_type [text], merged_type [keyword]"
},
"status":400
}
Listed below is my creation code for the index
{
"index":"myindex",
"body":{
"settings":{
"analysis":{
"analyzer":{
"my_email_analyzer":{
"type":"custom",
"tokenizer":"uax_url_email",
"filter":[
"lowercase",
"stop"
]
}
}
}
},
"mappings":{
"properties":{
"ak_additional_recovery_email":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"ak_city_town":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"ak_first_name":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"ak_last_name":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"ak_second_additional_recovery_email":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"ak_state":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"email":{
"type":"text",
"fields":{
"keyword":{
"type":"text",
"analyzer":"my_email_analyzer"
}
}
},
"indexedHash":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"uID":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"uName":{
"type":"text",
"fields":{
"keyword":{
"type":"text",
"analyzer":"my_email_analyzer"
}
}
}
}
}
}
}
And here is the PHP code trying to do the upsert
$this->client->update([
'id' => $data['uID'],
'body' => [
'doc' => $data,
'upsert' => [
'uName' => $data['uName'],
'email' => $data['email'],
'ak_first_name' => $data['ak_first_name'],
'ak_last_name' => $data['ak_last_name'],
'ak_city_town' => $data['ak_city_town'],
'ak_state' => $data['ak_state']
]
],
'index' => $this->dbName,
'type' => 'general'
]);
Simple mistake! I was using an incorrect type for the index type. I'm not sure why this error was posted though.

ElasticSearch accented and no accented words management

I created an index :
PUT members
{
"settings":{
"number_of_shards":1,
"analysis":{
"analyzer":{
"accentedNames":{
"tokenizer":"standard",
"filter":[
"lowercase",
"asciifolding"
]
},
"standardNames":{
"tokenizer":"standard",
"filter":[
"lowercase"
]
}
}
}
},
"mappings":{
"member":{
"properties":{
"id":{
"type":"text"
},
"name":{
"type":"text",
"analyzer":"standardNames",
"fields":{
"accented":{
"type":"text",
"analyzer":"accentedNames"
}
}
}
}
}
}
}
Assume that some documents are in this set (EDIT):
{"1", "Maéllys Macron"};
{"2", "Maêllys Alix"};
{"3", "Maëllys Rosa"};
{"4", "Maèllys Alix"};
{"5", "Maellys du Bois"};
I wanted to have this result :
If I want to get documents named "Maéllys", I expect to get "Maéllys Richard" as the best match, and others with the same score.
What I did is to use my analyzers with a such request :
GET members/member/_search
{
"query":{
"multi_match" : {
"query" : "Maéllys",
"fields" : [ "name", "name.accented" ]
}
}
}
"Maéllys Richard" has the best score. The documents "Ma(ê|ë|é|è)llys Richard have the same score that is higher than "Maellys Richard" document.
Can someone help me ?
Thanks.

why data can't get by elasticsearch?

Elastic search version 6.2.4
I made elastic search environment and made mapping like this.
{
"state":"open",
"settings":{
"index":{
"number_of_shards":"5",
"provided_name":"lara_cart",
"creation_date":"1529082175034",
"analysis":{
"filter":{
"engram":{
"type":"edgeNGram",
"min_gram":"1",
"max_gram":"36"
},
"maxlength":{
"type":"length",
"max":"36"
},
"word_delimiter":{
"split_on_numerics":"false",
"generate_word_parts":"true",
"preserve_original":"true",
"generate_number_parts":"true",
"catenate_all":"true",
"split_on_case_change":"true",
"type":"word_delimiter",
"catenate_numbers":"true"
}
},
"char_filter":{
"normalize":{
"mode":"compose",
"name":"nfkc",
"type":"icu_normalizer"
},
"whitespaces":{
"pattern":"\s[2,]",
"type":"pattern_replace",
"replacement":"\u0020"
}
},
"analyzer":{
"keyword_analyzer":{
"filter":[
"lowercase",
"trim",
"maxlength"
],
"char_filter":[
"normalize",
"whitespaces"
],
"type":"custom",
"tokenizer":"keyword"
},
"autocomplete_index_analyzer":{
"filter":[
"lowercase",
"trim",
"maxlength",
"engram"
],
"char_filter":[
"normalize",
"whitespaces"
],
"type":"custom",
"tokenizer":"keyword"
},
"autocomplete_search_analyzer":{
"filter":[
"lowercase",
"trim",
"maxlength"
],
"char_filter":[
"normalize",
"whitespaces"
],
"type":"custom",
"tokenizer":"keyword"
}
},
"tokenizer":{
"engram":{
"type":"edgeNGram",
"min_gram":"1",
"max_gram":"36"
}
}
},
"number_of_replicas":"1",
"uuid":"5xyW07F-RRCuIJlvBufNbA",
"version":{
"created":"6020499"
}
}
},
"mappings":{
"products":{
"properties":{
"sale_end_at":{
"format":"yyyy-MM-dd HH:mm:ss",
"type":"date"
},
"image_5":{
"type":"text"
},
"image_4":{
"type":"text"
},
"created_at":{
"format":"yyyy-MM-dd HH:mm:ss",
"type":"date"
},
"description":{
"analyzer":"keyword_analyzer",
"type":"text",
"fields":{
"autocomplete":{
"search_analyzer":"autocomplete_search_analyzer",
"analyzer":"autocomplete_index_analyzer",
"type":"text"
}
}
},
"sale_start_at":{
"format":"yyyy-MM-dd HH:mm:ss",
"type":"date"
},
"sale_price":{
"type":"integer"
},
"category_id":{
"type":"integer"
},
"updated_at":{
"format":"yyyy-MM-dd HH:mm:ss",
"type":"date"
},
"price":{
"type":"integer"
},
"image_1":{
"type":"text"
},
"name":{
"analyzer":"keyword_analyzer",
"type":"text",
"fields":{
"autocomplete":{
"search_analyzer":"autocomplete_search_analyzer",
"analyzer":"autocomplete_index_analyzer",
"type":"text"
},
"keyword":{
"analyzer":"keyword_analyzer",
"type":"text"
}
}
},
"image_3":{
"type":"text"
},
"categories":{
"type":"nested",
"properties":{
"parent_category_id":{
"type":"integer"
},
"updated_at":{
"type":"text",
"fields":{
"keyword":{
"ignore_above":256,
"type":"keyword"
}
}
},
"name":{
"analyzer":"keyword_analyzer",
"type":"text",
"fields":{
"autocomplete":{
"search_analyzer":"autocomplete_search_analyzer",
"analyzer":"autocomplete_index_analyzer",
"type":"text"
}
}
},
"created_at":{
"type":"text",
"fields":{
"keyword":{
"ignore_above":256,
"type":"keyword"
}
}
},
"id":{
"type":"long"
}
}
},
"id":{
"type":"long"
},
"image_2":{
"type":"text"
},
"stock":{
"type":"integer"
}
}
}
},
"aliases":[
],
"primary_terms":{
"0":1,
"1":1,
"2":1,
"3":1,
"4":1
},
"in_sync_allocations":{
"0":[
"clYoJWUKTru2Z78h0OINwQ"
],
"1":[
"MGQC73KiQsuigTPg4SQG4g"
],
"2":[
"zW6v82gNRbe3wWKefLOAug"
],
"3":[
"5TKrfz7HRAatQsJudKX9-w"
],
"4":[
"gqiblStYSYy_NA6fYtkghQ"
]
}
}
I want to use suggest search by autocomplete filed.
So I added a document like this.
{
"_index":"lara_cart",
"_type":"products",
"_id":"19",
"_version":1,
"_score":1,
"_source":{
"id":19,
"name":"Conqueror, whose.",
"description":"I should think you'll feel it a bit, if you wouldn't mind,' said Alice: 'besides, that's not a regular rule: you invented it just missed her. Alice caught the flamingo and brought it back, the fight.",
"category_id":81,
"stock":79,
"price":11533,
"sale_price":15946,
"sale_start_at":null,
"sale_end_at":null,
"image_1":"https://lorempixel.com/640/480/?56260",
"image_2":"https://lorempixel.com/640/480/?15012",
"image_3":"https://lorempixel.com/640/480/?14138",
"image_4":"https://lorempixel.com/640/480/?94728",
"image_5":"https://lorempixel.com/640/480/?99832",
"created_at":"2018-06-01 16:12:41",
"updated_at":"2018-06-01 16:12:41",
"deleted_at":null,
"categories":{
"id":81,
"name":"A secret, kept.",
"parent_category_id":"33",
"created_at":"2018-06-01 16:12:41",
"updated_at":"2018-06-01 16:12:41",
"deleted_at":null
}
}
}
After that, I try to search by below query.
But, this query can't get anything.
Do you know how to resolve it?
I think to cause is mapping and setting cause.
{
"query":{
"bool":{
"must":[
{
"term":{
"name.autocomplete":"Conqueror"
}
}
],
"must_not":[
],
"should":[
]
}
},
"from":0,
"size":10,
"sort":[
],
"aggs":{
}
}
It's just because of the field that you are using is analyzed and "term" couldn't support the query
you can try "match" on the field which analyzer is autocomplete; may be some basic knowledge of autocomplete and n-grams will help you better understanding this problem.
e.g.
you defined the following analyzer:
PUT /my_index
{
"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}
}
After that you can test the autocomplete with following request:
GET /my_index/_analyze?analyzer=autocomplete
quick brown
as configured abrove, the autocomplete will generate n-grams for the input query with the edges from 1 ~ 20. And the return for the request is:
q
qu
qui
quic
quick
b
br
bro
brow
brown
As we all know that term query is a query that will search the field which exactly contains the query world, just like where condition of mysql.

Resources