I create an index like this using a PUT http://localhost:9200/test :
{
"settings": {
"number_of_shards": 1,
"analysis": {
"analyzer": {
"sortable": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
}
}
This returned:
{"acknowledged":true}
Then make sure that the analyzer is there:
http://localhost:9200/test/_analyze?_analyzer=sortable&text=HeLLo
{"tokens":[{"token":"hello","start_offset":0,"end_offset":5,"type":"<ALPHANUM>","position":0}]}
So I create mappings for it:
By PUT http://localhost:9200/test/_mapping/company
{
"properties": {
"name": {
"type": "string",
"analyzer": "standard",
"fields": {
"raw": {
"type": {
"analyzer": "sortable"
}
}
}
}
}
This returns:
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"no handler for type [{analyzer=sortable}] declared on field [raw]"}],"type":"mapper_parsing_exception","reason":"no handler for type [{analyzer=sortable}] declared on field [raw]"},"status":400}
What is wrong?
Your company mapping needs to be fixed to this:
{
"properties": {
"name": {
"type": "string",
"analyzer": "standard",
"fields": {
"raw": {
"type": "string",
"analyzer": "sortable"
}
}
}
}
Related
I have an Elastic Search project with my aggregation and filter working correctly before I added synonym analyzer to mapping.
Current working Mapping :
"settings": {
"analysis": {
"normalizer": {
"lowercase": {
"type": "custom",
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"doc": {
"dynamic": "false",
"properties": {
"primarytrades": {
"type": "nested",
"properties" :{
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase"
}
}
}
}
}
}
}
}
#This is request and response with expected bucketed values:
Request:
{"aggs":{"filter_trades":{"aggs":{"nested_trades":{"aggs":{"autocomplete_trades":{"terms":{"field":"primarytrades.name.keyword","include":".*p.*l.*u.*m.b.","size":10}}},"nested":{"path":"primarytrades"}}},"filter":{"nested":{"path":"primarytrades","query":{"bool":{"should":[{"match":{"primarytrades.name":{"fuzziness":2,"query":"plumb"}}},{"match_phrase_prefix":{"primarytrades.name":{"query":"plumb"}}}]}}}}}},"query":{"bool":{"filter":[{"nested":{"path":"primarytrades","query":{"bool":{"should":[{"match":{"primarytrades.name":{"fuzziness":2,"query":"plumb"}}},{"match_phrase_prefix":{"primarytrades.name":{"query":"plumb"}}}]}}}}]}},"size":0}
Response:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":7216,"max_score":0.0,"hits":[]},"aggregations":{"filter#filter_trades":{"doc_count":7216,"nested#nested_trades":{"doc_count":48496,"sterms#autocomplete_trades":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"plumbing","doc_count":7192},{"key":"plumbing parts","doc_count":179}]}}}}}
To add synonym search feature to this, I changed mapping with synonym analyzer like this :
"settings": {
"analysis": {
"normalizer": {
"lowercase": {
"type": "custom",
"filter": [ "lowercase" ]
}
},
"analyzer": {
"synonym_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [ "lowercase", "my_synonyms" ]
}
},
"filter": {
"my_synonyms": {
"type": "synonym",
"synonyms": [ "piping, sink, plumbing" ],
"updateable": true
}
}
}
},
"mappings": {
"doc": {
"dynamic": "false",
"properties": {
"primarytrades": {
"type": "nested",
"properties" :{
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"analyzed": {
"type": "text",
"analyzer": "standard",
"search_analyzer": "synonym_analyzer"
}
}
}
}
}
}
}
}
And also, I changed my query to use search_analyzer as below :
{"aggs":{"filter_trades":{"aggs":{"nested_trades":{"aggs":{"autocomplete_trades":{"match":{"field":"primarytrades.name.analyzed","include":".*p.*l.*u.*m.b.","size":10}}},"nested":{"path":"primarytrades"}}},"filter":{"nested":{"path":"primarytrades","query":{"bool":{"should":[{"match":{"primarytrades.name":{"fuzziness":2,"query":"plumb","search_analyzer":"synonym_analyzer"}}},{"match_phrase_prefix":{"primarytrades.name":{"query":"plumb","search_analyzer":"synonym_analyzer"}}}]}}}}}},"query":{"bool":{"filter":[{"nested":{"path":"primarytrades","query":{"bool":{"should":[{"match":{"primarytrades.name":{"fuzziness":2,"query":"plumb","search_analyzer":"synonym_analyzer"}}},{"match_phrase_prefix":{"primarytrades.name":{"query":"plumb","search_analyzer":"synonym_analyzer"}}}]}}}}]}}}
I am getting this error :
"type": "named_object_not_found_exception",
"reason": "[8:24] unable to parse BaseAggregationBuilder with name [match]: parser not found"
Can someone help me correct the query ?
Thanks in advance!
In your match queries, you need to specify analyzer and not search_analyzer. search_analyzer is only a valid keyword in the mapping section.
{
"match": {
"primarytrades.name": {
"fuzziness": 2,
"query": "plumb",
"analyzer": "synonym_analyzer" <--- change this
}
}
},
I am trying to create an index with the mapping of text and keyword with the analyzer defined, here what i have tried till now:
{
"settings" : {
"number_of_shards" : 2,
"number_of_replicas" : 1
},
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase", "asciifolding"]
}
}
}
,
"mappings": {
"properties": {
"question": {
"type":"text",
"fields": {
"keyword": {
"type": "keyword"
},
"normalize": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}
}
I have tried this but getting error :
"error": {
"root_cause": [
{
"type": "parse_exception",
"reason": "unknown key [analysis] for create index"
}
],
"type": "parse_exception",
"reason": "unknown key [analysis] for create index"
},
"status": 400
}
Question is the field where I need to add this mapping.
I am trying this in AWS ES service.
Great start, you're almost there!
The analysis section needs to be located inside the top-level settings section, like this:
{
"settings": {
"index": {
"number_of_shards": 2,
"number_of_replicas": 1
},
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"properties": {
"question": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"normalize": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
},
"answer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"normalize": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}
}
I have read about previous version of ES (< 2) where the "token_analyzer" key needs to be changed to "analyzer". But no matter what I do I am still getting this error:
"type": "mapper_parsing_exception",
"reason": "analyzer on field [email] must be set when search_analyzer is set"
Here is what I am passing into ES via a PUT function when I get the error:
{
"settings": {
"analysis": {
"analyzer": {
"my_email_analyzer": {
"type": "custom",
"tokenizer": "uax_url_email",
"filter": ["lowercase", "stop"]
}
}
}
},
"mappings" : {
"uuser": {
"properties": {
"email": {
"type": "text",
"search_analyzer": "my_email_analyzer",
"fields": {
"email": {
"type": "text",
"analyzer": "my_email_analyzer"
}
}
},
"facebookId": {
"type": "text"
},
"name": {
"type": "text"
},
"profileImageUrl": {
"type": "text"
},
"signupDate": {
"type": "date"
},
"username": {
"type": "text"
}
,
"phoneNumber": {
"type": "text"
}
}
}
}
}
Any ideas what is wrong?
Because you have specified a search_analyzer for the field, you also have to specify the analyzer to be used at indexing time. For example, add this line under where you specify the search_analyzer:
"analyzer": "standard",
To give you this:
{
"settings": {
"analysis": {
"analyzer": {
"my_email_analyzer": {
"type": "custom",
"tokenizer": "uax_url_email",
"filter": ["lowercase", "stop"]
}
}
}
},
"mappings" : {
"uuser": {
"properties": {
"email": {
"type": "text",
"search_analyzer": "my_email_analyzer",
"analyzer": "standard",
"fields": {
"email": {
"type": "text",
"analyzer": "my_email_analyzer"
}
}
},
"facebookId": {
"type": "text"
},
"name": {
"type": "text"
},
"profileImageUrl": {
"type": "text"
},
"signupDate": {
"type": "date"
},
"username": {
"type": "text"
}
,
"phoneNumber": {
"type": "text"
}
}
}
}
}
See also: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html
Currently I am using dynamic template as follows, Here I am applying n-gram analyzer to all the "String" fields.
However to improve efficiency I would like to apply n-gram only on specific fields only and not on all String fields.
{
"template": "*",
"settings": {
"analysis": {
"filter": {
"ngram_filter": {
"type": "ngram",
"min_gram": 1,
"max_gram": 25
}
},
"analyzer": {
"case_insensitive": {
"tokenizer": "whitespace",
"filter": [
"ngram_filter",
"lowercase"
]
},
"search_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": "lowercase"
}
}
}
},
"mappings": {
"my_type": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "string",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "case_insensitive",
"search_analyzer": "search_analyzer"
}
}
}
]
}
}
}
I have a payload like this:
{
"userId":"abc123-pqr180-xyz124-njd212",
"email" : "someuser#test.com",
"name" : "somename",
.
.
20 more fields
}
Now I want to apply n-gram only for "email" and "userid".
How can we do this ?
Since you cannot rename the fields I suggest the following solution, i.e. to duplicate the dynamic template for the name and email fields.
{
"template": "*",
"settings": {
"analysis": {
"filter": {
"ngram_filter": {
"type": "ngram",
"min_gram": 1,
"max_gram": 25
}
},
"analyzer": {
"case_insensitive": {
"tokenizer": "whitespace",
"filter": [
"ngram_filter",
"lowercase"
]
},
"search_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": "lowercase"
}
}
}
},
"mappings": {
"my_type": {
"dynamic_templates": [
{
"names": {
"match_mapping_type": "string",
"match": "name",
"mapping": {
"type": "string",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "case_insensitive",
"search_analyzer": "search_analyzer"
}
}
},
{
"emails": {
"match_mapping_type": "string",
"match": "email",
"mapping": {
"type": "string",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "case_insensitive",
"search_analyzer": "search_analyzer"
}
}
}
]
}
}
}
I have the following mapping for elasticsearch
{
"mappings": {
"hotel": {
'properties': {"name": {
"type": "string",
"search_analyzer": "str_search_analyzer",
"index_analyzer": "str_index_analyzer"},
"destination": {'properties': {'en': {
"type": "string",
"search_analyzer": "str_search_analyzer",
"index_analyzer": "str_index_analyzer"}}},
"country": {"properties": {"en": {
"type": "string",
"search_analyzer": "str_search_analyzer",
"index_analyzer": "str_index_analyzer"}}},
"destination_facets": {"properties": {"en": {
"type": "string",
"search_analyzer": "facet_analyzer"
}}}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"str_search_analyzer": {
"tokenizer": "keyword",
"filter": ["lowercase"]
},
"str_index_analyzer": {
"tokenizer": "keyword",
"filter": ["lowercase", "substring"]
},
"facet_analyzer": {
"type": "keyword",
"tokenizer": "keyword"
},
},
"filter": {
"substring": {
"type": "edgeNGram",
"min_gram": 1,
"max_gram": 20,
}
}
}
}
}
Which I want my destination_facets to be not tokenized. But it comes as white-space tokenized. Is there a way to ignore all token activities?
You probably need to set your facet_analyzer not only for the search_analyzer but also for the index_analyzer (Elasticsearch probably use this one for facetting, the search_analyzer is only used to parse query strings).
Note that if you want the same analyze for both, you can just use the name analyzer in your mapping.
Ex :
{
"mappings": {
"hotel": {
...
"destination_facets": {"properties": {"en": {
"type": "string",
"analyzer": "facet_analyzer"
}}}
}
}
},
"settings": {
...
}
}