How to create a custom reusable type in ElasticSearch? - elasticsearch

My json for ElasticSearch schema looks like this :-
{
"mappings": {
"properties": {
"DESCRIPTION_FR": {
"type": "text",
"analyzer": "french"
},
"FEEDBACK_FR": {
"type": "text",
"analyzer": "french"
},
"SOURCE_FR": {
"type": "text",
"analyzer": "french"
}
}
}
}
There are 100 of properties like this. Replicating a change across all the properties with this approach is redundant and erroneous.
Is there a way in ElasticSearch 7.2 to write custom data type and reuse it in property mapping.
{
"settings": {
//definition of custom type "text_fr"
},
"mappings": {
"properties": {
"DESCRIPTION_FR": {
"type": "text_fr"
},
"FEEDBACK_FR": {
"type": "text_fr"
},
"SOURCE_FR": {
"type": "text_fr"
}
}
}
}

Yes! What you're after is dynamic mapping templates. More specifically the match feature.
Define the target field names with a leading wildcard:
PUT my_index
{
"mappings": {
"dynamic_templates": [
{
"is_french_text": {
"match_mapping_type": "*",
"match": "*_FR",
"mapping": {
"type": "text",
"analyzer": "french"
}
}
}
]
}
}
Insert a doc:
POST my_index/_doc
{
"DESCRIPTION_FR": "je",
"FEEDBACK_FR": "oui",
"SOURCE_FR": "je ne sais quoi"
}
Verify the dynamically generated mapping:
GET my_index/_mapping

Related

In Elasticsearch, can I set language-specific multi-fields?

Is it possible to use multi-fields to set and query multilingual fields?
Consider this mapping:
PUT multi_test
{
"mappings": {
"data": {
"_field_names": {
"enabled": false
},
"properties": {
"book_title": {
"type": "text",
"fields": {
"english": {
"type": "text",
"analyzer": "english"
},
"german": {
"type": "text",
"analyzer": "german"
},
"italian": {
"type": "text",
"analyzer": "italian"
}
}
}
}
}
}
}
I tried the following, but it doesn't work:
PUT multi_test/data/1
{
"book_title.english": "It's good",
"book_title.german": "Das gut"
}
The error seems to indicate I'm trying to add new fields:
{ "error": { "root_cause": [ { "type": "mapper_parsing_exception",
"reason": "Could not dynamically add mapping for field
[book_title.english]. Existing mapping for [book_title] must be of
type object but found [text]." } ], "type":
"mapper_parsing_exception", "reason": "Could not dynamically add
mapping for field [book_title.english]. Existing mapping for
[book_title] must be of type object but found [text]." }, "status":
400 }
What am I doing wrong here?
If my approach is unworkable, what is a better way to do this?
The problem is that you are using using fields for the field book_title.
Fields keyword is used when you want to keep same field and data in multiple ways i.e using different analyzers or some other setting changes but values should be same in all field names under fields.Here is the link describing what is keyword fields https://www.elastic.co/guide/en/elasticsearch/reference/2.4/multi-fields.html
In you use case the mapping should be like below
PUT multi_test
{
"mappings": {
"data": {
"_field_names": {
"enabled": false
},
"properties": {
"book_title": {
"properties": {
"english": {
"type": "text",
"analyzer": "english"
},
"german": {
"type": "text",
"analyzer": "german"
},
"italian": {
"type": "text",
"analyzer": "italian"
}
}
}
}
}
}
}
This will define book_title as object type and you can add multiple fields with different data under book_title

Combining Index Templates with Dynamic Templates

I would like to combine Index Templates and Dynamic Templates so that the dynamic mapping I define is automatically added to all indices created.
Is this possible?
Regards Morten
In the index template define the mappings as dynamic template.
For example :
PUT /_template/template_1
{
"template": "yourindex*",
"mappings": {
"my_type": {
"dynamic_templates": [
{your dynamic templates ...}
]
}
}
}
You can do something like this:
PUT /_template/my_template
{
"template": "name-*",
"mappings": {
"my_type": {
"dynamic_templates": [
{
"rule1": {
"match": "field*",
"mapping": {
"type": "string",
"index": "analyzed"
}
}
},
{
"rule2": {
"match": "another*",
"mapping": {
"type": "integer"
}
}
}
],
"properties": {
"field": {
"type": "string"
}
}
}
}
}
In Elasticsearch 7.x, document types are now deprecated. For this reason, the examples from old answers will raise exception. Solution below:
PUT /_template/test_template
{
"index_patterns" : [
"test*"
],
"mappings": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 1024
}
}
}
}
}
]
}
}
P.S. Also I recommend set index type "_doc" into old applications for compatibility with these templates now (fluent-bit v1.3, for example).

Update and search in multi field properties in ElasticSearch

I'm trying to use multi field properties for multi language support. I created following mapping for this:
{
"mappings": {
"product": {
"properties": {
"prod-id": {
"type": "string"
},
"prod-name": {
"type": "string",
"fields": {
"en": {
"type": "string",
"analyzer": "english"
},
"fr": {
"type": "string",
"analyzer": "french"
}
}
}
}
}
}
}
I created test record:
{
"prod-id": "1234567",
"prod-name": [
"Test product",
"Produit d'essai"
]
}
and tried to query using some language:
{
"query": {
"bool": {
"must": [
{"match": {
"prod-name.en": "Produit"
}}
]
}
}
}
As a result I got my document. But I expected that I will have empty result when I use French but choose English. It seems ElasticSearch ignores which field I specified in query. There is no difference in search result when I use "prod-name.en" or "prod-name.fr" or just "prod-name". Is this behaviour expected? Should I do some special things to have searching just in one language?
Another problem with updating multi field property. I can't update just one field.
{
"doc" : {
"prod-name.en": "Test"
}
}
I got following error:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Field name [prod-name.en] cannot contain '.'"
}
],
"type": "mapper_parsing_exception",
"reason": "Field name [prod-name.en] cannot contain '.'"
},
"status": 400
}
Is there any way to update just one field in multi field property?
In your mapping, the prod-name.en field will simply be analyzed using the english analyzer and the same for the french field. However, ES will not choose for you which value to put in which field.
Instead, you need to modify your mapping like this
{
"mappings": {
"product": {
"properties": {
"prod-id": {
"type": "string"
},
"prod-name": {
"type": "object",
"properties": {
"en": {
"type": "string",
"analyzer": "english"
},
"fr": {
"type": "string",
"analyzer": "french"
}
}
}
}
}
}
}
and input document to be like this and you'll get the results you expect.
{
"prod-id": "1234567",
"prod-name": {
"en": "Test product",
"fr": "Produit d'essai"
}
}
As for the updating part, your partial document should be like this instead.
{
"doc" : {
"prod-name": {
"en": "Test"
}
}
}

Elasticsearch: How to make all properties of object type as non analyzed?

I need to create an Elasticsearch mapping with an Object field whose keys are not known in advance. Also, the values can be integers or strings. But I want the values to be stored as non analyzed fields if they are strings. I tried the following mapping:
PUT /my_index/_mapping/test
{
"properties": {
"alert_text": {
"type": "object",
"index": "not_analyzed"
}
}
}
Now the index is created fine. But if I insert values like this:
POST /my_index/test
{
"alert_text": {
"1": "hello moto"
}
}
The value "hello moto" is stored as an analyzed field using standard analyzer. I want it to be stored as a non analyzed field. Is it possible if I don't know in advance what all keys can be present ?
Try dynamic templates. With this feature you can configure a set of rules for the fields that are created dynamically.
In this example I've configured the rule that I think you need, i.e, all the strings fields within alert_text are not_analyzed:
PUT /my_index
{
"mappings": {
"test": {
"properties": {
"alert_text": {
"type": "object"
}
},
"dynamic_templates": [
{
"alert_text_strings": {
"path_match": "alert_text.*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "not_analyzed"
}
}
}
]
}
}
}
POST /my_index/test
{
"alert_text": {
"1": "hello moto"
}
}
After executing the requests above you can execute this query to show the current mapping:
GET /my_index/_mapping
And you will obtain:
{
"my_index": {
"mappings": {
"test": {
"dynamic_templates": [
{
"alert_text_strings": {
"mapping": {
"index": "not_analyzed",
"type": "string"
},
"match_mapping_type": "string",
"path_match": "alert_text.*"
}
}
],
"properties": {
"alert_text": {
"properties": {
"1": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
Where you can see that alert_text.1 is stored as not_analyzed.

How exactly do mapped fields in elastic search work?

The documentation is sparse and not entirely helpful. So say I have the following fields for my attribute:
{
"my_index": {
"mappings": {
"my_type": {
"my_attribute": {
"mapping": {
"my_attribute": {
"type": "string",
"analyzer": "my_analyzer",
"fields": {
"lowercased": {
"type": "string"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
}
my_analyzer lowercases tokens (in addition to other stuff).
So now I would like to know if the following statements are true:
my_analyzer does not get applied to raw, because the not_analyzed index does not have any analyzers, as its name implies.
my_attribute and my_attribute.lowercased are the exact same, so it is redundant to have the field my_attribute.lowercased
Your first statement is correct, however the second is not. my_attribute and my_attribute.lowercased might not be the same since the former has your custom my_analyzer search and index analyzer, while my_attribute.lowercased has the standard analyzer (since no analyzer is specified the standard one kicks in).
Besides, your mapping is not correct the way it is written, it should be like this:
{
"mappings": {
"my_type": {
"properties": {
"my_attribute": {
"type": "string",
"analyzer": "my_analyzer",
"fields": {
"lowercased": {
"type": "string"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}

Resources