Elasticsearch - dynamic mapping with multi-field support - elasticsearch

Is it possible to add new fields with multi-field support dynamically?
My index has properties that will only be known at indexing time. So these fields will be included with dynamic mapping.
But, when a new field is added dynamically, I need it to be mapped as text and with three sub-fields: keyword, date (if it fits with dynamic_date_formats) and long.
With these three sub-fields I will be able to search and aggregate many queries with maximum performance.
I know I can do a "pre" mapping my index with these "dynamic fields" using nested field with key and value properties so I can create the value property with these three sub-fields. But I don't want to create a nested key/value field because it's not very fast when performing aggregations with a lot of documents.

I found it.
Dynamic templates is the answer.
Very simple :)
{
"mappings": {
"doc": {
"dynamic_templates": [
{
"objs": {
"match_mapping_type": "object",
"mapping": {
"type": "{dynamic_type}"
}
}
},
{
"attrs": {
"match_mapping_type": "*",
"mapping": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
},
"long": {
"type": "long",
"ignore_malformed": true
},
"double": {
"type": "double",
"ignore_malformed": true
},
"date": {
"type": "date",
"format": "dd/MM/yyyy||dd/MM/yyyy HH:mm:ss||dd/MM/yyyy HH:mm",
"ignore_malformed": true
}
}
}
}
}
],
"dynamic": "strict",
"properties": {
"fixed": {
"properties": {
"aaa": {
"type": "text"
},
"bbb": {
"type": "long"
},
"ccc": {
"type": "date",
"format": "dd/MM/yyyy"
}
}
},
"dyn": {
"dynamic": true,
"properties": {
}
}
}
}
}
}

Related

Elasticsearch - nested unknown mapping properties - set default type

I'm wondering if there is a way to set all nested field to a specific type
The main reason I'm looking for something like that is because the attributes inside category are not not known and can vary.
That's the mapping I have, and every single property must have its type explicitly set:
PUT data
{
"mappings": {
"properties": {
"category": {
"type": "nested",
"properties": {
"mode": {
"type": "text",
"analyzer": "keyword"
},
"sequence": {
"type": "text",
"analyzer": "keyword"
}
}
}
}
}
}
I'm looking for something like this pseudo mapping below:
PUT data
{
"mappings": {
"properties": {
"category": {
"type": "nested",
"properties.*": {
"type": "text",
"analyzer": "keyword"
}
}
}
}
}
Maybe this is not the way to go and if you have any other solution to handle these dynamic attributes, it will be really appreciated.
You can achieve this with dynamic templates
PUT data
{
"mappings": {
"dynamic_templates": [
{
"category_fields": {
"path_match": "category.*",
"mapping": {
"type": "text",
"analyzer": "keyword"
}
}
}
]
},
"properties": {
"category": {
"type": "nested"
}
}
}

Is there a way to know datatype of key in Elasticsearch?

I am facing issue like datatype of key getting changed. On creating index I have datatype as nested but for some reason, it gets changed to object. I make CRUD operations through the painless script but that seems to be fine.
Elastic version 7.3.0
Initial template:
"settings": {
"number_of_shards": 1,
},
"mappings" : {
"properties": {
"deleted_at": { "type": "date" },
"updated_at": { "type": "date" },
"id": { "type": "integer" },
"user_id": { "type": "integer" },
... some more keys
"user_tags": {
"type": "nested"
},
"user_files": {
"type": "nested"
},
}
}
Mapping After some bulk insert/update
"mappings" : {
"properties": {
"deleted_at": { "type": "date" },
"updated_at": { "type": "date" },
"id": { "type": "integer" },
"user_id": { "type": "integer" },
"user_tags": {
"properties": {
...some properties
}
},
"user_files": {
"properties": {
...some properties
}
},
}
}
I have to reindex to fix this issue but it's happening very often. Also is there any way to know the datatype of key whether it is nested or object?
Thanks in advance.
By setting "dynamic": "strict" your mapping won't be changed and unsuitable documents would not be inserted. To solve this problem you need to define all fields that you want to be inside of your nested field. For example:
{
"user_tags": {
"type": "nested",
"properties": {
"code": {
"type": "keyword",
"store": true
},
"score": {
"type": "float",
"store": true
}
}
}
}
If you want to just store a list you can use mapping as below:
{
"user_tags": {
"type": "keyword",
"store": true
}
}
In the second mapping you can store user_tags with this result ["tag1", "tag2", ...]

How do I create a template that allows dynamic index alias in elasticsearch

I'm trying to follow this doc using elasticsearch 2.4 where multiple tenant data can be put into one index but by using alias and routes I can search an aliased index and only retrieve information from one particular tenant.
Briefly, I could have an index filled with widget doc types under the index search_widgets. but the template would create an alias as I enter in widget docs based on the value of the user-id. If the widget type doc has user-id = 1 it would create an index alias search_widget-1. If searched with a get-all on that index it would return widgets docs that has user-id = 1 as a field in it.
Here's a incomplete example of how it think it should work
PUT /search_widgets
{
"search_widgets": {
"settings": {
"number_of_shards": 2
},
"mappings": {
"widget": {
"properties": {
"#timestamp": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"#version": {
"type": "string"
},
"active": {
"type": "boolean"
},
"user-id": {
"type": "number",
"index": "not_analyzed"
},
"created_date": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"name": {
"type": "string"
},
"updated_date": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"id": {
"type": "string",
"index": "not_analyzed"
}
}
}
},
"aliases": {
"widget": {
"routing": "search_widget_*",
"filter": {
"term": {
"user-id": "*"
}
}
}
}
}
}

Elasticsearch - Setting up default analyzers on all fields

I've an Index where the mappings will vary drastically. Consider for example, I'm indexing Wikipedia infobox data of every other article. The data in infobox is not structured, neither its uniform. So, the data can be of the form:-
Data1- {
'title': 'Sachin',
'Age': 41,
'Occupation': Cricketer
}
Data2- {
'title': 'India',
'Population': '23456987654',
'GDP': '23',
'NationalAnthem': 'Jan Gan Man'
}
Since all the fields are different and I want to apply Completion field on the relevant field, hence I'm thinking of applying analyzers on all the fields.
How can I apply analyzers on every field by default while indexing?
You need a _default_ template for your index, so that whenever new fields are added to it, those string fields will take the mapping from the _default_ template:
{
"template": "infobox*",
"mappings": {
"_default_": {
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"analyzer": "my_completion_analyzer",
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
]
}
}
}
Or if your index is not a daily/weekly one, you can just create it once with the _default_ mapping defined:
PUT /infobox
{
"mappings": {
"_default_": {
"dynamic_templates": [
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"analyzer": "my_completion_analyzer",
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
]
}
}
}

elasticsearch index_name with multi_field

I have 2 separate indexes, each containing a different type
I want to get combined records from both.
The problem is that one type has field 'email', the other has 'work_email'. However I want to treat them as the same thing for sorting purposes.
That is why I try to use index_name in one of the types.
Here are mappings:
Index1:
"mappings": {
"people": {
"properties": {
"work_email": {
"type": "string",
"index_name": "email",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
Index2:
"mappings": {
"companies": {
"properties": {
"email": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
I expect this to work:
GET /index1,index2/people,companies/_search?
{
"sort": [
{
"email.raw": {
"order": "asc"
}
}
]
}
But, I get an error that there is no such field in the 'people' type.
Am I doing something wrong, or is there a better way to achieve what I need?
Here you can find a recreation script that illustrates the problem: https://gist.github.com/pmishev/11375297
There is problem in the way you map the multi field..Check out the below mapping and try to index..You should get the results
"mappings": {
"people": {
"properties": {
"work_email":{
"type": "multi_field",
"fields": {
"work_email":{
"type": "string",
"index_name": "email"
},
"raw":{
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
We should specify the type to multi_field and under the fields we should specify the required fields....
I ended up adding a 'copy_to' property in my mapping:
"mappings": {
"people": {
"properties": {
"work_email": {
"type": "string",
"copy_to": "email",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
So now I can address both fields as email.
It's not ideal, as this means that the email field is actually indexed twice, but that was the only thing that worked.

Resources