Elasticsearch fielddata vs. fields mapping - elasticsearch

Can somebody explain me please what is the difference between settings fielddata and fields while mapping in Elasticsearch?
For example what is the difference between this two codes:
PUT my_index
{
"mappings": {
"properties": {
"my_field": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword" // for ordering
}
}
}
}
}
}
and
PUT my_index/_mapping
{
"properties": {
"my_field": {
"type": "text",
"fielddata": true // what is the difference?
}
}
}
Or can you tell me if this code does make any sence?
PUT my_index
{
"mappings": {
"properties": {
"my_field": {
"type": "text",
"fielddata": true,
"fields": {
"keyword": {
"type": "keyword" // for ordering
}
}
}
}
}
}

Since the main intent is to do sorting and aggregations, then definitely use the first option, i.e. the keyword (sub-)field.
fielddata is the old-fashioned way of doing it and eats up a lot more memory.
You can find more detailed information and a link to a related article here

Related

How to detect whether elasticsearch has enabled dynamic field

I don't know whether my index has enabled/disabled dynamic field. When I use get index mapping command it just responses these informations:
GET /my_index1/_mapping
{
"my_index1": {
"mappings": {
"properties": {
"goodsName": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
},
"auditTime": {
"type": "long"
},
"createUserId": {
"type": "long"
}
}
}
}
}
If you don't explicitly set the dynamic to false or strict, it will be true by default. If you explicitly set that, you will see that in your mappings:
{
"mappings": {
"dynamic": false,
"properties": {
"name": {
"type": "text"
}
}
}
}
And when you index the following document:
{"name":"products", "clickCount":1, "bookingCount":2, "isPromoted":1}
Only the field name will be indexed, the rest won't. If you call the _mapping endpoint again, it will give you the exact mappings above.

How to create a custom reusable type in ElasticSearch?

My json for ElasticSearch schema looks like this :-
{
"mappings": {
"properties": {
"DESCRIPTION_FR": {
"type": "text",
"analyzer": "french"
},
"FEEDBACK_FR": {
"type": "text",
"analyzer": "french"
},
"SOURCE_FR": {
"type": "text",
"analyzer": "french"
}
}
}
}
There are 100 of properties like this. Replicating a change across all the properties with this approach is redundant and erroneous.
Is there a way in ElasticSearch 7.2 to write custom data type and reuse it in property mapping.
{
"settings": {
//definition of custom type "text_fr"
},
"mappings": {
"properties": {
"DESCRIPTION_FR": {
"type": "text_fr"
},
"FEEDBACK_FR": {
"type": "text_fr"
},
"SOURCE_FR": {
"type": "text_fr"
}
}
}
}
Yes! What you're after is dynamic mapping templates. More specifically the match feature.
Define the target field names with a leading wildcard:
PUT my_index
{
"mappings": {
"dynamic_templates": [
{
"is_french_text": {
"match_mapping_type": "*",
"match": "*_FR",
"mapping": {
"type": "text",
"analyzer": "french"
}
}
}
]
}
}
Insert a doc:
POST my_index/_doc
{
"DESCRIPTION_FR": "je",
"FEEDBACK_FR": "oui",
"SOURCE_FR": "je ne sais quoi"
}
Verify the dynamically generated mapping:
GET my_index/_mapping

How to set elasticsearch index mapping as not_analysed for all the fields

I want my elasticsearch index to match the exact value for all the fields. How do I map my index to "not_analysed" for all the fields.
I'd suggest making use of multi-fields in your mapping (which would be default behavior if you aren't creating mapping (dynamic mapping)).
That way you can switch to traditional search and exact match searches when required.
Note that for exact matches, you would need to have keyword datatype + Term Query. Sample examples are provided in the links I've specified.
Hope it helps!
You can use dynamic_templates mapping for this. As a default, Elasticsearch is making the fields type as text and index: true like below:
{
"products2": {
"mappings": {
"product": {
"properties": {
"color": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
As you see, also it creates a keyword field as multi-field. This keyword fields indexed but not analyzed like text. if you want to drop this default behaviour. You can use below configuration for the index while creating it :
PUT products
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"product": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"index": false
}
}
}
]
}
}
}
After doing this the index will be like below :
{
"products": {
"mappings": {
"product": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword",
"index": false
}
}
}
],
"properties": {
"color": {
"type": "keyword",
"index": false
},
"type": {
"type": "keyword",
"index": false
}
}
}
}
}
}
Note: I don't know the case but you can use the multi-field feature as mentioned by #Kamal. Otherwise, you can not search on the not analyzed fields. Also, you can use the dynamic_templates mapping set some fields are analyzed.
Please check the documentation for more information :
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-templates.html
Also, I was explained the behaviour in this article. Sorry about that but it is Turkish. You can check the example code samples with google translate if you want.

kibana keyword occurrency across documents

I have been unable to show words occurrency in kibana inside a full_text field mapped as "type": "keyword" across documents in the index.
My first attempt involved the usage of an analyzer. However I have been unable to change the document in any way, the index mapping relfect the analyzer but no field reflect the analysis.
This is the simplified mapping:
{
"mappings": {
"doc": {
"properties": {
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"analyzed": {
"type": "text",
"analyzer": "rebuilt"
}
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"rebuilt": {
"tokenizer": "standard"
}
}
},
"index.mapping.ignore_malformed": true,
"index.mapping.total_fields.limit": 2000
}
}
but still I'm unable to see the array of words that I expect to be saved under the text.analyzed field, indeed that fields does not exists and I'm wondering why
It seems like settings fielddata=true link, in spite of being heavily discouraged, solved my problem (at least for now), and allows me to visualize in kibana the occurrence (or absolute frequency) of each word in the text field across documents.
The final version of the proposed simplified mapping therefore became:
{
"mappings": {
"doc": {
"properties": {
"text": {
"type": "text",
"analyzer": "rebuilt",
"fielddata": true
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"rebuilt": {
"tokenizer": "standard"
}
}
},
"index.mapping.ignore_malformed": true,
"index.mapping.total_fields.limit": 2000
}
}
Getting rid of the useless analyzed field.
I still have to check the performance of kibana. If someone has a performance safe solution to this problem please do not hesitate.
Thanks.

How exactly do mapped fields in elastic search work?

The documentation is sparse and not entirely helpful. So say I have the following fields for my attribute:
{
"my_index": {
"mappings": {
"my_type": {
"my_attribute": {
"mapping": {
"my_attribute": {
"type": "string",
"analyzer": "my_analyzer",
"fields": {
"lowercased": {
"type": "string"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
}
my_analyzer lowercases tokens (in addition to other stuff).
So now I would like to know if the following statements are true:
my_analyzer does not get applied to raw, because the not_analyzed index does not have any analyzers, as its name implies.
my_attribute and my_attribute.lowercased are the exact same, so it is redundant to have the field my_attribute.lowercased
Your first statement is correct, however the second is not. my_attribute and my_attribute.lowercased might not be the same since the former has your custom my_analyzer search and index analyzer, while my_attribute.lowercased has the standard analyzer (since no analyzer is specified the standard one kicks in).
Besides, your mapping is not correct the way it is written, it should be like this:
{
"mappings": {
"my_type": {
"properties": {
"my_attribute": {
"type": "string",
"analyzer": "my_analyzer",
"fields": {
"lowercased": {
"type": "string"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}

Resources