can ElasticSearch only add field index ,no save the orignal value just like lucene Field.Store.NO - elasticsearch

I have a big size field in MySQL and do not want to save the original value to ElasticSearch. Is there a method just like Lucene Field.Store.NO?
Thanks.

You just need to define the "store" mapping accordingly, eg. :
PUT your-index
{
"mappings": {
"properties": {
"some_field": {
"type": "text",
"index": true,
"store": false
}
}
}
}
You may also want to disable the _source field :
#disable-source-field
The _source field contains the original JSON document body that was passed at index time [...] Though very handy to have around, the source field does incur storage overhead within the index.
For this reason, it can be disabled as follows:
PUT your-index
{
"mappings": {
"_source": {
"enabled": false
}
}
}

Related

How to update index mapping with dynamic as strict in future?

I am new to elastic search. I have an index named users, which has a lot of fields I know. But a few more fields can be added in the future.
So when defining my mapping, I want to include the fields that I know currently with dynamic "strict", but in the future, if I want to add the new field, how will update the new mapping and if I do it, will I have to reindex everything.
I found in the ES documents that mappings are applied only during index creation time. So I am a little confused here, what's the right way to approach this.
You can always update the mapping in future, even after keeping it strict using the put mapping api. You'll not require existing data to be re-indexed unless you want the newly added field have some value for the older documents which were added before updating the mapping with the new field.
Lets assume you already have an index test with one field say field1 of type keyword. Now in future you have a requirement to add new field say field2 of integer type. You can do so by the put mapping api as below,
PUT test/_mapping
{
"properties": {
"field2": {
"type": "integer"
}
}
}
After executing the above if you check the mapping using
GET test/_mapping
You can see the new field as well in the response,
{
"test" : {
"mappings" : {
"dynamic" : "strict",
"properties" : {
"field1" : {
"type" : "keyword"
},
"field2" : {
"type" : "integer"
}
}
}
}
}
Inner objects inherit the dynamic setting from their parent object or from the mapping type. In the following example, dynamic mapping is disabled at the type level, so no new top-level fields will be added dynamically.
However, the user.social_networks object enables dynamic mapping, so you can add fields to this inner object.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html
PUT my-index-000001
{
"mappings": {
"dynamic": false,
"properties": {
"user": {
"properties": {
"name": {
"type": "text"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}

Elasticsearch how to specify Array of keyword with index false in mapping?

I am trying to specify Array of "keyword" fields in Elasticsearch mapping with index: "false", As according to ES docs there is no type as "Array" so I was thinking about using below mapping
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"arr_field": {
"type": "keyword", "index": false
}
}
}
}
}
Is this a correct way or not?
Yes there is no such specific data type for array. If you want to have a field that store array of integers all you need is to define the field as type integer and while indexing always make sure that value against that field is an array even if the value is single.
For e.g.:
PUT test
{
"mappings": {
"_doc": {
"properties": {
"intArray": {
"type": "integer"
}
}
}
}
}
PUT test/_doc/1
{
"intArray": [10, 12, 50]
}
PUT test/_doc/1
{
"intArray": [7]
}
Same goes for keyword data type as well. So what you are doing is right. All you need to take care is that while indexing a document, value for arr_field is always an array.

how to specify a field which should not indexed?

as mentioned in the title, I want to disable index a specified field in elasticsearch, for example, I have a fields named #fileds which contains three sub-fields like name、age、salary, now I do not want to index the field #fields.age in elasticsearch, how can I achieve that? I have tried to use include_in_all parameters, but it doesn't work. mapping configuration like:
"mappings": {
"fluentd": {
"properties": {
"#fields": {
"properties": {
"age": {
"type": "text",
"include_in_all": false,
"index": "no"
}
}
}
}
}
}
when I use this mapping configuration above, I can only see #fields.age in the index's mapping, #fields.name and #fields.salary should appear on the index's mapping not the #fields.age, how can this happen? any answers will be appreciated.

How to map dynamic field value in elasticsearch?

I'm mapping a couchbase gateway document and I'd like to tell elasticsearch to avoid indexing the internal attributes added by the gateway like the "_sync", this object contains another object named "channels" which has the following form:
"channels": {
"i7de5558-32ad-48ca-bf91-858c3a1e4588": 12
}
So I guess the mapping of this object would be like:
"channels": {
"type": "object",
"properties": {
"i7de5558-32ad-48ca-bf91-858c3a1e4588": {
"type": "integer",
"index": "not_analyze"
}
}
}
The problem is that the keys are always changing, so I don't know if I should use a wildcard like this "*": {"type": "integer", "index": "not_analyze"} for this property or do something else.
Any advice please?
If the fields are of integer types, you don't have to provide them explicitly in the mapping. You can create an empty mapping ,index documents with these fields. Elasticsearch will infer the type of field and update the mapping dynamically. You can also use dynamic templates for this.
{
"mappings": {
"my_type": {
"dynamic_templates": [
{
"analysed_string_template": {
"path_match": "channels.*",
"mapping": {
"type": "integer"
}
}
}
]
}
}
}
There`s a dynamic way to do that as you need, is called dynamic template
Using templates you are able to create rules like this:
PUT /my_index
{
"mappings": {
"my_type": {
"date_detection": false
}
}
}
In your case you could create a template to set all news fields inside the channel object as not_analyzed.
Hope it will help

Specify which fields are indexed in ElasticSearch

I have a document with a number of fields that I never query on so I would like to turn indexing on those fields off to save resources. I believe I need to disable the _all field, but how do I specify which fields are indexed then?
By default all the fields are indexed within the _all special field as well, which provides the so called catchall feature out of the box. However, you can specify for each field in your mapping whether you want to add it to the _all field or not, through the include_in_all option:
"person" : {
"properties" : {
"name" : {
"type" : "string", "store" : "yes", "include_in_all" : false
}
}
}
The above example disables the default behaviour for the name field, which won't be part of the _all field.
Otherwise, if you don't need the _all field at all for a specific type you can disable it like this, again in your mapping:
"person" : {
"_all" : {"enabled" : false},
"properties" : {
"name" : {
"type" : "string", "store" : "yes"
}
}
}
When you disable it your fields will still be indexed separately, but you won't have the catchall feature that _all provides. You will need then to query your specific fields instead of relying on the _all special field, that's it. In fact, when you query and don't specify a field, elasticsearch queries the _all field under the hood, unless you override the default field to query.
Each string field has index param in the mapping config, which defaults to analyzed. That means that besides the _all field each field is indexed solely.
And for the _all field it is said in reference that:
By default, it is enabled and all fields are included in it for ease of use.
So, to completely disable indexing for a field you have to specify (if the _all field is enabled):
"mappings": {
"your_mapping": {
"properties": {
"field_not_to_index": {
"type": "string",
"include_in_all": false,
"index": "no"
}
}
}
}
For the fields that should be queried on whether include them in the _all field (with "index": "no" to save resources) if you query through the _all field, or if you query on those fields solely use the index param with any positive value (analyzed or not_analyzed) and disable the _all field to save resources.
Following is an important doc page to understand the index settings in elastic search
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-intro.html
For your problem, ideally you should set the "index" flag to no in the field properties.
You can utilize enabled field to disable particular field or entire mapping.
ElasticSearch Doc
Disable Field mapping (i.e. session_data field)
{
"mappings": {
"_doc": {
"properties": {
"session_data": {
"enabled": false
}
}
}
}
}
Disable entire mapping
{
"mappings": {
"_doc": {
"enabled": false
}
}
}
Set dynamic index and _all index to false. Specify the required fields in mapping.
https://www.elastic.co/guide/en/elasticsearch/guide/current/dynamic-mapping.html
{
"mappings":{
"candidates":{
"_all":{
"enabled":false
},
"dynamic": "false",
"properties":{
"tags":{
"type":"text"
},
"derivedAttributes":{
"properties":{
"city":{
"type":"text"
},
"zip5":{
"type":"keyword"
}
}
}
}
}
}
}
_all has been deprecated since 6.0. Use below
"mappings": {
"dynamic":"false",
"properties": {
"field_to_index":{"index": true, "type": "text"}
}
According to es docs
Setting dynamic to false doesn’t alter the contents of the _source field at all. The _source will still contain the whole JSON document that you indexed. However, any unknown fields will not be added to the mapping and will not be searchable.

Resources