Disable dynamic mapping creation for only specific indexes on elasticsearch? - elasticsearch

I'm trying to disable dynamic mapping creation for only specific indexes, not for all. For some reason I can't put default mapping with 'dynamic' : 'false'.
So, here left two options as I can see:
specify property 'index.mapper.dynamic' in file elasticsearch.yml.
put 'index.mapper.dynamic' at index creation time, as described here https://www.elastic.co/guide/en/kibana/current/setup.html#kibana-dynamic-mapping
First option may only accept values: true, false and strict. So there is no way to specify subset of specific indexes (like we do by pattern with property 'action.auto_create_index' https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-creation).
Second option just not works.
I've created index
POST http://localhost:9200/test_idx/
{
"settings" : {
"mapper" : {
"dynamic" : false
}
},
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
}
}
}
}
}
Then checked index settings:
GET http://localhost:9200/test_idx/_settings
{
"test_idx" : {
"settings" : {
"index" : {
"mapper" : {
"dynamic" : "false"
},
"creation_date" : "1445440252221",
"number_of_shards" : "1",
"number_of_replicas" : "0",
"version" : {
"created" : "1050299"
},
"uuid" : "5QSYSYoORNqCXtdYn51XfA"
}
}
}
}
and mapping:
GET http://localhost:9200/test_idx/_mapping
{
"test_idx" : {
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
}
}
}
}
}
}
so far so good, let's index document with undeclared field:
POST http://localhost:9200/test_idx/test_type/1
{
"field1" : "it's ok, field must be in mapping and in source",
"somefield" : "but this field must be in source only, not in mapping"
}
Then I've checked mapping again:
GET http://localhost:9200/test_idx/_mapping
{
"test_idx" : {
"mappings" : {
"test_type" : {
"properties" : {
"field1" : {
"type" : "string"
},
"somefield" : {
"type" : "string"
}
}
}
}
}
}
As you can see, mapping is extended regardless of index setting "dynamic" : false.
I've also tried to create index exactly as described in doc
PUT http://localhost:9200/test_idx
{
"index.mapper.dynamic": false
}
but got the same behavior.
Maybe I've missed something?
Thanks a lot in advance!

You're almost there: the value needs to be set to strict.
And the correct usage is the following:
PUT /test_idx
{
"mappings": {
"test_type": {
"dynamic":"strict",
"properties": {
"field1": {
"type": "string"
}
}
}
}
}
And pushing this a bit further, if you want to forbid the creation even of new types, not only fields in that index, use this:
PUT /test_idx
{
"mappings": {
"_default_": {
"dynamic": "strict"
},
"test_type": {
"properties": {
"field1": {
"type": "string"
}
}
}
}
}
Without _default_ template:
PUT /test_idx
{
"settings": {
"index.mapper.dynamic": false
},
"mappings": {
"test_type": {
"dynamic": "strict",
"properties": {
"field1": {
"type": "string"
}
}
}
}
}

You must know about that the below part just mean that ES could'nt create a type dynamically.
"mapper" : {
"dynamic" : false
}
You should configure ES like this:
PUT http://localhost:9200/test_idx/_mapping/test_type
{
"dynamic":"strict"
}
Then you cant't index other field that without mapping any more ,and get an error as follow:
mapping set to strict, dynamic introduction of [hatae] within [data] is not allowed
If you wanna store the data,but make the field can't be index,you could take the setting like this:
PUT http://localhost:9200/test_idx/_mapping/test_type
{
"dynamic":false
}
Hope these can help the people with the same issue :).

The answer is in the doc (7x.): https://www.elastic.co/guide/en/elasticsearch/reference/7.x/dynamic.html
The dynamic setting controls whether new fields can be added
dynamically or not. It accepts three settings:
true
Newly detected fields are added to the mapping. (default)
false
Newly detected fields are ignored. These fields will not be indexed so
will not be searchable but will still appear in the _source field of
returned hits. These fields will not be added to the mapping, new
fields must be added explicitly.
strict
If new fields are detected, an exception is thrown and the document is
rejected. New fields must be explicitly added to the mapping.
PUT my_index
{
"mappings": {
"dynamic": "strict",
"properties": {
"user": {
"properties": {
"name": {
"type": "text"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}

You cannot disable dynamic mapping in ES 7 anymore, what you can do if you have completely unstructured data is to disable completely the mapping for the index like this:
curl -X PUT "localhost:9200/my_index?pretty" -H 'Content-Type: application/json' -d'
{
"mappings": {
"enabled": false
}
}
'
if you are using python you can do this:
from elasticsearch import Elasticsearch
# Connect to the elastic cluster
es=Elasticsearch([{'host':'localhost','port':9200}])
request_body = {
"mappings": {
"enabled": False
}
}
es.indices.create(index = 'my_index', body = request_body)

For ES 7 if you want to update an existing index:
PUT customers/_mapping
{
"dynamic": "strict"
}

first, please be concern aboout value false or strict,they work in a different way.
using "dynamic": "false" and create documents with fields not covered by the mapping, those fields will be ignored (so they won't be stored) and wouldn't show up in _source when you GET the document.
where value strict will not allow you to create the document rather it will throw an exception
Inner objects inherit the dynamic setting from their parent object or from the mapping type. In the following example, dynamic mapping is disabled at the type level, so no new top-level fields will be added dynamically.
However, the user.social_networks object enables dynamic mapping, so you can add fields to this inner object.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html
PUT my-index-000001
{
"mappings": {
"dynamic": false,
"properties": {
"user": {
"properties": {
"name": {
"type": "text"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
if you are using node.js client
await this.client.indices.putMapping({
index: ElasticIndex.UserDataFactory,
body: {
dynamic: 'strict',
properties: {
...this.schema,
},
},
});

Related

ElasticSearch - How create Index template/mapping per alias and perform search against each alias separately

Is is any way in elastic to store index template per alias.
I mean create Index with multiple aliases (alias1 ,alias2 ..) and attach different template to each of them. Then perform Index/Search docs on specific alias.
The reason I'm doing so due to multiple different data-structure (up to 50 types) of documents.
What I did so far is :
1. PUT /dynamic_index
2. POST /_aliases
{ "actions" : [
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type1" } },
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type2" } },
{ "add" : { "index" : "dynamic_index", "alias" : "alias_type3" } }
]}
3.
PUT_template/template1 {
"index_patterns": [
"dynamic_index"
],
"mappings": {
"dynamic_templates": [
{
"strings_as_keywords": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"analyzer": "standard",
"copy_to": "_all",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
}
}
}
],
"properties": {
"source": {
"type": "keyword"
}
}
},
"aliases": {
"alias_type1": {
}
}
}
4. same way to alias_type2 , alias_type3 but different fields ...
Indexing/Search : Trying create and search docs per alias like in example:
POST alias_type1/_doc
{
"source": "foo"
, .....
}
POST alias_type2/_doc
{
"source": "foo123"
, .....
}
GET alias_type1/_search
{
"query": {
"match_all": {}
}
}
GET alias_type2/_search
{
"query": {
"match_all": {}
}
}
What I see actually that even if I index documents per alias,
when searching I don't see result per alias ,all results are same on alias_type1,2 and even on index.
Any way I can achieve separation logic on each alias in terms of searches/index docs per type (alias) ?
Any ideas ?
You can’t have separate mapping for aliases pointing to the same index! Aliases are like virtual link pointing to a index so if your aliases pointing to same index you will get the same result back.
If you want to have different mapping based on your data structure you will need to creat multiple indices.
Update
You also can use custom routing based on a field for more information you can check Elastic official documentation here.

ElasticSearch filtering for a tag in array

I've got a bunch of events that are tagged for their audience:
{ id = 123, audiences = ["Public", "Lecture"], ... }
I've trying to do an ElasticSearch query with filtering, so that the search will only return events that have the an exact entry of "Public" in that audiences array (and won't return events that a "Not Public").
How do I do that?
This is what I have so far, but it's returning zero results, even though I definitely have "Public" events:
curl -XGET 'http://localhost:9200/events/event/_search' -d '
{
"query" : {
"filtered" : {
"filter" : {
"term" : {
"audiences": "Public"
}
},
"query" : {
"match" : {
"title" : "[searchterm]"
}
}
}
}
}'
You could use this mapping for you content type
{
"your_index": {
"mappings": {
"your_type": {
"properties": {
"audiences": {
"type": "string",
"index": "not_analyzed"
},
}
}
}
}
}
not_analyzed
Index this field, so it is searchable, but index the
value exactly as specified. Do not analyze it.
And use lowercase term value in search query

Set every property type to not_analyzed for custom object

I have a custom object that I wish to store in ElasticSearch as its own type in an index, but I don't want any field in the object to be analyzed. How should I go about doing this?
I have been using the ElasticSearch NEST client but can also manually create the mapping if needed.
You have a few options that will all work. Personally, I would go with either of the first two. If it's a daily index, then the second one is the better option.
Define the mapping upfront and disable dynamic fields. This is by far the safest approach and it will help you to avoid mistakes, and it will prevent fields from being added afterward.
{
"mappings": {
"_default_": {
"_all": {
"enabled": false
}
},
"mytype" : {
"dynamic" : "strict",
"properties" : {
...
}
}
}
}
Create an index template that also disables dynamic fields, but allows you to continuously roll new indices with the same mapping(s).
You can create tiered index templates so that more than one applies to any given index.
{
"template": "mytimedindex-*",
"settings": {
"number_of_shards": 2
},
"mappings": {
"_default_": {
"_all": {
"enabled": false
}
},
"mytype" : {
"dynamic" : "strict",
"properties" : {
...
}
}
}
}
Create a dynamic mapping that allows new fields, but defaults all strings to not_analyzed:
"dynamic_templates" : [ {
"strings" : {
"mapping" : {
"index" : "not_analyzed",
"type" : "string"
},
"match" : "*",
"match_mapping_type" : "string"
}
} ]
This will allow you to dynamically add fields to the mapping.

Setting up a Kibana terms panel for an Elasticsearch field that is a list of strings

I have a Kibana dashboard that contains a terms panel to show the number of instances for a particular field (let's call it field1). Field1 is, effectively, a list of strings. Each string usually contains multiple words. Since it's analyzed, Elasticsearch breaks the terms up into separate columns. I need to keep the text together, so I need a not_analyzed version. Here's my attempt to do that with a template, located at ~\config\templates\doc_template.json on a Windows box, which does not seem to be working. Elasticsearch is running as a Windows service.
{
"doc_template": {
"template": "*",
"mappings": {
"Type-*": {
"properties": {
"Field1": {
"type": "multi_field",
"fields": {
"Field1": { "index": "analyzed" },
"RawField1": { "index": "not_analyzed" }
}
}
}
}
}
}
}
In the terms panel, I expect the necessary field to be either RawField1 or Field1.RawField1, but I've tried other variations including and excluding .raw, with no luck.
New indexes are created daily. Field1 exists in 4 separate types, each of which begin with "Type-". I suspect my attempt at using a wildcard there is problematic, but I'm not sure. All data is being sent to Elasticsearch via NEST in a C# .NET application. Here's the mapping for Field1 as it currently exists for one of the types:
{
"index-2014.12.08" : {
"mappings" : {
"Type-1" : {
"properties" : {
"Field1" : {
"type" : "string"
},
"Field2" : {
"type" : "string"
},
"Field3" : {
"type" : "string"
}
}
}
}
}
}
Obviously, the mapping doesn't look like how I expect. What's the best way to remedy this issue?

Elasticsearch Map case insensitive to not_analyzed documents

I have a type with following mapping
PUT /testindex
{
"mappings" : {
"products" : {
"properties" : {
"category_name" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
}
}
I wanted to search for an exact word.Thats why i set this as not_analyzed.
But the problem is i want to search that with lower case or upper case[case insensitive].
I searched for it and found a way to set case insensitive.
curl -XPOST localhost:9200/testindex -d '{
"mappings" : {
"products" : {
"properties" : {
"category_name":{"type": "string", "index": "analyzed", "analyzer":"lowercase_keyword"}
}
}
}
}'
Is there any way to do these two mappings to same field.?
Thanks..
I think this example meets your needs:
$ curl -XPUT localhost:9200/testindex/ -d '
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"analyzer_keyword":{
"tokenizer":"keyword",
"filter":"lowercase"
}
}
}
}
},
"mappings":{
"test":{
"properties":{
"title":{
"analyzer":"analyzer_keyword",
"type":"string"
}
}
}
}
}'
taken from here: How to setup a tokenizer in elasticsearch
it uses both the keyword tokenizer and the lowercase filter on a string field which I believe does what you want.
If you want case insensitive queries ONLY, consider changing both your data AND your query to either of lower/upper case before you go about doing your business.
That would mean you keep your field not_analyzed and enter data/query in only one of the cases.
I believe this Gist answers your question best:
* https://gist.github.com/mtyaka/2006966
You can index a field several times during mapping and we do this all the time where one is not_analyzed and another is. We typically set the not_analyzed version to .raw
Like John P. wrote, you can set up analyzer during runtime, or you can set one up in the config at server start like in link above:
# Register the custom 'lowercase_keyword' analyzer. It doesn't do anything else
# other than changing everything to lower case.
index.analysis.analyzer.lowercase_keyword.type: custom
index.analysis.analyzer.lowercase_keyword.tokenizer: keyword
index.analysis.analyzer.lowercase_keyword.filter: [lowercase]
Then you define your mapping for your field(s) with both the not_analyzed version and the analyzed one:
# Map the 'tags' property to two fields: one that isn't analyzed,
# and one that is analyzed with the 'lowercase_keyword' analyzer.
curl -XPUT 'http://localhost:9200/myindex/images/_mapping' -d '{
"images": {
"properties": {
"tags": {
"type": "multi_field",
"fields": {
"tags": {
"index": "not_analyzed",
"type": "string"
},
"lowercased": {
"index": "analyzed",
"analyzer": "lowercase_keyword",
"type": "string"
}
}
}
}
}
}'
And finally your query (note lowercased values before building query to help find match):
# Issue queries against the index. The search query must be manually lowercased.
curl -XPOST 'http://localhost:9200/myindex/images/_search?pretty=true' -d '{
"query": {
"terms": {
"tags.lowercased": [
"event:battle at the boardwalk"
]
}
},
"facets": {
"tags": {
"terms": {
"field": "tags",
"size": "500",
"regex": "^team:.*"
}
}
}
}'
just create your custom analyzer with keyword tokenizer and lowercase token filter.
To this scenarios, I suggest that you could combine lowercase filter and keyword tokenizer into your custom analyzer. And lowercase your search-input keywords.
1.Create index with the analyzer combined with lowercase filter and keyword tokenizer
curl -XPUT localhost:9200/test/ -d '
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"your_custom_analyzer":{
"tokenizer":"keyword",
"filter": ["lowercase"]
}
}
}
}
}'
2.Put mappings and set the field properties with the analyzer
curl -XPUT localhost:9200/test/_mappings/twitter -d '
{
"twitter": {
"properties": {
"content": {"type": "string", "analyzer": "your_custom_analyzer" }
}
}
}'
3.You could search what you want in wildcard query.
curl -XPOST localhost:9200/test/twitter/ -d '{
"query": {
"wildcard": {"content": "**the words you want to search**"}
}
}'
Another way for search a filed in different way. I offser a suggestion for U was that using the multi_fields type.
You could set the field in multi_field
curl -XPUT localhost:9200/test/_mapping/twitter -d '
{
"properties": {
"content": {
"type": "multi_field",
"fields": {
"default": {"type": "string"},
"search": {"type": "string", "analyzer": "your_custom_analyzer"}
}
}
}
}'
So you could index data with above mappings properties. and finally search it in two way (default/your_custom_analyzer)
We could achieve case insensitive searching on non-analyzed strings using ElasticSearch scripting.
Example Query Using Inline Scripting:
{
"query" : {
"bool" : {
"must" : [{
"query_string" : {
"query" : "\"apache\"",
"default_field" : "COLLECTOR_NAME"
}
}, {
"script" : {
"script" : "if(doc['verb'].value != null) {doc['verb'].value.equalsIgnoreCase(\"geT\")}"
}
}
]
}
}
}
You need to enable scripting in the elasticsearch.yml file. Using scripts in search queries could reduce your overall search performance. If you want scripts to perform better, then you should make them "native" using java plugin.
Example Plugin Code:
public class MyNativeScriptPlugin extends Plugin {
#Override
public String name() {
return "Indexer scripting Plugin";
}
public void onModule(ScriptModule scriptModule) {
scriptModule.registerScript("my_script", MyNativeScriptFactory.class);
}
public static class MyNativeScriptFactory implements NativeScriptFactory {
#Override
public ExecutableScript newScript(#Nullable Map<String, Object> params) {
return new MyNativeScript(params);
}
#Override
public boolean needsScores() {
return false;
}
}
public static class MyNativeScript extends AbstractSearchScript {
Map<String, Object> params;
MyNativeScript(Map<String, Object> params) {
this.params = params;
}
#Override
public Object run() {
ScriptDocValues<?> docValue = (ScriptDocValues<?>) doc().get(params.get("key"));
if (docValue instanceof Strings) {
return ((String) params.get("value")).equalsIgnoreCase(((Strings) docValue).getValue());
}
return false;
}
}
}
Example Query Using Native Script:
{
"query" : {
"bool" : {
"must" : [{
"query_string" : {
"query" : "\"apache\"",
"default_field" : "COLLECTOR_NAME"
}
}, {
"script" : {
"script" : "my_script",
"lang" : "native",
"params" : {
"key" : "verb",
"value" : "GET"
}
}
}
]
}
}
}
it is so simple, just create mapping as follows
{
"mappings" : {
"products" : {
"properties" : {
"category_name" : {
"type" : "string"
}
}
}
}
}
No Need of giving index if you want to work with case insensitive because the default index will be "standard" that will take care of case insensitive.
I wish I could add a comment, but I can't. So the answer to this question is "this is not possible".
Analyzers are composed of a single Tokenizer and zero or more TokenFilters.
I wish I could tell you something else, but spending 4 hours researching, that's the answer. I'm in the same situation. You can't skip tokenization. It's either all on or all off.

Resources