For production I would like to restrict ElasticSearch automatic index creation. As per the documentation I have restricted the server elasticsearch.yml
action.auto_create_index
index.mapper.dynamic: false
However I'm stable to insert new documents with fields that do not match the custom mapping?
According to documentation, I think dynamic mapping should be set to strict.
The dynamic creation of fields within a type can be completely disabled by setting the dynamic property of the type to strict.
Related
When using a SetProcessor to enrich documents with a new field, what is the behavior if strict mapping is used for the index? Does the field being set by the SetProcessor need to be added to the mapping beforehand?
Yes, the new field needs to be added to the mapping prior to the execution of the pipeline. It makes no difference if you just add a new field to your source document or if an ingest pipeline creates one out of the blue, strict mapping is strict, no matter what.
Elasticsearch default behavior when inserting a document to an index, is to create an index mapping if it's not exist.
I know that I can change this behavior on the cluster level using this call
PUT _cluster/settings
{
"persistent": {
"action.auto_create_index": "false"
}
}
but I can't control the customer's elasticsearch.
I'm asking is there a parameter which I can send with the index a document request that will tell elastic not to create the index in case it doesn't exist but to fail instead?
If you couldn’t change cluster settings or settings in elasticsearch.yml, I’m afraid it’s not possible, since there are no special parameters during POST/PUT of the documents.
Another possible solution could be to create an API level, which will prevent going to Elasticsearch completely, if there is no such index.
There is an issue on Github, that is proposing to set action.auto_create_index to false by default, but unfortunately, I couldn’t see if there is any progress on it.
After reading some Elasticsearch index tuning guides like How to Maximize Elasticsearch Index Performance and elastic's Tune for indexing speed I wanted to take a look at updating the refresh_interval.
We are using AWS Elasticsearch domains (elasticsearch version 6.2). There's no mention of refresh_interval on Cloudformation's doc site AWS::Elasticsearch::Domain
So I wanted to see what the default setting was for AWS Elasticsearch.
Using the _settings API doesn't show the refresh_interval.
GET /my_index/_settings
And specifying the refresh_interval doesn't show anything either.
GET /my_index/_settings/index.refresh_interval
Only returns an empty object.
{}
How can I find the current refresh_interval for Elasticsearch?
You need to add a parameter called include_defaults in order to also retrieve the default values:
GET /my_index/_settings?include_defaults=true
In the response, you'll get a defaults section which includes the default value of the refresh_interval setting, most probably 1s.
NOTE: The reason the refresh_interval is empty is because your index has not set the value explicitly and so your index uses the default value.
I have Elasticsearch 2.3.2 and Lucene 5.5.0.
Some indexes were already existing in my nodes with documents.
To search them I was using
GET /index/type/id
But when I created a new index. I have to search with
GET /index/type/id?routing=routing_name
To search with previous query method how should I create Index. I do see that records mapping structure is same as old data.
Also, why do I need to specify routing? In previous, it was not the case.
According to ElasticSearch documentation by default, we don't need to specify routing as id is default routing parameter. also, I have not specified {routing: {required: true}}anywhere while creating an index or adding a record.
I'm confused with mapping and indexing. As I know, mapping a index is make kinda a schema of document.
My point is when I'm creating a document, there are several ways.
1) mapping an index -> indexing documents
2) when creating documents, simultaneously mapping would be done.
then, why do I have to do a mapping for some cases?
Elasticsearch allows to not define mapping for fields because it has some options to detect field types and apply its default mapping. https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html
It's good practice to always default mapping explicitly, relying on ES algorithm can cause unpredictable results.
If you really need some dynamic mappings because for example you don't know all required fields while defining mapping you can use something like dynamic templates or default mapping.
There are several known limitations with default mappings (like text is defaulted to keyword and anything above 256 bytes is ignored). That might work for most of the cases where data is from log. But having a mapping type allows you more control on what kind of indexing to be done and whether to index a field or not. Depending on your use case, the preferred option (defining mapping vs on the fly mapping) might be different.