Does a SetProcessor require dynamic mapping? - elasticsearch

When using a SetProcessor to enrich documents with a new field, what is the behavior if strict mapping is used for the index? Does the field being set by the SetProcessor need to be added to the mapping beforehand?

Yes, the new field needs to be added to the mapping prior to the execution of the pipeline. It makes no difference if you just add a new field to your source document or if an ingest pipeline creates one out of the blue, strict mapping is strict, no matter what.

Related

How to create an Elasticseearch index with index sorting via Spring annotation

I'm using Spring Data for Elasticsearch. I need to create an index with an index sorting as it is described here
Is there a way to define a POJO field to be used as a sorting field during indexing?
I'm using annotations, and that would be a preferred way, but any other options would be Ok too.
Currently this is not possible. Index sorting must be defined when the index is created, and as it is currently possible to define a json file with index settings and add that with #Setting to the entity, this fails in this case. The reason is, that when an index sorting is defined, the corresponding field must be defined in the mappings definition on index creation as well. Spring Data Elasticsearch first creates the index with the settings and after that it writes the mappings - which then is too late.
Please open an issue in the issue tracker that the index creation with index sorting should be possible, we have to think about how to define the sort fields.
Edit 28.03.2021:
From Spring Data Elasticsearch 4.2.0.RC1 on index creation will always be in one step with writing the mapping, so it's possible to provide a settings file that will be used along with the mapping.
It is as well possible now to define the index sorting parameters with arguments of the #Setting annotation, so no need for a json file at all.

Elasticsearch set default field analyzer for index

I was wondering if it is possible to modify the behviour of ES when dynamically mapping a field. In my case I don't want ES to map anything. Most of the fields I have are considered text by ES when the field occurs for the first time.
The correct mapping though for our application is 99% always keyword since we don't want the tokenizer to run on it. Can we modify the behaviour for new fields to be always mapped as keyword (unless defined otherwise in the index mapping of course)
Cheers and thanks!
You can use dynamic templates to solve your issue. Moreover, Elasticsearch guide has snippet which is suitable for your case.

What's a difference between indexing document after creating an index mapping AND creating an document directly with indexing in Elasticsearch

I'm confused with mapping and indexing. As I know, mapping a index is make kinda a schema of document.
My point is when I'm creating a document, there are several ways.
1) mapping an index -> indexing documents
2) when creating documents, simultaneously mapping would be done.
then, why do I have to do a mapping for some cases?
Elasticsearch allows to not define mapping for fields because it has some options to detect field types and apply its default mapping. https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html
It's good practice to always default mapping explicitly, relying on ES algorithm can cause unpredictable results.
If you really need some dynamic mappings because for example you don't know all required fields while defining mapping you can use something like dynamic templates or default mapping.
There are several known limitations with default mappings (like text is defaulted to keyword and anything above 256 bytes is ignored). That might work for most of the cases where data is from log. But having a mapping type allows you more control on what kind of indexing to be done and whether to index a field or not. Depending on your use case, the preferred option (defining mapping vs on the fly mapping) might be different.

ElasticSearch1.5 : Add new field in existing working Index

I have an existing index named as "MyIndex", which I am using to store a kind of data in ElasticSearch. That same index has millions of records. I am using ElasticSearch 1.5 version.
Now I have a new requirement for which I want to add two more fields in the same document which I am storing in "MyIndex" Index. Now I want to use both new schema and old schema documents in future.
What Can I do?
Can I inset new document in the same Index?
Are we need some changes in ElasticSearch mapping?
If we don't change anything, Is it affect on existing search capability?
Please help me to conclude this issue with your opinions.
Thanks in advance.
You can add new fields to existing index by updating mapping, but in many cases it would be just ok to index documents with new fields directly, and let ES infer types (although not always recommended) - but this will depend on what type of data you're indexing, and do you need special analyzers for strings or not.

Is it possible to force a mapping, server side, on a specific index?

I send to Elasticsearch data to a index mydata. This index may or may not exist when the data reaches Elasticsearch and is automatically created.
The mapping guessed from my data was correct up to now, when I added a new field of geo_point type. This type, as far as I understand, must be explicitly provided with a mapping.
My understanding is that mapping is handled
either dynamicall, like my case
or when creating an index "manually"
or via the Put Mapping API
None of these solutions work for me, the index is deleted / recreated rarely (but unpredictibly) and adding the mapping to each document sent to the server would be too much.
Is there a way to store, on the server, an information of the type "if you create index mydata, the field position must be of type geo_ip"?
Index templates will do exactly what you need. Simply create a template (with mappings and settings) whose name matches your index name and as soon as a new document comes in for an index that doesn't exist yet, the latter will be automatically create with the proper mappings and settings.
To answer your second question, yes, your mapping may only contain the definition of a few fields (the geo_point you mentioned, etc) and you can let ES map the other ones dynamically.

Resources