ElasticSearch1.5 : Add new field in existing working Index - elasticsearch

I have an existing index named as "MyIndex", which I am using to store a kind of data in ElasticSearch. That same index has millions of records. I am using ElasticSearch 1.5 version.
Now I have a new requirement for which I want to add two more fields in the same document which I am storing in "MyIndex" Index. Now I want to use both new schema and old schema documents in future.
What Can I do?
Can I inset new document in the same Index?
Are we need some changes in ElasticSearch mapping?
If we don't change anything, Is it affect on existing search capability?
Please help me to conclude this issue with your opinions.
Thanks in advance.

You can add new fields to existing index by updating mapping, but in many cases it would be just ok to index documents with new fields directly, and let ES infer types (although not always recommended) - but this will depend on what type of data you're indexing, and do you need special analyzers for strings or not.

Related

How to populate data for existing documents after new fields are added in mapping

recently we came across a requirement where we need to add 3 new fields to our Existing Index. We pull this data from our source database using logstash. We have 1000's of documents stored in the current Index already. In the past, it was being told that whenever a change has happened to an existing index (such as adding a new field) we need to reindex with complete data reload again. Since we want the previous documents to have these new fields with data populated in them.
Is there any other way we can achieve this by not dropping the existing index or deleting any documents and reloading? I was hoping we can have a better way of doing this with the latest 7. X version.
No need to drop the index and recreate one. After you have updated the mapping of the index, you just upsert the documents in the index again, with the new fields. Each document will be overwritten by the new one.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

Elastic search for batch update of old documents

In my application, I am using Elasticsearch for indexing and searching of documents.As expected, documents have some fields.
Due to new requirements, users want those documents to have some more new fields. I can add new fields for newly created documents, but I also need to have old documents too to have these fields.
I am thinking of writing a framework which would accept generic criteria to read old documents and update them. By generic criteria, I mean it must be able to accept any user defined condition to read older documents.
I am new to ES,and hence not sure if its feasible.
So I want to know whether it is feasible to write such a framework using Elastic search?
If you provide a custom document id, you can reindex your existing data with the update api (available also in the upsert mode). In this way you can update the documents adding the new fields when you re-import the old data.
It is important to provide a document id, otherwise it is impossible to add fields to the existing documents, since only insert are possible.

Specifying data type and analyzer while creating index

I am using elastic search to index my data and i was able to do it with the following POST request
http://localhost:9200/index/type/id
{
JSON data over here
}
Yesterday while i was going through some of the elastic tutorials i found one person mentioning about setting analyzer to those fields where we are planning to do full text search.I found during my googling that mapping API can be used to update datatypes and analyzer, but in my case i want to do it as i am creating index.
How can i do it?
You can create index with custom settings (and mappings) in the first request and then index your data with second request. In this case you can not do both at the same time.
However if you index your data first and index does not exist yet, it will be created automatically with default settings. You can then update your mappings.
Source: Index

reindexing elastic search or updating indexes?

I am now on elastic search, I cant figure out how to update elastic search index,type or document without deleting and reindexing? or is it the best way to achieve it?
So if I have products in my sql product table, should I better delete product type and reindex it or even entire DB as index on elasticsearc. what is the best use case and how can I achieve it?
I would like to do it with Nest preferably but if it is easier, ElasticSearch works for me as well.
Thanks
This can be a real challenge! Historic records in elasticsearch will need to be reindexed when the template changes. New records will automatically be formatted according to the template you specify.
Using this link has helped us a lot:
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html
You'll want to be sure to have the logstash filter set up to match the fields in your template.

Does ElasticSearch Snapshot/Restore functionality cause the data to be analyzed again during restore?

I have a decent amount of data in my ElasticSearch index. I changed the default analyzer for the index and hence essentially I need to reindex my data so that it is analyzed again using the new analyzer. So instead of creating a test script that will delete all of the existing data in the ES index and re-add the data I thought if there is a back-up/restore module that I could use. As part of that, I found the snapshot/restore module that ES supports - ElasticSearch-SnapshotAndRestore.
My question is - If I use the above ES snapshot/restore module will it actually cause the data to be re-analyzed? Since I changed the default analyzer, I need the data to be reanalyzed. If not, is there an alternate tool/module you will suggest that will allow for pure export and import of data and hence cause the data to be re-analyzed during import?
DevUser
No it does not re-analyze the data. You will need to reindex your data.
Fortunately that's fairly straightforward with Elasticsearch as it by default stores the source of your documents:
Reindexing your data
While you can add new types to an index, or add new fields to a type,
you can’t add new analyzers or make changes to existing fields. If you
were to do so, the data that has already been indexed would be
incorrect and your searches would no longer work as expected.
The simplest way to apply these changes to your existing data is just
to reindex: create a new index with the new settings and copy all of
your documents from the old index to the new index.
One of the advantages of the _source field is that you already have
the whole document available to you in Elasticsearch itself. You don’t
have to rebuild your index from the database, which is usually much
slower.
To reindex all of the documents from the old index efficiently, use
scan & scroll to retrieve batches of documents from the old index, and
the bulk API to push them into the new index.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/reindex.html
I'd read up on Scan and Scroll prior to taking this approach:
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/scan-scroll.html
TaskRabbit did opensource an import/export tool but I've not used it so cannot recommend but it is worth a look:
https://github.com/taskrabbit/elasticsearch-dump

Resources