Do changes to elasticsearch mapping apply to already indexed documents? - elasticsearch

If I change the mapping so certain properties have new/different boost values, does that work even if the documents have already been indexed? Or do the boost values get applied when the document is indexed?

You cannot change field level boost factors after indexing data. It's not even possible for new data to be indexed once the same fields have been indexed already for previous data.
The only way to change the boost factor is to reindex your data. The pattern to do this without changing the code of your application is to use aliases. An alias points to a specific index. In case you want to change the index, you create a new index, then reindex data from the old index to the new index and finally you change the alias to point to the new index. Reindexing data is either supported by the elasticsearch library or can be achieved with a scan/scroll.
First version of mapping
Index: items_v1
Alias: items -> items_v1
Change necessary, sencond version of the index with new field level boost values :
Create new index: items_v2
Reindex data: items_v1 => items_v2
Change alias: items -> items_v2
This might be useful in other situations where you want to change your mapping.
Field level boosts are, however, not recommended. The better approach is to use boosting at query time.
Alias commands are:
Adding an alias
POST /_aliases
{
"actions": [
{ "add": {
"alias": "tems",
"index": "items_v1"
}}
]
}
Removing an alias
POST /_aliases
{
"actions": [
{ "remove": {
"alias": "tems",
"index": "items_v1"
}}
]
}

They do not.
Index time boosting is generally not recommended. Instead, you should do your boosting when you search.

Related

Elastic Search synonyms - Delete Action

Are synonyms in Elastic Search (version 6.2.3) stored in the items when these are created/updated or are synonyms applied in every search query to the index?
We need to remove the synonyms of an index with 6 million items and I cannot see in the documentation if removing these synonyms from the index will be enough
DELETE /api/as/v1/engines/{ENGINE_NAME}/synonyms/{SYNONYM_SET_ID}
Or it is needed to reindex all the items afterwards, in which case it might be better to delete the current index and create a new one.
If synonyms are applied during inserting the document
Deletion of current synonyms, won't change anything in the existing data of an index, existing data should be searchable by synonyms.
If synonyms are applied during query time
In this case, removing the synonyms will stop searching the document using synonym.
Now the question is whether you are using index-time-analysis or query-time-analysis. You can check in your mappings. E.g
"mappings": {
"properties": {
"text": {
"type": "text",
"analyzer": "autocomplete", // <======== For index time analysis
"search_analyzer": "synonym_analyzer" //<====== For Query time analysis
}
}
}
}

what is offline and online indexing in Elastic search? and when do we need to reindex?

what is offline and online indexing in Elastic search? I did my research but I couldn't find enough resources to see what these terms mean? any idea? and also when do we need to reindex? any examples would be great
The terms offline and online indexing are used here.
https://spark-summit.org/2014/wp-content/uploads/2014/07/Streamlining-Search-Indexing-using-Elastic-Search-and-Spark-Holden-Karau.pdf
Reindexing
The most basic form if reindexing just copies one index to another.
I have used this form of reindexing to change a mapping.
Elasticsearch doesn't allow you to change a mapping, so if you want to change a mapping you have to create a new index (index2) with a new mapping and then reindex. The reindex will fill that new mapping with the data of the old index.
The command below will move everything from index to index2.
curl -XPOST 'localhost:9200/_reindex?pretty' -d'
{
"source": {
"index": "index"
},
"dest": {
"index": "index2"
}
}'
You can also use reindexing to fill a new index with a part of the old one. You can do so by using a couple of parameters. The example below will copy the newest 1000 documents.
POST /_reindex
{
"size": 1000,
"source": {
"index": "index",
"sort": { "date": "desc" }
},
"dest": {
"index": "index2"
}
}
For more examples about reindexing please have a look at the official documentation.
offline vs online indexing
In ONLINE mode the new index is built while the old index is accessible to reads and writes. any update on the old index will also get applied to the new index.
In OFFLINE mode the table is locked up front for any read or write, and then the new index gets built from the old index. No read or write operation is permitted on the table while the index is being rebuilt. Only when the operation is done is the lock on the table released and reads and writes are allowed again.

Elastic search update mappings

I have mappings created wrongly for an object in elastic search. Is there a way to update the mappings. The mapping has been created wrongly for type of the object(String instead of double).
In general, the mapping for existing fields cannot be updated. There are some exceptions to this rule. For instance:
new properties can be added to Object datatype fields.
new multi-fields can be added to existing fields.
doc_values can be disabled, but not enabled.
the ignore_above parameter can be updated.
Source : https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html
That's entirely possible, by PUTting the new mapping over the existing one, here are some examples.
Please note, that you will probably need to reindex all your data after you have done this, because I don't think that ES can convert string indexes to double indexes. (what will instead happen is, that you won't find any document when you search in that field)
PUT Mapping API allows you to add/modified datatype in an existing index.
PUT /assets/asset/_mapping
{
"properties": {
"common_attributes.asset_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"doc_values": true,
"normalizer": "lowercase_normalizer"
}
}
},
}
}
After updating the mapping, update the existing documents using bulk Update API.
POST /_bulk
{"update":{"_id":"59519","_type":"asset","_index":"assets"}}
{"doc":{"facility_id":491},"detect_noop":false}
Note - Use 'detect_noop' for detecting noop update.

How to set existing elastic search mapping from index: no to index: analyzed

I am new to elastic search, I want to updated the existing mapping under my index. My existing mapping looks like
"load":{
"mappings": {
"load": {
"properties":{
"customerReferenceNumbers": {
"type": "string",
"index": "no"
}
}
}
}
}
I would like to update this field from my mapping to be analyzed, so that my 'customerReferenceNumber' field will be available for search.
I am trying to run the following query in Sense plugin to do so,
PUT /load/load/_mapping { "load": {
"properties": {
"customerReferenceNumbers": {
"type": "string",
"index": "analyzed"
}
}
}}
but I am getting following error with this command,
MergeMappingException[Merge failed with failures {[mapper customerReferenceNumbers] has different index values]
Though there exist data associated with these mappings, here I am unable to understand why elastic search not allowing me to update mapping from no-index to indexed?
Thanks in advance!!
ElasticSearch doesn't allow this kind of change.
And even if it was possible, as you will have to reindex your data for your new mapping to be used, it is faster for you to create a new index with the new mapping, and reindex your data into it.
If you can't afford any downtime, take a look at the alias feature which is designed for these use cases.
This is by design. You cannot change the mapping of an existing field in this way. Read more about this at https://www.elastic.co/blog/changing-mapping-with-zero-downtime and https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html.

Elasticsearch field name aliasing

Is it possible to setup alias for field names in elasticsearch? (Just like how index names can be aliased)
For example: i have a document {'firstname': 'John', 'lastname': 'smith'}
I would like to alias 'firstname' to 'fn'...
Just a quick update, Elasticsearch 6.4 came up with feature called Alias Datatype. Check the below mapping and query as sample.
Note that the type of the field is alias in the below mapping for fieldname fn
Sample Mapping:
PUT myindex
{
"mappings": {
"_doc": {
"properties": {
"firstname": {
"type": "text"
},
"fn": {
"type": "alias",
"path": "firstname"
}
}
}
}
}
Sample Query:
GET myindex/_search
{
"query": {
"match" : {
"fn" : "Steve"
}
}
}
The idea is to use the alias for actual field on which inverted index is created. Note that fields with alias datatype aren't meant for write operations and its only meant for querying purpose.
Although you can refer to the link I've mentioned for more details, below are just some of the important points.
Field alias is only meant to be used when your index has a single mapping. Index has to be created post 6.xx version or be created in older version with the setting index.mapping.single_type: true
Can be used in querying, aggregations, sorting, highlighting and suggestion operations
Target field must be actual field on which inverted index is created
Cannot create alias of another alias field
Cannot use alias on multiple fields. Single alias, Single field.
Cannot be used as part of source filtering using _source.
There is no direct field alias functionality. However, you could rename the fields upon indexing using the index_name property in your mappings.
index_name : The name of the field that will be stored in the index.
Defaults to the property/field name.
See here for more information: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html
Adding alias fn for existing field firstname
PUT myindex/_mapping
{
"properties": {
"fn": {
"type": "alias",
"path": "firstname"
}
}
}
Should work this way as of Elasticsearch 7.
Probably you can try creating an alias on your index with filter on the desired field. Your filter must be written in such a way that it selects all the entries from your field. Please refer Filtered aliases section in here. But I am interested in knowing your use case. Why you want to create alias on particular field.

Resources