Documents with new field added before mapping update not queryable via new field - elasticsearch

I have an index that for one reason or another we've added fields to that don't exist in our mapping. For example:
{
"name": "Bob" // Exists in mapping
"age": 12 // doesn't existing in mapping
}
After updating the mapping to add the age field, any document we add the age field to is queryable, but none of the documents that had age added before we updated the mapping are queryable.
Is there a way to tell Elastic to make those older documents queryable, not just any net-new/updated after the mapping update?

This implies that you must have dynamic: false in your mapping, i.e. whenever you send a new field, you prevent ES from creating it automatically.
Once you have updated your mapping, you can then simply call _update_by_query on your index in order to update it and have it reindex the data it contains with the new mappings.
Your queries will then work also on the "older" data.

Related

What is the field "your_type" in Elasticsearch PUT request?

I am trying to resolve this error:
Fielddata is disabled on text fields by default. Set fielddata=true on
and saw one post which suggested me to do this; but I didn't get what is your_type endpoint in the given snippet:
PUT your_index/_mapping/your_type
I don't know what version of ElasticSearch you have but as of 7.x the mapping type has been removed.
In your case it could run like this (version > 7.x)
PUT my-index-000001/_mapping
{
"properties": {
"name-field": {
"type": "text",
"fielddata": true
}
}
}
A little about the mapping type:
Since the first release of Elasticsearch, each document has been
stored in a single index and assigned a single mapping type. A mapping
type was used to represent the type of document or entity being
indexed, for instance a twitter index might have a user type and a
tweet type.
Each mapping type could have its own fields, so the user type might
have a full_name field, a user_name field, and an email field, while
the tweet type could have a content field, a tweeted_at field and,
like the user type, a user_name field.
More information here:
https://www.elastic.co/guide/en/elasticsearch/reference/6.5/removal-of-types.html#_why_are_mapping_types_being_removed

Filtering collapsed results in Elasticsearch

I have an elasticsearch index containing documents that represent entities at a given point in time. When an entity changes state, a new document is created with a timestamp. When I need to get the current state of all entities, I can do the following:
GET https://127.0.0.1:9200/myindex/_search
{
"collapse": {
"field": "entity_id"
},
"sort" : [{
"timestamp": {
"order": "desc"
}
}]
}
However, I would like to further filter the result of the collapse. When entities are deleted I create a new document that includes an is_deleted flag along with the timestamp in a nested metadata field. I would like to extend the above query to entirely filter out those entities that have been deleted. Using a term filter on entity_metadata.is_deleted: true obviously does not work, because then my result just includes the last document with that entity_id before it got marked as deleted. How can I filter my results after the collapse is done to exclude any tombstoned entites?
What I would suggest is that instead of adding an is_deleted flag to all entity_id documents, you could add a date_deleted field with the date of the deletion to all documents of that entity, and then when you view a document, given its date and the deleted_date you'd know if the document was LIVE or deleted at that date.
In addition, it would allow you to consider:
all documents that don't have a deleted_date field (i.e. not deleted) and
all documents that have a deleted_date before/after a given date.

Setting doc_values for _id field in elasticSearch

I want to set doc_values for _id field in elastic search As want to perform sorting based on _id
hitting below api to update mapping gives me an error
PUT my_index/my_type/_mapping
{
"properties": {
"_id": {
"type": "keyword",
"doc_values": true
}
}
}
reason : Mapping definition for [_id] has unsupported parameters: [doc_value : true]
It is “doc_values”, you are using an incorrect parameter. https://www.elastic.co/guide/en/elasticsearch/reference/current/doc-values.html
Elastic discourages sorting on _id field. See this
The value of the _id field is also accessible in aggregations or for sorting, but doing so is discouraged as it requires to load a lot of data in memory. In case sorting or aggregating on the _id field is required, it is advised to duplicate the content of the _id field in another field that has doc_values enabled.
EDIT
Create a scripted field for your index pattern with name for. ex id of type string and script doc['_id'].value. See this link for more information on scripted fields. This will create a new field id and copy _id field's value for every document indexed into your indices matching your index pattern. You can then perform sorting on id field.

Add typed additional attributes to an existing document elasticsearch

I added a field to the document:
POST /erection/shop/1/_update
{
"doc": {
"my_field":""
}
}
The new field is assigned to the type of "String". how can I create a new field with the type "Boolean"/"Integer"?
and 2nd question:
is it possible to add one field in all documents using one query? (without updating each document)
1) Explicitly define a mapping prior to the first update you do.
2) No, you can't. You can do it in your application using "scan" and then "bulk update"

Elasticsearch: Do I need to index '_id' filed specifically?

Looks like _id field is automatically mapped to _uid by the Elasticsearch.
I have below query to get a document by passing document's _id
query: {
match: {
_id: myDocumentId
}
}
Should I specify indexing for _id field to make above query work fast or is it taken care by using _uid filed internally?
You can just use
GET /<index_name>/<type_name>/<myDocumentId>

Resources