elasticsearch reindex. select nested fields - elasticsearch

Is it possible to set particular nested fields for reindexing?
According to docs https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#docs-reindex-filter-source, selected fields are array.
POST _reindex
{
"source": {
"index": "twitter",
"_source": ["user", "_doc"]
},
"dest": {
"index": "new_twitter"
}
}
For example, we need reindex only nested fields of user like "name" and "birthdate":
How could it be done? We need something like this:
POST _reindex
{
"source": {
"index": "twitter",
"_source": { "user": ["name", "birthdate"], "_doc"]
},
"dest": {
"index": "new_twitter"
}
}

POST _reindex
{
"source": {
"index": "twitter",
"_source": [ "user.name", "user. birthdate", "_doc"]
},
"dest": {
"index": "twitter_new"
}
}
}
You need to use . to refer them.

Related

Elasticsearch remove a field from an object of an array in a dynamically generated index

I'm trying to delete fields from an object of an array in Elasticsearch. The index has been dynamically generated.
This is the mapping:
{
"mapping": {
"_doc": {
"properties": {
"age": {
"type": "long"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"result": {
"properties": {
"resultid": {
"type": "long"
},
"resultname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
},
"timestamp": {
"type": "date"
}
}
}
}
}
}
this is a document:
{
"result": [
{
"resultid": 69,
"resultname": "SFO"
},
{
"resultid": 151,
"resultname": "NYC"
}
],
"age": 54,
"name": "Jorge",
"timestamp": "2020-04-02T16:07:47.292000"
}
My goals is to remove all the fields resultid in result in all the document of the index. After update the document should look like this:
{
"result": [
{
"resultname": "SFO"
},
{
"resultname": "NYC"
}
],
"age": 54,
"name": "Jorge",
"timestamp": "2020-04-02T16:07:47.292000"
}
I tried using the following articles on stackoverflow but with no luck:
Remove elements/objects From Array in ElasticSearch Followed by Matching Query
remove objects from array that satisfying the condition in elastic search with javascript api
Delete nested array in elasticsearch
Removing objects from nested fields in ElasticSearch
Hopefully someone can help me find a solution.
You should reindex your index in a new one with _reindex API and call a script to remove your fields :
POST _reindex
{
"source": {
"index": "my-index"
},
"dest": {
"index": "my-index-reindex"
},
"script": {
"source": """
for (int i=0;i<ctx._source.result.length;i++) {
ctx._source.result[i].remove("resultid")
}
"""
}
}
After you can delete your first index :
DELETE my-index
And reindex it :
POST _reindex
{
"source": {
"index": "my-index-reindex"
},
"dest": {
"index": "my-index"
}
}
I combined the answer from Luc E with some of my own knowledge in order to reach a solution without reindexing.
POST INDEXNAME/TYPE/_update_by_query?wait_for_completion=false&conflicts=proceed
{
"script": {
"source": "for (int i=0;i<ctx._source.result.length;i++) { ctx._source.result[i].remove(\"resultid\")}"
},
"query": {
"bool": {
"must": [
{
"exists": {
"field": "result.id"
}
}
]
}
}
}
Thanks again Luc!
If your array has more than one copy of element you want to remove. Use this:
ctx._source.some_array.removeIf(tag -> tag == params['c'])

How to _reindex elasticsearch data to new mapping (from flat fields to objects)?

I have an old index (elasticsearch index) has more than 20K objects, this index has fields
{
"title": "Test title",
"title_ar": "عنوان تجريبي",
"body": "<p>......</p>"
}
I want to _reindex them to convert all data to new mapping like this
{
"title_1": {
"en": "Test title",
"ar": "عنوان تجريبي"
},
"body": "<p>......</p>"
}
What is the best elasticsearch pipeline processor to make this conversion available in _reindex API?
I suggest to simply use the reindex API to do this:
POST _reindex
{
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index"
},
"script": {
"source": "ctx._source.title = [ 'en' : ctx._source.title, 'ar': ctx._source.title_ar]",
"lang": "painless"
}
}
If in your old_index index you have this:
{
"title": "Test title",
"title_ar": "عنوان تجريبي",
"body": "<p>......</p>"
}
In your new index, you'll have this:
{
"title": {
"en": "Test title",
"ar": "عنوان تجريبي"
},
"body": "<p>......</p>"
}

Elastic Search change index to a document

How can I change the _index to an existing document in Elastic Search?
Example:
1) I create an index:
PUT /customer?pretty
2) I add a document:
POST /customer/_doc?pretty
{
"name": "John Doe"
}
3) I create another index:
PUT /customer2?pretty
How Do I move the document created in step 2 into the new _index customer2?
POST _reindex
{
"source": {
"index": "customer",
"type": "_doc",
"query": {
"term": {
"_id": "fMn2OmcBEGEHUvm1g7Mi"
}
}
},
"dest": {
"index": "customer2"
}
}
DELETE /customer2/_doc/fMn2OmcBEGEHUvm1g7Mi
where "fMn2OmcBEGEHUvm1g7Mi" is the id of the document.
There isn't a way to edit the meta fields in a document. The best way would be to reindex it into a new index and delete the older index.
POST _reindex
{
"source": {
"index": "customer"
},
"dest": {
"index": "customer2"
}
}

ElasticSearch NEST Reindex, edit name fields

I have an Index with nested Objects something like
"_index": "originindex",
"_source": {
"message": "",
"environment": "",
"nestedObj": {
"field1": "field1",
"field2": 1 },
"anotherfield": 1}
And I want to reindexit to something like
"_index": "newindex",
"_source": {
"message": "",
"nestedObj-field1":"field1",
"nestedObj-field2": 1 ,
"anotherfield": 1}
I'am new to all of this I'm using Nest on .Net V4.5, it proposes a ReindexAPI But don'tknow how to use it for this purpose
Thank you!
POST _reindex
{
"source": {
"index": "originindex"
},
"dest": {
"index": "newindex"
},
"script":{
"source":"ctx._source.nestedObj-field1 = ctx._source.remove(\"field1\");ctx._source.nestedObj-field2 = ctx._source.remove(\"field2\");"
}
Just make sure your mappings are in place on the dest index before you execute this.

Raw nested aggregation

I would like to create a raw nested aggregation in ElasticSearch, but I'm enable to get it working.
My documents look like this :
{
"_index": "items",
"_type": "frame_spec",
"_id": "19770602001",
"_score": 1,
"_source": {
"item_type_name": "frame_spec",
"status": "published",
"creation_date": "2016-02-18T11:19:15Z",
"last_change_date": "2016-02-18T11:19:15Z",
"publishing_date": "2016-02-18T11:19:15Z",
"attributes": [
{
"brand": "Sun"
},
{
"model": "Sunglasses1"
},
{
"eyesize": "56"
},
{
"opc": "19770602001"
},
{
"madein": "UNITED KINGDOM"
}
]
}
}
What I want to do is to aggregate based on one of the attributes. I can't do a normal aggregation with "attributes.model" (for example) because some of them contain spaces. So I've tried using the "raw" property but it appears that ES considers it as a normal property and does not return any result.
This is what I've tried :
{
"size": 0,
"aggs": {
"brand": {
"terms": {
"field": "attributes.brand.raw"
}
}
}
}
But I have no result.
Have you any solution I could use for this problem ?
You should use a dynamic_template in your mapping that will catch all attributes.* string fields and create a raw sub-field for all of them. For other types than string, you don't really need raw fields. You need to delete your current index and then recreate it with this:
DELETE items
PUT items
{
"mappings": {
"frame_spec": {
"dynamic_templates": [
{
"strings": {
"match_mapping_type": "string",
"path_match": "attributes.*",
"mapping": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
]
}
}
}
After that, you need to re-populate your index and then you'll be able to run this:
POST /items/_search
{
"size": 0,
"aggs": {
"brand": {
"terms": {
"field": "attributes.brand.raw"
}
}
}
}

Resources