Is there a way in ElasticSearch wherein I can remove some the objects in the nested field array.
So I have a nested field and it returns array of objects. I need to remove some objects in the nested field.
Is it possible to do so in the query or I need to do that in my code
These extra nested documents are hidden; we can’t access them directly. To update, add, or remove a nested object, we have to reindex the whole document. It’s important to note that, the result returned by a search request is not the nested object alone; it is the whole document.
Nested Objects Elastic search
As far as I know, In Elasticsearch you can't just remove part of a existing document. You should change the document (remove objects you don't need) and renew(rewrite) the document.
Related
We want to have a solution to compare a nested key value with root level field. Can we access a nested key value for filter(script)? We can get value from script field(parms), but we can't use parms in filter function. Only doc can be used from my knowledge, but doc in filter script we can't access nested structure. If doc under nested path, the meanwhile we can't access root level field.
You could leverage the copy_to / include_in_root parameters in order to allow _source access in a script, as I've outlined in my answer to How to iterate through a nested array in elasticsearch with filter script?
Alternatively, you could hijack a function_score query which, unlike standard filters, still has access to params._source without any mapping adjustments. Discussed in more detail in this thread.
I want to understand what cost to fetch a parent document with nested document in it.
Internally, nested objects index each object in the array as a separate hidden document, meaning that each nested object can be queried independently of the others...
I can't find the explanation how nested document relate to it parent in ES document. Do parent document hold nested ojbect _id, when we fetch parent, it just find source of nested object via id and replace that object to id in the result?
Overall idea of nested objects is the following - instead of relying on ids for join as parent-child approach is doing it utilise the logical organisation of the documents
Each nested object is written just before the parent document:
NESTED_DOC11 NESTED_DOC12 PARENT_DOC1 NESTED_DOC21 NESTED_DOC22 PARENT_DOC2
this is a smart trick which is utilised all the time to do efficient querying on the nested object without doing heavy lookups by ids.
However, this implies some limitations - for example you couldn't update/delete/add nested document without reindexing the whole "block"
More information on this approach is there
refer
https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html
To update, add, or remove a nested object, we have to reindex the
whole document.
In this content, 'whole' mean
all document in single type that has the document
single document that updated, added, or removed
A nested object is just a component of a single document.
{
"users" : [
{ "username" : "pickypg", "country" : "US" }
]
}
If users is the nested array, then each object in it is a nested document. Every change to that array (add, update, or delete) causes all nested documents in the array to be rewritten with it.
On a related note: you never need to use nested type unless it's going to be an array in general (not all documents must make use of it, but at least some of them should or it's a big waste).
Not coincidentally, the entire document is also rewritten. The real burden is that each nested document, unlike the normal, whole documents, adds extra work when the segments get written.
In Elasticsearch, is there any way to exclude the nested objects that don't match a particular query/filter from the resulting _source?
For example, let's say that a document has four objects in a nested field. Querying on the required filters results in only matching objects 1 and 3. When we get the results via _source, we will pull back the entire document along with objects 1,2,3,4.
Is it possible to exclude objects 2 and 4 from the results? Or is that something that we have to re-iterate and exclude using application-side logic?
At the moment there is no way to include only the matched nested objects in the result.
There is a inner_hits feature coming out in elasticsearch 1.5.0 which should help with this.
You can achieve this with use of inner_hits which will return you only matching nested objects. you can exclude this nested field in source.
Suggested by Val at:
ElasticSearch - Get only matching nested objects with All Top level fields in search response
If I have a object of class Car that has an nested object of class Engine where both classes have the field named "id" do I have to do anything special when I create the mapping? Or is it sufficient to add the type "nested" to the engine mapping.
Elasticsearch head GUI is showing unexpected rows, but the search seems to give the correct result so it would be good to know if I need to do anything else in the mapping if two or more objects have the same field name.
Seems like the structured query builder returns the engine document with the id that I search for when I select car.id from the dropdown.
There shouldn't be any problem, you can just use the dot notation to refer to the fields in the nested documents.
Also, if you have a single engine per car you don't need to declare the engine as nested in your mapping.