I want to understand what cost to fetch a parent document with nested document in it.
Internally, nested objects index each object in the array as a separate hidden document, meaning that each nested object can be queried independently of the others...
I can't find the explanation how nested document relate to it parent in ES document. Do parent document hold nested ojbect _id, when we fetch parent, it just find source of nested object via id and replace that object to id in the result?
Overall idea of nested objects is the following - instead of relying on ids for join as parent-child approach is doing it utilise the logical organisation of the documents
Each nested object is written just before the parent document:
NESTED_DOC11 NESTED_DOC12 PARENT_DOC1 NESTED_DOC21 NESTED_DOC22 PARENT_DOC2
this is a smart trick which is utilised all the time to do efficient querying on the nested object without doing heavy lookups by ids.
However, this implies some limitations - for example you couldn't update/delete/add nested document without reindexing the whole "block"
More information on this approach is there
Related
I would like to index document with nested field. There will be some cases when some documents is going to have more than 10k elements (even 80k). Also I'm going to query this index with inner_hits using from and size parameters (I need to paginate nested objects). My question is : Is it a good approach to use nested field when I need to paginate the list with huge amount of data or it will be better to denormalize the model ?
Is there a way in ElasticSearch wherein I can remove some the objects in the nested field array.
So I have a nested field and it returns array of objects. I need to remove some objects in the nested field.
Is it possible to do so in the query or I need to do that in my code
These extra nested documents are hidden; we can’t access them directly. To update, add, or remove a nested object, we have to reindex the whole document. It’s important to note that, the result returned by a search request is not the nested object alone; it is the whole document.
Nested Objects Elastic search
As far as I know, In Elasticsearch you can't just remove part of a existing document. You should change the document (remove objects you don't need) and renew(rewrite) the document.
In Elasticsearch, is there any way to exclude the nested objects that don't match a particular query/filter from the resulting _source?
For example, let's say that a document has four objects in a nested field. Querying on the required filters results in only matching objects 1 and 3. When we get the results via _source, we will pull back the entire document along with objects 1,2,3,4.
Is it possible to exclude objects 2 and 4 from the results? Or is that something that we have to re-iterate and exclude using application-side logic?
At the moment there is no way to include only the matched nested objects in the result.
There is a inner_hits feature coming out in elasticsearch 1.5.0 which should help with this.
You can achieve this with use of inner_hits which will return you only matching nested objects. you can exclude this nested field in source.
Suggested by Val at:
ElasticSearch - Get only matching nested objects with All Top level fields in search response
In Elasticsearch, is it possible to reference a top-level (non-nested) property in a nested filter?
I have a situation where I need a condition to be true either at a global level or in one of any number of associated nested objects. Inside of the nested filter I have an or-filter to check one or the other, but the outer property appears to be ignored. An example is here.
I have a feeling that what I'm needing is not supported and everything inside of the nested-filter must apply at or below the specified path (from the docs, "The query is executed against the nested objects / docs as if they were indexed as separate docs (they are, internally)". I'm about to just duplicate the top-level data in each nested object (it really is just a boolean field), but I'd like to know if this is possible or if there's another obvious solution I'm missing.
You are correct in that the feature you are looking for is not supported. Elasticsearch uses the various Lucene join queries such as ToParentBlockJoinQuery underneath and it does not reference both layers of properties.
You can use the include_in_parent/include_in_root properties to push the property to a higher level, but you lose the ability to filter on multiple properties belong to the same nested document.
If I have a object of class Car that has an nested object of class Engine where both classes have the field named "id" do I have to do anything special when I create the mapping? Or is it sufficient to add the type "nested" to the engine mapping.
Elasticsearch head GUI is showing unexpected rows, but the search seems to give the correct result so it would be good to know if I need to do anything else in the mapping if two or more objects have the same field name.
Seems like the structured query builder returns the engine document with the id that I search for when I select car.id from the dropdown.
There shouldn't be any problem, you can just use the dot notation to refer to the fields in the nested documents.
Also, if you have a single engine per car you don't need to declare the engine as nested in your mapping.