elasticsearch parent/children aggregation performance - elasticsearch

I am just an elasticsearch newbie. According to the following elasticsearch document,
join datatype
The join field shouldn’t be used like joins in a relation database. In Elasticsearch the key to good performance is to de-normalize your data into documents. Each join field, has_child or has_parent query adds a significant tax to your query performance.
has_child query
Note that the has_child is a slow query compared to other queries in the query dsl due to the fact that it performs a join.
has_parent query
Note that the has_parent is a slow query compared to other queries in the query dsl due to the fact that it performs a join.
I can understand these query types are slow and should be avoided. But what about parent and children aggregations ? I can not find any document or performance test result which says these aggregations are slow or not so bad.
I have to test it though, can someone give me some advice ?

Parent and Child Aggregations are definitely slower compared to other Aggregations. I have tested it in my applications and found it much slower than normal ones.

Related

How does Elasticsearch/Lucene achieve such performance when querying multiple fields?

According to the answer given here, Elasticsearch doesn't seem to use compound indexes for querying multiple fields, and instead queries multiple indexes and then intersects the results.
My question is how does it achieve such high performance? Surely a composite index is faster since it leads you straight to the desired data, rather than querying multiple indexes, which in turn return more data, and then compare the results?
I get the advantages of the multiple indexes, regarding the field order, etc., but in terms of performance, surely it's inferior...

composite aggregation vs nested terms aggregation

Hi I am currently using nested terms aggregations (triple or more) to query elasticsearch. I would rather use the composite aggregation with 3+ source fields that i just discovered since it is way more manageable in my opinion, but I was wondering if performance-wise this is a bad choice. Any recommendation ?

ElasticSearch Search Queries Count

We have a use case for aggregating count of elastic-search search queries/operations. Initially we've decided to make use of the /_stats endpoint for aggregating results on a per index basis. However, we would also like to explore the option of filtering search operations so we can distinguish operations by origin/source. I was wondering how we can do this efficiently. Any references to documentation or implementations would be highly appreciated,

Disable scoring in Elastic Search and improve search performance

I want to improve the speed and performance of my search queries.
Currently, I am using filtered queries and also applying the "fuzziness" parameter on some of the fields in my index while searching. I have already kept the fields as "not_analyzed" to improve performance.
The query is equivalent to the SQL query (Select from where =).
Also, while performing analysis on my query in QueryProfiler, i found that boost query is consuming a certain amount of time. I am not concerned with the scoring of the records and wish to fetch the data as it is searched.
For that, I am planning to set the "norms" parameter to "false" while insertion and search operations.
What other steps should I follow to disable boosting & scoring and to fasten the search operations?
And what properties should I enable/ disable in order to achieve the specified purpose?
Thanks in advance.

Does solr support the sorting while creating index?

In my test environment, there are nearly 130,000,000 documents on each server. It works fast if I do a search without sorting by date, but extremly slow if sorting is enabled.
I think if the solr can sort an indexed field while creating index, searching would be more efficient. So, how to configure the solr to sort some fields while indexing?
The initial query would be slower but all the subsequent queries should be fast.
Solr should be able to use the Filter Query Cache for sorting.
You can also warm the sort fields.
Also check if the overhead is also just cause of sorting and there is no querying and scoring involved.

Resources