I am facing a problem with elastic search. I am using elasticsearch 5.6
When I am searching an index on some fields and I get to have more than 40000 results.
I found 2 problems:
When trying to access page 1001 (results 10001) I get an error and I understood I can increase the default 10,000, However I can accept this limitation and expose back to the user only the first 10,000 results.
When I am trying to sort by a specific field, the sort does not work. This is a huge problem for me as this search is used by a client UI and I must enable paging through the results. I read about the scroll API but I does not fit my requirements (user requests from UI).
Do you have any idea how to solve this problem?
Thank you.
Related
Curious if there is some way to check if document ID is part of a large (million+ results) Elasticsearch query/filter.
Essentially I’ll have a group of related document ID’s and only want to return them if they are part of a larger query. Hoping to do database side. Theoretically seemed possible since ES has to cache stuff related to large scrolls.
It's a interesting use-case but you need to understand that Elasticsearch(ES) doesn't return all the matching documents ids in the search result and return by default only the 10 documents in the response, which can be changed by the size parameter.
And if you increase the size param and have millions of matching docs in your query then ES query performance would be very bad and it might bring even entire cluster down if you frequently fire such queries(in absence of circuit breaker) so be cautious about it.
You are right that, ES cache the stuff, but again that if you try to cache huge amount of data and that is getting invalidate very frequent then you will not get the required performance benefits, so better do the benchmark against it.
You are already on the correct path to use, scroll API to iterate on millions on search result, just see below points to improve further.
First get the count of search result, this is included in default search response with eq or greater value which will give you idea that how many search results you have based on which you can give size param for subsequent calls to see if your id is present or not.
See if you effectively utilize the filters context in your query, which is by default cached at ES.
Benchmark your some heavy scroll API calls with your data.
Refer this thread to fine tune your cluster and index configuration to optimize ES response further.
I have 1000 indexes (not index documents, but indexes) in my ElasticSearch cluster.
How do I paginate them to show 20 indexes on a page? I obviously cant show all 1000 on the page. I have searched, but could not find any info on that case.
Also it does not let me pass from as a parameter or limit when requesting info on all indexes.
There's no pagination option for listing incdices in elastic.
The from/size` parameters are for using the search api only.
I'm not sure what you're trying to do, but if your showing them in client facing GUI then you're going to have to implement pagination logic yourself (which honestly should not be hard).
Firstly i do not like asking questions without code but I could not find any solution and need help about requesting big log datas from kibana and elasticsearch too.
I am trying to get 10000 documents from kibana on discover panel but it get me error. Error is : my 10000 documents are 5gb size and it just allows 2gb , I searched about spliting data but i could not do that on kibana.
i also tried
_msearch
it is not what i look for.
1- Do you guyz can tell me how can i do (if its possible) scroll on kibana - discover.
2- How can i get bigger datas from 2gb ?
if you can give me examples or link resources i will be so pleased.
Why do you want to search above 10k docs on the discover module?
If im undestanding well you want to download more than 2gb of data from Kibana, right? Put this on the Kibana.yml: xpack.reporting.csv.maxSizeBytes: 50971520
I'm using ElasticSearch to search from more than 10 million records, most records contains 1 to 25 words. I want to retrieve data from it, the method I'm using now is drastically slow for big data retrieval as I'm trying to get data from the source field. I want a method that can make this process faster. I'm free to use other database or anything with ElasticSearch. Can anyone suggest some good Ideas and Example for this?
I've tried searching for solution on google and one solution I found was pagination and I've already applied it wherever it's possible but pagination is not an option when I want to retrieve many(5000+) hits in one query.
Thanks in advance.
Try using scroll
While a search request returns a single “page” of results, the scroll
API can be used to retrieve large numbers of results (or even all
results) from a single search request, in much the same way as you
would use a cursor on a traditional database.
I am using elastic search for the project I'm working on and I was wondering if there was a way to narrow the results I get from an indices stats search.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-stats.html
I currently use the docs to narrow the data I get back about the indices but now I want to only get back ones with a doc count greater than 0. Does anyone know if this is possible or how to?
Thanks!
For elastic search 1.5.2
If you're concerned about the size of the response (i.e. if you many many indices with many shards), the best you can do is to use response filtering (available only since ES 1.7) and only retrieve the docs field that you can further filter on the client-side:
curl 'localhost:9200/_stats/docs?pretty&filter_path=**.docs.count'