How to paginate ElasticSearch index names? - elasticsearch

I have 1000 indexes (not index documents, but indexes) in my ElasticSearch cluster.
How do I paginate them to show 20 indexes on a page? I obviously cant show all 1000 on the page. I have searched, but could not find any info on that case.
Also it does not let me pass from as a parameter or limit when requesting info on all indexes.

There's no pagination option for listing incdices in elastic.
The from/size` parameters are for using the search api only.
I'm not sure what you're trying to do, but if your showing them in client facing GUI then you're going to have to implement pagination logic yourself (which honestly should not be hard).

Related

Filter result in memory to search in elasticsearch from multiple indexes

I have 2 indexes and they both have one common field (basically relationship).
Now as elastic search is not giving filters from multiple indexes, should we store them in memory in variable and filter them in node.js (which basically means that my application itself is working as a database server now).
We previously were using MongoDB which is also a NoSQL DB but we were able to manage it through aggregate queries but seems the elastic search is not providing that.
So even if we use both databases combined, we have to store results of them somewhere to further filter data from them as we are giving users advanced search functionality where they are able to filter data from multiple collections.
So should we store results in memory to filter data further? We are currently giving advanced search in 100 million records to customers but that was not having the advanced text search that elastic search provides, now we are planning to provide elastic search text search to customers.
What do you suggest should we use the approach here to make MongoDB and elastic search together? We are using node.js to serve data.
Or which option to choose from
Denormalizing: Flatten your data
Application-side joins: Run multiple queries on normalized data
Nested objects: Store arrays of objects
Parent-child relationships: Store multiple documents through joins
https://blog.mimacom.com/parent-child-elasticsearch/
https://spoon-elastic.com/all-elastic-search-post/simple-elastic-usage/denormalize-index-elasticsearch/
Storing things client side in memory is not the solution.
First of all the simplest way to solve this problem is to simply make one combined index. Its very trivial to do this. Just insert all the documents from index 2 into index 1. Prefix all fields coming from index-2 by some prefix like "idx2". That way you won't overwrite any similar fields. You can use an ingestion pipeline to do this, or just do it client side. You only will ever do this once.
After that you can perform aggregations on the single index, since you have all the data in one-index.
If you are using somehting other than ES as your primary data-store you need to reconfigure the indexing operation to redirect everything that was earlier going into index-2 to go into index-1 as well(with the prefixed terms).
100 million records is trivial for something like ELasticsearch. Doing anykind of "joins" client side is NOT RECOMMENDED, as this will obviate the entire value of using ES.
If you need any further help on executing this, feel free to contact me. I have 11 years exp in ES. And I have seen people struggle with "joins" for 99% of the time. :)
The first thing to do when coming from MySQL/PostGres or even Mongodb is to restructure the indices to suit the needs of data-querying. Never try to work with multiple indices, ES is not built for that.
HTH.

How to get most retrieved items from an elasticsearch index?

I have uploaded some data to elasticsearch and I would like to keep track of how many times a data point is returned by past searches, that is to say, the most popular searched items.
Does elasticsearch provide such functionality to achieve this without implementing and updating a counter myself?
Cheers.

How to get all the index patterns which never had any documents?

For Kibana server decommissioning purposes, I want to get a list of index patterns which never had any single document and had documents.
How to achieve this using Kibana only?
I tried this but it doesn't give the list based on the document count.
GET /_cat/indices
Also in individual level getting the count to check the documents are there is time consuming .
GET index-pattern*/_count
You can try this. V is for verbose and s stands for sort.
GET /_cat/indices?v&s=store.size:desc
From the docs :
These metrics are retrieved directly from Lucene, which {es} uses internally to power indexing and search. As a result, all document counts include hidden nested documents.

ElasticSearch | Efficient Pagination with more than 10k documents

I have a microservice with elasticsearch as a backend store. Now I have multiple indices with hundreds of thousands of documents inserted in those indices.
Now I need to expose GET APIs for those indices. GET /employees/get.
I have gone through ES pagination using scroll and search_after. But both of them require meta information like scroll_id and search_after(key) to do pagination.
Now the concern is my microservice shouldn't expose these scroll_ids or search_after. With current approach, I can list up to 10k docs but not after that. And I don't want users of microservice to know about the backend store or anything about it. So How can I achieve this in elasticservice?
I have below approach in mind:
Store the scroll_id in-memory and retrieve the results based on that for subsequent queries. Get query will be as below:
GET /employees/get?page=1 By default each page will have 10k documents.
Implement scroll API internally over GET API and return all matching documents to users. But this increases the latency and memory. Because at times I may end up returning 100k docs to user in a single call.
Expose GET API with search string. By default return 10k documents and further the results will be refreshed with searchstring as explained:
Lets say GET /employees/get return 10k documents. And accept query_string to enrich the 10k like auto suggestion using n gram. Then we show most valid 10k docs everytime. I know this is not actual pagination but somehow this too solves the problem in a hacky way. This is my Plan-B.
Edited:
This is my usecase:
Return list of employees of a company. There are more than 100k employees. So I have to return the results in pages. GET /employees/get?from=0&size=1000 and GET /employees/get?from=1001&size=1000
But once I reach from+size to 10k, ES rejects the query.
Please suggest what would be the ideal way to implement pagination in microservice with ES as backend store and not letting user to know about internals of ES.

retrieve sorted search results from elasticsearch

I am facing a problem with elastic search. I am using elasticsearch 5.6
When I am searching an index on some fields and I get to have more than 40000 results.
I found 2 problems:
When trying to access page 1001 (results 10001) I get an error and I understood I can increase the default 10,000, However I can accept this limitation and expose back to the user only the first 10,000 results.
When I am trying to sort by a specific field, the sort does not work. This is a huge problem for me as this search is used by a client UI and I must enable paging through the results. I read about the scroll API but I does not fit my requirements (user requests from UI).
Do you have any idea how to solve this problem?
Thank you.

Resources