Pagination using REST API using Spring Boot and GBQ - spring

I have Experience API endpoint which will give results as per limit and offset. My REST API calls the Big Query and get the data set as it has more than 500 records, I need to store product ids in cache so that I can use it in every pagination request for given limit and offset. Is there any alternative better solution for this kind of scneario?

Related

Spring Boot Suggestions Endpoint Optimization

I have a microservice that have Rest endpoint which returns suggested item names. For every character entered in the text field (Min 3 character) this endpoint is invoked.
The endpoint internally calls elastic search which fetches the items based on query string.
When RPM increases with APM monitoring there is a huge spike in the heap consumptions as well.
Any idea how could I optimize this endpoint?

Elasticsearch search api to get total hit count?

I have a use case:
I need to use _search API to fetch whole bunch of records in a paginated way.
But at the same time, I would want to get the total hit number in the same _search API call.
Example:
The pagination number is 50, that is, I want to fetch result in a 50 batch manner. At the same time, I want to get the total hit number, let's say 5000 for each search call.
I have 2 questions:
Is this possible? get total hit number as the result of a _search API call?
Would the total hit number be impacted due to the pagination?
you can get total hit in search API with adding track_total_hits=true option.
GET localhost:9200/_search?pretty&track_total_hits=true
if you are using search API with from=X&size=50 for pagination, yes it is possible that the number of docs change during of pagination. but it depends of refresh interval. you can increase the refresh interval. there is another solution for this problem. Pit API.
https://www.elastic.co/guide/en/elasticsearch/reference/current/point-in-time-api.html
also from=X&size=50 with you have limit for pagination(I think you can only fetch 10000 docs) you could increase this limitation. or use scroll API.
Image from Search API ES-DOCs.. You can use hits -> total.

ElasticSearch | Efficient Pagination with more than 10k documents

I have a microservice with elasticsearch as a backend store. Now I have multiple indices with hundreds of thousands of documents inserted in those indices.
Now I need to expose GET APIs for those indices. GET /employees/get.
I have gone through ES pagination using scroll and search_after. But both of them require meta information like scroll_id and search_after(key) to do pagination.
Now the concern is my microservice shouldn't expose these scroll_ids or search_after. With current approach, I can list up to 10k docs but not after that. And I don't want users of microservice to know about the backend store or anything about it. So How can I achieve this in elasticservice?
I have below approach in mind:
Store the scroll_id in-memory and retrieve the results based on that for subsequent queries. Get query will be as below:
GET /employees/get?page=1 By default each page will have 10k documents.
Implement scroll API internally over GET API and return all matching documents to users. But this increases the latency and memory. Because at times I may end up returning 100k docs to user in a single call.
Expose GET API with search string. By default return 10k documents and further the results will be refreshed with searchstring as explained:
Lets say GET /employees/get return 10k documents. And accept query_string to enrich the 10k like auto suggestion using n gram. Then we show most valid 10k docs everytime. I know this is not actual pagination but somehow this too solves the problem in a hacky way. This is my Plan-B.
Edited:
This is my usecase:
Return list of employees of a company. There are more than 100k employees. So I have to return the results in pages. GET /employees/get?from=0&size=1000 and GET /employees/get?from=1001&size=1000
But once I reach from+size to 10k, ES rejects the query.
Please suggest what would be the ideal way to implement pagination in microservice with ES as backend store and not letting user to know about internals of ES.

Using elastic search for a UI dashboard behind a proxy

I am working on a search dashboard with full text search capabilities, backed by ES. The search would initially be consumed by a UI dashboard. I am planning to have an application web service (WS) api layer between the UI dashboard and ES which will route the business search to ES.
There can be multiple clients to WS going forward, each with its own business use cases, and complex data requirements (basically response fields). There are many entities and huge number of fields across them. Each client would need to specify what fields entities it wants to return with what fields.
To support this dynamically changing requirement, one approach could be to have the WS be a pass through to the ES (with pre validations like access control and post transformations to the response from ES). The WS APIs will look exactly like the ES APIs, the UI should build ES queries through JS client and send it to WS, which after access control will get data from ES.
I am new to ES and skeptic of this approach. Can there be any particular challenges in this approach. One of my colleague has worked on ES before but always with a backend Java client, so he's not too sure.
I looked up a ES Js client and there's an official one here.
Some Context here:
We have around 4 different entities (can increase in future) with both full text and keyword type fields. A typical search could have multiple filters and search terms and would want to specify the result fields. Also, some searches would be across entities and some to individual ones. We are maintaining a separate entity for each entity.
What I understand from your post is, below is what you want to achieve at high level.
There can be multiple clients to WS going forward, each with its own
business use cases, and complex data requirements (basically response
fields)
And as you are not sure, how to do this, you are thinking to build Elasticsearch queries from Javascript in your front-end only. I am not a very big fan of this approach as it exposes, how you are building queries and if some hacker knows crucial like below information, then can bring your entire ES cluster to its knees:
Knows what types of wildcard queries.
Knows index names and ES cluster details(although you may have access control but still you are exposing the crucial info).
How you are building your search queries.
Above are just a few examples and will add more info.
Right approach
As you already have a backend, where you would be checking the access, there only build the Elasticsearch queries and you even have the advantage of your teammates who knows it.
For building complex response field, you can use the source filtering, using which you can specify in your search request, what all fields you want to return in your search result.

Elastic Search High Level Client - How to search with Post Request?

I have used Elastic Search High Level Client to search the elastic index and process the results. I have used the following code to do the same.
restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
However, rest client uses "GET" to query the data. However, I want to send this as a Post request to Elastic Search. Any help on this would be highly appreciated.
After discussion (see comments), there was no need to force the High Level Rest Client to use POST instead of GET as GET is using behind the scene GET with body.

Resources