p:datatable paging all entries elastic search index - elasticsearch

I want to use a pageable primfaces datatable to display all entries in an elastic seach index.
Unfortunately the following code only returns 10 entries.
public ArrayList<Video> getAllVideos(String indexName) throws IOException{
makeConnection();
SearchRequest searchRequest = new SearchRequest(indexName);
Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
searchRequest.scroll(scroll);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchAllQuery());
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest);
return buildResponse(searchResponse, restHighLevelClient);
}
Does anybody have a working example for getting all records? Thanks

Related

Setting default operator to AND in elasticsearch when using SearchSourceBuilder and XContentParser

I have the below sample code to query elasticsearch. It receives the user query and uses XContentParser and SearchSourceBuilder. I don't want to tweak a lot on the existing code as it is has been in production for a while. I would like to change the default operator from "OR" to "AND" so we get more meaningful messages. I couldn't find a way to set the default operator in this scenario. Any help apprecaited
RestHighLevelClient esClient;
String[] indicesArray={"index1","index2"};
String fullQueryJson={"query":{"nested":{"path":"_ROOT","query":{"query_string":{"query":"_ROOT.LATEST.FULL_NAME:John Mary C"}}}},"_source":["FULL_NAME","TEXT_MSG","USER_NAME"],"size":9010}
SearchSourceBuilder accessSearchSource = new SearchSourceBuilder();
final SearchModule searchModule = new SearchModule(Settings.EMPTY, false, Collections.emptyList());
final XContentParser fullQueryJsonParser = XContentFactory.xContent(XContentType.JSON)
.createParser(new NamedXContentRegistry(searchModule.getNamedXContents()), LoggingDeprecationHandler.INSTANCE, fullQueryJson);
accessSearchSource.parseXContent(fullQueryJsonParser);
SearchRequest request=new SearchRequest(indicesArray);
request.source(accessSearchSource);
esClient.search(request);
It is part of the SearchSourceBuilder.
Example:
SearchRequest firstSearchRequest = new SearchRequest(type.toLowerCase());
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("user", claim).operator(Operator.OR));
searchSourceBuilder
firstSearchRequest.source(searchSourceBuilder);
searchSourceBuilder.query(QueryBuilders.matchQuery("user", claim).operator(Operator.OR));

Elasticsearch Search Scroll API doesn't retrieve all the documents from an index

I would like to retrieve all the documents from Elasticsearch, so I referred to the Search Scroll API.
But my question is, it is not returning all the documents, I have 36 documents in one index, for that it was returning 26 only.
Even when I checked with another index, where I have more than 10k documents, there it is also not returning the last 10 documents.
I really don't know why it was returning it like that! Any help will be appreciated! Thanks in advance!
Below the code I've tried:
final Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
SearchRequest searchRequest = new SearchRequest("myindex");
searchRequest.scroll(scroll);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query("")//here some query;
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
String scrollId = searchResponse.getScrollId();
SearchHit[] searchHits = searchResponse.getHits().getHits();
while (searchHits != null && searchHits.length > 0) {
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
scrollRequest.scroll(scroll);
searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
scrollId = searchResponse.getScrollId();
searchHits = searchResponse.getHits().getHits();
for (SearchHits hit: searchHits){
String source=hit.getSourceAsString();
}
}
ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
clearScrollRequest.addScrollId(scrollId);
ClearScrollResponse clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
boolean succeeded = clearScrollResponse.isSucceeded();
Today I faced with the same problem while working with an example from:
Elastic Scroll API
First of all, about documents you missed - 10 is default value for the size of requests and based on this we can suppose that one of your requests wasn't handled properly.
In your code first batch of 10 documents isn't handled:
SearchHit[] searchHits = searchResponse.getHits().getHits();
Before while loop you should iterate over your searchHits .
From the first time it was not clear to me in the official documents.
You should change your while loop logic to execute the hit iteration first and the scroll after.
while (searchHits != null && searchHits.length > 0) {
// execute this block first otherwise the scroll will overwrite the initial hits.
for (SearchHits hit: searchHits){
String source=hit.getSourceAsString();
}
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
scrollRequest.scroll(scroll);
searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
scrollId = searchResponse.getScrollId();
searchHits = searchResponse.getHits().getHits();
}
Another thing to consider is that you can increase the response hit size. from the docs:
The index.max_result_window which defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to from + size.
So the defaulted value for max_result_window is 10k hits, you can also set this value to be something else. this means you can fetch up to 10k hits in 1 search call instead of executing redundant paginations.
You can do this by specifying the size property for searchSourceBuilder before executing the search call like so:
searchSourceBuilder.size(10000);

How to use SearchAfter API correctly?

I recently work with elasticsearch and I have such a question. I have a million documents in index and I wanna to get more than 10_000. For this I can use scroll API or SearchAfter API. I understood how does scroll api works but I have some problem with SearchAfter.
Here is my SearchSourceBuilder method:
public SearchRequest buildRequest(SearchDistanceParameters args) {
final SearchSourceBuilder searchSourceBuilder = prepareSearchSourceBuilder(args);
final SearchRequest searchRequest = new SearchRequest();
return searchRequest.source(searchSourceBuilder);
}
private SearchSourceBuilder prepareSearchSourceBuilder(SearchDistanceParameters searchDistanceParameters) {
final FieldSortBuilder fieldSortBuilder = new FieldSortBuilder("_id").order(SortOrder.ASC);
final SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
final GeoDistanceQueryBuilder geoDistanceQueryBuilder = geoDistanceQuery(GeoLocationModelFieldName.LOCATION.name().toLowerCase());
geoDistanceQueryBuilder.point(searchDistanceParameters.getLatitude(), searchDistanceParameters.getLongitude());
geoDistanceQueryBuilder.distance(searchDistanceParameters.getDistance(), DistanceUnit.KILOMETERS);
searchSourceBuilder.query(geoDistanceQueryBuilder);
searchSourceBuilder.sort(fieldSortBuilder);
searchSourceBuilder.searchAfter();
return searchSourceBuilder;
}
Here I do sort before searchAfter() as mention in SearchAfter API doc.
Here I am sending my request to ElasticSearch:
public SearchResponse sendRequestToElastic(SearchDistanceParameters args) throws IOException {
SearchRequest searchRequest = searchByDistanceRequestBuilder.buildRequest(args);
return elasticDao.search(searchRequest, RequestOptions.DEFAULT); // standard RestHighLevelClient.search method inside elasticDao.
}
And finally I am trying to get my objects from SearchResponse:
public List<GeoPointsFromElasticSearchResponse> searchByDistance(SearchDistanceParameters searchDistanceParameters) {
final SearchResponse searchResponse = searchRepository.searchByDistance(searchDistanceParameters);
return getGeoPointsFromElasticSearchResponses(searchResponse);
}
private List<GeoPointsFromElasticSearchResponse> getGeoPointsFromElasticSearchResponses(SearchResponse searchResponse) {
SearchHit[] hits = searchResponse.getHits().getHits();
return Arrays.stream(hits)
.map(hit -> {
final GeoPointsFromElasticSearchResponse geoPointsFromElasticSearchResponse = new GeoPointsFromElasticSearchResponse();
final Map<String, Object> sourceMap = hit.getSourceAsMap();
final Map map = (Map) sourceMap.get(GeoLocationModelFieldName.LOCATION.name().toLowerCase());
geoPointsFromElasticSearchResponse.setLatitude((Double) map.get("lat"));
geoPointsFromElasticSearchResponse.setLongitude((Double) map.get("lon"));
log.info("Sorted hits: {}", hit.getSortValues());
return geoPointsFromElasticSearchResponse;
}).collect(Collectors.toList());
}
But I've got only 10_000 objects. It seems I am doing something wrong in last part. What am I doing wrong? How to use SearchAfter api in java correctly?
Well the search API wont return all the documents in one request, the behaviour is similar as pagination.
You have to pass argument to search after:
https://www.elastic.co/guide/en/elasticsearch/reference/6.7/search-request-search-after.html
According to the constructor : searchSourceBuilder.searchAfter(new Object[]{sortAfterValue});
The value you want to set is the one returned by the first search request (hits => getAt(lastIndex) => getSortValues())

(How to achieve pagination with rest high level client and spring?)or an alternative of elasticsearchTemplate.queryForPage()

I'm having issue migrating from transport client to Rest high-level client.
The following code will not work with RestHighLevelClient which I want to use to get a response of aggregated pages of type Class.
elasticsearchTemplate.queryForPage(searchQuery, Class.class)
Any suggestions to achieve the same with other method is also welcomed.
My workaround using restHighLevelClient without Spring data elasticsearch consist in this code (this is not the solution but perhaps could be help for your solution):
BoolQueryBuilder criteriaQuerySpecification = getCriteriaQuerySpecification(transactionFilter);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.sort(new FieldSortBuilder("operation_created_at").order(SortOrder.DESC));
sourceBuilder.query(criteriaQuerySpecification);
SearchRequest searchRequest = generateSearchRequest(totalElementsInt, pageNumberInt, sourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
List<OperationDto > operations = Arrays.asList(hits).stream().map(hit -> {
hit.getSourceAsString();
// get operation is a method mapping from hit to your dto using Map<String, Object> sourceAsMap = hit.getSourceAsMap();
OperationDto operation = getOperationDto(hit);
//convert hit to OperationDto
return operation;
}).collect(Collectors.toList());
private SearchRequest generateSearchRequest(Integer totalElementsInt, Integer pageNumberInt, SearchSourceBuilder sourceBuilder) {
SearchRequest searchRequest = new SearchRequest("operation-index").types("operation");
int offset = pageNumberInt *totalElementsInt;
sourceBuilder.from(offset);
sourceBuilder.size(totalElementsInt);
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
searchRequest.source(sourceBuilder);
return searchRequest;
}
This worked for me
public Page<T> search(){
Query query;
SearchHits<T> t;
Criteria nameCriteria = new Criteria("name").is(text).and(new
Criteria("jsonNode").in(String ...)); //This can be any Aggregator
query = new CriteriaQuery(nameCriteria).setPageable(paging);
searchHits = elasticsearchOperations.search(query, T.class);
return (Page) SearchHitSupport.searchPageFor(searchHits, query.getPageable());;
}

Unable to fetch the data with help of QueryBuilders.termQuery

I am a newcomer to elastic search I was trying to work with high-level Rest Client
I am able to work with CRUD operations, With Search functionality, I got stuck up.
My objective to bring all the data based on the book id start with E106 search criteria
http://localhost:5918/book-elastic/books/book/E106
I added the part of the code below
I am able to get all the data using
QueryBuilders.matchAllQuery()
But I couldn't get a specific field value
QueryBuilders.termQuery("_id",bookId)
I have also shared the screenshot of the both results
Can somebody help me out on the query?
Kindly revert in case of any further information needed.
Thanks in Advance
public Page<BookEntity> findByBookId(String bookId, Pageable pageable) throws IOException{
SearchRequest searchRequest = new SearchRequest(INDEX);
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.fetchSource(false);
//searchSourceBuilder.fetchSource(null, new String[]{"excludedProperty"});
/*MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("id",bookId);
matchQueryBuilder.fuzziness(Fuzziness.AUTO);
matchQueryBuilder.prefixLength(3);
matchQueryBuilder.maxExpansions(7);*/
searchSourceBuilder.from((int)pageable.getOffset());
searchSourceBuilder.size(pageable.getPageSize());
//searchSourceBuilder.query(matchQueryBuilder);
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
//searchSourceBuilder.query(QueryBuilders.termQuery("_id",bookId));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
SearchHits hits = searchResponse.getHits();
SearchHit[] objectHits = hits.getHits();
for (SearchHit searchHit : objectHits) {
System.out.println("***************************");
System.out.println("Search Hit :: "+searchHit);
System.out.println("***************************");
}
return null;
}
Result Screenshot
matchAllQuery
Input : {"from":0,"size":20,"query":{"match_all":{"boost":1.0}},"_source":false}
Response
Response Value: {"took":5,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"bookdata","_type":"books","_id":"E106401","_score":1.0},{"_index":"bookdata","_type":"books","_id":"E106403","_score":1.0},{"_index":"bookdata","_type":"books","_id":"E10640","_score":1.0}]}}
Input Values: QueryBuilders.termQuery("_id",bookId)
{"from":0,"size":20,"query":{"term":{"_id":{"value":"E106","boost":1.0}}},"_source":false}
Response
The Response is Null
matchPhrasePrefixQuery from QueryBuilders solved my issues

Resources