How to use SearchAfter API correctly? - elasticsearch

I recently work with elasticsearch and I have such a question. I have a million documents in index and I wanna to get more than 10_000. For this I can use scroll API or SearchAfter API. I understood how does scroll api works but I have some problem with SearchAfter.
Here is my SearchSourceBuilder method:
public SearchRequest buildRequest(SearchDistanceParameters args) {
final SearchSourceBuilder searchSourceBuilder = prepareSearchSourceBuilder(args);
final SearchRequest searchRequest = new SearchRequest();
return searchRequest.source(searchSourceBuilder);
}
private SearchSourceBuilder prepareSearchSourceBuilder(SearchDistanceParameters searchDistanceParameters) {
final FieldSortBuilder fieldSortBuilder = new FieldSortBuilder("_id").order(SortOrder.ASC);
final SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
final GeoDistanceQueryBuilder geoDistanceQueryBuilder = geoDistanceQuery(GeoLocationModelFieldName.LOCATION.name().toLowerCase());
geoDistanceQueryBuilder.point(searchDistanceParameters.getLatitude(), searchDistanceParameters.getLongitude());
geoDistanceQueryBuilder.distance(searchDistanceParameters.getDistance(), DistanceUnit.KILOMETERS);
searchSourceBuilder.query(geoDistanceQueryBuilder);
searchSourceBuilder.sort(fieldSortBuilder);
searchSourceBuilder.searchAfter();
return searchSourceBuilder;
}
Here I do sort before searchAfter() as mention in SearchAfter API doc.
Here I am sending my request to ElasticSearch:
public SearchResponse sendRequestToElastic(SearchDistanceParameters args) throws IOException {
SearchRequest searchRequest = searchByDistanceRequestBuilder.buildRequest(args);
return elasticDao.search(searchRequest, RequestOptions.DEFAULT); // standard RestHighLevelClient.search method inside elasticDao.
}
And finally I am trying to get my objects from SearchResponse:
public List<GeoPointsFromElasticSearchResponse> searchByDistance(SearchDistanceParameters searchDistanceParameters) {
final SearchResponse searchResponse = searchRepository.searchByDistance(searchDistanceParameters);
return getGeoPointsFromElasticSearchResponses(searchResponse);
}
private List<GeoPointsFromElasticSearchResponse> getGeoPointsFromElasticSearchResponses(SearchResponse searchResponse) {
SearchHit[] hits = searchResponse.getHits().getHits();
return Arrays.stream(hits)
.map(hit -> {
final GeoPointsFromElasticSearchResponse geoPointsFromElasticSearchResponse = new GeoPointsFromElasticSearchResponse();
final Map<String, Object> sourceMap = hit.getSourceAsMap();
final Map map = (Map) sourceMap.get(GeoLocationModelFieldName.LOCATION.name().toLowerCase());
geoPointsFromElasticSearchResponse.setLatitude((Double) map.get("lat"));
geoPointsFromElasticSearchResponse.setLongitude((Double) map.get("lon"));
log.info("Sorted hits: {}", hit.getSortValues());
return geoPointsFromElasticSearchResponse;
}).collect(Collectors.toList());
}
But I've got only 10_000 objects. It seems I am doing something wrong in last part. What am I doing wrong? How to use SearchAfter api in java correctly?

Well the search API wont return all the documents in one request, the behaviour is similar as pagination.
You have to pass argument to search after:
https://www.elastic.co/guide/en/elasticsearch/reference/6.7/search-request-search-after.html
According to the constructor : searchSourceBuilder.searchAfter(new Object[]{sortAfterValue});
The value you want to set is the one returned by the first search request (hits => getAt(lastIndex) => getSortValues())

Related

Setting default operator to AND in elasticsearch when using SearchSourceBuilder and XContentParser

I have the below sample code to query elasticsearch. It receives the user query and uses XContentParser and SearchSourceBuilder. I don't want to tweak a lot on the existing code as it is has been in production for a while. I would like to change the default operator from "OR" to "AND" so we get more meaningful messages. I couldn't find a way to set the default operator in this scenario. Any help apprecaited
RestHighLevelClient esClient;
String[] indicesArray={"index1","index2"};
String fullQueryJson={"query":{"nested":{"path":"_ROOT","query":{"query_string":{"query":"_ROOT.LATEST.FULL_NAME:John Mary C"}}}},"_source":["FULL_NAME","TEXT_MSG","USER_NAME"],"size":9010}
SearchSourceBuilder accessSearchSource = new SearchSourceBuilder();
final SearchModule searchModule = new SearchModule(Settings.EMPTY, false, Collections.emptyList());
final XContentParser fullQueryJsonParser = XContentFactory.xContent(XContentType.JSON)
.createParser(new NamedXContentRegistry(searchModule.getNamedXContents()), LoggingDeprecationHandler.INSTANCE, fullQueryJson);
accessSearchSource.parseXContent(fullQueryJsonParser);
SearchRequest request=new SearchRequest(indicesArray);
request.source(accessSearchSource);
esClient.search(request);
It is part of the SearchSourceBuilder.
Example:
SearchRequest firstSearchRequest = new SearchRequest(type.toLowerCase());
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("user", claim).operator(Operator.OR));
searchSourceBuilder
firstSearchRequest.source(searchSourceBuilder);
searchSourceBuilder.query(QueryBuilders.matchQuery("user", claim).operator(Operator.OR));

p:datatable paging all entries elastic search index

I want to use a pageable primfaces datatable to display all entries in an elastic seach index.
Unfortunately the following code only returns 10 entries.
public ArrayList<Video> getAllVideos(String indexName) throws IOException{
makeConnection();
SearchRequest searchRequest = new SearchRequest(indexName);
Scroll scroll = new Scroll(TimeValue.timeValueMinutes(1L));
searchRequest.scroll(scroll);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchAllQuery());
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest);
return buildResponse(searchResponse, restHighLevelClient);
}
Does anybody have a working example for getting all records? Thanks

(How to achieve pagination with rest high level client and spring?)or an alternative of elasticsearchTemplate.queryForPage()

I'm having issue migrating from transport client to Rest high-level client.
The following code will not work with RestHighLevelClient which I want to use to get a response of aggregated pages of type Class.
elasticsearchTemplate.queryForPage(searchQuery, Class.class)
Any suggestions to achieve the same with other method is also welcomed.
My workaround using restHighLevelClient without Spring data elasticsearch consist in this code (this is not the solution but perhaps could be help for your solution):
BoolQueryBuilder criteriaQuerySpecification = getCriteriaQuerySpecification(transactionFilter);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.sort(new FieldSortBuilder("operation_created_at").order(SortOrder.DESC));
sourceBuilder.query(criteriaQuerySpecification);
SearchRequest searchRequest = generateSearchRequest(totalElementsInt, pageNumberInt, sourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
List<OperationDto > operations = Arrays.asList(hits).stream().map(hit -> {
hit.getSourceAsString();
// get operation is a method mapping from hit to your dto using Map<String, Object> sourceAsMap = hit.getSourceAsMap();
OperationDto operation = getOperationDto(hit);
//convert hit to OperationDto
return operation;
}).collect(Collectors.toList());
private SearchRequest generateSearchRequest(Integer totalElementsInt, Integer pageNumberInt, SearchSourceBuilder sourceBuilder) {
SearchRequest searchRequest = new SearchRequest("operation-index").types("operation");
int offset = pageNumberInt *totalElementsInt;
sourceBuilder.from(offset);
sourceBuilder.size(totalElementsInt);
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
searchRequest.source(sourceBuilder);
return searchRequest;
}
This worked for me
public Page<T> search(){
Query query;
SearchHits<T> t;
Criteria nameCriteria = new Criteria("name").is(text).and(new
Criteria("jsonNode").in(String ...)); //This can be any Aggregator
query = new CriteriaQuery(nameCriteria).setPageable(paging);
searchHits = elasticsearchOperations.search(query, T.class);
return (Page) SearchHitSupport.searchPageFor(searchHits, query.getPageable());;
}

Unable to fetch the data with help of QueryBuilders.termQuery

I am a newcomer to elastic search I was trying to work with high-level Rest Client
I am able to work with CRUD operations, With Search functionality, I got stuck up.
My objective to bring all the data based on the book id start with E106 search criteria
http://localhost:5918/book-elastic/books/book/E106
I added the part of the code below
I am able to get all the data using
QueryBuilders.matchAllQuery()
But I couldn't get a specific field value
QueryBuilders.termQuery("_id",bookId)
I have also shared the screenshot of the both results
Can somebody help me out on the query?
Kindly revert in case of any further information needed.
Thanks in Advance
public Page<BookEntity> findByBookId(String bookId, Pageable pageable) throws IOException{
SearchRequest searchRequest = new SearchRequest(INDEX);
searchRequest.types(TYPE);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.fetchSource(false);
//searchSourceBuilder.fetchSource(null, new String[]{"excludedProperty"});
/*MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("id",bookId);
matchQueryBuilder.fuzziness(Fuzziness.AUTO);
matchQueryBuilder.prefixLength(3);
matchQueryBuilder.maxExpansions(7);*/
searchSourceBuilder.from((int)pageable.getOffset());
searchSourceBuilder.size(pageable.getPageSize());
//searchSourceBuilder.query(matchQueryBuilder);
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
//searchSourceBuilder.query(QueryBuilders.termQuery("_id",bookId));
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
SearchHits hits = searchResponse.getHits();
SearchHit[] objectHits = hits.getHits();
for (SearchHit searchHit : objectHits) {
System.out.println("***************************");
System.out.println("Search Hit :: "+searchHit);
System.out.println("***************************");
}
return null;
}
Result Screenshot
matchAllQuery
Input : {"from":0,"size":20,"query":{"match_all":{"boost":1.0}},"_source":false}
Response
Response Value: {"took":5,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"bookdata","_type":"books","_id":"E106401","_score":1.0},{"_index":"bookdata","_type":"books","_id":"E106403","_score":1.0},{"_index":"bookdata","_type":"books","_id":"E10640","_score":1.0}]}}
Input Values: QueryBuilders.termQuery("_id",bookId)
{"from":0,"size":20,"query":{"term":{"_id":{"value":"E106","boost":1.0}}},"_source":false}
Response
The Response is Null
matchPhrasePrefixQuery from QueryBuilders solved my issues

How can I make the documents expired in elastic search using jest api from java application?

I am new to elastic search, I want to expire the documents indexed in the elastic search with jest API from the application. I found that there is a parameter called as TTL for that. But I am facing problem to set the parameter as enabled and true from the jest client. Please let me know how to accomplish this.
Thanks in advance!
This seemed to work for me:
final String index = "myindex";
final String type = "mytype";
final String id = "myid";
final PutMapping putMapping = new PutMapping.Builder(index, type, "{ \"_ttl\" : { \"enabled\" : true } }").build();
client.execute(putMapping);
final Map<String, String> documentToIndex = new HashMap<String, String>();
documentToIndex.put("name", "Fred");
documentToIndex.put("phoneNumber", "1234");
documentToIndex.put("_ttl", "30s"); // Set the TTL
final String jsonDocument = gson.toJson(documentToIndex);
final JestResult result = client.execute(new Index.Builder(jsonDocument).index(index).type(type).id(id).build());

Resources