How to sort by _doc using elasticsearch java client - elasticsearch

I want to iterate on entire elasticsearch index/type. I am using scroll in java client as below
SearchResponse scrollResp = client.prepareSearch(test)
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100).execute().actionGet();
As suggested in docs in the link.
"Scroll requests have optimizations that make them faster when the sort order is _doc. If you want to iterate over all documents regardless of the order, this is the most efficient option"
"sort": [
"_doc"
]
How to set sort order to "_doc" in java client code above?

Use this :
SearchResponse scrollResp = elasticsearchTemplate.client.prepareSearch(test)
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000))
.setQuery(qb).addSort("_doc" , SortOrder.ASC)
.setSize(100).execute().actionGet();

Related

How can I make phrase suggester query with Elasticsearch java api?

I am using 7.10. version of elasticsearch. I created an index and did settings-mappings. Then I sent query to index by using http requests. I got the results I need, but I want to do same thing with Java API. However, I couldn't.
Can you help me to send request and get the result as list in java from scratch ?
And here it is my query that I used for obtain suggestions:
{
"suggest": {
"text": "some title I want to search",
"phrase_suggester": {
"phrase": {
"field": "title.shingle",
"max_errors": 2,
"size": 5,
"confidence": 0.0,
"direct_generator": [{
"field": "title.shingle",
"max_edits": 2
}
]
}
}
}
}
How can I write this query with Elasticsearch Java API. Can you help me figure this out ?
This would be the way to build the request:
client.search(searchRequestBuilder -> searchRequestBuilder
.suggest(suggestBuilder -> suggestBuilder
.text("some title I want to search")
.suggesters("phrase_suggester", fieldSuggesterBuilder -> fieldSuggesterBuilder
.phrase(phraseBuilder -> phraseBuilder.field("title.shingle")
.maxErrors(2d)
.size(5)
.confidence(0.0)
.directGenerator(directGeneratorBuilder -> directGeneratorBuilder
.field("title.shingle")
.maxEdits(2))))),
YourEntity.class);
Btw, the new client was in 7.16, you wrote 7.14?
Finaly I've found my own answers. It was so hard to find the solution due to lack of documents about these kind of specific topics. I'm sharing my solution for those who wondered:
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
PhraseSuggestionBuilder builder = SuggestBuilders.phraseSuggestion("title.shingle")
.addCandidateGenerator(new DirectCandidateGeneratorBuilder("title.shingle")
.suggestMode("always"))
.text(query)
.maxErrors(2f)
.confidence(0f);
SuggestBuilder suggestBuilder = new SuggestBuilder().addSuggestion("suggestion", builder);
searchSourceBuilder.suggest(suggestBuilder);
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices("index_name");
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);

Elasticsearch: Use match query along with autocomplete

I want to use match query along with autocomplete suggestion in ES5. Basically I want to restrict my autocomplete result based on an attribute, like autocomplete should return result within a city only.
MatchQueryBuilder queryBuilder = QueryBuilders.matchQuery("cityName", city);
SuggestBuilder suggestBuilder = new SuggestBuilder()
.addSuggestion("region", SuggestBuilders.completionSuggestion("region").text(text));
SearchResponse response = client.prepareSearch(index).setTypes(type)
.suggest(suggestBuilder)
.setQuery(queryBuilder)
.execute()
.actionGet();
The above doesn't seem to work correctly. I am getting both the results in the response both independent of each other.
Any suggestion?
It looks like the suggestion builder is creating a completion suggester. Completion suggesters are stored in a specialized structure that is separate from the main index, which means it has no access to your filter fields like cityName. To filter suggestions you need to explicitly define those same filter values when you create the suggestion, separate to the attributes you are indexing for the document to which the suggestion is attached. These suggester filters are called context. More information can be found in the docs.
The docs linked to above are going to explain this better than I can, but here is a short example. Using a mapping like the following:
"auto_suggest": {
"type": "completion",
"analyzer": "simple",
"contexts": [
{
"name": "cityName",
"type": "category",
"path": "cityName"
}
]
}
This section of the index settings defines a completion suggester called auto_suggest with a cityName context that can be used to filter the suggestions. Note that the path value is set, which means this context filter gets its value from the cityName attribute in your main index. You can remove the path value if you want to explicitly set the context to something that isn't already in the main index.
To request suggestions while providing context, something like this in combination with the settings above should work:
"suggest": {
"auto_complete":{
"text":"Silv",
"completion": {
"field" : "auto_suggest",
"size": 10,
"fuzzy" : {
"fuzziness" : 2
},
"contexts": {
"cityName": [ "Los Angeles" ]
}
}
}
}
Note that this request also allows for fuzziness, to make it a little resilient to spelling mistakes. It also restricts the number of suggestions returned to 10.
It's also worth noting that in ES 5.x completion suggester are document centric, so if multiple documents have the same suggestion, you will receive duplicates of that suggestion if it matches the characters entered. There's an option in ES 6 to de-duplicate suggestions, but nothing similar in 5.x. Again it's best to think of completion suggesters existing in their own index, specifically an FST, which is explained in more detail here.

Elasticsearch Query: Boosting specific field

I am using Elasticsearch 2.4.3 and want to boost specific fields in my query. Is this possible? I only see how I can boost an index.
Greetings!
UPDATE
Mapping:
"firstName":{"type":"string",
"analyzer":"customNGram"
},
"lastName":{
"type":"string",
"analyzer":"customNGram"
},
"note":{
"type":"string",
"analyzer":"customNGram"
}
Query (Java API):
QueryBuilder qb = new BoolQueryBuilder()
.must(QueryBuilders.matchQuery("_all", term)
.analyzer("atsCustomSearchAnalyzer")
.operator(Operator.AND));
SearchRequestBuilder searchRequestBuilder = elasticsearchClient.prepareSearch("persons", "activities").setTypes("person", "activity")
.setQuery(qb)
.addHighlightedField("*").setHighlighterRequireFieldMatch(false)
.setHighlighterOrder("score")
.setHighlighterFragmentSize(150)
.setHighlighterForceSource(true)
.setSize(100)
.addIndexBoost("persons", 200)
.setFrom(offset);
return searchRequestBuilder.execute().get();
If you split up your match-query to match individual fields, eg using a multi match query (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html), you can boost the field you like. So something like:
QueryBuilder qb = new BoolQueryBuilder()
.must(QueryBuilders.multiMatchQuery(term, "firstName^3",
"lastName^3", "note")
.analyzer("atsCustomSearchAnalyzer")
.operator(Operator.AND));
should boost firstName and lastName 3 times relative to the note field.

elasticsearch sort by _doc for scroll not returning any results

I am trying to run a scroll query and get the results by sorting on "_doc" field. But the scroll is always returning empty resultest. Below the scroll i am trying. I am using elasticsearch version 2.3.4
SearchResponse scrollResp = client.prepareSearch(indexName).setTypes(indexType)
.setScroll(TimeValue.timeValueMillis(scrollTimeout))
.addSort("_doc" , SortOrder.ASC)
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setQuery(query)
.setFrom(pageIndex)
.setSize(scrollSize)
//.setFetchSource(true)
.execute()
.actionGet();
return scrollResp;
But if i replace the same query with sort on "id" field it works fine. Am i doing anything wrong here?
SearchResponse scrollResp = client.prepareSearch(indexName).setTypes(indexType)
.setScroll(TimeValue.timeValueMillis(scrollTimeout))
.addSort(new FieldSortBuilder("id"))
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setQuery(query)
.setFrom(pageIndex)
.setSize(scrollSize)
//.setFetchSource(true)
.execute()
.actionGet();
return scrollResp;

Post Filter Query in Elasticsearch 2.3.3 using Java

I have built a web app on top of elasticsearch (v2.3.3). To filter the query, I am using post filter of elasticsearch. But I came to know that, if I use post filter then the performance benefit of filtering will be lost since I am not using any aggregation or differential filtering. (Reference: https://www.elastic.co/guide/en/elasticsearch/guide/current/_post_filter.html)
This is how my elasticsearch client looks like:
Client client = TransportClient.builder().build().addTransportAddress(
new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),
9300));
SearchResponse response = client.prepareSearch("index_name")
.setTypes("index_type")
.setQuery(QueryBuilders.simpleQueryStringQuery(query)
.field("newContent").field("T"))
.setPostFilter(QueryBuilders.termQuery(Collection, true))
.setFetchSource(new String[] { "U", "UE", "UD", "T" }, null)
.setVersion(true).addHighlightedField("newContent").setFrom(0)
.setSize(10).execute().actionGet();
I have also read that filtered query is depreciated in elasticsearch 2.x versions. Is there any other way which will help me to apply a filter before the query is executed? I might be missing something obvious. I would appreciate your help.
You simply need to bring the filter present in post filter inside a bool/filter query. Try to do hits instead:
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.simpleQueryStringQuery(query)
.field("newContent").field("T"))
.filter(QueryBuilders.termQuery(Collection, true));
SearchResponse response = client.prepareSearch("index_name")
.setTypes("index_type")
.setQuery(boolQuery)
.setFetchSource(new String[] { "U", "UE", "UD", "T" }, null)
.setVersion(true).addHighlightedField("newContent").setFrom(0)
.setSize(10).execute().actionGet();

Resources