Post Filter Query in Elasticsearch 2.3.3 using Java - elasticsearch

I have built a web app on top of elasticsearch (v2.3.3). To filter the query, I am using post filter of elasticsearch. But I came to know that, if I use post filter then the performance benefit of filtering will be lost since I am not using any aggregation or differential filtering. (Reference: https://www.elastic.co/guide/en/elasticsearch/guide/current/_post_filter.html)
This is how my elasticsearch client looks like:
Client client = TransportClient.builder().build().addTransportAddress(
new InetSocketTransportAddress(InetAddress.getByName("127.0.0.1"),
9300));
SearchResponse response = client.prepareSearch("index_name")
.setTypes("index_type")
.setQuery(QueryBuilders.simpleQueryStringQuery(query)
.field("newContent").field("T"))
.setPostFilter(QueryBuilders.termQuery(Collection, true))
.setFetchSource(new String[] { "U", "UE", "UD", "T" }, null)
.setVersion(true).addHighlightedField("newContent").setFrom(0)
.setSize(10).execute().actionGet();
I have also read that filtered query is depreciated in elasticsearch 2.x versions. Is there any other way which will help me to apply a filter before the query is executed? I might be missing something obvious. I would appreciate your help.

You simply need to bring the filter present in post filter inside a bool/filter query. Try to do hits instead:
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.simpleQueryStringQuery(query)
.field("newContent").field("T"))
.filter(QueryBuilders.termQuery(Collection, true));
SearchResponse response = client.prepareSearch("index_name")
.setTypes("index_type")
.setQuery(boolQuery)
.setFetchSource(new String[] { "U", "UE", "UD", "T" }, null)
.setVersion(true).addHighlightedField("newContent").setFrom(0)
.setSize(10).execute().actionGet();

Related

Elasticsearch Query: Boosting specific field

I am using Elasticsearch 2.4.3 and want to boost specific fields in my query. Is this possible? I only see how I can boost an index.
Greetings!
UPDATE
Mapping:
"firstName":{"type":"string",
"analyzer":"customNGram"
},
"lastName":{
"type":"string",
"analyzer":"customNGram"
},
"note":{
"type":"string",
"analyzer":"customNGram"
}
Query (Java API):
QueryBuilder qb = new BoolQueryBuilder()
.must(QueryBuilders.matchQuery("_all", term)
.analyzer("atsCustomSearchAnalyzer")
.operator(Operator.AND));
SearchRequestBuilder searchRequestBuilder = elasticsearchClient.prepareSearch("persons", "activities").setTypes("person", "activity")
.setQuery(qb)
.addHighlightedField("*").setHighlighterRequireFieldMatch(false)
.setHighlighterOrder("score")
.setHighlighterFragmentSize(150)
.setHighlighterForceSource(true)
.setSize(100)
.addIndexBoost("persons", 200)
.setFrom(offset);
return searchRequestBuilder.execute().get();
If you split up your match-query to match individual fields, eg using a multi match query (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html), you can boost the field you like. So something like:
QueryBuilder qb = new BoolQueryBuilder()
.must(QueryBuilders.multiMatchQuery(term, "firstName^3",
"lastName^3", "note")
.analyzer("atsCustomSearchAnalyzer")
.operator(Operator.AND));
should boost firstName and lastName 3 times relative to the note field.

How to sort by _doc using elasticsearch java client

I want to iterate on entire elasticsearch index/type. I am using scroll in java client as below
SearchResponse scrollResp = client.prepareSearch(test)
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000))
.setQuery(qb)
.setSize(100).execute().actionGet();
As suggested in docs in the link.
"Scroll requests have optimizations that make them faster when the sort order is _doc. If you want to iterate over all documents regardless of the order, this is the most efficient option"
"sort": [
"_doc"
]
How to set sort order to "_doc" in java client code above?
Use this :
SearchResponse scrollResp = elasticsearchTemplate.client.prepareSearch(test)
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000))
.setQuery(qb).addSort("_doc" , SortOrder.ASC)
.setSize(100).execute().actionGet();

Aggregations in Java client through JSON query - without AggregationBuilder

I am able to implement aggregation functionality via JSON query in HTTP based JEST client but not in TCP based Java client.
Through JEST client (HTTP REST based) it is possible to implement aggregation through query String.
JEST sample code:
JestClientFactory factory = new JestClientFactory();
HttpClientConfig httpClientConfig = new HttpClientConfig
.Builder("http://localhost:9201")
.build();
factory.setHttpClientConfig(httpClientConfig);
JestClient client = factory.getObject();
String queryString ="{\"query\":{\"match_all\": {}},\"aggs\":{\"avg1\":{\"avg\":{\"field\":\"age\"} } }}";
Search.Builder searchBuilder = new Search.Builder(queryString)
.addIndex("st1index")
.addType("st1type");
SearchResult response = client.execute(searchBuilder.build());
System.out.println(response.getJsonString());
client.shutdownClient();
Printing response of JEST client shows aggregation results.
Using TCP client in elasticsearch, aggregation is possible through AggregationBuilder.
When I tried to implement JSON query in TCP, it did not return aggregation results.
Is there any reason why TCP do not support aggregation through query string but supports with adding aggregation options?
TCP Java client sample code:
Edited
Removed WrapperQueryBuilder surrounding the queryString.
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "javaEscluster")
.put("node.name", "arivu").build();
Client client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress("localhost", 9303));
String queryString ="{\"match_all\": {},\"aggs\":{\"avg1\":{\"avg\":{\"field\":\"age\"} } }}";
SearchResponse response = client.prepareSearch("st1index").setTypes("st1type").setQuery(queryString).execute().actionGet();
System.out.println("Getresponse-->" +"Index-->"+ response.toString());
//closing node
client.close();
System.out.println("completed");
This code retrieves only search results and empty aggregation result data.
Edited:
Any reference material which explains the reason would be great.
In the main documentation of the WrapperQueryBuilder class, it is stated:
A Query builder which allows building a query given JSON string or binary data provided as input. This is useful when you want to use the Java Builder API but still have JSON query strings at hand that you want to combine with other query builders.
The keyword in here is the word query, i.e. the part named query in the request you send to the ES _search endpoint, i.e.:
{
"sort": {
... <--- whatever sorting definition you have goes here
},
"_source": {
... <--- whatever source definition you have goes here
},
"query": {
... <--- this is the content you can use with WrapperQueryBuilder
},
"aggs": {
... <--- whatever aggs definition you have goes here
}
}
WrapperQueryBuilder will only ever consider whatever you can fit inside that query section, so as you can see that doesn't include aggregations, which are in another top-level section of the request.
So, in the JSON query string you give, only the match_all will be considered, because that's the only valid token that is allowed to appear in the query section, the aggs:{...} part is not.
"{\"match_all\": {},\"aggs\":{\"avg1\":{\"avg\":{\"field\":\"age\"} } }}"
^ ^
| |
this is valid this is NOT valid

Obtaining string query (JSON) from SearchQuery object

For debugging purposes, I need to know what query spring-data-elasticsearch is sending to the ElasticSearch cluster. I have tried to call the toString method on the SearchQuery object, and doesn't return what I need.
What I am doing in Java (using spring-data-elasticsearch) is:
private FilterBuilder getFilterBuilder(String id) {
return orFilter(
termFilter("yaddayaddayadda.id", id),
termFilter("blahblahblah.id", id)
);
}
SearchQuery sq = NativeSearchQueryBuilder()
.withQuery(new MatchAllQuery())
.withFilter(fb)
.build();
And I expect to return something like this plain query executed in ES cluster REST API is returning:
{
"query": {
"filtered": {
"filter": {
"or": [
{
"term": {
"yaddayaddayadda.id": "9"
}
},
{
"term": {
"blahblahblah.id": "9"
}
}
]
}
}
}
}
Thanks in advance!
One way to achieve this is to log the queries on the ES/server-side into the slowlog file. Open your elasticsearch.yml config file and towards the bottom uncomment/edit the two lines below:
...
index.search.slowlog.threshold.query.info: 1ms
...
index.search.slowlog.threshold.fetch.info: 1ms
...
The advantage of this solution is that whatever client technology you're using to query your ES server (Spring Data, Ruby, Browser, Javascript, etc), you'll be able to dump and debug your queries in a single location.
SearchQuery Interface has a method getQuery() and getFilter() to get the information you need.
System.out.println(searchQuery.getQuery());
System.out.println(searchQuery.getFilter());
Hope this helps.
When using SearchRequest or SearchSourceBuilder, calling .toString() method on their instance will get you actual JSON query:
SearchRequest searchRequest = new SearchRequest("index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// building the query
// ...
searchSourceBuilder.query(query);
searchRequest.source(searchSourceBuilder);
System.out.println(searchSourceBuilder.toString()); // prints json query
System.out.println(searchRequest.toString()); // prints json query + other information

How to rewrite ElasticSearch DSL query with the Java API

I have got a working query for ElasticSearch, but I have problems to execute the same query with the Java API of ElasticSearch.
How can I express the query below with the Java API of ElasticSearch?
---
size: 0
query:
match_all: []
facets:
age:
statistical:
field : timestamp
It should be something like:
client.prepareSearch("yourindex")
.setTypes("yourtype")
.setQuery(QueryBuilders.matchAllQuery())
.addFacet(FacetBuilders.statisticalFacet("age").field("timestamp"))
.setSize(0)
.execute()
.actionGet();
You can convert your query DSL to a JSON string, and then wrap it with QueryBuilders.wrapperQuery() or WrapperQueryBuilder(), finally do the query with Java API like this.
SearchResponse response = client.prepareSearch("yourIndex")
.setTypes("yourType")
.setQuery(dslQB)
.setFrom(currentItem)
.setSize(pageSize)
.execute()
.actionGet();
`

Resources