Make "query + aggregations" elasticsearch, using java query dsl - spring-boot

It's possible build a query with aggregation (elasticsearch), using java-query-dsl?

ElasticSearch provides a client lib that helps you to build searches. You can find more about it here.
Here's an example of how you can do it:
// build the client
HttpHost host = new HttpHost("localhost", 9200, "http");
RestHighLevelClient client = new RestHighLevelClient(RestClient
.builder(new HttpHost[]{host}));
// build the search (set the conditions here)
BoolQueryBuilder boolQueryBuilder = boolQuery();
boolQueryBuilder.must(QueryBuilders.rangeQuery("age")
.from(25)
.to(40));
// build the aggregations (set the aggregations here)
TermsAggregationBuilder groupByGender = AggregationBuilders.terms("gender")
.field("gender")
.size(5);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(boolQueryBuilder);
sourceBuilder.aggregation(groupByGender);
// create and execute the search request
SearchRequest request = new SearchRequest()
.indices("customers")
.types("customer")
.allowPartialSearchResults(false)
.source(sourceBuilder)
.requestCache(true);
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
which will produce something like:
GET customers/customer/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"age": {
"gt": 25,
"lt": 40
}
}
}
]
}
},
"aggs": {
"gender": {
"terms": {
"field": "gender",
"size": 5
}
}
}
}

Related

Elasticsearch search templates - How to construct the search terms in NEST

Currently I have a search template that I am trying to pass in a couple of parameters,
How can I construct my search terms using NEST to get the following result.
Template
PUT _scripts/company-index-template
{
"script": {
"lang": "mustache",
"source": "{\"query\": {\"bool\": {\"filter\":{{#toJson}}clauses{{/toJson}},\"must\": [{\"query_string\": {\"fields\": [\"companyTradingName^2\",\"companyName\",\"companyContactPerson\"],\"query\": \"{{query}}\"}}]}}}",
"params": {
"query": "",
"clauses": []
}
}
}
DSL query looks as follow
GET company-index/_search/template
{
"id": "company-index-template",
"params": {
"query": "sky*",
"clauses": [
{
"terms": {
"companyGroupId": [
1595
]
}
},
{
"terms": {
"companyId": [
158,
836,
1525,
2298,
2367,
3176,
3280
]
}
}
]
}
}
I would like to construct the above query in NEST but cant seem to find a good way to generate the clauses value.
This is what I have so far...
var responses = this.client.SearchTemplate<Company>(descriptor =>
descriptor
.Index(SearchConstants.CompanyIndex)
.Id("company-index-template")
.Params(objects => objects
.Add("query", queryBuilder.Query)
.Add("clauses", "*How do I contruct this JSON*");
UPDATE:
This is how I ended up doing it. I just created a dictionary with all my terms in it.
I do think there might be a beter why of doing it, but I cant find it.
new List<Dictionary<string, object>>
{
new() {{"terms", new Dictionary<string, object> {{"companyGroupId", companyGroupId}}}},
new() {{"terms", new Dictionary<string, object> {{"companyId", availableCompanies}}}}
}
And then I had to Serialize when I passed it to the Params method.
var response = this.client.SearchTemplate<Company>(descriptor =>
descriptor.Index(SearchConstants.CompanyIndex)
.Id("company-index-template")
.Params(objects => objects
.Add("query", "*" + query + "*")
.Add("clauses", JsonConvert.SerializeObject(filterClauses))));

Elasticsearch Java - use Search Template to query

I have below code working fine in my java service.
Query searchQuery = new StringQuery(
"{\"bool\":{\"must\":[{\"match\":{\"id\":\"" + id + "\"}}]}}");
SearchHits<Instance> instanceSearchHits = elasticsearchOperations.search(searchQuery, Instance.class, IndexCoordinates.of("test"));
log.info("hits :: " + instanceSearchHits.getSearchHits().size());
Now, I want to save this query as a template in elastic and just pass params and search template from java service to elastic to execute the query.
Search Template added in Elastic
PUT _scripts/search-template-1
{
"script": {
"lang": "mustache",
"source": {
"query": {
"bool": {
"must": [
{
"term": {
"id": "{{id}}"
}
}
]
}
}
},
"params": {
"id": "id to search"
}
}
}
Call this template
GET test/_search/template
{
"id": "search-template-1",
"params": {
"id": "f52c2c62-e921-4410-847f-25ea0f3eeb40"
}
}
But unfortunately not able to find API reference for the same to call this search template from JAVA (spring-data-elasticsearch)
As mentioned by val this can be used to call the search template query from java
SearchTemplateRequest request = new SearchTemplateRequest();
request.setRequest(new SearchRequest("posts"));
request.setScriptType(ScriptType.STORED);
request.setScript("title_search");
Map<String, Object> params = new HashMap<>();
params.put("field", "title");
params.put("value", "elasticsearch");
params.put("size", 5);
request.setScriptParams(params);
SearchTemplateResponse response = client.searchTemplate(request, RequestOptions.DEFAULT);
SearchResponse searchResponse = response.getResponse();
SearchHits searchHits = searchResponse.getHits();
log.info("hits :: " + searchHits.getMaxScore());
searchHits.forEach(searchHit -> {
log.info("this is the response, " + searchHit.getSourceAsString());
});
This is currently not yet possible. There is an issue for this in Spring Data Elasticsearch.

Spring data Elasticsearch count in groups by date range

I use spring data elastic search and have a this list in my elastic search.
{"appUserId": "id-test-app-user11", "apkId": 1, "event": "INSTALL", "date": "2020-06-01"}
...
{"appUserId": "id-test-app-user168", "apkId": 1, "event": "INSTALL", "date": "2020-12-06"}
I want to count by day the number of install of an apkId between a date range.
With this request, I can get all data bewteen my date range and an apkId provided in parameter
LocalDate today = LocalDate.now();
LocalDate beginningDate = today.minusDays(intervalle);
BoolQueryBuilder query = QueryBuilders.boolQuery();
query.must(QueryBuilders.rangeQuery("date")
.gte(convertToDateViaInstant(beginningDate))
.lte(convertToDateViaInstant(today)));
query.must(QueryBuilders.matchQuery("apkId", apkId));
query.must(QueryBuilders.matchQuery("event", Event.INSTALL));
return apkHistoryRepo.search(query);
But I don't know how to aggregate by date in order to have something like
{"2020-06-01": "500"}
...
{"2020-12-06": "10"}
Please how could I achieve this ?
Thanks in advance
You are looking for date histogram aggregation. Here is how you can use it in your query,
{
"query": {
"bool": {
"must": [
{
"term": {
"apkId": "1"
}
},
{
"term": {
"event": "INSTALL"
}
},
{
"range": {
"date": {
"gte": <start_date_here>,
"lte": <end_date_here>
}
}
}
]
}
},
"aggs": {
"per_day_count": {
"date_histogram": {
"field": "date",
"calendar_interval": "1d"
}
}
}
}
Final solution in Java
LocalDate today = LocalDate.now();
LocalDate beginningDate = today.minusDays(intervalle);
BoolQueryBuilder query = QueryBuilders.boolQuery();
query.must(QueryBuilders.rangeQuery("date")
.gte(beginningDate)
.lte(today));
query.must(QueryBuilders.matchQuery("apkId", apkId));
query.must(QueryBuilders.matchQuery("event", Event.INSTALL));
Iterable<ApkHistory> list = apkHistoryRepo.search(query);
AggregationBuilder aggregation = AggregationBuilders
.dateHistogram("nb_install_per_day")
.field("date")
.dateHistogramInterval(DateHistogramInterval.DAY);
SearchRequest searchRequest = new SearchRequest("apkhistory");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(query).aggregation(aggregation);
searchRequest.source(searchSourceBuilder);
try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
ParsedDateHistogram dateGroupBy = searchResponse.getAggregations().get("nb_install_per_day");
List<? extends Histogram.Bucket> bucketList = dateGroupBy.getBuckets();
for(Bucket b : bucketList) {
System.out.println(b.getKeyAsString() + " "+b.getDocCount());
}
System.out.println("test");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Elastic Search NEST client raw + custom query

I'm using NEST client for querying ES, but now I have a specific situation - I'm trying to proxy query to ES, but with specific query applied by default:
public IEnumerable<TDocument> Search<TDocument>(string indexName, string query, string sort, int page, int pageSize) where TDocument : class
{
var search = new SearchRequest(indexName)
{
From = page,
Size = pageSize,
Query = new RawQuery(query),
};
var response = this.client.Search<TDocument>(search);
return response.Documents;
}
Code above is just proxying query to ES, but what if I need to apply specific filter that should be always applied along with passed query?
So for example I'd want Active field to be true by default. How can I merge this raw query with some specific and always applied filter (without merging strings to formulate merged ES API call if possible).
Assuming that query is well formed JSON that corresponds to the query DSL, you could deserialize it into an instance of QueryContainer and combine it with other queries. For example
var client = new ElasticClient();
string query = #"{
""multi_match"": {
""query"": ""hello world"",
""fields"": [
""description^2.2"",
""myOtherField^0.3""
]
}
}";
QueryContainer queryContainer = null;
using (var stream = client.ConnectionSettings.MemoryStreamFactory.Create(Encoding.UTF8.GetBytes(query)))
{
queryContainer = client.RequestResponseSerializer.Deserialize<QueryContainer>(stream);
}
queryContainer = queryContainer && +new TermQuery
{
Field = "another_field",
Value = "term"
};
var searchResponse = client.Search<TDocument>(s => s.Query(q => queryContainer));
which will translate to the following query (assuming default index is _all)
POST http://localhost:9200/_all/_search?pretty=true&typed_keys=true
{
"query": {
"bool": {
"filter": [{
"term": {
"another_field": {
"value": "term"
}
}
}],
"must": [{
"multi_match": {
"fields": ["description^2.2", "myOtherField^0.3"],
"query": "hello world"
}
}]
}
}
}

Can't nest terms aggregations more than two deep without further aggs being ignored?

I'm querying ElasticSearch using the Nest library for C#, to fetch graph data with multiple pivots. Each pivot is a nested TermsAggregation on a query, and everything works fine with one or two pivots. Once I get to three pivots, though, the SearchRequest object won't generate further aggregations.
The code to build the aggregations looks like this:
TermsAggregation topTermAgg = null;
TermsAggregation currentAgg = null;
foreach (var pivotName in activePivots)
{
newTermAgg = new TermsAggregation("pivot")
{
Field = pivot.ToString().ToLower()
};
if (topTermAgg == null)
{
topTermAgg = newTermAgg;
}
else
{
currentAgg.Aggregations = newTermAgg;
}
currentAgg = newTermAgg;
}
The SearchRequest itself is pretty straightforward:
var searchRequest = new SearchRequest(Indices.Index("a", "b", "c"))
{
Size = 0,
Aggregations = topTermAgg,
Query = query,
};
Unfortunately, the SearchRequest for 3 or more pivots, when converted to string, looks like this (via nestClient.Serializer.SerializeToString(searchRequest)):
{
"size": 0,
"query": {
"bool": <Fairly complex query, that works fine. It's the aggregation that has the problem.>
},
"aggs": {
"pivot": {
"terms": {
"field": "pivot1"
},
"aggs": {
"pivot": {
"terms": {
"field": "pivot2"
}
}
}
}
}
}
When I inspect the searchRequest object in the debugger, it quite definitely has 3 or more aggregations. What's going on here, and how can I get 3 or more nested terms aggregations to work properly?
I am using Nest version 5.01.
This must be related to the way in which you're building up the nested aggregations. Arbitrarily deep nested aggregations can be built with the client. Here's an example of a three deep nested aggregation
client.Search<Question>(s => s
.Aggregations(a => a
.Terms("top", ta => ta
.Field("top_field")
.Aggregations(aa => aa
.Terms("nested_1", nta => nta
.Field("nested_field_1")
.Aggregations(aaa => aaa
.Terms("nested_2", nnta => nnta
.Field("nested_field_3")
)
)
)
)
)
)
);
which serializes to
{
"aggs": {
"top": {
"terms": {
"field": "top_field"
},
"aggs": {
"nested_1": {
"terms": {
"field": "nested_field_1"
},
"aggs": {
"nested_2": {
"terms": {
"field": "nested_field_3"
}
}
}
}
}
}
}
}
You can also add values to AggregationDictionary directly
var request = new SearchRequest<Question>
{
Aggregations = new AggregationDictionary
{
{ "top", new TermsAggregation("top")
{
Field = "top_field",
Aggregations = new AggregationDictionary
{
{ "nested_1", new TermsAggregation("nested_1")
{
Field = "nested_field_1",
Aggregations = new AggregationDictionary
{
{ "nested_2", new TermsAggregation("nested_2")
{
Field = "nested_field_2"
}
}
}
}
}
}
}
}
}
};
client.Search<Question>(request);
is the same as the previous request. You can shorten this even further to
var request = new SearchRequest<Question>
{
Aggregations = new TermsAggregation("top")
{
Field = "top_field",
Aggregations = new TermsAggregation("nested_1")
{
Field = "nested_field_1",
Aggregations = new TermsAggregation("nested_2")
{
Field = "nested_field_2"
}
}
}
};
client.Search<Question>(request);
I got my code working by constructing the aggregation from the bottom-up, rather than from the top-down.
var terminalAggregation = <some aggregation. In my code, there's a lowest aggregation that's different from the rest. For the code I presented, you could just build the lowest pivot.>
TermsAggregation topTermAgg = null;
activePivots.Reverse();
foreach (var pivotName in activePivots)
{
newTermAgg = new TermsAggregation("pivot")
{
Field = pivot.ToString().ToLower(),
Aggregations = topTermAgg ?? terminalAggregation
};
topTermAgg = newTermAgg;
}
This looks like a bug in the Nest library; there are different classes like AggregationBase and BucketAggregationBase and AggregationDictionary that are all assignable to the "Aggregations" property, but it seems like there's some subtle flaw after the second assignment when you do this recursively.
The documentation is also not up-to-date: it claims that you can create an AggregationDictionary yourself, but since AggregationDictionary doesn't have a public Add() method, I really can't. Nor can I use the C#'s {}-after-insantiation syntax to populate its properties – again, because Add() is not public.

Resources