Elasticsearch collapse and sort using java API - elasticsearch-api

I am new to EL. My req would be I need fetch the data from ELS through Java API using spring boot.
I have written search query along with collapse and sort. Its working perfectly fine. But I am getting how to
re-write this code in java spring boot. Could you please help me out.
Below my ELS query:
GET /test/_search
{
"query": {
"function_score": {
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"match" : {
"job_status" : "SUCCESS"
}
},
{
"range": {
"input_count": {
"gte": 0
}
}
},
{
"range": {
"output_count": {
"gte": 0
}
}
},
{
"range": {
"#timestamp": {
"from" : "20/04/2020",
"to" : "26/04/2020",
"format" : "dd/MM/yyyy"
}
}
},
{
"script": {
"script": {
"source": "doc['output_count'].value < doc['input_count'].value",
"params": {}
}
}
}
]
}
}
}
}
}
},
"collapse": {
"field": "run_id.keyword"
},
"sort": [
{
"#timestamp": {
"order": "desc"
}
}
]
}
This is my Java code: Its is working fine. Here I need your help to add collapse & sort API code.
MultiSearchRequest multiRequest = new MultiSearchRequest();
SearchRequest rowCountMatchRequest = new SearchRequest();
SearchSourceBuilder rowCountMatchSearchSourceBuilder = new SearchSourceBuilder();
MultiSearchResponse response = null;
BoolQueryBuilder rowCountMatchQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.matchQuery("job_status", Constants.SUCCESS))
.must(QueryBuilders.rangeQuery("input_record_count").gte(0))
.must(QueryBuilders.rangeQuery("output_record_count").gte(0))
.must(QueryBuilders.rangeQuery("#timestamp").format("dd/MM/yyyy").gte(fromDate).lte(toDate))
.must(QueryBuilders.scriptQuery(
new Script("doc['output_count'].value >= doc['input_count'].value")));
rowCountMatchSearchSourceBuilder.query(rowCountMatchQuery);
rowCountMatchRequest.indices(stblstreamsetindex);
rowCountMatchRequest.source(rowCountMatchSearchSourceBuilder);
multiRequest.add(rowCountMatchRequest);
response = restHighLevelClient.msearch(multiRequest, RequestOptions.DEFAULT);
Hope I am clear with my question.

Just add a SortBuilder to the SearchSourceBuilder:
rowCountMatchSearchSourceBuilder.sort(SortBuilders.fieldSort("#timestamp").order(SortOrder.DESC));
For "collapse" it could work like this:
rowCountMatchSearchSourceBuilder.collapse(new CollapseBuilder("run_id.keyword"));

Related

Spring data elasticsearchQuery equivalent to HasChildQuery

{
"query": {
"bool": {
"must": [
{
"match": {
"Id": "xxxxxx"
}
},
{
"has_child": {
"type": "component",
"query": {
"bool": {
"should": [
{
"term": {
"type": "xxxxx"
}
},
{
"term": {
"name": "xxxxx"
}
}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
Want to replace above Query using Criteria or NativeSearchQueryBuilder. I have tried following its and hanging on search for long time.
QueryBuilder parentQuery = QueryBuilders.matchQuery("Id", Id);
HasChildQueryBuilder childQuery = JoinQueryBuilders.hasChildQuery("component",
QueryBuilders.termQuery("type","xxxxx"), ScoreMode.None);
HasChildQueryBuilder childQuery2 = JoinQueryBuilders.hasChildQuery("component",
QueryBuilders.termQuery("name","xxxxx"), ScoreMode.None);
NativeSearchQuery query = new NativeSearchQueryBuilder().withQuery(parentQuery)
.withQuery(childQuery).withQuery(childQuery2).build();
SearchHits<ENTITY> recipeSearchHits= elasticsearchOperations.search(query, ENTITY.class);
Followed official spring data document https://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#elasticsearch.jointype but i am doing something wrong here so its going on loop.

Elasticsearch querying number of dates in array matching query

I have documents in the following form
PUT test_index/_doc/1
{
"dates" : [
"2018-07-15T14:12:12",
"2018-09-15T14:12:12",
"2018-11-15T14:12:12",
"2019-01-15T14:12:12",
"2019-03-15T14:12:12",
"2019-04-15T14:12:12",
"2019-05-15T14:12:12"],
"message" : "hello world"
}
How do I query for documents such that there are n number of dates within the dates array falling in between two specified dates?
For example: Find all documents with 3 dates in the dates array falling in between "2018-05-15T14:12:12" and "2018-12-15T14:12:12" -- this should return the above document as "2018-07-15T14:12:12", "2018-09-15T14:12:12" and "2018-11-15T14:12:12" fall between "2018-05-15T14:12:12" and "2018-12-15T14:12:12".
I recently faced the same problem. However came up with two solutions.
1) If you do not want to change your current mapping, you could query for the documents using query_string. Also note you will have to create the query object according to the range that you have. ("\"2019-04-08\" OR \"2019-04-09\" OR \"2019-04-10\" ")
{
"query": {
"query_string": {
"default_field": "dates",
"query": "\"2019-04-08\" OR \"2019-04-09\" OR \"2019-04-10\" "
}
}
}
However,this type of a query only makes sense if the range is short.
2) So the second way is the nested method. But you will have to change your current mapping in such a way.
{
"properties": {
"dates": {
"type": "nested",
"properties": {
"key": {
"type": "date",
"format": "YYYY-MM-dd"
}
}
}
}
}
So your query will look something like this :-
{
"query": {
"nested": {
"path": "dates",
"query": {
"bool": {
"must": [
{
"range": {
"dates.key": {
"gte": "2018-04-01",
"lte": "2018-12-31"
}
}
}
]
}
}
}
}
}
You can create dates as a nested document and use bucket selector aggregation.
{
"empId":1,
"dates":[
{
"Days":"2019-01-01"
},
{
"Days":"2019-01-02"
}
]
}
Mapping:
"mappings" : {
"properties" : {
"empId" : {
"type" : "keyword"
},
"dates" : {
"type" : "nested",
"properties" : {
"Days" : {
"type" : "date"
}
}
}
}
}
GET profile/_search
{
"query": {
"bool": {
"filter": {
"nested": {
"path": "dates",
"query": {
"range": {
"dates.Days": {
"format": "yyyy-MM-dd",
"gte": "2019-05-01",
"lte": "2019-05-30"
}
}
}
}
}
}
},
"aggs": {
"terms_parent_id": {
"terms": {
"field": "empId"
},
"aggs": {
"availabilities": {
"nested": {
"path": "dates"
},
"aggs": {
"avail": {
"range": {
"field": "dates.Days",
"ranges": [
{
"from": "2019-05-01",
"to": "2019-05-30"
}
]
},
"aggs": {
"count_Total": {
"value_count": {
"field": "dates.Days"
}
}
}
},
"max_hourly_inner": {
"max_bucket": {
"buckets_path": "avail>count_Total"
}
}
}
},
"bucket_selector_page_id_term_count": {
"bucket_selector": {
"buckets_path": {
"children_count": "availabilities>max_hourly_inner"
},
"script": "params.children_count>=19;" ---> give the number of days that should match
}
},
"hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
I found my own answer to this, although I'm not sure how efficient it is compared to the other answers:
GET test_index/_search
{
"query":{
"bool" : {
"filter" : {
"script" : {
"script" : {"source":"""
int count = 0;
for (int i=0; i<doc['dates'].length; ++i) {
if (params.first_date < doc['dates'][i].toInstant().toEpochMilli() && doc['dates'][i].toInstant().toEpochMilli() < params.second_date) {
count += 1;
}
}
if (count >= 2) {
return true
} else {
return false
}
""",
"lang":"painless",
"params": {
"first_date": 1554818400000,
"second_date": 1583020800000
}
}
}
}
}
}
}
where the parameters are the two dates in epoch time. I've chosen 2 matches here, but obviously you can generalise to any number.

How to join two queries in one using elasticsearch?

Hi I want to join two queries in one in elasticsearch, but I don't know how to do it: I think I should do an aggregation but I don't know very clear how to do it. Could you help me? My ES version is 5.1.2.
First filter by status and name:
POST test_lite/_search
{
"aggs": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"match": {
"STATUS": "Now"
}
},
{
"match": {
"NAME": "PRUDENTL"
}
}
]
}
}
}
}
}
Look for in the filtered records for the word filtered in description:
POST /test_lite/_search
{
"query": {
"wildcard" : { "DESCRIPTION" : "*english*" }
}
}
The only query needed is:
POST test_lite/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"STATUS": "Now"
}
},
{
"match": {
"NAME": "PRUDENTL"
}
},
{"wildcard" : { "DESCRIPTION" : "*english*" }}
]
}
}
}

Elasticsearch aggregation using a bool filter

I've the following query which works fine on Elasticsearch 1.x but does not work on 2.x (I get doc_count: 0) since the bool filter has been deprecated. It's not quite clear to me how to re-write this query using the new Bool Query.
{
"aggregations": {
"events_per_period": {
"filter": {
"bool": {
"must": [
{
"terms": {
"message.facility": [
"facility1",
"facility2",
"facility3"
]
}
}
]
}
}
}
},
"size": 0
}
Any help is greatly appreciated.
I think you might want aggregation on multi fields with filter :-
Here I assume filter for id and aggregation on facility1 and facility2 .
{
"_source":false,
"query": {
"match": {
"id": "value"
}
},
"aggregations": {
"byFacility1": {
"terms": {
"field": "facility1"
},
"aggs": {
"byFacility2": {
"terms": {
"field": "facility2"
}
}
}
}
}
}
if you want aggregation on three field , check link.
For java implementation link2

Convert elasticsearch query to java with multi aggregation

I am using elastichsearch in java 1.7.5 and after console query I want to tranform the code below to java code. It is a query with mutiple sub-aggregation and result in my confusion.
{
"query": {
"bool": {
"must": [
{
"range": {
"rawlog.auAid": {
"from": "3007145536"
}
}
},
{
"term": {
"rawlog.ip": "118.70.204.171"
}
}
],
"must_not": [],
"should": []
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "loggedTime",
"interval": "second"
},
"aggs": {
"id": {
"terms": {
"field": "auAid"
}
},
"url": {
"terms": {
"field": "urlId1"
}
},
"devVerId": {
"terms": {
"field": "devVerId"
}
},
"devTypeId": {
"terms": {
"field": "devTypeId"
}
},
"osVerId": {
"terms": {
"field": "osVerId"
}
},
"browserId": {
"terms": {
"field": "browserId"
}
}
}
}
}
}
Can anyone help me to perform it ? Thanks so much
You have everything you need in the documentation here and here, but it basically goes like this:
// 1. build the query
QueryBuilder qb = boolQuery()
.must(rangeQuery("rawlog.auAid").from(3007145536))
.must(termQuery("rawlog.ip", "118.70.204.171"));
// 2. build the aggregations
AggregationBuilder articlesOverTime =
AggregationBuilders
.dateHistogram("articles_over_time")
.field("loggedTime")
.interval(DateHistogramInterval.SECOND);
articlesOverTime.subAggregation(AggregationBuilders.terms("id").field("auAid"));
articlesOverTime.subAggregation(AggregationBuilders.terms("url").field("urlId1"));
articlesOverTime.subAggregation(AggregationBuilders.terms("devVerId").field("devVerId"));
articlesOverTime.subAggregation(AggregationBuilders.terms("devTypeId").field("devTypeId"));
articlesOverTime.subAggregation(AggregationBuilders.terms("osVerId").field("osVerId"));
articlesOverTime.subAggregation(AggregationBuilders.terms("browserId").field("browserId"));
// 3. make the query
SearchResponse sr = node.client().prepareSearch()
.setQuery(qb)
.addAggregation(articlesOverTime)
.execute().actionGet();

Resources