I have below ElasticSeach query
What should be equivalent Java api code for this
GET my_index/_search
{
"aggs": {
"bucket_id": {
"terms": {
"field": "id"
, "size": 1000
},
"aggs": {
"bucket_name": {
"terms": {
"field": "name.keyword"
, "size": 1
}
}
}
}
}
}
Figured this out
AggregationBuilder aggregationBuilder = AggregationBuilders.terms("bucket_id").field("id").size(1000);
aggregationBuilder.subAggregation(AggregationBuilders.terms("bucket_name").field("name.keyword"));
In my contains field I have "xr" data and "xra","xrb","xrc" seperately. When I make query for the count of "xr" elasticsearch does not return me 1, it returns 4. How can I manage it?
This is my query
"aggs": {
"Group1": {
"terms": {
"field": "method.keyword",
"include": ".*POST.*",
},
"aggs": {
"Group3": {
"terms": {
"field": "contains.keyword",
"size": 11593,
}
}
},
}
I am using elastic search 1.6.0.
Here is my aggregation query :
GET /a/dummydata/_search
{
"size": 0,
"aggs": {
"sum_trig_amber": {
"terms": {
"field": "TRIGGER_COUNT_AMBER"
}
},
"sum_trig_green": {
"terms": {
"field": "TRIGGER_COUNT_GREEN"
}
},
"sum_trig-red": {
"terms": {
"field": "TRIGGER_COUNT_RED"
}
}
}
}
Is there any way by which i can add three sum_trig_amber + sum_trig_red + sum_trig_green ?
I am using elastichsearch in java 1.7.5 and after console query I want to tranform the code below to java code. It is a query with mutiple sub-aggregation and result in my confusion.
{
"query": {
"bool": {
"must": [
{
"range": {
"rawlog.auAid": {
"from": "3007145536"
}
}
},
{
"term": {
"rawlog.ip": "118.70.204.171"
}
}
],
"must_not": [],
"should": []
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "loggedTime",
"interval": "second"
},
"aggs": {
"id": {
"terms": {
"field": "auAid"
}
},
"url": {
"terms": {
"field": "urlId1"
}
},
"devVerId": {
"terms": {
"field": "devVerId"
}
},
"devTypeId": {
"terms": {
"field": "devTypeId"
}
},
"osVerId": {
"terms": {
"field": "osVerId"
}
},
"browserId": {
"terms": {
"field": "browserId"
}
}
}
}
}
}
Can anyone help me to perform it ? Thanks so much
You have everything you need in the documentation here and here, but it basically goes like this:
// 1. build the query
QueryBuilder qb = boolQuery()
.must(rangeQuery("rawlog.auAid").from(3007145536))
.must(termQuery("rawlog.ip", "118.70.204.171"));
// 2. build the aggregations
AggregationBuilder articlesOverTime =
AggregationBuilders
.dateHistogram("articles_over_time")
.field("loggedTime")
.interval(DateHistogramInterval.SECOND);
articlesOverTime.subAggregation(AggregationBuilders.terms("id").field("auAid"));
articlesOverTime.subAggregation(AggregationBuilders.terms("url").field("urlId1"));
articlesOverTime.subAggregation(AggregationBuilders.terms("devVerId").field("devVerId"));
articlesOverTime.subAggregation(AggregationBuilders.terms("devTypeId").field("devTypeId"));
articlesOverTime.subAggregation(AggregationBuilders.terms("osVerId").field("osVerId"));
articlesOverTime.subAggregation(AggregationBuilders.terms("browserId").field("browserId"));
// 3. make the query
SearchResponse sr = node.client().prepareSearch()
.setQuery(qb)
.addAggregation(articlesOverTime)
.execute().actionGet();
We have an index with approximately 1 million documents each representing a product for an e-commerce store. We are using aggregations to calculate buckets representing attribute values for each attribute of the product. If we do a search, which limits the resultset to say 2000 products, performance is great (Elastic returns the result in less than 10 milliseconds). However, if we do a matchall query to get all products and their corresponding aggregations, elastic takes more than 3 seconds to return a result. If we disable the aggregations, performance is blazing again. Thus, it seems as though performance, when using aggregations, is very dependent the size of the resultset (With a matchall query we get around 1000 buckets). Is there anything special we need to be aware of in Elastic, in order to have a matchall query performing similarly to a query, which returns 2000 products?
Before we started using Elastic, we have worked with Lucene, and built our own facet abstraction on top of Lucene to be able to handle the above scenario. In this case, we pre-calculated facets on index startup and represented each facet as a term with a corresponding bitset. When doing searches, we retrieved a bitset for the query in question and “AND’ed” it with the precalculated bitset for each facet we wanted to show in the given scenario. With this implementation the speed at which we were able to calculate facet results did not differ depending on the size of the resultset of a given query. Only the number of documents and the number of facets influenced the speed. With this implementation, we were able to calculate more than 10.000 facets (buckets) at each search request with an index of 1 million documents and still get the resultset and facet results in less than 100 milliseconds.
Can anyone tell if this will be possible to achieve with Elastic and any pointers to what we are doing wrong (We are currently running tests on a setup at found.no with 1 cluster having 4 2,5Ghz cores and 1GB ram. The Elastic index takes up around 3,5GB of disk space
Example query (we can save about 1/3 of the query time by not using nested aggregations):
{
"query": {
"match_all": {}
},
"aggs": {
"nested_aggs": {
"nested": {
"path": "specs"
},
"aggs": {
"combined": {
"filter": {
"match_all": { }
},
"aggs": {
"ul1": {
"terms": {
"field": "unspscNameLevel1",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"ul2": {
"terms": {
"field": "unspscNameLevel2",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"ul3": {
"terms": {
"field": "unspscNameLevel3",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"ul4": {
"terms": {
"field": "unspscNameLevel4",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"rl1": {
"terms": {
"field": "requirementSpecificationNameLevel1",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"rl2": {
"terms": {
"field": "requirementSpecificationNameLevel2",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"rl3": {
"terms": {
"field": "requirementSpecificationNameLevel3",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"t1": {
"terms": {
"field": "tilslutningskrav",
"size": 50
}
},
"t2": {
"terms": {
"field": "tildelingsform",
"size": 50
}
},
"t3": {
"terms": {
"field": "tildelingskriterie",
"size": 50
}
}
}
}
}
},
"nested_aggs2": {
"nested": {
"path": "specs.attributes"
},
"aggs": {
"combined": {
"filter": {
"match_all": { }
},
"aggs": {
"n4": {
"terms": {
"field": "4. niv. kategori",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"ffv": {
"terms": {
"field": "form forvaltningsopgave",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"fh": {
"terms": {
"field": "form hovedområde",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"fo": {
"terms": {
"field": "form opgaveområde",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"fs": {
"terms": {
"field": "form serviceområde",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
},
"s": {
"terms": {
"field": "styresystem",
"size": 50
} ,
"aggs": {
"parent_count": {
"reverse_nested": {}
}
}
}
}
}
}
},
"supplierCategory": {
"filter": {
"match_all": { }
},
"aggs": {
"sc": {
"terms": {
"field": "supplierCategory.raw",
"size": 50
}
},
"on": {
"terms": {
"field": "organisationName.raw",
"size": 50
}
},
"mn": {
"terms": {
"field": "manufacturerName.raw",
"size": 50
}
}
}
}
}
}