Elasticsearch Filter Query - elasticsearch

I am using elasticsearch 1.5.2. I stored some products with a field named "allergic" and some others without this field. And the values of this field can be fish or milk or nuts etc. I want to make a query and to get as a result only products which doesn't have at all this field called "allergic" and to integrate this to an other aggregation query. I want to make just one query: first eliminate products which have "allergic" field and then execute the aggregation query of the second block.
How to integrate this :
{
"constant_score" : {
"filter" : {
"missing" : { "field" : "allergic" }
}
}
}
to this aggregation query:
POST tes1/_search?search_type=count
{
"aggs" : {
"fruits" : {
"filter" : {
"query":{
"query_string": {
"query": "Fruits",
"fields": [
"category"
]
}
}},
"aggs" : {
"minprice": {
"top_hits": {
"sort": [
{
"prix en €/kg": {
"order": "asc"
}
}
], "size":400
}
}
}
}} }

You need to add the query part before the aggregation call. This will filter the results and then run aggregation on the resultset.
POST tes1/_search
{
"_source": false,
"size": 1000,
"query":
{ "constant_score" : {
"filter" : {
"missing" : { "field" : "allergic" }
}
}
},
"aggs" : {
"fruits" : {
"filter" : {
"query":{
"query_string": {
"query": "Fruits",
"fields": [
"category"
]
}
}},
"aggs" : {
"minprice": {
"top_hits": {
"sort": [
{
"prix en €/kg": {
"order": "asc"
}
}
], "size":400
}
}
}
}} }
On a side note please consider upgrading ElasticSearch to the latest version as 1.x is no longer supported.

Related

Elasticsearch Aggregation Query Builder

I am new to working with aggregations in Elasticsearch. I am on version 5.3 of Elasticsearch. I have a query to do a terms aggregation and also a filter, but when I try to use AggregationsBuilder to build the same query, I can't get it to look the same as the manual query. The manual query works and it is:
{
"size": 0,
"aggs": {
"data": {
"nested": {
"path": "data"
},
"aggs": {
"my_filters": {
"filter": {
"bool": {
"must": [
{
"term": {
"data.genre": "sci-fi"
}
}
]
}
},
"aggs": {
"my_agg_field": {
"terms": {
"field": "data.director.keyword"
}
}
}
}
}
}
}
}
But in my code, I need to use AggregationBuilder to create the query. So I use the following:
AggregationBuilder aggregation =
AggregationBuilders
.nested("data", "data")
.subAggregation(
AggregationBuilders.filter("filters", query)
.subAggregation(
AggregationBuilders
.terms("bucket_field").field("data.directors.keyword")
)
);
With the AggregationsBuilder, I get the following query:
{
"aggs": {
"filters" : {
"filter" : {
"nested" : {
"query" : {
"bool" : {
"must" : [
{
"match" : {
"data.genre" : {
"query" : "sci-fi",
"operator" : "OR"
}
}
}
]
}
},
"path" : "data"
}
},
"aggregations" : {
"bucket_field" : {
"terms" : {
"field" : "data.director.keyword"
}
}
}
}
}
}
This aggregation query returns results but doesn't aggregate the buckets like the manual query does and just returns:
"aggregations": {
"filters": {
"doc_count": 62,
"bucket_field": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
},
I need help converting my manual query into an AggregationsBuilder query. Thanks!

How does ES multiple random queries without repeating the results?

elasticsearch version 5.0
I have a requirement to randomly query user information multiple times, but the final result cannot have duplicate data.
For example,
the first random query result
user0 user1 user2
the second random query result
user0 user3 user4
User0 is a duplicate.
This is my random query, how can I modify it?
{
"size" : 10,
"query" : {
"match_all" : {
"boost" : 1.0
}
},
"_source" : {
"includes" : [
],
"excludes" : [ ]
},
"sort" : [
{
"_script" : {
"script" : {
"inline" : "Math.random()",
"lang" : "painless"
},
"type" : "number",
"order" : "asc"
}
}
],
"ext" : { }
}
{
"size": 1,
"query": {
"function_score": {
"functions": [
{
"random_score": {
"seed": "1477072619038"
}
}
]
}
}
}
You can follow this https://www.elastic.co/guide/en/elasticsearch/reference/5.4/query-dsl-function-score-query.html#function-random
You can use a bool must_not query and an id query to remove the ids of the previously retrieved documents.
{
"query": {
"match_all": {
"boost": 1.0
},
"bool": {
"must_not": [
{
"ids": {
"values": [The set of previous Ids]
}
}
]
}
},
...
}

Converting SQL query to ElasticSearch Query

I want to convert the following sql query to Elasticsearch one. can any one help in this.
select csgg, sum(amount) from table1
where type in ('a','b','c') and year=2016 and fc="33" group by csgg having sum(amount)=0
I tried following way:enter code here
{
"size": 500,
"query" : {
"constant_score" : {
"filter" : {
"bool" : {
"must" : [
{"term" : {"fc" : "33"}},
{"term" : {"year" : 2016}}
],
"should" : [
{"terms" : {"type" : ["a","b","c"] }}
]
}
}
}
},
"aggs": {
"group_by_csgg": {
"terms": {
"field": "csgg"
},
"aggs": {
"sum_amount": {
"sum": {
"field": "amount"
}
}
}
}
}
}
but not sure if I am doing right as its not validating the results.
seems query to be added inside aggregation.
Assuming that you use Elasticsearch 2.x, there is a possibility to have the having-semantics in Elasticsearch.
I'm not aware of a possibility prior 2.0.
You can use the new Pipeline Aggregation Bucket Selector Aggregation, which only selects the buckets, which meet a certain criteria:
POST test/test/_search
{
"size": 0,
"query" : {
"constant_score" : {
"filter" : {
"bool" : {
"must" : [
{"term" : {"fc" : "33"}},
{"term" : {"year" : 2016}},
{"terms" : {"type" : ["a","b","c"] }}
]
}
}
}
},
"aggs": {
"group_by_csgg": {
"terms": {
"field": "csgg",
"size": 100
},
"aggs": {
"sum_amount": {
"sum": {
"field": "amount"
}
},
"no_amount_filter": {
"bucket_selector": {
"buckets_path": {"sumAmount": "sum_amount"},
"script": "sumAmount == 0"
}
}
}
}
}
}
However there are two caveats. Depending on your configuration, it might be necessary to enable scripting like that:
script.aggs: true
script.groovy: true
Moreover, as it works on the parent buckets it is not guaranteed that you get all buckets with amount = 0. If the terms aggregation selects only terms with sum amount != 0, you will have no result.

Elasticsearch Facets: Search on _index returned no results

I want to search data on ES in this order by index-> by index_type-> text search data.
When I'am using the below query on "_index" I expected to get list of index_types under that particular _index and also the related data but it returned nothing. On the other hand when I searched by _type I got the data pertaining to the index_type. Where have I gone wrong?
curl -XGET 'http://localhost:9200/_all/_search?pretty' -d '{
"facets": {
"terms": {
"terms": {
"field": "_index",
"size": 10,
"order": "count",
"exclude": []
},
"facet_filter": {
"fquery": {
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "*"
}
}
]
}
},
"filter": {
"bool": {
"must": [
{
"terms": {
"_index": [
"<index_name>"
]
}
}
]
}
}
}
}
}
}
}
},
"size": 0
}'
Note: I faced this problem first on Kibana, where I used the filter "_index":"name_of_index"; it returned no results but "_type":"name_of_index_type" returned the expected result. I found Kibana uses the above query behind the scenes to get the results of the filter I tried.
this is an example of query with pre filter ( "query" : "*" ) and then a must&mustnot query. then the resutlt is used to make the aggregations :
curl -XGET 'http://localhost:9200/YOUR_INDEX_NAME/_search?size=10' -d '{
"query" : {
"filtered" : {
"query" : {
"query_string" : {
"query" : "*"
}
},
"filter" : {
"bool" : {
"must" : [
{ "term" : { "E_RECORDEDBY" : "malençon, g."} },
{ "term" : { "T_SCIENTIFICNAME" : "peniophora incarnata" } }
],
"must_not" : [
{"term" : { "L_CONTINENT" : "africa" } },
{"term" : { "L_CONTINENT" : "europe" } }
]
}
}
}
},
"aggs" : {
"L_CONTINENT" : {
"terms" : {
"field" : "L_CONTINENT",
"size" : 20
}
}
},
"sort" : "_score"
}'

Post filter on subaggregation in elasticsearch

I am trying to run a post filter on the aggregated data, but it is not working as i expected. Can someone review my query and suggest if i am doing anything wrong here.
"query" : {
"bool" : {
"must" : {
"range" : {
"versionDate" : {
"from" : null,
"to" : "2016-04-22T23:13:50.000Z",
"include_lower" : false,
"include_upper" : true
}
}
}
}
},
"aggregations" : {
"associations" : {
"terms" : {
"field" : "association.id",
"size" : 0,
"order" : {
"_term" : "asc"
}
},
"aggregations" : {
"top" : {
"top_hits" : {
"from" : 0,
"size" : 1,
"_source" : {
"includes" : [ ],
"excludes" : [ ]
},
"sort" : [ {
"versionDate" : {
"order" : "desc"
}
} ]
}
},
"disabledDate" : {
"filter" : {
"missing" : {
"field" : "disabledDate"
}
}
}
}
}
}
}
STEPS in the query:
Filter by indexDate less than or equal to a given date.
Aggregate based on formId. Forming buckets per formId.
Sort in descending order and return top hit result per bucket.
Run a subaggregation filter after the sort subaggregation and remove all the documents from buckets where disabled date is not null.(Which is not working)
The whole purpose of post_filter is to run after aggregations have been computed. As such, post_filter has no effect whatsoever on aggregation results.
What you can do in your case is to apply a top-level filter aggregation so that documents with no disabledDate are not taken into account in aggregations, i.e. consider only documents with disabledDate.
{
"query": {
"bool": {
"must": {
"range": {
"versionDate": {
"from": null,
"to": "2016-04-22T23:13:50.000Z",
"include_lower": true,
"include_upper": true
}
}
}
}
},
"aggregations": {
"with_disabled": {
"filter": {
"exists": {
"field": "disabledDate"
}
},
"aggs": {
"form.id": {
"terms": {
"field": "form.id",
"size": 0
},
"aggregations": {
"top": {
"top_hits": {
"size": 1,
"_source": {
"includes": [],
"excludes": []
},
"sort": [
{
"versionDate": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}

Resources