I have an indexed entry that has optional properties. So, for example, I have entries like this
{
"id":1
"field1":"XYZ"
},
{
"id":2
"field2":"XYZ"
},
{
"id":3
"field1":"XYZ"
}
I would like to make an aggregation that will tell me how many entries I have with field1 and field2 populated.
The expected result should be:
{
"field1":2
"field2":1
}
Is this even possible with elasticsaerch?
Yes, you can do it like this:
POST myindex/_search
{
"size": 0,
"aggs": {
"field_exists": {
"filters": {
"filters": {
"field1": {
"exists": {
"field": "field1"
}
},
"field2": {
"exists": {
"field": "field2"
}
}
}
}
}
}
}
You'll get an answer like this one:
"aggregations" : {
"field_exists" : {
"buckets" : {
"field1" : {
"doc_count" : 2
},
"field2" : {
"doc_count" : 1
}
}
}
}
Related
I wrote a elasticsearch query to get the aggregated doc count of a matching keyword "webserver1". Below is the query:
POST _search?filter_path=aggregations.*.buckets
{
"query": {
"bool": {
"must": [
{
"match": {
"hostname": "webserver1"
}
}
]
}
},
"aggs": {
"webserver1": {
"terms": {
"field": "webserver1"
}
}
}
}
Response:
{
"aggregations" : {
"webserver1" : {
"buckets" : [
{
"key" : "webserver1",
"doc_count" : 36715
}
]
}
}
}
Is there a way to filter only the wanted text and display it like the below one:
{
"webserver1" : 36715
}
I have checked multiple resource but I'm not able to find any filters/options to do it.
Structure:
{
.................
"mp": "CAR",
"nPhoto": 1,
"items": [
{
"availableQuantity": 3,
},
{
"availableQuantity": 0,
},
{
"availableQuantity": 0,
}
],
............................
}
}
If I filter by mp field, I generate the following query:
GET catalog/_search
{
"from" : 0,
"size" : 0,
"aggregations" : {
"brand" : {
"filter" : {
"bool" : {
"must" : {
"term" : {
"mp" : "CAR"
}
}
}
},
"aggregations" : {
"photosQuantity" : { "sum" : { "field" : "nPhoto" } }
}
}
}
}
But how to generate query if you need to filter by field availableQuantity, where availableQuantity > 0 at least one of the items?
What you probably want is nested query in filter part.
something along line of this:
{
"from": 0,
"size": 0,
"aggregations": {
"brand": {
"filter": {
"nested": {
"path": "items",
"query": {
"range": {
"items.availableQuantity": {
"gte": 0
}
}
}
}
},
"aggregations": {
"photosQuantity": {
"sum": {
"field": "nPhoto"
}
}
}
}
}
}
I'm having trouble aggregating my nested data to include null values as well.
I'm using Elasticsearch version 6.8
I'll simplify the problem, I've a nested field that looks like:
PUT test/doc/_mapping
{
"properties": {
"fields": {
"type" : "nested",
"properties" : {
"name" : {
"type" : "keyword"
},
"value" : {
"type" : "long"
}
}
}
}
}
I created 3 documents:
PUT test/doc/1
{
"fields" : {
"name" : "aaa",
"value" : 1
}
}
PUT test/doc/2
{
"fields" : [{
"name" : "aaa",
"value" : 1
},
{
"name" : "bbb",
"value" : 2
}]
}
PUT test/doc/3
{
"fields" : [
{
"name" : "bbb",
"value" : 2
}]
}
Now I want to group my data to get how many documents there are where name="bbb" group by each value.
For the above data I want to get:
2 – 2 documents
N/A – 1 document (the first document where bbb is missing)
The problem is with the null values, I cannot find a way to match the documents where "bbb" is null and put them in a N/A bucket.
So far I wrote a query that match the values where "bbb" exist:
GET test/doc/_search
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"my_agg": {
"nested": {
"path": "fields"
},
"aggs": {
"my_filter": {
"filter": {
"term": {
"fields.name": "bbb"
}
},
"aggs": {
"my_term": {
"terms": {
"field": "fields.value"
}
}
}
}
}
}
}
}
And the response is:
"aggregations" : {
"my_agg" : {
"doc_count" : 4,
"my_filter" : {
"doc_count" : 2,
"my_term" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 2,
"doc_count" : 2
}
]
}
}
}
}
I want to get also:
"key" : 0 (for N/A)
"doc_count" : 1
What am I missing?
If I understand this correctly, you want to know the buckets where there was zero/null/no matches. You can use min_doc_count
GET test/doc/_search
{
"size": ,
"query": {
"match_all": {}
},
"aggs": {
"my_agg": {
"nested": {
"path": "fields"
},
"aggs": {
"my_filter": {
"filter": {
"term": {
"fields.name": "bbb"
}
},
"aggs": {
"my_term": {
"terms": {
"field": "fields.value", --> you can also use "_id" to get count based on each document
"min_doc_count": 0 --> this will include all the buckets where count is zero/ or there is no match.
}
}
}
}
}
}
}
}
You could also use inner_hits to find a hit in each document or use _id in above aggregations query.
POST test/_search
{
"query": {
"bool": {
"should": [
{
"match_all": {}
},
{
"nested": {
"path": "fields",
"query": {
"match": {
"fields.name": "bbb"
}
},
"inner_hits": {}
}
}
]
}
}
}
I'm trying to do a simple unique aggregation, but getting this error:
java.lang.IllegalStateException: Field data loading is forbidden on eid
this is my query:
POST /logstash-2016.06.*/Nginx/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"pid": "1"
}
},
{
"term": {
"cvprogress": "0"
}
},
{
"range" : {
"ServerTime" : {
"gte" : "2016-06-28T00:00:00"
}
}
}
]
}
},
"aggs": {
"distinct_colors" : {
"cardinality" : {
"field" : "eid"
}
}
}
}
After going through the entire thread at https://github.com/elastic/elasticsearch/issues/15267 what worked was adding .raw
like this:
"aggs": {
"distinct_colors" : {
"cardinality" : {
"field" : "eid.raw"
}
}
}
I have a list, array or whichever language you are familiar. E.g. names : ["John","Bas","Peter"] and I want to query the name field if it matches one of those names.
One way is with OR Filter. e.g.
{
"filtered" : {
"query" : {
"match_all": {}
},
"filter" : {
"or" : [
{
"term" : { "name" : "John" }
},
{
"term" : { "name" : "Bas" }
},
{
"term" : { "name" : "Peter" }
}
]
}
}
}
Any fancier way? Better if it's a query than a filter.
{
"query": {
"filtered" : {
"filter" : {
"terms": {
"name": ["John","Bas","Peter"]
}
}
}
}
}
Which Elasticsearch rewrites as if you hat used this one
{
"query": {
"filtered" : {
"filter" : {
"bool": {
"should": [
{
"term": {
"name": "John"
}
},
{
"term": {
"name": "Bas"
}
},
{
"term": {
"name": "Peter"
}
}
]
}
}
}
}
}
When using a boolean filter, most of the time, it is better to use the bool filter than and or or. The reason is explained on the Elasticsearch blog: http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
As I tried the filtered query I got no [query] registered for [filtered], based on answer here it seems the filtered query has been deprecated and removed in ES 5.0. So I provide using:
{
"query": {
"bool": {
"filter": {
"terms": {
"name": ["John","Bas","Peter"]
}
}
}
}
}
example query = filter by keyword and a list of values
{
"query": {
"bool": {
"must": [
{
"term": {
"fguid": "9bbfe844-44ad-4626-a6a5-ea4bad3a7bfb.pdf"
}
}
],
"filter": {
"terms": {
"page": [
"1",
"2",
"3"
]
}
}
}
}
}