elasticsearch returns null on stats aggregation - elasticsearch

I have small data of 1200 entries in Elasticsearch that is automatically input in mapped fields of document-types. The float goes in float and double goes in double.
When taking 'aggs' of the data on 'stats' like:
GET /statsd-2015.09.28/timer_data/_search
{
"query" : {
"filtered" : {
"query" : { "match_all" : {}},
"filter" : {
"range" : { "ns" : { "lte" : "gunicorn" }}
}
}
},
"aggs" : {
"value_val" : { "stats" : { "field" : "u'count_90'" } }
}
}
I get null on return like this:
...
"aggregations": {
"value_val": {
"count": 0,
"min": null,
"max": null,
"avg": null,
"sum": null
}
}
...
Here is my mapping of fields:
{"statsd-2015.09.28":{"mappings":{"timer":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"gauge":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"counter":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"timer_data":{"properties":{"#timestamp":{"type":"double"},"act":{"type":"string"},"count":{"type":"float"},"count_90":{"type":"float"},"count_ps":{"type":"float"},"grp":{"type":"string"},"lower":{"type":"float"},"mean":{"type":"float"},"mean_90":{"type":"float"},"median":{"type":"float"},"ns":{"type":"string"},"std":{"type":"float"},"sum":{"type":"float"},"sum_90":{"type":"float"},"sum_squares":{"type":"float"},"sum_squares_90":{"type":"float"},"tgt":{"type":"string"},"upper":{"type":"float"},"upper_90":{"type":"float"}}}}}}
What I want to ask is that why is my output not desired? And how can I get it?

GET /statsd-2015.09.28/timer_data/_search
{
"query" : {
"filtered" : {
"query" : { "match_all" : {}},
"filter" : {
"range" : { "ns" : { "lte" : "gunicorn" }}
}
}
},
"aggs" : {
"value_val" : { "stats" : { "field" : "count_90" } }
}
}
I am new to this but I realized that field name was not what I was using. After this, everything became clear.

Related

Elasticsearch null values aggregation, sum beng 0 and avg being null?

I am using elasticsearch to do analyze and found that when doing the aggregation, if one bucket all elements are null value, the sum result is 0, but avg result is null.
{
"size" : 0,
"query" : {
"bool" : {
"must" : {
"bool" : {
"must" : {
"bool" : {
"should" : [ {
"term" : {
"2219" : "AAA"
}
}, {
"term" : {
"2219" : "BBB"
}
}, {
"term" : {
"2219" : "CCC"
}
}, {
"term" : {
"2219" : "DDD"
}
} ]
}
}
}
}
}
},
"explain" : false,
"aggregations" : {
"2224" : {
"terms" : {
"field" : "2224",
"missing" : "null",
"size" : 2000
},
"aggregations" : {
"2219" : {
"terms" : {
"field" : "2219",
"missing" : "null",
"size" : 2000
},
"aggregations" : {
"a" : {
"avg" : {
"field" : "2255"
}
},
"count" : {
"value_count" : {
"field" : "1982"
}
}
}
}
}
}
}
}
The result will be
...
{
"2219": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "DDD",
"doc_count": 1,
"a": {
"value": null
}
}
]
},
"key": "rock",
"doc_count": 1
}
...
The result for "a" is null.
But if I change to sum, the result of "a" is 0.
Weird different behavior.
There's a similar issue in ES github: https://github.com/elastic/elasticsearch/issues/9745
null is considered a correct value for AVG aggregation in case when ES has found 0 entities.
try adding this script to the aggregation to remove nulls:
"avg" : {
"field" : "2255"
"script":{
"lang":"painless",
"source":"if (_value == null) {return 0} else {return _value}"
}
}

Elasticsearch DSL for all null values between 2 timestamps

I have this JSON structure in Elasticsearch. I am having trouble creating a DSL to search for all null values of awsKafkaTimestamp that are in between a post.timestamp range of A and B. How can I do this?
{
"tracer": {
"post": {"timestamp": 123123},
"awsKafkaTimestamp": null,
"qdcKafkaTimestamp": null
}
}
Try this: (works for ES 2.4 - will not work for 2.2 and below)
{
"fields" : ["your_field"],
"query" : {
"bool" : {
"must_not" : {
"exists" : {
"field" : "awsKafkaTimestamp "
}
},
"must" : [{
"nested" : {
"path" : "post",
"filter" : {
"bool" : {
"must" : {
"range" : {
"post.timestamp" : {
"lte" : A,
"gte" : B
}
}
}
}
}
}
}
]
}
}
}

How can I sort by a calculated value in elasticsearch

I want to use elasticsearch as a search engine. I'm copying records from mysql to elasticsearch and when I query elasticsearch i want to calculate a value with the data in elastic and use it to sort the results
My index looks like:
{
"busquedas" : {
"aliases" : { },
"mappings" : {
"coche" : {
"properties" : {
"coeff_e" : {
"type" : "double"
},
"coeff_r" : {
"type" : "double"
},
"desc" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1460116924258",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "N6jhy_ilQmSb6og16suZ4g",
"version" : {
"created" : "2030199"
}
}
},
"warmers" : { }
}
}
And i want to compute a value per record like
myCustomOrder = (coeff_e + coeff_r) * timestamp
And use it to sort results
{
"sort" : [
{ "myCustomOrder" : {"order" : "asc"}},
"_score"
],
"query" : {
"term" : { ... }
}
}
I know i can use groovy to compute values but I only could use it to filter like its shown in the examples
{
"from": 10,
"size": 3,
"filter": {
"script": {
"script": "doc['coeff_e'].value < 0.5"
}
}
}
Thank you in advance, I'm totally newbie to elasticsearch :D
The same as with filtering. Take a look at this section of the documentation. It should be self-explanatory once you already know about scripts :-).
For the completeness sake:
{
"query" : {
....
},
"sort" : {
"_script" : {
"type" : "number",
"script" : {
"inline": "doc['field_name'].value * factor",
"params" : {
"factor" : 1.1
}
},
"order" : "asc"
}
}
}

Elasticsearch: has_child query with children aggregation - bucket counts are wrong

I'm attempting to find parents based on matches in their children and retrieve children term aggregations for the matches. For some reason, the bucket count for the children aggregation is showing a higher count than actual results (I would be happy if it showed the count of the parents - or the children - in the particular children bucket).
The query is similar to the following (NOTE: I use the filtered query as I will later add a filter in addition to the query):
{
"query" : {
"filtered" : {
"query" : {
"has_child" : {
"type" : "blog_tag",
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
}
}
},
"aggs" : {
"my_children" : {
"children" : {
"type" : "my_child_type"
},
"aggs" : {
"field_name" : {
"terms" : {
"field" : { "blog.blog_tag.field_name" }
}
}
}
}
}
}
What is the correct way to do this?
The problem was as noted in the comments. The solution was to filter the aggregation with the query,
"query" : {
"filtered" : {
"query" : {
"has_child" : {
"type" : "blog_tag",
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
}
}
},
"aggs" : {
"my_children" : {
"children" : {
"type" : "my_child_type"
},
"aggs" : {
"results" : {
"filter" : {
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
},
"aggs" : {
"field_name" : {
"terms" : {
"field" : { "blog.blog_tag.field_name" }
}
}
}
}
}
}
}

How to exclude a filter from a facet?

I have come from a Solr background and am trying to find the equivalent of "tagging" and "excluding" in Elasticsearch.
In the following example, how can I exclude the price filter from the calculation of the prices facet? In other words, the prices facet should take into account all of the filters except for price.
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
},
{
"range" : {
"price" : {
"from" : "10",
"to" : "20"
}
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
}
},
"features" : {
"terms" : {
"field" : "feature"
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
}
}
You can apply price filter as a top level filter to your query and add it to all facets expect prices as a facet_filter:
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"features" : {
"terms" : {
"field" : "feature"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
},
"filter": {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
}
Btw, important change since ES 1.0.0. Top-level filter was renamed to post_filter (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_search_requests.html#_search_requests). And filtered queries using is still preferred as described here: http://elasticsearch-users.115913.n3.nabble.com/Filters-vs-Queries-td3219558.html
And there is global option for facets to avoid filtering by query filter (elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets.html#_scope).

Resources