I have small data of 1200 entries in Elasticsearch that is automatically input in mapped fields of document-types. The float goes in float and double goes in double.
When taking 'aggs' of the data on 'stats' like:
GET /statsd-2015.09.28/timer_data/_search
{
"query" : {
"filtered" : {
"query" : { "match_all" : {}},
"filter" : {
"range" : { "ns" : { "lte" : "gunicorn" }}
}
}
},
"aggs" : {
"value_val" : { "stats" : { "field" : "u'count_90'" } }
}
}
I get null on return like this:
...
"aggregations": {
"value_val": {
"count": 0,
"min": null,
"max": null,
"avg": null,
"sum": null
}
}
...
Here is my mapping of fields:
{"statsd-2015.09.28":{"mappings":{"timer":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"gauge":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"counter":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"timer_data":{"properties":{"#timestamp":{"type":"double"},"act":{"type":"string"},"count":{"type":"float"},"count_90":{"type":"float"},"count_ps":{"type":"float"},"grp":{"type":"string"},"lower":{"type":"float"},"mean":{"type":"float"},"mean_90":{"type":"float"},"median":{"type":"float"},"ns":{"type":"string"},"std":{"type":"float"},"sum":{"type":"float"},"sum_90":{"type":"float"},"sum_squares":{"type":"float"},"sum_squares_90":{"type":"float"},"tgt":{"type":"string"},"upper":{"type":"float"},"upper_90":{"type":"float"}}}}}}
What I want to ask is that why is my output not desired? And how can I get it?
GET /statsd-2015.09.28/timer_data/_search
{
"query" : {
"filtered" : {
"query" : { "match_all" : {}},
"filter" : {
"range" : { "ns" : { "lte" : "gunicorn" }}
}
}
},
"aggs" : {
"value_val" : { "stats" : { "field" : "count_90" } }
}
}
I am new to this but I realized that field name was not what I was using. After this, everything became clear.
Related
I am using elasticsearch to do analyze and found that when doing the aggregation, if one bucket all elements are null value, the sum result is 0, but avg result is null.
{
"size" : 0,
"query" : {
"bool" : {
"must" : {
"bool" : {
"must" : {
"bool" : {
"should" : [ {
"term" : {
"2219" : "AAA"
}
}, {
"term" : {
"2219" : "BBB"
}
}, {
"term" : {
"2219" : "CCC"
}
}, {
"term" : {
"2219" : "DDD"
}
} ]
}
}
}
}
}
},
"explain" : false,
"aggregations" : {
"2224" : {
"terms" : {
"field" : "2224",
"missing" : "null",
"size" : 2000
},
"aggregations" : {
"2219" : {
"terms" : {
"field" : "2219",
"missing" : "null",
"size" : 2000
},
"aggregations" : {
"a" : {
"avg" : {
"field" : "2255"
}
},
"count" : {
"value_count" : {
"field" : "1982"
}
}
}
}
}
}
}
}
The result will be
...
{
"2219": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "DDD",
"doc_count": 1,
"a": {
"value": null
}
}
]
},
"key": "rock",
"doc_count": 1
}
...
The result for "a" is null.
But if I change to sum, the result of "a" is 0.
Weird different behavior.
There's a similar issue in ES github: https://github.com/elastic/elasticsearch/issues/9745
null is considered a correct value for AVG aggregation in case when ES has found 0 entities.
try adding this script to the aggregation to remove nulls:
"avg" : {
"field" : "2255"
"script":{
"lang":"painless",
"source":"if (_value == null) {return 0} else {return _value}"
}
}
I have this JSON structure in Elasticsearch. I am having trouble creating a DSL to search for all null values of awsKafkaTimestamp that are in between a post.timestamp range of A and B. How can I do this?
{
"tracer": {
"post": {"timestamp": 123123},
"awsKafkaTimestamp": null,
"qdcKafkaTimestamp": null
}
}
Try this: (works for ES 2.4 - will not work for 2.2 and below)
{
"fields" : ["your_field"],
"query" : {
"bool" : {
"must_not" : {
"exists" : {
"field" : "awsKafkaTimestamp "
}
},
"must" : [{
"nested" : {
"path" : "post",
"filter" : {
"bool" : {
"must" : {
"range" : {
"post.timestamp" : {
"lte" : A,
"gte" : B
}
}
}
}
}
}
}
]
}
}
}
I want to use elasticsearch as a search engine. I'm copying records from mysql to elasticsearch and when I query elasticsearch i want to calculate a value with the data in elastic and use it to sort the results
My index looks like:
{
"busquedas" : {
"aliases" : { },
"mappings" : {
"coche" : {
"properties" : {
"coeff_e" : {
"type" : "double"
},
"coeff_r" : {
"type" : "double"
},
"desc" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1460116924258",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "N6jhy_ilQmSb6og16suZ4g",
"version" : {
"created" : "2030199"
}
}
},
"warmers" : { }
}
}
And i want to compute a value per record like
myCustomOrder = (coeff_e + coeff_r) * timestamp
And use it to sort results
{
"sort" : [
{ "myCustomOrder" : {"order" : "asc"}},
"_score"
],
"query" : {
"term" : { ... }
}
}
I know i can use groovy to compute values but I only could use it to filter like its shown in the examples
{
"from": 10,
"size": 3,
"filter": {
"script": {
"script": "doc['coeff_e'].value < 0.5"
}
}
}
Thank you in advance, I'm totally newbie to elasticsearch :D
The same as with filtering. Take a look at this section of the documentation. It should be self-explanatory once you already know about scripts :-).
For the completeness sake:
{
"query" : {
....
},
"sort" : {
"_script" : {
"type" : "number",
"script" : {
"inline": "doc['field_name'].value * factor",
"params" : {
"factor" : 1.1
}
},
"order" : "asc"
}
}
}
I'm attempting to find parents based on matches in their children and retrieve children term aggregations for the matches. For some reason, the bucket count for the children aggregation is showing a higher count than actual results (I would be happy if it showed the count of the parents - or the children - in the particular children bucket).
The query is similar to the following (NOTE: I use the filtered query as I will later add a filter in addition to the query):
{
"query" : {
"filtered" : {
"query" : {
"has_child" : {
"type" : "blog_tag",
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
}
}
},
"aggs" : {
"my_children" : {
"children" : {
"type" : "my_child_type"
},
"aggs" : {
"field_name" : {
"terms" : {
"field" : { "blog.blog_tag.field_name" }
}
}
}
}
}
}
What is the correct way to do this?
The problem was as noted in the comments. The solution was to filter the aggregation with the query,
"query" : {
"filtered" : {
"query" : {
"has_child" : {
"type" : "blog_tag",
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
}
}
},
"aggs" : {
"my_children" : {
"children" : {
"type" : "my_child_type"
},
"aggs" : {
"results" : {
"filter" : {
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
},
"aggs" : {
"field_name" : {
"terms" : {
"field" : { "blog.blog_tag.field_name" }
}
}
}
}
}
}
}
I have come from a Solr background and am trying to find the equivalent of "tagging" and "excluding" in Elasticsearch.
In the following example, how can I exclude the price filter from the calculation of the prices facet? In other words, the prices facet should take into account all of the filters except for price.
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
},
{
"range" : {
"price" : {
"from" : "10",
"to" : "20"
}
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
}
},
"features" : {
"terms" : {
"field" : "feature"
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
}
}
You can apply price filter as a top level filter to your query and add it to all facets expect prices as a facet_filter:
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"features" : {
"terms" : {
"field" : "feature"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
},
"filter": {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
}
Btw, important change since ES 1.0.0. Top-level filter was renamed to post_filter (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_search_requests.html#_search_requests). And filtered queries using is still preferred as described here: http://elasticsearch-users.115913.n3.nabble.com/Filters-vs-Queries-td3219558.html
And there is global option for facets to avoid filtering by query filter (elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets.html#_scope).