Elasticsearch DSL for all null values between 2 timestamps - elasticsearch

I have this JSON structure in Elasticsearch. I am having trouble creating a DSL to search for all null values of awsKafkaTimestamp that are in between a post.timestamp range of A and B. How can I do this?
{
"tracer": {
"post": {"timestamp": 123123},
"awsKafkaTimestamp": null,
"qdcKafkaTimestamp": null
}
}

Try this: (works for ES 2.4 - will not work for 2.2 and below)
{
"fields" : ["your_field"],
"query" : {
"bool" : {
"must_not" : {
"exists" : {
"field" : "awsKafkaTimestamp "
}
},
"must" : [{
"nested" : {
"path" : "post",
"filter" : {
"bool" : {
"must" : {
"range" : {
"post.timestamp" : {
"lte" : A,
"gte" : B
}
}
}
}
}
}
}
]
}
}
}

Related

elasticsearch nested functionScoreQuery cannot access parent properties

I have a type in elasticsearch that looks like this:
"hotel" : {
"field" : 1,
"rooms" : [
{
"type" : "single",
"magicScore" : 1
},
{
"type" : "double",
"magicScore" : 2
}
]
}
where rooms is of type nested. I sort using a nested functionScoreQuery:
{
"query" : {
"filtered" : {
"query" : {
"nested" : {
"query" : {
"function_score" : {
"filter" : {
"match_all" : { }
},
"functions" : [ {
"script_score" : {
"script" : "return doc['hotel.field'].value"
}
} ]
}
},
"path" : "rooms",
"score_mode" : "max"
}
}
}
}
Problem is hotel.field returns 0 always. Is there a way to access the parent field inside a nested query? I know I can always pack the field inside the nested document but its a hack not a solution. Would using a dismax query help me? https://discuss.elastic.co/t/nested-value-on-function-score/29935
The query I am actually using looks something like this:
{
"query" : {
"bool" : {
"must" : {
"nested" : {
"query" : {
"function_score" : {
"query" : {
"not" : {
"query" : {
"terms" : {
"rooms.type" : [ "single", "triple" ]
}
}
}
},
"functions" : [ {
"script_score" : {
"script" : {
"inline" : "return doc['rooms.magicScore'].value;",
"lang" : "groovy",
"params" : {
"ratings" : {
"sample" : 0.5
},
"variable" : [ 0.0, 0.0, 0.0, 0.0, -0.5, -2.5]
}
}
}
} ],
"score_mode" : "max"
}
},
"path" : "rooms"
}
},
"filter" : {
"bool" : {
"filter" : [ {
"bool" : {
"should" : [ {
"term" : {
"cityId" : "166"
}
}, {
"term" : {
"cityId" : "165"
}
} ]
}
}, {
"nested" : {
"query" : {
"not" : {
"query" : {
"terms" : {
"rooms.type" : [ "single", "triple" ]
}
}
}
},
"path" : "rooms"
}
} ]
}
}
}
}
}
What I am trying to achieve is to access for example the cityId inside the function_score query which is nested.
The question is why are you accessing the parent values in a nested query. Once you are in the nested context, you cannot access parent fields or other fields from other nested fields.
From the documentation:
The nested clause “steps down” into the nested comments field. It no longer has access to fields in the root document, nor fields in any other nested document.
So, rewrite your queries so that the nested part touches the fields in that nested field and anything else is accessed outside the nested part.

elasticsearch returns null on stats aggregation

I have small data of 1200 entries in Elasticsearch that is automatically input in mapped fields of document-types. The float goes in float and double goes in double.
When taking 'aggs' of the data on 'stats' like:
GET /statsd-2015.09.28/timer_data/_search
{
"query" : {
"filtered" : {
"query" : { "match_all" : {}},
"filter" : {
"range" : { "ns" : { "lte" : "gunicorn" }}
}
}
},
"aggs" : {
"value_val" : { "stats" : { "field" : "u'count_90'" } }
}
}
I get null on return like this:
...
"aggregations": {
"value_val": {
"count": 0,
"min": null,
"max": null,
"avg": null,
"sum": null
}
}
...
Here is my mapping of fields:
{"statsd-2015.09.28":{"mappings":{"timer":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"gauge":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"counter":{"properties":{"#timestamp":{"type":"string"},"act":{"type":"string"},"grp":{"type":"string"},"ns":{"type":"string"},"tgt":{"type":"string"},"val":{"type":"float"}}},"timer_data":{"properties":{"#timestamp":{"type":"double"},"act":{"type":"string"},"count":{"type":"float"},"count_90":{"type":"float"},"count_ps":{"type":"float"},"grp":{"type":"string"},"lower":{"type":"float"},"mean":{"type":"float"},"mean_90":{"type":"float"},"median":{"type":"float"},"ns":{"type":"string"},"std":{"type":"float"},"sum":{"type":"float"},"sum_90":{"type":"float"},"sum_squares":{"type":"float"},"sum_squares_90":{"type":"float"},"tgt":{"type":"string"},"upper":{"type":"float"},"upper_90":{"type":"float"}}}}}}
What I want to ask is that why is my output not desired? And how can I get it?
GET /statsd-2015.09.28/timer_data/_search
{
"query" : {
"filtered" : {
"query" : { "match_all" : {}},
"filter" : {
"range" : { "ns" : { "lte" : "gunicorn" }}
}
}
},
"aggs" : {
"value_val" : { "stats" : { "field" : "count_90" } }
}
}
I am new to this but I realized that field name was not what I was using. After this, everything became clear.

Elasticsearch: has_child query with children aggregation - bucket counts are wrong

I'm attempting to find parents based on matches in their children and retrieve children term aggregations for the matches. For some reason, the bucket count for the children aggregation is showing a higher count than actual results (I would be happy if it showed the count of the parents - or the children - in the particular children bucket).
The query is similar to the following (NOTE: I use the filtered query as I will later add a filter in addition to the query):
{
"query" : {
"filtered" : {
"query" : {
"has_child" : {
"type" : "blog_tag",
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
}
}
},
"aggs" : {
"my_children" : {
"children" : {
"type" : "my_child_type"
},
"aggs" : {
"field_name" : {
"terms" : {
"field" : { "blog.blog_tag.field_name" }
}
}
}
}
}
}
What is the correct way to do this?
The problem was as noted in the comments. The solution was to filter the aggregation with the query,
"query" : {
"filtered" : {
"query" : {
"has_child" : {
"type" : "blog_tag",
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
}
}
},
"aggs" : {
"my_children" : {
"children" : {
"type" : "my_child_type"
},
"aggs" : {
"results" : {
"filter" : {
"query" : {
"filtered" : {
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
},
"aggs" : {
"field_name" : {
"terms" : {
"field" : { "blog.blog_tag.field_name" }
}
}
}
}
}
}
}

filtering on geo_distance in elasticsearch

I'm trying to set a maximum distance from my center location in my elasticsearch query, there's no problems with the sorting part:
{
"query" : {
"match_all" : {}
},
"sort" : [
{
"_geo_distance" : {
"location" : "56,14",
"order" : "asc",
"unit" : "km"
}
}
]
}
however when I try adding a filter I get the "[geo_distance] filter does not support [location]":
{
"query" : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "200m",
"location" : {
"location" : "56,14"
}
}
}
}
},
"sort" : [
{
"_geo_distance" : {
"location" : "56,14",
"order" : "asc",
"unit" : "km"
}
}
]
}
any ideas of what I'm doing wrong?
Use this filter instead
"filter" : {
"geo_distance" : {
"distance" : "200m",
"location" : "56,14"
}
}
Location can be any name for the field like if you field name is loc or locator,then the query would be
"filter" : {
"geo_distance" : {
"distance" : "200m",
"loc"/"locator" : "56,14"
}
}

How to exclude a filter from a facet?

I have come from a Solr background and am trying to find the equivalent of "tagging" and "excluding" in Elasticsearch.
In the following example, how can I exclude the price filter from the calculation of the prices facet? In other words, the prices facet should take into account all of the filters except for price.
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
},
{
"range" : {
"price" : {
"from" : "10",
"to" : "20"
}
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
}
},
"features" : {
"terms" : {
"field" : "feature"
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
}
}
You can apply price filter as a top level filter to your query and add it to all facets expect prices as a facet_filter:
{
query : {
"filtered" : {
"query" : {
"match_all" : {}
},
"filter" : {
"and" : [
{
"term" : {
"colour" : "Red"
}
},
{
"term" : {
"feature" : "Square"
}
},
{
"term" : {
"feature" : "Shiny"
}
}
]
}
}
},
"facets" : {
"colours" : {
"terms" : {
"field" : "colour"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"features" : {
"terms" : {
"field" : "feature"
},
"facet_filter" : {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
},
"prices" : {
"statistical" : {
"field" : "price"
}
}
},
"filter": {
"range" : { "price" : { "from" : "10", "to" : "20" } }
}
}
Btw, important change since ES 1.0.0. Top-level filter was renamed to post_filter (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_search_requests.html#_search_requests). And filtered queries using is still preferred as described here: http://elasticsearch-users.115913.n3.nabble.com/Filters-vs-Queries-td3219558.html
And there is global option for facets to avoid filtering by query filter (elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets.html#_scope).

Resources