I am trying to fetch all the documents within a radius of a particular location (lat,long).
Here's the mapping with location as the geo_point:
{
"mappings": {
"_doc": {
"properties": {
"color": {
"type": "long"
},
"createdTime": {
"type": "date"
},
"location": {
"properties": {
"lat": {
"type": "float"
},
"lon": {
"type": "float"
}
}
}
}
}
}
}
And here's my query
{
"aggregations": {
"weather_agg": {
"geo_distance": {
"field": "location",
"origin": "41.12,-100.77",
"unit": "km",
"distance_type": "plane",
"ranges": [
{
"from": 0,
"to": 100
}
]
},
"aggregations": {
"timerange": {
"filter": {
"range": {
"createdTime": {
"gte": "now-40h",
"lte": "now"
}
}
},
"aggregations": {
"weather_stats": {
"stats": {
"field": "color"
}
}
}
}
}
}
}
}
I am getting 0 hits for this. My question is whether there's something wrong with the mapping or the query ? We recently migrated to a newer cloud version and there's a possibility that something broke because of that.
Instead of mapping lat and long as float you should geo-point mapping
Related
We have multiple nested fields which need to be summed and then graphed almost as if it were a value of the parent document (using scripted fields is not an ideal solution for us).
Given the example index mapping:
{
"mapping": {
"_doc": {
"properties": {
"build_name": { "type": "keyword" },
"start_ms": { "type": "date" },
"projects": {
"type": "nested",
"properties": {
"project_duration_ms": { type": "long" },
"project_name": { "type": "keyword" }
}
}
}
}
}
}
Example doc._source:
{
"build_name": "example_build_1",
"start_ms": "1611252094540",
"projects": [
{ "project_duration_ms": "19381", project_name": "example_project_1" },
{ "project_duration_ms": "2081", "project_name": "example_project_2" }
]
},
{
"build_name": "example_build_2",
"start_ms": "1611252097638",
"projects": [
{ "project_duration_ms": "21546", project_name": "example_project_1" },
{ "project_duration_ms": "2354", "project_name": "example_project_2" }
]
}
It would be ideal to get a aggregation something like:
....
"aggregations" : {
"builds" : {
"total_durations" : {
"buckets" : [
{
"key": "example_build_1",
"start_ms": "1611252094540",
"total_duration": "21462"
},
{
"key": "example_build_2",
"start_ms": "1611252097638",
"total_duration": "23900"
}
}
}
}
}
}
No scripted fields necessary. This nested sum aggregation should do the trick:
{
"size": 0,
"aggs": {
"builds": {
"terms": {
"field": "build_name"
},
"aggs": {
"total_durations_parent": {
"nested": {
"path": "projects"
},
"aggs": {
"total_durations": {
"sum": {
"field": "projects.project_duration_ms"
}
}
}
}
}
}
}
}
Your use case is a great candidate for employing the copy_to parameter which'll put the build durations into one top-level list of longs so that the nested query won't be required when we're summing them up.
Adjust the mapping like so:
"properties": {
"build_name": { "type": "keyword" },
"start_ms": { "type": "date" },
"total_duration_ms": { "type": "long" }, <--
"projects": {
"type": "nested",
"properties": {
"project_duration_ms": {
"type": "long",
"copy_to": "total_duration_ms" <--
},
"project_name": { "type": "keyword" }
}
}
}
After reindexing (which is required due to the newly added field), the above query gets simplified to:
{
"size": 0,
"aggs": {
"builds": {
"terms": {
"field": "build_name"
},
"aggs": {
"total_durations": {
"sum": {
"field": "total_duration_ms"
}
}
}
}
}
}
I am new on Elastic Search. I really need the result about calculating the difference of two set.
Here is the mapping of a index:
{
"mappings": {
"properties": {
"Date": { "type": "date", "format": "yyyyMMdd"},
"areaID": { "type": "keyword" },
"deviceID": { "type": "keyword" }
}
}
}
The date range is from October to November.
I want to get a response for counting November's all new distinct 'deviceID' which grouped by 'areaID'.
I have no idea about how to implement it in ES syntax. Any ES master could give me some hints?
THANKS SO MUCH!
You can using aggs of elasticseach to group by areaID.
This is example with kibana
GET your_index/_search
{
"size": 1000000,
"query": {
"range": {
"Date": {
"gte": "2020-10-01",
"lte": "2020-11-31
}
}
}
},
"aggs": {
"area_id": {
"terms": {
"field": "areaID.keyword"
},
"aggs": {
"Date": {
"date_range": {
"field": "Date",
"ranges": [
{
"from": "2020-11-01",
"to": "2020-11-31"
}
]
},
"aggs": {
"device_id": {
"terms": {
"field": "deviceID.keyword",
}
}
}
}
}
}
}
}
Here structure of my index:
[
{
"Id":"1",
"Path":"/Series/Current/SerieA/foo/foo",
"PlayCount":100
},
{
"Id":"2",
"Path":"/Series/Current/SerieA/bar/foo",
"PlayCount":1000
},
{
"Id":"3",
"Path":"/Series/Current/SerieA/bar/bar",
"PlayCount":50
},
{
"Id":"4",
"Path":"/Series/Current/SerieB/bla/bla",
"PlayCount":300
},
{
"Id":"5",
"Path":"/Series/Current/SerieB/goo/boo",
"PlayCount":200
},
{
"Id":"6",
"Path":"/Series/Current/SerieC/foo/zoo",
"PlayCount":100
}
]
I'd like to execute an aggregation that bring me sum of "PlayCount" for each Series like:
[
{
"key":"serieA",
"TotalPlayCount":1150
},
{
"key":"serieB",
"TotalPlayCount":500
},
{
"key":"serieC",
"TotalPlayCount":100
}
]
This is how I try to do it but obviously query fails since this is not the proper way:
{
"size": 0,
"query":{
"filtered":{
"query":{
"regexp":{
"Path":"/Series/Current/.*"
}
}
}
},
"aggs":{
"play_count_for_current_series":{
"terms": {
"field": "Path",
"regexp": "/Series/Current/([^/]+)"
},
"aggs":{
"Total_play": { "sum": { "field": "PlayCount" } }
}
}
}
}
Is there a way to do it?
My suggestion is as follows:
DELETE test
PUT /test
{
"settings": {
"analysis": {
"filter": {
"my_special_filter": {
"type": "pattern_capture",
"preserve_original": 0,
"patterns": [
"/Series/Current/([^/]+)"
]
}
},
"analyzer": {
"my_special_analyzer": {
"tokenizer": "whitespace",
"filter": [
"my_special_filter"
]
}
}
}
},
"mappings": {
"test": {
"properties": {
"Path": {
"type": "string",
"fields": {
"for_aggregations": {
"type": "string",
"analyzer": "my_special_analyzer"
},
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
Create a special analyzer that uses a pattern_capture filter to catch only those terms that you are interested. Because I didn't want to change your current mapping for that field I added a fields section with a sub-field that will use this special analyzer. I also added a raw field which is not_analyzed which will help with the query itself.
POST test/test/_bulk
{"index":{}}
{"Id":"1","Path":"/Series/Current/SerieA/foo/foo","PlayCount":100}
{"index":{}}
{"Id":"2","Path":"/Series/Current/SerieA/bar/foo","PlayCount":1000}
{"index":{}}
{"Id":"3","Path":"/Series/Current/SerieA/bar/bar","PlayCount":50}
{"index":{}}
{"Id":"4","Path":"/Series/Current/SerieB/bla/bla","PlayCount":300}
{"index":{}}
{"Id":"5","Path":"/Series/Current/SerieB/goo/boo","PlayCount":200}
{"index":{}}
{"Id":"6","Path":"/Series/Current/SerieC/foo/zoo","PlayCount":100}
{"index":{}}
{"Id":"7","Path":"/Sersdasdies/Curradent/SerieC/foo/zoo","PlayCount":100}
For the query, you don't need the regular expression in the query because your aggregation will use that sub-field which only has your needed SerieX terms.
GET /test/test/_search
{
"size": 0,
"query": {
"filtered": {
"query": {
"regexp": {
"Path.raw": "/Series/Current/.*"
}
}
}
},
"aggs": {
"play_count_for_current_series": {
"terms": {
"field": "Path.for_aggregations"
},
"aggs": {
"Total_play": {
"sum": {
"field": "PlayCount"
}
}
}
}
}
}
And the result is
"play_count_for_current_series": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "SerieA",
"doc_count": 3,
"Total_play": {
"value": 1150
}
},
{
"key": "SerieB",
"doc_count": 2,
"Total_play": {
"value": 500
}
},
{
"key": "SerieC",
"doc_count": 1,
"Total_play": {
"value": 100
}
}
]
}
My mapping is
{
"myapp": {
"mappings": {
"attempts": {
"properties": {
"answers": {
"properties": {
"question_id": {
"type": "long"
},
"status": {
"type": "long"
}
}
},
"exam_id": {
"type": "long"
}
}
}
}
}
}
and i want to group by question_id and status
I want to know for every question_id how many has status 1 or 2
P.S. 2 attempts can have the same questions
First of All, you need to update your mapping and make answers a nested field. Not making it nested will make the answers to lose the correlation between question_id field and status field.
{
"myapp": {
"mappings": {
"attempts": {
"properties": {
"answers": {
"type":"nested", <-- Update here
"properties": {
"question_id": {
"type": "long"
},
"status": {
"type": "long"
}
}
},
"exam_id": {
"type": "long"
}
}
}
}
}
}
You can use status in a sub-aggregation as shown below
"aggs": {
"nested_qid_agg": {
"nested": {
"path": "answers"
},
"aggs": {
"qid": {
"terms": {
"field": "answers.question_id",
"size": 0
},
"aggs": {
"status": {
"terms": {
"field": "answers.status",
"size": 0
}
}
}
}
}
}
}
When using a Term Filter, I'm not able to use now elasticsearch 1.7.1 anymore. It worked fine in previous versions, but now it returns:
nested: IllegalArgumentException[Invalid format: \"now/y\"]
A query example is:
GET _search
{
"size": 0,
"aggs": {
"price": {
"nested": {
"path": "prices"
},
"aggs": {
"valid": {
"filter": {
"term": {
"prices.referred_year": "now/y"
}
},
"aggs": {
"ranged": {
"range": {
"field": "prices.price",
"ranges": [
{
"to": 10
},
{
"from": 10
}
]
}
}
}
}
}
}
}
}
Schema:
curl -XPUT 'http://localhost:9200/test/' -d '{
"mappings": {
"product": {
"properties": {
"prices": {
"type": "nested",
"include_in_parent": true,
"properties": {
"price": {
"type": "float"
},
"referred_year": {
"type": "date",
"format": "year"
}
}
}
}
}
}
}'
Document example:
curl -XPUT 'http://localhost:9200/test/product/1' -d '{
"prices": [
{
"referred_year": "2015",
"price": "10.00"
},
{
"referred_year": "2016",
"price": "11.00"
}
]
}'
Expected result for the aggregation (gotten by substituting now/y with 2015):
"aggregations": {
"price": {
"doc_count": 2,
"valid": {
"doc_count": 1,
"ranged": {
"buckets": [
{
"key": "*-10.0",
"to": 10,
"to_as_string": "10.0",
"doc_count": 0
},
{
"key": "10.0-*",
"from": 10,
"from_as_string": "10.0",
"doc_count": 1
}
]
}
}
}
}
now/y etc still works fine in the Range Filter and in queries.
I appreciate any help on this. Thanks!
------- UPDATE -------
So, it seems now doesn't work in Term Filters at all, no matter the rounding.
So, although I haven't found any documentation saying so, it seems using the now operator is not allowed in Term Filters. Which actually makes sense.
The correct query would be:
GET test/_search
{
"size": 0,
"aggs": {
"price": {
"nested": {
"path": "prices"
},
"aggs": {
"valid": {
"filter": {
"range": {
"prices.referred_year": {
"gte": "now/y",
"lte": "now/y"
}
}
},
"aggs": {
"ranged": {
"range": {
"field": "prices.price",
"ranges": [
{
"to": 10
},
{
"from": 10
}
]
}
}
}
}
}
}
}
}