Calculate Average of an Array's indexes in ElasticSearch - elasticsearch

I am trying to calculate average of the result set that is returning me locations from Elastic Search. Here is what i am trying.
'aggs' => [
"avg_location" => [
'avg' => [
'field' => 'location'
]
]
]
This returns error as location itself is an object/array that returns me [lat,long] of the point. I need to calculate average of lats and longs of all the points returned.
How can i do that? I tried quite a few things but none of them worked.
Here is the whole code.
$json = [
'query' => [
'bool' => [
'must_not' => [
'terms' => ['rarity' => []],
],
'must' => [
'range' => [
'disappearTime' => [
'gte' => 'now',
'lte' => 'now+1d'
]
]
],
'filter' => [
[
'geo_bounding_box' => [
'location' => [
'top_left' => [
'lat' => $northWestLat,
'lon' => $northWestLng
],
'bottom_right' => [
'lat' => $southEastLat,
'lon' => $southEastLng
]
]
]
]
]
]
],
'aggs' => [
"avg_location" => [
'avg' => [
'field' => 'location'
]
]
]
];
Thanks

Related

Elasticsearch - [function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME]

I have a query in PHP that is already working and now I want to expand it with function_score. The goal is that I can boost more recent content based on a timestamp.
I found this article https://discuss.elastic.co/t/how-to-prioritize-more-recent-content/134100 and was also reading this doc https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html.
I guess its some kind of misplacement the new part but i can't figure out where to put it. Im pretty new to Elasticsearch.
The error
"type":"parsing_exception",
"reason":"[function_score] malformed query, expected [END_OBJECT] but found [FIELD_NAME]"
The new part
'function_score' => [
'functions' => [
[
'filter'=> [
'range' => [
'tstamp' => [
'gte' => 'now-1y',
'lte' => 'now'
]
]
],
'weight' => 5
],
[
'filter' => [
'range' => [
'tstamp' => [
'gte' => 'now-3yr',
'lte' => 'now-1yr'
]
]
],
'weight' => 2
]
],
'boost_mode' => 'multiply'
],
The full query
'query' => [
'function_score' => [
'functions' => [
[
'filter'=> [
'range' => [
'tstamp' => [
'gte' => 'now-1y',
'lte' => 'now'
]
]
],
'weight' => 5
],
[
'filter' => [
'range' => [
'tstamp' => [
'gte' => 'now-3yr',
'lte' => 'now-1yr'
]
]
],
'weight' => 2
]
],
'boost_mode' => 'multiply'
],
'bool' => [
'filter' => [
['range' => [
'starttime' => ['lte' => $now],
]]
],
'must' => [
["multi_match" => [
'fuzziness' => 'auto',
'query' => $_REQUEST['kw'],
'fields' => [
'content^2',
'teaser^2',
'bodytext^2',
'title^5',
'header^3'
],
]],
['bool' => [
'should' => [
['match' => ['sys_language_uid' => $sysLanguageUid]],
['match' => ['sys_language_uid' => -1]],
],
'minimum_should_match' => 1,
]],
],
'must_not' => [
['match' => ['hidden' => 1]],
['match' => ['deleted' => 1]],
['match' => ['no_search' => 1]]
],
]
],
Any help is much appreciated
I found the solution. Here is the full query that works. I hope it helps someone else:
The "query" part belongs into "function_score":
'query' => [
'function_score' => [
'query' => [
'bool' => [
'filter' => [
['range' => [
'starttime' => ['lte' => $now],
]],
],
'should' => [
['multi_match' => [
'query' => $keyword,
'fields' => [
'content^2',
'teaser^2',
'bodytext^2',
'title^5',
'header^3'
],
'type'=>'best_fields',
'boost' => 3,
'operator' => 'and'
]],
['multi_match' => [
'query' => $keyword,
'fields' => [
'content^2',
'teaser^2',
'bodytext^2',
'title^5',
'header^3'
],
'type'=>'best_fields',
'boost' => 2
]],
['multi_match' => [
'fuzziness' => 'auto',
'query' => $keyword,
'fields' => [
'content^2',
'teaser^2',
'bodytext^2',
'title^5',
'header^3'
],
]],
],
'must' => [
['bool' => [
'should' => [
['match' => ['sys_language_uid' => $sysLanguageUid]],
['match' => ['sys_language_uid' => -1]],
],
'minimum_should_match' => 1,
]],
],
'must_not' => [
['match' => ['hidden' => 1]],
['match' => ['deleted' => 1]],
['match' => ['no_search' => 1]]
],
]
],
'functions' => [
[
'filter'=> [
'range' => [
'tstamp' => [
'gte' => $now - 31556952,
'lte' => $now
]
]
],
'weight' => 3
],
[
'filter' => [
'range' => [
'tstamp' => [
'gte' => $now - 94670856,
'lte' => $now - 31556952
]
]
],
'weight' => 2
]
],
'boost_mode' => 'multiply'
],
],

How to include mapped fields subfield in result in Elasticsearch?

For aggregation, I have a raw value of my field. But I can't access this value in my query. For example, in my case I have a brand Tommy Hilfiger and it's raw value tommy-hilfiger as a brand.keyword. How to include this value in a search results?
'body' => [
'settings' => [
'analysis' => [
'filter' => [
'remove_spaces_inside' => [
'type' => 'pattern_replace',
'pattern' => '\\s+',
'replacement' => ' '
],
'convert_spaces' => [
'type' => 'pattern_replace',
'pattern' => '\\s+',
'replacement' => '-'
],
],
'char_filter' => [
'convert_amp' => [
'type' => 'pattern_replace',
'pattern' => '&',
'replacement' => 'and'
]
],
'analyzer' => [
'slug' => [
'char_filter' => ['convert_amp'],
'tokenizer' => 'keyword',
'filter' => ['trim', 'lowercase', 'asciifolding', 'remove_spaces_inside', 'convert_spaces']
],
'format' => [
'char_filter' => ['convert_amp'],
'tokenizer' => 'keyword',
'filter' => ['trim', 'remove_spaces_inside']
]
]
]
],
'mappings' => [
'my_type' => [
'properties' => [
'brand' => [
'type' => 'string',
'fields' => [
'keyword' => [
'type' => 'string',
'analyzer' => 'slug',
'index_options' => 'docs',
]
]
]
]
]
]
]
Upd.
In my case, I store brand in 2 fields: default "Tommy Hilfiger" for full-text search, formatted keyword (slug) "tommy-hilfiger" for exact search. I can aggregate data by slug, but can't get this field in my query. For example, this query return all records with brand Tommy Hilfiger, but only default values, not a slug.
'body' => [
'_source' => [
'brand',
'brand.keyword'
],
'query' => [
'bool' => [
'must' => [
[
'terms' => [
'brand.keyword' => [
'tommy-hilfiger',
]
]
]
]
]
]
]

Elastic search to match top level field, and a nested field

I can't seem to get this query to work
I am trying to match all tags who's name matches $channel_name,
but also, which has a video, with the name of $search.
I think the nested part is messed up. How can I fix this query?
$body = [
'query' =
[
"bool" => [
"must" => [
'match_phrase' => ['name' => $channel_name]
],
'nested' => [
'path' => 'videos',
'query' => [
'bool' => [
[
'must' => [
'match' => [
'videos.name' => $search
]
]
]
]
]
]
],
],
'inner_hits' => [
'videos' => [
'path' => [
'videos' => [
'size' => $this->pageSize,
'sort' => $sortOrder,
'from' => ($pageNumber - 1) * $this->pageSize,
'query' => [
'match_all' => []
]
]
]
]
]
];

Using geo_polygon and geo_bounding_box filter together in ElasticSearch

I am new to ElasticSearch and trying to use geo_bounding_box and geo_polygon filters together in a search. They work if I use them separately one by one. But if I try using them, it gives me error. Here is my code.
$json = [
'query' => [
'bool' => [
'must_not' => [
['terms' => ['_id' => []]],
['terms' => ['rarity' => []]]
],
'must' => [
'range' => [
'disappearTime' => [
'gte' => 'now',
'lte' => 'now+1d'
]
]
],
'filter' => [
'geo_bounding_box' => [
'location' => [
'top_left' => [
'lat' => 52.280577919216356,
'lon' => -113.78533601760866
],
'bottom_right' => [
'lat' => 52.26306210545918,
'lon' => -113.81855249404909
]
]
],
'geo_polygon' => [
'location' => [
"points" => [
[-113.78721646175813, 52.29637194474555],
[-113.76335508934484, 52.281770664368565],
[-113.76335508934484, 52.26112133563143],
[-113.78721646175813, 52.24652005525444],
[-113.82096153824187, 52.24652005525444],
[-113.84482291065517, 52.26112133563143],
[-113.84482291065517, 52.281770664368565],
[-113.82096153824187, 52.29637194474555],
[-113.78721646175813, 52.29637194474555],
[-113.69997059121626, 52.298658944745554],
[-113.67610798767082, 52.28405766436857],
[-113.67610798767082, 52.26340833563143],
[-113.69997059121626, 52.248807055254446],
[-113.73371740878373, 52.248807055254446],
[-113.75758001232917, 52.26340833563143],
[-113.75758001232917, 52.28405766436857],
[-113.73371740878373, 52.298658944745554],
[-113.69997059121626, 52.298658944745554]
]
]
]
]
]
]
];
Any help would be greatly appreciated.
Thanks
You simply need to put each geo filter inside its own PHP Array, like you did with the terms queries in the bool/must clause
$json = [
'query' => [
'bool' => [
'must_not' => [
['terms' => ['_id' => []]],
['terms' => ['rarity' => []]]
],
'must' => [
'range' => [
'disappearTime' => [
'gte' => 'now',
'lte' => 'now+1d'
]
]
],
'filter' => [
[
'geo_bounding_box' => [
'location' => [
'top_left' => [
'lat' => 52.280577919216356,
'lon' => -113.78533601760866
],
'bottom_right' => [
'lat' => 52.26306210545918,
'lon' => -113.81855249404909
]
]
]
],
[
'geo_polygon' => [
'location' => [
"points" => [
[-113.78721646175813, 52.29637194474555],
[-113.76335508934484, 52.281770664368565],
[-113.76335508934484, 52.26112133563143],
[-113.78721646175813, 52.24652005525444],
[-113.82096153824187, 52.24652005525444],
[-113.84482291065517, 52.26112133563143],
[-113.84482291065517, 52.281770664368565],
[-113.82096153824187, 52.29637194474555],
[-113.78721646175813, 52.29637194474555],
[-113.69997059121626, 52.298658944745554],
[-113.67610798767082, 52.28405766436857],
[-113.67610798767082, 52.26340833563143],
[-113.69997059121626, 52.248807055254446],
[-113.73371740878373, 52.248807055254446],
[-113.75758001232917, 52.26340833563143],
[-113.75758001232917, 52.28405766436857],
[-113.73371740878373, 52.298658944745554],
[-113.69997059121626, 52.298658944745554]
]
]
]
]
]
]
]
];

How to order by date documents in aggregator with elasticsearch

I have this aggregators to deduplicate documents :
$searchParams['body'] = [
'aggs' => [
'dedup' => [
'terms' => [
'field' => 'source',
'size' => 50,
'order' => [
'max_hits' => "desc"
]
],
'aggs' => [
'dedup_hits' => [
'top_hits' => [
'size' => 1
]
],
'max_hits' => [
'max' => [
"script" => "doc.score"
]
]
]
]
]
];
This query order by document score. However, i want to order by _timestamp field. It's possible ? I have test with Date Histogram aggregator. But without success.

Resources