Elasticsearch Aggregation sorting - sorting

My Elasticsearch mapping is
{
"mappings" : {
"loc" : {
"dynamic": "true",
"properties" : {
"geoip" : {
"properties" : {
"location" : { "type": "geo_point"}
}
},
"lon" : { "type" : "double" },
"lat" : { "type" : "double" },
"altitude" : { "type" : "double" },
"id" : { "type" : "long" },
"date" : { "type" : "date", "format" : "epoch_millis" },
"ip" : { "type" : "string" },
"port" : { "type" : "string" }
}
}
}
}
And I want to sort by time.
So i made query.
{
"query": {
"bool" : {
"must" : {
"match_all" : {}
},
"filter" : {
"geo_distance" : {
"distance" : "0.2km",
"geoip.location" : {
"lat" : 36.773353,
"lon" : 126.933847
}
}
}
}
},
"size" : 0,
"sort" : { "date" : { "order" : "desc" } },
"aggs" : {
"ids" : {
"terms" : {
"field" : "id"
},
"aggs" : {
"dedup_docs" : {
"top_hits" : {"size" : 1}
}
}
}
}
}
I want to return the latest time by grouping the results of applying the gps filter by id and sorting in chronological order.
However, the date value of the result is an unordered result.
I do not know how to modify the query.

Related

Kibana index pattern mapping conflict

I am tired of reindexing every 2 3 weeks i have to do reindex.
{
"winlogbeat_sysmon" : {
"order" : 0,
"index_patterns" : [
"log-wlb-sysmon-*"
],
"settings" : {
"index" : {
"lifecycle" : {
"name" : "winlogbeat_sysmon_policy",
"rollover_alias" : "log-wlb-sysmon"
},
"refresh_interval" : "1s",
"number_of_shards" : "1",
"number_of_replicas" : "1"
}
},
"mappings" : {
"properties" : {
"thread_id" : {
"type" : "long"
},
"z_elastic_ecs.event.code" : {
"type" : "long"
},
"geoip" : {
"type" : "object",
"properties" : {
"ip" : {
"type" : "ip"
},
"latitude" : {
"type" : "half_float"
},
"location" : {
"type" : "geo_point"
},
"longitude" : {
"type" : "half_float"
}
}
},
"dst_ip_addr" : {
"type" : "ip"
}
}
},
"aliases" : { }
}
}
this is the template i set earlier from then i didn't change anything
in current and previous indices of log-wlb-sysmon has dst_ip_addr has ip field and older indices of log-wlb-sysmon has text field in logstash i didn't see any warnning for this issue

Elasticsearch Suggestions Multi Index and Multi Fields

I have different indexes that contain different fields. And I try to figure out how to get suggests from all indexes and all fields. I know that with GET /_all/_search I can search for results through all indexes. But how can I get all suggestions from all indexes and all fields? Because I want to have a feature like Google "Did you mean: suggests"
So, I tried this out:
GET /_all/_search
{
"query" : {
"multi_match" : {
"query" : "berlin"
}
},
"suggest" : {
"text" : "berlin",
"my-suggest-1" : {
"term" : {
"field" : "street"
}
},
"my-suggest-2" : {
"term" : {
"field" : "city"
}
},
"my-suggest-3" : {
"term" : {
"field" : "description"
}
}
}
}
"my-suggest-1" and "-2" belongs to Index address (see below) and "my-suggest-3" belongs to Index product. I get the following error:
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "no mapping found for field [street]"
},
{
"type" : "illegal_argument_exception",
"reason" : "no mapping found for field [city]"
},
{
"type" : "illegal_argument_exception",
"reason" : "no mapping found for field [description]"
}
]
}
But if I use only the fields of 1 index I get suggestions, see:
GET /_all/_search
{
"query" : {
"multi_match" : {
"query" : "berlin"
}
},
"suggest" : {
"text" : "berlin",
"my-suggest-1" : {
"term" : {
"field" : "street"
}
},
"my-suggest-2" : {
"term" : {
"field" : "city"
}
}
}
}
Response
...
"failures" : {
...
},
"hits" : {
...
}
"suggest" : {
"my-suggest-1" : [
{
"text" : "berlin",
"offset" : 0,
"length" : 10,
"options" : [
{
"text" : "berliner",
"score" : 0.9,
"freq" : 12
},
{
"text" : "berlinger",
"score" : 0.9,
"freq" : 1
}
]
}
],
"my-suggest-2" : [
{
"text" : "berlin",
"offset" : 0,
"length" : 10,
"options" : []
}
]
...
I don't know how I can get suggests from index address and product? I would be happy if someone can help me.
Index 1 - Address:
"address" : {
"aliases" : {
....
},
"mappings" : {
"dynamic" : "strict",
"properties" : {
"_entity_type" : {
"type" : "keyword",
"index" : false
},
"street" : {
"type" : "text"
},
"city" : {
"type" : "text"
}
}
},
"settings" : {
...
}
}
Index 2 - Product:
"product" : {
"aliases" : {
...
},
"mappings" : {
"dynamic" : "strict",
"properties" : {
"_entity_type" : {
"type" : "keyword",
"index" : false
},
"description" : {
"type" : "text"
}
}
},
"settings" : {
...
}
}
You can add multiple indices to your search. In this case, you need to search over the fields that exist on all indices. So In your case, you need to define all three fields in both of the indices. The fields "street" and "city" are filed in the first index and the field "description" is filled only in the second index. This will be your mapping for the "Address" index. In this index, the "description" field exists but has no data. In the second index, "street" and "city" exist but have no data.
"address" : {
"aliases" : {
....
},
"mappings" : {
"dynamic" : "strict",
"properties" : {
"_entity_type" : {
"type" : "keyword",
"index" : false
},
"street" : {
"type" : "text"
},
"city" : {
"type" : "text"
},
"description" : {
"type" : "text"
}
}
},
"settings" : {
...
}
}

ElasticSearch Multi Search query not return results

I am new to ElasticSearch and running version. 2.3.5.
I am running this query:
{
"query" : {
"multi_match" : {
"type" : "cross_fields",
"query" : "John Schmidt Sankt Boulevard 118b 2554 Island",
"minimum_should_match" : "50%",
"operator" : "and",
"fields" : ["*Name", "*Street.*hasStringValue", "*hasStreetNumber", "*hasPostCode", "*PostalLocality.*hasStringValue"]
}
}
}
However it does not return any result. If I remove the 'b' after 118 from the query from the query it returns the document.
All other fields is a match so how can I make ElasticSearch return the document?
Here is the mapping:
{
"my_index" : {
"mappings" : {
"datasubject" : {
"properties" : {
"#context" : {
"properties" : {
"con" : {
"type" : "string"
},
"cor" : {
"type" : "string"
},
"geo" : {
"type" : "string"
},
"per" : {
"type" : "string"
}
}
},
"cor:Person" : {
"properties" : {
"con:hasContactPoint" : {
"properties" : {
"con:Mobile" : {
"properties" : {
"con:hasAreaCode" : {
"type" : "string"
},
"con:hasCompleteTelephoneNumberString" : {
"type" : "string"
},
"con:hasCountryCode" : {
"type" : "string"
}
}
},
"con:PostalAddress" : {
"properties" : {
"con:hasAddressPoint" : {
"properties" : {
"geo:StreetAddress" : {
"properties" : {
"con:hasPostCode" : {
"type" : "string"
},
"con:hasPostalLocality" : {
"properties" : {
"geo:PostalLocality" : {
"properties" : {
"cor:hasStringValue" : {
"type" : "string"
}
}
}
}
},
"geo:hasStreet" : {
"properties" : {
"geo:Street" : {
"properties" : {
"cor:hasStringValue" : {
"type" : "string"
}
}
}
}
},
"geo:hasStreetNumber" : {
"type" : "string"
}
}
}
}
}
}
}
}
},
"cor:hasBirthDate" : {
"properties" : {
"cor:Date" : {
"properties" : {
"cor:hasDateValue" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
}
}
}
}
},
"cor:hasName" : {
"properties" : {
"per:Name" : {
"properties" : {
"per:familyName" : {
"type" : "string"
},
"per:givenName" : {
"type" : "string"
}
}
}
}
},
"cor:isIdentifiedBy" : {
"properties" : {
"cor:GEDIvA" : {
"properties" : {
"cor:hasCompleteIdentifierValue" : {
"type" : "string"
}
}
},
"dataset/pdi:IndividualId" : {
"properties" : {
"cor:hasCompleteIdentifierValue" : {
"type" : "string"
}
}
}
}
}
}
}
}
}
}
}
}
And here is the index settings:
{
"gdprui" : {
"settings" : {
"index" : {
"creation_date" : "1525442279108",
"analysis" : {
"filter" : {
"my_ascii_folding" : {
"type" : "asciifolding",
"preserve_original" : "true"
},
"substring" : {
"type" : "edgeNGram",
"min_gram" : "1",
"max_gram" : "10"
}
},
"analyzer" : {
"default" : {
"filter" : [ "standard", "my_ascii_folding", "lowercase", "substring", "reverse" ],
"tokenizer" : "standard"
}
}
},
"number_of_shards" : "5",
"number_of_replicas" : "2",
"uuid" : "EMqhJwGWRKi1F5gFwuSKTQ",
"version" : {
"created" : "2030599"
}
}
}
}
}

Mysteriously wrong values of numerical fields in ElasticSearch

I've spent the last 2 days investigating this mind-bending issue:I have an index with custom mappings on which I perform some aggregations. The problem is that in the results of the aggregation on numerical fields,it returns values that do not appear in the database from which the data was imported, even though the number of results is the same.
I found a similar issue here where the problem was inconsistent mapping of a field across an index, but in my case it is mapped as the same type. The problem happens with the fields: award.value.amount, award.value.x_amountEur, tender.value.x_amountEur as far as I have checked.This is my current mapping as stated by curl -XGET 'http://localhost:9200/documents/_mappings?pretty&human'
(the part that contains the target fields):
{
"documents" : {
"mappings" : {
"document" : {
"properties" : {
"additionalIdentifiers" : {
"type" : "string",
"index" : "not_analyzed"
},
"award" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"contract_number" : {
"type" : "string",
"index" : "not_analyzed"
},
"date" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"x_day" : {
"type" : "integer"
},
"x_month" : {
"type" : "integer"
},
"x_year" : {
"type" : "integer"
}
}
},
"description" : {
"type" : "string"
},
"initialValue" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"currency" : {
"type" : "string"
},
"x_vat" : {
"type" : "float"
}
}
},
"minValue" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"x_amountEur" : {
"type" : "float"
}
}
},
"title" : {
"type" : "string"
},
"value" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"currency" : {
"type" : "string"
},
"x_amountEur" : {
"type" : "float"
},
"x_vat" : {
"type" : "float"
},
"x_vatbool" : {
"type" : "boolean"
}
}
},
"x_initialValue" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"x_amountEur" : {
"type" : "float"
},
"x_vatbool" : {
"type" : "boolean"
}
}
}
}
},
"awardCriteria" : {
"type" : "string"
},
"contract_number" : {
"type" : "string"
},
"document_id" : {
"type" : "string",
"index" : "not_analyzed"
},
"numberOfTenderers" : {
"type" : "string"
},
"procurementMethod" : {
"type" : "string"
},
"procuring_entity" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"address" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"country" : {
"type" : "string"
},
"countryName" : {
"type" : "string",
"index" : "not_analyzed"
},
"email" : {
"type" : "string"
},
"locality" : {
"type" : "string"
},
"postalCode" : {
"type" : "string"
},
"streetAddress" : {
"type" : "string"
},
"telephone" : {
"type" : "string"
},
"x_url" : {
"type" : "string"
}
}
},
"name" : {
"type" : "string"
},
"x_slug" : {
"type" : "string",
"index" : "not_analyzed"
}
}
},
"suppliers" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"address" : {
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"email" : {
"type" : "string"
},
"locality" : {
"type" : "string"
},
"postalCode" : {
"type" : "string"
},
"streetAddress" : {
"type" : "string"
},
"telephone" : {
"type" : "string"
},
"x_url" : {
"type" : "string"
}
}
},
"name" : {
"type" : "string"
},
"x_slug" : {
"type" : "string",
"index" : "not_analyzed"
}
}
},
"tender" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"value" : {
"type" : "nested",
"properties" : {
"_id" : {
"properties" : {
"$oid" : {
"type" : "string"
}
}
},
"amount" : {
"type" : "float"
},
"currency" : {
"type" : "string"
},
"x_amountEur" : {
"type" : "float"
},
"x_vat" : {
"type" : "float"
},
"x_vatbool" : {
"type" : "boolean"
}
}
}
}
}
This is the aggregation I am using in order to get the values of contracts between each pair of supplier - procuring_entity:
Document.es.search({
"search_type": "count" ,
"body":{
"aggregations": {
"entities":{
"nested": {
"path": "procuring_entity"
},
"aggs": {
"procuring_entity_names": {
"terms": {
"field": "procuring_entity.x_slug",
"size": 0
},
"aggs": {
"suppliers": {
"nested": {
"path": "suppliers"
},
"aggs": {
"suppliers_names": {
"terms":{
"field": "suppliers.x_slug",
"size": 0
},
"aggs": {
"awards": {
"nested": {
"path": "award.value"
},
"aggs": {
"award_amounts": {
"terms":{
"field": "award.value.x_amountEur",
"size": 0
}
}
}
}
}
}
}
}
}
}
}
}
}
}})
The result with type float is :
{"entities"=>
{"doc_count"=>24300,
"procuring_entity_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"vsia-bernu-kliniska-universitates-slimnica",
"doc_count"=>1360,
"suppliers"=>
{"doc_count"=>1360,
"suppliers_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"recipe-plus-as",
"doc_count"=>388,
"awards"=>
{"doc_count"=>388,
"awards"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>3679.086669921875, "doc_count"=>373},
{"key"=>0.0, "doc_count"=>12},
{"key"=>73610.3203125, "doc_count"=>1},
{"key"=>244000.0, "doc_count"=>1},
{"key"=>342348.9375, "doc_count"=>1}]}}}
The problem is that in MongoDB the same query returns 388 documents that all have award.value.x_amountEur = 3679.08661250056 , as presented by Mongoid query:
Document.where(:"procuring_entity.x_slug" => "vsia-bernu-kliniska-universitates-slimnica")
.keep_if{|doc| doc.suppliers.first.x_slug == "recipe-plus-as"}
.map{|doc| doc.award.value.x_amountEur}.uniq
=>[3679.08661250056]
A query directly into MongoDB returns the same.
I have also tried to map the targeted fields as double, which gave the same result and as long which returned the following (even more incorrect result):
{"entities"=>
{"doc_count"=>24300,
"procuring_entity_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"vsia-bernu-kliniska-universitates-slimnica",
"doc_count"=>1360,
"suppliers"=>
{"doc_count"=>1360,
"suppliers_names"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>"recipe-plus-as",
"doc_count"=>388,
"awards"=>
{"doc_count"=>388,
"awards"=>
{"doc_count_error_upper_bound"=>0,
"sum_other_doc_count"=>0,
"buckets"=>
[{"key"=>3679, "doc_count"=>371},
{"key"=>0, "doc_count"=>12},
{"key"=>44300, "doc_count"=>1},
{"key"=>80472, "doc_count"=>1},
{"key"=>331636, "doc_count"=>1},
{"key"=>342348, "doc_count"=>1},
{"key"=>1658805, "doc_count"=>1}]}}}
I'm using Elasticsearch 2.0, mongoid 5.0.1 and mongoid-elasticsearch for indexing. I can't think of anything else to do so any suggestion is welcomed and appreciated.
I tried to test your scenario with ES 2.0 and there is something that I'm missing. I cannot make it create buckets for the award.value.x_amountEur unless I use a reverse_nested aggregation to "get out" from one nested path and into another.
So, instead of the awards aggregation that you have I'm using the same aggregation but "wrapped" in a reverse_nested aggregation:
"aggs": {
"getting_back": {
"reverse_nested": {},
"aggs": {
"awards": {
"nested": {
"path": "award.value"
},
"aggs": {
"award_amounts": {
"terms": {
"field": "award.value.x_amountEur"
}
}
}
}
}
}
}
And for this one I am seeing something ok.
Later edit: following mine and more general #Val's suggestion, the complete solution was to use reverse_nested on both awards and suppliers aggregations.

elasticsearch nested query with ruby gem

I am using the elasticsearch ruby gem to connect to an es server and currently have an index with the below mapping. I am trying to understand the proper syntax to query these nested objects. Experimenting with queries such as the following, but keep getting errors. I was wondering if someone could get me started on the proper syntax for querying a structure such as this? thanks!
client = Elasticsearch::Client.new log:true
client.search index: 'injuries', nested: { path: { week: {id: '1' } } }
Returns:
Elasticsearch::Transport::Transport::Errors::BadRequest: [400] {"error":"SearchPhaseExecutionException[Failed to execute phase [query
Sample Mapping:
{
"injuries" : {
"mappings" : {
"tbd" : {
"properties" : {
"injuries" : {
"properties" : {
"timestamp" : {
"properties" : {
"__content__" : {
"type" : "string"
},
"timeZone" : {
"type" : "string"
}
}
}
}
}
}
},
"football" : {
"properties" : {
"injuries" : {
"properties" : {
"timestamp" : {
"properties" : {
"__content__" : {
"type" : "string"
},
"timeZone" : {
"type" : "string"
}
}
},
"week" : {
"properties" : {
"id" : {
"type" : "string"
},
"inactivePlayers" : {
"properties" : {
"inactivePlayer" : {
"properties" : {
"firstName" : {
"type" : "string"
},
"lastName" : {
"type" : "string"
},
"playerId" : {
"type" : "string"
},
"position" : {
"type" : "string"
},
"status" : {
"type" : "string"
},
"teamId" : {
"type" : "string"
}
}
}
}
},
"injuredPlayers" : {
"properties" : {
"injuredPlayer" : {
"properties" : {
"displayName" : {
"type" : "string"
},
"firstName" : {
"type" : "string"
},
"gameStatus" : {
"type" : "string"
},
"injury" : {
"type" : "string"
},
"lastName" : {
"type" : "string"
},
"playerId" : {
"type" : "string"
},
"position" : {
"type" : "string"
},
"practiceStatus" : {
"type" : "string"
},
"teamId" : {
"type" : "string"
}
}
}
}
},
"season" : {
"type" : "string"
},
"seasonType" : {
"type" : "string"
}
}
}
}
}
}
}
}
}
}
Your nested query doesn't appear to have a query defined. I think it should be something like:
"nested" : {
"path" : "week",
"query" : {
"match" : {"week.id" : "1"}
}
}

Resources