ElasticSearch nested query score - elasticsearch

I have an index :
PUT my_index2
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
I have two documents:
POST my_index2/my_type/
{
"user": [
{
"name": "Alice Don"
},
{
"name": "Smith"
}
]
}
POST my_index2/my_type/
{
"user": [
{
"name": "Alice David"
}
]
}
When I search it:
GET my_index2/_search
{
"query": {
"nested" : {
"path" : "user",
"query" : {
"bool" : {
"should" : [
{ "match" : {"user.name" : "Alice"} }
]
}
}
}
}
}
Although both documents have one "Alice", the score of the first one is higher. How could that possible?

Your first document has shorter "name", so you got more same chars between "query" and "name"

Related

ElasticSearch DSL Matching all elements of query in list of list of strings

I'm trying to query ElasticSearch to match every document that in a list of list contains all the values requested, but I can't seem to find the perfect query.
Mapping:
"id" : {
"type" : "keyword"
},
"mainlist" : {
"properties" : {
"format" : {
"type" : "keyword"
},
"tags" : {
"type" : "keyword"
}
}
},
...
Documents:
doc1 {
"id" : "abc",
"mainlist" : [
{
"type" : "big",
"tags" : [
"tag1",
"tag2"
]
},
{
"type" : "small",
"tags" : [
"tag1"
]
}
]
},
doc2 {
"id" : "abc",
"mainlist" : [
{
"type" : "big",
"tags" : [
"tag1"
]
},
{
"type" : "small",
"tags" : [
"tag2"
]
}
]
},
doc3 {
"id" : "abc",
"mainlist" : [
{
"type" : "big",
"tags" : [
"tag1"
]
}
]
}
The query I've tried that got me closest to the result is:
GET /index/_doc/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"mainlist.tags": "tag1"
}
},
{
"term": {
"mainlist.tags": "tag2"
}
}
]
}
}
}
although I get as result doc1 and doc2, while I'd only want doc1 as contains tag1 and tag2 in a single list element and not spread across both sublists.
How would I be able to achieve that?
Thanks for any help.
As mentioned by #caster, you need to use the nested data type and query as in normal way Elasticsearch treats them as object and relation between the elements are lost, as explained in offical doc.
You need to change both mapping and query to achieve the desired output as shown below.
Index mapping
{
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"mainlist" :{
"type" : "nested"
}
}
}
}
Sample Index doc according to your example, no change there
Query
{
"query": {
"nested": {
"path": "mainlist",
"query": {
"bool": {
"must": [
{
"term": {
"mainlist.tags": "tag1"
}
},
{
"match": {
"mainlist.tags": "tag2"
}
}
]
}
}
}
}
}
And result
hits": [
{
"_index": "71519931_new",
"_id": "1",
"_score": 0.9139043,
"_source": {
"id": "abc",
"mainlist": [
{
"type": "big",
"tags": [
"tag1",
"tag2"
]
},
{
"type": "small",
"tags": [
"tag1"
]
}
]
}
}
]
use nested field type,this is work for it
https://www.elastic.co/guide/en/elasticsearch/reference/8.1/nested.html

Elasticsearch Multi-Term Auto Completion

I'm trying to implement the Multi-Term Auto Completion that's presented here.
Filtering down to the correct documents works, but when aggregating the completion_terms they are not filtered to those that match the current partial query, but instead include all completion_terms from any matched documents.
Here are the mappings:
{
"mappings": {
"dynamic" : "false",
"properties" : {
"completion_ngrams" : {
"type" : "text",
"analyzer" : "completion_ngram_analyzer",
"search_analyzer" : "completion_ngram_search_analyzer"
},
"completion_terms" : {
"type" : "keyword",
"normalizer" : "completion_normalizer"
}
}
}
}
Here are the settings:
{
"settings" : {
"index" : {
"analysis" : {
"filter" : {
"edge_ngram" : {
"type" : "edge_ngram",
"min_gram" : "1",
"max_gram" : "10"
}
},
"normalizer" : {
"completion_normalizer" : {
"filter" : [
"lowercase",
"german_normalization"
],
"type" : "custom"
}
},
"analyzer" : {
"completion_ngram_search_analyzer" : {
"filter" : [
"lowercase"
],
"tokenizer" : "whitespace"
},
"completion_ngram_analyzer" : {
"filter" : [
"lowercase",
"edge_ngram"
],
"tokenizer" : "whitespace"
}
}
}
}
}
}
}
I'm then indexing data like this:
{
"completion_terms" : ["Hammer", "Fortis", "Tool", "2000"],
"completion_ngrams": "Hammer Fortis Tool 2000"
}
Finally, the autocomplete search looks like this:
{
"query": {
"bool": {
"must": [
{
"term": {
"completion_terms": "fortis"
}
},
{
"term": {
"completion_terms": "hammer"
}
},
{
"match": {
"completion_ngrams": "too"
}
}
]
}
},
"aggs": {
"autocomplete": {
"terms": {
"field": "completion_terms",
"size": 100
}
}
}
}
This correctly returns documents matching the search string "fortis hammer too", but the aggregations include ALL completion terms that are included in any of the matched documents, e.g. for the query above:
"buckets": [
{ "key": "fortis" },
{ "key": "hammer" },
{ "key": "tool" },
{ "key": "2000" },
]
Ideally, I'd expect
"buckets": [
{ "key": "tool" }
]
I could filter out the terms that are already covered by the search query ("fortis" and "hammer" in this case) in the app, but the "2000" doesn't make any sense from a user's perspective, because it doesn't partially match any of the provided search terms.
I understand why this is happening, but I can't think of a solution. Can anyone help?
try filters agg please
{
"query": {
"bool": {
"must": [
{
"term": {
"completion_terms": "fortis"
}
},
{
"term": {
"completion_terms": "hammer"
}
},
{
"match": {
"completion_ngrams": "too"
}
}
]
}
},
"aggs": {
"findOuthammerAndfortis": {
"filters": {
"filters": {
"fortis": {
"term": {
"completion_terms": "fortis"
}
},
"hammer": {
"term": {
"completion_terms": "hammer"
}
}
}
}
}
}
}

ES query to match all elements in array

So I got this document with a
nested array that I want to filter with this query.
I want ES to return all documents where all items have changes = 0 and that only.
If document has even a single item in the list with a change = 1, that's discarded.
Is there any way I can achieve this starting from the query I have already wrote? Or should I use a script instead?
DOCUMENTS:
{
"id": "abc",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 1
}
]
}
},
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
QUERY:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"id": {
"value": "abc"
}
}
},
{
"nested": {
"path": "trips",
"query": {
"range": {
"trips.changes": {
"gt": -1,
"lt": 1
}
}
}
}
}
]
}
}
}
EXPECTED RESULT:
{
"id": "def",
"_source" : {
"trips" : [
{
"type" : "home",
"changes" : 0
},
{
"type" : "home",
"changes" : 0
}
]
}
}
Elasticsearch version: 7.6.2
Already read this answers but they didn't help me:
https://discuss.elastic.co/t/how-to-match-all-item-in-nested-array/163873
ElasticSearch: How to query exact nested array
First off, if you filter by id: abc, you obviously won't be able to get id: def back.
Second, due to the nature of nested fields which are treated as separate subdocuments, you cannot query for all trips that have the changes equal to 0 -- the connection between the individual trips is lost and they "don't know about each other".
What you can do is return only the trips that matched your nested query using inner_hits:
GET trips_solutions/_search
{
"_source": "false",
"query": {
"bool": {
"must": [
{
"nested": {
"inner_hits": {},
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
}
]
}
}
}
The easiest solution then is to dynamically save this nested info on a parent object like discussed here and using range/term query on the resulting array.
EDIT:
Here's how you do it using copy_to onto the doc's top level:
PUT trips_solutions
{
"mappings": {
"properties": {
"trips_changes": {
"type": "integer"
},
"trips": {
"type": "nested",
"properties": {
"changes": {
"type": "integer",
"copy_to": "trips_changes"
}
}
}
}
}
}
trips_changes will be an array of numbers -- I presume they're integers but more types are available.
Then syncing a few docs:
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":1}]}
POST trips_solutions/_doc
{"trips":[{"type":"home","changes":0},{"type":"home","changes":0}]}
And finally querying:
GET trips_solutions/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "trips",
"query": {
"term": {
"trips.changes": {
"value": 0
}
}
}
}
},
{
"script": {
"script": {
"source": "doc.trips_changes.stream().filter(val -> val != 0).count() == 0"
}
}
}
]
}
}
}
Note that we first filter normally using the nested term query to narrow down our search context (scripts are slow so this is useful). We then check if there are any non-zero changes in the accumulated top-level changes and reject those that apply.

Query documents that contains all values in nested array Elasticsearch

I'm trying to query documents where the nested array contains all of the elements passed in the query.
The index stores groups and each group has a list of members. I want to query all the groups that contains the given members.
Mapping:
"properties" : {
"members" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "keyword"
}
}
},
"name" : {
"type" : "text"
}
}
}
}
Example content:
[
{
"name" : "group 1",
"members" : [
{
"name" : "alice"
},
{
"name" : "bob"
}
]
},
{
"name" : "group 2",
"members" : [
{
"name" : "alice"
},
{
"name" : "foo"
},
{
"name" : "bob"
}
]
},
{
"name" : "group 3",
"members" : [
{
"name" : "foo"
},
{
"name" : "bar"
}
]
}
]
How can I find all groups that have both "alice" and "foo" as members?
I have tried the following query but it returns nothing:
GET /group/_search
{
"query": {
"nested": {
"path": "members",
"query": {
"bool": {
"must": [
{"match": {"members.name": "alice"}},
{"match": {"members.name": "foo"}}
]
}
}
}
}
}
I have also tried with term instead of match but it gives no results.
You can use the nested within a must clause. Like this:
GET /group/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "members",
"query": {
"term": {
"members.name": {
"value": "alice"
}
}
}
}
},
{
"nested": {
"path": "members",
"query": {
"term": {
"members.name": {
"value": "foo"
}
}
}
}
}
]
}
}
}

Watcher alert if no records matching filter in x minutes

I need to get ElasticSearch watcher to alert if there is no record matching a pattern inserted into the index in a time frame, it needs to be able to do this whilst grouping on another pair of field.
i.e. the records will be of the pattern:
Date Timestamp Level Message Client Site
It needs to check that Message matches "is running" for each Client's site(s) (i.e. Google Maps and Bing Maps have the same site of Maps). I tihnk the best(?) way to do this right now is to run a wacher per client site.
Sofar I have this, assume the task should write is running into the log every 20 minutes :
{
"trigger" : {
"schedule" : {
"interval" : "25m"
}
},
"input" : {
"search" : {
"request" : {
"search_type" : "count",
"indices" : "<logstash-{now/d}>",
"body" : {
"filtered" : {
"query" : {
"match_phrase" : { "Message" : "Is running" }
},
"filter" : {
"match" : { "Client" : "Example" } ,
"match" : { "Site" : "SomeSite" }
}
}
}
}
}
},
"condition" : {
"script" : "return ctx.payload.hits.total < 1"
},
"actions" : {
},
"email_administrator" : {
"email" : {
"to" : "me#host.tld",
"subject" : "Tasks are not running for {{ctx.payload.client}} on their site {{ctx.payload.site}}",
"body" : "Too many error in the system, see attached data",
"attach_data" : true,
"priority" : "high"
}
}
}
}
For anyone looking how to do this in the future, a few things need nesting in query as part of filter and match becomes term. Fun!...
{
"trigger": {
"schedule": {
"interval": "25m"
}
},
"input": {
"search": {
"request": {
"search_type": "count",
"indices": "<logstash-{now/d}>",
"body": {
"query": {
"filtered": {
"query": {
"match_phrase": {
"Message": "Its running"
}
},
"filter": {
"query": {
"term": {
"Client": "Example"
}
},
"query": {
"term": {
"Site": "SomeSite"
}
},
"query": {
"range": {
"event_timestamp": {
"gte": "now-25m",
"lte": "now"
}
}
}
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"lte": 1
}
}
},
"actions": {
"email_administrator": {
"email": {
"to": "me#host.tld",
"subject": "Tasks are not running for {{ctx.payload.client}} on their site {{ctx.payload.site}}",
"body": "Tasks are not running for {{ctx.payload.client}} on their site {{ctx.payload.site}}",
"attach_data": true,
"priority": "high"
}
}
}
}
You have to change your condition,It support json format:
"condition" : {
"script" : "return ctx.payload.hits.total : 1"
}
Please refer below link,
https://www.elastic.co/guide/en/watcher/current/condition.html

Resources