Elasticsearch combine term and range query on nested key/value data - elasticsearch

I have ES documents structured in a flat data structure using the nested data type, as they accept arbitrary JSON that we don't control, and we need to avoid a mapping explosion. Here's an example document:
{
"doc_flat":[
{
"key":"timestamp",
"type":"date",
"key_type":"timestamp.date",
"value_date":[
"2023-01-20T12:00:00Z"
]
},
{
"key":"status",
"type":"string",
"key_type":"status.string",
"value_string":[
"warning"
]
},
... more arbitrary fields ...
],
}
I've figured out how to query this nested data set to find matches on this arbitrary nested data, using a query such as:
{
"query": {
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{"term": {"doc_flat.key": "status"}},
{"term": {"doc_flat.value_string": "warning"}}
]
}
}
}
}
}
And I figured out how to find documents matching a particular date range:
{
"query": {
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{"term": {"doc_flat.key": "timestamp"}},
{
"range": {
"doc_flat.value_date": {
"gte": "2023-01-20T00:00:00Z",
"lte": "2023-01-21T00:00:00Z"
}
}
}
]
}
}
}
}
}
But I'm struggling to combine these two queries together, in order to search for documents that have a nested documents which match these two conditions:
a doc_flat.key of status, and a doc_flat.value_string of warning
a doc_flat.key of timestamp, and a doc_flat.value_date in a range
Obviously I can't just shove the second set of query filters into the same must array, because then no documents will match. I think I need to go "one level higher" in my query and wrap it in another bool query? But I can't get my head around how that would look.

You tried two nested inside Bool query?
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{
"term": {
"doc_flat.key": "timestamp"
}
},
{
"range": {
"doc_flat.value_date": {
"gte": "2023-01-20T00:00:00Z",
"lte": "2023-01-21T00:00:00Z"
}
}
}
]
}
}
}
}
],
"must": [
{
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{
"term": {
"doc_flat.key": "status"
}
},
{
"term": {
"doc_flat.value_string": "warning"
}
}
]
}
}
}
}
]
}
}
}

Related

With Elasticsearch, how to use an OR instead of AND within filter->terms query?

I have this following query with elastic:
{
"query": {
"bool": {
"filter": [{
"terms": {
"participants.group": ["group1","group2"]
}
}, {
"range": {
"recordDate": {
"gte": "2020-05-14 00:00:00.000",
"lte": "2020-07-22 20:30:56.566"
}
}
}]
}
}
}
Currently, this finds records with participants with group "group1" and "group2".
How to change the query so it finds records with participants from "group1" or "group2?
Is it possible to do it without changing the structure of the query?
I'm assuming that the field participants.group is of keyword type and not text type.
Assuming that, the query you have roughly translates to (group1) or (group2) or (group1 and group2).
All you need to do is modify the query as below and add a must_not clause like below:
POST my_filter_index/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"range": {
"recordDate": {
"gte": "2020-05-14 00:00:00.000",
"lte": "2020-07-22 20:30:56.566"
}
}
}
],
"should": [
{
"terms": {
"participants.group": ["group1", "group2"]
}
}
]
}
}
],
"must_not": [
{
"bool": {
"must": [
{
"term": {
"participants.group": "group1"
}
},
{
"term": {
"participants.group": "group2"
}
}
]
}
}
]
}
}
}
Let me know if that works!

How to combine must and must_not in elasticsearch with same field

i have elasticsearch 6.8.8, just for an example of my question. I want to create a query that gets me document with "Test" field with value "1", and i don't want to get "Test" field with value of "3", i know that i could write just the first expression without 3 and it will give me one document with value of "1". But i want to know, is there any way, that i can use must and must_not in the same time, on the same field and getting just the value of "1"?
I wrote this basic example to know what i mean:
{
"from": 0,
"query": {
"nested": {
"path": "attributes",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"attributes.key": {
"query": "Test"
}
}
},
{
"match": {
"attributes.value": {
"query": "1"
}
}
}
],
"must_not": [
{
"match": {
"attributes.key": {
"query": "Test"
}
}
},
{
"match": {
"attributes.value": {
"query": "3"
}
}
}
]
}
}
]
}
}
}
}
}
I use attributes as nested field with key-value field that use mapping as string type.
You'll need to leave out attributes.key:Test in the must_not because it filters out all Tests:
GET combine_flat/_search
{
"from": 0,
"query": {
"nested": {
"inner_hits": {},
"path": "attributes",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"attributes.key": {
"query": "Test"
}
}
},
{
"match": {
"attributes.value": {
"query": "1"
}
}
}
],
"must_not": [
{
"match": {
"attributes.value": {
"query": "3"
}
}
}
]
}
}
]
}
}
}
}
}
Tip: use inner_hits to just return the matched nested key-value pairs as opposed to the whole field.

Query elasticsearch where a key's value is at least some number

I am processing files to recognize if they contain labels and what the confidence the label was recognized.
I created a nested mapping called tags which contains label (text) and confidence (float between 0 and 100).
Here is an example of how I think the query would work (I know it's invalid). It should be a something like "Find documents that have the tags labelled A and B. A must have a confidence of at least 37 and B must have a confidence of at least 80".
{
"query": {
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match": {
"tags.label": "A"
},
"range": {
"tags.confidence": {
"gte": 37
}
}
},
{
"match": {
"tags.label": "B"
},
"range": {
"tags.confidence": {
"gte": 80
}
}
}
]
}
}
}
}
}
Any ideas? I am pretty sure I need to approach it differently (different mapping). I am not sure how to accomplish this in ElasticSearch. Is this possible?
Let's say your parent document would contain two nested documents, something like below:
{
"tags":[
{
"label":"A",
"confidence":40
},
{
"label":"B",
"confidence":85
}
]
}
If that is the case, below is how your query would be:
Nested Query:
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match": {
"tags.label": "A"
}
},
{
"range": {
"tags.confidence": {
"gte": 37
}
}
}
]
}
}
}
},
{
"nested": {
"path": "tags",
"query": {
"bool": {
"must": [
{
"match": {
"tags.label": "B"
}
},
{
"range": {
"tags.confidence": {
"gte": 80
}
}
}
]
}
}
}
}
]
}
}
}
Note that each nested document is indexed as a separate document. That is the reason you have to mention two separate queries. Otherwise, with what you have what it does it, it would search all the four values inside one/single nested document of its parent document.
Hope this helps!

Search in multiple index , 'query_shard_exception' when fields are not present

I'm trying to search in multiple indexes, but the fields and mapping for each index are different. Like one index is having nested path.
When I'm trying to query on index's I'm getting error for the index which are not having the nested path.
{
"query": {
"bool": {
"should": [
{
"term": {
"a": "good"
}
},
{
"term": {
"a.b": "sample"
}
},
{
"nested": {
"path": "x.y.z",
"query": {
"bool": {
"should": [
{
"term": {
"x.y.z.id.keyword": "test#gamil.com"
}
}
]
}
}
}
}
]
}
}
}
in above the nested path x.y.z is only present for one index.
I tried finding a solution, found ignore_unavailable. But it will ignore the index not having nested path, but I need the document's in that index which matches other condition in the query.
Try the following query by replacing your-index with the name of the index that contains the nested field.
{
"query": {
"bool": {
"should": [
{
"term": {
"a": "good"
}
},
{
"term": {
"a.b": "sample"
}
},
{
"bool": {
"must": [
{
"term": {
"_index": "your-index"
}
},
{
"nested": {
"path": "x.y.z",
"query": {
"bool": {
"should": [
{
"term": {
"x.y.z.id.keyword": "test#gamil.com"
}
}
]
}
}
}
}
]
}
}
]
}
}
}

Multiple values in nested elastic search 2 query

I have a nested object named 'bundles', that usually contains more than one object. Using this query I can succesfully query on the id of an object in bundles, but I fail to write a query that can query on multiple id's. Suggestions?
{
"query": {
"nested": {
"path": "bundles",
"query": {
"bool": {
"must": [
{
"match": {
"bundles.id": 43273
}
}
]
}
},
"inner_hits": {}
}
}
}
Perhaps you want "should" instead of "must" in the boolean filter. For example:
{
"query": {
"nested": {
"path": "bundles",
"query": {
"bool": {
"should": [
{
"match": {
"bundles.id": 43273
},
{
"match": {
"bundles.id": 433373
}
}
]
}
}
}
}
}
You could also use terms query if the field can be matched exactly. For example:
{
"query": {
"nested": {
"path": "bundles",
"query": {
"bool": {
"must": [
{
"terms": {
"bundles.id": [1140000000, 114]
}
}
]
}
}
}
}
}'

Resources