ElasticSearch - query documents where the nested field is empty array [] - elasticsearch

I am trying to filter by the empty arrays of the nested field.
I tried many different commands, even scripts, and flattened fields, but couldn't retrieve any results. Does anyone has experience with this, is it possible to be done in the ES? I also want to aggregate (count results) by the same empty array field value []
mapping
suitability:
type: "nested"
properties:
group:
type: "keyword"
code:
type: "keyword"
in the index, I have this nested field in every document
"suitability": [
{
"group": "RG309",
"code": 1
},
{
"group": "RG318",
"code": 1
},
{
"group": "RG355",
"code": 2
}
]
also some documents have an empty nested field
"suitability": []
query for empty suitability results ( DOESN'T WORK - always return total_hits: 0)
GET /_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"ignore_unmapped": [
true
],
"path": "suitability",
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "suitability"
}
}
]
}
}
}
}
]
}
},
"track_total_hits": true
}
query for not empty suitability ( THIS WORKS: returns all results )
{
"query": {
"bool": {
"must": [
{
"nested": {
"ignore_unmapped": [
true
],
"path": "suitability",
"query": {
"bool": {
"must": [
{
"terms": {
"suitability.rule_result": [
"1",
"2",
"3"
]
}
}
]
}
}
}
}
]
}
},
"track_total_hits": true
}

Related

ElasticSearch: Query nested array for empty and specific value in single query

Documents structure -
{
"hits": [
{
"_type": "_doc",
"_id": "ef0a2c44179a513476b080cc2a585d95",
"_source": {
"DIVISION_NUMBER": 44,
"MATCHES": [
{
"MATCH_STATUS": "APPROVED",
"UPDATED_ON": 1599171303000
}
]
}
},
{
"_type": "_doc",
"_id": "ef0a2c44179a513476b080cc2a585d95",
"_source": {
"DIVISION_NUMBER": 44,
"MATCHES": [ ]
}
}
]
}
Question - MATCHES is a nested array inside there is a text field MATCH_STATUS that can have any values say "APPROVED","REJECTED".
I am looking to search ALL documents that contain MATCH_STATUS having values say "APPROVED", "RECOMMENDED" as well as where there is no data in MATCHES (empty array "MATCHES": [ ]). Please note I want this in a single query.
I am able to do this in two separate queries like this -
GET all matches with status = RECOMMENDED, APPROVED
"must": [
{
"nested": {
"path": "MATCHES",
"query": {
"terms": {
"MATCHES.MATCH_STATUS.keyword": [
"APPROVED",
"RECOMMENDED"
]
}
}
}
}
]
GET all matches having empty array "MATCHES" : [ ]
{
"size": 5000,
"query": {
"bool": {
"filter": [],
"must_not": [
{
"nested": {
"path": "MATCHES",
"query": {
"exists": {
"field": "MATCHES"
}
}
}
}
]
}
},
"from": 0
}
You can combine both queries using should clause.
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"nested": {
"path": "MATCHES",
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"terms": {
"MATCHES.MATCH_STATUS.keyword": [
"APPROVED",
"RECOMMENDED"
]
}
}
]
}
}
}
},
{
"bool": {
"must_not": [
{
"nested": {
"path": "MATCHES",
"query": {
"bool": {
"filter": {
"exists": {
"field": "MATCHES"
}
}
}
}
}
}
]
}
}
]
}
}
}
Update: To answer your comment.
Missing aggregation does not support nested field for now. There is open issue as of now.
To get count of empty matches, you can use a filter aggregation with the nested query wrapped into the must_not clause of the bool query.
{
"aggs": {
"missing_matches_agg": {
"filter": {
"bool": {
"must_not": {
"nested": {
"query": {
"match_all": {}
},
"path": "MATCHES"
}
}
}
}
}
}
}

Query that satisfies all conditions in an array

The documents are stored in the form below in Elastic Research index.
mapping
{
"mappings": {
"properties": {
"data": {
"type": "nested"
}
}
}
}
first docs
{
"data": [
{
"value": "a"
},
{
"value": "a"
},
{
"value": "b"
}
]
}
second docs
{
"data": [
{
"value": "a"
},
{
"value": "a"
},
{
"value": "a"
}
]
}
I want to return the document only when all values in the array are 'a' (second docs)
In this case, how should I make the query condition?
The nested query searches nested field objects as if they were indexed
as separate documents. If an object matches the search, the nested
query returns the root parent document.
When using a combination of bool query with must and must_not, it searches for each individual nested object and eliminates the objects that do not match, but if there are some nested objects left, that match with your query, you will get your results.
Try out this below search query, where all the documents are discarded who have a nested object with the b value.
Search Query:
{
"query": {
"bool": {
"must_not": {
"nested": {
"path": "data",
"query": {
"term": {
"data.value": "b"
}
}
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64329782",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"data": [
{
"value": "a"
},
{
"value": "a"
},
{
"value": "a"
}
]
}
}
]
Search Query with the combination of multiple bool and nested queries:
The below search query will also give you the required result.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"match": {
"data.value": "a"
}
}
]
}
}
}
}
],
"must_not": [
{
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"match": {
"data.value": "b"
}
}
]
}
}
}
}
]
}
}
}

ElasticSearch and nested query

Having a problem getting record with intersecting ('and') condition.
I have a doc:
{
"uuid": "1e2a0c06-af24-42e1-a31a-0f84233521de",
"subject": "subj",
"relations": [
{
"userUuid": "0f38e576-6b1f-4c1a-86a8-67a55a06d504",
"signed": false
},
{
"userUuid": "15979293-6b04-41a9-a6aa-bba99499496f",
"signed": true
}
]
}
Querying and expecting to get EMPTY result, cause conditions are met from different nested elements:
"bool": {
"must": [
{
"nested": {
"query": {
"term": {
"relations.userUuid": {
"value": "15979293-6b04-41a9-a6aa-bba99499496f",
"boost": 1.0
}
}
},
"path": "relations",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1.0
}
},
{
"nested": {
"query": {
"term": {
"relations.signed": {
"value": false,
"boost": 1.0
}
}
},
"path": "relations",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1.0
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
How to query that condition would be 'AND' within same nested object?
Updated the answer looking at your comment. You need to mention path in your nested document.
Scenario 1: If you want any of the nested documents to contain 5979293-6b04-41a9-a6aa-bba99499496f as userUuid and signed as true
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "relations", <---- Note this
"query": {
"term": {
"relations.userUuid": "15979293-6b04-41a9-a6aa-bba99499496f"
}
}
}
},
{
"nested": {
"path": "relations",
"query": {
"term": {
"relations.signed": false
}
}
}
}
]
}
}
}
This would return true if there are two nested documents, first nested doc containing the userUuid and second nested doc containing signed as false
Scenario 2: If you want both the fields to be present in a single nested document
POST <your_index_name>/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "relations", <---- Note this
"query": {
"bool": {
"must": [
{
"term": {
"relations.userUuid": "15979293-6b04-41a9-a6aa-bba99499496f"
}
},
{
"term": {
"relations.signed": false
}
}
]
}
}
}
}
]
}
}
}
In this scenario, a single nested document must contain both values.
Let me know if this helps!

How to query multiple parameters in a nested field in elasticsearch

I'm trying to search for keyword and then add nested queries for amenities which is a nested field of an array of objects.
With the query below I am able to search when I'm only matching one amenity id but when I have more than one it doesn't return anything.
Anyone have an idea what is wrong with my query ?
{
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"_geo_distance": {
"geolocation": [
100,
10
],
"order": "asc",
"unit": "m",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"name^2",
"city",
"state",
"zip"
],
"fuzziness": 5,
"query": "complete"
}
},
{
"nested": {
"path": "amenities",
"query": {
"bool": {
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}
]
}
}
}
}
]
}
}
}
When you do:
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]
What you're actually saying is find me any document where "amenities.id"="1" and "amenities.id"="2" which unless "amenities.id" is a list of values it won't work.
What you probably want to say is find me any document where "amenities.id"="1" or "amenities.id"="2"
To do that you should use should instead of must:
"should": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]

Elasticsearch AND Parens

I'm attempting to do the following with the query dsl but I'll express it as SQL:
(matrices.matrix = 'Matrix1' AND matrices.count = 1) AND (matrices.matrix = 'Matrix2' AND matrices.count >= 0)
So, I need to get docs that have both of these nested docs with these values.
This is the nested document it sits on the _source level
"matrices": [
{
"terms": [],
"count": 0,
"matrix": "none"
},
{
"terms": [
"greater"
],
"count": 1,
"matrix": "Matrix1"
}
]
And here is the mapping for the sub-doc:
"matrices": {
"type": "nested",
"include_in_parent": true,
"properties": {
"count": {
"type": "long"
},
"matrix": {
"type": "string"
},
"terms": {
"type": "string"
}
}
}
So, I need to generate a query that will allow me to get docs that match both (matrix = 'none' && count=0) && (matrix = 'Matrix' && count = 1)
Thanks,
So basically you want to retrieve documents that MUST contain two nested documents with the following criteria:
one nested document with matrices.count=0 AND matrices.matrix=none
another nested document with matrices.count=1 AND matrices.matrix=Matrix
Then with the mapping you have, you can achieve that result using the following query. We use bool/must for two nested queries which in turn match the criteria each of the nested documents that must be retrieved.
curl -XPOST localhost:9200/_search -d '{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "matrices",
"query": {
"bool": {
"must": [
{
"term": {
"matrices.count": 0
}
},
{
"term": {
"matrices.matrix": "none"
}
}
]
}
}
}
},
{
"nested": {
"path": "matrices",
"query": {
"bool": {
"must": [
{
"term": {
"matrices.count": 1
}
},
{
"term": {
"matrices.matrix": "matrix"
}
}
]
}
}
}
}
]
}
}
}
}
}

Resources