Filtering with deep nested items - elasticsearch

I have the following mapping:
PUT /test
{
"mappings": {
"test": {
"properties": {
"parent": {
"type": "nested",
"properties": {
"#id": {
"type": "string",
"index": "not_analyzed"
},
"#type": {
"type": "string"
},
"child": {
"type": "nested",
"properties": {
"#id": {
"type": "string",
"index": "not_analyzed"
},
"subchild": {
"type": "nested",
"properties": {
"#id": {
"type": "string",
"index": "not_analyzed"
},
"hasA": {
"type": "nested",
"properties": {
"#value": {
"type": "string"
}
}
},
"hasB": {
"type": "nested",
"properties": {
"#id": {
"type": "string",
"index": "not_analyzed"
}
}
},
"hasC": {
"type": "nested",
"properties": {
"#id": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}
}
}
}
}
}
And the following document:
POST /test/test/1
{
"parent": {
"#id": "12345",
"#type": "test",
"child": [
{
"#id": "1",
"subchild": [
{
"#id": "1.1",
"hasA": {
"#value": "hasA value"
},
"hasB": {
"#id": "hasB_1"
},
"hasC": {
"#id": "hasC_1"
}
}
]
},
{
"#id": "2",
"subchild": [
{
"#id": "2.1",
"hasA": {
"#value": "hasA value"
},
"hasB": {
"#id": "hasB_2"
},
"hasC": {
"#id": "hasC_2"
}
}
]
}
]
}
}
And the following query:
POST test/test/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "parent.child.subchild.hasB",
"filter": {
"bool": {
"must": [
{
"term": {
"parent.child.subchild.hasB.#id": "hasB_2"
}
}
]
}
},
"_cache": false
}
}
}
}
}
I'm unable to set the path to just parent.child.subchild so that I can match on both hasB and hasC, it seems I can only select one nested item at a time. This is what I would like to be able to do:
POST test/test/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "parent.child.subchild",
"filter": {
"bool": {
"must": [
{
"term": {
"parent.child.subchild.hasB.#id": "hasB_2"
}
},
{
"term": {
"parent.child.subchild.hasC.#id": "hasC_2"
}
}
]
}
},
"_cache": false
}
}
}
}
}

Are you looking for something like this?
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "parent.child.subchild",
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "parent.child.subchild.hasB",
"query": {
"term": {
"parent.child.subchild.hasB.#id": "hasB_2"
}
}
}
},
{
"nested": {
"path": "parent.child.subchild.hasC",
"query": {
"term": {
"parent.child.subchild.hasC.#id": "hasC_2"
}
}
}
}
]
}
},
"_cache": false
}
}
}
}
}

Query for correct syntax for multi-level nested docs can be found here. Look at #martijnvg comments inside it.
ES Docs has not done a good job on explaining multi-level nested query.
Basically you need to nest subchild inside child inside parent and specify individual path. You will need three nested queries.
P.S - I have not tested this myself. Please let me know if it does not work.

Related

Elasticsearch nested query with aggregation using nested term doesn't return any bucket

I have an ES index with this mapping:
{
"_doc": {
"dynamic": "false",
"properties": {
"original": {
"properties":{
"id": {
"type": "keyword"
},
"purchaseStatus": {
"type": "keyword"
},
"marketCode": {
"type": "keyword"
},
"salesProfiles": {
"type": "nested",
"properties": {
"marketCode": {
"type": "keyword"
},
"purchaseStatus": {
"type": "keyword"
}
}
}
}
},
"recommended": {
"properties":{
"id": {
"type": "keyword"
},
"purchaseStatus": {
"type": "keyword"
},
"marketCode": {
"type": "keyword"
},
"salesProfiles": {
"type": "nested",
"properties": {
"marketCode": {
"type": "keyword"
},
"purchaseStatus": {
"type": "keyword"
}
}
}
}
},
"distance": {
"type": "double"
},
"rank": {
"type": "double"
},
"source": {
"properties": {
"application": {
"type": "keyword"
},
"platform": {
"type": "keyword"
}
}
},
"timestamp": {
"properties": {
"createdAt": {
"type": "date"
},
"updatedAt": {
"type": "date"
}
}
}
}
},
"_default_": {
"dynamic": "false"
}
}
and I need to obtain the recommended docs with salesProfiles.marketCode equal to original.marketCode but my query doesn't return any buckets:
GET index/_search
{
"aggs": {
"similarities": {
"filter": {
"bool": {
"must": [
{
"term": {
"original.storefrontId": "12345"
}
},
{
"nested": {
"path": "recommended.salesProfiles",
"query": {
"bool": {
"must": [
{
"match": {
"recommended.salesProfiles.purchaseStatus": "PAID"
}
}
]
}
}
}
}
]
}
},
"aggs": {
"markets": {
"nested": {
"path": "recommended.salesProfiles"
},
"aggs": {
"recommendedMarket": {
"terms": {
"field": "recommended.salesProfiles.marketCode",
"size": 100
}
}
}
}
}
}
},
"explain": false
}
Any suggestion would be really appreciated. Thanks in advance!
Its hard to debug this without any example docs, but I think this might work
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"original.storefrontId": "12345"
}
},
{
"nested": {
"path": "recommended.salesProfiles",
"query": {
"bool": {
"must": [
{
"match": {
"recommended.salesProfiles.purchaseStatus": "PAID"
}
}
]
}
}
}
}
]
}
},
"aggs": {
"Profiles": {
"nested": {
"path": "recommended.salesProfiles"
},
"aggs": {
"by_term": {
"terms": {
"field": "recommended.salesProfiles.marketCode",
"size": 100
}
}
}
}
}
}
I don't think you can use "nested" under the filter agg without being under a nested aggregation, so I believe that's why you didn't get any docs.
I basically moved all the filtering to the query and just aggregated the terms later

OR query in nested objects ElasticSearch

I use ElasticSearch version 1.7.5 and I am trying to fetch all documents where missing some fields.
My mapping:
...
"participant": {
"properties": {
"id": {
"type": "string"
},
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"name": {
"type": "string"
}
},
"coordinator": {
"properties": {
"id": {
"type": "string"
},
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"name": {
"type": "string"
}
}
...
I want to query all documents that don't have assigned coordinator.id or participant.id yet.
My query looks like:
"query": {
"nested": {
"path": "coordinator, participant",
"query": {
"constant_score": {
"filter": {
"or": [
{
"missing": {
"field": "coordinator.id"
}
},
{
"missing": {
"field": "participant.id"
}
},
]
}
}
}
}
}
You do OR queries via the bool query:
https://www.elastic.co/guide/en/elasticsearch/reference/1.7/query-dsl-bool-filter.html
So this query would work:
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"filter": {
"missing": {
"field": "participant.id"
}
}
}
},
{
"constant_score": {
"filter": {
"missing": {
"field": "coordinator.id"
}
}
}
}
]
}
}
}
I noticed that you were using a nested query though the mapping does not state that coordinator and participant are nested field types so that will not work:
https://www.elastic.co/guide/en/elasticsearch/reference/1.7/mapping-nested-type.html
Setting something as a nested type is only useful when you need to group search terms together so I don't think it is necessary for you.

An Elasticsearch filter to determine the absence of a value

I have a document that has students and grades for each student. It looks something like this:
"name": "bill",
"year": 2015,
"grades": [
{"subject": "math", grade: "A"},
{"subject": "english", grade: "B"}
], ...
I'm looking for query filter(s) that can give me:
a list of students who have studied 'math', and
a list of students who have not studied 'math'.
I'm thinking that an exists filter should do it, but I'm struggling to get my head around it.
It's a stylised example but the mappings are something like this:
"mappings": {
"student": {
"properties": {
"name": {
"type": "string"
},
"grades": {
"type": "nested",
"properties": {
"subject": {
"type": "string"
},
"grade": {
"type": "string"
}
}
}
}
}
}
You need to change a bit your mapping and, depending on the your needs, I'd suggest aggregations.
First, your nested object needs to be "include_in_parent": true so that you can easily do the not studied 'math' part:
PUT /grades
{
"mappings": {
"student": {
"properties": {
"name": {
"type": "string"
},
"grades": {
"type": "nested",
"include_in_parent": true,
"properties": {
"subject": {
"type": "string"
},
"grade": {
"type": "string"
}
}
}
}
}
}
}
And the full query, using aggregations:
GET /grades/student/_search?search_type=count
{
"aggs": {
"studying_math": {
"filter": {
"nested": {
"path": "grades",
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"grades.subject": "math"
}
}
]
}
}
}
}
}
},
"aggs": {
"top_10": {
"top_hits": {
"size": 10
}
}
}
},
"not_studying_math": {
"filter": {
"bool": {
"must_not": [
{
"term": {
"grades.subject": "math"
}
}
]
}
},
"aggs": {
"top_10": {
"top_hits": {
"size": 10
}
}
}
}
}
}
A term filter should do just fine. For the inverse query, just negate it with a not filter:
"query":
{
"filtered" : {
"query": {
"match_all": {}
},
"filter" : {
"term": {
"grades.subject": "math"
}
}
}
}
And for the ones who did not study math:
"query":
{
"filtered" : {
"query": {
"match_all": {}
},
"filter" : {
"not": {
"filter": {
"term": {
"grades.subject": "math"
}
}
}
}
}
}

Elasticsearch using nested object mappings in has_parent filter

Is it possible to use nested object mappings in a has_parent query or filter? Whenever I try I always get:
[customers_and_messages] [nested] nested object under path [accounts]
is not of nested type]
Mapping
{
"customer": {
"properties": {
"name": {
"analyzer": "name_edge_ngram_analyzer",
"type": "string"
},
"phone_number": {
"analyzer": "phone_number_edge_ngram_analyzer",
"type": "string"
},
"location_id": {
"index": "not_analyzed",
"type": "string"
},
"accounts": {
"type": "nested",
"properties": {
"id": {
"index": "not_analyzed",
"type": "string"
},
"tags": {
"type": "string"
},
"notes": {
"type": "string"
}
}
}
}
}
}
Query
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"has_parent": {
"type": "customer",
"filter": {
"nested": {
"path": "accounts",
"filter": {
"term": {
"accounts.id": "8392f356-d6ec-11e4-8737-b74c1bacfb15"
}
}
}
}
}
}
],
"should": [
{
"term": {
"kind": "task"
}
},
{
"term": {
"kind": "alert"
}
}
]
}
}
}
}
}

ElasticSearch Nested Query formulation

I have a mapping like this
{
"experience": {
"type": "nested",
"properties": {
"end": {
"type": "string"
},
"organization": {
"type": "nested",
"properties": {
"details": {
"type": "string"
},
"name": {
"type": "string"
}
}
}
}
}
}
Now I want to make a query like this:
{
"nested": {
"path": "experience",
"query": {
"bool": {
"must": [{
"match": {
"experience.organization.name": {
"query": company_name,
"operator": "and"
}
}
}, {
"match": {
"experience.end": "Present"
}
}]
}
}
}
}
The above query is not returning any results, is this the correct way to index and query the above scenario?
I am confused about what should be the value of the path variable since organisation.name and end are not at the same level.
Here is a complete working sample with your code:
PUT index1/test1/_mapping
{
"test1": {
"properties": {
"experience": {
"type": "nested",
"properties": {
"end": {
"type": "string"
},
"organization": {
"type": "nested",
"properties": {
"details": {
"type": "string"
},
"name": {
"type": "string"
}
}
}
}
}
}
}
}
POST index1/test1
{
"experience": {
"end": "Present",
"organization": {
"name": "org1",
"details": "some details here"
}
}
}
GET index1/test1/_search
{
"query": {
"nested": {
"path": "experience",
"query": {
"bool": {
"must": [
{
"match": {
"end": "present"
}
},
{
"nested": {
"path": "experience.organization",
"query": {
"match": {
"name": "org1"
}
}
}
}
]
}
}
}
}
}
That being said, you have a double nested object here which you will probably find will work against you in the long run. I would consider flattening the data so that the nested is not necessary.

Resources