Multiple (AND) queries for a nested index structure in Elasticsearch - elasticsearch

I have an index with the below mapping
{
"mappings": {
"xxxxx": {
"properties": {
"ID": {
"type": "text"
},
"pairs": {
"type": "nested"
},
"xxxxx": {
"type": "text"
}
}
}
}
}
the pairs field is essentially an array of objects - each object has a unique ID associated with it
What i'm trying to do is to get only one object from the pairs field for updates. To that extent , i've tried this
GET /sample/_search/?size=1000
{
"query": {
"bool": {
"must": [
{
"match": {
"ID": "2rXdCf5OM9g1ebPNFdZNqW"
}
},
{
"match": {
"pairs.id": "c1vNGnnQLuk"
}
}
]
}
},
"_source": "pairs"
}
but this just returns an empty object despite them being valid IDs. If i remove the pairs.id rule - i get the entire array of objects .
What do i need to add/edit to ensure that i can query via both IDS (original and nested)

Since pairs is of nested type, you need to use a nested query. Also you might probably want to leverage nested inner-hits as well:
GET /sample/_search/?size=1000
{
"query": {
"bool": {
"must": [
{
"match": {
"ID": "2rXdCf5OM9g1ebPNFdZNqW"
}
},
{
"nested": {
"path": "pairs",
"query": {
"match": {
"pairs.id": "c1vNGnnQLuk"
}
},
"inner_hits": {}
}
}
]
}
},
"_source": false
}

Related

Search-as-you-type inside arrays

I am trying to implement a search-as-you-type query inside an array.
This is the structure of the documents:
{
"guid": "6f954d53-df57-47e3-ae9e-cb445bd566d3",
"labels":
[
{
"name": "London",
"lang": "en"
},
{
"name": "Llundain",
"lang": "cy"
},
{
"name": "Lunnainn",
"lang": "gd"
}
]
}
and up to now this is what I came with:
{
"query": {
"multi_match": {
"fields": ["labels.name"],
"query": name,
"type": "phrase_prefix"
}
}
which works exactly as requested.
The problem is that I would like to search also by language.
What I tried is:
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
but these queries act on separate values of the array.
So, for example, I would like to search only Welsh language (cy). That means that my query that contains the city name should match only values that have "cy" on the "lang" tag.
How do I write this kind of query?
Internally, ElasticSearch flattens nested JSON objects, so it can't correlate the lang and name of a specific element in the labels array. If you want this kind of correlation, you'll need to index your documents differently.
The usual way to do this is to use the nested data type with a matching nested query.
The query would end up looking something like this:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}
But note that you'll need to also specify nested mappings for your labels, e.g.:
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
/* you might want to add other mapping-related configuration here */
},
"lang": {
"type": "keyword"
}
}
}
}
Other ways to do this include:
Indexing each label as a separate document, repeating the guid field
Using parent/child documents
You should use Nested datatype in mapping instead of Object datatype. For detail explanation refer this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
So, you should define mapping of your field something like this:
{
"properties": {
"labels": {
"type": "nested",
"properties": {
"name": {
"type": "text"
},
"lang": {
"type": "keyword"
}
}
}
}
}
After this you could query using Nested Query as:
{
"query": {
"nested": {
"path": "labels",
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": ["labels.name"],
"query": "london",
"type": "phrase_prefix"
}
},
{
"term": {
"labels.lang": "gd"
}
}
]
}
}
}
}
}

Elastic Search query for an AND condition on two properties of a nested object

I have the post_filter as below, Where I am trying to filter records where the school name is HILL SCHOOL AND containing a nested child object with name JOY AND section A.
school is present in the parent object, Which is holding children list of nested objects.
All of the above are AND conditions.
But the query doesn't seem to work. Any idea why ? And is there a way to combine the two nested queries?
GET /test_school/_search
{
"query": {
"match_all": {}
},
"post_filter": {
"bool": {
"must_not": [
{
"bool": {
"must": [
{
"term": {
"schoolname": {
"value": "HILL SCHOOL"
}
}
},
{
"nested": {
"path": "children",
"query": {
"bool": {
"must": [
{
"match": {
"name": "JACK"
}
}
]
}
}
}
},
{
"term": {
"children.section": {
"value": "A"
}
}
}
]
}
}
]
}
}
}
The schema is as below:
PUT /test_school
{
"mappings": {
"_doc": {
"properties": {
"schoolname": {
"type": "keyword"
},
"children": {
"type": "nested",
"properties": {
"name": {
"type": "keyword",
"index": true
},
"section": {
"type": "keyword",
"index": true
}
}
}
}
}
}
}
Sample data as below:
POST /test_school/_doc
{
"schoolname":"HILL SCHOOL",
"children":{
"name":"JOY",
"section":"A"
}
}
second record
POST /test_school/_doc
{
"schoolname":"HILL SCHOOL",
"children":{
"name":"JACK",
"section":"B"
}
}
https://stackoverflow.com/a/17543151/183217 suggests special mapping is needed to work with nested objects. You appear to be falling foul of the "cross object matching" problem.

How do I refer to multiple nesting levels in an Elastic Search's Filter Aggregation?

Let's call my root level foo and my child level events. I want to aggregate on the events level but with a filter that EITHER the event has color "orange" OR the parent foo has customerId "35".
So, I want to have a filter aggregation that's inside a nested aggregation. In this filter's query clause, I have one child that refers to a field on foo and the other refers to a field on events. However, that first child has no way to actually reference the parent like that! I can't use a reverse_nested aggregation because I can't put one of those as a child of a compound query, and I can't filter before nesting because I'd lose the OR semantics that way. How do I reference the field on foo?
Concrete example if it helps. Mapping:
{
"foo": {
"properties": {
"customer_id": { "type": "long" },
"events": {
"type": "nested",
"properties": {
"color": { "type": "keyword" },
"coord_y": { "type": "double" }
}
}
}
}
}
(update for clarity: that's an index named foo with the root mapping named foo)
The query I want to be able to make:
{
"aggs": {
"OP0_nest": {
"nested": { "path": "events" },
"aggs": {
"OP0_custom_filter": {
"filter": {
"bool": {
"should": [
{ "term": { "events.color": "orange" } },
{ "term": { "customer_id": 35 } }
]
}
},
"aggs": {
"OP0_op": {
"avg": { "field": "events.coord_y" }
}
}
}
}
}
}
}
Of course, this does not work, because the child of the should clause containing customer_id does not work. That term query is always false because customer_id can't be accessed inside the nested aggregation.
Thanks in advance!
Since the fields you want to apply filter on are at different levels you need to make query for each level separately and place them in should clause of bool query which becomes the filter for our filter aggregation. In this aggregation we then add a nested aggregation to get the avg of coord_y.
The aggregation will be (UPDATED: since foo is index name removed foo from field names):
{
"aggs": {
"OP0_custom_filter": {
"filter": {
"bool": {
"should": [
{
"term": {
"customer_id": 35
}
},
{
"nested": {
"path": "events",
"query": {
"term": {
"events.color": "orange"
}
}
}
}
]
}
},
"aggs": {
"OP0_op": {
"nested": {
"path": "events"
},
"aggs": {
"OP0_op_avg": {
"avg": {
"field": "events.coord_y"
}
}
}
}
}
}
}
}

elasticsearch - find document by exactly matching a nested object

I have documents that contain multiple role/right definitions as an array of nested objects:
{
...
'roleRights': [
{'roleId':1, 'right':1},
{'roleId':2, 'right':1},
{'roleId':3, 'right':2},
]
}
I am trying to filter out document with specific roleRights, but my query seems to mix up combinations. Here is my filterQuery as "pseudoCode"
boolFilter > must > termQuery >roleRights.roleId: 1
boolFilter > must > termQuery >roleRights.type: 2
The above should only return
documents that have role 1 assigned with right 2.
But it looks like i get
all document that have role 1 assigned disregarding the right
and all documents that have right 2 assigned disregarding the role.
Any hints?
You need to map roleRights as nested (see a good explanation here), like below:
PUT your_index
{
"mappings": {
"your_type": {
"properties": {
"roleRights": {
"type": "nested",
"properties": {
"roleId": { "type": "integer" },
"right": { "type": "integer" }
}
}
}
}
}
}
Make sure to delete your index first, recreate it and re-populate it.
Then you'll be able to make your query like this:
POST your_index/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "roleRights",
"query": {
"term": { "roleRights.roleId": 1}
}
}
},
{
"nested": {
"path": "roleRights",
"query": {
"term": { "roleRights.type": 2}
}
}
}
]
}
}
}

How to query for two fields in one and the same tuple in an array in ElasticSearch?

Let's say there are some documents in my index which look like this:
{
"category":"2020",
"properties":[
{
"name":"foo",
"value":"2"
},
{
"name":"boo",
"value":"2"
}
]
},
{
"category":"2020",
"properties":[
{
"name":"foo",
"value":"8"
},
{
"name":"boo",
"value":"2"
}
]
}
I'd like to query the index in a way to return only those documents that match "foo":"2"but not "boo":"2".
I tried to write a query that matches both properties.name and properties.value, but then I'm getting false positives. I need a way to tell ElasticSearch that name and value have to be part of the same properties tuple.
How can I do that?
You need to map properties as a nestedtype. So your mapping would look similar to this:
{
"your_type": {
"properties": {
"category": {
"type": "string"
},
"properties": {
"type": "nested",
"properties": {
"name": {
"type": "string"
},
"value": {
"type": "string"
}
}
}
}
}
}
Then, your query to match documents having "foo=2" in the same tuple but not "boo=2" in the same tuple would need to use the nested query accordingly, like the one below.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "properties",
"query": {
"bool": {
"must": [
{
"match": {
"properties.name": "foo"
}
},
{
"match": {
"properties.value": "2"
}
}
]
}
}
}
}
],
"must_not": [
{
"nested": {
"path": "properties",
"query": {
"bool": {
"must": [
{
"match": {
"properties.name": "boo"
}
},
{
"match": {
"properties.value": "2"
}
}
]
}
}
}
}
]
}
}
}
#Val's answer is as good as it gets. One thing I would add, though, since it makes the difference between one type of query and others that might benefit from nesteds "opposite" feature.
In Elasticsearch, the default type for "properties":[{"name":"foo","value":"2"},{"name":"boo","value":"2"}] that is used to auto-create such a field is object. The object has the drawback that it doesn't associate one sub-field's value with another sub-field's value, meaning foo is not necessarily associated with 2. name is just an array of values and value is the again another array of values with not association between the two.
If one needs the above association to work then nested is a must.
But, I have encountered situations where both these features were needed. If you need both of these, you can set include_in_parent: true for the mapping so that you can take advantage of both. One of the situations that I have seen is here.
"properties": {
"type": "nested",
"include_in_parent": true,
"properties": {
"name": {
"type": "string"
...

Resources