Bool AND search in properties in ElasticSearch - elasticsearch

I've got a very small dataset of documents put in ES :
{"id":1, "name": "John", "team":{"code":"red", "position":"P"}}
{"id":2, "name": "Jack", "team":{"code":"red", "position":"S"}}
{"id":3, "name": "Emily", "team":{"code":"green", "position":"P"}}
{"id":4, "name": "Grace", "team":{"code":"green", "position":"P"}}
{"id":5, "name": "Steven", "team":[
{"code":"green", "position":"S"},
{"code":"red", "position":"S"}]}
{"id":6, "name": "Josephine", "team":{"code":"red", "position":"S"}}
{"id":7, "name": "Sydney", "team":[
{"code":"red", "position":"S"},
{"code":"green", "position":"P"}]}
I want to query ES for people who are in the red team, with position P.
With the request
curl -XPOST 'http://localhost:9200/teams/aff/_search' -d '{
"query": {
"bool": {
"must": [
{
"match": {
"team.code": "red"
}
},
{
"match": {
"team.position": "P"
}
}
]
}
}
}'
I've got a wrong result.
ES gives
"name": "John",
"team":
{ "code": "red", "position": "P" }
and
"name": "Sydney",
"team":
[
{ "code": "red", "position": "S"},
{ "code": "green", "position": "P"}
]
For the last entry, ES took the property code=red in the first record and took the property position=P in the second record.
How can I specify that the search must match the 2 two terms in the same record (within or not a list of nested records) ?
In fact, the good answer is only the document 1, with John.
Here is the gist that creates the dataset :
https://gist.github.com/flrt/4633ef59b9b9ec43d68f
Thanks in advance

When you index document like
{
"name": "Sydney",
"team": [
{"code": "red", "position": "S"},
{"code": "green","position": "P"}
]
}
ES implicitly create inner object for your field (team in particular example) and flattens it to structure like
{
'team.code': ['red', 'green'],
'team.position: ['S', 'P']
}
So you lose your order. To avoid this you need explicitly put nested mapping, index your document as always and query them with nested query
So, this
PUT so/nest/_mapping
{
"nest": {
"properties": {
"team": {
"type": "nested"
}
}
}
}
PUT so/nest/
{
"name": "Sydney",
"team": [
{
"code": "red",
"position": "S"
},
{
"code": "green",
"position": "P"
}
]
}
GET so/nest/_search
{
"query": {
"nested": {
"path": "team",
"query": {
"bool": {
"must": [
{
"match": {
"team.code": "red"
}
},
{
"match": {
"team.position": "P"
}
}
]
}
}
}
}
}
will result with empty hits.
Further reading on relation management: https://www.elastic.co/blog/managing-relations-inside-elasticsearch

You can use a Nested Query so that your searches happen individually on the subdocuments in the team array, rather than across the entire document.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "team",
"query": {
"bool": {
"must": [
{ "match": { "team.code": "red" } },
{ "match": { "team.position": "P" } }
]
}
}
}
}
]
}
}
}

Related

ElasticSearch array data match multiple properties in nested element with AND condition

I'm facing a problem where I have two documents each containing an array of objects. I like to search for one document containing two properties for a nested object (matching both at the same time in the same object) but I always get both documents.
I created the documents with:
POST /respondereval/_doc
{
"resp_id": "1236",
"responses": [
{"key": "meta","text":"abc"},
{"key": "property 1", "text": "yes"},
{"key": "property 2", "text": "yes"},
]
}
POST /respondereval/_doc
{
"resp_id": "1237",
"responses": [
{"key": "meta","text":"abc"},
{"key": "property 1", "text": "no"},
{"key": "property 2", "text": "yes"},
]
}
I defined an index for them to prevent ES to flat out the objects like this:
PUT /respondereval
{
"mappings" : {
"properties": {
"responses" : {
"type": "nested"
}
}
}
}
I now like to search for the first document (resp_id 1236) with the following query:
GET /respondereval/_search
{
"query": {
"nested": {
"path": "responses",
"query": {
"bool": {
"must": [
{ "match": { "responses.key": "property 1" } },
{ "match": { "responses.text": "yes" } }
]
}
}
}
}
}
This should only return one element which matches both conditions at the same time.
Unfortunatly, it always returns both documents. I assume it's because at some point, ES still flattens the values in the nested objects arrays into something like this (simplified):
resp_id 1236: "key":["gender", "property 1", "property 2"], "text:["abc", "yes", "yes"]
resp_id 1237: "key":["gender", "property 1", "property 2"], "text:["abc", "no", "yes"]
which both contain the property1 and yes.
What is the correct way to solve this so that only documents are returned which contains an element in the objects array which matches both conditions ("key": "property 1" AND "text": "yes") at the same time?
The problem is with your mapping. You have text mapping which uses standard analyser by default.
Standard analyzer creates tokens on whitespaces. So
property 1 will be tokenised as
{
"tokens": [
{
"token": "property",
"start_offset": 0,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "1",
"start_offset": 9,
"end_offset": 10,
"type": "<NUM>",
"position": 1
}
]
}
Similarly property 2 also.
Hence both the documents are returned.
And when you search for yes, it matched from second text in the second document. property 1 matches property analysed token of second key in the document.
To make it work: - use keyword variation
{
"query": {
"nested": {
"path": "responses",
"query": {
"bool": {
"must": [
{ "match": { "responses.key.keyword": "property 1" } },
{ "match": { "responses.text.keyword": "yes" } }
]
}
}
}
}
}
It would be proper:
{
"query": {
"nested": {
"path": "responses",
"query": {
"bool": {
"must": [
{ "match_phrase": { "responses.key": "property 1" } },//phrase queries
{ "match": { "responses.text": "yes" } }
]
}
}
}
}
}
Have you directly tried the must query without nested.path
{
"query": {
"bool": {
"must": [
{
"match": {
"responses.key": "property 1"
}
},
{
"match": {
"responses.text": "yes"
}
}
]
}
}
}

Multi match query with terms lookup searching multiple indices elasticsearch 6.x

All,
I am working on building a NEST 6.x query that takes a serach term and looks in different fields in different indices.
This is the one I got so far but is not returning any results that I am expecting.
Please see the details below
Indices used
dev-sample-search
user-agents-search
The way the search should work is as follows.
The value in the query field(27921093) is searched against the
fields agentNumber, customerName, fileNumber, documentid(These are all
analyzed fileds).
The search should limit the documents to the agentNumbers the user
sampleuser#gmail.com has access to( sample data for
user-agents-search) is added below.
agentNumber, customerName, fileNumber, documentid and status are
part of the index dev-sample-search.
status field is defined as a keyword.
The fields in the user-agents-search index are all keywords
Sample user-agents-search index data:
{
"id": "sampleuser#gmail.com"",
"user": "sampleuser#gmail.com"",
"agentNumber": [
"123.456.789",
"1011.12.13.14"
]
}
Sample dev-sample-search index data:
{
"agentNumber": "123.456.789",
"customerName": "Bank of america",
"fileNumber":"test_file_1123",
"documentid":"1234456789"
}
GET dev-sample-search/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"multi_match": {
"type": "best_fields",
"query": "27921093",
"operator": "and",
"fields": [
"agentNumber",
"customerName",
"fileNumber",
"documentid^10"
]
}
}
],
"filter": [
{
"bool": {
"must": [
{
"terms": {
"agentNumber": {
"index": "user-agents-search",
"type": "_doc",
"user": "sampleuser#gmail.com",
"path": "agentNumber"
}
}
},
{
"bool": {
"must_not": [
{
"terms": {
"status": {
"value": "pending"
}
}
},
{
"term": {
"status": {
"value": "cancelled"
}
}
},
{
"term": {
"status": {
"value": "app cancelled"
}
}
}
],
"should": [
{
"term": {
"status": {
"value": "active"
}
}
},
{
"term": {
"status": {
"value": "terminated"
}
}
}
]
}
}
]
}
}
]
}
}
}
I see a couple of things that you may want to look at:
In the terms lookup query, "user": "sampleuser#gmail.com", should be "id": "sampleuser#gmail.com",.
If at least one should clause in the filter clause should match, set "minimum_should_match" : 1 on the bool query containing the should clause

ElasticSearch Nested Query Does not work as expected

I need a quick help, i need to fetch the entire document only when my certain conditions are fulfilled in that same array.
Example
Conditions when in one array block all these three conditions fulfill. i.e.
"profile.bud.buddies.code": "1"
"profile.bud.buddies.moredata.key":"one"
"profile.bud.buddies.moredata.val": "0"
Unfortunately right now it is going through the entire document and trying to match the values in each of those arrays so it could be so that code=1 gets matched in one array, key=one in some other array and val=0 in the third array. What happens it in this case it returns me the entire document, whereas actually this was not fulfilled in one array alone so shouldn't have returned me the document.
I made the moredata as nested type but still cannot get through. Please help.
Query I am using
"query": {
"bool": {
"should": [
{
"match": {
"profile.bud.buddies.code": "1"
}
}
]
},
"nested": {
"path": "profile.bud.buddies.moredata",
"query": {
"bool": {
"must": [
{
"match": {
"profile.bud.buddies.moredata.key": "one"
}
},
{
"match": {
"profile.bud.buddies.moredata.val": "0"
}
}
]
}
}
}
}
Document Structure
"profile": {
"x":{},
"y":{},
"a":{},
"b":{},
"bud":{
"buddies": [
{
"code":"1",
"moredata": [
{
"key": "one",
"val": 0,
"setup": "2323",
"data": "myid"
},
{
"key": "two",
"val": 1,
"setup": "23",
"data": "id"
}]
},
{
"code":"2",
"moredata": [
{
"key": "two",
"val": 0,
"setup": "2323",
"data": "myid"
},
{
"key": "three",
"val": 1,
"setup": "23",
"data": "id"
}]
}]
}
This is how i have marked the mappings;
"profile": {
"bug": {
"properties": {
"buddies": {
"properties": {
"moredata": {
"type": "nested",
"properties": {
"key": {"type": "string"},
"val": {"type": "string"}
Your query structure is incorrect, it should be something like
"query": {
"bool": {
"must": [{
"match": {
"profile.bud.buddies.code": "1"
}
},
{
"nested": {
"path": "profile.bud.buddies.moredata",
"query": {
"bool": {
"must": [{
"match": {
"profile.bud.buddies.moredata.key": "one"
}
},
{
"match": {
"profile.bud.buddies.moredata.val": "0"
}
}
]
}
}
}
]
}
}
}
where the nested query is inside of the array of must clauses of the outer bool query. Note that profile.bud.buddies.moredata must be mapped as a nested data type.

Elasticsearch query fails to return results when querying a nested object

I have an object which looks something like this:
{
"id": 123,
"language_id": 1,
"label": "Pablo de la Pena",
"office": {
"count": 2,
"data": [
{
"id": 1234,
"is_office_lead": false,
"office": {
"id": 1,
"address_line_1": "123 Main Street",
"address_line_2": "London",
"address_line_3": "",
"address_line_4": "UK",
"address_postcode": "E1 2BC",
"city_id": 1
}
},
{
"id": 5678,
"is_office_lead": false,
"office": {
"id": 2,
"address_line_1": "77 High Road",
"address_line_2": "Edinburgh",
"address_line_3": "",
"address_line_4": "UK",
"address_postcode": "EH1 2DE",
"city_id": 2
}
}
]
},
"primary_office": {
"id": 1,
"address_line_1": "123 Main Street",
"address_line_2": "London",
"address_line_3": "",
"address_line_4": "UK",
"address_postcode": "E1 2BC",
"city_id": 1
}
}
My Elasticsearch mapping looks like this:
"mappings": {
"item": {
"properties": {
"office": {
"properties": {
"data": {
"type": "nested",
}
}
}
}
}
}
My Elasticsearch query looks something like this:
GET consultant/item/_search
{
"from": 0,
"size": 24,
"query": {
"bool": {
"must": [
{
"term": {
"language_id": 1
}
},
{
"term": {
"office.data.office.city_id": 1
}
}
]
}
}
}
This returns zero results, however, if I remove the second term and leave it only with the language_id clause, then it works as expected.
I'm sure this is down to a misunderstading on my part of how the nested object is flattened, but I'm out of ideas - I've tried all kinds of permutations of the query and mappings.
Any guidance hugely appreciated. I am using Elasticsearch 6.1.1.
I'm not sure if you need the entire record or not, this solution gives every record that has language_id: 1 and has an office.data.office.id: 1 value.
GET consultant/item/_search
{
"from": 0,
"size": 100,
"query": {
"bool":{
"must": [
{
"term": {
"language_id": {
"value": 1
}
}
},
{
"nested": {
"path": "office.data",
"query": {
"match": {
"office.data.office.city_id": 1
}
}
}
}
]
}
}
}
I put 3 different records in my test index for proofing against false hits, one with different language_id and one with different office ids and only the matching one returned.
If you only need the office data, then that's a bit different but still solvable.

Elasticsearch: Query nested object contained within an object

I'm struggling to build a query where I can do a nested search across a sub-object of a document.
Say I have the following index/mapping:
curl -XPOST "http://localhost:9200/author/" -d '
{
"mappings": {
"item": {
"properties": {
"books": {
"type": "object",
"properties": {
"data": {
"type": "nested"
}
}
}
}
}
}
}
'
And the following 2 documents in the index:
{
"id": 1,
"name": "Robert Louis Stevenson",
"books": {
"count": 2,
"data": [
{
"id": 1,
"label": "Treasure Island"
},
{
"id": 3,
"label": "Dr Jekyll and Mr Hyde"
}
]
}
}
and
{
"id": 2,
"name": "Philip K. Dick",
"books": {
"count": 1,
"data": [
{
"id": 4,
"label": "Do Android Dream of Electric Sheep"
}
]
}
}
I have an array of Book ID's, say [1,4]; how would I write a query which does a keyword search of the author name AND only returns them if they wrote one of the books in the array?
I haven't managed to get a query which doesn't cause some sort of query parse_exception, but as a starting block, here's the current iteration of my query - maybe it's obvious where I'm going wrong?
{
"query": {
"bool": {
"must": {
"match": {
"label": "Louis"
}
}
},
"nested": {
"path": "books.data",
"query": {
"bool": {
"must": {
"terms": {
"books.data.id": [
1,
4
]
}
}
}
}
}
},
"from": 0,
"size": 8
}
In the above scenario I'd like the document for Mr Robert Louis Stevenson to be returned, as his name contains Louis and he wrote book ID 1.
For what it's worth, the current error I get looks like this:
{
"error": {
"root_cause": [
{
"type": "parse_exception",
"reason": "failed to parse search source. expected field name but got [START_OBJECT]"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "author",
"node": "sCk3su4YSnqhvdTGjOztlw",
"reason": {
"type": "parse_exception",
"reason": "failed to parse search source. expected field name but got [START_OBJECT]"
}
}
]
},
"status": 400
}
This makes me feel like I've got my "nested" object all wrong, but the docs suggest that I'm right!
You have it almost right, the nested query must simply be located inside the bool one like in the query below. Also the match query needs to be made on the name field since this is where the author name is stored:
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "Louis"
}
},
{
"nested": {
"path": "books.data",
"query": {
"bool": {
"must": {
"terms": {
"books.data.id": [
1,
4
]
}
}
}
}
}
}
]
}
},
"from": 0,
"size": 8
}

Resources