ElasticSearch: Complex filter by nested document - elasticsearch

I have following document structure:
{
product_name: "Product1",
product_id: 1,
...,
articles: [
{
article_name: 'Article 101',
id: 101,
some_param: 10,
clients: []
},
{
article_name: 'Article 102',
id: 102,
some_param: 11,
clients: [
{
client_id: 10001,
client_name: "some client 1001"
}
...
]
}
]
},
{
product_name: "Product2",
product_id: 2,
...,
articles: [
{
article_name: 'Article 101',
id: 101,
some_param: 10,
clients: []
},
{
article_name: 'Article 102',
id: 102,
some_param: 10,
clients: [
{
client_id: 10001,
client_name: "some client 1001"
}
...
]
}
]
}
I need to get documents (product) ONLY if some of its articles match 2 conditions (single article should match both conditions): articles.some_param = 10 AND articles.clients.client_id = 10001
So I need to get only product with id 2.
I'm using this query now, which is incorrect (and I know why), because it fetches both documents:
{
"query": {
"bool": {
"filter": [
{
"term": {
"articles.clients.id": 10001
}
},
{
"terms": {
"articles.some_param": 10
}
}
]
}
}
}
How can I write query which gets only products which has at least 1 article which matches both conditions: articles.some_param = 10 AND articles.clients.client_id = 10001
e.g., to get Product with ID 2 only?

Something like this:
{
"query": {
"nested": {
"path": "articles",
"query": {
"bool": {
"must": [
{
"term": {
"articles.some_param": {
"value": 10
}
}
},
{
"nested": {
"path": "articles.clients",
"query": {
"term": {
"articles.clients.id":{
"value": 10001
}
}
}
}
}
]
}
}
}
}
}
UPDATE:
Try wrap second query to bool.
{
"query": {
"nested": {
"path": "articles",
"query": {
"bool": {
"must": [
{
"term": {
"articles.some_param": {
"value": 10
}
}
},
{
"bool":{
"must" : [
{
"nested": {
"path": "articles.clients",
"query": {
"term": {
"articles.clients.id":{
"value": 10001
}
}
}
}
}
]
}
}
]
}
}
}
}
}
p.s. I could be mistaken with a path on the second nested query. Just couldn't check. So you can play around with the path on the second query.
p.p.s. The filter is not the query what you need. It does not calculate the scores

Related

Elastic search multiple AND & OR operator in a query

I have a following mapping applied to my index :
PUT /testing
PUT /testing/_mapping?pretty
{
"properties": {
"empID": {
"type":"long"
},
"state":{
"type":"text"
},
"Balance":{
"type":"long"
},
"loanid":{
"type":"long"
},
"rating":{
"type":"text"
},
"category":{
"type":"text"
}
}
}
Sample documents added in the index
POST testing/_doc?pretty
{
"empID":1,
"state":"NY",
"Balance":55,
"loanid":89,
"rating":"A",
"category":"PRO"
}
POST /testing/_doc?pretty
{
"empID":1,
"state":"TX",
"Balance":56,
"loanid":65,
"rating":"B",
"category":"TRIAL"
}
POST /testing/_doc?pretty
{
"empID":2,
"state":"TX",
"Balance":34,
"loanid":76,
"rating":"C",
"category":"PAID"
}
POST /testing/_doc?pretty
{
"empID":3,
"state":"TX",
"Balance":72,
"loanid":23,
"rating":"D",
"category":"FREE"
}
POST /testing/_doc?pretty
{
"dealID":3,
"state":"NY",
"Balance":23,
"loanid":67,
"rating":"E",
"category":"FREE"
}
POST /testing/_doc?pretty
{
"empID":2,
"state":"NY",
"Balance":23,
"loanid":98,
"rating":"F",
"category":"PRE"
}
POST /testing/_doc?pretty
{
"empID":2,
"state":"TX",
"Balance":19,
"loanid":100,
"rating":"D",
"category":"PAID"
}
I am trying to create ES query which is equivalent of sql query like :
select * from table_name
where empID =1 or state = 'NY'
and balance >=20 or loanid in (23, 67, 89) or rating = 'D'
and category!='FREE' or empID = 2 ;
vs (ES Query )
GET testing/_search?pretty
{
"query": {
"bool": {
"should": [
{
"match": {
"state": {
"query": "NY"
}
}
},
{
"term": {
"empID": 1
}
},
{
"bool": {
"must": [
{
"range": {
"Balance": {
"gte": 20
}
}
},
{
"bool": {
"should": [
{
"terms": {
"loanid": [
23,
67,
89
]
}
},
{
"match": {
"rating": {
"query": "D"
}
}
},
{
"bool": {
"must_not": [
{
"match": {
"category": {
"query": "FREE"
}
}
}
]
}
}
]
}
}
]
}
}
]
}
}
}
I am only getting 6 documents back wherein sql query gives 7 documents back .Could you confirm if this is how the multiple AND & OR QUERY would work in ES and help me in resolving the issue .

Query to get random n items from top 100 items in Elastic Search

I need to write a query in elasticsearch to get random 12 items in the top 100 sorted items.
I tried something like this, but I am unable to get random 12 items(I can get only the top 12 items).
The query I used:
GET product/_search
{
"sort": [
{
"DateAdded": {
"order": "desc"
}
}
],
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"definitionName": {
"value": "ABC"
}
}
},
{
"range": {
"price": {
"gt": 0
}
}
}
]
}
},
"functions": [
{
"random_score": {
"seed": 314159265359
}
}
]
}
},
"size": 12
}
Can anybody guide me where am I going wrong? (I am a beginner in writing ElasticQueries)
Thanks in Advance.
EDIT: doesnot work, window_size recalculate score on the X top results.
Also:
need to set: "track_scores" to true at the top level.
corect syntax is:
"rescore": {
"window_size": 10,
"query": {
"score_mode": "max", //wathever
"rescore_query": {
"bool": {
"should": [
{
//your query here - you can use a function or a script score too
}
]
}
},
"query_weight": 0.7,
"rescore_query_weight": 1.2
}
}
Ok i understand better.
Indeed you have to sort by date (top 100) and rescore with a random function (read https://www.elastic.co/guide/en/elasticsearch/reference/7.x/search-request-body.html#request-body-search-post-filter).
Should be something like:
{
"sort": [
{
"DateAdded": {
"order": "desc"
}
}
],
"query": {
"bool": {
"must": [
{
"term": {
"definitionName": {
"value": "ABC"
}
}
},
{
"range": {
"price": {
"gt": 0
}
}
}
]
}
},
"size": 100,
"rescore": {
"window_size": 12,
"query": {
"rescore_query": {
"random_score": {
"seed": 314159265359
}
}
}
}
}

Combination of querystring and terms in elasticsearch

I have this query:
query: {
query: {
query_string: {
query: "Perspolis OR Branco",
default_field: "body"
}
},
from: 1,
size: 1
}
How to combine this query with an exact check for the field processed: true and age between 10 and 20?
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "Perspolis OR Branco",
"default_field": "body"
}
},
{
"term": {
"processed": {
"value": true
}
}
},
{
"range": {
"age": {
"gte": 10,
"lte": 20
}
}
}
]
}
},
"from": 1,
"size": 1
}

ElasticSearch query filter on nested documents

I need to filter my index based on a nested property :
myNestedProperty: [
{ id: 1, displayName: toto },
{ id: 2, displayName: tata },
{ id: 3, displayName: titi }
]
myNestedProperty: [
{ id: 4, displayName: dodo },
{ id: 5, displayName: dada },
{ id: 6, displayName: didi }
]
I would like to count how many have a Toto and how many does not. I try with the following query :
"aggs": {
"HasToto": {
"filter": {
"nested": {
"path": "myNestedProperty",
"query": {
"match": {
"myNestedProperty.id": "1"
}
}
}
}
},
"NoToto": {
"filter": {
"nested": {
"path": "myNestedProperty",
"query": {
"bool": {
"must_not": [
{"match": {
"myNestedProperty.id": "1"
}}
]
}
}
}
}
}
}
The "HasToto" seems to return the expected result but it's not the case of "NoToto" filter (Too much data returned).
Rules :
"Toto" can only be there once in myNestedProperty. If I have "Toto", I can't have "Dodo" or another one.
It's a hierarchical object :
-- Toto
---- Tata
------- Titi
I simplify the data due to their complexity, I hope it's enough clear with this simple object.
How to achieve this please ? Thanks in advance.
I found the solution \o/
"aggs": {
"HasToto": {
"filter": {
"bool": {
"must": {
"nested": {
"path": "myNestedProperty",
"query": {
"match": {
"myNestedProperty.id": "1"
}
}
}
}
}}
},
"NoToto": {
"filter": {
"bool": {
"must_not": [
{
"nested": {
"path": "myNestedProperty",
"query": {
"match": {
"myNestedProperty.id": "1"
}
}
}
}
]
}
}
}
}

Match multiple properties on the same nested document in ElasticSearch

I'm trying to accomplish what boils down to a boolean AND on nested documents in ElasticSearch. Let's say I have the following two documents.
{
"id": 1,
"secondLevels": [
{
"thirdLevels": [
{
"isActive": true,
"user": "anotheruser#domain.com"
}
]
},
{
"thirdLevels": [
{
"isActive": false,
"user": "user#domain.com"
}
]
}
]
}
{
"id": 2,
"secondLevels": [
{
"thirdLevels": [
{
"isActive": true,
"user": "user#domain.com"
}
]
}
]
}
In this case, I want to only match documents (in this case ID: 2) that have a nested document with both isActive: true AND user: user#domain.com.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "secondLevels.thirdLevels",
"query": {
"bool": {
"must": [
{
"term": {
"secondLevels.thirdLevels.isActive": true
}
},
{
"term": {
"secondLevels.thirdLevels.user": "user#domain.com"
}
}
]
}
}
}
}
]
}
}
}
However, what seems to be happening is that my query turns up both documents because the first document has one thirdLevel that has isActive: true and another thirdLevel that has the appropriate user.
Is there any way to enforce this strictly at query/filter time or do I have to do this in a script?
With nested-objects and nested-query, you have made most of the way.
All you have to do now is to add the inner hits flag and also use source filtering for move entire secondLevels documents out of the way:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "secondLevels.thirdLevels",
"query": {
"bool": {
"must": [
{
"term": {
"secondLevels.thirdLevels.isActive": true
}
},
{
"term": {
"secondLevels.thirdLevels.user": "user#domain.com"
}
}
]
}
},
"inner_hits": {
"size": 100
}
}
}
]
}
}
}

Resources