Elastic search combine must and must_not - elasticsearch

I have a document that holds data for a product the mapping is as follow:
"mappings" : {
"properties" : {
"view_score" : {
"positive_score_impact" : true,
"type" : "rank_feature"
},
"recipients" : {
"dynamic" : false,
"type" : "nested",
"enabled" : true,
"properties" : {
"type" : {
"similarity" : "boolean",
"type" : "keyword"
},
"title" : {
"type" : "text",
"fields" : {
"key" : {
"type" : "keyword"
}
}
}
}
}
}
}
And I have 2 documents with the following data:
{
"view_score": 10,
"recipients": [{"type":"gender", "title":"male"}, {"type":"gender", "title":"female"}]
}
{
"view_score": 10,
"recipients": [{"type":"gender", "title":"female"}]
}
When a user searches for a product she can say "I prefer products for females" so The products which specifies gender as just female should come before products that specifies gender as male and female both.
I have the following query which gives more score to products with just female gender:
GET _search
{
"sort": [
"_score"
],
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"nested": {
"path": "recipients",
"ignore_unmapped": true,
"query": {
"bool": {
"boost": 10,
"must": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "female"
}
}
],
"must_not": {
"bool": {
"filter": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "male"
}
}
]
}
}
}
}
}
}
]
}
},
"script": {
"source": "return _score;"
}
}
}
}
But if I add another query to should query it won't behave the same and gives the same score to products with one or two genders in their specifications.
here is my final query which wont work as expected:
GET _search
{
"sort": [
"_score"
],
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"rank_feature": {
"field": "view_score",
"linear": {}
}
},
{
"nested": {
"path": "recipients",
"ignore_unmapped": true,
"query": {
"bool": {
"boost": 10,
"must": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "female"
}
}
],
"must_not": {
"bool": {
"filter": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "male"
}
}
]
}
}
}
}
}
}
]
}
},
"script": {
"source": "return _score;"
}
}
}
}
So my problem is how to combine these should clause together to give more weight to the products that specify only one gender.

Related

elastic - query multiple levels on nested object in inner_hits

i have a huge nested object which has lots of levels
i want to create a query which will return only the leaf / some object in the middle,
and the query is supposed to query multiple levels in the tree.
for example:
my DB is saving the whole company structure.
company -> wards -> employees -> working hours
i want to make a query that will return only the working hours of the employees in ward 2 which started later than 3pm this month
i tried to use inner_hits - but to no use.
as requested, sample document and expected result:
company:[{
properties:{companyId: 112}
ward:[{
properties: {wardId: 223}
employee:{
properties: {employeeId: 334},
workingHours: [
{ date: "1.1.2021", numOfHours: 4},
{ date: "1.2.2021", numOfHours: 7}
]
}]
}]
}]
the query:
I need to return the working hours of date "1.2.21" , of employee 334, of ward 223. and only the working hours, not the whole tree.
expected result:
4 or { date: "1.1.2021", numOfHours: 4} , whatever is simpler
hope its clear now
You need to add inner_hits to all nested queries
You can either parse entire result to get matched working hours(from inner hits) o can use response filtering to remove additional data
Mapping
PUT index123
{
"mappings": {
"properties": {
"company": {
"type": "nested",
"properties": {
"ward": {
"type": "nested",
"properties": {
"employee": {
"type": "nested",
"properties": {
"workingHours": {
"type": "nested",
"properties": {
"date": {
"type": "date"
}
}
}
}
}
}
}
}
}
}
}
}
Data
"_index" : "index123",
"_type" : "_doc",
"_id" : "9gGYI3oBt-MOenya6BcN",
"_score" : 1.0,
"_source" : {
"company" : [
{
"companyId" : 112,
"ward" : [
{
"wardId" : 223,
"employee" : {
"employeeId" : 334,
"workingHours" : [
{
"date" : "2021-01-01",
"numOfHours" : 4
},
{
"date" : "2021-01-02",
"numOfHours" : 7
}
]
}
}
]
}
]
}
}
Query
GET index123/_search?filter_path=hits.hits.inner_hits.ward.hits.hits.inner_hits.employee.hits.hits.inner_hits.workingHours.hits.hits._source
{
"query": {
"nested": {
"inner_hits": {
"name":"ward"
},
"path": "company.ward",
"query": {
"bool": {
"must": [
{
"term": {
"company.ward.wardId": {
"value": 223
}
}
},
{
"nested": {
"inner_hits": {
"name":"employee"
},
"path": "company.ward.employee",
"query": {
"bool": {
"must": [
{
"term": {
"company.ward.employee.employeeId": {
"value":334
}
}
},
{
"nested": {
"inner_hits": {
"name":"workingHours"
},
"path": "company.ward.employee.workingHours",
"query": {
"range": {
"company.ward.employee.workingHours.date": {
"gte": "2021-01-01",
"lte": "2021-01-01"
}
}
}
}
}
]
}
}
}
}
]
}
}
}
}
}
Result
{
"hits" : {
"hits" : [
{
"inner_hits" : {
"ward" : {
"hits" : {
"hits" : [
{
"inner_hits" : {
"employee" : {
"hits" : {
"hits" : [
{
"inner_hits" : {
"workingHours" : {
"hits" : {
"hits" : [
{
"_source" : {
"date" : "2021-01-01",
"numOfHours" : 4
}
}
]
}
}
}
}
]
}
}
}
}
]
}
}
}
}
]
}
}
Update:
Query with company ID
GET index123/_search?filter_path=hits.hits.inner_hits.company.hits.hits.inner_hits.ward.hits.hits.inner_hits.employee.hits.hits.inner_hits.workingHours.hits.hits._source
{
"query": {
"nested": {
"path": "company",
"inner_hits": {
"name": "company"
},
"query": {
"bool": {
"must": [
{
"term": {
"company.companyId": {
"value": 112
}
}
},
{
"nested": {
"inner_hits": {
"name": "ward"
},
"path": "company.ward",
"query": {
"bool": {
"must": [
{
"term": {
"company.ward.wardId": {
"value": 223
}
}
},
{
"nested": {
"inner_hits": {
"name": "employee"
},
"path": "company.ward.employee",
"query": {
"bool": {
"must": [
{
"term": {
"company.ward.employee.employeeId": {
"value": 334
}
}
},
{
"nested": {
"inner_hits": {
"name": "workingHours"
},
"path": "company.ward.employee.workingHours",
"query": {
"range": {
"company.ward.employee.workingHours.date": {
"gte": "2021-01-01",
"lte": "2021-01-01"
}
}
}
}
}
]
}
}
}
}
]
}
}
}
}
]
}
}
}
}
}

ElasticSearch filter by nested boolean type fields

I need to query on multiple nested fields on boolean types.
Structure of mapping:
"mappings" : {
"properties" : {
"leaders" : {
"type" : "nested",
"properties" : {
"except_1" : {
"type" : "boolean"
},
"except_2" : {
"type" : "boolean"
},
"counter" : {
"type" : "integer"
}
}
}
}
}
I am trying to use query both except1 and except2 only to False.
Below my try, unfortunately it returns True and False for both fields and I cannot fix it.
"query": {
"nested": {
"path": "leaders",
"query": {
"bool": {
"must": [
{
"term": {
"leaders.except_1": False
}
},
{
"term": {
"leaders.except_2": False
}
}
]
}
}
}
}
What you're probably looking for is the inner_hits option -- showing only the matched nested subdocuments.
PUT leaders
{"mappings":{"properties":{"leaders":{"type":"nested","properties":{"except_1":{"type":"boolean"},"except_2":{"type":"boolean"},"counter":{"type":"integer"}}}}}}
POST leaders/_doc
{
"leaders": [
{
"except_1": true,
"except_2": false
},
{
"except_1": false,
"except_2": false
}
]
}
GET leaders/_search
{
"query": {
"nested": {
"path": "leaders",
"inner_hits": {},
"query": {
"bool": {
"must": [
{
"term": {
"leaders.except_1": false
}
},
{
"term": {
"leaders.except_2": false
}
}
]
}
}
}
}
}
then
GET leaders/_search
{
"query": {
"nested": {
"path": "leaders",
"inner_hits": {},
"query": {
"bool": {
"must": [
{
"term": {
"leaders.except_1": false
}
},
{
"term": {
"leaders.except_2": false
}
}
]
}
}
}
}
}
yielding
{
"hits":[
{
"_index":"leaders",
"_type":"_doc",
"_id":"u-he8HEBG_KW3EFn-gMz",
"_score":0.87546873,
"_source":{ <-- default behavior
"leaders":[
{
"except_1":true,
"except_2":false
},
{
"except_1":false,
"except_2":false
}
]
},
"inner_hits":{
"leaders":{
"hits":{
"total":{
"value":1,
"relation":"eq"
},
"max_score":0.87546873,
"hits":[ <------- only the matching nested subdocument
{
"_index":"leaders",
"_type":"_doc",
"_id":"u-he8HEBG_KW3EFn-gMz",
"_nested":{
"field":"leaders",
"offset":1
},
"_score":0.87546873,
"_source":{
"except_1":false,
"except_2":false
}
}
]
}
}
}
}
]
}
Furthermore, you can force the system to only return the inner_hits by saying "_source": "inner_hits" on the top-level of your search query.

Elasticsearch - Search and filter by price

I have some array with promocodes (comes from request):
$promocodes = ['K1H5E1F1', 'M4C8A5K6', 'A3B9A45KL'];
And I have products data in Elasticsearch (as example, I will give data of one product):
// First product (2 promocodes matched, take a lower price 265.5 and filter this product at this price)
"price": 199,
"promocodes" : [
{
"code" : "K1H5E1F1",
"price" : 265.5
},
{
"code" : "LKDS3534K",
"price" : 357
},
{
"code" : "A3B9A45KL",
"price" : 327.5
}
]
// Second product (1 promocode matched, take a price 700 and filter this product at this price)
"price": 800,
"promocodes" : [
{
"code" : "AJ543HJB",
"price" : 500
},
{
"code" : "M4C8A5K6",
"price" : 700
}
]
// Third product (0 promocode matched, take a base price 900 and filter this product at this price)
"price": 900,
"promocodes" : [
{
"code" : "AJ87HJ90",
"price" : 750
}
]
I need to filter products data by price based on promocodes. If you set a range for the price and have promocodes, then you need to filter the products. If the product has the same promocode, then you need to take the price for this promotional code, not the main price. If 2 promocodes match for one product, then you need to take a lower price. In my example, the same product has 2 promotional codes for one product, I need to take the lower price out of 2 prices for the promocode and filter for that particular price.
This request does not filter prices as I need:
GET dev_products/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"price": {
"gte": 100,
"lte": 350
}
}
},
{
"nested": {
"path": "promocodes",
"query": {
"terms": {
"promocodes.code": [
'K1H5E1F1',
'M4C8A5K6',
'A3B9A45KL'
]
}
}
}
}
]
}
}
}
I don't know how to make a request correctly, I ask you for help.
You need to use inner hits.
{
"query": {
"bool": {
"must": [
{
"range": {
"price": {
"gte": 100,
"lte": 350
}
}
},
{
"nested": {
"path": "promocodes",
"query": {
"terms": {
"promocodes.code": [
"K1H5E1F1",
"A3B9A45KL"
]
}
},
"inner_hits": {
"sort": {"promocodes.price": "asc"},----> sort nested document by price
"size": 1 ---> return top 1 document
}
}
}
]
}
}
}
Result:
"hits" : [
{
"_index" : "index4",
"_type" : "_doc",
"_id" : "NTBFgm0BFLPFo7KPt70j",
"_score" : 2.0,
"_source" : {
"price" : 199,
"promocodes" : [
{
"code" : "K1H5E1F1",
"price" : 265.5
},
{
"code" : "LKDS3534K",
"price" : 357
},
{
"code" : "A3B9A45KL",
"price" : 327.5
}
]
},
"inner_hits" : { -----> inner hits contains nested data
"promocodes" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ -----> returns one matched field
{
"_index" : "index4",
"_type" : "_doc",
"_id" : "NTBFgm0BFLPFo7KPt70j",
"_nested" : {
"field" : "promocodes",
"offset" : 0
},
"_score" : null,
"_source" : {
"code" : "K1H5E1F1",
"price" : 265.5
},
"sort" : [
265.5
]
}
]
}
}
}
}
]
EDIT:
Below logic checks if promocode has match then return document with promocode value in innerhits. If promocode has no match and parent price is in range(gte and lte value) then return that document.
GET dev_products/_search
{
"_source": "price",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"range": {
"price": {
"gte": 100,
"lte": 350
}
}
}
],
"must_not": [
{
"nested": {
"path": "promocodes",
"query": {
"bool": {
"must": [
{
"terms": {
"promocodes.code.keyword": [
"K1H5E1F1",
"A3B9A45KL"
]
}
}
]
}
},
"inner_hits": {
"sort": {
"promocodes.price": "asc"
},
"size": 1
}
}
}
]
}
},
{
"nested": {
"path": "promocodes",
"query": {
"bool": {
"must": [
{
"terms": {
"promocodes.code.keyword": [
"K1H5E1F1",
"A3B9A45KL"
]
}
},
{
"range": {
"promocodes.price": {
"gte": 100,
"lte": 350
}
}
}
]
}
},
"inner_hits": {
"sort": {
"promocodes.price": "asc"
},
"size": 1
}
}
}
]
}
}
}
EDIT-2
Query
GET dev_products/_search
{
"_source": "price",
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"range": {
"price": {
"gte": 100,
"lte": 350
}
}
}
],
"must_not": [
{
"nested": {
"path": "promocodes",
"query": {
"bool": {
"must": [
{
"terms": {
"promocodes.code.keyword": [
"K1H5E1F1",
"A3B9A45KL"
]
}
}
]
}
}
}
}
]
}
},
{
"bool": {
"must": [
{
"nested": {
"path": "promocodes",
"query": {
"bool": {
"must": [
{
"terms": {
"promocodes.code.keyword": [
"K1H5E1F1",
"A3B9A45KL"
]
}
},
{
"range": {
"promocodes.price": {
"lte": 350,
"gte": 100
}
}
}
]
}
},
"inner_hits": {
"sort": {
"promocodes.price": "asc"
},
"size": 1
}
}
}
],
----> don't include document if any matched promcode has value less than given range
"must_not": [
{
"nested": {
"path": "promocodes",
"query": {
"bool": {
"must": [
{
"terms": {
"promocodes.code.keyword": [
"K1H5E1F1",
"A3B9A45KL"
]
}
},
{
"range": {
"promocodes.price": {
"lt": 100
}
}
}
]
}
}
}
}
]
}
}
]
}
}
}
If the range for price and promocodes.price is equal to "gte":270,"lte":271 and in terms of promocodes.code is equal to ["promo1","promo2","promo4"] the request does not work - in fact, he should not choose this product, since the price according to the lowest promotional code is 265.5 and does not fall into the diapason of range, but he still selects this product and does not add the desired Promocode for inner_hits (for some reason he chooses "promo2" for inner_hits with price 270).
"price": 275,
"promocodes" : [
{
"code" : "promo1",
"price" : 265.5
},
{
"code" : "promo2",
"price" : 270
},
{
"code" : "promo3",
"price" : 250
}
]

Elasticsearch - OR in term conditions

I need little help with transfering mysql query to ES. The query looks like this
SELECT * FROM `xyz` WHERE visibility IN (1,2) AND (active=0 OR (active=1 AND finished=1)
It's easy, to make only AND conditions, but how to mix AND with OR in term?
"query" : {
"bool" : {
"must" : [{
"terms" : { "visibility" : ["1", "2"] }
}, {
"term" : { "active" : "1" }
}, {
"term" : { "active" : "0", "finished" : "1" } // OR
},]
}
}
Try like this by nesting a bool/should and bool/filter query inside the main bool/filter query:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"visibility": [
"1",
"2"
]
}
},
{
"bool": {
"should": [
{
"term": {
"active": "0"
}
},
{
"bool": {
"filter": [
{
"term": {
"active": "1"
}
},
{
"term": {
"finished": "1"
}
}
]
}
}
]
}
}
]
}
}
}

Selecting documents with a specific field is set to NULL in Elasticsearch

Someone please help me to add expires_at IS NULL to ES query below. I looked into Dealing with Null Values section for missing filter but the way I used it (shown at the bottom) causes not expired documents not appearing in result so obviously I'm doing something wrong here.
Note: I don't want to use or query because it is deprecated in 2.0.0-beta1.
QUERY
{
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"term": {
"order_id": "123"
}
},
{
"term": {
"is_active": 1
}
},
{
"range": {
"expires_at": {
"gt": "2016-07-01T00:00:00+0000"
}
}
}
]
}
}
}
}
}
This is what I'm aiming at:
SELECT * FROM orders
WHERE
order_id = '123' AND
is_active = '1' AND
(expires_at > '2016-07-01T00:00:00+0000' OR expires_at IS NULL)
This is what I did, but un-expired documents won't show up in this case so this is wrong.
{
"query": {
"filtered": {
"filter": {
"missing": {
"field": "expires_at"
}
},
"query": {
"bool": {
"must": [
......
......
]
}
}
}
}
}
My ES version:
{
"status" : 200,
"name" : "Fan Boy",
"version" : {
"number" : "1.3.4",
"build_hash" : "a70f3ccb52200f8f2c87e9c370c6597448eb3e45",
"build_timestamp" : "2014-09-30T09:07:17Z",
"build_snapshot" : false,
"lucene_version" : "4.9"
},
"tagline" : "You Know, for Search"
}
This should do it:
{
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"term": {
"order_id": "123"
}
},
{
"term": {
"is_active": 1
}
},
{
"bool": {
"should": [
{
"range": {
"expires_at": {
"gt": "20160101000000"
}
}
},
{
"filtered": {
"filter": {
"missing": {
"field": "expires_at"
}
}
}
}
]
}
}
]
}
}
}
}
}

Resources