Search with multi filter - elasticsearch

I have a question here. In common shopping cart sites have the function search for product with multiple filters. For example I'm searching for sport gear with some filters like this:
Manufacturer
[x] Nike
Adidas
Umbro
Options
Size
[x] S
[x] M
L
Color
[x] White
Yellow
Red
[x] Blue
Here's my mapping
PUT test/product/_mapping
{
"product":{
"properties" : {
"name" : {"type" : "string", "store":"yes"},
"manufacturer" {"type": "string}
"options" : {
"type": "nested"
}
}
}
}
Some test data
POST test/product/1
{
"name": "Shirt 1",
"manufacturer": "Adidas",
"options":[
{
"Color" : ["Red", "Green"]
},
{
"Size" : ["S","M","L"]
}
],
"price":250000
}
POST test/product/2
{
"name": "Shirt 2",
"manufacturer": "Nike",
"options":[
{
"Color" : ["Blue", "Green", "White"]
},
{
"Size" : ["S", "L", "XL"]
}
],
"price":100000
}
POST test/product/3
{
"name": "Shirt 3",
"manufacturer": "Umbro",
"options": [
{
"Color" : ["Red"]
},
{
"Size" : ["S","XXL"]
}
],
"price": 300000
}
With this query, everything's fine
POST test/product/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "options",
"filter": {
"bool": {
"must": [
{
"terms": {
"options.Color": [
"white"
]
}
}
]
}
}
}
},
{
"term": {
"manufacturer": "nike"
}
}
]
}
}
}
}
}
But, if I add more condition in Options filter, i get no result
POST test/product/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "options",
"filter": {
"bool": {
"must": [
{
"terms": {
"options.Color": [
"white"
]
}
},
{
"terms": {
"options.Size": [
"s"
]
}
}
]
}
}
}
},
{
"term": {
"manufacturer": "nike"
}
}
]
}
}
}
}
}
I don't know whether i'm wrong in mapping or my query, or can you show me what's the best way to create mapping in this scenario. Thank you for all your help.

The problem here is the usage of the nested type. Your nested filter is not evaluated over all children altogether but on every child individually. Since you do not have a single nested object, that satisfies your filter (having both, Color and Size), you're not getting any results. You have two options:
merge those individual nested objects together
POST test/product/1
{
"name": "Shirt 1",
"manufacturer": "Adidas",
"options":[
{
"Color" : ["Red", "Green"],
"Size" : ["S","M","L"]
}
],
"price":250000
}
Your mapping and query stays the same.
Do not use a nested type, but a simple object type. You have to change your mapping for options:
PUT test/product/_mapping
{
"product":{
"properties" : {
"options" : {
"type": "object"
}
}
}
}
And drop the nested filter:
POST test/product/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"terms": {
"options.Color": [
"white"
]
}
},
{
"terms": {
"options.Size": [
"s"
]
}
},
{
"term": {
"manufacturer": "nike"
}
}
]
}
}
}
}
}
But your data can stay the same.
Nested objects are really for different structured data. If you were to have something like
"options":[
{
"Color" : "Blue",
"Size": "S"
},
{
"Color": "Red",
"Size" : "L"
}
]
And you want to filter for items, that are both, Blue and S, then you would have to use a nested filter.

Related

Elastic search combine must and must_not

I have a document that holds data for a product the mapping is as follow:
"mappings" : {
"properties" : {
"view_score" : {
"positive_score_impact" : true,
"type" : "rank_feature"
},
"recipients" : {
"dynamic" : false,
"type" : "nested",
"enabled" : true,
"properties" : {
"type" : {
"similarity" : "boolean",
"type" : "keyword"
},
"title" : {
"type" : "text",
"fields" : {
"key" : {
"type" : "keyword"
}
}
}
}
}
}
}
And I have 2 documents with the following data:
{
"view_score": 10,
"recipients": [{"type":"gender", "title":"male"}, {"type":"gender", "title":"female"}]
}
{
"view_score": 10,
"recipients": [{"type":"gender", "title":"female"}]
}
When a user searches for a product she can say "I prefer products for females" so The products which specifies gender as just female should come before products that specifies gender as male and female both.
I have the following query which gives more score to products with just female gender:
GET _search
{
"sort": [
"_score"
],
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"nested": {
"path": "recipients",
"ignore_unmapped": true,
"query": {
"bool": {
"boost": 10,
"must": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "female"
}
}
],
"must_not": {
"bool": {
"filter": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "male"
}
}
]
}
}
}
}
}
}
]
}
},
"script": {
"source": "return _score;"
}
}
}
}
But if I add another query to should query it won't behave the same and gives the same score to products with one or two genders in their specifications.
here is my final query which wont work as expected:
GET _search
{
"sort": [
"_score"
],
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"rank_feature": {
"field": "view_score",
"linear": {}
}
},
{
"nested": {
"path": "recipients",
"ignore_unmapped": true,
"query": {
"bool": {
"boost": 10,
"must": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "female"
}
}
],
"must_not": {
"bool": {
"filter": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "male"
}
}
]
}
}
}
}
}
}
]
}
},
"script": {
"source": "return _score;"
}
}
}
}
So my problem is how to combine these should clause together to give more weight to the products that specify only one gender.

Filter query by length of nested objects. ie. min_child

I'm trying to filter my query by the number of nested objects found. The Elastic Search documentation mentions that using a script is an expensive task, so I've set out to do it with a score, though I can't seem to get the script to work either.
Here's my mappings:
"mappings": {
"properties": {
"dates" : {
"type" : "nested",
"properties" : {
"rooms" : {
"type" : "integer"
},
"timestamp" : {
"type" : "long"
}
}
},
"doc_id" : {
"type" : "text"
},
"distance" : {
"type" : "integer"
}
...
}
}
Here's some example data:
PUT /test/_doc/1
{
"doc_id": "1",
"distance": 1,
"dates": [
{
"rooms": 1,
"timestamp": 1
},
{
"rooms": 1,
"timestamp": 2
},
...
]
}
I'm filtering by the parents distance field, among others, and filtering the nested dates by their timestamps, and rooms. I need to filter all results to an exact number of nest dates found.
I tried to borrow from here.
This is my search query:
GET /test/_search
{
"query" : {
"function_score": {
"min_score": 20,
"boost": 1,
"functions": [
{
"script_score": {
"script": {
"source": "if (_score > 20) { return - 1; } return _score;"
}
}
}
],
"query": {
"bool" : {
"filter": [
{ "range": { "distance": { "lt": 5 }}},
{
"nested": {
"score_mode": "sum",
"boost": 10,
"path": "dates",
"query": {
"bool": {
"filter": [
{ "range": { "dates.rooms": { "gte": 1 } } },
{ "range": { "dates.timestamp": { "lte": 2 }}},
{ "range": { "dates.timestamp": { "gte": 1 }}}
]
}
}
}
}
]
}
}
}
}
}
This returns all the results that match, yet they all have a score of 0.0 and aren't getting filtered by the number of nested objects found.
If this is the right solution, how can I get this working? If not, how can I get a script to do it within this search?
Thanks!
Before getting started, keep in mind that the scoring function has changed between Elastic 6 and 7. You can find the updated code samples on this this gist.
Your question didn't outline the specifics of your search. Reading the code, it seems like you want to retrieve all documents where the distance is less than five, and the number of matching rooms is precisely 2. If this is correct, the code you submitted does not achieve this.
Reasons: your function score contains your primary condition and your condition on the number of matching rooms (it is quite tricky to mix both, though not impossible). To make things simpler, isolate them for the function score to be only applicable to the number of rooms.
Supposing you are using elastic 7+, this might work:
{
"_source": {
"includes": ["*"],
"excludes": ["dates"]
},
"query": {
"bool": {
"must": [
{"range": {"distance": {"lt": 5}}},
{
"function_score": {
"min_score": 20,
"boost": 1,
"score_mode": "multiply",
"boost_mode": "replace",
"functions": [
{
"script_score": {
"script": {
"source": "if (_score > 20) { return 0; } return _score;"
}
}
}
],
"query": {
"nested": {
"path": "date",
"boost": 10,
"score_mode": "sum",
"query": {
"constant_score": {
"boost": 1,
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{"term": {"dates.timestamp": 1}},
{"range": {"dates.rooms": {"lt": 5}}}
],
"should": [
{"term": {"dates.other_prop": 1}},
{"term": {"dates.other_prop": 4}}
]
}
},
{
"bool": {
"must": [
{"term": {"dates.timestamp": 2}},
{"range": {"dates.rooms": {"lt": 5}}}
],
"should": [
{"term": {"dates.other_prop": 1}},
{"term": {"dates.other_prop": 3}}
]
}
}
]
}
}
}
}
}
}
}
}
]
}
}
}
I managed to get it all working with scoring as filtering doesn't allow scoring. Using GET /test/_explain/[id] helped to understand exactly what was happening
GET /test/_search
{
// Don't return the nested fields, they are returned in the inner_hits
"_source": {
"includes": [ "*" ],
"excludes": [ "dates" ]
},
"query": {
"function_score": {
// Score is calculated with 1 point for each matched inner property and outer property.
// 7 is the exact score to allow
"min_score": 7,
"boost": 1,
"score_mode": "sum",
"boost_mode": "multiply",
"functions": [
{
"script_score": {
"script": {
// Ignore any results that don't match exactly
"source": "if (_score == 7) { return 1; } return 0;",
"lang": "painless"
}
}
}
],
"query": {
"bool" : {
"must" : [
{ "range" : { "distance" : { "lt": 10 }}},
{
"nested": {
"inner_hits" : {},
"path": "dates",
"score_mode": "sum",
"query": {
"bool": {
// Match each required nested object individually, then verify with the score if we got 1 match for each should
"should": [
{
"bool": {
"must": [
{ "term": { "dates.timestamp": 1 }},
{ "range": { "dates.rooms": { "lt": 5 } } }
],
"should": [
{ "term": { "dates.other_prop": 1 }},
{ "term": { "dates.other_prop": 4 }}
]
}
},
{
"bool": {
"must": [
{ "term": { "dates.timestamp": 2 }},
{ "range": { "dates.rooms": { "lt": 5 } } }
],
"should": [
{ "term": { "dates.other_prop": 1 }},
{ "term": { "dates.other_prop": 3 }}
]
}
}
]
}
}
}
}
]
}
}
}
}
}

How to query multiple parameters in a nested field in elasticsearch

I'm trying to search for keyword and then add nested queries for amenities which is a nested field of an array of objects.
With the query below I am able to search when I'm only matching one amenity id but when I have more than one it doesn't return anything.
Anyone have an idea what is wrong with my query ?
{
"sort": [
{
"_score": {
"order": "desc"
}
},
{
"_geo_distance": {
"geolocation": [
100,
10
],
"order": "asc",
"unit": "m",
"mode": "min",
"distance_type": "sloppy_arc"
}
}
],
"query": {
"bool": {
"must": [
{
"multi_match": {
"fields": [
"name^2",
"city",
"state",
"zip"
],
"fuzziness": 5,
"query": "complete"
}
},
{
"nested": {
"path": "amenities",
"query": {
"bool": {
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}
]
}
}
}
}
]
}
}
}
When you do:
"must": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]
What you're actually saying is find me any document where "amenities.id"="1" and "amenities.id"="2" which unless "amenities.id" is a list of values it won't work.
What you probably want to say is find me any document where "amenities.id"="1" or "amenities.id"="2"
To do that you should use should instead of must:
"should": [
{
"term": {
"amenities.id": "1"
}
},
{
"term": {
"amenities.id": "2"
}
}]

Elastic Search 1.7.3 Nested filter: matching terms in an array of objects

I am trying to query for the following document in my elasticsearch:
"amenity": [
"Free Wifi",
"Free Breakfast",
"Veg Only",
"Swimming Pool",
"Newspaper",
"Bar",
"Credit Card",
"Pickup & Drop",
"Gym",
"Elevator",
"Valet Parking"
],
"dodont": [
{
"do_or_dont": "Do",
"what": "Vegetarians"
},
{
"do_or_dont": "Do",
"what": "Family"
},
{
"do_or_dont": "Dont",
"what": "Loud Music"
},
{
"do_or_dont": "Dont",
"what": "Booze"
}
]
and here is the query I have written:
"filter": {
"and": {
"filters": [
{
"nested" : {
"path" : "dodont",
"filter" : {
"bool" : {
"must": [{"and" : [
{
"term" : {"dodont.do_or_dont" : "Do"}
},
{
"term" : {"dodont.what" : "Vegetarians"}
}
]},
{"and" : [
{
"term" : {"dodont.do_or_dont" : "Do"}
},
{
"term" : {"dodont.what" : "Family"}
}
]}]
}
}
}
}
]
}
}
Now this query returns empty result, but when I change the "must" to "should" in the bool in above code, it returns the above document as the result (there is only 1 document matching this filter the one shown above), but ideally, the "must" condition should return the above document, I want to pass multiple objects for Do's and donts and I only want the results which match all of them, but I am not able to do so. How should I go about it?
You need to split out the two conditions on your nested document, since each element of the dodont nested array is conceptually a separate document:
{
"filter": {
"and": {
"filters": [
{
"nested": {
"path": "dodont",
"filter": {
"and": [
{
"term": {
"dodont.do_or_dont": "Do"
}
},
{
"term": {
"dodont.what": "Vegetarians"
}
}
]
}
}
},
{
"nested": {
"path": "dodont",
"filter": {
"and": [
{
"term": {
"dodont.do_or_dont": "Do"
}
},
{
"term": {
"dodont.what": "Family"
}
}
]
}
}
}
]
}
}
}

Elasticsearch additional boost if multiple conditions are met

Imagine I have a document, which looks like this:
{
"Title": "Smartphones in United Kingdom",
"Text": "A huge text about the topic",
"CategoryTags": [
{
"CategoryID": 1,
"CategoryName": "Smartphone"
},
{
"CategoryID": 2,
"CategoryName": "Apple"
},
{
"CategoryID": 3,
"CategoryName": "Samsung"
}
],
"GeographyTags": [
{
"GeographyID": 1,
"GeographyName": "Western Europe"
},
{
"GeographyID": 2,
"GeographyName": "United Kingdom"
}
]
}
CategoryTags and GeographyTags are stored as nested subdocuments.
I'd be looking for "apple united kingdom" in my search bar. How'd I make a query that would boost this document if it has both matching category and geography at the same time?
I was thinking of multi_match query, but I didn't figure out how would I deal with nested documents here...
I was thinking of nesting must into should statement. Would that make any sense?
POST /_search
{
"template": {
"size": "50",
"_source": {
"include": "Title"
},
"query": {
"filtered": {
"query": {
"bool": {
"minimum_number_should_match": "2<50%",
"must": [
{
"match": {
"Text": {
"query": "{{SearchPhrase}}"
}
}
}
],
"should": [
{
"match": {
"Title": {
"query": "{{SearchPhrase}}",
"type": "phrase",
"boost": "20"
}
}
},
{
"bool": {
"must": [
{
"nested": {
"path": "CategoryTags",
"query": {
"match": {
"CategoryTags.CategoryName": "{{SearchPhrase}}"
}
}
}
},
{
"nested": {
"path": "GeographyTags",
"query": {
"match": {
"GeographyTags.GeographyName": "{{SearchPhrase}}"
}
}
}
}
]
}
}
]
}
}
}
}
}
}

Resources