Get available apartments query - elasticsearch

Overview
I have apartments which have reservations. My index has the reservations as nested fields with date fields for start_date and end_date.
I'm using the chewy ruby gem - but this doesn't matter at this time i think. Just need to get my query right.
Goal
I want to fetch all available apartments which have no reservation at the given date or no reservations at all.
Current query
Unfortunately returns all apartments:
:query => {
:bool => {
:must_not => [
{
:range => {:"reservations.start_date" => {:gte => "2017-02-10"}}
},
{
:range => {:"reservations.end_date" => {:lte => "2017-02-12"}}
}
]
}
}
Index Settings
{
"apartments" : {
"aliases" : { },
"mappings" : {
"apartment" : {
"properties" : {
"city" : {
"type" : "string"
},
"coordinates" : {
"type" : "geo_point"
},
"email" : {
"type" : "string"
},
"reservations" : {
"type" : "nested",
"properties" : {
"end_date" : {
"type" : "date",
"format" : "yyyy-MM-dd"
},
"start_date" : {
"type" : "date",
"format" : "yyyy-MM-dd"
}
}
},
"street" : {
"type" : "string"
},
"zip" : {
"type" : "string"
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1487289727161",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "-rM79OUvQ3qkkLJmQCsoCg",
"version" : {
"created" : "2040499"
}
}
},
"warmers" : { }
}
}

We have to list free apartments and those apartment that will be available in the desired period (start_date, end_date variables)
So it should be a or query: free_aparments or available_aparments
The free apartments (those that haven't any value in reservations field) should be easy to query with a missing filter, but this is a nested field and we have to deal with.
If we perform the query with a missing filter all docs will be returned. It's weird but it happens. Here there's the explained solution: https://gist.github.com/Erni/7484095 and here is the issue: https://github.com/elastic/elasticsearch/issues/3495 The gist snnipet works with all elasticsearch versions.
The other part of the or query are available apartments. I've solved this part performing a not query. Return me those apartments that NOT have a reservation, thought a list of range that match with those aparments that do have a reservation and then negate the result using must_not filter
elasticsearch_query = {
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"nested": {
"filter": {
"bool": {
"must_not" : [
{
"range": {
"start_date": {
"gte" : start_date,
"lt" :end_date
}
}
},
{
"range": {
"end_date": {
"gte" : end_date,
#"lte" :end_date
}
}
}
]
}
},
"path": "reservations"
}
},
{
#{ "missing" : { "field" : "reservations"} }
"not": {
"nested": {
"path": "reservations",
"filter": {
"match_all": {}
}
}
}
}
],
}
}
},
},
"sort" : {"id":"desc"}
}
You can have a look to my solution in this notebook
I've created and example, populating a sample index and searching for desired apartments with this query
Comments answers:
Prefix: Since nested filter is performed setting path will be queried, prefix is no needed at all (at least in my tested version). And yes, you can add a field names start_date at document level or at another nested field
Apartment matches: Yes, it matches with 91 sample apartments, but since I did a search with default size parameter, only 10 are returned (I didn't specified its value, its default value). If you need to get ALL of them, use a scroll search
(notebook has been modified to clarify this points)

First of all, I think you must use the nested query.
I am not familiar with chewy-gem but the query would look something like:
:query => {
:nested: => {
:path: => "reservations",
:query => {
:bool => {
:must_not => [
{
:range => {:"reservations.start_date" => {:gte => "2017-02-10"}}
},
{
:range => {:"reservations.end_date" => {:lte => "2017-02-12"}}
}
]
}
}
}
}
But it might also not work as if there is a reservation in 2018, the fisrt bool query will be true (as the start date will be > 2017-02-10), therefore the appartment will not be returned, if I'm correct.
I would do something like:
:query => {
:nested: => {
:path: => "reservations",
:query => {
:bool => {
:must_not => [
{
:range => {:"reservations.start_date" => {:gte => "2017-02-10", :lte => "2017-02-12"}}
},
{
:range => {:"reservations.end_date" => {:gte => "2017-02-10", :lte => "2017-02-12"}}
}
]
}
}
}
}
which means no start date beetween the range you want, no end date beetween the range you want.

This is the query I came up with which is supposed to take into account all conditions, namely:
either there are no reservations (1st top-level bool/should)
or there are at least one reservation and the reservation start and end dates do not overlap with the requested dates.
Here, we're asking for free apartments between 2017-02-10 and 2017-02-12
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"nested": {
"path": "reservations",
"query": {
"bool": {
"must_not": {
"exists": {
"field": "reservations.start_date"
}
}
}
}
}
},
{
"bool": {
"must": [
{
"nested": {
"path": "reservations",
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"range": {
"reservations.start_date": {
"gt": "2017-02-10"
}
}
},
{
"range": {
"reservations.end_date": {
"lt": "2017-02-10"
}
}
}
]
}
}
}
},
{
"nested": {
"path": "reservations",
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"range": {
"reservations.start_date": {
"gt": "2017-02-12"
}
}
},
{
"range": {
"reservations.end_date": {
"lt": "2017-02-12"
}
}
}
]
}
}
}
}
]
}
}
]
}
}

Related

Elastic search combine must and must_not

I have a document that holds data for a product the mapping is as follow:
"mappings" : {
"properties" : {
"view_score" : {
"positive_score_impact" : true,
"type" : "rank_feature"
},
"recipients" : {
"dynamic" : false,
"type" : "nested",
"enabled" : true,
"properties" : {
"type" : {
"similarity" : "boolean",
"type" : "keyword"
},
"title" : {
"type" : "text",
"fields" : {
"key" : {
"type" : "keyword"
}
}
}
}
}
}
}
And I have 2 documents with the following data:
{
"view_score": 10,
"recipients": [{"type":"gender", "title":"male"}, {"type":"gender", "title":"female"}]
}
{
"view_score": 10,
"recipients": [{"type":"gender", "title":"female"}]
}
When a user searches for a product she can say "I prefer products for females" so The products which specifies gender as just female should come before products that specifies gender as male and female both.
I have the following query which gives more score to products with just female gender:
GET _search
{
"sort": [
"_score"
],
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"nested": {
"path": "recipients",
"ignore_unmapped": true,
"query": {
"bool": {
"boost": 10,
"must": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "female"
}
}
],
"must_not": {
"bool": {
"filter": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "male"
}
}
]
}
}
}
}
}
}
]
}
},
"script": {
"source": "return _score;"
}
}
}
}
But if I add another query to should query it won't behave the same and gives the same score to products with one or two genders in their specifications.
here is my final query which wont work as expected:
GET _search
{
"sort": [
"_score"
],
"query": {
"script_score": {
"query": {
"bool": {
"should": [
{
"rank_feature": {
"field": "view_score",
"linear": {}
}
},
{
"nested": {
"path": "recipients",
"ignore_unmapped": true,
"query": {
"bool": {
"boost": 10,
"must": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "female"
}
}
],
"must_not": {
"bool": {
"filter": [
{
"term": {
"recipients.type": "gender"
}
},
{
"match": {
"recipients.title": "male"
}
}
]
}
}
}
}
}
}
]
}
},
"script": {
"source": "return _score;"
}
}
}
}
So my problem is how to combine these should clause together to give more weight to the products that specify only one gender.

In Elasticsearch, how do I search string on multiple fields from multi-level nested objects

In Elasticsearch 6, I have data with nested objects like this:
{
"brands" :
[
{
"brand_name" : "xyz",
"products" :
[
{
"title" : "test",
"mrp" : 100,
"sp" : 90,
"status" : 1
},
{
"title" : "test1",
"mrp" : 50,
"sp" : 45,
"status" : 1
}
]
},
{
"brand_name" : "aaa",
"products" :
[
{
"title" : "xyz",
"mrp" : 100,
"sp" : 90,
"status" : 1
},
{
"title" : "abc",
"mrp" : 50,
"sp" : 45,
"status" : 1
}
]
}
]
}
I want to search from either from the field brand_name or from the field title. And I want return all results in same inner_hits.
For example : If I input the search string as "xyz" it should return both brands object with correspondent product object.
If I input the search string as "test" it should return only first brand array with only first product object.
How can I achieve this. Any ideas?
I have tried with the nested path query like this:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "brands",
"query": {
"bool": {
"should": [
{
"term": {
"brands.brand_name": "xyz"
}
},
{
"term": {
"brands.brand_name.keyword": "aaa"
}
},
{
"nested": {
"path": "brands.products",
"query": {
"bool": {
"should": [
{
"match": {
"brands.products.title": "xyz"
}
}
]
}
},
"inner_hits": {}
}
}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
But this query returning with multiple inner_hits response with multiple array objects for each brands and for each products.
I want the response like all brand names which is matching with the string should list under one array and all the products should list under another array under same inner_hits.
Since you want the inner hits to be different based on where the match has happened i.e. brands.brand_name or brands.products.title, you can have two queries one for brand name and other for product title as independent nested queries. These queries then should be inside should clause of a bool query. Each of the nested query should have its own inner_hits as below:
{
"query": {
"bool": {
"should": [
{
"nested": {
"path": "brands",
"inner_hits": {},
"query": {
"term": {
"brands.brand_name.keyword": "test"
}
}
}
},
{
"nested": {
"path": "brands.products",
"inner_hits": {},
"query": {
"term": {
"brands.products.title": "test"
}
}
}
}
]
}
},
"_source": false
}

Query to get available dates using start and end date

I’m trying to create a query which returns available products with no reservation at that date (or date range) or no reservations at all. It’s driving me crazy.
Here is my current mapping with index settings:
{
"development_product_instances" : {
"aliases" : { },
"mappings" : {
"product_instance" : {
"properties" : {
"reservations" : {
"type" : "nested",
"properties" : {
"end_date" : {
"type" : "date",
"format" : "yyyy-MM-dd"
},
"start_date" : {
"type" : "date",
"format" : "yyyy-MM-dd"
}
}
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1503327829680",
"number_of_shards" : "5",
"number_of_replicas" : "1",
"uuid" : "9b9BhF-ITta2dlCKRLrnfA",
"version" : {
"created" : "2040499"
}
}
},
"warmers" : { }
}
}
And the query:
{
bool: {
should: [
{
nested: {
path: "reservations",
filter: {
bool: {
must_not: [
{
range:
{
"reservations.start_date":
{
gte: start_date,
lte: end_date
}
}
},
{
range:
{
"reservations.end_date":
{
gte: start_date,
lt: end_date
}
}
}
]
}
}
}
},
{
not: {
nested: {
path: "reservations",
filter: {
match_all: {}
}
}
}
}
]
}
}
When there is more than one reservation it returns all.
I hope someone can see the bug in there. Maybe i'm missing something in the bigger picture.
Your problem is that the must_not is inside the nested query. That means that if it matches for any of the nested reservations, then the parent document matches. So when there are multiple reservations, unless the range you're querying overlaps all the existing reservations, you get a match. You can rewrite it like this (note that this query also matches when reservations is empty):
{
"query": {
"bool": {
"must_not": {
"nested": {
"path": "reservations",
"query": {
"bool": {
"should": [
{
"range": {
"reservations.start_date": {
"gte": start_date,
"lt": end_date
}
}
},
{
"range": {
"reservations.end_date": {
"gte": start_date,
"lt": end_date
}
}
},
{
"bool": {
"must": [
{
"range": {
"reservations.start_date": {
"lt": start_date
}
}
},
{
"range": {
"reservations.end_date": {
"gt": end_date
}
}
}
]
}
}
]
}
}
}
}
}
}
}

how to use not query in index field in elasticsearch

This is the mapping.
curl -XPUT 'localhost:9200/products/' -d '{
"settings" : {
"index" : {
"number_of_shards" : 6,
"number_of_replicas" : 1
}
},
"mappings" : {
"product" : {
"_all":{ "enabled": true },
"properties":{
"id" : { "type" : "string", "index" : "not_analyzed", "include_in_all": true },
"description" : { "type" : "string" },
"title" : { "type" : "string", "boost" : 2 },
}
}
}
}'
I don't want to get the ads which have no description. but as you can see in mapping "description" have an index.
So how do I use not query in description?
please help me out.
I seen the doc of elasticsearch and I use this query.
**query => {
filtered => {
filter => {
not => {
filter => {
term => {description => ''}
}
}
},
query => {
match => { _all => $q }
}
}
}**
But it's not working, I think because description have index right?
For 2.4 this would be the correct syntax and query approach:
{
"query": {
"bool": {
"must": [
{"match_all": {}}
],
"filter": {
"bool": {
"must": [
{
"exists": {
"field": "description"
}
},
{
"wildcard": {
"description": "*"
}
}
]
}
}
}
}
}
Instead of filtered you have a bool with must as query and filter as filter. What's inside ofmustis what you have as query and what's inside offilteris what you have as filter. The approach you used withfiltered` is deprecated in ES 2.x.

Elasticsearch match list against field

I have a list, array or whichever language you are familiar. E.g. names : ["John","Bas","Peter"] and I want to query the name field if it matches one of those names.
One way is with OR Filter. e.g.
{
"filtered" : {
"query" : {
"match_all": {}
},
"filter" : {
"or" : [
{
"term" : { "name" : "John" }
},
{
"term" : { "name" : "Bas" }
},
{
"term" : { "name" : "Peter" }
}
]
}
}
}
Any fancier way? Better if it's a query than a filter.
{
"query": {
"filtered" : {
"filter" : {
"terms": {
"name": ["John","Bas","Peter"]
}
}
}
}
}
Which Elasticsearch rewrites as if you hat used this one
{
"query": {
"filtered" : {
"filter" : {
"bool": {
"should": [
{
"term": {
"name": "John"
}
},
{
"term": {
"name": "Bas"
}
},
{
"term": {
"name": "Peter"
}
}
]
}
}
}
}
}
When using a boolean filter, most of the time, it is better to use the bool filter than and or or. The reason is explained on the Elasticsearch blog: http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/
As I tried the filtered query I got no [query] registered for [filtered], based on answer here it seems the filtered query has been deprecated and removed in ES 5.0. So I provide using:
{
"query": {
"bool": {
"filter": {
"terms": {
"name": ["John","Bas","Peter"]
}
}
}
}
}
example query = filter by keyword and a list of values
{
"query": {
"bool": {
"must": [
{
"term": {
"fguid": "9bbfe844-44ad-4626-a6a5-ea4bad3a7bfb.pdf"
}
}
],
"filter": {
"terms": {
"page": [
"1",
"2",
"3"
]
}
}
}
}
}

Resources