SQl equivalent Correlated query for ElasticSearch Aggregation - elasticsearch

I have a use case for writing an aggregation which if written in SQL can be achieved using correlated queries.
I have a index called listings where the properties/columns are ListDate, ListPrice, SoldDate, SoldPrice, OffMarketDate.
ListDate is not nullable, but SoldDate,SoldPrice, OffMarketDate can be nullable.
I want to aggregate stats from the above index based on the following requirement.
I want to have monthly stats, which I see can be achieved by
DateHistogramAggregation
For each month from the
DateHistogramAggregation, I want to find the listings as follows:
Example: For Jan 2019, get all the listings where (ListDate< Feb 1st, 2019) and (SoldDate is null or SoldDate<Jan 1st, 2019) and (OffMarketDate is null or OffMarketDate< Jan 1st, 2019)
Then run the aggregation function for those lists each month.
I appreciate any suggestions to implement this use case. Thanks in advance for the help.

Please see the below details and info as how you can approach this problem:
Mapping:
PUT listings
{
"mappings": {
"properties": {
"listDate":{
"type": "date"
},
"listPrice":{
"type": "long"
},
"soldDate":{
"type": "date"
},
"soldPrice": {
"type": "long"
},
"offMarketDate": {
"type": "date"
}
}
}
}
Note that I've constructed the above mapping looking at your question.
Sample Documents:
POST listings/_doc/1
{
"listDate": "2020-01-01",
"listPrice": "100.00",
"soldDate": "2019-12-25",
"soldPrice": "120.00",
"offMarketDate": "2019-12-20"
}
POST listings/_doc/2
{
"listDate": "2020-01-01",
"listPrice": "100.00",
"soldDate": "2019-12-24",
"soldPrice": "122.00",
"offMarketDate": "2019-12-20"
}
POST listings/_doc/3
{
"listDate": "2020-01-25",
"listPrice": "120.00",
"soldDate": "2020-01-30",
"soldPrice": "140.00",
"offMarketDate": "2020-01-26"
}
POST listings/_doc/4
{
"listDate": "2020-01-25",
"listPrice": "120.00",
"soldDate": "2020-02-02",
"soldPrice": "135.00",
"offMarketDate": "2020-01-26"
}
POST listings/_doc/5
{
"listDate": "2020-01-25",
"listPrice": "120.00"
}
POST listings/_doc/6
{
"listDate": "2020-02-02",
"listPrice": "120.00"
}
Note how I've not added the soldDate and offMarketDate in the docs 5 and 6 as that would be better option than having it with null value.
Request Query:
So I've come up with the below query for your use-case.
Also for the sake of aggregation, let's say I've calculated the total soldPrice for the docs having
listDate in the month of Jan 2020 AND
(soldDate either null OR soldDate before the month of Jan 2020) AND.
(offMarketDate either null OR offMarketDate before the month of Jan 2020).
Below is the query:
POST listings/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"listDate": {
"gte": "2020-01-01",
"lte": "2020-02-01"
}
}
},
{
"bool": {
"should": [
{
"range": {
"soldDate": {
"lte": "2020-01-01"
}
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "soldDate"
}
}
]
}
}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{
"range": {
"offMarketDate": {
"lte": "2020-01-01"
}
}
},
{
"bool": {
"must_not": [
{
"exists": {
"field": "offMarketDate"
}
}
]
}
}
],
"minimum_should_match": 1
}
}
]
}
},
"aggs": {
"my_histogram": {
"date_histogram": {
"field": "listDate",
"calendar_interval": "month"
},
"aggs": {
"total_sales_price": {
"sum": {
"field": "soldPrice"
}
}
}
}
}
}
The query above is very easily readable and self explanatory. I'd suggest reading about the below different queries which I've made use of:
Bool Query
Range Query
Field Exists Query to verify if field exists or not.
Data Histogram Aggregation
Sum Metrics Aggregation
Response:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 3.0,
"hits" : [
{
"_index" : "listings",
"_type" : "_doc",
"_id" : "1",
"_score" : 3.0,
"_source" : {
"listDate" : "2020-01-01",
"listPrice" : "100.00",
"soldDate" : "2019-12-25",
"soldPrice" : "120.00",
"offMarketDate" : "2019-12-20"
}
},
{
"_index" : "listings",
"_type" : "_doc",
"_id" : "2",
"_score" : 3.0,
"_source" : {
"listDate" : "2020-01-01",
"listPrice" : "100.00",
"soldDate" : "2019-12-24",
"soldPrice" : "122.00",
"offMarketDate" : "2019-12-20"
}
},
{
"_index" : "listings",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"listDate" : "2020-01-25",
"listPrice" : "120.00"
}
}
]
},
"aggregations" : {
"my_histogram" : {
"buckets" : [
{
"key_as_string" : "2020-01-01T00:00:00.000Z",
"key" : 1577836800000,
"doc_count" : 3,
"total_sales_price" : {
"value" : 242.0
}
}
]
}
}
}
As expected, documents 1,2 and 5 are showing up with the correct aggregated sum of soldPrice.
Hope that helps!

Related

elasticsearch filter nested object

I have an index with a nested object containing two attributes namely scopeId and categoryName. Following is the mappings part of the index
"mappedCategories" : {
"type" : "nested",
"properties": {
"scopeId": {"type":"long"},
"categoryName": {"type":"text",
"analyzer" : "productSearchAnalyzer",
"search_analyzer" : "productSearchQueryAnalyzer"}
}
}
A sample document containing the nested mappedCategories object is as follows:
POST productsearchna_2/_doc/1
{
"categoryName" : "Operating Systems",
"contexts" : [
0
],
"countryCode" : "US",
"id" : "10076327-1",
"languageCode" : "EN",
"localeId" : 1,
"mfgpartno" : "test123",
"manufacturerName" : "Hewlett Packard Enterprise",
"productDescription" : "HPE Microsoft Windows 2000 Datacenter Server - Complete Product - Complete Product - 1 Server - Standard",
"productId" : 10076327,
"skus" : [
{"sku": "43233004",
"skuName": "UNSPSC"},
{"sku": "43233049",
"skuName": "SP Richards"},
{"sku": "43234949",
"skuName": "Ingram Micro"}
],
"mappedCategories" : [
{"scopeId": 3228552,
"categoryName": "Laminate Bookcases"},
{"scopeId": 3228553,
"categoryName": "Bookcases"},
{"scopeId": 3228554,
"categoryName": "Laptop"}
]
}
I want to filter categoryName "lap" on scopeId: 3228553 i.e. my query should return 0 hits since Laptop is mapped to scopeId 3228554. But my following query is returning 1 hit with scopeId : 3228554
POST productsearchna_2/_search
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "mappedCategories",
"query": {
"term": {
"mappedCategories.categoryName": "lap"
}
},
"inner_hits": {}
}
}
],
"filter": [
{
"nested": {
"path": "mappedCategories",
"query": {
"term": {
"mappedCategories.scopeId": {
"value": 3228552
}
}
}
}
}
]
}
},
"_source": ["mappedCategories.categoryName", "productId"]
}
Following is part of the result of the query:
"inner_hits" : {
"mappedCategories" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.5586993,
"hits" : [
{
"_index" : "productsearchna_2",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "mappedCategories",
"offset" : 2
},
"_score" : 1.5586993,
"_source" : {
"scopeId" : 3228554,
"categoryName" : "Laptop"
}
}
]
}
}
I want my query to return zero hits, and in case I search for "book" with scopeId: 3228552, I want my query to return 2 hits, 1 for Bookcases and another for Laminate Bookcases categoryNames. Please help.
This query solves part of the problem but when searching for book" with scopeId: 3228552 it will only get 1 result.
GET idx_test/_search?filter_path=hits.hits.inner_hits
{
"query": {
"nested": {
"path": "mappedCategories",
"query": {
"bool": {
"filter": [
{
"term": {
"mappedCategories.scopeId": {
"value": 3228553
}
}
}
],
"must": [
{
"match": {
"mappedCategories.categoryName": "laptop"
}
}
]
}
},
"inner_hits": {}
}
}
}

Getting only the most recent records from ElasticSearch

I have a user index of ElasticSearch where each user has a name and multiple other user related information and also an indexedAt field which specify when the user information is being indexed. When any information of user changes I create a new record of the user and store it. Therefore each user can have many multiple records in the index.
Now Simply I want to get only the most up to date information of the queried users.
For example if I run the following query, it will return all of the records of John and Smith. But I want only the most recent record for each of the users.
{
"size": 10000,
"query": {
"bool": {
"should": [
{
"match_phrase": {
"name": "John"
}
},
{
"match_phrase": {
"name": "Smith"
}
}
]
}
},
"sort": [
{
"indexedAt": {
"order": "desc"
}
}
]
}
You can use inner_hits to get your answer
GET /temp_index/_search
{
"size": 10,
"query": {
"bool": {
"should": [
{
"match_phrase": {
"name": "John"
}
},
{
"match_phrase": {
"name": "Smith"
}
}
]
}
},
"collapse": {
"field": "name.keyword",
"inner_hits": {
"name": "most_recent",
"size": 1,
"sort": [{"indexedAt": "desc"}]
}
}
}
This will get you a result similar to below
{
"_index" : "temp_index",
"_type" : "_doc",
"_id" : "KSHBjnMBPr3VGlJjXe3d",
"_score" : 0.8266786,
"_source" : {
"name" : "John",
"indexedAt" : 1015
},
"fields" : {
"name.keyword" : [
"John"
]
},
"inner_hits" : {
"most_recent" : {
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "temp_index",
"_type" : "_doc",
"_id" : "LyHBjnMBPr3VGlJji-24",
"_score" : null,
"_source" : {
"name" : "John",
"indexedAt" : 1050
},
"sort" : [
1050
]
}
]
}
}
}
},
You can access the inner_hits portion to get the document which was most recently indexed (i.e. with the largest indexedAt value)

Elasticsearch filter by multiple fields in an object which is in an array field

The goal is to filter products with multiple prices.
The data looks like this:
{
"name":"a",
"price":[
{
"membershipLevel":"Gold",
"price":"5"
},
{
"membershipLevel":"Silver",
"price":"50"
},
{
"membershipLevel":"Bronze",
"price":"100"
}
]
}
I would like to filter by membershipLevel and price. For example, if I am a silver member and query price range 0-10, the product should not appear, but if I am a gold member, the product "a" should appear. Is this kind of query supported by Elasticsearch?
You need to make use of nested datatype for price and make use of nested query for your use case.
Please see the below mapping, sample document, query and response:
Mapping:
PUT my_price_index
{
"mappings": {
"properties": {
"name":{
"type":"text"
},
"price":{
"type":"nested",
"properties": {
"membershipLevel":{
"type":"keyword"
},
"price":{
"type":"double"
}
}
}
}
}
}
Sample Document:
POST my_price_index/_doc/1
{
"name":"a",
"price":[
{
"membershipLevel":"Gold",
"price":"5"
},
{
"membershipLevel":"Silver",
"price":"50"
},
{
"membershipLevel":"Bronze",
"price":"100"
}
]
}
Query:
POST my_price_index/_search
{
"query": {
"nested": {
"path": "price",
"query": {
"bool": {
"must": [
{
"term": {
"price.membershipLevel": "Gold"
}
},
{
"range": {
"price.price": {
"gte": 0,
"lte": 10
}
}
}
]
}
},
"inner_hits": {} <---- Do note this.
}
}
}
The above query means, I want to return all the documents having price.price range from 0 to 10 and price.membershipLevel as Gold.
Notice that I've made use of inner_hits. The reason is despite being a nested document, ES as response would return the entire set of document instead of only the document specific to where the query clause is applicable.
In order to find the exact nested doc that has been matched, you would need to make use of inner_hits.
Below is how the response would return.
Response:
{
"took" : 128,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808291,
"hits" : [
{
"_index" : "my_price_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.9808291,
"_source" : {
"name" : "a",
"price" : [
{
"membershipLevel" : "Gold",
"price" : "5"
},
{
"membershipLevel" : "Silver",
"price" : "50"
},
{
"membershipLevel" : "Bronze",
"price" : "100"
}
]
},
"inner_hits" : {
"price" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.9808291,
"hits" : [
{
"_index" : "my_price_index",
"_type" : "_doc",
"_id" : "1",
"_nested" : {
"field" : "price",
"offset" : 0
},
"_score" : 1.9808291,
"_source" : {
"membershipLevel" : "Gold",
"price" : "5"
}
}
]
}
}
}
}
]
}
}
Hope this helps!
Let me take show you how to do it, using the nested fields and query and filter context. I will take your example to show, you how to define index mapping, index sample documents, and search query.
It's important to note the include_in_parent param in Elasticsearch mapping, which allows us to use these nested fields without using the nested fields.
Please refer to Elasticsearch documentation about it.
If true, all fields in the nested object are also added to the parent
document as standard (flat) fields. Defaults to false.
Index Def
{
"mappings": {
"properties": {
"product": {
"type": "nested",
"include_in_parent": true
}
}
}
}
Index sample docs
{
"product": {
"price" : 5,
"membershipLevel" : "Gold"
}
}
{
"product": {
"price" : 50,
"membershipLevel" : "Silver"
}
}
{
"product": {
"price" : 100,
"membershipLevel" : "Bronze"
}
}
Search query to show Gold with price range 0-10
{
"query": {
"bool": {
"must": [
{
"match": {
"product.membershipLevel": "Gold"
}
}
],
"filter": [
{
"range": {
"product.price": {
"gte": 0,
"lte" : 10
}
}
}
]
}
}
}
Result
"hits": [
{
"_index": "so-60620921-nested",
"_type": "_doc",
"_id": "1",
"_score": 1.0296195,
"_source": {
"product": {
"price": 5,
"membershipLevel": "Gold"
}
}
}
]
Search query to exclude Silver, with same price range
{
"query": {
"bool": {
"must": [
{
"match": {
"product.membershipLevel": "Silver"
}
}
],
"filter": [
{
"range": {
"product.price": {
"gte": 0,
"lte" : 10
}
}
}
]
}
}
}
Above query doesn't return any result as there isn't any matching result.
P.S :- this SO answer might help you to understand nested fields and query on them in detail.
You have to use Nested fields and nested query to archive this: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
Define you Price property with type "Nested" and then you will be able to filter by every property of nested object

Query unsold products from ElasticSearch

I'm new to elasticSearch and need some help...
I've an index called sales having these type of records :
{product.id : 12 , sales.datetime : October 12th 2019},
{product.id : 13 , sales.datetime : October 12th 2019},
{product.id : 14 , sales.datetime : October 13th 2019},
{product.id : 14 , sales.datetime : October 14th 2019},
{product.id : 14 , sales.datetime : October 14th 2019},
{product.id : 13 , sales.datetime : October 18th 2019},
...
I would like to retrieve products unsold from the October 12th 2019,
I tried to filter on the sales.datetime with :
"range" => [
"datetime" => [
"lt" => "October 12th 2019"
]
]
But obviously, this type of query will return unexpected values, for exemple with this case, it'll retrieve product.id : [12,13] but has you can see, the product.id has been sell on October 18th 2019...
Mappings:
PUT sales
{
"mappings": {
"properties": {
"id":{
"type": "keyword"
},
"date":{
"type": "date",
"format": "MM-dd-yyyy"
}
}
}
}
Data:
[
{
"_index" : "sales",
"_type" : "_doc",
"_id" : "I-6Y7W0B_-hMjUaqpQH4",
"_score" : 1.0,
"_source" : {
"id" : 12,
"date" : "10-12-2019"
}
},
{
"_index" : "sales",
"_type" : "_doc",
"_id" : "JO6Y7W0B_-hMjUaqtwEL",
"_score" : 1.0,
"_source" : {
"id" : 13,
"date" : "10-12-2019"
}
},
{
"_index" : "sales",
"_type" : "_doc",
"_id" : "Je6Y7W0B_-hMjUaqxgEH",
"_score" : 1.0,
"_source" : {
"id" : 13,
"date" : "10-18-2019"
}
}
]
Query:- get max date for all terms, get max date where date range less than 2019-10-12
if both are same return bucket
GET sales/_search
{
"size": 0,
"aggs": {
"transactionId": {
"terms": {
"field": "id",
"size": 10000
},
"aggs": {
"maxDate": {
"max": {
"field": "date"
}
},
"pending_status": {
"filter": {
"range": {
"date": {
"lte": "10-12-2019"
}
}
},
"aggs": {
"filtered_maxdate": {
"max": {
"field": "date"
}
}
}
},
"buckets_latest_status_pending": {
"bucket_selector": {
"buckets_path": {
"filtereddate": "pending_status>filtered_maxdate",
"maxDate": "maxDate"
},
"script": "params.filtereddate==params.maxDate"
}
}
}
}
}
}
Response:
[
{
"key" : "12",
"doc_count" : 1,
"pending_status" : {
"doc_count" : 1,
"filtered_maxdate" : {
"value" : 1.5708384E12,
"value_as_string" : "10-12-2019"
}
},
"maxDate" : {
"value" : 1.5708384E12,
"value_as_string" : "10-12-2019"
}
}
]
EDIT 1:
You can use (top_hits)[https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html] aggregation to get all documents in a bucket
GET sales/_search
{
"size": 0,
"aggs": {
"transactionId": {
"terms": {
"field": "id",
"size": 10000
},
"aggs": {
"maxDate": {
"max": {
"field": "date"
}
},
"pending_status": {
"filter": {
"range": {
"date": {
"lte": "10-12-2019"
}
}
},
"aggs": {
"filtered_maxdate": {
"max": {
"field": "date"
}
}
}
},
"buckets_latest_status_pending": {
"bucket_selector": {
"buckets_path": {
"filtereddate": "pending_status>filtered_maxdate",
"maxDate": "maxDate"
},
"script": "params.filtereddate==params.maxDate"
}
},
"top_hits":{ ---> top hits to get all documents under a bucket
"top_hits": {
"size": 10
}
}
}
}
}
}
For pagination you can use composite aggregation/ include partitions
Let us take a step back and understand what we are trying to do.
First we want to filter the documents. Secondly we want the filter logic to be dependent on the other document's value, which is nothing but a kind of self-join kinda scenario. I don't think that is possible with the way your documents are ingested. You can read more on join here.
As a simplest solution, in order to achieve what you are looking for, you would need to change the document structure to something like below. Of course you can also choose to use nested type but I think below would be much simpler solution.
Mapping:
PUT sales
{
"mappings": {
"properties": {
"id":{
"type": "keyword"
},
"date":{
"type": "date",
"format": "MM-dd-yyyy"
}
}
}
}
Sample Documents:
POST sales/_doc/1
{
"id" : 12 ,
"date" : ["10-12-2019", "10-18-2019"] <--- Note this
}
POST sales/_doc/2
{
"id" : 13 ,
"date" : ["10-12-2019"]
}
POST sales/_doc/3
{
"id" : 14 ,
"date" : ["10-10-2019", "10-14-2019"]
}
Query:
POST sales/_search
{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"lte": "10-12-2019"
}
}
}
],
"must_not": [
{
"range": {
"date": {
"gt": "10-12-2019"
}
}
}
]
}
}
}
Note how the query has been constructed in a very simplistic fashion using must and must_not clauses.
Response:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "sales",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"id" : 13,
"date" : [
"10-12-2019"
]
}
}
]
}
}
Note that in the response, you get only the document you are looking for.
Hope that helps!

Elasticsearch Match Date Range or Number in Array

My goal is to filter my records by date and a day of the week (Mo = 1, Tue = 2, Thu = 3, ..., Sun = 7). In this case, either the date or the weekday should match any of the days in the array. Or both, of course. I am new to Elasticsearch and seem to have a number of mistakes in my query. I documented everything here, as far as I got and hope for a couple of helpful insights. Thanks in advance.
Current Mapping
{
"index":{
"mappings":{
"entity":{
"_meta":{
"model":"AppBundle\\Entity\\Entity"
},
"properties":{
"subEntity":{
"properties":{
"date":{
"type":"date",
"format":"strict_date_optional_time||epoch_millis"
},
"days":{
"properties":{
"day":{
"type":"string"
}
}
}
}
}
}
}
}
}
}
Current Records
curl -XGET 'localhost:9200/index/_search?pretty=1'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [ {
"_index" : "index",
"_type" : "entity",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"subEntity" : [ {
"date" : "2016-09-20T00:00:00+02:00",
"days" : [ ]
}, {
"date" : "2016-09-21T00:00:00+02:00",
"days" : [ ]
}, {
"date" : "2016-09-22T00:00:00+02:00",
"days" : [ {
"day" : 4
}, {
"day" : 5
}, {
"day" : 6
} ]
}, {
"date" : "2016-09-20T00:00:00+02:00",
"days" : [ ]
} ]
}
},
[...]
}
}
Current Request
{
"query":{
"should":{
"filter":[ {
"range":{
"entity.subEntity.date":{
"gte":"2016-09-20",
"lte":"2016-09-21"
}
}
}, {
"term":{
"entity.subEntity.days.day": 2
}
} ]
}
}
}
MySQL Equivalent
SELECT entity
FROM entity
LEFT JOIN subEntity ON (subEntity.entity_id = entity.id)
LEFT JOIN day ON (day.subEntity_id = subEntity.id)
WHERE subEntity.date BETWEEN 2016-09-20 AND 2016-09-21
OR day = 2
If you want to query across properties of a sub-object within a document (where a document may have a collection of such sub-objects), you need to map subEntity as a nested type. In your example, since you are only looking for documents that are within the date range or match the day value, you can use an object mapping as have, but if you need to combine queries with an and operation, then you would need a nested type mapping. If you need to do this, it would make sense to map as a nested type. Additionally, since day is a numeric value, you should map it as a byte.
{
"index":{
"mappings":{
"entity":{
"_meta":{
"model":"AppBundle\\Entity\\Entity"
},
"properties":{
"subEntity":{
"type": "nested",
"properties":{
"date":{
"type":"date",
"format":"strict_date_optional_time||epoch_millis"
},
"days":{
"properties":{
"day":{
"type":"byte"
}
}
}
}
}
}
}
}
}
}
Now that subEntity is mapped as a nested type, a nested query needs to be used to query against it, so the query becomes
{
"query": {
"nested": {
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"range": {
"subEntity.date": {
"gte": "2016-09-20",
"lte": "2016-09-21"
}
}
}
]
}
},
{
"bool": {
"filter": [
{
"terms": {
"subEntity.days.day": [
2
]
}
}
]
}
}
]
}
},
"path": "subEntity"
}
}
}
Both queries are issued as bool filter queries as we don't need to calculate a relevancy score for either, we simply need to know if a document matches or not i.e. a simple yes/no answer. Warpping a query in a bool filter means that the query runs in a filter context.
Next, either query can match, so we add both as should clauses to an outer bool query.
As a complete example:
Create index and mapping
PUT http://localhost:9200/entities?pretty=true
{
"settings": {
"index.number_of_replicas": 0,
"index.number_of_shards": 1
},
"mappings": {
"entity": {
"properties": {
"id": {
"type": "integer"
},
"subEntity": {
"type": "nested",
"properties": {
"date": {
"type": "date"
},
"days": {
"properties": {
"day": {
"type": "short"
}
},
"type": "object"
}
}
}
}
}
}
}
Bulk index four entities
POST http://localhost:9200/_bulk?pretty=true
{"index":{"_index":"entities","_type":"entity","_id":"1"}}
{"subEntity":{"date":"2016-09-19T05:00:00+00:00"}}
{"index":{"_index":"entities","_type":"entity","_id":"2"}}
{"subEntity":{"date":"2016-09-20T05:00:00+00:00"}}
{"index":{"_index":"entities","_type":"entity","_id":"3"}}
{"subEntity":{"date":"2016-09-18T18:00:00+00:00","days":[{"day":2},{"day":5}]}}
{"index":{"_index":"entities","_type":"entity","_id":"4"}}
{"subEntity":{"date":"2016-09-18T18:00:00+00:00","days":[{"day":3},{"day":4}]}}
Issue the search query above
POST http://localhost:9200/entities/entity/_search?pretty=true
{
"query": {
"nested": {
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"range": {
"subEntity.date": {
"gte": "2016-09-20",
"lte": "2016-09-21"
}
}
}
]
}
},
{
"bool": {
"filter": [
{
"terms": {
"subEntity.days.day": [
2
]
}
}
]
}
}
]
}
},
"path": "subEntity"
}
}
}
We should only get back entities with ids 2 and 3; id 2 matches on date and id 3 matches on day
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.0,
"hits" : [ {
"_index" : "entities",
"_type" : "entity",
"_id" : "2",
"_score" : 0.0,
"_source" : {
"subEntity" : {
"date" : "2016-09-20T05:00:00+00:00"
}
}
}, {
"_index" : "entities",
"_type" : "entity",
"_id" : "3",
"_score" : 0.0,
"_source" : {
"subEntity" : {
"date" : "2016-09-18T18:00:00+00:00",
"days" : [ {
"day" : 2
}, {
"day" : 5
} ]
}
}
} ]
}
}
Your Solution can be easily achieved using "or" query but now in es 2.0.0 onwards "or" query is deprecated. in-place of using or query we can use "bool" query now. Sample query is given below
{
"query": {
"bool" : {
"should" : [
{
"term" : { "CREAT_DT": "2015-11-03T07:49:07.000Z" }
},
{
"term" : { "TableName": "dwd" }
}
],
"minimum_should_match" : 1,
"boost" : 1.0
}
}
}
More details about it's uses can be found in below link
https://www.elastic.co/guide/en/elasticsearch/reference/2.0/query-dsl-bool-query.html

Resources