elasticsearch find doc by time with datetime field - elasticsearch

I'm trying to retrieve all documents that have a date between 2 dates and a time between 2 hours.
I can't get the query to work.
Is it possible ? If yes, how.
[
{
"_index": "a1",
"_type": "_doc",
"_id": "50c09e31-1fad-4d25-ab9d-35154a1b765b",
"_score": 5.0,
"_source":
{
"start_at": "2022-06-23 14:00",
"end_at": "2022-06-23 14:15",
...
}
},
{
"_index": "a1",
"_type": "_doc",
"_id": "d96ba291-63de-422a-9123-3d1a1d573861",
"_score": 5.0,
"_source":
{
"start_at": "2022-06-24 16:30",
"end_at": "2022-06-24 17:00",
...
}
}
]
GET /a1/_search?pretty
{
"query": {
"bool": {
"must": [
{
"range": {
"start_at": {
"gte": "2022-06-20",
"format": "yyyy-MM-dd"
}
}
},
{
"range": {
"start_at": {
"lt": "2022-06-27",
"format": "yyyy-MM-dd"
}
}
},
{
"range": {
"start_at": {
"gte": "14:00",
"format": "HH:mm"
}
}
},
{
"range": {
"start_at": {
"lt": "18:00",
"format": "HH:mm"
}
}
},
]
}
},
"size": 10
}
Thanks.

The immediate solution would be to use a query similar to this one but change the script part to:
doc['start_at'].value.getHourOfDay() ...
Since scripting can be bad for performance, a better solution would be to index the hours into a dedicated field and then perform a range query on it.

Related

conditionally query for fields in elasticsearch

I m new to Elasticsearch and before posting this question I have googled for help but not understanding how to write the query which i wanted to write.
My problem is I have few bunch of documents which i want to query, few of those documents has field "DueDate" and few of those has "PlannedCompletionDate" but not both exist in a single document. So I want to write a query which should conditionally query for a field from documents and return all documents.
For example below I m proving sample documents of each type and my query should return results from both the documents, I need to write query which should check for field existence and return the document
"_source": {
...
"plannedCompleteDate": "2019-06-30T00:00:00.000Z",
...
}
"_source": {
...
"dueDate": "2019-07-26T07:00:00.000Z",
...
}
You can use range query with the combination of the boolean query to achieve your use case.
Adding a working example with index mapping, data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"plannedCompleteDate": {
"type": "date",
"format": "yyyy-MM-dd"
},
"dueDate": {
"type": "date",
"format": "yyyy-MM-dd"
}
}
}
}
Index Data:
{
"plannedCompleteDate": "2019-05-30"
}
{
"plannedCompleteDate": "2020-06-30"
}
{
"dueDate": "2020-05-30"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"range": {
"plannedCompleteDate": {
"gte": "2020-01-01",
"lte": "2020-12-31"
}
}
},
{
"range": {
"dueDate": {
"gte": "2020-01-01",
"lte": "2020-12-31"
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "65808850",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"plannedCompleteDate": "2020-06-30"
}
},
{
"_index": "65808850",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"dueDate": "2020-05-30"
}
}
]

Elasticsearch - Trouble querying for exact date with range query

I have the following mapping definition in my events index:
{
"events": {
"mappings": {
"properties": {
"data": {
"properties": {
"reportDate": {
"type": "date",
"format": "M/d/YYYY"
}
}
}
}
}
}
And an example doc:
{
"_index": "events",
"_type": "_doc",
"_id": "12345",
"_version": 1,
"_seq_no": 90,
"_primary_term": 1,
"found": true,
"_source": {
"data": {
"reportDate": "12/4/2018",
}
}
}
My goal is query for docs with an exact data.reportDate of 12/4/2018, but when I run this query:
{
"query": {
"range": {
"data.reportDate": {
"lte": "12/4/2018",
"gte": "12/4/2018",
"format": "M/d/YYYY"
}
}
}
}
I instead get all of the docs that have a data.reportDate that is in the year 2018, not just 12/4/2018. I've tried setting relation to CONTAINS and WITHIN with no luck. Any ideas?
You need to change your date format from M/d/YYYY to M/d/yyyy. Refer to this ES official documentation to know more about date formats. You can even refer to this documentation to know about the difference between yyyy and YYYY
yyyy specifies the calendar year whereas YYYY specifies the year (of
“Week of Year”)
Adding a working example with index mapping, data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"data": {
"properties": {
"reportDate": {
"type": "date",
"format": "M/d/yyyy"
}
}
}
}
}
}
Index Data:
{
"data": {
"reportDate": "12/3/2018"
}
}
{
"data": {
"reportDate": "12/4/2018"
}
}
{
"data": {
"reportDate": "12/5/2018"
}
}
Search Query:
{
"query": {
"bool": {
"must": {
"range": {
"data.reportDate": {
"lte": "12/4/2018",
"gte": "12/4/2018"
}
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65312594",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"data": {
"reportDate": "12/4/2018"
}
}
}
]

Elasticsearch Date parsing error in 7.x version

Im using Elasticsearch 7.1 and i have defined the format in my index mappings as below :
"ManufacturerDate": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'|| yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSSXXX"
}
But im getting date parsing error when searching against the date - "2020-07-09T00:12:22.011-00:00". The format yyyy-MM-dd'T'HH:mm:ss.SSSXXX is already defined as one of the accepted formats.
The error is
Failed to parse date field [2020-07-09T00:12:22.011-00:00] with format [yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSSXXX]:
Can anyone please help?
Adding Working example with mapping and search query.
To know more about the Date data type refer to this documentation.
The search query mentioned below is for finding exact date type values.
To Return documents that contain terms within a provided range refer this
Mapping :
{
"mappings": {
"properties": {
"ManufacturerDate": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSSXXX"
}
}
}
}
Search Query:
{
"query": {
"term": {
"ManufacturerDate": {
"value": "2020-07-09T00:12:22.011-00:00"
}
}
}
}'
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
}
]
Update 1:
You can even use Constant score query
Search query:
{
"query": {
"constant_score": {
"filter": {
"term": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
},
"boost": 1.2
}
}
}
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.2,
"_source": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
}
]
Update 2: By changing the order of patterns the query works (Using ES version 7.2)
Mapping:
{
"mappings": {
"properties": {
"ManufacturerDate": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSXXX||yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSS"
}
}
}
}
Index data:
{
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
Search Query:
{
"query": {
"constant_score": {
"filter": {
"term": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
},
"boost": 1.2
}
}
}
Search Result :
"hits": [
{
"_index": "my_index5",
"_type": "_doc",
"_id": "1",
"_score": 1.2,
"_source": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
}
]

Complex aggregations with Elastic Search

Supposing this is my elasticsearch structure:
{
"_index": "my_index",
"_type": "person",
"_id": "ID",
"_source": {
...DATA...
}
}
{
"_index": "my_index",
"_type": "result",
"_id": "ID",
"_source": {
"personID": "personID"
"date": "timestamp",
"result": "integer",
"speciality": "categoryID"
}
}
I would like to get the most 10 most "influent" people based on:
number of competition in the last 30 days
number of competition in the last year
competition's results in the last 30 days
number of different specialities in the last 30 days
I'm thinking about using _score but I don't know how to influence the score using some values aggregated from the documents of type "result" . This is what I'm trying to achieve
POST my_index/_search?search_type=dfs_query_then_fetch
{
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"_type": {
"value": "person"
}
}
}
]
}
}
},
"functions": [
{
"field_value_factor": {
"field": {
"query": {
//competitions in the last 30 days
},
"aggs": {
//cout
}
},
"factor": 1
},
"weight": 0.1
}
]
}
}
Is this possible with just 1 request?
Is this a good approach?
Any tip on what to look at is appreciated

Optimize MLT elasticsearch query

I want to apply more like this query, so I use this(python wrapper for elasticsearch):
{
"query": {
"more_like_this": {
"fields": ["title", "content"],
"docs": [
{
"_index": "kavosh",
"_type": "articles",
"_id": str(news_id)
}
]
}
},
"size": 1,
}
but I have many timeout. so i decided to reduce range of mlt checking to one week. (Is it effective?) for example adding this:
{
"range": {
"publication_date": {
"lte": now,
"gte": now - 1week
}
}
}
How can apply this filter to MLT query and do you have any suggestion to optimize query?
You can use below query:
{
"query": {
"filtered": {
"query": {
"more_like_this": {
"fields": [
"title",
"content"
],
"docs": [
{
"_index": "kavosh",
"_type": "articles",
"_id": str(news_id)
}
]
}
},
"filter": {
"range": {
"publication_date": {
"lte": "now",
"gte": "now - 1week"
}
}
}
}
}
}
Hope it helps.

Resources