Elasticsearch Date parsing error in 7.x version - elasticsearch

Im using Elasticsearch 7.1 and i have defined the format in my index mappings as below :
"ManufacturerDate": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'|| yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSSXXX"
}
But im getting date parsing error when searching against the date - "2020-07-09T00:12:22.011-00:00". The format yyyy-MM-dd'T'HH:mm:ss.SSSXXX is already defined as one of the accepted formats.
The error is
Failed to parse date field [2020-07-09T00:12:22.011-00:00] with format [yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSSXXX]:
Can anyone please help?

Adding Working example with mapping and search query.
To know more about the Date data type refer to this documentation.
The search query mentioned below is for finding exact date type values.
To Return documents that contain terms within a provided range refer this
Mapping :
{
"mappings": {
"properties": {
"ManufacturerDate": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSSXXX"
}
}
}
}
Search Query:
{
"query": {
"term": {
"ManufacturerDate": {
"value": "2020-07-09T00:12:22.011-00:00"
}
}
}
}'
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
}
]
Update 1:
You can even use Constant score query
Search query:
{
"query": {
"constant_score": {
"filter": {
"term": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
},
"boost": 1.2
}
}
}
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.2,
"_source": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
}
]
Update 2: By changing the order of patterns the query works (Using ES version 7.2)
Mapping:
{
"mappings": {
"properties": {
"ManufacturerDate": {
"type": "date",
"format": "yyyy-MM-dd'T'HH:mm:ss.SSSXXX||yyyy-MM-dd'T'HH:mm:ss.SSS'ZZ'||yyyy-MM-dd'T'HH:mm:ss.SSS"
}
}
}
}
Index data:
{
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
Search Query:
{
"query": {
"constant_score": {
"filter": {
"term": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
},
"boost": 1.2
}
}
}
Search Result :
"hits": [
{
"_index": "my_index5",
"_type": "_doc",
"_id": "1",
"_score": 1.2,
"_source": {
"ManufacturerDate": "2020-07-09T00:12:22.011-00:00"
}
}
]

Related

conditionally query for fields in elasticsearch

I m new to Elasticsearch and before posting this question I have googled for help but not understanding how to write the query which i wanted to write.
My problem is I have few bunch of documents which i want to query, few of those documents has field "DueDate" and few of those has "PlannedCompletionDate" but not both exist in a single document. So I want to write a query which should conditionally query for a field from documents and return all documents.
For example below I m proving sample documents of each type and my query should return results from both the documents, I need to write query which should check for field existence and return the document
"_source": {
...
"plannedCompleteDate": "2019-06-30T00:00:00.000Z",
...
}
"_source": {
...
"dueDate": "2019-07-26T07:00:00.000Z",
...
}
You can use range query with the combination of the boolean query to achieve your use case.
Adding a working example with index mapping, data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"plannedCompleteDate": {
"type": "date",
"format": "yyyy-MM-dd"
},
"dueDate": {
"type": "date",
"format": "yyyy-MM-dd"
}
}
}
}
Index Data:
{
"plannedCompleteDate": "2019-05-30"
}
{
"plannedCompleteDate": "2020-06-30"
}
{
"dueDate": "2020-05-30"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"range": {
"plannedCompleteDate": {
"gte": "2020-01-01",
"lte": "2020-12-31"
}
}
},
{
"range": {
"dueDate": {
"gte": "2020-01-01",
"lte": "2020-12-31"
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "65808850",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"plannedCompleteDate": "2020-06-30"
}
},
{
"_index": "65808850",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"dueDate": "2020-05-30"
}
}
]

Elasticsearch - Trouble querying for exact date with range query

I have the following mapping definition in my events index:
{
"events": {
"mappings": {
"properties": {
"data": {
"properties": {
"reportDate": {
"type": "date",
"format": "M/d/YYYY"
}
}
}
}
}
}
And an example doc:
{
"_index": "events",
"_type": "_doc",
"_id": "12345",
"_version": 1,
"_seq_no": 90,
"_primary_term": 1,
"found": true,
"_source": {
"data": {
"reportDate": "12/4/2018",
}
}
}
My goal is query for docs with an exact data.reportDate of 12/4/2018, but when I run this query:
{
"query": {
"range": {
"data.reportDate": {
"lte": "12/4/2018",
"gte": "12/4/2018",
"format": "M/d/YYYY"
}
}
}
}
I instead get all of the docs that have a data.reportDate that is in the year 2018, not just 12/4/2018. I've tried setting relation to CONTAINS and WITHIN with no luck. Any ideas?
You need to change your date format from M/d/YYYY to M/d/yyyy. Refer to this ES official documentation to know more about date formats. You can even refer to this documentation to know about the difference between yyyy and YYYY
yyyy specifies the calendar year whereas YYYY specifies the year (of
“Week of Year”)
Adding a working example with index mapping, data, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"data": {
"properties": {
"reportDate": {
"type": "date",
"format": "M/d/yyyy"
}
}
}
}
}
}
Index Data:
{
"data": {
"reportDate": "12/3/2018"
}
}
{
"data": {
"reportDate": "12/4/2018"
}
}
{
"data": {
"reportDate": "12/5/2018"
}
}
Search Query:
{
"query": {
"bool": {
"must": {
"range": {
"data.reportDate": {
"lte": "12/4/2018",
"gte": "12/4/2018"
}
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "65312594",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"data": {
"reportDate": "12/4/2018"
}
}
}
]

Elasticsearch associating exact match terms

I have a search index of filenames containing over 100,000 entries that share about 500 unique variations of the main filename field. I have recently made some modifications to certain filename values that are being generated from my data. I was wondering if there is a way to link certain queries to return an exact match. In the following query:
"query": {
"bool": {
"must": [
{
"match": {
"filename": "foo-bar"
}
}
],
}
}
how would it be possible to modify the index and associate the results so that above query will also match results foo-bar-baz, but not foo-bar-foo or any other variation?
Thanks in advance for your help
You can use a term query instead of a match query. Perfect to use on a keyword:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
Adding a working example with index data and search query. (Using the default mapping)
Index Data:
{
"fileName": "foo-bar"
}
{
"fileName": "foo-bar-baz"
}
{
"fileName": "foo-bar-foo"
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName.keyword": "foo-bar"
}
},
{
"match": {
"fileName.keyword": "foo-bar-baz"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9808291,
"_source": {
"fileName": "foo-bar-baz"
}
}
]

How to change the order of search results on Elastic Search?

I am getting results from following Elastic Search query:
"query": {
"bool": {
"should": [
{"match_phrase_prefix": {"title": keyword}},
{"match_phrase_prefix": {"second_title": keyword}}
]
}
}
The result is good, but I want to change the order of the result so that the results with matching title comes top.
Any help would be appreciated!!!
I was able to reproduce the issue with sample data and My solution is using a query time boost, as index time boost is deprecated from the Major version of ES 5.
Also, I've created sample data in such a manner, that without boost both the sample data will have a same score, hence there is no guarantee that one which has match comes first in the search result, this should help you understand it better.
1. Index Mapping
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"second_title" :{
"type" :"text"
}
}
}
}
2. Index Sample docs
a)
{
"title": "opster",
"second_title" : "Dimitry"
}
b)
{
"title": "Dimitry",
"second_title" : "opster"
}
Search query
{
"query": {
"bool": {
"should": [
{
"match_phrase_prefix": {
"title": {
"query" : "dimitry",
"boost" : 2.0 <-- Notice the boost in `title` field
}
}
},
{
"match_phrase_prefix": {
"second_title": {
"query" : "dimitry"
}
}
}
]
}
}
}
Output
"hits": [
{
"_index": "60454337",
"_type": "_doc",
"_id": "1",
"_score": 1.3862944,
"_source": {
"title": "Dimitry", <-- Dimitry in title field has doube score
"second_title": "opster"
}
},
{
"_index": "60454337",
"_type": "_doc",
"_id": "2",
"_score": 0.6931472,
"_source": {
"title": "opster",
"second_title": "Dimitry"
}
}
]
Let me know if you have any doubt understanding it.

Elastic search ---: MUST_NOT query not working

I have a query in which i want to add a must_not clause that would discard all records that have blank data for a some field. I tried a lot of ways but none worked. when I issue the same query (mentioned below) with other specific fields then it works fine.
this query should get all records that do not have "registrationType1" field empty/blank
query:
{
"size": 20,
"_source": [
"registrationType1"
],
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1": ""
}
}
]
}
}
}
the results below still contains "registrationType1" with empty values
results:
**"_source": {
"registrationType1": ""}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842002",
"_score": 1,
"_source": {
"registrationType1": "A&R"}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842033",
"_score": 1,
"_source": {
"registrationType1": "AMHA"}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842213",
"_score": 1,
"_source": {
"registrationType1": "AMHA"}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3842963",
"_score": 1,
"_source": {
"registrationType1": ""}}
, * {
"_index": "oh_animal",
"_type": "animals",
"_id": "3869063",
"_score": 1,
"_source": {
"registrationType1": ""}}**
PFB mappings for the field above
"registrationType1": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
You need to use the keyword subfield in order to do this:
{
"size": 20,
"_source": [
"registrationType1"
],
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1.keyword": "" <-- change this
}
}
]
}
}
}
If you do not specify any text value on the text fields, there is basically nothing to analyze and return the documents accordingly.
In similar way, if you remove must_not and replace it with must, it would show empty results.
What you can do is, looking at your mapping, query must_not on keyword field. Keyword fields won't be analysed and in that way your query would return the results as you expect.
Query
POST myemptyindex/_search
{
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1.keyword": ""
}
}
]
}
}
}
Hope this helps!
I am using elasticsearch version 7.2,
I replicated your data and ingested in my elastic index,and tried querying with and without .keyword.
I am getting the desired result when using the ".keyword" in the field name.It is not returning the docs which have registrationType1="".
Note - The query does not works when not using the ".keyword"
I have added my sample code below, have a look if that helps.
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.indices.create(index="test", ignore=400, body={
"mappings": {
"_doc": {
"properties": {
"registrationType1": {
"type": "text",
"field": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
})
data = {
"registrationType1": ""
}
es.index(index="test",doc_type="_doc",body=data,id=1)
search = es.search(index="test", body={
"size": 20,
"_source": [
"registrationType1"
],
"query": {
"bool": {
"must_not": [
{
"term": {
"registrationType1.keyword": ""
}
}
]
}
}
})
print(search)
Executing the above should not return any results as we are inserting empty for the field
There was some issue with the mappings itself, I deleted the index and re-indexed it with new mappings and its working now.

Resources