How to range query a date mapped as long - elasticsearch

I have an external ES db (i.e. I can't change its structure) with the following mapping
"failure_url": {
"properties": {
"lastAccessTime": {
"type": "long"
},
"url": {
"type": "keyword"
}
}
}
lastAccessTime represents a date, but is mapped as a long. A standard "range" filter fails with
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: "now""
}
is the error in my filter expression or is it due to the field not being a "date"? If the latter, how can I still query this date?

If you want to query your field with now and other date match expressions, then your field must be defined as a date type.
One thing you can do is to create another field of type date and then update your index to index that new field.
So first change your mapping like this:
PUT my-index/_mappings
{
"properties": {
"lastAccessDate": {
"type": "date"
}
}
}
And then you can leverage the update by query API in order to index your dates from the long field:
POST my-index/_update_by_query
{
"script": {
"source": "ctx._source.lastAccessDate = ctx._source.lastAccessTime"
}
}
When that's done, you'll be able to run your range query on the new lastAccessDate field of type date.

Related

Partial search on date fields in elasticsearch

I'm trying to implement partial search on a date field in elastic search. For example if startDate is stored as "2019-08-28" i should be able to retrieve the same while querying just "2019" or "2019-08" or "2019-0".
For other fields i'm doing this:
{
"simple_query_string": {
"fields": [
"customer"
],
"query": "* Andrew *",
"analyze_wildcard": "true",
"default_operator": "AND"
}}
which perfectly works on text fields, but the same doesn't work on date fields.
This is the mapping :
{"mappings":{"properties":{"startDate":{"type":"date"}}}}
Any way this can be achieved, be it change in mapping or other query method? Also i found this discussion related to partial dates in elastic, not sure if it's much relevant but here it is:
https://github.com/elastic/elasticsearch/issues/45284
Excerpt from ES-Docs
Internally, dates are converted to UTC (if the time-zone is specified)
and stored as a long number representing milliseconds-since-the-epoch.
It is not possible to do searching as we can do on a text field. However, we can tell ES to index date field as both date & text.e.g
Index date field as multi-type:
PUT sample
{
"mappings": {
"properties": {
"my_date": {
"type": "date",
"format": "year_month_day",//<======= yyyy-MM-dd
"fields": {
"formatted": {
"type": "text", //<========= another representation of type TEXT, can be accessed using my_date.formatted
"analyzer": "whitespace" //<======= whitespace analyzer (standard will tokenized 2020-01-01 into 2020,01 & 01)
}
}
}
}
}
}
POST dates/_doc
{
"date":"2020-01-01"
}
POST dates/_doc
{
"date":"2019-01-01"
}
Use wildcard query to search: You can even use n-grams at indexing time for faster search if required.
GET dates/_search
{
"query": {
"wildcard": {
"date.formatted": {
"value": "2020-0*"
}
}
}
}

Cannot create elasticsearch mapping or get date field to work

{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Types cannot be provided in put mapping requests, unless the include_type_name parameter is set to true."
}
"status" : 400
I've read that the above is because type are deprecated in elasticsearch 7.7, is that valid for data type? I mean how am I suppose to say I want the data to be considered as a date?
My current mapping has this element:
"Time": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
I just wanted to create it again with type:"date", but I noticed even copy pasting the current mapping (which works) yields an error... The index and mappins I have are generate automatically by https://github.com/jayzeng/scrapy-elasticsearch
My goal is simply to have a date field, I have all my date in my index, but when I want to filter in kibana I can see it is not considered as a date field. And modyfing mapping doesn't seem like an option.
Obvious ELK noob here, please bare with me (:
The error is quite ironic because I pasted the mapping from an existing index/mapping...
You're experiencing this because of the breaking change in 7.0 regarding named doc types.
Long story short, instead of
PUT example_index
{
"mappings": {
"_doc_type": {
"properties": {
"Time": {
"type": "date",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
you should do
PUT example_index
{
"mappings": { <-- Note no top-level doc type definition
"properties": {
"Time": {
"type": "date",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
I'm not familiar with your scrapy package but at a glance it looks like it's just a wrapper around the native py ES client which should forward your mapping definitions to ES.
Caveat: if you have your fields defined as text and you intend to change it to date (or add a field of type date), you'll get the following error:
mapper [Time] of different type, current_type [text], merged_type [date]
You basically have two options:
drop the index, set the new mapping & reindex everything (easiest but introduces downtime)
extend the mapping with a new field, let's call it Time_as_datetime and update your existing docs using a script (introduces a brand new field/property -- that's somewhat verbose)

Elastic Search Date Range Query

I am new to elastic search and I am struggling with date range query. I have to query the records which fall between some particular dates.The JSON records pushed into elastic search database are as follows:
"messageid": "Some message id",
"subject": "subject",
"emaildate": "2020-01-01 21:09:24",
"starttime": "2020-01-02 12:30:00",
"endtime": "2020-01-02 13:00:00",
"meetinglocation": "some location",
"duration": "00:30:00",
"employeename": "Name",
"emailid": "abc#xyz.com",
"employeecode": "141479",
"username": "username",
"organizer": "Some name",
"organizer_email": "cde#xyz.com",
I have to query the records which has start time between "2020-01-02 12:30:00" to "2020-01-10 12:30:00". I have written a query like this :
{
"query":
{
"bool":
{
"filter": [
{
"range" : {
"starttime": {
"gte": "2020-01-02 12:30:00",
"lte": "2020-01-10 12:30:00"
}
}
}
]
}
}
}
This query is not giving results as expected. I assume that the person who has pushed the data into elastic search database at my office has not set the mapping and Elastic Search is dynamically deciding the data type of "starttime" as "text". Hence I am getting inconsistent results.
I can set the mapping like this :
PUT /meetings
{
"mappings": {
"dynamic": false,
"properties": {
.
.
.
.
"starttime": {
"type": "date",
"format":"yyyy-MM-dd HH:mm:ss"
}
.
.
.
}
}
}
And the query will work but I am not allowed to do so (office policies). What alternatives do I have so that I can achieve my task.
Update :
I assumed the data type to be "Text" but by default Elastic Search applies both "Text" and "Keyword" so that we can implement both Full Text and Keyword based searches. If it is also set as "Keyword" . Will this benefit me in any case. I do not have access to lots of stuff in the office that's why I am unable to debug the query.I only have the search API for which I have to build the query.
GET /meetings/_mapping output :
'
'
'
"starttime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
'
'
'
Date range queries will not work on text field, for that, you have to use the date field
Since you are working on date fields , best practice is to use the date field.
I would suggest you to reindex your index to another index so that you can change the type of your text field to date field
Step1-: Create index2 using index1 mapping and make sure to change the type of your date field which is text to date type
Step 2-: Run the elasticsearch reindex and reindex all your data from index1 to index2. Since you have changed your field type to date field type. Elasticsearch will now recognize this field as date
POST _reindex
{
"source":{ "index": "index1" },
"dest": { "index": "index2" }
}
Now you can run your Normal date queries on index2
As #jzzfs suggested the idea is to add a date sub-field to the starttime field. You first need to modify the mapping like this:
PUT meetings/_mapping
{
"properties": {
"starttime" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
},
"date": {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss",
}
}
}
}
}
When done, you need to reindex your data using the update by query API so that the starttime.date field gets populated and index:
POST meetings/_update_by_query
When the update is done, you'll be able to leverage the starttime.date sub-field in your query:
{
"query": {
"bool": {
"filter": [
{
"range": {
"starttime.date": {
"gte": "2020-01-02 12:30:00",
"lte": "2020-01-10 12:30:00"
}
}
}
]
}
}
}
There are ways of parsing text fields as dates at search time but the overhead is impractical... You could, however, keep the starttime as text by default but make it a multi-field and query it using starttime.as_date, for example.

Why can't I mix my data types in Elasticsearch?

Currently, I'm trying to implement a variation of the car example here:
https://www.elastic.co/blog/managing-relations-inside-elasticsearch
If I run:
PUT /vehicle/_doc/1
{
"metadata":[
{
"key":"make",
"value":"Saturn"
},
{
"key":"year",
"value":"2015"
}
]
}
the code works correctly.
But if I delete the index and change 2015 from a string to a number:
DELETE /vehicle
PUT /vehicle/_doc/1
{
"metadata":[
{
"key":"make",
"value":"Saturn"
},
{
"key":"year",
"value":2015
}
]
}
I get the following error message:
{ "error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [metadata.value] of different type, current_type [long], merged_type [text]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [metadata.value] of different type, current_type [long], merged_type [text]" }, "status": 400 }
How do I fix this error?
PUT /vehicle/_doc/1
{
"metadata":[
{
"key":"make",
"value":"Saturn"
},
{
"key":"year",
"value":2015
}
]
}
After deleting the index and then trying to index a new document as above, the following steps takes place:
When elastic could not find an index by the name of vehicle and auto index creation is enabled (which is by default enabled) it will create a new index named as vehicle.
Based on the input document elastic now tries to make best guess of the datatype for the fields of the document. This is know as dynamic field mapping.
For the above document since metadata is an array of objects the field metadata is assumed to be of object data type.
Now comes the step to decide the data type of fields of individual object. When it encounters the first object it finds two fields key and value. Both these fields have string value (make and Saturn respectively) and hence elastic identifies the datatype for both the fields as text.
Elastic then try to define the mapping as below:
{
"properties": {
"metadata": {
"properties": {
"key": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"value": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
When it encounter the second object where the value of value field is numeric (2015) which it guesses the datatype as long. This results in conflict with the previously identified datatype which was text. A field can not be of mixed data type. Data types are strict, and hence the error.
To resolve the error you need to make sure the input values for the fields are of same type for each of the object as below:
PUT /vehicle/_doc/1
{
"metadata":[
{
"key":"make",
"value":2016
},
{
"key":"year",
"value":2015
}
]
}
For above it could be better used as:
PUT /vehicle/_doc/1
{
"metadata":[
{
"make":"Saturn",
"year": 2015
}
]
}

Unable to retrieve a geopoint data field when searching

I have an index with a field called loc which is correctly mapped as a geopoint.
When running a search like:
curl -XGET 'http://localhost:9200/DB/_search'
I get 10 or so results and all of them appear to have loc inside the _source object.
If I try:
curl -XGET 'http://localhost:9200/DB/_search?fields=name'
I get a fields object with the name field correctly set up (name exists, it is another field, it is a string). Thing is, if I try the same thing with the loc field, as in:
curl -XGET 'http://localhost:9200/DB/_search?fields=loc'
I don't get anything back, neither the _source nor the fields objects.
How may I return the loc field when running this query?
Bonus question: Is there a way to return the loc field as a geohash?
Update, here's the mapping:
{
"geonames": {
"mappings": {
"place": {
"properties": {
"ele": {
"type": "string"
},
"geoid": {
"type": "string"
},
"loc": {
"type": "geo_point"
},
"name": {
"type": "string"
},
"pop": {
"type": "string"
},
"tz": {
"type": "string"
}
}
}
}
}
}
You should use source filtering instead of fields and you'll get the loc field as you expect.
curl -XGET 'localhost:9200/DB/_search?_source=loc'
Quoting from the official documentation on fields (emphasis added):
The fields parameter is about fields that are explicitly marked as stored in the mapping, which is off by default and generally not recommended. Use source filtering instead to select subsets of the original source document to be returned.

Resources