Is there any way to set default date range in elasticsearch - elasticsearch

Is there any way in elasticsearch to set a default date range if to and from fields are null. Like whenever to and from are empty, then elasticsearch should perform search on the basis of defined default range. I have written a query but it only works in the case if to and from is defined:
"range": {
"time": {
"from": "2018-01-16T07:05:00",
"to": "2018-01-16T10:59:09",
"include_lower": true,
"include_upper": true
}
}

There is no default date range in elasticsearch right now.
If you didn't provide any date range filter, the elasticsearch will search the entire records which match your query.
It will search the entire index or alias you have pointed to search.
My suggestion would be
If you want to set a default time frame in your filter, you have to do that in your code (means client side).
So your program should set a time frame example - Last 30 days or something.

Related

Elasticsearch date based function scoring boosting the wrong way

I would like to boost scores of documents based on how "recent" a document is. I am trying to do this using a function_score. Here is an example of me doing this on a field called updated_at:
{
"function_score": {
"boost_mode": "sum",
"functions": [
{
"exp": {
"updated_at": {
"origin": "now",
"scale": "1h",
"decay": 0.01,
},
},
"weight": 1,
}
],
"query": query
},
}
I would expect documents close to the datetime now will have a score closer to 1, and documents closer to scale will have a score closer to decay (as described in the docs). Therefore, I'm using the boost_mode sum, to keep the original document scores, and increase depending on how close to now the updated_at value is. (Also, the query score is useful so I would rather add than multiply, which is the default).
To test this scenario, I create a document (A) that returns a query score of about 2. I then duplicate it (B) and modify the new document's updated_at timestamp to be an hour in the past.
In this scenario, I would expect (A) to have a higher score and (B) to have a lower score. However, when I run this scenario, I get the exact opposite. (B) ends up with a score of 3 and (A) ends up with a score of 2.
What am I misunderstanding here to cause this to happen? And how would I modify my function score to do what I would like?
This turned out to be a a timezone issue.
I ended up using the explain API to look at what was contributing to the score. When doing that, I noticed that the origin set to now was actually in a different timezone to the one I was setting in the documents.
I fixed this by manually providing a UTC timestamp in the elasticsearch query rather than using now as the value.
(If there is a better way to do this, please let me know)

Elastic Search Scoring based on the date time fields

How to write the custom Scoreing function in Elasticsearch based on the date field
can any one help me to write the custom Scoreing function in Elasticsearch based on the date field?
If I give the date field as asc it will use other scoring function to calculate score and finally if use the asc i need add the score to document with has least recent days and if desc the score should be based on most recent days.
I bet what you are looking for is so-called Function Queries.
In case of date you could use field_value_factor. It will take your date value and transform it into milliseconds (Unix timestamp). So you should supply smth like:
"field_value_factor": {
"field": "your_date_field",
"factor": 1,
"modifier": "none",
"missing": 1
}

ElasticSearch 2.4 date range histogram using the difference between two date fields

I haven't been able to find anything regarding this for ES 2.* in regards to the problem here or in the docs, so sorry if this is a duplicate.
What I am trying to do is create an aggregation in an ElasticSearch query that will allow me to create buckets based on the difference in a record between 2 date fields.
I.e. If I had data in ES for a shop, I might like to see the time difference between a purchase_date field and shipped_date field.
So in that instance I'd want to create an aggregate that had buckets to give me the hits for when shipped_date - purchase_date is < 1 day, 1-2 days, 3-4 days or 5+ days.
Ideally I was hoping this was possible in an ES query. Is that the case or would the best approach be to process the results into my own array based on the time difference for each hit?
I was able to achieve this by using the built in expression language which is enabled by default in ES 2.4. The functionality I wanted was to group my results to show the difference between EndDate and Date Processed in increments of 15 days. Relevant part of the query is:
{
...,
"aggs": {
"reason": {
"date_histogram": {
"min_doc_count": 1,
"interval": "1296000000ms", // 15 days
"format": "epoch_millis",
"script": {
"lang": "expression",
"inline": "doc['DateProcessed'] > doc['EndDate'] ? doc['DateProcessed'] - doc['EndDate'] : -1"
}
}
...
}
}

String range query in Elasticsearch

I'm trying to query data in an Elasticsearch cluster (2.3) using the following range query. To clarify, I'm searching on a field that contains an array of values that were derived by concatenating two ids together with a count. For example:
Schema:
{
id1: 111,
id2: 222,
count: 5
}
The query I'm using looks like the following:
Query:
{
"query": {
"bool": {
"must": {
"range": {
"myfield": {
"from": "111_222_1",
"to": "111_222_2147483647",
"include_lower": true,
"include_upper": true
}
}
}
}
}
}
The to field uses Integer.MAX_VALUE
This works alright but doesn't exactly match the underlying data. Querying through other means produces more results than this method.
More strangely, trying 111_222_5 in the from field produces 0 results, while trying 111_222_10 does produce results.
How is ES (and/or Lucene) interpreting this range query and why is it producing such strange results? My initial guess is that it's not looking at the full value of the last portion of the String and possibly only looking at the first digit.
Is there a way to specify a format for the TermRange? I understand date ranging allows formatting.
A look here provides the answer.
The way it's doing range is lexicographic, 5 comes before 50 comes before 6, etc.
To get around this, I reindexed using a fixed length string for the count.
0000000001
0000000100
0001000101
...

Kibana: filter events for today

I'm using Kibana on top of logstash and I want to filter items in the index to today, or the last 24 hours is fine too.
So apparently this requires me to run a range query against the underlying ElasticSearch engine that would look like:
"range" : {
"timestamp" : {
"gte": "now-24h",
"lte": "now",
}
}
However - I can't put that in the filter box in Kibana 3:
This is a numeric range query and it doesn't work - but it shows the input box and the idea.
So my question: how can I create a filter that filters the events to a date range in Kibana 3?
Found it, it's in the top menu:
Clicking it generates the range filter as can be seen as the 2nd filter on the left.

Resources