Elasticsearch: storing a range of values in a field - elasticsearch

This is the first time I am asking a question.
I am planning to use Elasticsearch for storing certain data that I have.
The problem that I face is that I need to store a field's value as a range thats tolerated.
Like this-
the field name - tolerated pH
Example value - 5.1 - 7.0
I need to save it like this and when a query is executed, it has to see if the entered value lies in the range.
I can't find it in the reference and guide.
All I find is Range filter.
Can someone please help me out?
And guide me how it can be done?

I would create two fields
{ ...
"minToleratedPh": 5.1,
"maxToleratedPh": 7.0
}
And then use two range queries in order to check that the constraint minPh < input_value < maxPh holds true (just replace input_value by the pH value to check):
{
"query": {
"bool": {
"filter": [
{
"range": {
"minToleratedPh": {
"lt": input_value
}
}
},{
"range": {
"maxToleratedPh": {
"gt": input_value
}
}
}
]
}
}
}

Related

Find same text within time range

I'm storing articles of blogs in ElasticSearch in this format:
{
blog_id: keyword,
blog_article_id: keyword,
timestamp: date,
article_text: text
}
Suppose I want to find all blogs with articles that mention X at least twice within the last 30 days. Is there a simple query to find all blog_ids that have articles with the same word at least n times within a date range?
Is this the right way to model the problem or should I use a nested objects for an easier query?
Can this be made into a report in Kibana?
The simplest query that comes to mind is
{
"_source": "blog_id",
"query": {
"bool": {
"must": [
{
"match": {
"article_text": "xyz"
}
},
{
"range": {
"timestamp": {
"gte": "now-30d"
}
}
}
]
}
}
}
nested objects are most probably not going to simplify anything -- on the contrary.
Can it be made into a Kibana report?
Sure. Just apply the filters either in KQL (Kib. query lang) or using the dropdowns & choose a metric that you want to track (total blog_id count, timeseries frequency etc.)
EDIT re # of occurrences:
I know of 2 ways:
there's the term_vector API which gives you the word frequency information but it's a standalone API and cannot be used at query time.
Then there's the scripted approach whereby you look at the whole article text, treat is as a case-sensitive keyword, and count the # of substrings, thereby eliminating the articles with non-sufficient word frequency. Note that you don't have to use function_score as I did -- a simple script query will do. it may take a non-trivial amount of time to resolve if you have non-trivial # of docs.
In your case it could look like this:
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"source": """
def word = 'xyz';
def docval = doc['article_text.keyword'].value;
String temp = docval.replace(word, "");
def no_of_occurences = ((docval.length() - temp.length()) / word.length());
return no_of_occurences >= 2;
"""
}
}
}
]
}
}
}

How to use multifield search in elasticsearch combining should and must clause

This may be a repeted question but I'm not findin' a good solution.
I'm trying to search elasticsearch in order to get documents that contains:
- "event":"myevent1"
- "event":"myevent2"
- "event":"myevent3"
the documents must not contain all of them in the same document but the result should contain only documents that are only with those types of events.
And this is simple because elasticsearch helps me with the clause should
which returns exactly what i want.
But then, I want that all the documents must contain another condition that is I want the field result.example.example = 200 and this must be in every single document PLUS the document should be 1 of the previously described "event".
So, for example, a document has "event":"myevent1" and result.example.example = 200 another one has "event":"myevent2" and result.example.example = 200 etc etc.
I've tried this configuration:
{
"query": {
"bool": {
"must":{"match":{"operation.result.http_status":200}},
"should": [
{
"match": {
"event": "bank.account.patch"
}
},
{
"match": {
"event": "bank.account.add"
}
},
{
"match": {
"event": "bank.user.patch"
}
}
]
}
}
}
but is not working 'cause I also get documents that not contain 1 of the should field.
Hope I explained well,
Thanks in advance!
As is, your query tells ES to look for documents that must have "operation.result.http_status":200 and to boost those that have a matching event type.
You're looking to combine two must queries
one that matches one of your event types,
one for your other condition
The event clause accepts multiple values and those values are exact matches : you're looking for a terms query.
Try
{
"query": {
"bool": {
"must": [
{"match":{"operation.result.http_status":200}},
{
"terms" : {
"event" : [
"bank.account.patch",
"bank.account.add",
"bank.user.patch"
]
}
}
]
}
}
}

How to calculate difference between two datetime in ElasticSearch

I'm working with ES and I need a query that returns the difference between two datetime (mysql timediff), but have not found any function of ES to do that. Someone who can help me?
MySQL Query
SELECT SEC_TO_TIME(
AVG(
TIME_TO_SEC(
TIMEDIFF(r.acctstoptime,r.acctstarttime)
)
)
) as average_access
FROM radacct
Thanks!
Your best best is scripted fields. The above search query should work , provided you have enabled dynamic scripting and these date fields are defined as date in the mapping.
{
"script_fields": {
"test1": {
"script": "doc['acctstoptime'].value - doc['acctstarttime'].value"
}
}
}
Note that you would be getting result in epoch , which you need to convert to your denomination.
You can read about scripted field here and some of its examples here.
Here is another example using script fields. It converts dates to milli seconds since epoch, subtracts the two and converts the results into number of days between the two dates.
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "priorTransactionDate"
}
},
{
"script": {
"script": "(doc['transactionDate'].date.millis - doc['priorTransactionDate'].date.millis)/1000/86400 < 365"
}
}
]
}
}
}

Elastic Search Date Range

I have a query that properly parses date ranges. However, my database has a default value that all dates have a timestamp of 00:00:00. This means that items that are still valid today are shown as expired even if they should still be valid. How can I adjust the following to look at just the date and not the time of the item (expirationDate).
{
"range": {
"expirationDate": {
"gte": "now"
}
}
}
An example of the data is:
"expirationDate": "2014-06-24T00:00:00.000Z",
Did you look into the different format options for dates stored in ElasticSearch? If this does not work for you or you don't want to store dates without the time you can try this query, which will work for your exact use case I guess:
{
"range": {
"expirationDate": {
"gt": "now-1d"
}
}
}
You can also round down the time so that your query returns anything that occurred since the beginning of the day:
Assuming that
now is 2017-03-07T07:00:00.000,
now/d is 2017-03-07T00:00:00.000
Your query would be:
{
"range": {
"expirationDate": {
"gte": "now/d"
}
}
}
elastic search documentation on rounding times

Elasticsearch date range intersection

I'm storing something like the following information in elastic search:
{ "timeslot_start_at" : "2013-02-01", "timeslot_end_at" : "2013-02-03" }
Given that I have another date range (given from user input for example) I am wanting to search for an intersecting time range. Similar to this: Determine Whether Two Date Ranges Overlap Which outlines that the following logic is what i'm after:
(StartDate1 <= EndDate2) and (StartDate2 <= EndDate1)
But I'm unsure of how to fit this into an elastic search query, would I use a range filter and only set the 'to' values, leaving the from blank? Or is there a more efficient way of doing this?
Update: It is now possible to use date_range data type that was added in elasticsearch v5.2. For an earlier version of elasticsearch the following solution still applies.
To test for intersection, you should combine two range queries into a single query using bool query:
{
"bool": {
"must": [
{
"range": {
"timeslot_start_at": {
"lte": "2013-02-28"
}
}
},
{
"range": {
"timeslot_end_at": {
"gte": "2013-02-03"
}
}
}
]
}
}

Resources