search for a certain text between within a range of a certain timestamp with Elasticsearch - elasticsearch

I have worked with Elasticsearch and have done some research on the Internet how to query data with a certain text and how to query data within a range of timestamp, using Elasticsearch PHP Client API. Now I would like to combine these two queries in one. Lets say search for a certain text and within a range of a certain timestamp. Can someone please tell me how to do that using Elasticsearch PHP Client API? Thanks in advanced! I have searched on the Internet but still cannot combine these two queries in one :-(

Here is an example of a bool query, the logic here is that the record must fall within a date range and should also contain the text in the textfield field. You could have both query conditions within the must clause.
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [
{
"range": {
"datefield": {
"gte": "from",
"lte": "to"
}
}
}
],
"should": [
{
"match": {
"textfield": {
"query": "Name",
"boost": 10
}
}
}
]
}
}
}
UPDATE - OR MUST HAVE BOTH
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [
{
"range": {
"datefield": {
"gte": "from",
"lte": "to"
}
}
},
{
"match": {
"textfield": {
"query": "Name",
"boost": 10
}
}
}
]
}
}
}

Related

Is it ok to use only filter query in elastic search

i have to query elastic search for some data and all my filters are drop down values as in they are exact matches only so i thought of using only the filter query and not any must or match query, so is there any problem with this kind of approach.
in the below example i am trying to get last 15 min data where L1 is any 1 of ("XYZ","CFG") and L2 is any 1 of ( "ABC","CDE")
My query looks like below :
{
"size": 20,
"sort": [
{
"eventTs": "desc"
}
],
"query": {
"bool": {
"filter": [
{
"range": {
"eventTs": {
"gte": "now-15m",
"lte": "now",
"format": "epoch_millis",
"boost": 1
}
}
},
{
"terms": {
"l1": [
"XYZ","CFG"
]
}
},
{
"terms": {
"l2":[
"ABC","CDE"
]
}
}
]
}
}
}
If you don't need _score which is used to show the relevant documents according to their score, you can use filter which is executed in much faster way(since calculation of score is disabled), and cached as well.
Must read query and filter context for in-depth understanding of these concepts.

ElasticSearch: how to use timezone from the doc to query on date range

I am trying to achieve something like this:
GET /services/_search
{
"size": 10,
"_source": [
"doc.time",
"doc.timezone"
],
"query": {
"bool": {
"must": [
{
"range": {
"doc.time": {
"gte": "now-7d/d",
"time_zone": "US/Pacific"
}
}
}
]
}
}
}
This query works.
Each doc has its own time zone which is specified inside Elastic Search document.
I need to use doc.timezone instead of "US/Pacific".
Is that possible to achieve in Elastic Search?

elasticsearch must query combine OR?

I have been trying to use a must query with bool but I am failing to get the results.
In pseudo-SQL:
SELECT * FROM info WHERE (ulevel= '1.3.10' or ulevel= '1.3.6') AND (#timestamp between '2017-06-05T07:00:00.000Z' and '2017-06-05T07:00:00.000Z')
Here is what I have:
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "_all",
"query": "*"
},
"range": {
"#timestamp": {
"from": "2017-06-05T07:00:00.000Z",
"to": "2017-06-05T07:20:00.000Z"
}
},
"bool": {
"should": [
{"term": { "ulevel": "1.3.10"}},
{"term": { "ulevel": "1.3.6"}}
]
}
}
]
}
}
Does anyone have a solution?
Thank you so much.
You can use terms query for the first part and the range query for the second part
GET _search
{
"query": {
"bool": {
"must": [
{
"terms": {
"ulevel": [
"1.3.10",
"1.3.6"
]
}
},
{
"range": {
"#timestamp": {
"gte": "2017-06-05T07:00:00.000Z",
"lte": "2017-06-05T07:20:00.000Z"
}
}
}
]
}
},
"from": 0,
"size": 20
}
Some Notes :
Filters documents that have fields that match any of the provided terms (not analyzed)
Also you can use some date spesific formulation with rage filter. Please check the range query page https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#ranges-on-dates more information.
Update:
Added from and size for comment question.

ElasticSearch filtering by field1 THEN field2 THEN take max of field3

I am struggling to get the information that I need from ElasticSearch.
My log statements are like this:
field1: Example
field2: Example2
field3: Example3
I would like to search a timeframe (using last 24 hours) to find all data that has this in field1 and that in field2.
There then may be multiple this.that.[field3] entries, so I want to only return the maximum of that field.
In fact, in my data, field3 is actually the key of the entry.
What is the best way of retrieving the information I need? I have managed to get the results returned using aggs, but the data is in buckets, and I am only interested in the data with the max value of field3.
I have added an example of the query that I am looking to do: https://jsonblob.com/54535d49e4b0d117eeaf6bb4
{
"size": 0,
"aggs": {
"agg_129": {
"filters": {
"filters": {
"CarName: Toyota": {
"query": {
"query_string": {
"query": "CarName: Toyota"
}
}
}
}
},
"aggs": {
"agg_130": {
"filters": {
"filters": {
"Attribute: TimeUsed": {
"query": {
"query_string": {
"query": "Attribute: TimeUsed"
}
}
}
}
},
"aggs": {
"agg_131": {
"terms": {
"field": "#timestamp",
"size": 0,
"order": {
"_count": "desc"
}
}
}
}
}
}
}
},
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"gte": "2014-10-27T00:00:00.000Z",
"lte": "2014-10-28T23:59:59.999Z"
}
}
}
],
"must_not": []
}
}
}
}
}
So, that example above is showing only those that have CarName = Toyota and Attribute = TimeUsed.
My data is as follows:
There are x number of cars CarName and each car has y number of Attributes and each of those Attributes have a document with a timestamp.
To begin with, I was looking for a query for CarName.Attribute.timestamp (latest), however, if I am able to use just ONE query to get the latest timestamp for EVERY attribute for EVERY CarName, then that would decrease query calls from ~50 to one.
If you are using a ElasticSearch v1.3+, you can add a top_hits aggregation with parameter size:1 and descending sort on the field3 value.
This will return the whole document with maximum value on the field, as you wish.
This example in the documentation might do the trick.
Edit:
Ok, it seems you don't need the whole document, but only the maximum timestamp value. You can use a max aggregation instead of using a top_hits one.
The following query (not tested) should give you the maximum timestamp value for each top 10 Attribute value of each CarName top 10 value, in only one request.
terms aggregation is like a GROUP BY clause, and you should not have to query 50 times to retrieve the values of each CarName/Attribute combination : this is the point of nesting a terms aggregation for Attribute in the CarName aggregation.
Note that, to work properly, the CarName and Attribute fields should be not_analyzed. If it's not the case, you will have "funny" results in your buckets. The problem (and possible solution) is very well described here.
Feel free to change the size parameter of the terms aggregation to fit to your case.
{
"size": 0,
"aggs": {
"by_carnames": {
"terms": {
"field": "CarName",
"size": 10
},
"aggs": {
"by_attribute": {
"terms": {
"field": "Attribute",
"size": 10
},
"aggs": {
"max_timestamp": {
"max": {
"field": "#timestamp"
}
}
}
}
}
}
},
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"gte": "2014-10-27T00:00:00.000Z",
"lte": "2014-10-28T23:59:59.999Z"
}
}
}
]
}
}
}
}
}

Combine two range filters inside a bool query

In my ES index, documents have two fields, score_min and score_max, which I am trying to boost in a bool query.
I want to boost all documents for which score_min <= expected_score <= score_max is true.
I know that I can put two range queries in the must clause, but that would mean that other documents would get overlooked.
Is there a way to do something like this
..
..
"should": [
...
...
"some_query": {
"and": [
"range": {
"score_min": {
"lte": expected_score
},
},
"range": {
"score_max": {
"gte": expected_score
}
}
"boost": 2
]
}
]
You can do this using the function_score query. One of the added benefits is that your range query can be written as a filter instead, and so take advantage of filter caching:
curl -XGET "http://localhost:9200/_search" -d'
{
"query": {
"function_score": {
"query": {
"match": { "some_field": "foo bar" }
},
"functions": [
{
"boost_factor": 1.2
"filter": {
"bool": {
"must": [
{ "range": { "score_min": { "lte": 10 }}},
{ "range": { "score_max": { "gte": 10 }}}
]
}
}
}
]
}
}
}'
All results matching the query are returned, but any results which additionally match the filter will have their score multiplied by boost_factor.

Resources