Give higher priority to specific ranges in elasticsearch - elasticsearch

Here is a sample document from my index
{
"name" : "Neil Buckinson",
"insuranceType" : "personal",
"premiumAmount": 4000,
"age": 36
}
I want to give the documents with premium amount 3500 to 4500 more priority than others. How can I do that in elasticsearch?

I would say a better approach would be function score query with filter functions -
{
"query": {
"function_score": {
"functions": [
{
"weight": 100,
"filter": {
"range": {
"premiumAmount": {
"gte": 3500,
"lte": 4500
}
}
}
}
]
}
}
}

That's quite simple. Just use Bool Query and add your condition in should clause.
The bool query takes a more-matches-is-better approach, so the score
from each matching must or should clause will be added together to
provide the final _score for each document.
GET /index/_search
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"should": [
{
"range": {
"premiumAmount": {
"gte": 3500,
"lte": 4500
},
"boost": 2
}
}
]
}
}
}
Just place your current query in must clause instead of match_all.
should means that this condition is optional, but if it matches your criterion, it will boost your document
That's what exactly you're looking for.

Related

Is it ok to use only filter query in elastic search

i have to query elastic search for some data and all my filters are drop down values as in they are exact matches only so i thought of using only the filter query and not any must or match query, so is there any problem with this kind of approach.
in the below example i am trying to get last 15 min data where L1 is any 1 of ("XYZ","CFG") and L2 is any 1 of ( "ABC","CDE")
My query looks like below :
{
"size": 20,
"sort": [
{
"eventTs": "desc"
}
],
"query": {
"bool": {
"filter": [
{
"range": {
"eventTs": {
"gte": "now-15m",
"lte": "now",
"format": "epoch_millis",
"boost": 1
}
}
},
{
"terms": {
"l1": [
"XYZ","CFG"
]
}
},
{
"terms": {
"l2":[
"ABC","CDE"
]
}
}
]
}
}
}
If you don't need _score which is used to show the relevant documents according to their score, you can use filter which is executed in much faster way(since calculation of score is disabled), and cached as well.
Must read query and filter context for in-depth understanding of these concepts.

Compond query with Elasticsearch

I'm trying to perform a search with the intended criteria being (activationDate in range 1598889600 to 1602051579) or someFlag=true.
Below is the query I tried, but it does not yield any records with someFlag=true (even with a big size, e.g. 5000). My Elasticsearch does have a lot of records with someFlag=true.
There are about 3000 total documents and this query returns around 280 documents.
{
"query": {
"bool": {
"must": [
{
"range": {
"activationDate": {
"gte": 1598889600
}
}
},
{
"range": {
"activationDate": {
"lte": 1602051579
}
}
}
],
"should": {
"match": {
"someFlag": true
}
}
}
},
"from": 1,
"size": 1000
}
Am I missing something?
This should work:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"activationDate": {
"gte": 1598889600,
"lte": 1602051579
}
}
},
{
"term": {
"someFlag": true
}
}
]
}
}
]
}
}
}
In theory this should do the same:
{
"query": {
"bool": {
"should": [
{
"range": {
"activationDate": {
"gte": 1598889600,
"lte": 1602051579
}
}
},
{
"term": {
"someFlag": true
}
}
]
}
}
}
However the first query I've given wraps bool clause within a filter context (so that it does not need to score and query becomes cacheable).
Your bool query might have not worked because you were using match query, not term. match is normally used for text search only.
Replace the must with an should and set minimum_should_match=1 as is is an OR query and you are fine if just one of the ceiterias is met by any record. Next reduce the two range criterias to just one, where you combine gte and lte.

elasticsearch averaging a field on a bucket

I am a newbie to elasticsearch, trying to understand how aggregates and metrics work. I was particularly running an aggregate query to retrieve average num of bytesOut based on clientIPHash from an elasticsearch instance. The query I created (using kibana) is as follows:
{
"size": 0,
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*",
"analyze_wildcard": true
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"#timestamp": {
"gte": 1476177616965,
"lte": 1481361616965,
"format": "epoch_millis"
}
}
}
],
"must_not": []
}
}
}
},
"aggs": {
"2": {
"terms": {
"field": "ClientIP_Hash",
"size": 50,
"order": {
"1": "desc"
}
},
"aggs": {
"1": {
"avg": {
"field": "Bytes Out"
}
}
}
}
}
}
It gives me some output (supposed to be avg) grouped on clientIPHash like below:
ClientIP_Hash: Descending Average Bytes Out
64e6b1f6447fd044c5368740c3018f49 1,302,210
4ff8598a995e5fa6930889b8751708df 94,038
33b559ac9299151d881fec7508e2d943 68,527
c2095c87a0e2f254e8a37f937a68a2c0 67,083
...
The problem is, if I replace the avg with sum or min or any other metric type, I still get same values.
ClientIP_Hash: Descending Sum of Bytes Out
64e6b1f6447fd044c5368740c3018f49 1,302,210
4ff8598a995e5fa6930889b8751708df 94,038
33b559ac9299151d881fec7508e2d943 68,527
c2095c87a0e2f254e8a37f937a68a2c0 67,083
I checked the query generated by kibana, and it seems to correctly put the keyword 'sum' or 'avg' accordingly. I am puzzled why I get the same values for avg and sum or any other metric.
Could you see if the sample data set of yours have more values. As min, max and Avg remains the same if you have only one value.
Thanks

search for a certain text between within a range of a certain timestamp with Elasticsearch

I have worked with Elasticsearch and have done some research on the Internet how to query data with a certain text and how to query data within a range of timestamp, using Elasticsearch PHP Client API. Now I would like to combine these two queries in one. Lets say search for a certain text and within a range of a certain timestamp. Can someone please tell me how to do that using Elasticsearch PHP Client API? Thanks in advanced! I have searched on the Internet but still cannot combine these two queries in one :-(
Here is an example of a bool query, the logic here is that the record must fall within a date range and should also contain the text in the textfield field. You could have both query conditions within the must clause.
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [
{
"range": {
"datefield": {
"gte": "from",
"lte": "to"
}
}
}
],
"should": [
{
"match": {
"textfield": {
"query": "Name",
"boost": 10
}
}
}
]
}
}
}
UPDATE - OR MUST HAVE BOTH
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [
{
"range": {
"datefield": {
"gte": "from",
"lte": "to"
}
}
},
{
"match": {
"textfield": {
"query": "Name",
"boost": 10
}
}
}
]
}
}
}

Combine two range filters inside a bool query

In my ES index, documents have two fields, score_min and score_max, which I am trying to boost in a bool query.
I want to boost all documents for which score_min <= expected_score <= score_max is true.
I know that I can put two range queries in the must clause, but that would mean that other documents would get overlooked.
Is there a way to do something like this
..
..
"should": [
...
...
"some_query": {
"and": [
"range": {
"score_min": {
"lte": expected_score
},
},
"range": {
"score_max": {
"gte": expected_score
}
}
"boost": 2
]
}
]
You can do this using the function_score query. One of the added benefits is that your range query can be written as a filter instead, and so take advantage of filter caching:
curl -XGET "http://localhost:9200/_search" -d'
{
"query": {
"function_score": {
"query": {
"match": { "some_field": "foo bar" }
},
"functions": [
{
"boost_factor": 1.2
"filter": {
"bool": {
"must": [
{ "range": { "score_min": { "lte": 10 }}},
{ "range": { "score_max": { "gte": 10 }}}
]
}
}
}
]
}
}
}'
All results matching the query are returned, but any results which additionally match the filter will have their score multiplied by boost_factor.

Resources