Boost result which has the current date in between dates - elasticsearch

My mapping has two properties:
"news_from_date" : {
"type" : "string"
},
"news_to_date" : {
"type" : "string"
},
Search results have the properties news_from_date, news_to_date
curl -X GET 'http://172.2.0.5:9200/test_idx1/_search?pretty=true' 2>&1
Result:
{
"news_from_date" : "2022-05-30 00:00:00",
"news_to_date" : "2022-06-23 00:00:00"
}
Question is: How can I boost all results with the current date being in between their "news_from_date"-"news_to_date" interval, so they are shown as highest ranking results?

Tldr;
First off if you are going to play with dates, you should probably use the one of the dates type provided by Elasticsearch.
They are many way to approach you problem, using painless, using scoring function or even more classical query types.
Using Should
Using the Boolean query type, you have multiple clauses.
Must
Filter
Must_not
Should
Should allow for optionals clause to be factored in the final score.
So you go with:
GET _search
{
"query": {
"bool": {
"should": [
{
"range": {
"news_from_date": {
"gte": "now"
}
}
},
{
"range": {
"news_to_date": {
"lte": "now"
}
}
}
]
}
}
}
Be aware that:
You can use the minimum_should_match parameter to specify the number or percentage of should clauses returned documents must match.
If the bool query includes at least one should clause and no must or filter clauses, the default value is 1. Otherwise, the default value is 0.
Using a script
As provided by the documentation, you can create a custom function to score your documents according to your own business rules.
The script is using Painless (a stripped down version of java)
GET /_search
{
"query": {
"function_score": {
"query": {
"match": { "message": "elasticsearch" }
},
"script_score": {
"script": {
"source": "Math.log(2 + doc['my-int'].value)"
}
}
}
}
}

Related

Elasticsearch: How to filter results with a specific word in a value using elasticsearch

I need to add a parameter to my search that filters results containing a specific word in a value. The query is searching for user history records and contains a url key. I need to filter out /history and any other url containing that string.
Here's my current query:
GET /user_log/_search
{
"size" : 50,
"query": {
"match": {
"user_id": 56678
}
}
}
Here's an example of a record, boiled down to just the value we're looking at:
"_source": {
"url": "/history?page=2&direction=desc",
},
How can the parameters of the search be changed to filter out this result.
You can use the filter param of boolean query in Elasticsearch.
if your url field is of type keyword, you can use the below query
{
"query": {
"bool": {
"must": {
"match": {
"user_id": 56678
}
},
"filter": { --> note filter
"term": {
"url": "/history"
}
}
}
}
}
I found a way to solve my specific issue. Instead of filtering on the url I'm filtering on a different value. Here's what I'm using now:
{
"size" : 50,
"query": {
"bool" : {
"must" : {
"match" : { "user_id" : 56678 }
},
"must_not": {
"match" : { "controller": "History" }
}
}
}
}
I'm still going to leave this question open for a while to see if anyone has other ways of solving the original problem.

Fuzziness in date type filed

I have date field in my mapping and I want to do fuzzy search on my field
Below is my code
GET _search
{
"query": {
"fuzzy": {
"death_date": {
"value": "3548"
}
}
}
}
Current result don't return data based as per expectation.
Although I have 3548 value it's score is less that 3549 value which appears on the top of the result
I have changed my query to include range parameter as suggested
GET _search
{
"query" : {
"bool": {
"must":
{
"match": {
"marriages.marriage_year": "1630"
}
},
"should":
{
"match": {
"first_name":
{ "query" : "mary",
"fuzziness":"2"
}
}
},
"must":
{
"range" : {
"marriages.marriage_year": {
"gt" : "1620",
"lte" : "1740"
}
}
}
}
}
}
It is returning data with marriages.marriage_year= "1630" with Mary as first_name as highest score.I also want to include marriages.marriage_year between 1620 - 1740 which are not shown in the results. It is showing data only for marraige_year 1630
Fuzzy query is meant to work on string fields to accommodate for typing errors. It gives you result on basis of edit distance. It doesn't make sense to use it on numeric fields. As 1000, 9000 has distance 1 only but they are far apart. You can do a range query as suggested by Russ or if you are bothered about edit distance and not range, index it as string field and then do fuzzy query.

how to achieve an exists filter on ES5.0?

The exists filter has been replaced by an exists query in ES5.0.
So how can we achieve, within the same query the equivalent? In other words, we don't want to do two query but just on for various aggregations, including the exists count?
So I want to count the number of time the field "the_field" exists (or is not null)
"aggregation":{
"exists_count":{
"filter":{
"exists":{
"field":"the_field"
}
}
}
}
I think you can use stats aggregation,
{ "aggs" :
{ "time_stats" :
{ "extended_stats" :
{ "field" : "time" }
}
}
}
Look at elastic stats doc
With Elastic 5.0, filters didn't so much get replaced by queries, but combined. Syntactically they look the same, but the context in which you use it determines if it gets interpreted as a query (factors into scoring) or as a filter to simply weed out documents. The below code should achieve exactly what you want:
{
"query": {
"match_all": {}
},
"aggs": {
"field_exists": {
"filter": {
"exists": {
"field": "name"
}
}
}
}
}
The aggregation returned will look something like this, with the doc_count representing the number of documents where the "name field exists. Hope this helps!
{
"aggregations": {
"field_exists": {
"doc_count": 11984
}
}
}

To find difference between two integer fields and check it falls under a specific range, using scripts in elasticsearch

I have two fields,let us name them "fieldA" and "fieldB" in my documents and i need to find the difference between them and check if that value falls under a specific range say "rangeA" or " rangeB" and then return the documents that matches my criteria.
The schema for data is as shown below:
{
"fieldA": 45
"fieldB":13
}
I need to find all the document which have the difference between "fieldA" and "fieldB" in between 30 and 35. How can i do this using scripting in elasticsearch?
This can also be done using aggregations and scripts like below:
{
"aggregations": {
"age_diff": {
"range": {
"script": "doc[\"fieldA\"].value - doc[\"fieldB\"].value",
"ranges": [
{
"from": 30,
"to": 35
}
]
}
}
}
}
This way you can just check how many documents falls under the specified range.But if you want to get the documents under the aggregations you can use "top_hits" aggregations.
More detailed discussion on aggregations can be found here and more about "top_hits" can be found in detail here
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "difference=doc['fieldA'].value-doc['fieldB'].value;return (difference>param1 && difference<param2);",
"params": {
"param1":30,
"param2":35
}
}
}
}
}
}

Elasticsearch query for matching two parameters at the same time

I have to search two fields in a DB using elasticsearch where i should be getting total hits isequal to the sum of individual field search. I did it on port 9200 like this and its working. How to write a must match code for this.
http://localhost:9200/indexname/typename/_search?q=Both:Yes++Type:Comm
Where Both is one field and Comm is another.
Thank you
You need to use an "AND" query.
GET hilden1/type1/_search
{
"query": {
"filtered": {
"filter": {
"and": {
"filters": [
{
"term": {
"both": "yes"
}
},
{
"term": {
"type": "comm"
}
}
]
}
}
}
}
}
I think this is what you need:
Elasticsearch URI based query with AND operator
_search?q=%2Bboth:yes%20%2Btype:comm

Resources