ElasticSearch : How can I boost score depending on field value? - elasticsearch

I am trying to get rid of sorting in elasticsearch by boosting the _score based on field value. Here is my scenario:
I have a field in my document: applicationDate. This is time elapsed since EPOC. I want record having greater applicationDate (most recent) to have higer score.
If score of two documents are same, I want to sort them on another field that is of type String. Say "status" is another field that can have value (Available, in progress, closed ). So, documents having same applicationDate should have _score based on status.
Available should have more score , In Progress a less, Closed, least. So by this means, I wont have to sort the documents after getting results.
Please give me some pointers.

You should be able to achieve this using Function Score .
Depending on your requirements it could be as simple as the following
Example:
put test/test/1
{
"applicationDate" : "2015-12-02",
"status" : "available"
}
put test/test/2
{
"applicationDate" : "2015-12-02",
"status" : "progress"
}
put test/test/3
{
"applicationDate" : "2016-03-02",
"status" : "progress"
}
post test/_search
{
"query": {
"function_score": {
"functions": [
{
"field_value_factor" : {
"field" : "applicationDate",
"factor" : 0.001
}
},
{
"filter": {
"term": {
"status": "available"
}
},
"weight": 360
},
{
"filter": {
"term": {
"status": "progress"
}
},
"weight": 180
}
],
"boost_mode": "multiply",
"score_mode": "sum"
}
}
}
**Results:**
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "3",
"_score": 1456877060,
"_source": {
"applicationDate": "2016-03-02",
"status": "progress"
}
},
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 1449014780,
"_source": {
"applicationDate": "2015-12-02",
"status": "available"
}
},
{
"_index": "test",
"_type": "test",
"_id": "2",
"_score": 1449014660,
"_source": {
"applicationDate": "2015-12-02",
"status": "progress"
}
}
]

Have you looked at function scores?
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
Specifically look at decay functions in the above documentation.

There is a new field called rank_feature_field that can be useful for this usecase:
https://www.elastic.co/guide/en/elasticsearch/reference/current/rank-feature.html

Related

Elasticsearch - Find documents missing two fields

I'm trying to create a query that returns information about how many documents that don't have data for two fields (date.new and date.old). I have tried the query below, but it works as OR-logic, where all documents missing either date.new or date.old are returned. Does anyone know how I can make this only return documents missing both fields?
{
"aggs":{
"Missing_field_count1":{
"missing":{
"field":"date.new"
}
},
"Missing_field_count2":{
"missing":{
"field":"date.old"
}
}
}
}
Aggregations is not the feature to use for this. You need to use the exists query wrapped within a bool/must_not query, like this:
GET index/_count
{
"size": 0,
"bool": {
"must_not": [
{
"exists": {
"field": "date.new"
}
},
{
"exists": {
"field": "date.old"
}
}
]
}
}
hits.total.value indicates the count of the documents that match the search request. The value indicates the number of hits that match and relation indicates whether the value is accurate (eq) or a lower bound (gte)
Index Data:
{
"data": {
"new": 1501,
"old": 10
}
}
{
"title": "elasticsearch"
}
{
"title": "elasticsearch-query"
}
{
"date": {
"new": 1400
}
}
The search query given by #Val answers on how to achieve your use case.
Search Result:
"hits": {
"total": {
"value": 2, <-- note this
"relation": "eq"
},
"max_score": 0.0,
"hits": [
{
"_index": "65112793",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"title": "elasticsearch"
}
},
{
"_index": "65112793",
"_type": "_doc",
"_id": "5",
"_score": 0.0,
"_source": {
"title": "elasticsearch-query"
}
}
]
}

ElasticSearch: why it is not possible to get suggest by criteria?

I want to get suggestions from some text for concrete user.
As I understand Elasticsearch provides suggestions based on the whole dictionary(inverted index) that contains all the terms in the index.
So if user1 posts some text then this text can be suggested to user2. Am I right?
Is it possible to add filter by criteria (by user for example) to reduce the set of terms to be suggested?
Yes, that's very much possible, let me show you by an example, which uses the query with filter context:
Index def
{
"mappings": {
"properties": {
"title": {
"type": "text" --> inverted index for storing suggestions on title field
},
"userId" : {
"type" : "keyword" --> like in you example
}
}
}
}
Index sample doc
{
"title" : "foo baz",
"userId" : "katrin"
}
{
"title" : "foo bar",
"userId" : "opster"
}
Search query without userId filter
{
"query": {
"bool": {
"must": {
"match": {
"title": "foo"
}
}
}
}
}
Search results(bring both results)
"hits": [
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar",
"userId": "posted" --> note another user
}
},
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "foo baz",
"userId": "katrin" -> note user
}
}
]
Now lets reduce the suggestion by filtering the docs created by user katrin
Search query
{
"query": {
"bool": {
"must": {
"match": {
"title": "foo"
}
},
"filter": {. --> note filter on userId field
"term": {
"userId": "katrin"
}
}
}
}
}
Search result
"hits": [
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "foo baz",
"userId": "katrin"
}
}
]

How to change the order of search results on Elastic Search?

I am getting results from following Elastic Search query:
"query": {
"bool": {
"should": [
{"match_phrase_prefix": {"title": keyword}},
{"match_phrase_prefix": {"second_title": keyword}}
]
}
}
The result is good, but I want to change the order of the result so that the results with matching title comes top.
Any help would be appreciated!!!
I was able to reproduce the issue with sample data and My solution is using a query time boost, as index time boost is deprecated from the Major version of ES 5.
Also, I've created sample data in such a manner, that without boost both the sample data will have a same score, hence there is no guarantee that one which has match comes first in the search result, this should help you understand it better.
1. Index Mapping
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"second_title" :{
"type" :"text"
}
}
}
}
2. Index Sample docs
a)
{
"title": "opster",
"second_title" : "Dimitry"
}
b)
{
"title": "Dimitry",
"second_title" : "opster"
}
Search query
{
"query": {
"bool": {
"should": [
{
"match_phrase_prefix": {
"title": {
"query" : "dimitry",
"boost" : 2.0 <-- Notice the boost in `title` field
}
}
},
{
"match_phrase_prefix": {
"second_title": {
"query" : "dimitry"
}
}
}
]
}
}
}
Output
"hits": [
{
"_index": "60454337",
"_type": "_doc",
"_id": "1",
"_score": 1.3862944,
"_source": {
"title": "Dimitry", <-- Dimitry in title field has doube score
"second_title": "opster"
}
},
{
"_index": "60454337",
"_type": "_doc",
"_id": "2",
"_score": 0.6931472,
"_source": {
"title": "opster",
"second_title": "Dimitry"
}
}
]
Let me know if you have any doubt understanding it.

Elasticsearch GET the last document for a given field if it exists

I have a short question which seems to be simple, but I wasn't able to find any answer so far.
I want to retrieve on an Elasticsearch node, the last document given to a date field. But I want to have the last document, only for documents which contains a specific field.
For instance, let's say I want to get the last purchase which contains the field "promotionCode" :
Query :
http://elasticsearch:9200/store1/purchase/_search?q=vendor:Marie&size=1&sort=date:desc
where store1 is my index, purchase a document type.
Now let's say I have these two documents in my ElasticSearch :
"hits": [
{
"_index": "store1",
"_type": "purchase",
"_id": "1",
"_score": 1,
"_source": {
"date": "2016-03-16T12:53:16.000Z",
"vendor": "Marie",
"promotionCode": "XYZ123"
}
},
{
"_index": "store1",
"_type": "purchase",
"_id": "2",
"_score": 1,
"_source": {
"date": "2016-03-18T12:53:16.000Z",
"vendor": "Marie"
}
}
]
The above query will retrieve the document of id 2, but I will not have any field "promotionCode" in my result.
If I want to get the last document, containing a specific field, how do I do ?
I explored "fields" filter, but it only send back void document if the field is not contained, and I read about Source filtering but not sure it is doing what I want ...
Thanks a lot for any hint !
Yo can try with this query:
{
"query": {
"term": { "vendor": "Marie" }
},
"filter": {
"bool": {
"must_not": { "missing": { "field": "promotionCode" } }
}
},
"sort": { "date" : "desc" },
"size": 1
}
You can use Exists Query
GET /store1/purchase/_search?q=vendor:Marie&size=1&sort=date:desc
{
"query": {
"exists" : {
"field" : "promotionCode"
}
}
}
Hope it helps!!

Elasticsearch exact match of specific field(s)

I'm trying to filter my elasticsearch index by specific fields, the "country" field to be exact. However, I keep getting loads of other results (other countries) back that are not exact.
Please could someone point me in the right direction.
I've tried the following searches:
GET http://127.0.0.1:9200/decision/council/_search
{
"query": {
"filtered": {
"filter": {
"term": {
"country": "Algeria"
}
}
}
}
}
Here is an example document:
{
"_index": "decision",
"_id": "54290140ec882c6dac5ae9dd",
"_score": 1,
"_type": "council",
"_source": {
"document": "DEV DOCUMENT"
"id": "54290140ec882c6dac5ae9dd",
"date_updated": 1396448966,
"pdf_file": null,
"reported": true,
"date_submitted": 1375894031,
"doc_file": null,
"country": "Algeria"
}
}
You can use the match_phrase query instead
POST http://127.0.0.1:9200/decision/council/_search
{
"query" : {
"match_phrase" : { "country" : "Algeria"}
}
}

Resources