Function score ignored - elasticsearch

I have two nearly identical documents, one of which has the fields CONSTRUCTION: 1 and EDUCATION: 0.1, the other with CONSTRUCTION: 0.1 and EDUCATION: 1. I want to be able to sort results by the value of either the CONSTRUCTION or EDUCATION field
GET /objects/_search
{
"query": {
"function_score": {
"query": {
"match": {
"name": {
"query": "Monkeys"
}
}
},
"field_value_factor": {
"field" : "CONSTRUCTION",
"missing": 1
}
}
},
"_source": ["name", "CONSTRUCTION", "EDUCATION"]
}
Returns the incorrect results:
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.7622693,
"hits": [
{
"_index": "objects__feed_id_key_pages__date_2019-12-10__timestamp_1575988952__batch_id_3gpnz7fc__",
"_type": "_doc",
"_id": "dit:greatDomesticUi:KeyPages:12",
"_score": 1.7622693,
"_source": {
"CONSTRUCTION": 0.1,
"name": "Space Monkeys - education",
"EDUCATION": 1
}
},
{
"_index": "objects__feed_id_key_pages__date_2019-12-10__timestamp_1575988952__batch_id_3gpnz7fc__",
"_type": "_doc",
"_id": "dit:greatDomesticUi:KeyPages:11",
"_score": 1.0226655,
"_source": {
"CONSTRUCTION": 1,
"name": "Space Monkeys - construction",
"EDUCATION": 0.1
}
}
]
}
}
This only always returns the same results. Indeed if you misspell the field_value_factor field, you get the same score "field_value_factor": { "field" : "WHATEVER",... }. This suggests the field simply isn't being read.

Dynamic mapping was turned off. The EDUCATION and CONSTRUCTION fields were not mapped. Mystery solved!

Related

Elasticsearch scripted fields. Dynamic param selection

I'm trying to have a dynamic param based on a value in the document.
I tried so far this here
body: {
"script_fields": {
"potentialIncome": {
"script": {
"lang": "painless",
"source": "doc.rentPrice.value - params['doc.buyingPrice.value']",
"params": {
120000: 1200,
150000: 1500
}
}
}
}
}
This throws me following error:
type: 'script_exception',
reason: 'runtime error',
script_stack:
[ 'doc.rentPrice.value - params[\'doc.buyingPrice.value\']',
' ^---- HERE' ],
script: 'doc.rentPrice.value - params[\'doc.buyingPrice.value\']',
lang: 'painless'
I would like to have params dynamic in a way that the doc value buyingPrice decides which value to deduct.
Using ElasticSearch 7.2
A complicated and bad way is to use following script
if(doc['buyingPrice'].value==120000){return doc['rentPrice'].value-params['120000']}
else if(doc['buyingPrice'].value==150000){return doc['rentPrice'].value-params['150000']}
The Es object:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [{
"_index": "immo",
"_type": "objects",
"_id": "1",
"_score": 1.0,
"_source": {
"buyingPrice": 120000,
"rentPrice": 500
}
}, {
"_index": "immo",
"_type": "objects",
"_id": "2",
"_score": 1.0,
"_source": {
"buyingPrice": 150000,
"rentPrice": 500
}
}]
}
}
You need to try without the single quotes.
"source": "return (params[String.valueOf(doc.buyingPrice.value)] != null) ? doc.rentPrice.value - params[String.valueOf(doc.buyingPrice.value)] : 0",
^ ^
| |

How do i get accurate sum in elasticsearch based on source hits?

How do i get an exact sum aggregation in elasticsearch? Fore reference i am currently using elasticsearch 5.6 and the my index mapping looks like this:
{
"my-index":{
"mappings":{
"my-type":{
"properties":{
"id":{
"type":"keyword"
},
"fieldA":{
"type":"double"
},
"fieldB":{
"type":"double"
},
"fieldC":{
"type":"double"
},
"version":{
"type":"long"
}
}
}
}
}
}
The search query generated (using java client) is:
{
/// ... some filters here
"aggregations" : {
"fieldA" : {
"sum" : {
"field" : "fieldA"
}
},
"fieldB" : {
"sum" : {
"field" : "fieldB"
}
},
"fieldC" : {
"sum" : {
"field" : "fieldC"
}
}
}
}
However my result hits generate the following:
{
"took": 10,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 3.8466966,
"hits": [
{
"_index": "my-index",
"_type": "my-type",
"_id": "25a203b63e264fd2be13db006684b06d",
"_score": 3.8466966,
"_source": {
"fieldC": 108,
"fieldA": 108,
"fieldB": 0
}
},
{
"_index": "my-index",
"_type": "my-type",
"_id": "25a203b63e264fd2be13db006684b06d",
"_score": 3.8466966,
"_source": {
"fieldC": -36,
"fieldA": 108,
"fieldB": 144
}
},
{
"_index": "my-index",
"_type": "my-type",
"_id": "25a203b63e264fd2be13db006684b06d",
"_score": 3.8466966,
"_source": {
"fieldC": -7.2,
"fieldA": 1.8,
"fieldB": 9
}
},
{
"_index": "my-index",
"_type": "my-type",
"_id": "25a203b63e264fd2be13db006684b06d",
"_score": 3.8466966,
"_source": {
"fieldC": 14.85,
"fieldA": 18.9,
"fieldB": 4.05
}
},
{
"_index": "my-index",
"_type": "my-type",
"_id": "25a203b63e264fd2be13db006684b06d",
"_score": 3.8466966,
"_source": {
"fieldC": 36,
"fieldA": 36,
"fieldB": 0
}
}
]
},
"aggregations": {
"fieldA": {
"value": 272.70000000000005
},
"fieldB": {
"value": 157.05
},
"fieldC": {
"value": 115.64999999999999
}
}
}
why do i get:
115.64999999999999 instead of 115.65 in fieldC
272.70000000000005 instead of 272.7 in fieldA
should i use float instead of double? or is there a way i can change the query without using painless script and using java's BigDecimal with specified precision and rounding mode?
It has to do with float number precision in JavaScript (similar to what can be seen here and explained here).
Here are two ways to check this:
A. If you node.js installed, just type node at the prompt and then enter the sum of all fieldA values:
$ node
108 - 36 - 7.2 + 14.85 + 36
115.64999999999999 <--- this is the answer
B. Open the Developer tools of your browser and pick the Console view. Then type the same sum as above:
> 108-36-7.2+14.85+36
< 115.64999999999999
As you can see, both results are consistent with what you're seeing in your ES response.
One way to circumvent this is to store your numbers either as normal integers (i.e. 1485 instead of 14.85, 3600 instead of 36, etc) or as scaled_float with a scaling factor of 100 (or bigger depending on the precision you need)

Elasticsearch query that requires all values in array to be present

Heres a sample query:
{
"query":{
"constant_score":{
"filter":{
"terms":{
"genres_slugs":["simulator", "strategy", "adventure"]
}
}
}
},
"sort":{
"name.raw":{
"order":"asc"
}
}
}
The value mapped to the genres_slugs property is just a simple array.
What i'm trying to do here is match all games that have all the values in the array: ["simulator","strategy","adventure"]
As in, the resulting items MUST have all those values. What's returning instead are results that have only one value and not the others.
Been going at this for 6 hours now :(
Ok, if the resulting items MUST have all those values, use MUST param instead of FILTER.
{ "query":
{ "constant_score" :
{ "filter" :
{ "bool" :
{ "must" : [
{ "term" :
{"genres_slugs":"simulator"}
},
{ "term" :
{"genres_slugs":"strategy"}
},
{ "term" :
{"genres_slugs":"adventure"}
}]
}
}
}
}
}
This returns:
{
"took": 54,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "try",
"_type": "stackoverflowtry",
"_id": "123",
"_score": 1,
"_source": {
"genres_slugs": [
"simulator",
"strategy",
"adventure"
]
}
},
{
"_index": "try",
"_type": "stackoverflowtry",
"_id": "126",
"_score": 1,
"_source": {
"genres_slugs": [
"simulator",
"strategy",
"adventure"
]
}
}
]
}
}
Doc:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_multiple_exact_values.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-common-terms-query.html

Custom scoring function in Elasticsearch does not return expected field value

I create a custom scoring function for my documents that just returns the value of the field a for each document. But for some reason, in the example below, the last digits of the _score in the results differ from the last digits of the value of a for each document. What is happening here?
PUT test/doc/1
{
"a": 851459198
}
PUT test/doc/2
{
"a": 984968088
}
GET test/_search
{
"query": {
"function_score": {
"script_score": {
"script": {
"inline": "doc[\"a\"].value"
}
}
}
}
}
That will return the following:
{
"took": 16,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 984968060,
"hits": [
{
"_index": "test",
"_type": "doc",
"_id": "2",
"_score": 984968060,
"_source": {
"a": 984968088
}
},
{
"_index": "test",
"_type": "doc",
"_id": "1",
"_score": 851459200,
"_source": {
"a": 851459198
}
}
]
}
}
Why is the _score different than the value of the field a?
I'm using Elasticsearch 2.1.1
The _score value is internally hard coded as a float which can only accurately represent integers up to the value 134217728. Therefore, if you want to make use, in the scoring function, of a field stored as a number larger than that, it will overflow the buffer and be truncated. See this github issue

How to filter out elements from an array that doesn’t match the query?

stackoverflow won't let me write that much example code so I put it on gist.
So I have this index
with this mapping
here is a sample document I insert into newly created mapping
this is my query
GET products/paramSuggestions/_search
{
"size": 10,
"query": {
"filtered": {
"query": {
"match": {
"paramName": {
"query": "col",
"operator": "and"
}
}
}
}
}
}
this is the unwanted result I get from previous query
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.33217794,
"hits": [
{
"_index": "products",
"_type": "paramSuggestions",
"_id": "1",
"_score": 0.33217794,
"_source": {
"productName": "iphone 6",
"params": [
{
"paramName": "color",
"value": "white"
},
{
"paramName": "capacity",
"value": "32GB"
}
]
}
}
]
}
}
and finally the wanted result, how I want the query result to look like
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.33217794,
"hits": [
{
"_index": "products",
"_type": "paramSuggestions",
"_id": "1",
"_score": 0.33217794,
"_source": {
"productName": "iphone 6",
"params": [
{
"paramName": "color",
"value": "white"
},
]
}
}
]
}
}
How should the query look like to achieve the wanted result with filtered array field which matches the query? In other words, all other non-matching array items should not appear in the final result.
The final result is the _source document that you indexed. There is no feature that lets you mask field elements of your document out of the Elasticsearch response.
That said, depending on your goal, you can look into how Highlighters and Suggesters identify result terms matching the query, or possibly, roll-your-own client-side masking using info returned from setting "explain": true in your query.

Resources