Painless script to increase the count if the full path exists or else add the full path and add the count - elasticsearch

I am creating a script to increase the count value of the field if the field full path exist, or else I have to add the full path dynamically. For example, In the below example
If the record already has inner->board1->count, I should increment the value of it by the value of the count
If I don't have inner or board1 or count, I should add them and add the value of the count. Please also note here the inner or board1orcount` are not fixed.
If the value is not an object, I can check using ctx._source.myCounts == null, but I am not sure how to check for the object fields and subfields and sub subfields.
Code
POST test/_update/3
{
"script": {
"source": "ctx._source.board_counts = params.myCounts",
"lang": "painless",
"params": {
"myCounts": {
"inner":{
"board1":{"count":5},
"board2":{"count":4},
"board3":{"temp":1,"temp2":3}
},
"outer":{
"board1":{"count":5},
"board10":{"temp":1,"temp2":3}
}
}
}
}
}

I am able to come up with this and working fine.
POST test/_update/3
{
"script": {
"source": "{"source": "if (ctx._source['myCounts'] == null) {ctx._source['myCounts'] = [:];} for (mainItem in params.myCounts) { for (accessItemKey in mainItem.keySet()) { if (ctx._source.myCounts[accessItemKey] == null) { ctx._source.myCounts[accessItemKey] = [:];}for (boardItemKey in mainItem[accessItemKey].keySet()) {if (ctx._source.myCounts[accessItemKey][boardItemKey] == null) {ctx._source.myCounts[accessItemKey][boardItemKey] = [:];} for (countItemKey in mainItem[accessItemKey][boardItemKey].keySet()) { if (ctx._source.myCounts[accessItemKey][boardItemKey][countItemKey] == null) { ctx._source.myCounts[accessItemKey][boardItemKey][countItemKey] =mainItem[accessItemKey][boardItemKey][countItemKey]; }else {ctx._source.myCounts[accessItemKey][boardItemKey][countItemKey] += mainItem[accessItemKey][boardItemKey][countItemKey];}}}}}",
"lang": "painless",
"params": {
"myCounts": {
"inner":{
"board1":{"count":5},
"board2":{"count":4},
"board3":{"temp":1,"temp2":3}
},
"outer":{
"board1":{"count":5},
"board10":{"temp":1,"temp2":3}
}
}
}
}
}

Related

Elasticsearch: Use loop in Painless script

I have an old version of Elasticsearch (5.6.16) on a production environment that I can't upgrade.
I'm trying to use a loop in a painless script_score script but I always face a runtime error.
All my documents can have one or several "badges", here the mapping:
"myDocument":{
"properties":{
"badges":{
"type":"nested",
"properties":{
"name":{
"type":"keyword"
}
}
},
}
},
My goal is to do a custom script that will provide a better score for documents with a specific type of badge
So I made this script
for (item in doc['badges']) {
if (item['name'] == "myCustomBadge") {
return _score * 10000;
}
}
return _score;
But unfortunately, I'm getting errors while I try to use it
{
"query":{
"function_score":{
"query":{
"match_all": {}
},
"functions":[
{
"script_score":{
"script":{
"inline":"for (item in doc['badges']) { if (item['name'] == \"myCustomBadge\") { return _score * 10000; }}return _score;",
"lang":"painless"
}
}
},
]
}
}
}
"error":{
"root_cause":[
{
"type":"script_exception",
"reason":"runtime error",
"script_stack":[
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:77)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:36)",
"for (item in doc['badges']) { ",
" ^---- HERE"
],
"script":"for (item in doc['badges']) { if (item['name'] == \"myCustomBadge\") { return _score * 10000; }}return _score;",
"lang":"painless"
}
],
"type":"search_phase_execution_exception",
"reason":"all shards failed"
}
I tried to change the for with an other variant, but same error.
for(int i = 0; i < doc['badges']; i++) {
if (doc['badges'][i]['name'] == "uaWorker") {
return _score * 10000;
}
}
return _score;
Could you help me find what I did wrong?
Thanks you all
The problem is not the loop but the fact that badges is nested and you're trying to access it from doc values. In this case, you need to access the array of badges from the _source document directly, like this:
for (item in params._source['badges'])

Run Elasticsearch processor on all the fields of a document

I am trying to trim and lowercase all the values of the document that is getting indexed into Elasticsearch
The processors available has the field key is mandatory. This means one can use a processor on only one field
Is there a way to run a processor on all the fields of a document?
There sure is. Use a script processor but beware of reserved keys like _type, _id etc:
PUT _ingest/pipeline/my_string_trimmer
{
"description": "Trims and lowercases all string values",
"processors": [
{
"script": {
"source": """
def forbidden_keys = [
'_type',
'_id',
'_version_type',
'_index',
'_version'
];
def corrected_source = [:];
for (pair in ctx.entrySet()) {
def key = pair.getKey();
if (forbidden_keys.contains(key)) {
continue;
}
def value = pair.getValue();
if (value instanceof String) {
corrected_source[key] = value.trim().toLowerCase();
} else {
corrected_source[key] = value;
}
}
// overwrite the original
ctx.putAll(corrected_source);
"""
}
}
]
}
Test with a sample doc:
POST my-index/_doc?pipeline=my_string_trimmer
{
"abc": " DEF ",
"def": 123,
"xyz": false
}

Elasticsearch scripted_metric null_pointer_exception

I'm trying to use the scripted_metric aggs of Elasticsearch and normally, it's working perfectly fine with my other scripts
However, with script below, I'm encountering an error called "null_pointer_exception" but they're just copy-pasted scripts and working for 6 modules already
$max = 10;
{
"query": {
"match_all": {}
//omitted some queries here, so I just turned it into match_all
}
},
"aggs": {
"ARTICLE_CNT_PDAY": {
"histogram": {
"field": "pub_date",
"interval": "86400"
},
"aggs": {
"LATEST": {
"nested": {
"path": "latest"
},
"aggs": {
"SUM_SVALUE": {
"scripted_metric": {
"init_script": "
state.te = [];
state.g = 0;
state.d = 0;
state.a = 0;
",
"map_script": "
if(state.d != doc['_id'].value){
state.d = doc['_id'].value;
state.te.add(state.a);
state.g = 0;
state.a = 0;
}
state.a = doc['latest.soc_mm_score'].value;
",
"combine_script": "
state.te.add(state.a);
double count = 0;
for (t in state.te) {
count += ((t*10)/$max)
}
return count;
",
"reduce_script": "
double count = 0;
for (a in states) {
count += a;
}
return count;
"
}
}
}
}
}
}
}
}
I tried running this script in Kibana, and here's the error message:
What I'm getting is, that there's something wrong with the reduce_script portion, tried to change this part:
FROM
for (a in states) {
count += a;
}
TO
for (a in states) {
count += 1;
}
And worked perfectly fine, I felt that the a variable isn't getting what it's supposed to hold
Any ideas here? Would appreciate your help, thank you very much!
The reason is explained here:
If a parent bucket of the scripted metric aggregation does not collect any documents an empty aggregation response will be returned from the shard with a null value. In this case the reduce_script's states variable will contain null as a response from that shard. reduce_script's should therefore expect and deal with null responses from shards.
So obviously one of your buckets is empty, and you need to deal with that null like this:
"reduce_script": "
double count = 0;
for (a in states) {
count += (a ?: 0);
}
return count;
"

ElasticSearch/Painless: How do I skip an item when iterating?

I have a for loop that iterates a list. If the list contains a certain value, say "5", I want the loop to skip that value. But Painless seems determined to not permit that by not letting me have an empty if block or use a continue statement. How can I accomplish this?
"script_fields": {
"HResultCount": {
"script": {
"lang": "painless",
"inline": "int instance = 0; for (int i = 0; i < doc['numbers'].length; ++i) { if (doc['numbers'] == '5') { /* bail out */ } else { return 1.0; } }"
}
}
Since a script has to return a value in all cases, you can remove the value 5 from the list before iterating as you suggested.
You can achieve this like that by calling removeIf on a copy of your list with a Java 8 lambda:
"script_fields": {
"HResultCount": {
"script": {
"lang": "painless",
"inline": "int instance = 0; List copy = new ArrayList(doc['numbers']); copy.removeIf(i -> i == 5); for (int i = 0; i < copy.length; ++i) { instance += copy[i]; } return instance;"
}
}

How to pass Date as parameter to ElasticSearch

We areable to pass the integer values as part of inline params but not date..
We are trying it like this.
"script": {
"inline": "if ((doc['enddate'].date >= param1) && (doc['enddate'].date <= param2)) { return param2 }",
"params": {
"param1": new DateTime(),
"param2": new DateTime(doc['enddate'].date).plusDays(+1)
}
}
You cannot reference document fields in inline parameters and in your case you don't really need any parameters. I suggest doing it the following way:
"script": {
"inline": "def now = new DateTime(); def tomorrow = now.plusDays(1); if ((doc['enddate'].date >= now) && (doc['enddate'].date <= tomorrow)) { return tomorrow }"
}
Note that you still need to return something in case the condition is not satisfied.

Resources