ElasticSearch/Painless: How do I skip an item when iterating? - elasticsearch

I have a for loop that iterates a list. If the list contains a certain value, say "5", I want the loop to skip that value. But Painless seems determined to not permit that by not letting me have an empty if block or use a continue statement. How can I accomplish this?
"script_fields": {
"HResultCount": {
"script": {
"lang": "painless",
"inline": "int instance = 0; for (int i = 0; i < doc['numbers'].length; ++i) { if (doc['numbers'] == '5') { /* bail out */ } else { return 1.0; } }"
}
}

Since a script has to return a value in all cases, you can remove the value 5 from the list before iterating as you suggested.
You can achieve this like that by calling removeIf on a copy of your list with a Java 8 lambda:
"script_fields": {
"HResultCount": {
"script": {
"lang": "painless",
"inline": "int instance = 0; List copy = new ArrayList(doc['numbers']); copy.removeIf(i -> i == 5); for (int i = 0; i < copy.length; ++i) { instance += copy[i]; } return instance;"
}
}

Related

Elasticsearch painless sorting null_pointer_error

I'm trying to create a sorting script with painless, filtering nested documents.
The reason I'm doing this with a script, is because I need to emulate a COALESCE statement.
My documents have titles stored like:
{
title: [
{
type: MainTitle,
value: [
{
language: eng,
label: The title
},
{
language: ger,
label: Das title
}
]
},
{
type: AvailabilityTitle,
value: [
{
language: eng,
label: New thing!
}
]
}
]
}
title and title.value are nested documents.
I want to sort documents primarily by their english MainTitle and by their german MainTitle only if no english MainTitle exists - even if the german title gave a higher score.
I'm trying to simply sort by the english MainTitle first to try it out and this is the script:
def source = params._source;
def titles = source.title;
if (titles != null && titles.length > 0) {
for(int i=0; i < titles.length; i++) {
def t = titles[i];
if (t.type == 'MainTitle') {
def values = t.value;
if (values != null && values.length > 0) {
for (int j = 0; j < values.length; j++) {
def v = values[j];
if (v.language == 'eng') {
return v.label;
}
}
}
}
}
}
return \"\";
For some reason I'm getting a null_pointer_exception
"script_stack": [
"if (values != null && values.length > 0) { ",
" ^---- HERE"
],
I don't get how values can be null at that point since I'm specifically checking for null just before it.
The null_pointer_exception is thrown, not because values is null, but because values does not have a method/function called length. That is because for some reason values is an ArrayList even though titles earlier is an Array. Apparently they both have the method/function size() so I can just use that.
So this works:
def source = params._source;
def titles = source.title;
if (titles != null && titles.size() > 0) {
for(int i=0; i < titles.size(); i++) {
def t = titles[i];
if (t.type == 'MainTitle') {
def values = t.value;
if (values != null && values.size() > 0) {
for (int j = 0; j < values.size(); j++) {
def v = values[j];
if (v != null && v.language == 'fin') {
return v.label;
}
}
}
}
}
}
return '';

Painless script to increase the count if the full path exists or else add the full path and add the count

I am creating a script to increase the count value of the field if the field full path exist, or else I have to add the full path dynamically. For example, In the below example
If the record already has inner->board1->count, I should increment the value of it by the value of the count
If I don't have inner or board1 or count, I should add them and add the value of the count. Please also note here the inner or board1orcount` are not fixed.
If the value is not an object, I can check using ctx._source.myCounts == null, but I am not sure how to check for the object fields and subfields and sub subfields.
Code
POST test/_update/3
{
"script": {
"source": "ctx._source.board_counts = params.myCounts",
"lang": "painless",
"params": {
"myCounts": {
"inner":{
"board1":{"count":5},
"board2":{"count":4},
"board3":{"temp":1,"temp2":3}
},
"outer":{
"board1":{"count":5},
"board10":{"temp":1,"temp2":3}
}
}
}
}
}
I am able to come up with this and working fine.
POST test/_update/3
{
"script": {
"source": "{"source": "if (ctx._source['myCounts'] == null) {ctx._source['myCounts'] = [:];} for (mainItem in params.myCounts) { for (accessItemKey in mainItem.keySet()) { if (ctx._source.myCounts[accessItemKey] == null) { ctx._source.myCounts[accessItemKey] = [:];}for (boardItemKey in mainItem[accessItemKey].keySet()) {if (ctx._source.myCounts[accessItemKey][boardItemKey] == null) {ctx._source.myCounts[accessItemKey][boardItemKey] = [:];} for (countItemKey in mainItem[accessItemKey][boardItemKey].keySet()) { if (ctx._source.myCounts[accessItemKey][boardItemKey][countItemKey] == null) { ctx._source.myCounts[accessItemKey][boardItemKey][countItemKey] =mainItem[accessItemKey][boardItemKey][countItemKey]; }else {ctx._source.myCounts[accessItemKey][boardItemKey][countItemKey] += mainItem[accessItemKey][boardItemKey][countItemKey];}}}}}",
"lang": "painless",
"params": {
"myCounts": {
"inner":{
"board1":{"count":5},
"board2":{"count":4},
"board3":{"temp":1,"temp2":3}
},
"outer":{
"board1":{"count":5},
"board10":{"temp":1,"temp2":3}
}
}
}
}
}

Elasticsearch scripted_metric null_pointer_exception

I'm trying to use the scripted_metric aggs of Elasticsearch and normally, it's working perfectly fine with my other scripts
However, with script below, I'm encountering an error called "null_pointer_exception" but they're just copy-pasted scripts and working for 6 modules already
$max = 10;
{
"query": {
"match_all": {}
//omitted some queries here, so I just turned it into match_all
}
},
"aggs": {
"ARTICLE_CNT_PDAY": {
"histogram": {
"field": "pub_date",
"interval": "86400"
},
"aggs": {
"LATEST": {
"nested": {
"path": "latest"
},
"aggs": {
"SUM_SVALUE": {
"scripted_metric": {
"init_script": "
state.te = [];
state.g = 0;
state.d = 0;
state.a = 0;
",
"map_script": "
if(state.d != doc['_id'].value){
state.d = doc['_id'].value;
state.te.add(state.a);
state.g = 0;
state.a = 0;
}
state.a = doc['latest.soc_mm_score'].value;
",
"combine_script": "
state.te.add(state.a);
double count = 0;
for (t in state.te) {
count += ((t*10)/$max)
}
return count;
",
"reduce_script": "
double count = 0;
for (a in states) {
count += a;
}
return count;
"
}
}
}
}
}
}
}
}
I tried running this script in Kibana, and here's the error message:
What I'm getting is, that there's something wrong with the reduce_script portion, tried to change this part:
FROM
for (a in states) {
count += a;
}
TO
for (a in states) {
count += 1;
}
And worked perfectly fine, I felt that the a variable isn't getting what it's supposed to hold
Any ideas here? Would appreciate your help, thank you very much!
The reason is explained here:
If a parent bucket of the scripted metric aggregation does not collect any documents an empty aggregation response will be returned from the shard with a null value. In this case the reduce_script's states variable will contain null as a response from that shard. reduce_script's should therefore expect and deal with null responses from shards.
So obviously one of your buckets is empty, and you need to deal with that null like this:
"reduce_script": "
double count = 0;
for (a in states) {
count += (a ?: 0);
}
return count;
"

How to pass Date as parameter to ElasticSearch

We areable to pass the integer values as part of inline params but not date..
We are trying it like this.
"script": {
"inline": "if ((doc['enddate'].date >= param1) && (doc['enddate'].date <= param2)) { return param2 }",
"params": {
"param1": new DateTime(),
"param2": new DateTime(doc['enddate'].date).plusDays(+1)
}
}
You cannot reference document fields in inline parameters and in your case you don't really need any parameters. I suggest doing it the following way:
"script": {
"inline": "def now = new DateTime(); def tomorrow = now.plusDays(1); if ((doc['enddate'].date >= now) && (doc['enddate'].date <= tomorrow)) { return tomorrow }"
}
Note that you still need to return something in case the condition is not satisfied.

create parametric scripted_metric in elasticsearch

I want to use scripted_metric within an aggregation. I have some parametric values in my script that I want to set those per query, It is possible to create this query at all?
below an example for what I'm looking for
"aggs": {
"testAgg": {
"scripted_metric": {
"init_script": "_agg['maximum'] = []",
"map_script": "max = 0; for(tv in _source.tvs){ if(tv.att1>= param1 && tv.attr2 <= param2 && tv.att3 > max){max = tv.att3; }}; _agg.maximum.add(max);",
"combine_script": "sum = 0; for (m in _agg.maximum) { sum += m }; return sum;",
"reduce_script": "sum = 0; for (a in _aggs) { sum += a }; return sum;"
}
}
}
param1 and param2 are my parametric values, how to change this aggregation for my purpose?
tnx :)
You can do it by specifying a global params map
"aggs": {
"testAgg": {
"scripted_metric": {
"params": {
"_agg": {},
"param1": 10,
"param2": 20
},
"init_script": "_agg['maximum'] = []",
"map_script": "max = 0; for(tv in _source.tvs){ if(tv.att1>= param1 && tv.attr2 <= param2 && tv.att3 > max){max = tv.att3; }}; _agg.maximum.add(max);",
"combine_script": "sum = 0; for (m in _agg.maximum) { sum += m }; return sum;",
"reduce_script": "sum = 0; for (a in _aggs) { sum += a }; return sum;"
}
}
}

Resources