NPE when partially updating Elasticsearch Index - elasticsearch

I'm following the example over here for updating a set of tags using the partial update in Elasticsearch.
Following is my script:
{
"script": {
"lang": "painless",
"inline": "ctx._source.deviceTags.add(params.tags)",
"params": {
"tags": "search"
}
}
}
Request URL is:
https://aws-es-service-url/devices/device/123/_update
But I'm getting the following response:
{
"error": {
"root_cause": [
{
"type": "remote_transport_exception",
"reason": "[fBaExM8][x.x.x.x:9300][indices:data/write/update[s]]"
}
],
"type": "illegal_argument_exception",
"reason": "failed to execute script",
"caused_by": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx._source.deviceTags.add(params.tags)",
" ^---- HERE"
],
"script": "ctx._source.deviceTags.add(params.tags)",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
}
},
"status": 400
}
Any idea of what I have done wrong?

Since your deviceTags array is initially null, you have two ways to solve this
A. Use upsert to make sure that deviceTags gets added to your document initially
{
"script": {
"lang": "painless",
"inline": "ctx._source.deviceTags.add(params.tags)",
"params": {
"tags": "search"
}
},
"upsert": {
"deviceTags": ["search"]
}
}
B. Protect your code against NPE
{
"script": {
"lang": "painless",
"inline": "(ctx._source.deviceTags = ctx._source.deviceTags ?: []).add(params.tags)",
"params": {
"tags": "search"
}
}
}

Related

Is there any problem in my elasticsearch request?

I'm trying to update a doc in elasticsearch by using:
POST /rcqmkg_eco_ugc_rec/_doc/aaa/_update
{
"script": {
"source": "ctx._source.flower_cnt_0 += params.flower_cnt_0",
"lang": "painless",
"params": {
"flower_cnt_0": 1
}
}
}
But I got a illegal_argument_exception error. The result of elasticsearch is:
{
"error": {
"root_cause": [
{
"type": "remote_transport_exception",
"reason": "[data1_xxx][xxx][indices:data/write/update[s]]"
}
],
"type": "illegal_argument_exception",
"reason": "failed to execute script",
"caused_by": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx._source.flower_cnt_0 += params.flower_cnt_0",
" ^---- HERE"
],
"script": "ctx._source.flower_cnt_0 += params.flower_cnt_0",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
}
},
"status": 400
}
Where is the problem in my update request of es?
If the document you're trying to update might not contain a field called flower_cnt_0, you need to account for this in your script:
Try this instead:
POST /rcqmkg_eco_ugc_rec/_doc/aaa/_update
{
"script": {
"source": "ctx._source.flower_cnt_0 = (ctx._source.flower_cnt_0 ?: 0) + params.flower_cnt_0",
"lang": "painless",
"params": {
"flower_cnt_0": 1
}
}
}

Scripted Metric Aggregation fails on unexpected = character

I'm trying to have some object containing all time deltas between two logs with the same recordId.
In order to do so, I'm executing the following query (Scripted Metric Aggregation) on elastic search, where the logs are, using fiddler:
{
"query": {
"exists": {
"field": "recordId"
}
},
"aggs": {
"deltas": {
"scripted_metric": {
"init_script": {
"source": "state.deltas = {};",
"lang": "expression"
},
"map_script": {
"source": " if (!(doc['topic'].value in state.deltas)) { state.deltas[doc['topic'].value] = {} } state.deltas[doc['topic'].value][doc['recordId'].value] = !(doc['recordId'].value in state.deltas[doc['topic'].value]) ? doc['#timestamp'].date.millisOfDay : Math.abs(state.deltas[doc['topic'].value][doc['recordId'].value] - doc['#timestamp'].date.millisOfDay) ",
"lang": "expression"
},
"reduce_script": {
"source": " res = {}; for (s in states) { for(topic in Object.keys(s.deltas)) { if (!(topic in res)) { res[topic] = {} } for(recordId in Object.keys(s.deltas[topic])) { res[topic][recordId] = !(recordId in res[topic]) ? s.deltas[topic][recordId] : Math.abs(res[topic][recordId] - s.deltas[topic][recordId]) } } } return res;",
"lang": "expression"
}
}
}
}
}
But it fails because of unexpected '=' character...
Tried other ways but it always throws the same error.
What am I missing?
{
"error": {
"root_cause": [
{
"type": "lexer_no_viable_alt_exception",
"reason": "lexer_no_viable_alt_exception: null"
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "il0:il1-sys-logs-2020.10.07",
"node": "5bmIYI_iSVCtx41qEEbASg",
"reason": {
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"state.deltas = {};",
" ^---- HERE"
],
"script": "state.deltas = {};",
"lang": "expression",
"caused_by": {
"type": "parse_exception",
"reason": "parse_exception: unexpected character '= ' on line (1) position (13)",
"caused_by": {
"type": "lexer_no_viable_alt_exception",
"reason": "lexer_no_viable_alt_exception: null"
}
}
}
}
],
"caused_by": {
"type": "lexer_no_viable_alt_exception",
"reason": "lexer_no_viable_alt_exception: null"
}
},
"status": 500
}

How do I overwrite the #timestamp field with another field in Elasticsearch?

I incorrectly ingested lots of documents into Elasticsearch using the wrong #timestamp field. I already changed the affected Logstash pipeline to use the correct timestamps, but I cannot re-ingest the old data.
I do however have another document field that can be used as the timestamp (json.created_at). So I'd like to update the field. I've found that I can use the _update_by_query action to do that, but I've tried several versions that didn't work, including this:
POST logstash-rails_models-*/_update_by_query
{
"script": {
"lang": "painless",
"source": "ctx._source.#timestamp = ctx._source.json.created_at"
}
}
This complains about an unexpected character:
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"ctx._source.#timestamp = ctx._source. ...",
" ^---- HERE"
],
"script": "ctx._source.#timestamp = ctx._source.json.created_at",
"lang": "painless"
}
],
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"ctx._source.#timestamp = ctx._source. ...",
" ^---- HERE"
],
"script": "ctx._source.#timestamp = ctx._source.json.created_at",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "unexpected character [#].",
"caused_by": {
"type": "lexer_no_viable_alt_exception",
"reason": null
}
}
},
"status": 500
}
What should I do?
The correct way to access this field is via brackets and wrapped in quotes:
POST logstash-rails_models-*/_update_by_query
{
"script": {
"lang": "painless",
"source": "ctx._source['#timestamp'] = ctx._source.json.created_at"
}
}
See also this thread and some more info about updating fields with Painless.

How to aggregate over 'ip' field using script

I am trying to perform terms aggregation over a field of type 'ip' using inline script like this :
{
"aggs": {
"by_ipaddress": {
"terms": {
"script": {
"inline": "doc['ipAddressFrom'].value",
"lang": "painless"
}
}
}
}
}
It throws following exception :
"reason": {
"type": "script_exception",
"reason": "runtime error",
"caused_by": {
"type": "array_index_out_of_bounds_exception",
"reason": "16"
},
"script_stack": [
"org.apache.lucene.util.UnicodeUtil.UTF8toUTF16(UnicodeUtil.java:602)",
"org.apache.lucene.util.BytesRef.utf8ToString(BytesRef.java:152)",
"org.elasticsearch.index.fielddata.ScriptDocValues$Strings.getValue(ScriptDocValues.java:83)",
"doc['ipAddressFrom'].value",
" ^---- HERE"
],
"script": "doc['ipAddressFrom'].value",
"lang": "painless"
}
But when I aggregate over the same field :
{
"aggs": {
"by_ipaddress": {
"terms": {
"field": "ipAddressFrom"
}
}
}
}
It works.
Mapping for the field "ipAddressFrom" is :
"ipAddressFrom" : {
"type" : "ip"
}
Please let me know how to use ip fields in script.
For elasticsearch 6.x, there is nothing wrong with using ip type in painless scripts.
Your aggregation with inline doesn't work because for some documents field ipAddressFrom does not exist.
You can fix the aggregation with something like:
"script": {
"inline": "if (doc.containsKey('ipAddressFrom') && !doc['ipAddressFrom'].empty){ return doc['ipAddressFrom'].value} else {return '0'}",
"lang": "painless"
}

Update nested string field

I am trying to update a field image.uri by _update_by_query:
POST user/_update_by_query
{
"script": {
"source": "ctx._source.image.uri = 'https://example.com/default/image/profile.jpg'",
"lang": "painless"
},
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "image.id"
}
}
]
}
}
}
But it throws error:
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx._source.image.uri = 'https://example.com/default/image/profile.jpg'",
" ^---- HERE"
],
"script": "ctx._source.image.uri = 'https://example.com/default/image/profile.jpg'",
"lang": "painless"
}
],
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"ctx._source.image.uri = 'https://example.com/default/image/profile.jpg'",
" ^---- HERE"
],
"script": "ctx._source.image.uri = 'https://example.com/default/image/profile.jpg'",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
},
"status": 500
}
A sample document:
{
"image": {
"uri": "https://example.com/resources/uploads/default_files/profile/thumb/large/default_profile.jpg"
},
"created": "2018-06-06T21:49:26Z",
"uid": 1,
"name": "Jason Cameron",
"username": "jason"
}
UPDATED RESPONE
The problem could be coming from a document without image object in it.
Try to add strict mapping if possible, to avoid indexing documents without image object.
OLD RESPONSE/"\' are correct for use inside painless script as string
Your problem comes as use of ' to encapsulate your uri, strings must be encapsulated by ".
Try to modify your script as:
"script": {
"source": "ctx._source.image.uri = \"https://example.com/default/image/profile.jpg\"",
"lang": "painless"
}

Resources