Accessing Text Keyword Field through a Script - elasticsearch

I am trying to do some scripting in elasticsearch
Here is an example of the JSON segment in the request.
{
"script_score": {
"script": {
"source": "doc.containsKey('var')?params.adder[doc['var'].keyword]:0 ",
"params": {
"adder": {
"type1": 1,
"type2": 1000
}
}
}
},
"weight": 100000
}
This is the error that is thrown
{
"shard": 0,
"index": "",
"node": "4eX6EgO2QAuBdc5zkUiDBg",
"reason": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:759)",
"org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
"org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:290)",
"org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:101)",
"org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:98)",
"java.base/java.security.AccessController.doPrivileged(AccessController.java:312)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:98)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
"doc.containsKey('var')?params.adder[doc['var'].keyword]:0 ",
" ^---- HERE"
],
"script": "doc.containsKey('var')?params.adder[doc['var'].keyword]:0 ",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [var] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}
}
}
It's surprising to me that they cannot access the keyword field because it's nested. Do I need to make another field that is the keyword field?
Thank you

To access nested fields try doc['var.keyword'] or doc['var.keyword'].value

Related

Elastic Search Sort by text error: illegal_argument_exception

I have the below index configuration for Products in Elastic Search.
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 0
}
},
"mappings": {
"properties": {
"name": {
"fields": {
"original": {
"type": "keyword"
}
},
"type": "text",
"fielddata": true,
"analyzer": "portuguese"
},
"product_data": {
"type": "object"
}
}
}
}
HERE I UPSERT THE DATA
http://127.0.0.1:9200/product2/_update/IMOB01
BODY
{
"doc": {
"name":"Test",
"product_data": {
"symbol": "IMOB01",
"release_date": "2013-01-01T00:00:00"
}
},
"doc_as_upsert": true
}
RESPONSE
201 created
The problem is, if I try _search and sort by name everything is OK.
BUT if for instance, I try use _Search with sort by **product_data.symbol ** I receive the error below...
localhost:9200/product/_search
BODY
{
"from": 0,
"size": 50,
"query": {
"bool": {
"must": {
"match": {
"product_data.symbol": "imob01"
}
}
}
},
"sort": [
{ "product_data.symbol": {"order": "asc", "unmapped_type" : "text"}}
]
}
REPONSE
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "product",
"node": "PfwDOAwzTZm63IHq_rt1TA",
"reason": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
],
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory.",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [product_data.symbol] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
"status": 400
}
What I`m doing wrong ?
OBS: The reason that I created the product_data as object is because I have some products with different types of return, they can have symbol or other fields, symbol is just a example.
I tried change the type of product_data for fielddata:true with no success.
Text fields are not optimised for operations that require per-document
field data like aggregations and sorting, so these operations are
disabled by default. Please use a keyword field instead.
Alternatively, set fielddata=true on [product_data.symbol] in order to
load field data by uninverting the inverted index. Note that this can
use significant memory.
Try user keyword type:
"sort": [
{ "product_data.symbol.keyword": {"order": "asc", "unmapped_type" : "text"}}
]

Elasticsearch removing an array list when reindexing all records

So I am trying to reindex one of my indices to a temporary one and remove an array list: platform.platforms.*
This is what my Kibana query looks like:
POST /_reindex
{
"source": {
"index": "devops-ucd-000001"
},
"dest": {
"index": "temp-ucd"
},
"conflicts": "proceed",
"script": {
"lang": "painless",
"inline": "ctx._source.platform.platforms.removeAll(Collections.singleton('1'))"
}
}
However what I get is a null pointer exception:
"script_stack": [
"ctx._source.platform.platforms.removeAll(Collections.singleton('1'))",
" ^---- HERE"
],
"script": "ctx._source.platform.platforms.removeAll(Collections.singleton('1'))",
"lang": "painless",
"caused_by": {
"type": "null_pointer_exception",
"reason": null
}
I tried following this question: how to remove arraylist value in elastic search using curl? to no avail.
Any help would be appreciated here.
It is probably due to some documents not having platform field. You need to add additional checks in your script to ignore such documents
"script": {
"lang": "painless",
"inline": """
if(ctx._source.platform!=null && ctx._source.platform.platforms!=null && ctx._source.platform.platforms instanceof List)
{
ctx._source.platform.platforms.removeAll(Collections.singleton('1'))
}
"""
}
Above has null check on platform and platform.platforms also if platforms is of type list

How does elasticsearch handle returns inside a scripted update query?

I can't find the relevant documentation describing the return keyword. Where is this documented?
I am running the following query
POST /myindex/mytype/FwOaGmQBdhLB1nuQhK1Q/_update
{
"script": {
"source": """
if (ctx._source.owner._id.equals(params.signedInUserId)){
for (int i = 0; i < ctx._source.managers.length; i++) {
if (ctx._source.managers[i].email.equals(params.managerEmail)) {
ctx._source.managers.remove(i);
return;
}
}
}
ctx.op = 'noop';
""",
"lang": "painless",
"params": {
"signedInUserId": "auth0|5a78c1ccebf64a46ecdd0d9c",
"managerEmail": "d#d.com"
}
},
"_source": true
}
but I'm getting the error
"type": "illegal_argument_exception",
"reason": "failed to execute script",
"caused_by": {
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"... ve(i);\n return;\n }\n }\n ...",
" ^---- HERE"
],
"script": <the script here>,
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "invalid sequence of tokens near [';'].",
"caused_by": {
"type": "no_viable_alt_exception",
"reason": null
}
}
If I remove return keyword, then the script runs but I get the wrong behavior as expected. I can correct the behavior by using a Boolean to keep track of email removal, but why can't I return early?
It's hard to say, you could avoid null/void returns by passing a lambda comparator to either retainAll or removeIf
ctx._source.managers.removeIf(m -> m.email.equals(params.managerEmail))
Lambda expressions and method references work the same as Java’s.

elasticsearch kibana update document which has space using update by query

I have a field to update which has space in it.
POST /index/type/_update_by_query
{
"query": {
"match_phrase":{
"field": "value"
}
},
"script":{
"lang": "painless",
"inline": "ctx._source.Existing Field = New_Value"
}
}
But I get this error.
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"ctx._source.Existing Field = New_Value",
" ^---- HERE"
],
"script": "ctx._source.Existing Field = New_Value",
"lang": "painless"
}
],
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"ctx._source.Existing Field = New_Value",
" ^---- HERE"
],
"script": "ctx._source.Existing Field = New_Value",
"lang": "painless",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "unexpected token ['Field'] was expecting one of [{<EOF>, ';'}]."
}
},
"status": 500
}
When I execute this query on a field which doesn't have space, it works fine.
How do I handle cases where there is a space in the field name?
ELK version = 5.4.3
I have read in the documentation that using spaces in field names is not advised, but these fields are dynamically created from a certain server and there are like 1M data entries every day. Hence I want to do a update_by_query on all the matching entries.
Try this one:
POST index/type/_update_by_query
{
"script":{
"lang": "painless",
"inline": "ctx._source['Existing Field'] = 'New Value'"
}
}
This is possible because ctx._source is an instance of painless Map, which is a normal Java HashMap. It allows you to access fields with weird characters and also add and remove fields in update queries.
Hope that helps!

ElasticSearch script query & geo distance query

ElasticSearch version is 2.3.4
How to query geo_distance_query and script_query combination ?
I have an index, which has a lot of store information, there are location and distance fields.
Given a geo_point, the index based on the top of the query based on geo_distance_query this point to all store the distance, and calculated the distance should be less than the current store in the distance field value.
I think the script_query and geo_distance_query combination, do not know how to achieve.
Try the following code:
query: {
bool: {
must: [
{
script: {
script: {
inline: "doc['location'].arcDistance(" + _.lat + "," + _.lon + ") < doc['distance'].value"
, lang: "painless"
}
}
}
]
}
}
Results Elasticsearch error:
[illegal_argument_exception] script_lang not supported [painless]
Whether someone has encountered and solved this problem, using what method to implement the query ?
change code :
{
bool: {
must: [
{
script: {
script: {
inline: "doc['location'].arcDistance(" + _.lat + "," + _.lon + ") < doc['distance'].value"
, lang: "groovy"
}
}
}
]
}
}
also error :
message: '[script_exception] failed to run inline script [doc[\'location\'].arcDistance(31.89484,120.287825) < doc[\'distance\'].value] using lang [groovy]',
detail image :
{
"error": {
"root_cause": [{
"type": "script_exception",
"reason": "failed to run inline script [doc[\'location\'].arcDistance(31.89484,120.287825) < 3000] using lang [groovy]"
}],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query_fetch",
"grouped": true,
"failed_shards": [{
"shard": 0,
"index": "business_stores_v1",
"node": "Z_65eOYXT6u8aDf7mp2ZRg",
"reason": {
"type": "script_exception",
"reason": "failed to run inline script [doc[\'location\'].arcDistance(31.89484,120.287825) < 3000] using lang [groovy]",
"caused_by": {"type": "null_pointer_exception", "reason": null}
}
}]
}, "status": 500
}
The error you're getting, i.e. null_pointer_exception is because one of the documents in your business_stores_v1 index has a null location field and thus the formula fails
doc['location'].arcDistance(...)
^
|
null_pointer_exception here

Resources