update_by_query for multi field - elasticsearch

I have added a new multi field (raw) of an existing field (response) in the index. And since the new multi field raw will have no data. I tried to add data from source as below.
POST /y_metrics/response/_update_by_query
{
"script":{
"inline": "ctx._source.response['raw'] = ctx._source.response;"
},
"query": {
"match_all": {}
}
}
fails :
"type": "missing_property_exception",
"reason": "No such property: raw for class: java.lang.String"
2nd try :
POST /y_metrics/response/_update_by_query
{
"script":{
"inline": "ctx._source['response.raw'] = ctx._source.response;"
},
"query": {
"match_all": {}
}
}
fails:
"type": "mapper_parsing_exception",
"reason": "Field name [response.raw] cannot contain '.'"
Apparently, the problem seems to be because of ".". But how else one would access multi field in this case ? I read about de_dot filter does it help in my case ?

if you add a field to an existing field (thus a multi field), there is no need to use a script, just reindex and Elasticsearch will handle the rest. You can just drop the script part of your update by query call.

Related

Kibana dashboard filter that compares two values returns an illegal_argument_exception exception

I am new to Kibana and I am probably missing something very basic here.
I am trying to use the filter suggested in this question to filter an entire dashboard but I am getting an exception. The values are set as Strings and I cant tell where to set the to as suggested by the exception (and what It actually means)
Here is the filter:
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "doc['server.name'].value == doc['client.name'].value",
"lang": "painless"
}
}
}
}
}
}
Here is the exception:
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:40)",
"doc['server.name'].value == doc['client.name'].value",
" ^---- HERE"
],
"script": "doc['server.name'].value == doc['client.name'].value",
"lang": "painless",
"position": {
"offset": 4,
"start": 0,
"end": 48
},
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [server.name] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
}
},
I tried checking if there is anything I can set or is not configured properl at Stack Management --> Index Patterns --> Pattern --> The value server.name, but all I can see is that it is set to string.
As I said before, I am new to Kibana yet I would expect that a "string".equal("other_string") should be a very generic and no drama filter... What am I missing here?
The solution was to use the same exact script but with added ".keyword" to the field name.
So instead of:
"source": "doc['server.name'].value == doc['client.name'].value"
I now use:
"source": "doc['server.name.keyword'].value == doc['client.name.keyword'].value"

How to range query a date mapped as long

I have an external ES db (i.e. I can't change its structure) with the following mapping
"failure_url": {
"properties": {
"lastAccessTime": {
"type": "long"
},
"url": {
"type": "keyword"
}
}
}
lastAccessTime represents a date, but is mapped as a long. A standard "range" filter fails with
"caused_by": {
"type": "number_format_exception",
"reason": "For input string: "now""
}
is the error in my filter expression or is it due to the field not being a "date"? If the latter, how can I still query this date?
If you want to query your field with now and other date match expressions, then your field must be defined as a date type.
One thing you can do is to create another field of type date and then update your index to index that new field.
So first change your mapping like this:
PUT my-index/_mappings
{
"properties": {
"lastAccessDate": {
"type": "date"
}
}
}
And then you can leverage the update by query API in order to index your dates from the long field:
POST my-index/_update_by_query
{
"script": {
"source": "ctx._source.lastAccessDate = ctx._source.lastAccessTime"
}
}
When that's done, you'll be able to run your range query on the new lastAccessDate field of type date.

How do I convert to uppercase and delete a particular field while using reindex?

I am trying to migrate from ES 1.4 to ES 5.5. In one of the index, I need to change the name of field and also convert it's value to uppercase. I am able to reindex with a change in name of field and remove the unwanted field but need help in converting the value to uppercase.
This is what I tried
POST _reindex?wait_for_completion=false
{
"source": {
"remote": {
"host": "http://source_ip:17002"
},
"index": "log_event_2017-08-11",
"size": 1000,
"query": {
"match_all": {}
}
},
"dest": {
"index": "logs-ics-2017-08-11"
},
"script": {
"inline": "ctx._source.product = ctx._source.remove(\"product_name\")",
"lang": "painless"
}
}
The above POST request is able to remove "product_name" and create "product" with it's value. So in order to uppercase "product" docs value I tried below inline script but it gives a null_pointer_exception.
I am new to Elasticsearch scripting. Please help.
"ctx._source.product = ctx._source.remove(\"product_name\");ctx._source.product = doc[\"product\"].toUpperCase()"
You can add an ingest pipeline before you trigger the _reindexapi. There are processors to rename a field and convert a field to uppercase. You can incorporate the pipeline in your reindex call, then.
{
"source": {
"index": "source"
},
"dest": {
"index": "dest",
"pipeline": "<id_of_your_pipeline>"
}
}

extract text from field arrays

One of the fields called "resources" has the following 2 inner documents.
{
"type": "AWS::S3::Object",
"ARN": "arn:aws:s3:::sms_vild/servers_backup/db_1246/db/reports_201706.schema"
},
{
"accountId": "934331768510612",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::sms_vild"
}
I need to split the ARN field and get the last part of it. i.e. "reports_201706.schema" preferably using scripted field.
What I have tried:
1) I checked the fileds list and found only 2 entries resources.accountId and resources.type
2) I tried with date-time field and it worked correctly in the scripted filed option (expression).
doc['eventTime'].value
3) But the same does not work with other text fields for e.g.
doc['eventType'].value
Getting this error:
"caused_by":{"type":"script_exception","reason":"link error","script_stack":["doc['eventType'].value","^---- HERE"],"script":"doc['eventType'].value","lang":"expression","caused_by":{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [eventType] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."}}},"status":500}
It means I need to change the mapping. Is there any other way to extract text from nested arrays in an object?
Update:
Please visit sample kibana here...
https://search-accountact-phhofxr23bjev4uscghwda4y7m.us-east-1.es.amazonaws.com/_plugin/kibana/
search for "ebs_attach.png" and then check resources field. You will see 2 nested arrays like this...
{
"type": "AWS::S3::Object",
"ARN": "arn:aws:s3:::datameetgeo/ebs_attach.png"
},
{
"accountId": "513469704633",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::datameetgeo"
}
I need to split ARN field and extract the last part that is again "ebs_attach.png"
If I can some-how display it as scripted field, then I can see the bucket name and the file name side-by-side on discovery tab.
Update 2
In other words, I am trying to extract the text shown in this image as a new field on discovery tab.
While you can use scripting for this, I highly encourage you to extract those kind of information at index time. I have provided two examples here, which are far from failsafe (you need to test with different path or with this field missing at all), but it should provide a base to start with
PUT foo/bar/1
{
"resources": [
{
"type": "AWS::S3::Object",
"ARN": "arn:aws:s3:::sms_vild/servers_backup/db_1246/db/reports_201706.schema"
},
{
"accountId": "934331768510612",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::sms_vild"
}
]
}
# this is slow!!!
GET foo/_search
{
"script_fields": {
"document": {
"script": {
"inline": "return params._source.resources.stream().filter(r -> 'AWS::S3::Object'.equals(r.type)).map(r -> r.ARN.substring(r.ARN.lastIndexOf('/') + 1)).findFirst().orElse('NONE')"
}
}
}
}
# Do this on index time, by adding a pipeline
PUT _ingest/pipeline/my-pipeline-id
{
"description" : "describe pipeline",
"processors" : [
{
"script" : {
"inline": "ctx.filename = ctx.resources.stream().filter(r -> 'AWS::S3::Object'.equals(r.type)).map(r -> r.ARN.substring(r.ARN.lastIndexOf('/') + 1)).findFirst().orElse('NONE')"
}
}
]
}
# Store the document, specify the pipeline
PUT foo/bar/1?pipeline=my-pipeline-id
{
"resources": [
{
"type": "AWS::S3::Object",
"ARN": "arn:aws:s3:::sms_vild/servers_backup/db_1246/db/reports_201706.schema"
},
{
"accountId": "934331768510612",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::sms_vild"
}
]
}
# lets check the filename field of the indexed document by getting it
GET foo/bar/1
# We can even search for this file now
GET foo/_search
{
"query": {
"match": {
"filename": "reports_201706.schema"
}
}
}
Note: Considered "resources" is kind of array
NSArray *array_ARN_Values = [resources valueForKey:#"ARN"];
Hope it will work for you!!!

elasticsearch routing on specific field

hi I want to set custom routing on specific field "userId" on my Es v2.0.
But it giving me error.I don't know how to set custom routing on ES v2.0
Please guys help me out.Thanks in advance.Below is error message, while creating custom routing with existing index.
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [_routing] has unsupported parameters: [path : userId]"
}
],
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [_routing] has unsupported parameters: [path : userId]"
},
"status": 400
}
In ES 2.0, the _routing.path meta-field has been removed. So now you need to do it like this instead:
In your mapping, you can only specify that routing is required (but you cannot specify path anymore):
PUT my_index
{
"mappings": {
"my_type": {
"_routing": {
"required": true
},
"properties": {
"name": {
"type": "string"
}
}
}
}
}
And then when you index a document, you can specify the routing value in the query string like this:
PUT my_index/my_type/1?routing=bar
{
"name": "foo"
}
You can still use custom routing based on a field from the data being indexed. You can setup a simple pipeline and then use the pipeline every time you index a document, or you can also change the index settings to use the pipeline whenever the index receives a document indexing request.
Read about the pipeline here
Do read above and below the docs for more clarity. It's not meant for setting custom routing, but can be used for the purpose. Custom routing was disabled for a good reason that the field to be used can turn out to have null values leading to unexpected behavior. Hence take care of that issue by yourself.
For routing, here is a sample pipeline PUT :
PUT localhost:9200/_ingest/pipeline/nameRouting
Content-Type: application/json
{
"description" : "Set's the routing value from name field in document",
"processors" : [
{
"set" : {
"field": "_routing",
"value": "{{_source.name}}"
}
}
]
}
The index settings will be :
{
"settings" : {
"index" : {
"default_pipeline" : "nameRouting"
}
}
}

Resources