ElasticSearch - Copy one field value to other field for all documents

ElasticSearch - Copy one field value to other field for all documents - elasticsearch

We have a field "name" in the index. We recently added a new field "alias".
I want to copy name field value to the new field alias for all documents.
Is there any Update query that will do this?
If that is not possible , Help me to achieve this.
Thanks in advance
I am trying this query
http://URL/index/profile/_update_by_query
{
"query": {
"constant_score" : {
"filter" : {
"exists" : { "field" : "name" }
}
}
},
"script" : "ctx._source.alias = name;"
}
In the script , I am not sure how to give name field.
I getting error
{
"error": {
"root_cause": [
{
"type": "class_cast_exception",
"reason": "java.lang.String cannot be cast to java.util.Map"
}
],
"type": "class_cast_exception",
"reason": "java.lang.String cannot be cast to java.util.Map"
},
"status": 500
}

Indeed, the syntax has changed a tiny little bit since. You need to modify your query to this:
POST index/_update_by_query
{
"query": {
"constant_score" : {
"filter" : {
"exists" : { "field" : "name" }
}
}
},
"script" : {
"inline": "ctx._source.alias = ctx._source.name;"
}
}
UPDATE for ES 6
Use source instead of inline

Related

Getting a timestamp exception when I try to update an unrelated field using painless in elasticsearch

Im trying to run the following script
POST /data_hip/_update/1638643727.0
{
"script":{
"source":"ctx._source.avgmer=4;"
}
}
But I am getting the following error.
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "failed to parse field [#timestamp] of type [date] in document with id '1638643727.0'. Preview of field's value: '1.638642742E12'"
}
],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse field [#timestamp] of type [date] in document with id '1638643727.0'. Preview of field's value: '1.638642742E12'",
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "failed to parse date field [1.638642742E12] with format [epoch_millis]",
"caused_by" : {
"type" : "date_time_parse_exception",
"reason" : "Failed to parse with all enclosed parsers"
}
}
},
"status" : 400
}
this is strange because on queries (not updates) the date time is parsed fine.
The timestamp field mapping is as follows
"#timestamp": {
"type":"date",
"format":"epoch_millis"
},
I am running elasticsearch 7+
EDIT:
Adding my index settings
{
"data_hip" : {
"settings" : {
"index" : {
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"number_of_shards" : "1",
"provided_name" : "data_hip",
"creation_date" : "1638559533343",
"number_of_replicas" : "1",
"uuid" : "CHjkvSdhSgySLioCju9NqQ",
"version" : {
"created" : "7150199"
}
}
}
}
}
Im not running an ingest pipeline

The problem is the scientific notation, the 'E12' suffix, being in a field that ES is expecting to be an integer.
Using this reprex:
PUT so_test
{
"mappings": {
"properties": {
"ts": {
"type": "date",
"format": "epoch_millis"
}
}
}
}
# this works
POST so_test/_doc/
{
"ts" : "123456789"
}
# this does not, throws the same error you have IRL
POST so_test/_doc/
{
"ts" : "123456789E12"
}
I'm not sure how/where those values are creeping in, but they are there in the document you are passing to ES.

ElasticSearch Accessing Nested Documents in Script - Null Pointer Exception

Gist: Trying to write a custom filter on nested documents using painless. Want to write error checks when there are no nested documents to surpass null_pointer_exception
I have a mapping as such (simplified and obfuscated)
{
"video_entry" : {
"aliases" : { },
"mappings" : {
"properties" : {
"captions_added" : {
"type" : "boolean"
},
"category" : {
"type" : "keyword"
},
"is_votable" : {
"type" : "boolean"
},
"members" : {
"type" : "nested",
"properties" : {
"country" : {
"type" : "keyword",
},
"date_of_birth" : {
"type" : "date",
}
}
}
}
Each video_entry document can have 0 or more members nested documents.
Sample Document
{
"captions_added": true,
"category" : "Mental Health",
"is_votable: : true,
"members": [
{"country": "Denmark", "date_of_birth": "1998-04-04T00:00:00"},
{"country": "Denmark", "date_of_birth": "1999-05-05T00:00:00"}
]
}
If one or more nested document exist, we want to write some painless scripts that'd check certain fields across all the nested documents. My script works on mappings with a few documents but when I try it on larger set of documents I get null pointer exceptions despite having every null check possible. I've tried various access patterns, error checking mechanisms but I get exceptions.
POST /video_entry/_search
{
"query": {
"script": {
"script": {
"source": """
// various NULL checks that I already tried
// also tried short circuiting on finding null values
if (!params['_source'].empty && params['_source'].containsKey('members')) {
def total = 0;
for (item in params._source.members) {
// custom logic here
// if above logic holds true
// total += 1;
}
return total > 3;
}
return true;
""",
"lang": "painless"
}
}
}
}
Other Statements That I've Tried
if (params._source == null) {
return true;
}
if (params._source.members == null) {
return true;
}
if (!ctx._source.contains('members')) {
return true;
}
if (!params['_source'].empty && params['_source'].containsKey('members') &&
params['_source'].members.value != null) {
// logic here
}
if (doc.containsKey('members')) {
for (mem in params._source.members) {
}
}
Error Message
&& params._source.members",
^---- HERE"
"caused_by" : {
"type" : "null_pointer_exception",
"reason" : null
}
I've looked into changing the structure (flattening the document) and the usage of must_not as indicated in this answer. They don't suit our use case as we need to incorporate some more custom logic.
Different tutorials use ctx, doc and some use params. To add to the confusion Debug.explain(doc.members), Debug.explain(params._source.members) return empty responses and I'm having a hard time figuring out the types.
Gist: Trying to write a custom filter on nested documents using painless. Want to write error checks when there are no nested documents to surpass null_pointer_exception
Any help is appreciated.

TLDr;
Elastic flatten objects. Such that
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
Turn into:
{
"group" : "fans",
"user.first" : [ "alice", "john" ],
"user.last" : [ "smith", "white" ]
}
To access members inner value you need to reference it using doc['members.<field>'] as members will not exist on its own.
Details
As you may know, Elastic handles inner documents in its own way. [doc]
So you will need to reference them accordingly.
Here is what I did to make it work.
Btw, I have been using the Dev tools of kibana
PUT /so_test/
PUT /so_test/_mapping
{
"properties" : {
"captions_added" : {
"type" : "boolean"
},
"category" : {
"type" : "keyword"
},
"is_votable" : {
"type" : "boolean"
},
"members" : {
"properties" : {
"country" : {
"type" : "keyword"
},
"date_of_birth" : {
"type" : "date"
}
}
}
}
}
POST /so_test/_doc/
{
"captions_added": true,
"category" : "Mental Health",
"is_votable" : true,
"members": [
{"country": "Denmark", "date_of_birth": "1998-04-04T00:00:00"},
{"country": "Denmark", "date_of_birth": "1999-05-05T00:00:00"}
]
}
PUT /so_test/_doc/
{
"captions_added": true,
"category" : "Mental breakdown",
"is_votable" : true,
"members": []
}
POST /so_test/_doc/
{
"captions_added": true,
"category" : "Mental success",
"is_votable" : true,
"members": [
{"country": "France", "date_of_birth": "1998-04-04T00:00:00"},
{"country": "Japan", "date_of_birth": "1999-05-05T00:00:00"}
]
}
And then I did this query (it is only a bool filter, but I guess making it work for your own use case should not prove too difficult)
GET /so_test/_search
{
"query":{
"bool": {
"filter": {
"script": {
"script": {
"lang": "painless",
"source": """
def flag = false;
// /!\ notice how the field is referenced /!\
if(doc['members.country'].size() != 0)
{
for (item in doc['members.country']) {
if (item == params.country){
flag = true
}
}
}
return flag;
""",
"params": {
"country": "Japan"
}
}
}
}
}
}
}
BTW you were saying you were a bit confused about the context for painless. you can find in the documentation so details about it.
[doc]
In this case the filter context is the one we want to look at.

Add geoIP data to old data from Elasticsearch index

I recently added a GeoIP processor to my ingestion pipeline in Elasticsearch. this works well and adds new fields to the newly ingested documents.
I wanted to add the GeoIP fields to older data by doing an _update_by_query on an index, however, it seems that it doesn't accept "processors" as a parameter.
What I want to do is something like this:
POST my_index*/_update_by_query
{
"refresh": true,
"processors": [
{
"geoip" : {
"field": "doc['client_ip']",
"target_field" : "geo",
"database_file" : "GeoLite2-City.mmdb",
"properties":["continent_name", "country_iso_code", "country_name", "city_name", "timezone", "location"]
}
}
],
"script": {
"day_of_week": {
"type": "long",
"script": "emit(doc['#timestamp'].value.withZoneSameInstant(ZoneId.of(doc['geo.timezone'])).getDayOfWeek().getValue())"
},
"hour_of_day": {
"type": "long",
"script": "emit(doc['#timestamp'].value.withZoneSameInstant(ZoneId.of(doc['geo.timezone'])).getHour())"
},
"office_hours": {
"script": "if (doc['day_of_week'].value< 6 && doc['day_of_week'].value > 0) {if (doc['hour_of_day'].value> 7 && doc['hour_of_day'].value<19) {return 1;} else {return -1;} } else {return -1;}"
}
}
}
I receive the following error:
{
"error" : {
"root_cause" : [
{
"type" : "parse_exception",
"reason" : "Expected one of [source] or [id] fields, but found none"
}
],
"type" : "parse_exception",
"reason" : "Expected one of [source] or [id] fields, but found none"
},
"status" : 400
}

Since you have the ingestion pipeline ready, you simply need to reference it in your call to the _update_by_query endpoint, like this:
POST my_index*/_update_by_query?pipeline=my-pipeline
^
|
add this

Can only use wildcard queries on keyword, text and wildcard fields - not on [id] which is of type [long]

Elasticsearch version 7.13.1
GET test/_mapping
{
"test" : {
"mappings" : {
"properties" : {
"id" : {
"type" : "long"
},
"name" : {
"type" : "text"
}
}
}
}
}
POST test/_doc/101
{
"id":101,
"name":"hello"
}
POST test/_doc/102
{
"id":102,
"name":"hi"
}
Wildcard Search pattern
GET test/_search
{
"query": {
"query_string": {
"query": "*101* *hello*",
"default_operator": "AND",
"fields": [
"id",
"name"
]
}
}
}
Error is : "reason" : "Can only use wildcard queries on keyword, text and wildcard fields - not on [id] which is of type [long]",
It was working fine in version 7.6.0 ..
What is new change in latest ES and what is the resolution of this issue?

It's not directly possible to perform wildcards on numeric data types. It is better to convert those integers to strings.
You need to modify your index mapping to
PUT /my-index
{
"mappings": {
"properties": {
"code": {
"type": "text"
}
}
}
}
Otherwise, if you want to perform a partial search you can use edge n-gram tokenizer

How to use nested in Ealsticsearch 7.10

Followings are the steps on how using nested field in elastersearch.
First step:
curl -XPUT 'localhost:9200/my_index/my_type/1?pretty' -d'
{
"group" : "fans",
"user" : [ // 1
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}'
Second step:
curl -XPUT 'localhost:9200/my_index?pretty' -d'
{
"mappings": {
"my_type": {
"properties": {
"user": {
"type": "nested" // 1
}
}
}
}
}'
Before i copy the code, i have delete all the index on my machine.
However, after running step 2, something went woring like the following .
{
"error" : {
"root_cause" : [
{
"type" : "resource_already_exists_exception",
"reason" : "index [my_index/yHhgr8iEQqGnHo5Ugex2dA] already exists",
"index_uuid" : "yHhgr8iEQqGnHo5Ugex2dA",
"index" : "my_index"
}
],
"type" : "resource_already_exists_exception",
"reason" : "index [my_index/yHhgr8iEQqGnHo5Ugex2dA] already exists",
"index_uuid" : "yHhgr8iEQqGnHo5Ugex2dA",
"index" : "my_index"
},
"status" : 400
}
I really don't konw what to do about this.(I have also tried create nested field first. It also went wrong)
I'm new to elastersearch, really need help. Thankyou very mutch!!!

Since you are using Elasticsearch version 7.10, you cannot add the mapping type in the index mapping definition. Refer to this to know more about the removal of mapping types.
You can not change the mapping of an index that already exists, you need to delete it and index the data with the new mapping, or reindex into a new index with the new mapping.
You need to first create the index with the following mapping:
PUT /my_index
{
"mappings": {
"properties": {
"user": {
"type": "nested" // 1
}
}
}
}
And then index the documents into the index. Refer to this official documentation, to know more about nested type.
PUT /my_index/_doc/1
{
"group" : "fans",
"user" : [ // 1
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

ElasticSearch - Copy one field value to other field for all documents - elasticsearch

Related

Getting a timestamp exception when I try to update an unrelated field using painless in elasticsearch

ElasticSearch Accessing Nested Documents in Script - Null Pointer Exception

Add geoIP data to old data from Elasticsearch index

Can only use wildcard queries on keyword, text and wildcard fields - not on [id] which is of type [long]

How to use nested in Ealsticsearch 7.10

Categories

Resources