Upsert does not add new one when document doesn't exist (ElasticSearch) - elasticsearch

I am creating new docs like this:
PUT test/_doc/1
{
"counter" : 1,
"tags" : "red"
}
Now I want to update or insert document whether or not it already exists:
POST test/_update/2
{
"script" : {
"source": "ctx._source.counter += params.count",
"lang": "painless",
"params" : {
"count" : 4
}
},
"upsert" : {
"counter" : 1
}
}
In my case, _doc=2 does not exist, for this I added upsert into the query so that it will be created automatically when the _doc does not exist.
Instead, I am getting this error message:
{
"error": {
"root_cause": [
{
"type": "invalid_type_name_exception",
"reason": "Document mapping type name can't start with '_', found: [_update]"
}
],
"type": "invalid_type_name_exception",
"reason": "Document mapping type name can't start with '_', found: [_update]"
},
"status": 400
}
Did I misunderstand how it works please?
UPDATE
Mapping:
PUT /test
{
"mappings": {
"type_name": {
"properties": {
"counter" : { "type" : "integer" },
"tags": { "type" : "text" }
}}},
"settings": {
"number_of_shards": 1
}
}
ElasticSearch version: "6.8.4"

try this
you were looking at 7.x documentation
this is the documentation for your version : https://www.elastic.co/guide/en/elasticsearch/reference/6.8/docs-update.html
POST test/type_name/2/_update
{
"script" : {
"source": "ctx._source.counter += params.count",
"lang": "painless",
"params" : {
"count" : 4
}
},
"upsert" : {
"counter" : 1
}
}

Related

How to update field format in Opensearch/Elasticsearch?

I am trying to change the format of a string field in opensearch:
PUT my_index/_mapping
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "YYYY-MM-DD HH:mm:ss.SSS"
}
}
}
}
Response is
{
"error" : {
"root_cause" : [
{
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [mappings : {properties={timestamp={format=YYYY-MM-DD HH:mm:ss.SSS, type=date}}}]"
}
],
"type" : "mapper_parsing_exception",
"reason" : "Root mapping definition has unsupported parameters: [mappings : {properties={timestamp={format=YYYY-MM-DD HH:mm:ss.SSS, type=date}}}]"
},
"status" : 400
}
I've spent days trying to figure this out, seems to me like Opensearch is just so unnecessarily complex.
You cannot change the type of an existing field once it's been created. You need to reindex your index with the wrong mapping into a new index with the right mapping.
First, create the new index:
PUT new_index
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"format": "YYYY-MM-DD HH:mm:ss.SSS"
}
}
}
}
Then, reindex the old index into the new one
POST _reindex
{
"source": {
"index": "old_index"
},
"dest": {
"index": "new_index"
}
}

Bulk API error while indexing data into elasticsearch

I want to import some data into elasticsearch using bulk API. this is the mapping I have created using Kibana dev tools:
PUT /main-news-test-data
{
"mappings": {
"properties": {
"content": {
"type": "text"
},
"title": {
"type": "text"
},
"lead": {
"type": "text"
},
"agency": {
"type": "keyword"
},
"date_created": {
"type": "date"
},
"url": {
"type": "keyword"
},
"image": {
"type": "keyword"
},
"category": {
"type": "keyword"
},
"id":{
"type": "keyword"
}
}
}
}
and this is my bulk data:
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{
"content":"\u0641\u0647\u06cc\u0645\u0647 \u062d\u0633\u0646\u200c\u0645\u06cc\u0631\u06cc: \u0627\u06af\u0631\u0686\u0647 \u062f\u0631 \u0647\u06cc\u0627\u0647\u0648\u06cc ",
"title":"\u06a9\u0627\u0631\u0647\u0627\u06cc \u0642\u0627\u0644\u06cc\u0628\u0627\u0641",
"lead":"\u062c\u0627\u0645\u0639\u0647 > \u0634\u0647\u0631\u06cc -.",
"agency":"13",
"date_created":1494518193,
"url":"http://www.khabaronline.ir/(X(1)S(bud4wg3ebzbxv51mj45iwjtp))/detail/663749/society/urban",
"image":"uploads/2017/05/11/1589793661.jpg",
"category":"15",
"id":"2981643"
}
{ "index" : { "_index" : "main-news-test-data", "_id" : "2" } }
{
....
but when I want to post data I receive this error:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"
}
"status" : 400
}
what is the problem? I used both PowerShell and POST method in Kibana dev tools but I receive the same error in both.
The data should be specified in a single line like this:
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{ "content":"\u0641\u0647","title":"\u06a9" }
Please refer this SO answer
Try this below format of bulk JSON. I have tested this bulk API request locally also, and now it's working perfectly fine:
{ "index" : { "_index" : "main-news-test-data", "_id" : "1" } }
{"content":"\u0641\u0647\u06cc\u0645\u0647 \u062d\u0633\u0646\u200c\u0645\u06cc\u0631\u06cc: \u0627\u06af\u0631\u0686\u0647 \u062f\u0631 \u0647\u06cc\u0627\u0647\u0648\u06cc ", "title":"\u06a9\u0627\u0631\u0647\u0627\u06cc \u0642\u0627\u0644\u06cc\u0628\u0627\u0641", "lead":"\u062c\u0627\u0645\u0639\u0647 > \u0634\u0647\u0631\u06cc -.", "agency":"13", "date_created":1494518193, "url":"http://www.khabaronline.ir/(X(1)S(bud4wg3ebzbxv51mj45iwjtp))/detail/663749/society/urban", "image":"uploads/2017/05/11/1589793661.jpg", "category":"15", "id":"2981643"}
Dont forget to add a new line at the end of your content.

Add date field and boolean with ? in name to existing Elasticsearch documents

We need to add two new fields to an existing ElasticSearch (7.9 oss) instance.
Field 1: Date Field
We want to add an optional date field. It shouldn't have a value upon creation.
How to do this with update_by_query?
Tried this:
POST orders/_update_by_query
{
"query": {
"match_all": {}
},
"script": {
"source": "ctx._source.new_d3_field",
"lang": "painless",
"type": "date",
"format": "yyyy/MM/dd HH:mm:ss"
}
}
Field 2: Boolean field with ? in name
We want to keep the ? so that it matches the other fields that we already have in ES.
Also worth noting that even removing the ? and doing the below the field doesn't appear to be a boolean.
Tried this:
POST orders/_update_by_query
{
"query": {
"match_all": {}
},
"script": {
"source": "ctx._source.new_b_field? = false",
"lang": "painless"
}
}
Which gave the error:
{
"error" : {
"root_cause" : [
{
"type" : "script_exception",
"reason" : "compile error",
"script_stack" : [
"ctx._source.new_b_field? = false",
" ^---- HERE"
],
"script" : "ctx._source.new_b_field? = false",
"lang" : "painless",
"position" : {
"offset" : 25,
"start" : 0,
"end" : 32
}
}
],
"type" : "script_exception",
"reason" : "compile error",
"script_stack" : [
"ctx._source.new_b_field? = false",
" ^---- HERE"
],
"script" : "ctx._source.new_b_field? = false",
"lang" : "painless",
"position" : {
"offset" : 25,
"start" : 0,
"end" : 32
},
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "invalid sequence of tokens near ['='].",
"caused_by" : {
"type" : "no_viable_alt_exception",
"reason" : null
}
}
},
"status" : 400
}
Also tried:
POST orders/_update_by_query?new_b_field%3F=false
Which gave:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "request [/orders/_update_by_query] contains unrecognized parameter: [new_b_field?]"
}
],
"type" : "illegal_argument_exception",
"reason" : "request [/orders/_update_by_query] contains unrecognized parameter: [new_b_field?]"
},
"status" : 400
}
If you want to add two new fields to an existing ElasticSearch index that don't have value upon creation you should update its mapping using Put mapping API
PUT /orders/_mapping
{
"properties": {
"new_d3_field": {
"type": "date",
"format": "yyyy/MM/dd HH:mm:ss"
},
"new_b_field?": {
"type": "boolean"
}
}
}
If you still want to use _update_by_query you should set an initial value, then the field will be added.
POST orders/_update_by_query?wait_for_completion=false&conflicts=proceed
{
"query": {
"match_all": {}
},
"script": {
"source": "ctx._source.new_d3_field=params.date;ctx._source.new_b_field = params.val",
"lang": "painless",
"params": {
"date": "1980/01/01",
"val": false
}
}
}
Update By Query API is used to update documents so I guess you can't add a field to your schema without updating at list one doc. what you can do is to set a dummy doc and update only this certain doc. Something like that:
POST orders/_update_by_query
{
"query": {
"match": {
"my-field":"my-value"
}
},
"script": {
"source": "ctx._source.new_d3_field=params.date;ctx._source.new_b_field = params.val",
"lang": "painless",
"params": {
"date": "1980/01/01",
"val": false
}
}
}

For an elastic search index, how to get the documents where array field has length greater than 0?

In elastic search index, how to get the documents where array field has length greater than 0?
I tried following multiple syntaxes but didn't get any breakthrough. I got same error in all of the syntaxes.
GET http://{{host}}:{{elasticSearchPort}}/student_details/_search
Syntax 1:
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "doc['enrolledCourses'].values.length > 0",
"lang": "painless"
}
}
}
}
}
}
Error:
"caused_by": {
"type": "illegal_argument_exception",
"reason": "No field found for [enrolledCourses] in mapping with types []"
}
Syntax 2:
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "doc['enrolledCourses'].values.size() > 0",
"lang": "painless"
}
}
}
}
}
}
Error:
"caused_by": {
"type": "illegal_argument_exception",
"reason": "No field found for [enrolledCourses] in mapping with types []"
}
Syntax 3:
{
"query": {
"bool": {
"filter" : {
"script" : {
"script" : "doc['enrolledCourses'].values.size() > 0"
}
}
}
}
}
Error:
"caused_by": {
"type": "illegal_argument_exception",
"reason": "No field found for [enrolledCourses] in mapping with types []"
}
Syntax 4:
{
"query": {
"bool": {
"filter" : {
"script" : {
"script" : "doc['enrolledCourses'].values.length > 0"
}
}
}
}
}
Error:
"caused_by": {
"type": "illegal_argument_exception",
"reason": "No field found for [enrolledCourses] in mapping with types []"
}
Please help me in solving this.
I don't know what version of elastic you run, then all my test I'd running on latest 7.9.0 version of Elasticsearch.
I will use painless script for scripting.
I put to documents to index test:
PUT test/_doc/1
{
"name": "Vasia",
"enrolledCourses" : ["test1", "test2"]
}
PUT test/_doc/2
{
"name": "Petya"
}
How you can see one document contains enrolledCourses field and second not.
In painless you don't need use values field and you can take length directly, this is according to painless documentation. Then I skip using values operator in my script:
GET test/_search
{
"query": {
"bool": {
"filter": [
{
"script": {
"script": {
"source": "doc['enrolledCourses'].length > 0",
"lang": "painless"
}
}
}
]
}
}
}
After running I'd received 2 different errors:
{
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:757)",
"org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
"org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:331)",
"org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:97)",
"org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:94)",
"java.base/java.security.AccessController.doPrivileged(AccessController.java:312)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:94)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
"doc['enrolledCourses'].length > 0",
" ^---- HERE"
]
}
and
{
"type" : "illegal_argument_exception",
"reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [enrolledCourses] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
}
Both of errors is pretty clear. First for document where field doesn't exists and second because Elasticsearch indexed string array field as default mapping type text.
Both of cases is very easy to fix by mapping enrolledCourses field as keyword.
In first case mapping will always provide empty field and in second keyword word be allow to run fielddata property.
PUT test
{
"settings": {
"number_of_replicas": 0
},
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"enrolledCourses": {
"type": "keyword"
}
}
}
}
Now I will receive right answer for query:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "test",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.0,
"_source" : {
"name" : "Vasia",
"enrolledCourses" : [
"test1",
"test2"
]
}
}
]
}
}

ElasticSearch => How to updates with a partial document using update_by_query

I want to update the data in my index whose cname is wang.
My index code is as follows:
PUT index_c
{
"mappings": {
"_doc" : {
"properties" : {
"cid" : {
"type" : "keyword"
},
"cname" : {
"type" : "keyword"
},
"cage" : {
"type" : "short"
},
"chome" : {
"type" : "text"
}
}
}
}
}
And my update request is as follows:
POST index_c/_update_by_query
{
"query" : {
"match": {
"cname": "wang"
}
},
"doc" : {
"cage" : "100",
"chome" : "china"
}
}
But I got an error like this:
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "Unknown key for a START_OBJECT in [doc].",
"line": 1,
"col": 43
}
],
"type": "parsing_exception",
"reason": "Unknown key for a START_OBJECT in [doc].",
"line": 1,
"col": 43
},
"status": 400
}
So I want to know how to implement this when using "update_by_query"
I think this will work for you just replace the doc part with script. if inline shows deprecated for you then just use source instead
POST index_c/_update_by_query
{
"query" : {
"match": {
"cname": "wang"
}
},
"script" : {
"inline" : "ctx._source.cage='100'; ctx._source.chome= 'china';",
"lang" : "painless"
}
}

Resources