I have a document in elasticsearch that looks like this:
{
"_index": "stats",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"publishTime": {
"lastUpdate": 1580991095131,
"h0_4": 0,
"h4_8": 0,
"h8_12": 3,
"h12_16": 5,
"h16_20": 2,
"h20_24": 1
},
"postCategories": {
"lastUpdate": 1580991095131,
"tech": 56,
"lifestyle": 63,
"healthcare": 49,
"finances": 25,
}
}
}
Updating/Incrementing existing property values by sending a POST request to /stats/_update/1 works great! However, if I try to upsert a non-existing property name under postCategories, I get a Bad Request (400) error of type remote_transport_exception/illegal_argument_exception:
"ctx._source.postCategories.relationships += params.postCategories.relationships",
^---- HERE"
Upsert
{
"script": {
"source": "ctx._source.postCategories.relationships += params.postCategories.relationships",
"lang": "painless",
"params": {
"postCategories": {
"relationships": 2
}
}
},
"upsert": {
"postCategories": {
"relationships": 2
}
}
}
I also tried the Scripted Upsert method by following the documentation from here, however, the same error occurs:
Scripted Upsert
{
"scripted_upsert":true,
"script": {
"source": "ctx._source.postCategories.relationships += params.postCategories.relationships",
"params": {
"postCategories": {
"relationships": 2
}
}
},
"upsert": {}
}
Can anyone tell me how can I properly add/upsert new property names under postCategories object, please?
Thank You!
Its basically saying that you are trying to assign a value to a field that doesnt exist. I think below should work(not tested).
Try to check if field exists - continue with operation if it exists.
Else add new field and assign value.
"if (ctx._source.postCategories.containsKey(\"relationships\")) { ctx._source.postCategories.relationships += params.postCategories.relationships} else { ctx._source.postCategories[\"relationships\"] = params.postCategories.relationships}",
Related
I noticed that when using script_fields it always returns array of the values that should be returned. I wonder why is this happening and is it possible to just return a non-array data type like an object or bool?
Example to illustrate (taken from web)
GET sat/_search
{
"script_fields": {
"some_scores": {
"script": {
"lang": "painless",
"inline": "def scores = 0; scores = doc['AvgScrRead'].value + doc['AvgScrWrit'].value; return scores;" <--- we are returning a number here!
}
}
}
}
Result:
{
"_index": "sat",
"_type": "scores",
"_id": "AV3CYR8JFgEfgdUCQSON",
"_score": 1,
"_source": {
"cds": 1611760130062,
"rtype": "S",
"sname": "American High",
"dname": "Fremont Unified",
"cname": "Alameda",
"enroll12": 444,
"NumTstTakr": 298,
"AvgScrRead": 576,
"AvgScrMath": 610,
"AvgScrWrit": 576,
"NumGE1500": 229,
"PctGE1500": 76.85,
"year": 1516
},
"fields": {
"some_scores": [ <----- here it's an array
1152
]
}
}
"chods": {
"mappings": {
"chod": {
"properties": {
"state": {
"type": "text"
}
}
},
"chods": {},
"variant": {
"_parent": {
"type": "chod"
},
"_routing": {
"required": true
},
"properties": {
"percentage": {
"type": "double"
}
}
}
}
},
When I execute:
PUT /chods/variant/565?parent=36442
{ // some data }
It returns:
{
"_index":"chods",
"_type":"variant",
"_id":"565",
"_version":6,
"result":"updated",
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"created":false
}
But when I run this query:
GET /chods/variant/565?parent=36442
It returns variant with parent=36443
{
"_index": "chods",
"_type": "variant",
"_id": "565",
"_version": 7,
"_routing": "36443",
"_parent": "36443",
"found": true,
"_source": {
...
}
}
Why it returns with parent 36443 and not 36442?
When I tried to reproduce this with your steps, I got the expected result (version=36442). I noticed that after your PUT of the document with "_parent": "36442" the output is "_version":6. In your GET of the document, "_version": 7 is returned. Is it possible that you posted another version of the document?
I also noticed that GET /chods/variant/565?parent=36443 would not actually filter by the parent id - the query parameter is disregarded. If you actually want to filter by parent id, this is the query you're looking for:
GET /chods/_search
{
"query": {
"parent_id": {
"type": "variant",
"id": "36442"
}
}
}
As #fylie pointed out the main problem is that if you use same id of the document you will get your document overridden by last version - sort of
Lets say that we have index /tests and type "a" which is child of type "test" and we do following commands:
PUT /tests/a/50?parent=25
{
"item": "C"
}
PUT /tests/a/50?parent=26
{
"item": "D"
}
PUT /tests/a/50?parent=50
{
"item": "E",
"item2": "F",
}
What the result will be? Well it can result in creating 1 - 3 documents.
If it will route to the same shard, you will end up with one document, which will have 3 versions.
If it will route to 3 different shards, you will end up with 3 new documents.
You have an Elasticsearch index with two docs:
[
{
"_index": "myIndex",
"_type": "myType",
"_id": "es1472002807930",
"_source": {
"animal": "turtle",
"color": "green",
"weight": 20,
}
},
{
"_index": "myIndex",
"_type": "myType",
"_id": "es1472002809463",
"_source": {
"animal": "bear",
"color": "brown"
"weight": 400,
}
}
]
Later, you get this updated data about the bear:
{
"color": "pink",
"weight": 500,
"diet": "omnivore",
}
So, you want to update the "color" and "weight" values of the bear, and add the "diet" key to the "bear" doc. You know there's only one doc with "animal": "bear" (but you don't know the _id):
Using the Nodejs driver, what updateByQuery syntax would update the "bear" doc with these new values?
(NOTE: this question has been entirely edited to be more useful to the SO community!)
The answer was provided by Val in this other SO:
How to update a document based on query using elasticsearch-js (or other means)?
Here is the answer:
var theScript = {
"inline": "ctx._source.color = 'pink'; ctx._source.weight = 500; ctx._source.diet = 'omnivore';"
}
client.updateByQuery({
index: myindex,
type: mytype,
body: {
"query": { "match": { "animal": "bear" } },
"script": theScript
}
}, function(err, res) {
if (err) {
reportError(err)
}
cb(err, res)
}
)
The other answer is missing the point since it doesn't have any script to carry out the update.
You need to do it like this:
POST /myIndex/myType/_update_by_query
{
"query": {
"term": {
"animal": "bear"
}
},
"script": "ctx._source.color = 'green'"
}
Important notes:
you need to make sure to enable dynamic scripting in order for this to work.
if you are using ES 2.3 or later, then the update-by-query feature is built-in
if you are using ES 1.7.x or a former release you need to install the update-by-query plugin
if you are using anything between ES 2.0 and 2.2, then you don't have any way to do this in one shot, you need to do it in two operations.
UPDATE
Your node.js code should look like this, you're missing the body parameter:
client.updateByQuery({
index: index,
type: type,
body: {
"query": { "match": { "animal": "bear" } },
"script": { "inline": "ctx._source.color = 'pink'"}
}
}, function(err, res) {
if (err) {
reportError(err)
}
cb(err, res)
}
)
For elasticsearch 7.4 you could use
await client.updateByQuery({
index: "indexName",
body: {
query: {
match: { fieldName: "valueSearched" }
},
script: {
source: "ctx._source.fieldName = params.newValue",
lang: 'painless',
params: {
newValue: "newValue"
}
}
}
});
I have an upsert query running in bulk. Final document is to be stored like this:
{
"email": "abc#xyz.com",
"sources": [1,2]
}
Here is the code:
var doc = {
"source": parseInt(id),
"email": email
}
var upsert_query = {
"script": "if (ctx._source.containsKey(\"sources\")) { if (!ctx._source.sources.contains(source)) { ctx._source.sources += source; } } else {ctx._source.sources = [source] }",
"params": {
"source": doc.source
},
"upsert": {
"email": doc.email,
"sources": [doc.source]
}
}
bulkRequestBody.push({"update": {"_index": "my_index", "_type": "email", "_id": doc.email, "_retry_on_conflict": 3}});
bulkRequestBody.push(upsert_query);
The code works perfectly fine on elasticsearch version 1.4 but not working on version 2.1.1.
I also tried to restructure my query:
var upsert_query = {
"script": {
"inline": "if (ctx._source.containsKey(\"sources\")) { if (!ctx._source.sources.contains(source)) { ctx._source.sources += source; } } else {ctx._source.sources = [source] }",
"params": {
"source": doc.source
}
},
"upsert": {
"email": doc.email,
"sources": [doc.source]
}
}
but still no luck. Any help ?
Scripting needs to be enabled to run scripts like this:
in the elasticsearch.yml file in config, add the following lines:
script.inline: on
script.indexed: on
I am new to elastic search, I have created an index "cmn" with a type "mention". I am trying to import data from my existing solr to elasticsearch, so I want to map an existing field to the _id field.
I have created the following file under /config/mappings/cmn/,
{
"mappings": {
"mentions":{
"_id" : {
"path" : "docKey"
}
}
}
}
But this doesn't seem to be working, every time I index a record the following _id is created,
"_index": "cmn",
"_type": "mentions",
"_id": "k4E0dJr6Re2Z39HAIjYMmg",
"_score": 1
Also, the mapping is not reflects. I have also tried the following option,
{
"mappings": {
"_id" : {
"path" : "docKey"
}
}
}
SAMPLE DOCUMENT: Basically a tweet.
{
"usrCreatedDate": "2012-01-24 21:34:47",
"sex": "U",
"listedCnt": 2,
"follCnt": 432,
"state": "Southampton",
"classified": 0,
"favCnt": 468,
"timeZone": "Casablanca",
"twitterId": 473333038,
"lang": "en",
"stnostem": "#ootd #ootw #fashion #styling #photography #white #pink #playsuit #prada #sunny #spring http://t.co/YbPFrXlpuh",
"sourceId": "tw",
"timestamp": "2014-04-09T22:58:00.396Z",
"sentiment": 0,
"updatedOnGMTDate": "2014-04-09T22:56:57.000Z",
"userLocation": "Southampton",
"age": 0,
"priorityScore": 57.4700012207031,
"statusCnt": 14612,
"name": "YazzyK",
"profilePicUrl": "http://pbs.twimg.com/profile_images/453578494556270594/orsA0pKi_normal.jpeg",
"mentions": "",
"sourceStripped": "Instagram",
"collectionName": "STREAMING",
"tags": "557/161/193/197",
"msgid": 1397084280396.33,
"_version_": 1464949081784713200,
"url2": "{\"urls\":[{\"url\":\"http://t.co/YbPFrXlpuh\",\"expandedURL\":\"http://instagram.com/p/mliZbgxVZm/\",\"displayURL\":\"instagram.com/p/mliZbgxVZm/\",\"start\":88,\"end\":110}]}",
"links": "http://t.co/YbPFrXlpuh",
"retweetedStatus": "",
"twtScreenName": "YazKader",
"postId": "454030232501358592",
"country": "Bermuda",
"message": "#ootd #ootw #fashion #styling #photography #white #pink #playsuit #prada #sunny #spring http://t.co/YbPFrXlpuh",
"source": "Instagram",
"parentStatusId": -1,
"bio": "Live and breathe Fashion. Persian and proud- Instagram: #Yazkader",
"createdOnGMTDate": "2014-04-09T22:56:57.000Z",
"searchText": "#ootd #ootw #fashion #styling #photography #white #pink #playsuit #prada #sunny #spring http://t.co/YbPFrXlpuh",
"isFavorited": "False",
"frenCnt": 214,
"docKey": "tw_454030232501358592"
}
Also, how can we create unique mapping for each "TYPE" and not just the index.
Thanks
Do like this,
Put the mapping as,
PUT index_name/type_name/_mapping
{
"type_name": {
"_id": {
"path": "docKey"
},
"properties": {
"docKey": {
"type": "string"
}
}
}
}
And, it will work. (When you index docKey, then _id is set). You shouldn't have to provide all the mapping.