ElasticSearch: Partial Update a document or remove it. (Opposite of upsert) - elasticsearch

In ElasticSearch I'm using upsert to update a document that may not exist:
POST /website/pageviews/1/_update
{
"script" : "ctx._source.online+=1",
"upsert": {
"online": 1
}
}
Since my data are going to change frequently I want to remove my document if online == 0.
It would be useless to use update if I need to get the document and check online value every time, and I don't want to accumulate a lot of trash documents.
Which is the best way to remove my document when online == 0? Something like:
POST /website/pageviews/1/_update
{
"script" : "ctx._source.online-=1",
"remove_doc": "ctx._source.online == 0"
}

You can use the delete operation like this:
POST /website/pageviews/1/_update
{
"script" : "if (online == 0) { ctx.op = 'delete' } else { ctx._source.online += 1 }",
"upsert": {
"online": 1
}
}

Related

Elasticsearch. Painless script to search based on the last result

Let's see if someone could shed a light on this one, which seems to be a little hard.
We need to correlate data from multiple index and various fields. We are trying painless script.
Example:
We make a search in an index to gather data about the queueid of mails sent by someone#domain
Once we have the queueids, we need to store the queueids in an array an iterate over it to make new searchs to gather data like email receivers, spam checks, postfix results and so on.
Problem: Hos can we store the data from one search and use it later in the second search?
We are testing something like:
GET here_an_index/_search
{
"query": {
"bool" : {
"must": [
{
"range": {
"#timestamp": {
"gte": "now-15m",
"lte": "now"
}
}
}
],
"filter" : {
"script" : {
"script" : {
"source" : "doc['postfix_from'].value == params.from; qu = doc['postfix_queueid'].value; return qu",
"params" : {
"from" : "someona#mdomain"
}
}
}
}
}
}
}
And, of course, it throws an error.
"doc['postfix_from'].value ...",
"^---- HERE"
So, in a nuttshell: is there any way ti execute a search looking for some field value based on a filter (like from:someone#dfomain) and use this values on later searchs?
We have evaluated using script fields or nested, but due to some architecture reasons and what those changes would entail, right now, can not be used.
Thank you very much!

Elasticsearch partial update script: Clear array and replace with new values

I have documents like:
{
MyProp: ["lorem", "ipsum", "dolor"]
... lots of stuff here ...
}
My documents can be quite big (but these MyProp fields are not), and expensive to generate from scratch.
Sometimes I need to update batches of these - it would therefore be beneficial to do a partial update (to save "indexing client" processing power and bandwidth, and thus time) and replace the MyProp values with new values.
Example of original document:
{
MyProp: ["lorem", "ipsum", "dolor"]
... lots of stuff here ...
}
Example of updated document (or rather how it should look):
{
MyProp: ["dolor", "sit"]
... lots of stuff here ...
}
From what I have seen, this includes scripting.
Can anyone enlighten me with the remaining bits of the puzzle?
Bounty added:
I'd like to also have some instructions of how to make these in a batch statement, if possible.
You can use the update by query API in order to do batch updates. This works since ES 2.3 onwards, otherwise you need to install a plugin.
POST index/_update_by_query
{
"script": {
"inline": "ctx._source.myProp += newProp",
"params": {
"newProp": "sit"
}
},
"query": {
"match_all": {}
}
}
You can of course use whatever query you want in order to select the documents on which MyProp needs to be updated. For instance, you could have a query to select documents having some specific MyProp values to be replaced.
The above will only add a new value to the existing array. If you need to completely replace the MyProp array, then you can also change the script to this:
POST index/_update_by_query
{
"script": {
"inline": "ctx._source.myProp = newProps",
"params": {
"newProps": ["dolor", "sit"]
}
},
"query": {
"match_all": {}
}
}
Note that you also need to enable dynamic scripting in order for this to work.
UPDATE
If you simply want to update a single document you can use the partial document update API, like this:
POST test/type1/1/_update
{
"doc" : {
"MyProp" : ["dolor", "sit"]
}
}
This will effectively replace the MyProp array in the specified document.
If you want to go the bulk route, you don't need scripting to achieve what you want:
POST index/type/_bulk
{ "update" : {"_id" : "1"} }
{ "doc" : {"MyProp" : ["dolor", "sit"] } }
{ "update" : {"_id" : "2"} }
{ "doc" : {"MyProp" : ["dolor", "sit"] } }
Would a _bulk update work for you?
POST test/type1/_bulk
{"update":{"_id":1}}
{"script":{"inline":"ctx._source.MyProp += new_param","params":{"new_param":"bla"},"lang":"groovy"}}
{"update":{"_id":2}}
{"script":{"inline":"ctx._source.MyProp += new_param","params":{"new_param":"bla"},"lang":"groovy"}}
{"update":{"_id":3}}
{"script":{"inline":"ctx._source.MyProp += new_param","params":{"new_param":"bla"},"lang":"groovy"}}
....
And you would also need to enable inline scripting for groovy. What the above would do is to add a bla value to the listed documents in MyProp field. Of course, depending on your requirements many other changes can be performed in that script.

Partially Update Nested Records in ElasticSearch Index

I am using ElasticSearch 2.x with Nest 2 in my project.
I am facing a issue in which i need to update the nested records , but Elasticsearch doesn't do that, instead it deletes the records and re-index them.
So because of this scenario, i need to always send all the nested records along with the updated one to update the nested records.
So has anyone of you has a solution of this? Can i only update the record without re-indexing all records?
Thanks for your help in advance..!!
Try this,
It works for me
POST /yourindex/type/_id/_update
{
"script" : {
"inline" : "if (ctx._source.yourarray == null || ctx._source.yourarray.size() == 0){ ctx._source.yourarray = params.uuuser} else {ctx._source.yourarray.add(params.newarray[0]) } ",
"params" : {
"newarray" :[
{"c1":"dfgfgsdf",
"c2":"can2",
"ce":" can2#can.co",
"cp":475522778,
"d1":[
{
"e1":"fffff",
"ffff":[{"g1":"hhhhh"},{"g2":"iiiiii"}]
}
]
}
]
}
}
}

elasticsearch filter by length of a string field

i am trying to get records the has in 'title' more then X characters.
NOTE: not all records contains title field.
i have tried:
GET books/_search
{
"filter" : {
"script" : {
"script" : "_source.title.length() > 10"
}
}
}
as a result, i get this error:
GroovyScriptExecutionException[NullPointerException[Cannot invoke method length() on null object
how can i solve it?
You need to take into account that some documents might have a null title field. So you can use the groovy null-safe operator. Also make sure to use the POST method instead:
POST books/_search
{
"filter" : {
"script" : {
"script" : "_source.title?.size() > 10"
}
}
}
You can also use custom tokenizers to count the number of characters. Check this answer for a possible help: https://stackoverflow.com/a/47556098/463846

Elasticsearch - Create field using script if doesn't exist

Is there a way to dynamically add fields using scripts? I am running a script that checks whether a field exists. If not then creates it.
I'm trying out:
script: 'if (ctx._source.attending == null) { ctx._source.attending = { events: newField } } else if (ctx._source.attending.events == null) { ctx._source.attending.events = newField } else { ctx._source.attending.events += newField }'
Except unless I have a field in my _source explicitly named attending in my case, I get:
[Error: ElasticsearchIllegalArgumentException[failed to execute script];
nested: PropertyAccessException[
[Error: could not access: attending; in class: java.util.LinkedHashMap]
To check whether a field exists use the ctx._source.containsKey function, e.g.:
curl -XPOST "http://localhost:9200/myindex/message/1/_update" -d'
{
"script": "if (!ctx._source.containsKey(\"attending\")) { ctx._source.attending = newField }",
"params" : {"newField" : "blue" },
"myfield": "data"
}'
I would consider if it's really necessary to see if the field exists at all. Just apply the new mapping to ES and it will add it if it's required and do nothing if it already exists.
Our system re-applies the mappings on every application startup.

Resources