Looking for Elasticsearch updateByQuery syntax example (Node driver) - elasticsearch

You have an Elasticsearch index with two docs:
[
{
"_index": "myIndex",
"_type": "myType",
"_id": "es1472002807930",
"_source": {
"animal": "turtle",
"color": "green",
"weight": 20,
}
},
{
"_index": "myIndex",
"_type": "myType",
"_id": "es1472002809463",
"_source": {
"animal": "bear",
"color": "brown"
"weight": 400,
}
}
]
Later, you get this updated data about the bear:
{
"color": "pink",
"weight": 500,
"diet": "omnivore",
}
So, you want to update the "color" and "weight" values of the bear, and add the "diet" key to the "bear" doc. You know there's only one doc with "animal": "bear" (but you don't know the _id):
Using the Nodejs driver, what updateByQuery syntax would update the "bear" doc with these new values?
(NOTE: this question has been entirely edited to be more useful to the SO community!)

The answer was provided by Val in this other SO:
How to update a document based on query using elasticsearch-js (or other means)?
Here is the answer:
var theScript = {
"inline": "ctx._source.color = 'pink'; ctx._source.weight = 500; ctx._source.diet = 'omnivore';"
}
client.updateByQuery({
index: myindex,
type: mytype,
body: {
"query": { "match": { "animal": "bear" } },
"script": theScript
}
}, function(err, res) {
if (err) {
reportError(err)
}
cb(err, res)
}
)

The other answer is missing the point since it doesn't have any script to carry out the update.
You need to do it like this:
POST /myIndex/myType/_update_by_query
{
"query": {
"term": {
"animal": "bear"
}
},
"script": "ctx._source.color = 'green'"
}
Important notes:
you need to make sure to enable dynamic scripting in order for this to work.
if you are using ES 2.3 or later, then the update-by-query feature is built-in
if you are using ES 1.7.x or a former release you need to install the update-by-query plugin
if you are using anything between ES 2.0 and 2.2, then you don't have any way to do this in one shot, you need to do it in two operations.
UPDATE
Your node.js code should look like this, you're missing the body parameter:
client.updateByQuery({
index: index,
type: type,
body: {
"query": { "match": { "animal": "bear" } },
"script": { "inline": "ctx._source.color = 'pink'"}
}
}, function(err, res) {
if (err) {
reportError(err)
}
cb(err, res)
}
)

For elasticsearch 7.4 you could use
await client.updateByQuery({
index: "indexName",
body: {
query: {
match: { fieldName: "valueSearched" }
},
script: {
source: "ctx._source.fieldName = params.newValue",
lang: 'painless',
params: {
newValue: "newValue"
}
}
}
});

Related

How to filter match in top 3 - elasticsearch?

I am having the following data in the elasticsearch
{
"_index": "media",
"_type": "information",
"_id": "6838",
"_source": {
"demographics_countries": {
"AE": 0.17543859649122806,
"CA": 0.013157894736842105,
"FR": 0.017543859649122806,
"GB": 0.043859649122807015,
"IT": 0.02631578947368421,
"LB": 0.013157894736842105,
"SA": 0.49122807017543857,
"TR": 0.017543859649122806,
"US": 0.09210526315789472
}
}
},
{
"_index": "media",
"_type": "information",
"_id": "57696",
"_source": {
"demographics_countries": {
"TN": 0.8125,
"MA": 0.034375,
"DZ": 0.032812,
"FR": 0.0125,
"EG": 0.0125,
"IN": 0.009375,
"SA": 0.009375
}
}
]
Expected result:
Find out an document having specific country SA (saudi arabia) is among top 3 in demographics_countries
For example:
"_id": "6838" (first document) is matched because SA (saudi arabia) is among top 3 in the demographics_countries in the above mentioned example document.
Tried ? : I have tried to filter using top_hits, But it's not working as expected.
Any suggestion will be grateful
With the current data model it's quite difficult to do that. What I'd suggest might be not the easiest way to do it, but it will definitely be the fastest to query eventually.
I'd suggest remodelling your documents to already include top countries:
[
{
"_index": "media",
"_type": "information",
"_id": "6838",
"_source": {
"top_demographics_countries": ["TN", "MA", "DZ"],
"demographics_countries": {
"AE": 0.17543859649122806,
"CA": 0.013157894736842105,
"FR": 0.017543859649122806,
"GB": 0.043859649122807015,
"IT": 0.02631578947368421,
"LB": 0.013157894736842105,
"SA": 0.49122807017543857,
"TR": 0.017543859649122806,
"US": 0.09210526315789472
}
}
},
{
"_index": "media",
"_type": "information",
"_id": "57696",
"_source": {
"top_demographics_countries": ["TN", "MA", "DZ"],
"demographics_countries": {
"TN": 0.8125,
"MA": 0.034375,
"DZ": 0.032812,
"FR": 0.0125,
"EG": 0.0125,
"IN": 0.009375,
"SA": 0.009375
}
}
}
]
Ignore values I've picked for top_demographics_countries. With this kind of approach, you can always precalculate top and then you could use a simple terms query to check if document contains that value or not:
{
"query": {
"bool": {
"filter": {
"term": {
"top_demographics_countries": "SA"
}
}
}
}
}
It's going to be cheaper to compute them once during saving compared to always building that clause dynamically.
#Evaldas is right -- it's better to extract the top 3 beforehand.
But if you can't help yourself and feel compelled to use java/painless, here's one approach:
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "demographics_countries.SA"
}
},
{
"script": {
"script": {
"source": """
def tuple_list = new ArrayList();
for (def c : params.all_countries) {
def key = 'demographics_countries.'+c;
if (!doc.containsKey(key) || doc[key].size() == 0) {
continue;
}
def val = doc[key].value;
tuple_list.add([c, val]);
}
// sort tuple list by the country values
Collections.sort(tuple_list, (arr1, arr2) -> arr1[1] < arr2[1] ? 1 : -1);
// slice & take only the top 3
def top_3_countries = tuple_list.subList(0, 3).stream().map(arr -> arr[0]).collect(Collectors.toList());
return top_3_countries.size() >=3 && top_3_countries.contains(params.country_of_interest);
""",
"params": {
"country_of_interest": "SA",
"all_countries": [
"AE",
"CA",
"FR",
"GB",
"IT",
"LB",
"SA",
"TR",
"US",
"TN",
"MA",
"DZ",
"EG",
"IN"
]
}
}
}
}
]
}
}
}

Spring Mongo - An aggregation to order by objects in an array

I have the following data:
{
"_id": ObjectID("5e2fa881c3a1a70006c5743c"),
"name": "Some name",
"policies": [
{
"cId": "dasefa-2738-4cf0-90e0d568",
"weight": 12
},
{
"cId": "c640ad67dasd0-92f981583568",
"weight": 50
}
]
}
I'm able to query this with Spring Mongo fine, however I want to be able to order the policies by weight
At the moment I get my results fine with:
return mongoTemplate.find(query, CArea::class.java)
However say I make the following aggregations:
val unwind = Aggregation.unwind("policies")
val sort = Aggregation.sort(Sort.Direction.DESC,"policies.weight")
How can I go and actually apply those to the returned results above? I was hoping that the dot annotation would do the job in my query however didnt do anything e.g. Query().with(Sort.by(options.sortDirection, "policies.weight"))
Any help appreciated.
Thanks.
I am not familier with Spring Mongo, but I guess you can convert the following aggregation to spring code.
db.collection.aggregate([
{
$unwind: "$policies"
},
{
$sort: {
"policies.weight": -1
}
},
{
$group: {
_id: "$_id",
"policies": {
"$push": "$policies"
},
parentFields: {
$first: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$parentFields",
{
policies: "$policies"
}
]
}
}
}
])
This will result:
[
{
"_id": "5e2fa881c3a1a70006c5743c",
"name": "Some name",
"policies": [
{
"cId": "c640ad67dasd0-92f981583568",
"weight": 50
},
{
"cId": "dasefa-2738-4cf0-90e0d568",
"weight": 12
}
]
}
]
Playground

MongoDB aggregation query using spring

db.getCollection('questionbank').aggregate([
{ "$group": {
"_id": {
"technology": "$technology",
"level":"$level",
"type":"$type"
},
"Count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.technology",
"QuestionCount": {
"$push": {
"level":"$_id.level",
"type":"$_id.type",
"count": "$Count"
},
}
}}
])
I am trying to get the same output structure.
Can anyone please help me to write above query in spring.
I have tried a lot but failed.
You can use the following .
group("technology", "level", "type").count().as("count"), group("_id.technology") .push( new BasicDBObject("level", "$_id.level") .append("type", "$_id.type") .append("count", "$count")) .as("questionCount")

Children are not mapping properly in elastic to parents

"chods": {
"mappings": {
"chod": {
"properties": {
"state": {
"type": "text"
}
}
},
"chods": {},
"variant": {
"_parent": {
"type": "chod"
},
"_routing": {
"required": true
},
"properties": {
"percentage": {
"type": "double"
}
}
}
}
},
When I execute:
PUT /chods/variant/565?parent=36442
{ // some data }
It returns:
{
"_index":"chods",
"_type":"variant",
"_id":"565",
"_version":6,
"result":"updated",
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"created":false
}
But when I run this query:
GET /chods/variant/565?parent=36442
It returns variant with parent=36443
{
"_index": "chods",
"_type": "variant",
"_id": "565",
"_version": 7,
"_routing": "36443",
"_parent": "36443",
"found": true,
"_source": {
...
}
}
Why it returns with parent 36443 and not 36442?
When I tried to reproduce this with your steps, I got the expected result (version=36442). I noticed that after your PUT of the document with "_parent": "36442" the output is "_version":6. In your GET of the document, "_version": 7 is returned. Is it possible that you posted another version of the document?
I also noticed that GET /chods/variant/565?parent=36443 would not actually filter by the parent id - the query parameter is disregarded. If you actually want to filter by parent id, this is the query you're looking for:
GET /chods/_search
{
"query": {
"parent_id": {
"type": "variant",
"id": "36442"
}
}
}
As #fylie pointed out the main problem is that if you use same id of the document you will get your document overridden by last version - sort of
Lets say that we have index /tests and type "a" which is child of type "test" and we do following commands:
PUT /tests/a/50?parent=25
{
"item": "C"
}
PUT /tests/a/50?parent=26
{
"item": "D"
}
PUT /tests/a/50?parent=50
{
"item": "E",
"item2": "F",
}
What the result will be? Well it can result in creating 1 - 3 documents.
If it will route to the same shard, you will end up with one document, which will have 3 versions.
If it will route to 3 different shards, you will end up with 3 new documents.

Elasticsearch: upsert working on older version but not on newer version

I have an upsert query running in bulk. Final document is to be stored like this:
{
"email": "abc#xyz.com",
"sources": [1,2]
}
Here is the code:
var doc = {
"source": parseInt(id),
"email": email
}
var upsert_query = {
"script": "if (ctx._source.containsKey(\"sources\")) { if (!ctx._source.sources.contains(source)) { ctx._source.sources += source; } } else {ctx._source.sources = [source] }",
"params": {
"source": doc.source
},
"upsert": {
"email": doc.email,
"sources": [doc.source]
}
}
bulkRequestBody.push({"update": {"_index": "my_index", "_type": "email", "_id": doc.email, "_retry_on_conflict": 3}});
bulkRequestBody.push(upsert_query);
The code works perfectly fine on elasticsearch version 1.4 but not working on version 2.1.1.
I also tried to restructure my query:
var upsert_query = {
"script": {
"inline": "if (ctx._source.containsKey(\"sources\")) { if (!ctx._source.sources.contains(source)) { ctx._source.sources += source; } } else {ctx._source.sources = [source] }",
"params": {
"source": doc.source
}
},
"upsert": {
"email": doc.email,
"sources": [doc.source]
}
}
but still no luck. Any help ?
Scripting needs to be enabled to run scripts like this:
in the elasticsearch.yml file in config, add the following lines:
script.inline: on
script.indexed: on

Resources