Elasticsearch: add new fields to an existing documents - elasticsearch

I would like to add to an existing document matched by a query a new object with new fields.
PUT test/_doc/1
{
"id" : 1,
"text": "My life is beautiful"
"category": "optimistic"
}
I would like to add to all the "category":"optimistic" documents, a new object, something like
{"references": {
"group": "Pro-life",
"responsable": "Mr. Happy Guy"
"job": "Happiness bringer"
}
I would like to try with update_by_query but I cannot make it work with object like this. Any ideas?
I did try with this:
{
"script": "ctx._source.references='{\"hello\":\"world\"}'",
"query": {
"match": {
"category": "optimistic"
}
}
}
But it doesn't give me the expected results. It just saved it as a string """{"hello":"world"}"""
whilst I wanted it as JSON object

Not tested but you could try with params along with update_by_query,
{
"query": {
"match": {
"category": "optimistic"
}
},
"script": {
"inline": "ctx._source.references = params.new_fields",
"params": {
"new_fields": {
"group": "Pro-life",
"responsable": "Mr. Happy Guy"
"job": "Happyness bringer"
}
}
}
}

You should use painless syntax to do that, try it like that:
"script": "ctx._source.references=[ \"hello\":\"world\" ]"
More info: https://www.elastic.co/guide/en/elasticsearch/painless/current/painless-operators-reference.html#map-initialization-operator

Related

Is it possible to upsert documents in elasticsearch based on a given field other than _id?

I have an index with documents as follows
{ "StartId": "123a", "EndId": "123b", "Tag": "tag1" }
{ "StartId": "234a", "EndId": "234b", "Tag": "tag2" }
{ "StartId": "345a", "EndId": "345b", "Tag": "tag3" }
{ "StartId": "456a", "EndId": "456b", "Tag": "tag4" }
Now I have a new document
{ "StartId": "567a", "EndId": "567b", "Tag": "tag5" }
If StartId from new doc already exists in the index, I want to update it with the data in new doc else I want to insert the new document.
I know how to upsert by _id
POST test/_update/1
{
"doc": {
"name": "new_name"
},
"doc_as_upsert": true
}
But is it possible to upsert by any field other than _id? If yes, how can I do it?
No, it's not possible in one go.
doc_as_upsert is reserved for the Update API which requires an _id in the request's URI path.
You can, however, update documents without knowing the _id through the Update by query API:
POST test-index/_update_by_query
{
"query": {
"match_all": {}
},
"script": {
"source": """
if (ctx._source['StartId'] == params['StartId']) {
ctx._source.putAll(params['upsertWith']);
}
""",
"lang": "painless",
"params": {
"StartId": "567a",
"upsertWith": {
"name": "new_name"
}
}
}
}
but upserting is impossible because ctx['_id'] is read-only. You can, however, delete documents with it.

How to match first and last names with Elasticsearch?

A typical Elasticsearch JSON response is kind of like:
[
{
"_index": "articles",
"_id": "993",
"_score": 10.443843,
"_source": {
"title": "This is a test title",
"authors": [
{
first_name: 'john',
last_name: 'smith'
},
How can I query for all articles where one of the authors is 'john smith'? Currently I have:
const {
hits: { hits }
} = await client.search({
index: "articles",
body: {
query: {
bool: {
should: [
{
match: {
"authors.first_name": "john"
}
},
{
match: {
"authors.first_name": "Smith"
}
}
]
}
}
}
});
But this returns articles where first or last name are john or smith, not articles with a 'john smith' as an author.
I think you are facing nested vs. object dilemma here. You can achieve what you are looking for by changing the type of authors field to nested type (you didn't share your index mapping so I'm assuming here) and using this query
{
"query":{
"nested":{
"path":"authors",
"query":{
"bool":{
"must":[
{
"match":{
"authors.firstName":{
"query":"john"
}
}
},
{
"match":{
"authors.lastName":{
"query":"Smith"
}
}
}
]
}
}
}
}
}
Hope that helps.
Well in this case your using a "should" statement which can be explained as
firstname:john OR lastname:smith
this can be easily fix with a "must" instead, which can be explained as
firstname:john AND lastname:smith
Also as rob mention in his response, nested vs object is indeed a dilema.
but this dilema would appear when you're treating with arrays of information.
for example you have the following entry
entry #1
{
"serviceType": "mysql",
"allowedUsers": [
{
"firstName": "Daniel",
"lastName": "Acevedo"
},
{
"firstName": "John",
"lastName": "Smith"
},
{
"firstName": "Mike",
"lastName": "K"
}
]
}
and you do the following search
{
"size": 10,
"query": {
"query_string": {
"query": "allowedUsers.firstName:john AND allowedUsers.lastName:acevedo"
}
}
}
you WILL have a match in the document because because both firstName and lastName match your document even though they match in different user objects. this is an example of OBJECT mapping.
in this case there is no work around, and you must use NESTED mapping in order to acomplish a natural match.
in your specific case i dont think you're facing this so going with OBJECT and MUST (AND instead of should (OR)) query you should do fine.
if you need further explanation let me know I'll make an edit with more details.
cheers.

Auto Complete is not working in Elastic Search

If we give exact match or only one character its working fine, but if we give 2 or 3 characters auto complete is not working. For example if we give T or Test its working, but if i give Tes its not working.
My data looks like this
PUT /test/test/1
{
"id": "1",
"input": "Test",
"output": ["Testing", "Testing"]
}
PUT /test/test/2
{
"id": "2",
"input": "Test two",
"output":["Testing", "Testing"]
}
My elastic query is
{
"query": {
"query_string": {
"query": "tes"
}
}
}
You forgot a wildcard I believe:
GET /test/test/_search
{
"query": {
"query_string": {
"query": "tes*"
}
}
}
You may also want to use "query": "input:tes*" to autocomplete only one specific field.

Casting when querying ElasticSearch data

Is there a way in elasticsearch where I can cast a string to a long value at query time?
I have something like this in my document:
"attributes": [
{
"key": "age",
"value": "23"
},
{
"key": "name",
"value": "John"
},
],
I would like to write a query to get all the persons that have an age > 23. For that I need to cast the value to an int such that I can compare it when the key is age.
The above document is an example very specific to this problem.
I would greatly appreciate your help.
Thanks!
You can use scripting for that
POST /index/type/_search
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "foreach(attr : _source['attributes']) {if ( attr['key']=='age') { return attr['value'] > ageValue;} } return false;",
"params" : {
"ageValue" : 23
}
}
},
"query": {
"match_all": {}
}
}
}
}
UPD: Note that dynamic scripting should be enabled in elasticsearch.yml.
Also, I suppose you can archive better query performance by refactoring you document structure and applying appropriate mapping for age field.

Is there any way not to return arrays when specifying return fields in an Elasticsearch query?

If I have a documents like this :
[
{
"model": "iPhone",
"brand": "Apple"
},
{
"model": "Nexus 5",
"brand": "Google"
}
]
And that I make a query which only returns the model field in a query, like this:
{
"fields": ["model"],
"query": {
"term": {
"brand": "apple"
}
}
}
Then each document field is returned within an array like this:
{ "model": ["iPhone"] }
instead of
{ "model": "iPhone" }
How can I avoid that and get the fields in the same format as when the fields query option is not defined?
At the end the answer was pretty easy: you have to use the _source query option insteand of fields.
Example:
{
"_source": ["model"],
"query": {
"term": {
"brand": "apple"
}
}
}
This way I get documents in the following format, like in the original one (without the _source option):
{ "model": "iPhone" }
I had the same problem, and indeed (as Wax Cage said) I thought that _source would bring some performances problems. I think using both fields and _source solves the problem:
const fields = ['model']
{
fields: fields,
_source: fields
query: {
term: {
brand: 'apple'
}
}
}

Resources