Does ES support to add a new item into the nested field of existing document? - elasticsearch

I have a ES Doc like this.
{
"title": "Nest eggs",
"comments": [
{
"name": "John Smith",
"comment": "Great article",
},
{
"name": "Alice White",
"comment": "More like this please",
}
]
}
and now I'd like to add a new "comments" in this document and finially the document will be
{
"title": "Nest eggs",
"comments": [
{
"name": "John Smith",
"comment": "Great article",
},
{
"name": "Alice White",
"comment": "More like this please",
},
{
"name": "New guy",
"comment": "something here",
}
]
}
I don't want to provide the existing "comments" object during every append so what should be the best approach to add a new object every time to this nested field.
My solution:
POST test_v2/_update/Z_nM_2wBjkGOA-r6ArOb
{
"script": {
"lang": "painless",
"inline": "ctx._source.nested_field.add(params.object)",
"params": {
"object": {
"model" : "tata nano",
"value" : "2"
}
}
}
}

Checking for empty field in script itself. If field doesn't exist it is created first
POST test3/_update/30RaAG0BY3127H1HaOEv
{
"script": {
"lang": "painless",
"inline": "if(!ctx._source.containsKey('comments')){ctx._source['comments']=[]}ctx._source.comments.add(params.object)",
"params": {
"object": {
"model": "tata nano",
"value": "2"
}
}
}
}

Related

Insert child document in Elasticsearch update_by_query

i have a parent-child model in Elasticsearch and i would like to add a new child to a parent based on a query. Naturally the update_by_user call is a candidate for this. However i struggle to insert a new child document in the script section. Does anyone has a suggestion on how to do this? Manual inserts would work, but performance would very slow.
Example scenario: Two parents, one is married, one is single. The married one already has a child. The update should now add a new child to all married parents.
Index mappings
PUT /family
{
"mappings": {
"properties": {
"familyRelation": {
"type": "join",
"relations": {
"parent": "child"
}
},
"name": {
"type": "keyword"
},
"status": {
"type": "married"
}
}
}
}
Parent document:
PUT /family/_doc/dave
{
"name": "dave",
"familyRelation": "parent",
"status": "married"
}
PUT /family/_doc/tom
{
"name": "tom",
"familyRelation": "parent",
"status": "single"
}
Child document:
PUT /family/_doc/amelie/?routing=dave
{
"familyRelation": {
"name": "child",
"parent": "dave"
},
"name": "amelie"
}
An (incomplete) update by query would look like this:
POST /family/_update_by_query
{
"script": {
"source": "???",
"lang": "painless",
"params": {
"child": {
"familyRelation": {
"name": "child",
"parent": "Dave"
},
"name": "Jonathan"
}
}
},
"query": {
"term": {
"status": "married"
}
}
}

Elasticsearch - nested types vs collapse/aggs

I have a use case where I need to find the latest data based on some fields.
The fields are:
category.name
category.type
createdAt
For example: search for the newest data where category.name = 'John G.' AND category.type = 'A'. I expect the data with ID = 1 where it matches the criteria and is the newest one based on createdAt field ("createdAt": "2022-04-18 19:09:27.527+0200")
The problem is that category.* is a nested field and I can't aggs/collapse these fields because ES doesn't support it.
Mapping:
PUT data
{
"mappings": {
"properties": {
"createdAt": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss.SSSZ"
},
"category": {
"type": "nested",
"properties": {
"name": {
"type": "text",
"analyzer": "keyword"
}
}
},
"approved": {
"type": "text",
"analyzer": "keyword"
}
}
}
}
Data:
POST data/_create/1
{
"category": [
{
"name": "John G.",
"level": "A"
},
{
"name": "Chris T.",
"level": "A"
}
],
"createdBy": "John",
"createdAt": "2022-04-18 19:09:27.527+0200",
"approved": "no"
}
POST data/_create/2
{
"category": [
{
"name": "John G.",
"level": "A"
},
{
"name": "Chris T.",
"level": "A"
}
],
"createdBy": "Max",
"createdAt": "2022-04-10 10:09:27.527+0200",
"approved": "no"
}
POST data/_create/3
{
"category": [
{
"name": "Rick J.",
"level": "B"
}
],
"createdBy": "Rick",
"createdAt": "2022-03-02 02:09:27.527+0200",
"approved": "no"
}
I'm looking for either a search query that can handle that in an acceptable performant way, or a new object design without nested type where I could take advantage of aggs/collapse feature.
Any suggestion will be really appreciated.
About your first question,
For example: search for the newest data where category.name = 'John G.' AND category.type = 'A'. I expect the data with ID = 1 where it matches the criteria and is the newest one based on createdAt field ("createdAt": "2022-04-18 19:09:27.527+0200")
I believe you can do something along those lines:
GET /72088168/_search
{
"query": {
"nested": {
"path": "category",
"query": {
"bool": {
"must": [
{
"match": {
"category.name": "John G."
}
},
{
"match": {
"category.level": "A"
}
}
]
}
}
}
},
"sort": [
{
"createdAt": {
"order": "desc"
}
}
],
"size":1
}
For the 2nd matter, it really depends on what you are aiming to do. could merge category.name and category.level in the same field. Such that you document would look like:
{
"category": ["John G. A","Chris T. A"],
"createdBy": "Max",
"createdAt": "2022-04-10 10:09:27.527+0200",
"approved": "no"
}
No more nested needed. Although I agree it feels like using tape to fix your issue.

Elastic Search Wildcard query with space failing 7.11

I am having my data indexed in elastic search in version 7.11. This is my mapping i got when i directly added documents to my index.
{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}
I havent added the keyword part but no idea where it came from.
I am running a wild card query on the same. But unable to get data for keywords with spaces.
{
"query": {
"bool":{
"should":[
{"wildcard": {"name":"*hello world*"}}
]
}
}
}
Have seen many answers related to not_analyzed . And i have tried updating {"index":"true"} in mapping but with no help. How to make the wild card search work in this version of elastic search
Tried adding the wildcard field
PUT http://localhost:9001/indexname/_mapping
{
"properties": {
"name": {
"type" :"wildcard"
}
}
}
And got following response
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "mapper [name] cannot be changed from type [text] to [wildcard]"
}
],
"type": "illegal_argument_exception",
"reason": "mapper [name] cannot be changed from type [text] to [wildcard]"
},
"status": 400
}
Adding a sample document to match
{
"_index": "accelerators",
"_type": "_doc",
"_id": "602ec047a70f7f30bcf75dec",
"_score": 1.0,
"_source": {
"acc_id": "602ec047a70f7f30bcf75dec",
"name": "hello world example",
"type": "Accelerator",
"description": "khdkhfk ldsjl klsdkl",
"teamMembers": [
{
"userId": "karthik.r#gmail.com",
"name": "Karthik Ganesh R",
"shortName": "KR",
"isOwner": true
},
{
"userId": "anand.sajan#gmail.com",
"name": "Anand Sajan",
"shortName": "AS",
"isOwner": false
}
],
"sectorObj": [
{
"item_id": 14,
"item_text": "Cross-sector"
}
],
"geographyObj": [
{
"item_id": 4,
"item_text": "Global"
}
],
"technologyObj": [
{
"item_id": 1,
"item_text": "Artificial Intelligence"
}
],
"themeColor": 1,
"mainImage": "assets/images/Graphics/Asset 35.svg",
"features": [
{
"name": "Ideation",
"icon": "Asset 1007.svg"
},
{
"name": "Innovation",
"icon": "Asset 1044.svg"
},
{
"name": "Strategy",
"icon": "Asset 1129.svg"
},
{
"name": "Intuitive",
"icon": "Asset 964.svg"
},
],
"logo": {
"actualFileName": "",
"fileExtension": "",
"fileName": "",
"fileSize": 0,
"fileUrl": ""
},
"customLogo": {
"logoColor": "#B9241C",
"logoText": "EC",
"logoTextColor": "#F6F6FA"
},
"collaborators": [
{
"userId": "muhammed.arif#gmail.com",
"name": "muhammed Arif P T",
"shortName": "MA"
},
{
"userId": "anand.sajan#gmail.com",
"name": "Anand Sajan",
"shortName": "AS"
}
],
"created_date": "2021-02-18T19:30:15.238000Z",
"modified_date": "2021-03-11T11:45:49.583000Z"
}
}
You cannot modify a field mapping once created. However, you can create another sub-field of type wildcard, like this:
PUT http://localhost:9001/indexname/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"wildcard": {
"type" :"wildcard"
},
"keyword": {
"type" :"keyword",
"ignore_above":256
}
}
}
}
}
When the mapping is updated, you need to reindex your data so that the new field gets indexed, like this:
POST http://localhost:9001/indexname/_update_by_query
And then when this finishes, you'll be able to query on this new field like this:
{
"query": {
"bool": {
"should": [
{
"wildcard": {
"name.wildcard": "*hello world*"
}
}
]
}
}
}

Is it possible to update to a document with array of fields in elasticsearch using update_by_query API?

I have a document with the following content in all the documents in an index:
"universities": {
"number": 1,
"state": [
{
"Name": "michigan",
"country": "us",
"code": 5696
}
]
}
I want to update all the documents in the index like this
"universities": {
"number": 1,
"state": [
{
"Name": "michigan",
"country": "us",
"code": 5696
},
{
"Name": "seatle",
"country": "us",
"code": 5695
}
]
}
IS this can be possible using update_by_query in elasticsearch 2.4.1?
I tried the below query:
"script": {
"inline": "for(i in ctx._source.univeristies.state){i.name=Text}",
"params": {
"Text": "seatle"
}
}
}
but is appending the name to existing one rather than creating a new one in a list.
You need to use this script instead:
"script": {
"inline": "ctx._source.universities.state.add(new_state)",
"params": {
"new_state": {
"Text": "Seattle",
"country": "us",
"code": 5695
}
}
}
}
UPDATE:
For later versions of ES (6+), the query looks like this instead:
"script": {
"source": "ctx._source.universities.state.add(params.new_state)",
"params": {
"new_state": {
"Text": "Seattle",
"country": "us",
"code": 5695
}
}
}
}

Elasticsearch query on inner list and get only matching objects from list instead of entire list in result document

In following elastic search documents need to find comments from specific name eg "Mary Brown". Basically query on inner list and get only matching objects from list instead of entire list in result document. Is it possible. I have defined nested as mapping for 'comments'
{
"title": "Investment secrets",
"body": "What they don't tell you ...",
"tags": [ "shares", "equities" ],
"comments": [
{
"name": "Mary Brown",
"comment": "Lies, lies, lies",
"age": 42,
"stars": 1,
"date": "2014-10-18"
},
{
"name": "John Smith",
"comment": "You're making it up!",
"age": 28,
"stars": 2,
"date": "2014-10-16"
},
{
"name": "Mary Brown",
"comment": "making it!!!",
"age": 42,
"stars": 3,
"date": "2014-10-20"
}
]
}
Since you have properly mapped your comments field as nested, then yes this is possible using inner_hits, like this:
{
"_source": false,
"query": {
"nested": {
"path": "comments",
"inner_hits": { <---- use inner_hits here
"_source": [
"comment", "date"
]
},
"query": {
"bool": {
"must": [
{
"term": {
"comments.name": "Mary Brown"
}
}
]
}
}
}
}
}

Resources