ElasticSearch - how to edit a field inside an array of a document - elasticsearch

I have a document in our ElasticSearch index which looks like this:
{
"_index": "nm_doc",
"_type": "nm_doc",
"_id": "JRPXqmQBatyecf67YEfq",
"_score": 0.86147696,
"_source": {
"text": "A 29-year-old IT professional from Bhopal was convicted and sentenced to life imprisonment by an Additional Sessions Court in Pune on Wednesday for the rape and brutal murder of a woman in 2008, after she had refused his advances. Watch What Else is Making News The court found Manu Mohinder Ebrol, who worked in the same firm as the girl, of raping and killing the woman after stabbing her 18 times on the night of October 20, 2008, in her rented apartment. After committing the crime, Ebrol had fled to Bhopal. He was arrested later by Pune Police. The prosecution examined 26 witnesses for the case and forensic evidence such as call details and medical records also proved crucial. For all the latest Pune News , download Indian Express App",
"entities": [
{
"name": "Mohinder Ebrol"
},
{
"name": "Sessions Court"
},
{
"name": "Pune Police"
},
{
"name": "Pune News"
},
{
"name": "Indian Express"
}
]
}
If I wanted to edit just the first name in that array (Mohinder Ebrol) to be Manu Ebrol, how would I accomplish this via API call? Do I need to pass in the entire array to update the one name?

I have figured it out via the documentation:
The call Url is:
POST http://elastichost:9200/indexname/_doc/JRPXqmQBatyecf67YEfq/_update?pretty
And the body simply looks like this (yes, you do have to provide the entire array):
{
"doc": { "entities": [
{
"name": "Manu Ebrol"
},
{
"name": "Sessions Court"
},
{
"name": "Pune Police"
},
{
"name": "Pune News"
},
{
"name": "Indian Express"
}
] }
}
Hope this can help someone in the future.

Related

#mention via incoming webhook in MS Teams

I'm trying to mention a user from an incoming webhook.
I tried a few iterations via Postman of
{
"text": "test #user"
}
or
{
"text": "test #user#email.com"
}
but none of these seem to work.
Is this simple but very important thing just not possible?
Thanks.
I'm afraid this isn't possible yet - the only way to do # mentions is by using the full Bot Framework APIs.
You're not the only one to have asked for this though, so I'll get it on the backlog.
This is now supported and documented here (https://learn.microsoft.com/en-us/microsoftteams/platform/task-modules-and-cards/cards/cards-format?tabs=adaptive-md%2Cconnector-html#user-mention-in-incoming-webhook-with-adaptive-cards).
Sample:
{
"type": "message",
"attachments": [
{
"contentType": "application/vnd.microsoft.card.adaptive",
"content": {
"type": "AdaptiveCard",
"body": [
{
"type": "TextBlock",
"size": "Medium",
"weight": "Bolder",
"text": "Sample Adaptive Card with User Mention"
},
{
"type": "TextBlock",
"text": "Hi <at>Adele UPN</at>, <at>Adele AAD</at>"
}
],
"$schema": "http://adaptivecards.io/schemas/adaptive-card.json",
"version": "1.0",
"msteams": {
"entities": [
{
"type": "mention",
"text": "<at>Adele UPN</at>",
"mentioned": {
"id": "AdeleV#contoso.onmicrosoft.com",
"name": "Adele Vance"
}
},
{
"type": "mention",
"text": "<at>Adele AAD</at>",
"mentioned": {
"id": "87d349ed-44d7-43e1-9a83-5f2406dee5bd",
"name": "Adele Vance"
}
}
]
}
}
}]
}
If it helps anybody, after looking in to this and seeing it couldn't be done (still!?), a workaround for me was to change the channel notification settings to banner + feed for all new posts for relevant users in the channel. This eliminates the need to use the tag (if tagging the team).
It is supported in Teams now, however the sample code does not work in Microsoft card playground, I didn't know the code actually worked until I tried in Postman.

Best approch of Elastic Search time based feeds module?

I am new with elastic search and looking for the best solution with which i can create a feed module which have time based feeds along with there group and comment.
I learned little and come up with following.
PUT /group
{
"mappings": {
"groupDetail": {},
"content": {
"_parent": {
"type": "groupDetail"
}
},
"comment": {
"_parent": {
"type": "content"
}
}
}
}
so that will be placed separately as per index.
but than after i found one post where i found that parent child is costly operation for search than nested objects.
something like following is two group(feed) having details with content and comments as nested element.
{
"_index": "group",
"_type": "groupDetail",
"_id": 6829,
"_score": 1,
"_source": {
"groupid": 6829,
"name": "Jignesh Public",
"insdate": "2016-10-01T04:09:33.916Z",
"upddate": "2017-04-19T05:19:40.281Z",
"isVerified": true,
"tags": [
"spotrs",
"surat"
],
"content": [
{
"contentid": 1,
"type": "1",
"byUser": 5858,
"insdate": "2016-10-01 11:20",
"info": [
{
"t": 1,
"v": "lorem ipsum long text 1"
},
{
"t": 2,
"v": "http://www.imageurl.com/1"
}
],
"comments": [
{
"byuser": 5859,
"comment": "Comment 1",
"upddate": "2016-10-01T04:09:33.916Z"
},
{
"byuser": 5860,
"comment": "Comment 2",
"upddate": "2016-10-01T04:09:33.916Z"
}
]
},
{
"contentid": 2,
"type": "2",
"byUser": 5859,
"insdate": "2016-10-01 11:20",
"info": [
{
"t": 4,
"v": "http://www.videoURL.com/1"
}
],
"comments": [
{
"byuser": 5859,
"comment": "Comment 1",
"upddate": "2016-10-01T04:09:33.916Z"
},
{
"byuser": 5860,
"comment": "Comment 2",
"upddate": "2016-10-01T04:09:33.916Z"
}
]
}
]
}
}
{
"_index": "group",
"_type": "groupDetail",
"_id": 6849,
"_score": 1,
"_source": {
"groupid": 6849,
"name": "Xyz Group Public",
"insdate": "2016-10-01T04:09:33.916Z",
"upddate": "2017-04-19T05:19:40.281Z",
"isVerified": false,
"tags": [
"spotrs",
"food"
],
"content": [
{
"contentid": 3,
"type": "1",
"byUser": 5858,
"insdate": "2016-10-01 11:20",
"info": [
{
"t": 1,
"v": "lorem ipsum long text 3"
},
{
"t": 2,
"v": "http://www.imageurl.com/1"
}
],
"comments": [
{
"byuser": 5859,
"comment": "Comment 1",
"upddate": "2016-10-01T04:09:33.916Z"
},
{
"byuser": 5860,
"comment": "Comment 2",
"upddate": "2016-10-01T04:09:33.916Z"
}
]
},
{
"contentid": 4,
"type": "2",
"byUser": 5859,
"insdate": "2016-10-01 11:20",
"info": [
{
"t": 4,
"v": "http://www.videoURL.com/1"
}
],
"comments": [
{
"byuser": 5859,
"comment": "Comment 1",
"upddate": "2016-10-01T04:09:33.916Z"
},
{
"byuser": 5860,
"comment": "Comment 2",
"upddate": "2016-10-01T04:09:33.916Z"
}
]
}
]
}
}
now if i try to think with nested object than i confused if user add comment very frequently than reindexing factor will effect?
So main think i want to ask is which is the best approach with which i can add comment frequently and my content searching result is also faster.
Performance
Parent/child stores relevant data in same shards, as separately doc, which avoid the network;
Parent/child needs a joining process when retrieving data;
Nested object store the inner and outer object together, as a single doc;
So, we can infer:
Update nested object will re-index whole index, which can very expensive if your document is large;
Update parent or child alone will not affect the other one;
Searching nested object is a little fast, which save the process of joining;
Suggestions
As far as I understand your problem, you should use parent/child.
When your group's comments become more and more, adding a new comment will still re-index whole content, which can be very time-consuming;
On the other hand, search a comment with parent/child just need one more look up after finding the child, which is relative acceptable.
Furthermore, you should also take the rate of searching a comment comparing to adding a comment into account:
If you need searching a lot but a little new comments, maybe you can choose nested object;
Otherwise, choose parent/child;
By the way, you may combine both of them:
When this feed is active, use parent/child to store them;
When it is closed, i.e., no more comments can be added, move them to a new index with nested object;
If you do not specify more detailed info other than very frequently it is going to be hard to come up with a recommendation. Also you have not mentioned how your data looks like. A comment in a blog post might be happening rare, even in heated discussions. A comment/reply in a forum post (that will result in a huge document) might be sth very different. I'd personally start with nested and see how it goes, but I also do not know all the requirements, so this might be a very wrong answer.

How to insert 100 million entries in FireBase database without painting myself into a corner

I am learning and trying to understand how to optimizing my FireBase database for my use case.
Looking at the code below: Lets say that the US/STREET_ADDRESS get pushed 150 million entries(the number of street addresses in the US), and I want to sort on the "path": key, since its a unique identifier for all addresses. What would the performance be here when it comes to responce time? If this setup regarding my question is not advicable how can I change the US/STREET_ADDRESS?
Reading about fan-out but dont think i need that because of no multichat or other multi anything. In my case I simply want to create a databas that holdes the intere worlds all street addresses, for whatever reason :).
Should I denormalize the US/STREET_ADDRESS even more, like splitting the 10 million entries into US/A/STREET_ADDRESS, US/B/STREET_ADDRESS and US/C/STREET_ADDRESS. This can be done since the "path": "US/California/Orange County/3138 E Maple Ave" in the sample code below is unique, and could even be used as a Key instead of Firebase postsRef.push() auto key, that this code sample use. But still there will be 100 millions
{
"AE": {
"name": "United Arab Emirates"
},
"GB": {
"name": "United Kingdom"
},
"US": {
"name": "United States",
"STREET_ADDRESS": {
"-KUxrqYlItjme2v1I_W5": {
"id": "hjg86-tg33-8hu4-yh5u",
"path": "US/California/Orange County/3138 E Maple Ave"
},
"-KUJHjj7HG5gGHNNJ_T5": {
"id": "ds86-tg12-7eu4-juw3",
"path": "US/Florida/Tampa/104 Biscayne Ave"
}
}
},
"CH": {
"name": "Switzerland"
},
"SY": {
"name": "Syrian Arab Republic"
},
"TW": {
"name": "Taiwan"
},
"TJ": {
"name": "Tajikistan"
},
"TZ": {
"name": "Tanzania, United Republic of"
},
"TH": {
"name": "Thailand"
}
}

Use a many to many relation in Elasticsearch

Currently we have a problem to perform a query (or more precisely to design a mapping) in elasticsearch, which help us to perform a query over a relational problem, that we didn't get solved with our non-document orientated thinking from sql.
We want to create a many-to-many relation between different Elasticsearch entries. We need this to edit an entry once and keep all using’s updated to this.
To describe the problem, we'll use the following simple data model:
Broadcast Content
------------ ---------
Id Id
Day Title
Contents [] Description
So we have two different types to index, broadcasts and contents.
A broadcast can have many contents and single contents could also be part of different broadcasts (e.g. repetition).
JSON like:
index/broadcasts
{
"Id": "B1",
"Day": "2014-10-15",
"Contents": [
"C1",
"C2"
]
}
{
"Id": "B2",
"Day": "2014-10-16",
"Contents": [
"C1",
"C3"
]
}
index/contents
{
"Id": "C1",
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
}
{
"Id": "C2",
"Title": "Have a break!",
"Description": "Everything about Android"
}
{
"Id": "C3",
"Title": "Late Night Disaster",
"Description": "Comedy show"
}
Now we want to rename the "Late Night Disaster" into something more precisely and keep all references up to date.
How could we approach this? Are there fourther options in ES, like includes in RavenDB?
Nested objects or child-parent relations didn't helped us so far.
What about denormalizing? seems difficult if we come from the SQL mindset, but give you a try, even with millions of documents, LUCENE indexing can help, and renaming will be a batch job.
[
{
"Id": "B1",
"Day": "2014-10-15",
"Contents": [
{
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
},
{
"Title": "Have a break!",
"Description": "Everything about Android"
}
]
},
{
"Id": "B2",
"Day": "2014-10-16",
"Contents": [
{
"Title": "Wake up",
"Description": "Morning show with Jeff Bridges"
},
{
"Title": "Late Night Disaster",
"Description": "Comedy show"
}
]
}
]

ElasticSearch - Querying only for particular array elements that are not empty

I'm relatively new to ES and am having difficulty finding really good references or tutorials on the query dsl.
We have a document type of the example below. The query I wish to conduct is thus: "Return all the email_package records that have at least one entities record (one record in the 'entities' array)." And yes I want the complete 'email' record.
Could anyone assist? Also if you could point to a reference or tutorial or cookbook somewhere that addresses question like this, that would be also greatly appreciated.
"email_package": {
"email": {
"date": "2007-02-13T18:24:22-04:00",
"subject": "this is the subject",
"body": "this is the body"
},
"entities": [
{
"Louisville": {
"City": "South"
}
},
{
"Memphis": {
"City": "South"
}
}
]
}
// more 'email_package records follow...
Your document is a bit problematic, since you seems to be nesting objects and giving them different names. If you are not bound to the current structure, I would have changed the mapping into something that is more manageable, and queries will be straight forward, e.g:
"email_package": {
"email": {
"body": "this is the body1",
"date": "2007-02-13T18:24:22-04:00",
"subject": "this is the subject"
},
"entities": [
{
"name": "Louisville"
"City": "South",
},
{
"name": "Memphis"
"City": "South",
}
]
}
Query:
{ "filter": {
"exists": {
"field": "email_package.entities.name"
}
}

Resources