How to store info regarding which notifications have been read by user? - elasticsearch

I have a set of notification or information items stored in elasticsearch. Once a user has seen a notification I need to mark it as seen by that user. A user can filter documents by read/unread status. Notifications will be viewed by lot of users and seen status will constantly get updated. What is the best way to store this data. Shall I store the list of users which have seen that notification in same document itself or shall I create parent child relationship.

For sure you should avoid parent-child or nested type because they are computationial costful. the best way to achieve the relationship with a lot of data is to denormalize your data and put them in different indices. Please read here and here .Example:
PUT notification
{"mappings": {
"properties": {
"content": {
"type": "text"},
"id_notification":{
"type":"keyword" }{
}}
}
}
Then user index:
PUT user
{"mappings": {
"properties": {
"general_information": {
"type": "text"},
"id_user":{
"type":"keyword" }{
}}
}
}
another index for the relationship:
PUT seen
{"mappings": {
"properties": {
"seen": {
"notification_id":{
"type": "keyword",
"fields":{
"user_id":{
"type":"keyword"}}},
"unseen":{
"notification_id":{
"type": "keyword",
"fields":{
"user_id":{
"type":"keyword"}}}}
}
}
Sorry for the text format, i haven't kibana now. You should pay attention that to pass from information indices - user, notification - to the support index - seen - you should make a multi-index query - doc here. it will works because the name and the values of the fields - user_id , notification_id - are the same in different indices. The subfields user_id in seen index are array of keywords. However you could make user_id a single keyword and parent of notification_id keyword array field. In every case they keep the one to many realtionship, the best choice depends from your data

Related

How to convert json to collection in power apps

I have a power app that using the flow from power automate.
My flow is doing an HTTP get and respond a JSON to power apps like below.
Here is the JSON as text:
{"value": "[{\"dataAreaId\":\"mv\",\"AccountNum\":\"100000\",\"Name\":\"*****L FOOD AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100001\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100014\",\"Name\":\"****(SEB)\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100021\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100029\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"500100\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"500210\",\"Name\":\"****\"}]"}
But when I try to convert this JSON to the collection, It doesn't behave like a list.
It just seems like a text. Here is how I try to bind the list.
How can I create a collection from JSON to bind to the gallery view?
I found the solution. I finally create a collection from the response of flow.
The flow's name is GetVendor.
The response of flow is like this :
{"value": "[{\"dataAreaId\":\"mv\",\"AccountNum\":\"100000\",\"Name\":\"*****L FOOD AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100001\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100014\",\"Name\":\"****(SEB)\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100021\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"100029\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"500100\",\"Name\":\"**** AB\"},{\"dataAreaId\":\"mv\",\"AccountNum\":\"500210\",\"Name\":\"****\"}]"}
Below code creates a list from this response :
ClearCollect(_vendorData, MatchAll(GetVendors.Run(_token.value).value, "\{""dataAreaId"":""(?<dataAreaId>[^""]*)"",""AccountNum"":""(?<AccountNum>[^""]*)"",""Name"":""(?<Name>[^""]*)""\}"));
And I could bind the accountnum and name from _vendorDatra collection to the gallery view
In my case I had the same issue as you, but couldn't manage to get data into _vendorData collection, because MatchAll regex part was not working correctly, even if I had exactly the same scenario and I could not make it work.
My solution was to modify the flow itself, where I returned Response instead of Respond to a Power app or Flow, so basically I could return full request from Http.
This caused me some issues also, because when I generated schema from sample I could not register the flow to the powerapp with the error Failed during http send request.
The solution was to manually review the response schema and change all column types to one of the following three, because other are not supported: string, integer or boolean. Object and array can be set only on top level items, but never on children, so if you have anything else than my mentioned three, replace it to string. And no property can be left with undefined type.
Basically I like this solution even more, because in powerapps itself you do not need to do any conversion or anything - simply use the data as is, because it is already recognized as collection in case of array and you have all the properties already named for you.
Response step schema example is below.
{
"type": "object",
"properties": {
"PropertyOne": {
"type": "string"
},
"PropertyTwo": {
"type": "integer"
},
"PropertyThree": {
"type": "boolean"
},
"PropertyFour": {
"type": "array",
"items": {
"type": "object",
"properties": {
"PropertyArray1": {
"type": "string"
},
"PropertyArray1": {
"type": "integer"
},
"PropertyArray1": {
"type": "boolean"
}
}
}
It is easy now.
Power Apps introduced ParseJSON function which helps converting string to collection easily.
Table(ParseJSON(JSONString));
In gallery, map columns like - ThisItem.Value.ColumnName

Elastic Beats - Changing the Field Type of Default Fields in Beats Documents?

I'm still fairly new to the Elastic Stack and I'm still not seeing the entire picture from what I'm reading on this topic.
Let's say I'm using the latest versions of Filebeat or Metricbeat for example, and pushing that data to Logstash output, (which is then configured to push to ES). I want an "out of the box" field from one of these beats to have its field type changed (example: change beat.hostname from it's current default "text" type to "keyword"), what is the best place/practice for configuring this? This kind of change is something I would want consistent across multiple hosts running the same Beat.
I wouldn't change any existing fields since Kibana is building a lot of visualizations, dashboards, SIEM,... on the exptected fields + data types.
Instead extend (add, don't change) the default mapping if needed. On top of the default index template, you can add your own and they will be merged. Adding more fields will require some more disk space (and probably memory when loading), but it should be manageable and avoids a lot of drawbacks of other approaches.
Agreed with #xeraa. It is not advised to change the default template since that field might be used in any default visualizations.
Create a new template, you can have multiple templates for the same index pattern. All the mappings will be merged.The order of the merging can be controlled using the order parameter, with lower order being applied first, and higher orders overriding them.
For your case, probably create a multi-field for any field that needs to be changed. Eg: As shown here create a new keyword multifield, then you can refer the new field as
fieldname.raw
.
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
The other answers are correct but I did the below in Dev console to update the message field from text to text & keyword
PUT /index_name/_mapping
{
"properties": {
"message": {
"type": "match_only_text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 10000
}
}
}
}
}

ElasticSearch terms query match against all values in multi-value field

In ElasticSearch (5.4) I have documents of the following structure:
{
"email": {
type: "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
where email is a multi-valued field (any document can have multiple emails associated with it) representing an email address. I also have a list of "allowed email addresses."
I would like to write a query to find documents that include any email addresses outside of the white list.
For instance, if we had:
whitelist = ['email1#test.com', 'email2#test.com']
document1: {email: ['email1#test.com', 'email4#test.com']}
document2: {email: ['email1#test.com']}
document3: {email: ['email5#test.com', 'email6#test.com']}
We would want the query to find documents 1 and 3.
My first instinct was to use a query of the form:
{
bool: {
must_not: {terms: {email.keyword: [whitelist]}}}
}
}
However, this only returns document3 --- a document where NONE of the emails match the whitelist.
Is there an efficient way to achieve this? Unfortunately, there is too much data to use a script query.

Scripted Field Kibana Not Working

I am trying to get scripted fields in Kibana to work.
I have two fields in my documents, customer and site
I'd like to create a new scripted field called friendly_name which is customer+" "+site
I've tried
return doc["customer"].value + " "+doc["site"].value
and it doesn't yield any results.
I've even tried just return 1 to see if I can get anything to return.
How can I get this to work?
Scripted fields work with doc_values only and I am guessing that, since this doesn't work for you, your customer and site field are text fields.
From https://www.elastic.co/blog/using-painless-kibana-scripted-fields:
Both Painless and Lucene expressions operate on fields stored in doc_values. So for string data, you will need to have the string to be stored in data type keyword.
So, you either define your two fields to be keyword or you add a subfield to them and in your scrip you use customer.keyword and site.keyword. And the changed mapping should be:
"customer": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}

ElasticSearch performance when querying by element type

Assume that we have a dataset containing a collection of domains { domain.com, domain2.com } and also a collection of users { user#domain.com, angryuser#domain2.com, elastic#domain3.com }.
Being so lets assume that both domains and users have several attributes in common, such as "domain", and when the attribute name matches, also do the mapping and possible values.
Then we load up our elasticsearch index with all collections separating them by type, domain and user.
Obviously in our system we would have many more users compared to domains so when querying for domain related data, the expectation is that it would be much faster to filter the query by the type of the attribute right?
My question is, having around 5 million users and 200k domains, why is that when my index only contains domain data, users were deleted, queries run much faster than filtering the objects based on their type? Shouldn't it be at least around similar performance ? On my current status we can match 20 domains per second if there are no users on the index, but it drops to 4 when we load up the users, even though we still filter by type.
Maybe it is something that im missing as im new to elasticsearch.
UPDATE:
This is the query basically
"query" : {
"flt_field": {
"domain_address": {
"like_text": "chroma",
"fuzziness": 0.3
}
}
}
And the mapping is something like this
"user": {
"properties": {
...,
"domain_address": {
"type": "string",
"boost": 2.4,
"similarity": "linear"
}
}
},
"domain": {
"properties": {
...,
"domain_address": {
"type": "string",
"boost": 2.4,
"similarity": "linear"
}
}
}
Other fields in .... but their mapping should not influence the outcome ???

Resources