Timelion Statement : How to filter data from an array in Timelion visualization query - elasticsearch

There is a column of an index in Kibana, which has an array of data
E.g. Below is a sample column = blocked_by
"blocked_by": [
{
"error_category_name": "Record is not a new one",
"error_category_owner": "AB",
"created_when": "2022-05-18T09:52:44.000Z",
"name": "ERROR IN RCS: Delete Subscriber",
"resolved_when": "2022-05-18T10:52:55.963+01:00",
"id": "8163578639440138764"
},
{
"error_category_name": "NM-1009 Phone Number is not in appropriate state",
"error_category_owner": "AB",
"created_when": "2022-05-18T09:52:45.000Z",
"name": "ERROR IN NC NM: Change MSISDN status",
"resolved_when": "2022-05-18T10:53:16.230+01:00",
"id": "8163578637640138764"
},
I want to extract only the latest record out of this column in my timelion expression
Can someone help me out, if this is possible to do so in timelion
My expression:
.es(index=sales_order,timefield=created_when,q='blocked_by.error_category_owner.keyword:(AB OR Undefined OR null OR "") AND _exists_:blocked_by').divide(.es(index=sales_order,timefield=created_when)).yaxis(2,position=right,units=percent).label(Fallout)

Related

openrefine - Sorting column contents within a record?

Scoured the internet as best as I could but couldn't find an answer -- I was wondering, is there some way to sort the contents of a row by record? E.g. take the following table:
Key
Row to sort
Other row
a
bca
A
cab
cab
abc
f
b
zyx
yxz
u
c
def
h
fed
h
and turn it into:
Key
Row to sort
Other row
a
abc
A
bca
cab
cab
f
b
yxz
zyx
u
c
def
h
fed
h
The ultimate goal is to sort all of the columns for each record alphabetically, and then blank up so that each record is a single row.
I've tried doing a sort on the column to sort within the record itself, but that orders records by whichever record has an entry that comes in alphabetical order (regardless of whether it's the 1st entry for the record or not, interestingly).
Here is a solution using sort
Prerequisite: assuming that the values in the "Key" column are unique.
Switch to rows mode
Fill down the "Key" column via Key=> Edit cells => Fill down.
Sort the "Key" column via Key=> Sort...
Sort the "Row to sort" column via Row to sort => Sort... as additional sort
Make the sorting permanent by selecting Reorder rows permanently in the sort menu.
Blank down the "Key" and "Row to sort" column.
Here is a solution using GREL
As deduplicating and sorting records is quite a common task I have a GREL expression reducing this task to two steps:
Transform the "Row to sort" column with the following GREL expression:
if(
row.index - row.record.fromRowIndex == 0,
row.record.cells[columnName].value.uniques().sort().join(","),
null
)
Split the multi-valued cells in the "Row to sort" column on the separator ,.
The GREL expression will take all the record cells of the current column, extract their values into an array, make the values in the array unique, sort the remaining value in the array and join it into a string using , as separator.
The joining into a string is necessary as OpenRefine currently has no support for displaying arrays in the GUI.
I would do it as follows:
For all columns except the key column, use the Edit cells > Join multi-valued cells operation, with a separator that is not present in the cell values
Transform all columns except the key column with: value.split(',').sort().join(',')
Split back your columns with Edit cells > Split multi-valued cells
Then you can blank down / fill down as you wish.
Here is the JSON representation of the workflow for your example:
[
{
"op": "core/multivalued-cell-join",
"columnName": "Row to sort",
"keyColumnName": "Key",
"separator": ",",
"description": "Join multi-valued cells in column Row to sort"
},
{
"op": "core/multivalued-cell-join",
"columnName": "Other row",
"keyColumnName": "Key",
"separator": ",",
"description": "Join multi-valued cells in column Other row"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "record-based"
},
"columnName": "Row to sort",
"expression": "grel:value.split(',').sort().join(',')",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Row to sort using expression grel:value.split(',').sort().join(',')"
},
{
"op": "core/text-transform",
"engineConfig": {
"facets": [],
"mode": "record-based"
},
"columnName": "Other row",
"expression": "grel:value.split(',').sort().join(',')",
"onError": "keep-original",
"repeat": false,
"repeatCount": 10,
"description": "Text transform on cells in column Other row using expression grel:value.split(',').sort().join(',')"
},
{
"op": "core/multivalued-cell-split",
"columnName": "Row to sort",
"keyColumnName": "Key",
"mode": "separator",
"separator": ",",
"regex": false,
"description": "Split multi-valued cells in column Row to sort"
},
{
"op": "core/multivalued-cell-split",
"columnName": "Other row",
"keyColumnName": "Key",
"mode": "separator",
"separator": ",",
"regex": false,
"description": "Split multi-valued cells in column Other row"
}
]

How do I split field with comma separrated and I concatenate field1 field 2 [3 word first] in processor nifi?

I split field with comma separrated field 1 field 2 and concatenate field1 field 2 [3 word first]
example
2022-09-05T00:00:10,677 abc.1 ,
after split and concatenate
2022-09-05T00:00:10:677,abc.1,
You can use UpdateRecord and add a user-defined property something like /field3 set to concat( /field1, /field2 ). You can change /field3 to be whatever you want the output field name to be, and if you want to remove the other fields you can specify a schema in your Record Writer that only has the field(s) you want, such as:
{
"type": "record",
"name": "nifiRecord",
"namespace": "org.apache.nifi",
"fields": [{
"name": "field3",
"type": ["string", "null"]
}]
}

elasticsearch - query between document types

I have a production_order document_type
i.e.
{
part_number: "abc123",
start_date: "2018-01-20"
},
{
part_number: "1234",
start_date: "2018-04-16"
}
I want to create a commodity document type
i.e.
{
part_number: "abc123",
commodity: "1 meter machining"
},
{
part_number: "1234",
commodity: "small flat & form"
}
Production orders are datawarehoused every week and are immutable.
Commodities on the other hand could change over time. i.e abc123 could change from 1 meter machining to 5 meter machining, so I don't want to store this data with the production_order records.
If a user searches for "small flat & form" in the commodity document type, I want to pull all matching records from the production_order document type, the match being between part number.
Obviously I can do this in a relational database with a join. Is it possible to do the same in elasticsearch?
If it helps, we have about 500k part numbers that will be commoditized and our production order data warehouse currently holds 20 million records.
I have found that you can indeed now query between indexs in elasticsearch, however you have to ensure your data stored correctly. Here is an example from the 6.3 elasticsearch docs
Terms lookup twitter example At first we index the information for
user with id 2, specifically, its followers, then index a tweet from
user with id 1. Finally we search on all the tweets that match the
followers of user 2.
PUT /users/user/2
{
"followers" : ["1", "3"]
}
PUT /tweets/tweet/1
{
"user" : "1"
}
GET /tweets/_search
{
"query" : {
"terms" : {
"user" : {
"index" : "users",
"type" : "user",
"id" : "2",
"path" : "followers"
}
}
}
}
Here is the link to the original page
https://www.elastic.co/guide/en/elasticsearch/reference/6.1/query-dsl-terms-query.html
In my case above I need to setup my storage so that commodity is a field and it's values are an array of part numbers.
i.e.
{
"1 meter machining": ["abc1234", "1234"]
}
I can then look up the 1 meter machining part numbers against my production_order documents
I have tested and it works.
There is no joins supported in elasticsearch.
You can query twice first by getting all the partnumbers using "small flat & form" and then using all the partnumbers to query the other index.
Else try to find a way to merge these into a single index. That would be better. Updating the Commodities would not cause you any problem by combining the both.

Elasicsearch sort by inner field

I have documents that one of their field looks like the following -
"ingredients": [{
"unit": "MG",
"value": 123,
"key": "abc"
}]
And I would like to sort the different records according to the ascending value of specific ingredient. That is if I have 2 records which have use ingredient with key "abc", one with value 1 and one with value 2. The one with ingredient value 1 should appear first.
Each of those records may have more than on ingredient.
Thank you in advance!
The search query to sort will be:
{
"sort":{
"ingredients.value":{
"order":"asc"}
}}

How to create a line chart based upon column value instead of column name in Kibana

In ElasticSearch I have some sample data, against which I would like to visualize the line charts in Kibana 4. Samples in ElasticSearch look like this:
"_id": "AVhNy_dxcW7axK5BvIEO",
"timeStamp": "2016-11-11T05:39:10.5844951Z",
"analyticSource": [
{
"analyticId": "A",
"analyticUnit": "sec",
"analyticValue": 0.22743704915046692
},
{
"analyticId": "B",
"analyticUnit": "sec",
"analyticValue": 0.14946113526821136
}]
and another sample:
"_id": "AVhNxnjscW7axK5Bu-Tl",
"timeStamp": "2016-11-11T05:40:10.5954951Z",
"analyticSource": [
{
"analyticId": "A",
"analyticUnit": "sec",
"analyticValue": 0.20143736898899078
},
{
"analyticId": "B",
"analyticUnit": "sec",
"analyticValue": 0.09747125953435898
}]
For now Kibana just plot plot according to the column Id and in this case a single line chart is plotted for analyticValue. What I really want is to plot 2 line chart in Kibana for A and B against timestamp. Is there some kind of script(query) or something where I can say to kibana to seggregate the analyticValue according to analyticId?
Object to array is not suppported on Kibana 4. So I have to create the flat mapping with analyticId, analyticValue, analyticUnit as columnns. Then I aggregate over analyticId and created the line chart with Y axis as max of analyticValue and in X axis selected Date Histogram with time-stamp. I hope this helps to users who lands here.

Resources