Jolt Spec for Shift Operation - apache-nifi

I have the following JSON input
{
"Status": "PENDING",
"TaskId": "0000000001",
"EventType": "change",
"Timestamp": "2019-01-09-15.41.57.473940T01",
"Comment": "{\"comment\":[{\"createDate\":\"2019-01-09T15:41:57:473000-05:00\",\"type\":\"system\",\"text\":\"Assigned to: RAJ\",\"userId\":\"RAJ\",\"userName\":\"RAJA MADHIE\"},{\"createDate\":\"2019-01-09T15:45:59:150000-05:00\",\"type\":\"manual\",\"text\":\"Comments entered for 0000000001\",\"userId\":\"RAJ\",\"userName\":\"RAJA MADHIE\"},{\"createDate\":\"2019-01-09T15:49:09:586000-05:00\",\"type\":\"manual\",\"text\":\"Comments entered for 0000000001 - processed.\",\"userId\":\"RAJ\",\"userName\":\"RAJA MADHIE\"}]}"
}
and expecting the output to be something like:
{
"inputs": [
{
"richInput": {
"subjectId": "0000000001",
"body": {
"messageSegments": [
{
"type": "system",
"text": "Assigned to: RAJ"
}
]
},
"feedElementType": "FeedItem"
}
},
{
"richInput": {
"subjectId": "0000000001",
"body": {
"messageSegments": [
{
"type": "manual",
"text": "Comments entered for 0000000001"
}
]
},
"feedElementType": "FeedItem"
}
},
{
"richInput": {
"subjectId": "0000000001",
"body": {
"messageSegments": [
{
"type": "manual",
"text": "Comments entered for 0000000001-processed."
}
]
},
"feedElementType": "FeedItem"
}
}
]
}
I have tried to transform this using couple of JOLT Specs but no luck... Any suggestions or recommendations are much appreciated. Thanks!

You need to make comments a proper JSON structure, it will appear something as below.
{
"Status": "PENDING",
"TaskId": "0000000001",
"EventType": "change",
"Timestamp": "2019-01-09-15.41.57.473940T01",
"Comment": {
"comment": [
{
"createDate": "2019-01-09T15:41:57:473000-05:00",
"type": "system",
"text": "Assigned to: RAJ",
"userId": "RAJ",
"userName": "RAJA MADHIE"
},
{
"createDate": "2019-01-09T15:45:59:150000-05:00",
"type": "manual",
"text": "Comments entered for 0000000001",
"userId": "RAJ",
"userName": "RAJA MADHIE"
},
{
"createDate": "2019-01-09T15:49:09:586000-05:00",
"type": "manual",
"text": "Comments entered for 0000000001 - processed.",
"userId": "RAJ",
"userName": "RAJA MADHIE"
}
]
}
}
Spec file to transform this:
[
{
"operation": "shift",
"spec": {
"Comment": {
"comment": {
"*": {
"#(3,TaskId)": "inputs[&1].richInput.subjectId",
"type": "inputs[&1].richInput.body.messageSegments[0].type",
"text": "inputs[&1].richInput.body.messageSegments[0].text",
"#FeedItem": "inputs[&1].richInput.feedElementType"
}
}
}
}
},
{
"operation": "shift",
"spec": {
"inputs": {
"*": {
"richInput": {
"subjectId": "inputs[&2].richInput.subjectId",
"body": "inputs[&2].richInput.body",
"feedElementType": "inputs[&2].richInput.feedElementType"
}
}
}
}
}
]

Related

Order documents by multiple geolocations

I am new to ElasticSearch and I try to create an index for companies that come with multiple branches in the city.
Each of the branches, it has its own geolocation point.
My companies document looks like this:
{
"company_name": "Company X",
"branch": [
{
"address": {
// ... other fields
"location": "0.0000,1.1111"
}
}
]
}
The index have the following mapping:
{
"companies": {
"mappings": {
"dynamic_templates": [
{
"ids": {
"match": "id",
"match_mapping_type": "long",
"mapping": {
"type": "long"
}
}
},
{
"company_locations": {
"match": "location",
"match_mapping_type": "string",
"mapping": {
"type": "geo_point"
}
}
}
],
"properties": {
"branch": {
"properties": {
"address": {
"properties": {
// ...
"location": {
"type": "geo_point"
},
// ...
}
},
}
}
}
}
}
}
Now, in the ElasticSearch I've indexed the following documents:
{
"company_name": "Company #1",
"branch": [
{
"address": {
"location": "39.615,19.8948"
}
}
]
}
and
{
"company_name": "Company #2",
"branch": [
{
"address": {
"location": "39.586,19.9028"
}
},
{
"address": {
"location": "39.612,19.9134"
}
},
{
"address": {
"location": "39.607,19.8946"
}
}
]
}
Now what is my problem. If I try to run the following search query, unfortunately the company displayed first is the Company #2 although the geodistance query has the location data of the Company #1:
GET companies/_search
{
"fields": [
"company_name",
"branch.address.location"
],
"_source": false,
"sort": [
{
"_geo_distance": {
"branch.address.location": {
"lon": 39.615,
"lat": 19.8948
},
"order": "asc",
"unit": "km"
}
}
]
}
Am I doing something wrong? Is there a way to sort the search results using this method?
Please keep in mind that if for example search with a geolocation that is more close to some geolocations of the "Comapny #2", in this case I need the Company #2 to be first.
Finally, if the setup I have isn't correct for what I require, if there's any other way to achieve that same result with different document structure, please let me know. I am still in the beginning of the project, and It's simple to adapt to what is more appropriate.
The documentation here says "Geopoint expressed as a string with the format: "lat,lon"."
Your location is "location": "39.615,19.8948", maybe the query must be below:
"branch.address.location": {
"lat": 39.615,
"lon": 19.8948
}
My Tests:
PUT idx_test
{
"mappings": {
"properties": {
"branch": {
"properties": {
"address": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
}
}
POST idx_test/_doc/1
{
"company_name": "Company #1",
"branch": [
{
"address": {
"location": "39.615,19.8948"
}
}
]
}
POST idx_test/_doc/2
{
"company_name": "Company #2",
"branch": [
{
"address": {
"location": "39.586,19.9028"
}
},
{
"address": {
"location": "39.612,19.9134"
}
},
{
"address": {
"location": "39.607,19.8946"
}
}
]
}
Search by location "39.607,19.8946" company #2
GET idx_test/_search?
{
"fields": [
"company_name",
"branch.address.location"
],
"_source": false,
"sort": [
{
"_geo_distance": {
"branch.address.location": {
"lat": 39.607,
"lon": 19.8946
},
"order": "asc",
"unit": "km"
}
}
]
}
Response:
"hits": [
{
"_index": "idx_test",
"_id": "2",
"_score": null,
"fields": {
"branch.address.location": [
{
"coordinates": [
19.9028,
39.586
],
"type": "Point"
},
{
"coordinates": [
19.9134,
39.612
],
"type": "Point"
},
{
"coordinates": [
19.8946,
39.607
],
"type": "Point"
}
],
"company_name": [
"Company #2"
]
},
"sort": [
0
]
},
{
"_index": "idx_test",
"_id": "1",
"_score": null,
"fields": {
"branch.address.location": [
{
"coordinates": [
19.8948,
39.615
],
"type": "Point"
}
],
"company_name": [
"Company #1"
]
},
"sort": [
0.8897252783915647
]
}
]
Search by location "39.615,19.8948" company #1
GET idx_test/_search?
{
"fields": [
"company_name",
"branch.address.location"
],
"_source": false,
"sort": [
{
"_geo_distance": {
"branch.address.location": {
"lat": 39.615,
"lon": 19.8948
},
"order": "asc",
"unit": "km"
}
}
]
}
Response
"hits": [
{
"_index": "idx_test",
"_id": "1",
"_score": null,
"fields": {
"branch.address.location": [
{
"coordinates": [
19.8948,
39.615
],
"type": "Point"
}
],
"company_name": [
"Company #1"
]
},
"sort": [
0
]
},
{
"_index": "idx_test",
"_id": "2",
"_score": null,
"fields": {
"branch.address.location": [
{
"coordinates": [
19.9028,
39.586
],
"type": "Point"
},
{
"coordinates": [
19.9134,
39.612
],
"type": "Point"
},
{
"coordinates": [
19.8946,
39.607
],
"type": "Point"
}
],
"company_name": [
"Company #2"
]
},
"sort": [
0.8897285575578558
]
}
]

Elastic Watcher not returning results

I am trying to simulate a watch and see if the actions are triggering fine. But my problem is the search returns no results.
My query
Checks for a particular index.
Checks for a range
Check for the servicename field to be a particular value.
This is my watch definition
{
"trigger": {
"schedule": {
"interval": "10m"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"datasolutions-svc-*"
],
"body": {
"query": {
"bool": {
"filter": [
{
"term": {
"level": {
"value": "ERROR"
}
}
},
{
"term": {
"servicename": [
"Iit.Det.Urm.MepsSubscriber"
]
}
},
{
"range": {
"#timestamp": {
"gte": "now-60m"
}
}
}
]
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"actions": {
"notify-slack": {
"slack": {
"account": "elastic_watcher_alerts",
"proxy": {
"host": "proxy.dom",
"port": 80
},
"message": {
"from": "Error Monitor",
"to": [
"#det-errors"
],
"text": "The following error(s) have been logged",
"dynamic_attachments": {
"list_path": "ctx.payload.items",
"attachment_template": {
"color": "#f00",
"title": "{{msg}}",
"title_link": "https://elastic.mid.dom:port/{{index}}/doc/{{id}}?pretty",
"text": "{{msg}}",
"fields": [
{
"title": "Server",
"value": "{{host}}",
"short": true
},
{
"title": "Servicename",
"value": "{{service}}",
"short": true
}
]
}
}
}
}
}
},
"transform": {
"script": {
"source": "['items': ctx.payload.hits.hits.collect(hit -> ['msg': hit._source.message, 'service': hit._source.servicename, 'index': hit._index, 'id' : hit._id, 'host': hit._source.agent.hostname ])]",
"lang": "painless"
}
}
}
I am trying to now test it by using the simulate option and giving it an input. This input is copied from actual data that is in the index. I copied a json document from kibana (in the discover section), so the alternate input json should be ok
Here's the alternative input
{
"_index": "datasolutions-svc-live-7.7.0-2021.01",
"_type": "doc",
"_id": "Hre9SHcB1QIqYEnyxSCw",
"_version": 1,
"_score": null,
"_source": {
"exception": "System.Data.SqlClient.SqlException (0x80131904): blabla",
"agent": {
"hostname": "SATSVC3-DK1",
"name": "datasolutions-svc-live",
"id": "8c826ae1-e411-4257-a31f-08824dd58b5a",
"type": "filebeat",
"ephemeral_id": "e355bf8a-be67-4ed1-85f4-b9043674700e",
"version": "7.7.0"
},
"log": {
"file": {
"path": "D:\\logs\\7DaysRetention\\Iit.Det.Urm.MepsSubscriber\\Iit.Det.Urm.MepsSubscriber.log.20210128.log"
},
"offset": 17754757
},
"level": "ERROR",
"message": "Error while starting service.",
"#timestamp": "2021-02-17T10:00:28.343Z",
"ecs": {
"version": "1.5.0"
},
"host": {
"name": "datasolutions-svc-live"
},
"servicename": "Iit.Det.Urm.MepsSubscriber",
"codelocation": "Iit.Det.Urm.MepsSubscriber.MepsSubscriberService.OnStart:29"
},
"fields": {
"#timestamp": [
"2021-02-17T10:00:28.343Z"
]
},
"highlight": {
"servicename": [
"#kibana-highlighted-field#Iit.Det.Urm.MepsSubscriber#/kibana-highlighted-field#"
]
},
"sort": [
1611833128343
]
}
But when I run "simulate", I get the ctx.payload.total.hits as null because apparently it does not find any results. Result of the simulate-
{
"watch_id": "_inlined_",
"node": "eMS-E34eT4-zZhGwtPNSmw",
"state": "execution_not_needed",
"user": "sum",
"status": {
"state": {
"active": true,
"timestamp": "2021-02-17T10:57:04.077Z"
},
"last_checked": "2021-02-17T10:57:04.077Z",
"actions": {
"notify-slack": {
"ack": {
"timestamp": "2021-02-17T10:57:04.077Z",
"state": "awaits_successful_execution"
}
}
},
"execution_state": "execution_not_needed",
"version": -1
},
"trigger_event": {
"type": "manual",
"triggered_time": "2021-02-17T10:57:04.077Z",
"manual": {
"schedule": {
"scheduled_time": "2021-02-17T10:57:04.077Z"
}
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"datasolutions-svc-*"
],
"rest_total_hits_as_int": true,
"body": {
"query": {
"bool": {
"filter": [
{
"term": {
"level": {
"value": "ERROR"
}
}
},
{
"term": {
"servicename": [
"Iit.Det.Urm.MepsSubscriber"
]
}
},
{
"range": {
"#timestamp": {
"gte": "now-60m"
}
}
}
]
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.hits.total": {
"gt": 0
}
}
},
"metadata": {
"name": "datasolutions-svc-mepssubscriber",
"xpack": {
"type": "json"
}
},
"result": {
"execution_time": "2021-02-17T10:57:04.077Z",
"execution_duration": 0,
"input": {
"type": "simple",
"status": "success",
"payload": {
"highlight": {
"servicename": [
"#kibana-highlighted-field#Iit.Det.Urm.MepsSubscriber#/kibana-highlighted-field#"
]
},
"_index": "datasolutions-svc-live-7.7.0-2021.01",
"_type": "doc",
"_source": {
"exception": "System.Data.SqlClient.SqlException (0x80131904): blabla",
"agent": {
"hostname": "SATSVC3-DK1",
"name": "datasolutions-svc-live",
"id": "8c826ae1-e411-4257-a31f-08824dd58b5a",
"type": "filebeat",
"ephemeral_id": "e355bf8a-be67-4ed1-85f4-b9043674700e",
"version": "7.7.0"
},
"#timestamp": "2021-02-17T10:00:28.343Z",
"ecs": {
"version": "1.5.0"
},
"log": {
"file": {
"path": "D:\\logs\\7DaysRetention\\Iit.Det.Urm.MepsSubscriber\\Iit.Det.Urm.MepsSubscriber.log.20210128.log"
},
"offset": 17754757
},
"level": "ERROR",
"host": {
"name": "datasolutions-svc-live"
},
"servicename": "Iit.Det.Urm.MepsSubscriber",
"message": "Error while starting service.",
"codelocation": "Iit.Det.Urm.MepsSubscriber.MepsSubscriberService.OnStart:29"
},
"_id": "Hre9SHcB1QIqYEnyxSCw",
"sort": [
1611833128343
],
"_score": null,
"fields": {
"#timestamp": [
"2021-02-17T10:00:28.343Z"
]
},
"_version": 1
}
},
"condition": {
"type": "compare",
"status": "success",
"met": false,
"compare": {
"resolved_values": {
"ctx.payload.hits.total": null
}
}
},
"actions": []
},
"messages": []
}
I am not sure what can't it find the results. Can someone tell me what is it that I am doing wrong?
I was able to solve it using the "inspect" section of discover page of the index.
Finally my input for the watcher query had to be changed to
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": [
"datasolutions-svc-*"
],
"rest_total_hits_as_int": true,
"body": {
"query": {
"bool": {
"must": [],
"filter": [
{
"bool": {
"should": [
{
"match_phrase": {
"servicename": "Iit.Det.Urm.MepsSubscriber"
}
}
],
"minimum_should_match": 1
}
},
{
"match_phrase": {
"level": "ERROR"
}
},
{
"range": {
"#timestamp": {
"gte": "now-10m",
"format": "strict_date_optional_time"
}
}
}
],
"should": [],
"must_not": []
}
}
}
}
}
}

ruby extract data from nested array

How I can access to a depth key value from a nested hash?
I want to extract values inside of _Data, I want to extract the field and size inside data[], to be printed like fieldvalue_sizevalue
elemento3_3
I have this code but only get the all the data keys and values and i had some troubles making this iteration, I am new at ruby
_Data[0][:checks].each do |intera|
puts "#{itera[:data]}
end
this is the nested array
_Data =[
{
"agent": "",
"bottom_comments": [],
"checks": [
{
"title": "new",
"data": [{
"field": "elemento1",
"value": "0",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento2",
"value": "0",
"size":"4",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento3",
"value": "0",
"size":"3",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento4",
"value": "0",
"size":"17",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento5",
"value": "0",
"size":"12",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento6",
"value": "0",
"size":"19",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento7",
"value": "0",
"size":"9",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento8",
"value": "0",
"size":"10",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento11",
"value": "0",
"size":"14",
}
]
},
{
"title": "new",
"data": [{
"field": "elemento19",
"value": "0",
"size":"14",
}
]
},
{
"type": "condiciones",
"elementos": [
{
"table": [
{
"title": "",
"name": "radio",
},
{
"title": "",
"name": "xenon",
},
{
"title": "",
"name": "aluminio",
},
{
"title": "",
"name": "boro",
},
{
"title": "",
"name": "oro",
},
{
"title": "",
"name": "bromo",
},
{
"title": "",
"name": "oxigeno",
}
]
}
]
}
]
}
]
_Data[0][:checks].each do |intera|
intera[:data].each do |data|
puts "#{data[:filed]}_#{data[:value]}"
end
end
Hope this may help you.
Here is one line code to print the values if they exist. Let me know if any further details on any of the methods used will be helpful.
_Data[0][:checks].pluck(:data).flatten.compact.each{|i| puts "#{i[:field]}_#{i[:size]}" if i[:field] && i[:size]}

Jolt Transform to Array with Parent Attributes

I am using NiFi Jolt Processor to transform some JSON data.
My JSON field genre contains shared attributes and an array that contains the list of actual genre names.
I needed to transform the "genre" attribute to become an array of "genres" containing the list of the common attributes and the different genre names.
I have the following input JSON:
{
"programIdentifier": "9184663",
"programInstance": {
"genre": {
"source": "GN",
"locked": false,
"lastModifiedDate": 1527505462094,
"lastModifiedBy": "Some Service",
"genres": [
"Miniseries",
"Drama"
]
}
}
}
I have tried the following spec:
[{
"operation": "shift",
"spec": {
"programIdentifier": ".&",,
"genre": {
"source": "genres[].source.value",
"locked": "genres[].locked",
"lastModifiedDate": "genres[].lastModifiedDate",
"lastModifiedBy": "genres[].lastModifiedBy",
"genres": {
"*": "genres[&0].name"
}
}
}]
This my expected output:
{
"programIdentifier": "9184663",
"programInstance": {
"genres": [
{
"source": {
value: "GN"
}
"locked": false,
"lastModifiedDate": 1527505462094,
"lastModifiedBy": "Some Service",
"name": "Miniseries"
},
{
"source": {
value: "GN"
}
"locked": false,
"lastModifiedDate": 1527505462094,
"lastModifiedBy": "Some Service",
"name": "Drama"
}
]
}
}
But it's coming out as:
{
"programIdentifier": "9184663",
"programInstance": {
"genres": [
{
"source": {
"value": "GN"
},
"name": "Miniseries"
}, {
"locked": false,
"name": "Drama"
}, {
"lastModifiedDate": 1527505462094
}, {
"lastModifiedBy": "Some Service"
}],
}
}
Is it what you want to achieve?
[
{
"operation": "shift",
"spec": {
"programIdentifier": ".&",
"programInstance": {
"genre": {
"genres": {
"*": {
"#2": {
"source": "programInstance.genres[&2].source[]",
"locked": "programInstance.genres[&2].locked",
"lastModifiedDate": "programInstance.genres[&2].lastModifiedDate",
"lastModifiedBy": "programInstance.genres[&2].lastModifiedBy"
},
"#": "programInstance.genres[&1].name"
}
}
}
}
}
}
]

FHIR and DocumentReference

How can I get DocumentReference for particular document from DocumentManifest list of documents?
Here is an example of returned DocumentManifest
{
"title": {
},
"id": {
},
"updated": {
},
"content": {
"type": {
"#id": "urn:uuid:xxxxxx-xxxx-xxxx-xxxx-xxxxxxx",
"text": "Patient Document List"
},
"resourceType": "DocumentManifest",
"text": {
"status": "generated",
"div": "<div xmlns=\"http://www.w3.org/1999/xhtml\">Some Test ORG</div>"
},
"contained": {
"resourceType": "Patient",
"identifier": {
"use": "official",
"system": "",
"value": "12345678987654321"
}
},
"subject": {
"reference": "Patient Documents"
},
"recipient": {
"organization": {
"display": "Some Test ORG"
}
},
"created": "2018-02-09T13:26:53-07:00",
"status": "current",
"content": {
"reference": [
"Binary/DOCUMENT-1000123",
"Binary/DOCUMENT-1000124",
"Binary/DOCUMENT-1000125"
]
}
}
}
I have tried to use something like
GET [service-url]/DocumentReference/?_query=generate&uri=[service-url]/BINARY/DOCUMENT-1000125
but I had no luck.
That DocumentManifest is not pointing at a DocumentReference. It is point at a document.
FYI: What you have shown is not compliant with IHE-MHD use of DocumentReference/DocumentManifest/Binary.

Resources