Match keys with sibling object JSONATA - jsonata

I have an JSON object with the structure below. When looping over key_two I want to create a new object that I will return. The returned object should contain a title with the value from key_one's name where the id of key_one matches the current looped over node from key_two.
Both objects contain other keys that also will be included but the first step I can't figure out is how to grab data from a sibling object while looping and match it to the current value.
{
"key_one": [
{
"name": "some_cool_title",
"id": "value_one",
...
}
],
"key_two": [
{
"node": "value_one",
...
}
],
}

This is a good example of a 'join' operation (in SQL terms). JSONata supports this in a path expression. See https://docs.jsonata.org/path-operators#-context-variable-binding
So in your example, you could write:
key_one#$k1.key_two[node = $k1.id].{
"title": $k1.name
}
You can then add extra fields into the resulting object by referencing items from either of the original objects. E.g.:
key_one#$k1.key_two[node = $k1.id].{
"title": $k1.name,
"other_one": $k1.other_data,
"other_two": other_data
}
See https://try.jsonata.org/--2aRZvSL

I seem to have found a solution for this.
[key_two].$filter($$.key_one, function($v, $k){
$v.id = node
}).{"title": name ? name : id}
Gives:
[
{
"title": "value_one"
},
{
"title": "value_two"
},
{
"title": "value_three"
}
]
Leaving this here if someone have a similar issue in the future.

Related

Filtering JSON based on sub array in a Power Automate Flow

I have some json data that I would like to filter in a Power Automate Flow.
A simplified version of the json is as follows:
[
{
"ItemId": "1",
"Blah": "test1",
"CustomFieldArray": [
{
"Name": "Code",
"Value": "A"
},
{
"Name": "Category",
"Value": "Test"
}
]
},
{
"ItemId": "2",
"Blah": "test2",
"CustomFieldArray": [
{
"Name": "Code",
"Value": "B"
},
{
"Name": "Category",
"Value": "Test"
}
]
}
]
For example, I wish to filter items based on Name = "Code" and Value = "A". I should be left with the item with ItemId 1 in that case.
I can't figure out how to do this in Power Automate. It would be nice to change the data structure, but this is the way the data is, and I'm trying to work out if this is possible in Power Automate without changing the data itself.
Firstly, I had to fix your JSON, it wasn't complete.
Secondly, filtering on sub array information isn't what I'd call easy. However, to get around the limitations, you can perform a bit of trickery.
Prior to the step above, I create a variable of type Array and called it Array.
In the step above, the left hand side expression is ...
string(item()?['CustomFieldArray'])
... and the contains comparison on the right hand side is simply as you can see, a string with the appropriate filter value ...
{"Name":"Code","Value":"A"}
... it's not an expression or a proper object, just a string.
If you need to enhance it to cater for case sensitive values, just set everything to lower case using the toLower expression on the left.
Although it's hard to see, that will produce your desired result ...
... you can see by the vertical scrollbars that it's reduced the size of the array.

How to create a HashMap with custom object as a key?

In Elasticsearch, I have an object that contains an array of objects. Each object in the array have type, id, updateTime, value fields.
My input parameter is an array that contains objects of the same type but different values and update times. Id like to update the objects with new value when they exist and create new ones when they aren't.
I'd like to use Painless script to update those but keep them distinct, as some of them may overlap. Issue is that I need to use both type and id to keep them unique. So far I've done it with bruteforce approach, nested for loop and comparing elements of both arrays, but I'm not too happy about that.
One of the ideas is to take array from source, build temporary HashMap for fast lookup, process input and later store all objects back into source.
Can I create HashMap with custom object (a class with type and id) as a key? If so, how to do it? I can't add class definition to the script.
Here's the mapping. All fields are 'disabled' as I use them only as intermidiate state and query using other fields.
{
"properties": {
"arrayOfObjects": {
"properties": {
"typ": {
"enabled": false
},
"id": {
"enabled": false
},
"value": {
"enabled": false
},
"updated": {
"enabled": false
}
}
}
}
}
Example doc.
{
"arrayOfObjects": [
{
"typ": "a",
"id": "1",
"updated": "2020-01-02T10:10:10Z",
"value": "yes"
},
{
"typ": "a",
"id": "2",
"updated": "2020-01-02T11:11:11Z",
"value": "no"
},
{
"typ": "b",
"id": "1",
"updated": "2020-01-02T11:11:11Z"
}
]
}
And finally part of the script in it's current form. The script does some other things, too, so I've stripped them out for brevity.
if (ctx._source.arrayOfObjects == null) {
ctx._source.arrayOfObjects = new ArrayList();
}
for (obj in params.inputObjects) {
def found = false;
for (existingObj in ctx._source.arrayOfObjects) {
if (obj.typ == existingObj.typ && obj.id == existingObj.id && isAfter(obj.updated, existingObj.updated)) {
existingObj.updated = obj.updated;
existingObj.value = obj.value;
found = true;
break;
}
}
if (!found) {
ctx._source.arrayOfObjects.add([
"typ": obj.typ,
"id": obj.id,
"value": params.inputValue,
"updated": obj.updated
]);
}
}
There's technically nothing suboptimal about your approach.
A HashMap could potentially save some time but since you're scripting, you're already bound to its innate inefficiencies... Btw here's how you initialize & work with HashMaps.
Another approach would be to rethink your data structure -- instead of arrays of objects use keyed objects or similar. Arrays of objects aren't great for frequent updates.
Finally a tip: you said that these fields are only used to store some intermediate state. If that weren't the case (or won't be in the future), I'd recommend using nested arrays to enable querying independently of other objects in the array.

How to access parent node from child

How do I access a parent object from a child node. Seems that i can't access the scope
This is the source json
{
"content" : {
"date" : "2019-02-10T02:40:48Z",
"production" : {
"productionId" : "918",
}
}
}
This is my Jsonata
{
"productionType": "specificProducts",
"products": [
content.production.(
{"usedProducts" : {
"id" = productionId,
"productDate" = content.date // how do I access content
}
})
]
}
do I have to save "content" in some kind of variable and pass it to the child ?
The Answer is $$.content.date
Here is the documentation of it
https://docs.jsonata.org/programming#built-in-variables
{
"productionType": "specificProducts",
"products": [
content.production.(
{"usedProducts" : {
"id" = productionId,
"productDate" = $$.content.date
}
})
]
}
Another solution is to not dive down into the production element until you want to access its 'productionId' property -- like this:
{
"productionType": "specificProducts",
"products": [
content.{
"usedProducts": {
"id": production.productionId,
"productDate": date
}
}
]
}
Then you can just access the 'date' property in the context of its parent content object.
Of course, these answers may or may not work as expected if the source object is more deeply nested, or contains arrays of child objects...
But to answer your original question, "no" -- in JSONata, elements cannot know what "path" was used to dereference them. Iirc, it was a concious design decision to ensure maximum flexibility and speed.
Use the % symbol to access the parent node from the context of a child node. You can use %.% to access the grandparent node, and so on.
You can read more about it in the documentation here: https://docs.jsonata.org/path-operators
This might be what you were trying to accomplish. Since the query content.production returns an array, your query had to be adjusted slightly.
{
"productionType": "specificProducts",
"products": [
{
"usedProducts": [
content.production.{
"id": $.productionId,
"productDate": %.date
}
]
}
]
}

Index main-object, sub-objects, and do a search on sub-objects (that return sib-objects)

I've an object like it (simplified here), Each strain have many chromosomes, that have many locus, that have many features, that have many products, ... Here I just put 1 of each.
The structure in json is:
{
"name": "my strain",
"public": false,
"authorized_users": [1, 23, 51],
"chromosomes": [
{
"name": "C1",
"locus": [
{
"name": "locus1",
"features": [
{
"name": "feature1",
"products": [
{
"name": "product1"
//...
}
]
}
]
}
]
}
]
}
I want to add this object in Elasticsearch, for the moment I've add objects separatly: locus, features and products. It's okay to do a search (I want type a keyword, watch in name of locus, name of features, and name of products), but I need to duplicate data like public and authorized_users, in each subobject.
Can I register the whole object in elasticsearch and just do a search on each locus level, features and products ? And get it individually ? (no return the Strain object)
Yes you can search at any level (ie, with a query like "chromosomes.locus.name").
But as you have arrays at each level, you will have to use nested objects (and nested query) to get exactly what you want, which is a bit more complex:
https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.3/query-dsl-nested-query.html
For your last question, no, you cannot get subobjects individually, elastic returns the whole json source object.
If you want only data from subobjects, you will have to use nested aggregations.

Which is the better design for this API response

I'm trying to decide upon the best format of response for my API. I need to return a reports response which provides information on the report itself and the fields contained on it. Fields can be of differing types, so there can be: SelectList; TextArea; Location etc..
They each use different properties, so "SelectList" might use "Value" to store its string value and "Location" might use "ChildItems" to hold "Longitude" "Latitude" etc.
Here's what I mean:
"ReportList": [
{
"Fields": [
{
"Id": {},
"Label": "",
"Value": "",
"FieldType": "",
"FieldBankFieldId": {},
"ChildItems": [
{
"Item": "",
"Value": ""
}
]
}
]
}
The problem with this is I'm expecting the users to know when a value is supposed to be null. So I'm expecting a person looking to extract the value from "Location" to extract it from "ChildItems" and not "Value". The benefit to this however, is it's much easier to query for things than the alternative which is the following:
"ReportList": [
{
"Fields": [
{
"SelectList": [
{
"Id": {},
"Label": "",
"Value": "",
}
]
"Location": [
{
"Id": {},
"Label": "",
"Latitude": "",
"Longitude": "",
"etc": "",
}
]
}
]
}
So this one is a reports list that contains a list of fields which on it contains a list of fieldtype for every fieldtype I have (15 or something like that). This is opposed to just having a list of reports which has a list of fields with a "fieldtype" enum which I think is fairly easy to manipulate.
So the Question: Which format is best for a response? Any alternatives and comments appreciated.
EDIT:
To query all fields by fieldtype in a report and get values with the first way it would go something like this:
foreach(field in fields)
{
switch(field.fieldType){
case FieldType.Location :
var locationValue = field.childitems;
break;
case FieldType.SelectList:
var valueselectlist = field.Value;
break;
}
The second one would be like:
foreach(field in fields)
{
foreach(location in field.Locations)
{
var latitude = location.Latitude;
}
foreach(selectList in field.SelectLists)
{
var value= selectList.Value;
}
}
I think the right answer is the first one. With the switch statement. It makes it easier to query on for things like: Get me the value of the field with the id of this guid. It just means putting it through a big switch statement.
I went with the first one because It's easier to query for the most common use case. I'll expect the client code to put it into their own schema if they want to change it.

Resources