I have a question that is an extension/followup to a previous question I've asked:
How do I concatenate dummy values in JQ based on field value, and then CSV-aggregate these concatenations?
In my bash script, when I run the following jq against my curl result:
curl -u someKey:someSecret someURL 2>/dev/null | jq -r '.schema' | jq -r -c '.fields'
I get back a JSON array as follows:
[
{"name":"id", "type":"int"},
{
"name": "agents",
"type": {
"type": "array",
"items": {
"name": "carSalesAgents",
"type": "record"
"fields": [
{
"name": "agentName",
"type": ["string", "null"],
"default": null
},
{
"name": "agentEmail",
"type": ["string", "null"],
"default": null
},
{
"name": "agentPhones",
"type": {
"type": "array",
"items": {
"name": "SalesAgentPhone",
"type": "record"
"fields": [
{
"name": "phoneNumber",
"type": "string"
}
]
}
},
"default": []
}
]
}
},
"default": []
},
{"name":"description","type":"string"}
]
Note: line breaks and indentation added here for ease of reading. This is all in reality a single blob of text.
My goal is to do a call with jq applied to return the following, given the example above (again lines and spaces added for readability, but only need to return valid JSON blob):
{
"id":1234567890,
"agents": [
{
"agentName": "xxxxxxxxxx",
"agentEmail": "xxxxxxxxxx",
"agentPhones": [
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
}
]
},
{
"agentName": "xxxxxxxxxx",
"agentEmail": "xxxxxxxxxx",
"agentPhones": [
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
}
]
}
],
"description":"xxxxxxxxxx"
}
To summarise, I am trying to automatically generate templated values that match the "schema" JSON shown above.
So just to clarify, the values for "name" (including their surrounding double-quotes) are concatenated with either:
:1234567890 ...when the "type" for that object is "int"
":xxxxxxxxxx" ...when the "type" for that object is "string"
...and when type is "array" or "record" the appropriate enclosures are added {} or [] with the nested content inside.
if its an array of records, generate TWO records for the output
The approach I have started down to cater for parsing nested content like this is to have a series of if-then-else's for every combination of each possible jq type.
But this is fast becoming very hard to manage and painful. From my initial scratch efforts...
echo '[{"name":"id","type":"int"},{"name":"test_string","type":"string"},{"name":"string3ish","type":["string","null"],"default":null}]' | jq -c 'map({(.name): (if .type == "int" then 1234567890 else (if .type == "string" then "xxxxxxxxxx" else (if .type|type == "array" then "xxARRAYxx" else "xxUNKNOWNxx" end) end) end)})|add'
I was wondering if anyone knew of a smarter way to do this in bash/shell with JQ.
PS: I have found alternate solutions for such parsing using Java and Python modules, but JQ is preferable for a unique case of limitations around portability. :)
Thanks!
jq supports functions. Those functions can recurse.
#!/usr/bin/env jq -f
# Ignore all but the first type, in the case of "type": ["string", "null"]
def takeFirstTypeFromArray:
if (.type | type) == "array" then
.type = .type[0]
else
.
end;
def sampleData:
takeFirstTypeFromArray |
if .type == "int" then
1234567890
elif .type == "string" then
"xxxxxxxxxx"
elif .type == "array" then # generate two entries for any test array
[(.items | sampleData), (.items | sampleData)]
elif .type == "record" then
(.fields | map({(.name): sampleData}) | add)
elif (.type | type) == "array" then
(.type[] | sampleData)
elif (.type | type) == "object" then
(.type | sampleData)
else
["UNKNOWN", .]
end;
map({(.name): sampleData}) | add
Related
I have a YAML file which I am converting into JSON files using yq.
It generates the following output,
{
"a" : 1,
"b" : 2,
"c" : {
"id": "9ee ...",
"parent": "abc..."
}
}
Then I am creating another JSON based on the above JSON key & value. Please find the below code snippet,
# Extract the properties from the YAML file
json=$(yq -j "$file_name")
# Iterate over the properties
parameters=""
for key in $(echo "${json}" | jq -r 'keys[]'); do
# Extract the key and value of the property
value=$(echo "${json}" | jq -r ".$key")
echo "Adding parameter $key with value $value to SSM"
# Add the property to the list of parameters
parameters+="{\"Name\": \"$key\", \"Value\": \"$value\", \"Type\": \"String\", \"Overwrite\": true}"
done
Since the value in the 1st JSON is already JSON so we couldn't able to generate the 2nd JSON which is generating an invalid-JSON error.
Is there any way to escape/stringify the JSON characters in bash script or jq, so that we can able to generate the 2nd JSON?
Any help would be really appreciated.
Actual output:
[
{
"Name": "a",
"Value": "1",
"Type": "String",
"Overwrite": "true"
},
{
"Name": "b",
"Value": "2",
"Type": "String",
"Overwrite": "true"
},
{
"Name": "c",
"Value": "{
"id": "9ee ...",
"parent": "abc..."
}",
"Type": "String",
"Overwrite": "true"
}
]
The above one is not a valid JSON.
Expected output:
[
{
"Name": "a",
"Value": "1",
"Type": "String",
"Overwrite": "true"
},
{
"Name": "b",
"Value": "2",
"Type": "String",
"Overwrite": "true"
},
{
"Name": "c",
"Value": "{\r\n \"id\": \"9ee ...\",\r\n \"parent\": \"abc...\"\r\n }",
"Type": "String",
"Overwrite": "true"
}
]
Why try to emulate jq's behavior with a shell loop (which should generally be avoided) instead of using jq directly?
yq -j "$file_name" | jq 'to_entries | map({
Name: .key,
Value: .value,
Type: "String",
Overwrite: true
})'
Or directly transform using yq only:
yq 'to_entries | map({
Name: .key,
Value: .value,
Type: "String",
Overwrite: true
})' -j "$file_name"
Update after clarifying edit of the question: It has become clear that you want to transform the value into a string. jq has the tostring filter for that, the program thus becomes:
to_entries | map({
Name: .key,
Value: (.value | tostring),
Type: "String",
Overwrite: true
})
Note that this will not keep the line breaks and indents, but formats the JSON object in a "compact" way. Let us know if that's a problem.
$ jq 'to_entries | map({
Name: .key,
Value: (.value | tostring),
Type: "String",
Overwrite: true
})' <<JSON
{
"a": 1,
"b": 2,
"c": {
"id": "9ee ...",
"parent": "abc..."
}
}
JSON
[
{
"Name": "a",
"Value": "1",
"Type": "String",
"Overwrite": true
},
{
"Name": "b",
"Value": "2",
"Type": "String",
"Overwrite": true
},
{
"Name": "c",
"Value": "{\"id\":\"9ee ...\",\"parent\":\"abc...\"}",
"Type": "String",
"Overwrite": true
}
]
This should achieve what's expected :
jq 'to_entries | map({
Name: .key,
Value: (.value|if (type == "object")
then tojson
else tostring
end),
Type: "String",
Overwrite: true})
' input.json
To JSON-encode data from within jq you can use the tojson (or #json) builtins.
Also, to get the actual type of some data, there is the type function.
Maybe you are trying to accomplish something like this:
# Convert the the YAML file into JSON
json="$(yq -j "$file_name")"
# Transform the JSON data into desired format
jq 'to_entries | map({Name: .key} + (.value | {
EncodedValue: tojson,
OriginalType: type,
Overwrite: true
}))' <<< "$json" > "$new_file_name"
Checking the contents of the new file would now give you something like:
[
{
"Name": "a",
"EncodedValue": "1",
"OriginalType": "number",
"Overwrite": true
},
{
"Name": "b",
"EncodedValue": "2",
"OriginalType": "number",
"Overwrite": true
},
{
"Name": "c",
"EncodedValue": "{\"id\":\"9ee ...\",\"parent\":\"abc...\"}",
"OriginalType": "object",
"Overwrite": true
}
]
In this vega chart, if I download and convert the flare-dependencies.json to csv using the following jq command,
jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | #csv' flare-dependencies.json > flare-dependencies.csv
And change the corresponding data property in the edge-bundling.vg.json file from:
{
"name": "dependencies",
"url": "data/flare-dependencies.json",
"transform": [
{
"type": "formula",
"expr": "treePath('tree', datum.source, datum.target)",
"as": "treepath",
"initonly": true
}
]
},
to
{
"name": "dependencies",
"url": "data/flare-dependencies.csv",
"format": { "type": "csv" },
"transform": [
{
"type": "formula",
"expr": "treePath('tree', datum.source, datum.target)",
"as": "treepath",
"initonly": true
}
]
},
The hovering effect wont work(the colors wont change when I hover edges/nodes.
I suspect that the issue is with this section:
"name": "selected",
"source": "dependencies",
"transform": [
{
"type": "filter",
"expr": "datum.source === active || datum.target === active"
}
]
What am I missing? How can I fix this?
JSON data is typed; that is, the file format distinguishes between string and numerical data. CSV data is untyped: all entries are expressed as strings.
The chart specification above requires some fields to be numerical. When you convert the input data to CSV, you must add a format specifier to specify numerical types for the numerical data columns.
In case of this chart you can use the following for the nodes data:
"format": {
"type": "tsv",
"parse": { "id": "number", "name": "string", "parent": "number" }
},
And the following for the links data:
"format": {
"type": "tsv",
"parse": { "source": "number", "target": "number" }
},
In the JSON example below how to find all the elements which contain string "Choice" and replace them with another string, for example "Grade".
So with below all the fields name "***Choice" should change to "***Grade".
I have pasted the expected output below. Given that I don't know how many fields will have the string "Choice", I don't want to simply do [$in ~> | ** [firstChoice] | {"firstGrade": firstChoice}, ["firstChoice"] | ;] which is a straight find and replace.
{
"data": {
"resourceType": "Bundle",
"id": "e919c820-71b9-4e4b-a1c8-c2fef62ea911",
"firstChoice": "xxx",
"type": "collection",
"entry": [
{
"resource": {
"resourceType": "Condition",
"id": "SMART-Condition-342",
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "38341003",
"display": "Essential hypertension",
"firstChoice": "xxx"
}
],
"text": "Essential hypertension"
},
"clinicalStatus": "active",
"secondChoice": "xxx"
},
"search": {
"mode": "match"
}
}
]
}
}
Expected output
{
"data": {
"resourceType": "Bundle",
"id": "e919c820-71b9-4e4b-a1c8-c2fef62ea911",
"firstGrade": "xxx",
"type": "collection",
"entry": [
{
"resource": {
"resourceType": "Condition",
"id": "SMART-Condition-342",
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "38341003",
"display": "Essential hypertension",
"firstGrade": "xxx"
}
],
"text": "Essential hypertension"
},
"clinicalStatus": "active",
"secondGrade": "xxx"
},
"search": {
"mode": "match"
}
}
]
}
}
There might be simpler ways, but this is an expression I came up with in JSONata:
(
$prefixes := $keys(**)[$ ~> /Choice$/].$substringBefore('Choice');
$reduce($prefixes, function($acc, $prefix) {(
$choice := $prefix & "Choice";
$acc ~> | ** [$lookup($choice)] | {$prefix & "Grade": $lookup($choice)}, [$choice] |
)}, $$)
)
It looks terrible, but I'll explain how I built it up anyway.
You started with the expression
$ ~> | ** [firstChoice] | {"firstGrade": firstChoice}, ["firstChoice"] |
which is fine if you only want to replace one choice, and you know the full name. If you want to replace more than one, then you can chain these together as follows:
$ ~> | ** [firstChoice] | {"firstGrade": firstChoice}, ["firstChoice"] |
~> | ** [secondChoice] | {"secondGrade": secondChoice}, ["secondChoice"] |
~> | ** [thirdChoice] | {"thirdGrade": thirdChoice}, ["thirdChoice"] |
At this point, you could create a higher-order function that takes the choice prefix and returns a partial substitution (note that the |...|...| syntax generates a function). Then you can chain these together for an array of prefixes using the built in $reduce() higher-order function. So you get something like this:
(
$prefixes := ["first", "second", "third"];
$reduce($prefixes, function($acc, $prefix) {(
$choice := $prefix & "Choice";
$acc ~> | ** [$lookup($choice)] | {$prefix & "Grade": $lookup($choice)}, [$choice] |
)}, $$)
)
But if you don't know the set of prefixes up front, and want to select, say, all property names that end in 'Choice', the the following expression will get you that:
$prefixes := $keys(**)[$ ~> /Choice$/].$substringBefore('Choice')
Which then arrives at my final expression. You can experiment with it here in the exerciser on your data.
I know it was 2019, but here's an alternative solution to help future readers.
$replace($string(),/([first|second|third])Choice/,"$1Grade")~>$eval()
I have a JSON file that I want to convert into a CSV file using the jq in a shell script. I want to create a single row from this entire JSON file. I have to extract value from values. The row output should be something like
null,642,642,412,0,null,null
Here is my JSON file
{
"data": [
{
"name": "exits",
"period": "lifetime",
"values": [
{
"value": {}
}
],
"title": "Exits",
"description": "Number of times someone exited the carousel"
},
{
"name": "impressions",
"period": "lifetime",
"values": [
{
"value": 642
}
],
"title": "Impressions",
"description": "Total number of times the media object has been seen"
},
{
"name": "reach",
"period": "lifetime",
"values": [
{
"value": 412
}
],
"title": "Reach",
"description": "Total number of unique accounts that have seen the media object"
},
{
"name": "replies",
"period": "lifetime",
"values": [
{
"value": 0
}
],
"title": "Replies",
"description": "Total number of replies to the carousel"
},
{
"name": "taps_forward",
"period": "lifetime",
"values": [
{
"value": {}
}
],
"title": "Taps Forward",
"description": "Total number of taps to see this story's next photo or video"
},
{
"name": "taps_back",
"period": "lifetime",
"values": [
{
"value": {}
}
],
"title": "Taps Back",
"description": "Total number of taps to see this story's previous photo or video"
}
]
}
Hi tried using this jq command :
.data | map(.values[].value) | #csv
This is giving the following output:
jq: error (at :70): object ({}) is not valid in a csv row
exit status 5
So when I am getting this empty JSON object it is reflecting an error.
Please Help!!
The row output should be something like
null,642,642,412,0,null,null
Using length==0 here is dubious at best. To check for {} one could write:
jq '.data | map(.values[].value | if . == {} then "null" else . end) | #csv'
Similarly for [].
If you run the command without the #csv part you will see that the output is:
[
{},
642,
412,
0,
{},
{}
]
By replacing the empty objects with "null": (length == 0)
jq '.data | map(.values[].value) | map(if (type == "object" and length == 0 ) then "null" else . end) | #csv'
Output:
"\"null\",642,412,0,\"null\",\"null\""
Per suggestion from #aaron (see comment). The following can produce the requested output without extra post-processing. Disclaimer: this is not working with my jq 1.5, but working on jqplay with jq 1.6.
jq --raw-output '.data | map(.values[].value) | map(if (type == "object" and length == 0 ) then "null" else . end) | join(",")'
Output:
null,642,412,0,null,null
I've got two Apache Avro schemas (essentially JSON) - one being a "common" part across many schemas and another one as an . Looking for a way to merge them in a shell script.
base.avsc
{
"type": "record",
"fields": [
{
"name": "id",
"type": "string"
}
]
}
schema1.avsc
{
"name": "schema1",
"namespace": "test",
"doc": "Test schema",
"fields": [
{
"name": "property1",
"type": [
"null",
"string"
],
"default": null,
"doc": "Schema 1 specific field"
}
]
}
jq -s '.[0] * .[1]' base.avsc schema1.avsc doesn't merge the array for me:
{
"type": "record",
"fields": [
{
"name": "property1",
"type": [
"null",
"string"
],
"default": null,
"doc": "Schema 1 specific field"
}
],
"name": "schema1",
"namespace": "test",
"doc": "Test schema"
}
I don't expect to have same keys in the "fields" array. And "type": "record", could be moved into schema1.avsc if that makes it easier.
An expected result should be something like this (the order of the keys doesn't make a difference)
{
"name": "schema1",
"namespace": "test",
"doc": "Test schema",
"type": "record",
"fields": [
{
"name": "property1",
"type": [
"null",
"string"
],
"default": null,
"doc": "Schema 1 specific field"
},
{
"name": "id",
"type": "string"
}
]
}
Can't figure out how to write an expression in jq for what I want.
You need an addition (+) operator to perform a union of records from both the files and combine the common record fields from both the files as
jq -s '.[0] as $o1 | .[1] as $o2 | ($o1 + $o2) |.fields = ($o2.fields + $o1.fields) ' base.avsc schema1.avsc
Answer adopted from pkoppstein's comment on this GitHub post Merge arrays in two json files.
The jq manual says this under the addition operator +
Objects are added by merging, that is, inserting all the key-value pairs from both objects into a single combined object. If both objects contain a value for the same key, the object on the right of the + wins. (For recursive merge use the * operator.)
Here's a concise solution that avoids "slurping":
jq --argfile base base.avsc '
$base + .
| .fields += ($base|.fields)
' schema1.avsc
Or you could go with brevity:
jq -s '
.[0].fields as $f | add | .fields += $f
' base.avsc schema1.avsc
as an alternative solution, you may consider handling hierarchical json using a walk-path based unix utility jtc.
the ask here is mere a recursive merge, which with jtc looks like this:
bash $ <schema1.avsc jtc -mi base.avsc
{
"doc": "Test schema",
"fields": [
{
"default": null,
"doc": "Schema 1 specific field",
"name": "property1",
"type": [
"null",
"string"
]
},
{
"name": "id",
"type": "string"
}
],
"name": "schema1",
"namespace": "test",
"type": "record"
}
bash $
PS> Disclosure: I'm the creator of the jtc - shell cli tool for JSON operations