unable to parse json into csv using jq - bash

I have a JSON file that I want to convert into a CSV file using the jq in a shell script. I want to create a single row from this entire JSON file. I have to extract value from values. The row output should be something like
null,642,642,412,0,null,null
Here is my JSON file
{
"data": [
{
"name": "exits",
"period": "lifetime",
"values": [
{
"value": {}
}
],
"title": "Exits",
"description": "Number of times someone exited the carousel"
},
{
"name": "impressions",
"period": "lifetime",
"values": [
{
"value": 642
}
],
"title": "Impressions",
"description": "Total number of times the media object has been seen"
},
{
"name": "reach",
"period": "lifetime",
"values": [
{
"value": 412
}
],
"title": "Reach",
"description": "Total number of unique accounts that have seen the media object"
},
{
"name": "replies",
"period": "lifetime",
"values": [
{
"value": 0
}
],
"title": "Replies",
"description": "Total number of replies to the carousel"
},
{
"name": "taps_forward",
"period": "lifetime",
"values": [
{
"value": {}
}
],
"title": "Taps Forward",
"description": "Total number of taps to see this story's next photo or video"
},
{
"name": "taps_back",
"period": "lifetime",
"values": [
{
"value": {}
}
],
"title": "Taps Back",
"description": "Total number of taps to see this story's previous photo or video"
}
]
}
Hi tried using this jq command :
.data | map(.values[].value) | #csv
This is giving the following output:
jq: error (at :70): object ({}) is not valid in a csv row
exit status 5
So when I am getting this empty JSON object it is reflecting an error.
Please Help!!
The row output should be something like
null,642,642,412,0,null,null

Using length==0 here is dubious at best. To check for {} one could write:
jq '.data | map(.values[].value | if . == {} then "null" else . end) | #csv'
Similarly for [].

If you run the command without the #csv part you will see that the output is:
[
{},
642,
412,
0,
{},
{}
]
By replacing the empty objects with "null": (length == 0)
jq '.data | map(.values[].value) | map(if (type == "object" and length == 0 ) then "null" else . end) | #csv'
Output:
"\"null\",642,412,0,\"null\",\"null\""
Per suggestion from #aaron (see comment). The following can produce the requested output without extra post-processing. Disclaimer: this is not working with my jq 1.5, but working on jqplay with jq 1.6.
jq --raw-output '.data | map(.values[].value) | map(if (type == "object" and length == 0 ) then "null" else . end) | join(",")'
Output:
null,642,412,0,null,null

Related

How do i get id from json with jq?

i have json. How can I get the id whose attributes value is 0fda6bb8-4fc9-4463-9d26-af2d503cb19c ?
[
{
"id": "c3b1516d-5b2c-4838-b5eb-77d94d634832",
"versionId": "c3b1516d-5b2c-4838-b5eb-77d94d634832",
"name": "выписка маленькая заявка с лендинга ИБ",
"entityTypeName": "TestCases",
"projectId": "6dfe2ace-dd40-4e36-b66e-4a655a855a2f",
"sectionId": "bf7fbece-4fdf-466a-b041-2d830debc844",
"isAutomated": false,
"globalId": 264511,
"duration": 300,
"attributes": {
"1be40893-5dad-4b37-b70d-b830c4bd273f": "0fda6bb8-4fc9-4463-9d26-af2d503cb19c",
"f4b408ae-5418-4a8d-99d9-4a67cb34870b": "fa000fb2-375d-4eb5-901c-fb5df30785ad"
},
"createdById": "995b1f08-cc65-409c-aa1c-a16c82dabf1d",
"modifiedById": "995b1f08-cc65-409c-aa1c-a16c82dabf1d",
"createdDate": "2022-10-12T00:22:43.544Z",
"modifiedDate": "2022-10-12T00:22:43.544Z",
"state": "NeedsWork",
"priority": "Medium",
"isDeleted": false,
"tagNames": [
"master"
],
"iterations": []
},
{
"id": "ec423701-f2a8-4667-8459-939a6e079941",
"versionId": "0dfe176e-b172-47ae-8049-e6974086d497",
"name": "[iOS] СБПэй фичатоглы. Fts.SBPay.Settings выключен Fts.C2B.Settings.Subscriptions включен",
"entityTypeName": "TestCases",
"projectId": "6dfe2ace-dd40-4e36-b66e-4a655a855a2f",
"sectionId": "8626c9f5-a5aa-4584-bbca-e9cd60369a5e",
"isAutomated": false,
"globalId": 402437,
"duration": 300,
"attributes": {
"1be40893-5dad-4b37-b70d-b830c4bd273f": "b52bfc88-9b13-41e1-8b4c-098ebfa673e0",
"240b7589-9461-44dc-8b13-361132877c50": "cfd99bad-fb3f-43fe-be8a-cb745f2d4c78",
"6639eb1a-1335-44ec-ba8b-c3c52bff9e79": "ed3bc553-e873-472f-8dc1-7f2720ad457d",
"9ae36ef5-ca0e-4273-bb39-aedf289a119d": "6687017f-138b-4d75-91bd-c6465f1f5331",
"b862c3ee-55eb-486f-8125-a7a034d69340": "IBANK5-37207",
"f4b408ae-5418-4a8d-99d9-4a67cb34870b": "36dc55ac-359c-4312-9b1a-646ad5fd5aa9"
},
"createdById": "11a30c8b-73e2-4233-bbf5-7cc41556d3e0",
"modifiedById": "11a30c8b-73e2-4233-bbf5-7cc41556d3e0",
"createdDate": "2022-11-01T12:05:56.821Z",
"modifiedDate": "2022-11-02T14:16:55.246Z",
"state": "Ready",
"priority": "Medium",
"isDeleted": false,
"tagNames": [],
"iterations": []
}
]
I tried using
cat new2.xml | jq '.' | jq '.[] | select(."1be40893-5dad-4b37-b70d-b830c4bd273f" | index("0fda6bb8-4fc9-4463-9d26-af2d503cb19c")) | .[] .id'
but the search returns nothing
You could select on .attributes[] and display the id field only:
jq '.[] | select(.attributes[] == "0fda6bb8-4fc9-4463-9d26-af2d503cb19c").id'
Output:
"c3b1516d-5b2c-4838-b5eb-77d94d634832"
With the input given, you'd get the same result with the more specific:
jq '.[] | select(.attributes["1be40893-5dad-4b37-b70d-b830c4bd273f"] == "0fda6bb8-4fc9-4463-9d26-af2d503cb19c").id'
(because there's only one attribute Key with the Value "0fda6bb8-4fc9-4463-9d26-af2d503cb19c" in your example)

Select object only if it have different element from previous one with jq

The goal is to print the object only if a value of it is different from the previous object.
Imagine this json:
[
{
"date": "10-03-20",
"value": 3
},
{
"date": "11-03-20",
"value": 3
},
{
"date": "12-03-20",
"value": 3
},
{
"date": "13-03-20",
"value": 8
},
{
"date": "14-03-20",
"value": 8
},
{
"date": "15-03-20",
"value": 5
}
]
The expected output should be:
[
{
"date": "10-03-20",
"value": 3
},
{
"date": "13-03-20",
"value": 8
},
{
"date": "15-03-20",
"value": 5
}
]
Only print object if value is different of the previous object.
This what I start to do with jq:
[foreach .[] as $row (null;
select($row.value != $lastValue) | $row.value
$lastValue = $row.value
)]
Of course this doesn't work, I can't use variable with jq like this, i don't know if this is the good way to do.
Using reduce here is probably easier than using foreach:
reduce .[] as $x (null;
if . == null then [$x]
elif .[-1].value == $x.value then .
else . + [$x] end)
Solution using foreach
For reference, here's a solution using foreach. The key idea is to use foreach's state variable to store the "local variables" that you might use in a different language.
[ foreach .[] as $x ({};
if .prev == $x.value then .emit = null
else {prev: $x.value, emit: $x}
end;
select(.emit).emit ) ]

How do I parse nested JSON with JQ into CSV-aggregated output?

I have a question that is an extension/followup to a previous question I've asked:
How do I concatenate dummy values in JQ based on field value, and then CSV-aggregate these concatenations?
In my bash script, when I run the following jq against my curl result:
curl -u someKey:someSecret someURL 2>/dev/null | jq -r '.schema' | jq -r -c '.fields'
I get back a JSON array as follows:
[
{"name":"id", "type":"int"},
{
"name": "agents",
"type": {
"type": "array",
"items": {
"name": "carSalesAgents",
"type": "record"
"fields": [
{
"name": "agentName",
"type": ["string", "null"],
"default": null
},
{
"name": "agentEmail",
"type": ["string", "null"],
"default": null
},
{
"name": "agentPhones",
"type": {
"type": "array",
"items": {
"name": "SalesAgentPhone",
"type": "record"
"fields": [
{
"name": "phoneNumber",
"type": "string"
}
]
}
},
"default": []
}
]
}
},
"default": []
},
{"name":"description","type":"string"}
]
Note: line breaks and indentation added here for ease of reading. This is all in reality a single blob of text.
My goal is to do a call with jq applied to return the following, given the example above (again lines and spaces added for readability, but only need to return valid JSON blob):
{
"id":1234567890,
"agents": [
{
"agentName": "xxxxxxxxxx",
"agentEmail": "xxxxxxxxxx",
"agentPhones": [
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
}
]
},
{
"agentName": "xxxxxxxxxx",
"agentEmail": "xxxxxxxxxx",
"agentPhones": [
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
},
{
"phoneNumber": "xxxxxxxxxx"
}
]
}
],
"description":"xxxxxxxxxx"
}
To summarise, I am trying to automatically generate templated values that match the "schema" JSON shown above.
So just to clarify, the values for "name" (including their surrounding double-quotes) are concatenated with either:
:1234567890 ...when the "type" for that object is "int"
":xxxxxxxxxx" ...when the "type" for that object is "string"
...and when type is "array" or "record" the appropriate enclosures are added {} or [] with the nested content inside.
if its an array of records, generate TWO records for the output
The approach I have started down to cater for parsing nested content like this is to have a series of if-then-else's for every combination of each possible jq type.
But this is fast becoming very hard to manage and painful. From my initial scratch efforts...
echo '[{"name":"id","type":"int"},{"name":"test_string","type":"string"},{"name":"string3ish","type":["string","null"],"default":null}]' | jq -c 'map({(.name): (if .type == "int" then 1234567890 else (if .type == "string" then "xxxxxxxxxx" else (if .type|type == "array" then "xxARRAYxx" else "xxUNKNOWNxx" end) end) end)})|add'
I was wondering if anyone knew of a smarter way to do this in bash/shell with JQ.
PS: I have found alternate solutions for such parsing using Java and Python modules, but JQ is preferable for a unique case of limitations around portability. :)
Thanks!
jq supports functions. Those functions can recurse.
#!/usr/bin/env jq -f
# Ignore all but the first type, in the case of "type": ["string", "null"]
def takeFirstTypeFromArray:
if (.type | type) == "array" then
.type = .type[0]
else
.
end;
def sampleData:
takeFirstTypeFromArray |
if .type == "int" then
1234567890
elif .type == "string" then
"xxxxxxxxxx"
elif .type == "array" then # generate two entries for any test array
[(.items | sampleData), (.items | sampleData)]
elif .type == "record" then
(.fields | map({(.name): sampleData}) | add)
elif (.type | type) == "array" then
(.type[] | sampleData)
elif (.type | type) == "object" then
(.type | sampleData)
else
["UNKNOWN", .]
end;
map({(.name): sampleData}) | add

Nested Filtering json file with jq statement

I have a json file with the following structure of each object inside
{
"id": 2400321267,
"data": {
"q": "quinoa black bean and shrimp r",
"r": "quinoa black bean and shrimps r",
"s": "3"
},
"job_id": 1413792,
"results": {
"judgments": [
{
"id": 5022700047,
"unit_state": "good",
"data": {
"rewrite_quality": "1"
},
}
],
}
},
{
"id": 2400321267,
"data": {
"q": "quinoa black bean and shrimp r",
"r": "quinoa black bean and shrimps r",
"s": "3"
},
"job_id": 1413792,
"results": {
"judgments": [
{
"id": 5022700047,
"unit_state": "good",
"data": {
"rewrite_quality": "2"
},
}
],
}
}
and I was trying to use the command jq '.[] | select(any(.Tags[]; .rewrite_quality == "1"))' | less to try to see if the output is correct but I don't see any output.
I want the output to have only entries with rewrite_quality == '1', in this case only the first entry.
Reading between the lines, it would appear that the following filter should achieve the stated goals:
.[]
| select( .results | any(.judgments[]; .data.rewrite_quality == "1"))
"Tags"
If the intent in using ".Tags" was to indicate that it does not matter what path leads to .rewrite_quality, then the filter to use would be:
.[]
| select( any(.. | objects | .rewrite_quality == "1"))
Alternative to using less
If you want a brief indication of whether there are any matches, you could use this filter, which has the added value of revealing how many objects satisfy the criterion:
map(select(any(.. | objects | .rewrite_quality == "1"))) | length

Find fields which contains a text and replace it with another text

In the JSON example below how to find all the elements which contain string "Choice" and replace them with another string, for example "Grade".
So with below all the fields name "***Choice" should change to "***Grade".
I have pasted the expected output below. Given that I don't know how many fields will have the string "Choice", I don't want to simply do [$in ~> | ** [firstChoice] | {"firstGrade": firstChoice}, ["firstChoice"] | ;] which is a straight find and replace.
{
"data": {
"resourceType": "Bundle",
"id": "e919c820-71b9-4e4b-a1c8-c2fef62ea911",
"firstChoice": "xxx",
"type": "collection",
"entry": [
{
"resource": {
"resourceType": "Condition",
"id": "SMART-Condition-342",
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "38341003",
"display": "Essential hypertension",
"firstChoice": "xxx"
}
],
"text": "Essential hypertension"
},
"clinicalStatus": "active",
"secondChoice": "xxx"
},
"search": {
"mode": "match"
}
}
]
}
}
Expected output
{
"data": {
"resourceType": "Bundle",
"id": "e919c820-71b9-4e4b-a1c8-c2fef62ea911",
"firstGrade": "xxx",
"type": "collection",
"entry": [
{
"resource": {
"resourceType": "Condition",
"id": "SMART-Condition-342",
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "38341003",
"display": "Essential hypertension",
"firstGrade": "xxx"
}
],
"text": "Essential hypertension"
},
"clinicalStatus": "active",
"secondGrade": "xxx"
},
"search": {
"mode": "match"
}
}
]
}
}
There might be simpler ways, but this is an expression I came up with in JSONata:
(
$prefixes := $keys(**)[$ ~> /Choice$/].$substringBefore('Choice');
$reduce($prefixes, function($acc, $prefix) {(
$choice := $prefix & "Choice";
$acc ~> | ** [$lookup($choice)] | {$prefix & "Grade": $lookup($choice)}, [$choice] |
)}, $$)
)
It looks terrible, but I'll explain how I built it up anyway.
You started with the expression
$ ~> | ** [firstChoice] | {"firstGrade": firstChoice}, ["firstChoice"] |
which is fine if you only want to replace one choice, and you know the full name. If you want to replace more than one, then you can chain these together as follows:
$ ~> | ** [firstChoice] | {"firstGrade": firstChoice}, ["firstChoice"] |
~> | ** [secondChoice] | {"secondGrade": secondChoice}, ["secondChoice"] |
~> | ** [thirdChoice] | {"thirdGrade": thirdChoice}, ["thirdChoice"] |
At this point, you could create a higher-order function that takes the choice prefix and returns a partial substitution (note that the |...|...| syntax generates a function). Then you can chain these together for an array of prefixes using the built in $reduce() higher-order function. So you get something like this:
(
$prefixes := ["first", "second", "third"];
$reduce($prefixes, function($acc, $prefix) {(
$choice := $prefix & "Choice";
$acc ~> | ** [$lookup($choice)] | {$prefix & "Grade": $lookup($choice)}, [$choice] |
)}, $$)
)
But if you don't know the set of prefixes up front, and want to select, say, all property names that end in 'Choice', the the following expression will get you that:
$prefixes := $keys(**)[$ ~> /Choice$/].$substringBefore('Choice')
Which then arrives at my final expression. You can experiment with it here in the exerciser on your data.
I know it was 2019, but here's an alternative solution to help future readers.
$replace($string(),/([first|second|third])Choice/,"$1Grade")~>$eval()

Resources