Parsing JSON file-jq [duplicate] - bash

This question already has answers here:
jq not working on tag name with dashes and numbers
(2 answers)
Closed 4 years ago.
Whole file:https://1drv.ms/u/s!AizscpxS0QM4hJpEkp12VPHiKO_gBg
Using this command i get part bellow (get latest job)
jq '.|[ .executions[] | select(.job.name != null) | select(.job.name) ]
| sort_by(.id)
| reverse
| .[0] ' 1.json
{
"argstring": null,
"date-ended": {
"date": "2018-04-03T17:43:38Z",
"unixtime": 1522777418397
},
"date-started": {
"date": "2018-04-03T17:43:34Z",
"unixtime": 1522777414646
},
"description": "",
"executionType": "user",
"failedNodes": [
"172.30.61.88"
],
"href": "http://172.30.61.88:4440/api/21/execution/126",
"id": 126,
"job": {
"averageDuration": 4197,
"description": "",
"group": "",
"href": "http://172.30.61.88:4440/api/21/job/271cbcec-5042-4d52-b794-ede2056b2ab8",
"id": "271cbcec-5042-4d52-b794-ede2056b2ab8",
"name": "aa",
"permalink": "http://172.30.61.88:4440/project/demo/job/show/271cbcec-5042-4d52-b794-ede2056b2ab8",
"project": "demo"
},
"permalink": "http://172.30.61.88:4440/project/demo/execution/show/126",
"project": "demo",
"status": "failed",
"user": "administrator"
I managed to extract job name and status, now want to get date-ended.date ?
jq '.|[ .executions[] |select(.job.name != null) | select(.job.name) ]
| sort_by(.id)
| reverse
| .[0]
| "\(.status), \(.job.name)"' 1.json

With the "-r" command-line option, the following filter:
[.executions[] | select(.job.name != null)]
| sort_by(.id)
| reverse
| .[0]
| [.status, .job.name, ."date-ended".date]
| #csv
produces:
"failed","aa","2018-04-03T17:43:38Z"
An important point that you might have missed is that "-" is a "special" character in that it can signify negation or subtraction.
If your jq does not support the syntax ."date-ended".date, then you could fall back to the basic syntax: (.["date-ended"] | .date)

I guess you have troubles extracting .date-ended.date because the name contains a dash that is interpreted by jq as subtraction.
The solution is listed in the documentation:
If the key contains special characters, you need to surround it with double quotes like this: ."foo$", or else .["foo$"].
This means the last filter of your jq program should be:
"\(.status), \(.job.name), \(."date-ended".date)"

Related

Merge two array of objects with common key using jq command

I have two datasets:
data1='[
{ "bookings": 2984, "timestamp": 1675854900 },
{ "bookings": 2967, "timestamp": 1675855200 }
]'
data2='[
{ "errors": 51, "timestamp": 1675854900 },
{ "errors": 90, "timestamp": 1675855200 }
]'
I want the output to be:
combined='[
{ "errors": 51, bookings: 2984, "timestamp": 1675854900 },
{ "errors": 90, bookings: 2967, "timestamp": 1675855200 }
]'
Can this be achieved by shell scripting and jq command?
Assume that timestamp will always be present and will always have a common value across two datasets. Even the order is same.
This last paragraph just caught my attention:
Assume that timestamp will always be present and will always have a common value across two datasets. Even the order is same.
If this is truly the case then it is reasonable to assume that both arrays have the same length and their items are aligned respectively. Thus, there's no need to build up a hash-based INDEX as accessing the items by their numeric keys (positions within the arrays) can already be achieved in constant time.
jq -n --argjson data1 "$data1" --argjson data2 "$data2" '
$data1 | [keys[] | $data2[.] + $data1[.]]
'
[
{
"errors": 51,
"timestamp": 1675854900,
"bookings": 2984
},
{
"errors": 90,
"timestamp": 1675855200,
"bookings": 2967
}
]
A simple JOIN operation could do:
jq -n --argjson data1 "$data1" --argjson data2 "$data2" '
[JOIN(INDEX($data1[]; .timestamp); $data2[]; .timestamp | #text; add)]
'
[
{
"errors": 51,
"timestamp": 1675854900,
"bookings": 2984
},
{
"errors": 90,
"timestamp": 1675855200,
"bookings": 2967
}
]
I'm getting this error: jq: error: JOIN/4 is not defined at <top-level>, line 2: [JOIN(INDEX($data1[]; .timestamp); $data2[]; .timestamp | #text; add)] jq: 1 compile error
You are probably using an older version of jq. JOIN and INDEX were introduced in jq 1.6. Either define them yourself by taking their definitions from source, or take those definitions and modify them to fit your very use case (both work well with jq 1.5).
Definitions from source:
jq -n --argjson data1 "$data1" --argjson data2 "$data2" '
def INDEX(stream; idx_expr):
reduce stream as $row ({}; .[$row | idx_expr | tostring] = $row);
def JOIN($idx; stream; idx_expr; join_expr):
stream | [., $idx[idx_expr]] | join_expr;
[JOIN(INDEX($data1[]; .timestamp); $data2[]; .timestamp | #text; add)]
'
Adapted to your use case:
jq -n --argjson data1 "$data1" --argjson data2 "$data2" '
($data1 | with_entries(.key = (.value.timestamp | #text))) as $ix
| $data2 | map(. + $ix[.timestamp | #text])
'
In general, if you find JOIN a bit tricky to understand or use, then consider using INDEX for this type of problem. In the present case, you could get away with a trivially simple approach, e.g.:
jq -n --argjson data1 "$data1" --argjson data2 "$data2" '
INDEX($data1[]; .timestamp) as $dict
| $data2 | map( . + $dict[.timestamp|tostring])
Another way to do this is to build a map from timestamps to error counts, and perform a lookup in it.
jq -n '
input as $data1
| input as $data2
| ($data2
| map({ "key": (.timestamp | tostring), "value": .errors })
| from_entries
) as $errors_by_timestamp
| $data1 | map(.errors = $errors_by_timestamp[(.timestamp | tostring)])
' <<<"$data1 $data2"
By the way, I have trying to this answer from AI since morning and finally it also gave me correct solution this time
#!/bin/bash
data1='[
{ "bookings": 2984, "timestamp": 1675854900 },
{ "bookings": 2967, "timestamp": 1675855200 }
]'
data2='[
{ "errors": 51, "timestamp": 1675854900 },
{ "errors": 90, "timestamp": 1675855200 }
]'
combined=$(jq -n --argjson d1 "$data1" --argjson d2 "$data2" '
[ $d1, $d2 ] | transpose[] | group_by(.timestamp) | map(
reduce .[] as $i ({}; . * $i)
)
')
echo "$combined"
Just pasting it here for you guys in case you didn't think of this method

nested Bash for loop not iterating correctly? [duplicate]

This question already has answers here:
bash script loop through two variables in lock step
(2 answers)
Closed 4 months ago.
I am using a curl command to get json data from an application
the example curl response
{
"count": 2,
"value": [
{
"id": 344,
"name": "ip-172-20-94-68",
"status": "offline",
},
{
"id": 345,
"name": "ip-172-20-95-119",
"status": "offline",
}
]
}
My bash script
ipAddresses=$(curl -s -X GET "$hostUrl" -u :"$EncodedPAT" | jq -r ".value[] | select(.status==\"offline\").name")
ids=$(curl -s -X GET "$hostUrl" -u :"$EncodedPAT" | jq -r ".value[] | select(.status==\"offline\").id")
for ipAddress in "${ipAddresses[#]}"
do
for id in "${ids[#]}"
do
echo "ipAddress: ${ipAddress} id: ${id}"
done
done
output
ipAddress: ip-172-20-94-68
ip-172-20-95-119
ip-172-20-95-113
ip-172-20-94-96
ip-172-20-94-86
id: 344
345
346
348
350
So it looks like it is only iterating through the outermost loop once, and not iterating through the inner loops (just uses first element).
expected output
ipAddress: ip-172-20-94-68 id: 344
ipAddress: ip-172-20-95-119 id: 345
ipAddress: ip-172-20-95-113 id: 346
ipAddress: ip-172-20-94-96 id: 348
ipAddress: ip-172-20-94-86 id: 350
I've looked elsewhere on the net but am having trouble finding anything relevant.
Any guesses as to what might be wrong?
You can combine this into a single jq call:
.value[] | select(.status == "offline") | "ipAddress: \(.name) id: \(.id)"
Gives:
"ipAddress: ip-172-20-94-68 id: 344"
"ipAddress: ip-172-20-95-119 id: 345"
Online JqPlay Demo

How to get the content of a JSON without showing the names of the key values

I am new to Bash and I am currently trying to get the content of a JSON without showing the names of the key values.
This is how the JSON looks like (part of it):
[
{
"V1": 65,
"V2": "Female",
"V3": 0.7,
"V4": 0.1,
"V5": 187,
"V6": 16,
"V7": 18,
"V8": 6.8,
"V9": 3.3,
"V10": 0.9,
"Class": 1
},
{
"V1": 62,
"V2": "Male",
"V3": 10.9,
"V4": 5.5,
"V5": 699,
"V6": 64,
"V7": 100,
"V8": 7.5,
"V9": 3.2,
"V10": 0.74,
"Class": 1
},
{
"V1": 62,
"V2": "Male",
"V3": 7.3,
"V4": 4.1,
"V5": 490,
"V6": 60,
"V7": 68,
"V8": 7,
"V9": 3.3,
"V10": 0.89,
"Class": 1
}
]
This is my script
#!/bin/bash
echo "Albumin =3";
echo "Age Sex Albumin Proteins";
echo "******";
echo " "
echo "Women";
echo "--------------";
cat csvjson.json | jq -c '.[] | {V1, V2, V8, V9} | select(.V9 ==3) | select(.V2 =="Female")';
echo " "
echo "Men";
echo "-------------";
cat csvjson.json | jq -c '.[] | {V1, V2, V8, V9} | select(.V9 ==3) | select(.V2 =="Male")';
This is what the script shows
Women
--------------
{"V1":38,"V2":"Female","V8":5.6,"V9":3}
{"V1":38,"V2":"Female","V8":5.6,"V9":3}
{"V1":32,"V2":"Female","V8":6,"V9":3}
{"V1":31,"V2":"Female","V8":6,"V9":3}
{"V1":19,"V2":"Female","V8":5.5,"V9":3}
{"V1":38,"V2":"Female","V8":7,"V9":3}
{"V1":20,"V2":"Female","V8":6.1,"V9":3}
{"V1":32,"V2":"Female","V8":7,"V9":3}
{"V1":42,"V2":"Female","V8":6.7,"V9":3}
Men
-------------
{"V1":72,"V2":"Male","V8":7.4,"V9":3}
{"V1":60,"V2":"Male","V8":6.3,"V9":3}
{"V1":33,"V2":"Male","V8":5.4,"V9":3}
{"V1":60,"V2":"Male","V8":6.8,"V9":3}
{"V1":60,"V2":"Male","V8":7.4,"V9":3}
{"V1":60,"V2":"Male","V8":7,"V9":3}
{"V1":72,"V2":"Male","V8":6.2,"V9":3}
And this is what I want to show
Women
--------------
38,Female,3, 5.6
38,Female,3, 5.6
32,Female,3, 6
31,Female,3, 6
19,Female,3, 5.5
38,Female,3, 7
20,Female,3, 6.1
32,Female,3, 7
42,Female,3, 6.7
Men
--------------
72,Male,3, 7.4
60,Male,3, 6.3
33,Male,3, 5.4
60,Male,3, 6.8
60,Male,3, 7.4
60,Male,3, 7
72,Male,3, 6.2
So, how can I hide the key values and only show the content of the JSON after doing the filters I did?
This can be accomplished entirely within jq (although some constraints are not all clear, so please comment and I will update the code):
jq --raw-output '
group_by(.V2)[]
| if first.V2 == "Male" then "Men" else "Women" end,
"--------------",
(
.[]
| select(.V9 == 3.3) # this filters to matching records
| [.V1, .V2, .V9, .V8]
| join(",")
)
' csvjson.json
Demo
Demo stand-alone jq script and code bloc language highlight for use here in stack sites using pmf's answer.
#!/usr/bin/env -S jq --raw-output --from-file
group_by(.V2)[]
| if first.V2 == "Male" then "Men" else "Women" end,
"--------------",
(
.[]
| select(.V9 == 3.3) # this filters to matching records
| [.V1, .V2, .V9, .V8]
| join(",")
)

How can I pass a variable in aws cli command in bash?

I am unable to pass a variable in the tag-user cli command.
A=$(aws iam list-user-tags --user-name user --query 'Tags[].{Key:Key,Value:Value}' | grep -B2 "Description" | grep Value | awk -F ":" '{print $2}' | tr -d '",'| awk '$1=$1')
aws iam list-user-tags --user-name user --query 'Tags[].{Key:Key,Value:Value}' | grep -B2 "Description" | grep Value
"Value": "Used for SSO",
A=Used for SSO
passing the value of A to the below CLI :
aws iam tag-user --user-name azure-sso-user --tags "[{"Key": "own:team","Value": "test#test.com"},{"Key": "security","Value": "Service"},{"Key": "comment","Value": "$A"}]"
This is the error I get:
Error parsing parameter '--tags': Invalid JSON:
[{Key: own:team,Value: test#test.com},{Key: security,Value: Service},{Key: own:comment,Value: Used
This worked:
aws iam tag-user --user-name user --tags '[{"Key": "own:team","Value": "test#test.com"},{"Key": "security","Value": "Service"},{"Key": "own:comment","Value": "'"$A"'"}]'
That is, using the following:
[
{
"Key": "own:team",
"Value": "test#test.com"
},
{
"Key": "security",
"Value": "Service"
},
{
"Key": "own:comment",
"Value": "'"
$A
"'"
}
]

Elasticsearch query time boosting produces result in inadequate order

The ES search result for the given search keyword one two three seems to be wrong after applying boost feature per keyword. Please help me modifying my "faulty" query in order to accomplish "expected result" below as I described. I'm on ES 1.7.4 with LUCENE 4.10.4
Boosting criteria -three is regarded as the most important keyword:
word - boost
---- -----
one 1
two 2
three 3
ES index content - just showing MySQL dump to make the post shorter
mysql> SELECT id, title FROM post;
+----+-------------------+
| id | title |
+----+-------------------+
| 1 | one |
| 2 | two |
| 3 | three |
| 4 | one two |
| 5 | one three |
| 6 | one two three |
| 7 | two three |
| 8 | none |
| 9 | one abc |
| 10 | two abc |
| 11 | three abc |
| 12 | one two abc |
| 13 | one two three abc |
| 14 | two three abc |
+----+-------------------+
14 rows in set (0.00 sec)
Expected ES query result - The user is searching for one two three. I'm not fussed about the order of equally scored records. I mean if record 6 and 13 switches places, I don't mind.
+----+-------------------+
| id | title | my scores for demonstration purposes
+----+-------------------+
| 6 | one two three | (1+2+3 = 6)
| 13 | one two three abc | (1+2+3 = 6)
| 7 | two three | (2+3 = 5)
| 14 | two three abc | (2+3 = 5)
| 5 | one three | (1+3 = 4)
| 4 | one two | (1+2 = 3)
| 12 | one two abc | (1+2 = 3)
| 3 | three | (3 = 3)
| 11 | three abc | (3 = 3)
| 2 | two | (2 = 2)
| 10 | two abc | (2 = 2)
| 1 | one | (1 = 1)
| 9 | one abc | (1 = 1)
| 8 | none | <- This shouldn't appear
+----+-------------------+
14 rows in set (0.00 sec)
Unexpected ES query result - Unfortunately, This is what I get.
+----+-------------------+
| id | title | _score
+----+-------------------+
| 6 | one two three | 1.0013864
| 13 | one two three abc | 1.0013864
| 4 | one two | 0.57794875
| 3 | three | 0.5310148
| 7 | two three | 0.50929534
| 5 | one three | 0.503356
| 14 | two three abc | 0.4074363
| 11 | three abc | 0.36586377
| 12 | one two abc | 0.30806428
| 10 | two abc | 0.23231897
| 2 | two | 0.12812772
| 1 | one | 0.084527075
| 9 | one abc | 0.07408653
+----+-------------------+
ES query
curl -XPOST "http://127.0.0.1:9200/_search?post_dev" -d'
{
"query": {
"bool": {
"must": {
"match": {
"title": {
"query": "one two three"
}
}
},
"should": [
{
"match": {
"title": {
"query": "one",
"boost": 1
}
}
},
{
"match": {
"title": {
"query": "two",
"boost": 2
}
}
},
{
"match": {
"title": {
"query": "three",
"boost": 3
}
}
}
]
}
},
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"from": "0",
"size": "100"
}'
Some more test queries:
This query doesn't produce any result.
This query doesn't order correctly as seem here.
# Index some test data
curl -XPUT "localhost:9200/test/doc/1" -d '{"title": "one"}'
curl -XPUT "localhost:9200/test/doc/2" -d '{"title": "two"}'
curl -XPUT "localhost:9200/test/doc/3" -d '{"title": "three"}'
curl -XPUT "localhost:9200/test/doc/4" -d '{"title": "one two"}'
curl -XPUT "localhost:9200/test/doc/5" -d '{"title": "one three"}'
curl -XPUT "localhost:9200/test/doc/6" -d '{"title": "one two three"}'
curl -XPUT "localhost:9200/test/doc/7" -d '{"title": "two three"}'
curl -XPUT "localhost:9200/test/doc/8" -d '{"title": "none"}'
curl -XPUT "localhost:9200/test/doc/9" -d '{"title": "one abc"}'
curl -XPUT "localhost:9200/test/doc/10" -d '{"title": "two abc"}'
curl -XPUT "localhost:9200/test/doc/11" -d '{"title": "three abc"}'
curl -XPUT "localhost:9200/test/doc/12" -d '{"title": "one two abc"}'
curl -XPUT "localhost:9200/test/doc/13" -d '{"title": "one two three abc"}'
curl -XPUT "localhost:9200/test/doc/14" -d '{"title": "two three abc"}'
# Make test data available for search
curl -XPOST "localhost:9200/test/_refresh?pretty"
# Search using function score
curl -XPOST "localhost:9200/test/doc/_search?pretty" -d'{
"query": {
"function_score": {
"query": {
"match": {
"title": "one two three"
}
},
"functions": [
{
"filter": {
"query": {
"match": {
"title": "one"
}
}
},
"weight": 1
},
{
"filter": {
"query": {
"match": {
"title": "two"
}
}
},
"weight": 2
},
{
"filter": {
"query": {
"match": {
"title": "three"
}
}
},
"weight": 3
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
},
"sort": [
{
"_score": {
"order": "desc"
}
}
],
"from": "0",
"size": "100"
}'

Resources