yq v4: delete node based on key-value of child - yaml

How can I remove all (parent) nodes with the "type: array" key/value pair using yq v4?
Before:
info:
title: My API
components:
schemas:
pets:
type: array
items:
$ref: "#/components/schemas/pet"
pet:
type: object
properties:
petName:
type: string
After:
info:
title: My API
components:
schemas:
pet:
type: object
properties:
petName:
type: string
I use yq (https://github.com/mikefarah/yq/) version v4.30.8
I tried many yq commands, for example:
yq 'del(.components.schemas.[] | select(. == "array") | parent)' filename.yaml
, but without success.

How can I remove all (parent) nodes with the "type: array" key/value pair
To reach all items, use .. to recurse down the document tree. To filter for a given key/value pair, name key and value on both sides of the equation.
yq 'del(.. | select(.type == "array"))' file.yaml
EDIT: How to only consider nodes under .components.schemas?
If you still want recursion, prepend .. with .components.schemas |. Without recursion, replace .. with .components.schemas[].
yq 'del(.components.schemas | .. | select(.type == "array"))' file.yaml
yq 'del(.components.schemas[] | select(.type == "array"))' file.yaml

Related

Get the YAML path of a given line in a file

Using yq (or any other tool), how can I return the full YAML path of an arbitrary line number ?
e.g. with this file :
a:
b:
c: "foo"
d: |
abc
def
I want to get the full path of line 2; it should yield: a.b.c. Line 0 ? a, Line 4 ? a.d (multiline support), etc.
Any idea how I could achieve that?
Thanks
I have coded two solutions that differ slightly in their behaviour (see remarks below)
Use the YAML processor mikefarah/yq.
I have also tried to solve the problem using kislyuk/yq, but it is not suitable,
because the operator input_line_number only works in combination with the --raw-input option
Version 1
FILE='sample.yml'
export LINE=1
yq e '[..
| select(line == env(LINE))
| {"line": line,
"path": path | join("."),
"type": type,
"value": .}
]' $FILE
Remarks
LINE=3 returns two results, because line 3 contains two nodes
the key 'c' of map 'a.b'
the string value 'foo' of key 'c'.
LINE=5 does not return a match, because the multiline text node starts in line 4.
the results are wrapped in an array, as multiple nodes can be returned
Output for LINE=1
- line: 1
path: ""
type: '!!map'
value:
a:
b:
c: "foo"
d: |-
abc
def
Output for LINE=2
- line: 2
path: a
type: '!!map'
value:
b:
c: "foo"
Output for LINE=3
- line: 3
path: a.b
type: '!!map'
value:
c: "foo"
- line: 3
path: a.b.c
type: '!!str'
value: "foo"
Output for LINE=4
- line: 4
path: d
type: '!!str'
value: |-
abc
def
Output for LINE=5
[]
Version 2
FILE='sample.yml'
export LINE=1
if [[ $(wc -l < $FILE) -lt $LINE ]]; then
echo "$FILE has less than $LINE lines"
exit
fi
yq e '[..
| select(line <= env(LINE))
| {"line": line,
"path": path | join("."),
"type": type,
"value": .}
]
| sort_by(.line, .type)
| .[-1]' $FILE
Remarks
at most one node is returned, even if there are more nodes in the selected row. So the result does not have to be wrapped in an array.
Which node of one line is returned can be controlled by the sort_by function, which can be adapted to your own needs.
In this case, text nodes are preferred over maps because "!!map" is sorted before "!!str".
LINE=3 returns only the text node of line 3 (not node of type "!!map")
LINE=5 returns the multiline text node starting at line 4
LINE=99 does not return the last multiline text node of sample.yaml because the maximum number of lines is checked in bash beforehand
Output for LINE=1
line: 1
path: ""
type: '!!map'
value:
a:
b:
c: "foo"
d: |-
abc
def
Output for LINE=2
line: 2
path: a
type: '!!map'
value:
b:
c: "foo"
Output for LINE=3
line: 3
path: a.b.c
type: '!!str'
value: "foo"
Output for LINE=4
line: 4
path: d
type: '!!str'
value: |-
abc
def
Output for LINE=5
line: 4
path: d
type: '!!str'
value: |-
abc
def
Sharing my findings since I've spent too much time on this.
As #Inian mentioned line numbers won't necessary be accurate.
YQ does provides us with the line operator, but I was not able to find a decent way of mapping that from an input.
That said, if you're sure the input file will not contain any multi-line values, you could do something like this
Use awk to get the key of your input line, eg 3 --> C
This assumes the value will never contain :, the regex can be edited if needed to go around this
Select row in awk
Trim leading and trailing spaces from a string in awk
export searchKey=$(awk -F':' 'FNR == 3 { gsub(/ /,""); print $1 }' ii)
Use YQ to recursive (..) loop over the values, and create each path using (path | join("."))
yq e '.. | (path | join("."))' ii
Filter the values from step 2, using a regex where we only want those path's that end in the key from step 1 (strenv(searchKey))
yq e '.. | (path | join(".")) | select(match(strenv(searchKey) + "$"))' ii
Print the path if it's found
Some examples from my local machine, where your input file is named ii and both awk + yq commands are wrapped in a bash function
$ function getPathByLineNumber () {
key=$1
export searchKey="$(awk -v key=$key -F':' 'FNR == key { gsub(/ /, ""); print $1 }' ii)"
yq e '.. | (path | join(".")) | select(match(strenv(searchKey) + "$"))' ii
}
$
$
$
$
$ yq e . ii
a:
b:
c: "foo"
$
$
$ getPathByLineNumber 1
a
$ getPathByLineNumber 2
a.b
$ getPathByLineNumber 3
a.b.c
$
$

Assign YAML array from input to key using `yq`

I'm trying to take a YAML style array I get from an AWS command, and assign it to a key while updating my own YAML.
This represents what I have right now:
yq '(.HostedZones[] | select(.Id=="/hostedzone/ABC123")).ResourceRecordSets |= "'"$(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" --output yaml | yq '.ResourceRecordSets')"'"' -i route53.yml
This is how route53.yml looks like before I run the command:
HostedZones:
- CallerReference: abc-123
Id: /hostedzone/ABC123
Name: domain.name.com.
ResourceRecordSetCount: 5
and this is how route53.yml looks like after:
HostedZones:
- CallerReference: abc-123
Id: /hostedzone/ABC123
Name: domain.name.com.
ResourceRecordSetCount: 5
ResourceRecordSets: |-
- Name: a.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: b.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: c.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: d.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: e.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
As you can see there's a |- right after the key, and it seems like it is treated as multiline string instead of an array of maps. How can I avoid it and assign the array as a YAML array? When I tried manually assigning an array in the style of ['a', 'b', 'c'] and the update works as it should, adding a YAML array under the key, how can I achieve it with the output of the aws command?
The reason why yq is interpreting the output of the $(...) expression as a multiline strine is because you have quoted it; your yq expression, simplified, looks like:
yq 'ResourceRecordSets |= "some string here"'
The quotes mean "this is a string, not a structure", so that's what you get. You could try dropping the quotes, like this:
yq '(.HostedZones[] | select(.Id=="/hostedzone/ABC123")).ResourceRecordSets |= '"$(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" --output yaml | yq '.ResourceRecordSets')" route53.yml
That might work, but it is fragile. A better solution is to have yq parse the output of the subexpression as a separate document, and then merge it as a structured document rather than a big string. Like this:
yq \
'(.HostedZones[] | select(.Id=="/hostedzone/ABC123")).ResourceRecordSets |= input.ResourceRecordSets' \
route53.yml \
<(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" --output yaml)
This takes advantage of the fact that you can provide jq (and hence yq) multiple files on the command line, and then refer to the input variable (described in the IO section of the jq manual). The above command line is structured like this:
yq <expression> <file1> <file2>
Where <file1> is route53.yml, and <file2> is a bash process substitution.
This solution simplifies issues around quoting and formatting.
(You'll note I dropped your use of -i here; that seems to throw yq for a loop, and it's easy to output to a temporary file and then rename it.)
Eventually I solved it using a combination of larsks answer about the unnecessary quotes, and using jq instead of yq (no matter what I did, using solely yq wouldn't work) to assign the array to a separate variable before editing the YAML document:
record_sets=$(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" | jq '.[]')
yq '(.HostedZones[] | select(.Id=="ABC123")).ResourceRecordSets |= '"$record_sets"'' -i route53.yml

How to remove an empty map from YAML using yq

I need to remove an empty map from a YAML, using YQ
Sometimes this map may have values, and sometimes this will appears empty.
My YAML code looks like this:
apiVersion: route.openshift.io/v1
kind: Route
metadata:
annotations: {}
creationTimestamp: "2021-03-24T13:16:10Z"
I need to remove annotations: {}
My desired output:
apiVersion: route.openshift.io/v1
kind: Route
metadata:
creationTimestamp: "2021-03-24T13:16:10Z"
Anybody can helps me?
mikefarah/yq
For a generic approach you can use the command
yq e 'del(.. | select(tag == "!!map" and length == 0))'
to remove all empty objects in the input.
Change !!map with !!seq if you want to do the same for empty arrays.
kislyuk/yq
Remove empty objects: yq -y 'del(.. | select(objects and length == 0))'
Remove empty arrays: yq -y 'del(.. | select(arrays and length == 0))'
Remove empty objects, arrays and strings: yq -y 'del(.. | select(length == 0))'
You can delete the annotations map when its length is 0. Using mikefarah/yq, this can be done as below (verfied on yq version 4.9.6)
yq e 'del(.metadata.annotations | select(length==0))' yaml
Note: Since 4.18.1, yq's eval/e command is the default command and no longer needs to be specified.
From the #jpseng's anwser, I get the command only removes empty objects at the leaf nodes, since it may create more empty objects, like key a or d(and d eventually):
$ cat example.yaml
a:
b: {}
c:
d:
e: {}
f: "exists"
$ yq 'del(.. | select(tag=="!!map" and length == 0))' example.yaml
a: {}
c:
d: {}
f: "exists"
Remove these recursively by Bash script
$ cat delete_empty_map.sh
#!/usr/bin/env bash
result=$(yq 'del( .. | select(tag == "!!map" and length == 0))' "$1")
while [ "$(echo "$result" | yq 'map(.. | select(tag == "!!map" and length == 0)) | any')" = "true" ]
do
result=$(echo "$result" | yq 'del( .. | select(tag == "!!map" and length == 0))')
done
echo "$result" | yq
$ ./delete_empty_map.sh example.yaml
f: "exists"

Extract Key Value pairs which matches the regex in YAML with boolean values

I have this below YAML input and I am trying to extract shown output using yq. I want to remove pairs where key name (VAR-A) in value {{a.b.VAR-A}} (after a.b.) matches and If I have more than one {{a.b.VAR-A}} in values separated by - , I want to keep them.
VAR-A: '{{a.b.VAR-A}}'
VAR-B: '{{a.b.VAR-B}}'
VAR-C: v0.0
VAR-D: '{{a.b.VAR-D}}-{{a.b.VAR-A}}'
VAR-E: '{{a.b.VAR-C}}-{{a.b.VAR-B}}-{{a.b.VAR-A}}'
VAR-F: True
Expected Output:
VAR-C: v0.0
VAR-D: '{{a.b.VAR-D}}-{{a.b.VAR-A}}'
VAR-E: '{{a.b.VAR-C}}-{{a.b.VAR-B}}-{{a.b.VAR-A}}'
VAR-F: True
This question works if I have all strings, but it fails when I have boolean value in yaml. Extract Key Value pairs which matches the regex in YAML using yq/sed/grep
I get below error:
Error: cannot substitute with !!bool, can only substitute strings. Hint: Most often you'll want to use '|=' over '=' for this operation.
There are at least two very different extant "yq" projects: a Python-based one, which is the focus of Part 1 below, and a Go-based one, which is the focus of Part 2.
Part 1
python-yq 'del(.[] | select( ( type == "string" and test("^{{a[.]b[.][^}]*}}$" ))))' so-vars.yaml
or
python-yq 'map_values( select( ( type == "string" and test("^{{a[.]b[.][^}]*}}$" )) | not))' so-vars.yaml
Output:
{
"VAR-C": "v0.0",
"VAR-D": "{{a.b.VAR-D}}-{{a.b.VAR-A}}",
"VAR-E": "{{a.b.VAR-C}}-{{a.b.VAR-B}}-{{a.b.VAR-A}}",
"VAR-F": true
}
Part 2
The Go-based version of yq that I have (4.6.3) might not be able to handle your requirements directly, but here's a solution that uses this yq to translate to and from JSON, and jq to do the rest:
yq -j eval . input.yaml |
jq 'del(.[] | select(( type == "string" and test("^{{a[.]b[.][^}]*}}$" ))))' > tmp.json
yq -P eval . tmp.json
The del-free version of the jq program:
map_values( select( type == "string" and test("^{{a[.]b[.][^}]*}}$" | not)
Output:
VAR-C: v0.0
VAR-D: '{{a.b.VAR-D}}-{{a.b.VAR-A}}'
VAR-E: '{{a.b.VAR-C}}-{{a.b.VAR-B}}-{{a.b.VAR-A}}'
VAR-F: true

Convert Cloudflare API response to yaml

When I retrieve all records for a hosted zone in Cloudflare, e.g. of response, I need to create from it the following yaml structure:
name_zones: # this line we create
.zone_name: # the value is taken from the response
auth_key: XXX # this line we create
records: # this line we create
# iterate over all records
- name: .name
type: .type
priority: .priority # create line if value set|exist
content: .content
ttl: .ttl # create line if value set|exist
e.g. jq code which almost done this:
jq '.result[] | {name: .name, type: .type, content: .content, ttl: .ttl} + if has("priority") then {priority} else null end' | jq -n '.name_zone.zone_name.auth_key.records |= [inputs]' | yq r -P -
How to pass or create the value of zone_name and auth_key: XXX?
First, your two invocations of jq can be replaced by just one, as shown in the answer below.
Second, there are currently (at least) two yqs in the wild:
python-based yq (https://kislyuk.github.io/yq) - hereafter python-yq
go-based yq (https://github.com/mikefarah/yq)
For better and/or worse, version 4 of the go-based yq is significantly different from earlier versions, so if you want to use the current go-based version you may have to make adjustments accordingly. To simplify things (at least from my point of view), I will replace your yq r -P - by:
python-jq -y .
The following produces the output shown below.
< cf_response.json \
jq '{name_zones:
{zone_name: .result[0].zone_name,
auth_key: "XXX",
records:
[.result[]
| {name, type, content, ttl}
+ if has("priority")
then {priority}
else null end] }} '|
python-yq -y .
Output
name_zones:
zone_name: test.com
auth_key: XXX
records:
- name: test.com
type: A
content: 111.111.111.111
ttl: 1
- name: test.com
type: TXT
content: google-site-verification=content
ttl: 1
- name: test.com
type: MX
content: smtp.test.com
ttl: 1
priority: 0
auth_key
If you want to pass in the value of auth_key as a parameter to jq, you could use the command-line sequence --arg auth_key XXX, and then use $auth_key in the jq program.

Resources