How to merge two .yaml files such that shared keys between the files uses only one of their values? - yaml

I am attempting to merge two yaml files and would like any shared keys under a specific key to use values from one of the yaml files, and not merge both. This problem may be better described using an example. GIven file1.yaml and file2.yaml, I am trying to achieve the following:
file1.yaml
name: 'file1'
paths:
path1:
content: "t"
path2:
content: "b"
file2.yaml
name: 'file2'
paths:
path1:
value: "t"
My ideal result in merging is the following file:
file3.yaml
name: 'file2'
paths:
path1:
value: "t"
path2:
content: "b"
Specifically, I would like to overwrite any key under paths such that if both yaml files have the same key under paths, then only use the value from file2. Is there some tool that enables this? I was looking into yq but I'm not sure if that tool would work

Please specify which implementation of yq you are using. They are quite similar, but sometimes differ a lot.
For instance, using kislyuk/yq, you can use input to access the second file, which you can provide alongside the first one:
yq -y 'input as $in | .name = $in.name | .paths += $in.paths' file1.yaml file2.yaml
name: file2
paths:
path1:
value: t
path2:
content: b
With mikefarah/yq, you'd use load with providing the second file in the code, while only the first one is your regular input:
yq 'load("file2.yaml") as $in | .name = $in.name | .paths += $in.paths' file1.yaml
name: 'file2'
paths:
path1:
value: "t"
path2:
content: "b"

Related

yq append multiple file into list yaml

Assume I have a file root.yml
keyA: valA
keyB: valB
myList:
Then I receive some yml file, such as
1.yml
project_id: abc
description: xyz
2.yml
project_id: cba
description: zyx
And so on (they may stored in same folder)
Now I want to append the content of 1.yml, 2.yml (and so on) to the myList of root.yml and output to console
Expected:
keyA: valA
keyB: valB
myList:
- project_id: abc
description: xyz
- project_id: cba
description: zyx
- (so on...)
I have searched some examples but they hard code the list item in the yq command, like this post: Stack Overflow
But I want it load from files, not from hard code
Please forgive for my bad english
With mikefarah/yq, you could use the load function with the filename of the YAML files to be included, i.e. with your example
yq '.myList += [ load("1.yaml"), load("2.yaml") ]' root.yml
producing a YAML result as
keyA: valA
keyB: valB
myList:
- project_id: abc
description: xyz
- project_id: cba
description: zyx
As indicated in your comment, if one of the object has a parent structure and you want to extract the element from it, you can do
yq 'load("1.yaml") as $f | .myList += [ $f.config[], load("2.yaml") ]' root.yml
Tested on yq version 4.27.2

Assign YAML array from input to key using `yq`

I'm trying to take a YAML style array I get from an AWS command, and assign it to a key while updating my own YAML.
This represents what I have right now:
yq '(.HostedZones[] | select(.Id=="/hostedzone/ABC123")).ResourceRecordSets |= "'"$(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" --output yaml | yq '.ResourceRecordSets')"'"' -i route53.yml
This is how route53.yml looks like before I run the command:
HostedZones:
- CallerReference: abc-123
Id: /hostedzone/ABC123
Name: domain.name.com.
ResourceRecordSetCount: 5
and this is how route53.yml looks like after:
HostedZones:
- CallerReference: abc-123
Id: /hostedzone/ABC123
Name: domain.name.com.
ResourceRecordSetCount: 5
ResourceRecordSets: |-
- Name: a.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: b.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: c.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: d.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
- Name: e.domain.name.com.
ResourceRecords:
- Value: some.value.com
TTL: 300
Type: CNAME
As you can see there's a |- right after the key, and it seems like it is treated as multiline string instead of an array of maps. How can I avoid it and assign the array as a YAML array? When I tried manually assigning an array in the style of ['a', 'b', 'c'] and the update works as it should, adding a YAML array under the key, how can I achieve it with the output of the aws command?
The reason why yq is interpreting the output of the $(...) expression as a multiline strine is because you have quoted it; your yq expression, simplified, looks like:
yq 'ResourceRecordSets |= "some string here"'
The quotes mean "this is a string, not a structure", so that's what you get. You could try dropping the quotes, like this:
yq '(.HostedZones[] | select(.Id=="/hostedzone/ABC123")).ResourceRecordSets |= '"$(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" --output yaml | yq '.ResourceRecordSets')" route53.yml
That might work, but it is fragile. A better solution is to have yq parse the output of the subexpression as a separate document, and then merge it as a structured document rather than a big string. Like this:
yq \
'(.HostedZones[] | select(.Id=="/hostedzone/ABC123")).ResourceRecordSets |= input.ResourceRecordSets' \
route53.yml \
<(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" --output yaml)
This takes advantage of the fact that you can provide jq (and hence yq) multiple files on the command line, and then refer to the input variable (described in the IO section of the jq manual). The above command line is structured like this:
yq <expression> <file1> <file2>
Where <file1> is route53.yml, and <file2> is a bash process substitution.
This solution simplifies issues around quoting and formatting.
(You'll note I dropped your use of -i here; that seems to throw yq for a loop, and it's easy to output to a temporary file and then rename it.)
Eventually I solved it using a combination of larsks answer about the unnecessary quotes, and using jq instead of yq (no matter what I did, using solely yq wouldn't work) to assign the array to a separate variable before editing the YAML document:
record_sets=$(aws route53 list-resource-record-sets --hosted-zone-id "ABC123" | jq '.[]')
yq '(.HostedZones[] | select(.Id=="ABC123")).ResourceRecordSets |= '"$record_sets"'' -i route53.yml

Convert Cloudflare API response to yaml

When I retrieve all records for a hosted zone in Cloudflare, e.g. of response, I need to create from it the following yaml structure:
name_zones: # this line we create
.zone_name: # the value is taken from the response
auth_key: XXX # this line we create
records: # this line we create
# iterate over all records
- name: .name
type: .type
priority: .priority # create line if value set|exist
content: .content
ttl: .ttl # create line if value set|exist
e.g. jq code which almost done this:
jq '.result[] | {name: .name, type: .type, content: .content, ttl: .ttl} + if has("priority") then {priority} else null end' | jq -n '.name_zone.zone_name.auth_key.records |= [inputs]' | yq r -P -
How to pass or create the value of zone_name and auth_key: XXX?
First, your two invocations of jq can be replaced by just one, as shown in the answer below.
Second, there are currently (at least) two yqs in the wild:
python-based yq (https://kislyuk.github.io/yq) - hereafter python-yq
go-based yq (https://github.com/mikefarah/yq)
For better and/or worse, version 4 of the go-based yq is significantly different from earlier versions, so if you want to use the current go-based version you may have to make adjustments accordingly. To simplify things (at least from my point of view), I will replace your yq r -P - by:
python-jq -y .
The following produces the output shown below.
< cf_response.json \
jq '{name_zones:
{zone_name: .result[0].zone_name,
auth_key: "XXX",
records:
[.result[]
| {name, type, content, ttl}
+ if has("priority")
then {priority}
else null end] }} '|
python-yq -y .
Output
name_zones:
zone_name: test.com
auth_key: XXX
records:
- name: test.com
type: A
content: 111.111.111.111
ttl: 1
- name: test.com
type: TXT
content: google-site-verification=content
ttl: 1
- name: test.com
type: MX
content: smtp.test.com
ttl: 1
priority: 0
auth_key
If you want to pass in the value of auth_key as a parameter to jq, you could use the command-line sequence --arg auth_key XXX, and then use $auth_key in the jq program.

How should I extract and combine parts of files based on a common text string effectively in bash?

Suppose I have two similar files:
a.yaml
data:
- name: a1
args: ["cmd", "something"]
config:
- name: some
val: thing
- name: a2
args: ["cmd2", "else"]
[...other array values...]
tags: ["something-in-a"]
values: ["else-in-a"]
substitutions:
key1: a-value
key2: a-value
key3: a-value
b.yaml
data:
- name: b1
args: ["cmd", "something"]
config:
- name: some
val: thing
- name: b2
args: ["cmd2", "else"]
[...other array values...]
tags: ["something-in-b"]
values: ["else-in-b"]
substitutions:
key1: b-value
key2: b-value
key3: b-value
My goal is to combine parts of a and b file such that I have a new file which consists of file content before substitutions: from b.yaml and content including and after substitutions: from a.yaml
So in this case, my desired output would be like this:
c.yaml
data:
- name: b1
args: ["cmd", "something"]
config:
- name: some
val: thing
- name: b2
args: ["cmd2", "else"]
[...other array values...]
tags: ["something-in-b"]
values: ["else-in-b"]
substitutions:
key1: a-value
key2: a-value
key3: a-value
The parts before and after substitutions: in both file contents might have different lengths.
Currently, my method is like this:
head -q -n `awk '/substitution/ {print FNR-1}' b.yaml` b.yaml >! c.yaml ; \
tail -q -n `awk '/substitution/ {ROWNUM=FNR} END {print NR-ROWNUM+1}' a.yaml` a.yaml >> c.yaml; \
rm a.yaml b.yaml; mv c.yaml a.yaml; # optional newfile renaming to original
But I wonder if there's an alternative or better method for combining parts of different files based on a common text string in bash?
Use awk, you just need to flag the flow based on the string:
awk '$1 == "substitutions:"{skip = FNR==NR ? 1:0}!skip' b.yaml a.yaml
Explaination:
FNR==NR: if true, process lines in the first file b.yaml, otherwise the 2nd file a.yaml
!skip: if TRUE, print the line, otherwise skip the line.
{
head -B9999 'substitutions:' a.yaml | head -n -1
head -A9999 'substitutions:' b.yaml
} > c.yaml
A oneliner:
{ head -B9999 'substitutions:' a.yaml | head -n -1; head -A9999 'substitutions:' b.yaml; } > c.yaml
The -A9999 and -B9999 are a bit dirty, here's a solution with sed's:
{
sed '/substitutions:/,$d' a.yaml
echo substitutions:
sed '1,/substitutions:/d' b.yaml
} > c.yaml

Replacing a yaml file value using bash scripting

I have a yaml file and want to replace a particular value in it. Note that this value can have multiple entries under different parameters however I want to replace only that particular occurrence. Here is my sample code:
parameters:
- name: COUNT_1
displayName: First Counter
required: true
value: "1"
- name: COUNT_2
displayName: Second Counter
required: false
value: "1"
Here I want to replace value: "1" under COUNT_2 only not COUNT_1 with something like value: "2". I cannot use an external yml processing library of any kind, only sed or awk etc.
I have tried the looping approach wherein I while loop through the yaml file but it gets too cluttered and confusing because I first note the line number of parameters, the try to loop starting from that line number, then compare the name value, then look up the value under that name and replace it. This does not seem to be the proper approach to me. Can anyone suggest an easier way?
You may use this awk:
awk '$2 == "name:" { tag = ($3 == "COUNT_2") }
tag && $1 == "value:"{$1 = " " $1; $2 = "2"} 1' file.yaml
parameters:
- name: COUNT_1
displayName: First Counter
required: true
value: "1"
- name: COUNT_2
displayName: Second Counter
required: false
value: 2
Too bad you're restricting yourself. Here's how to do it with ruby
ruby -e '
require "yaml"
data = YAML.load File.read ARGV.shift
data["parameters"].select {|h| h["name"] == "COUNT_2"}.each {|h| h["value"] = "2"}
puts YAML.dump(data)
' file.yml
Using a proper YAML parser is important. For example, if your YAML looks like
parameters:
- value: "1"
name: COUNT_1
displayName: First Counter
required: true
- value: "1"
name: COUNT_2
displayName: Second Counter
required: false
i.e. with the value appearing before the name, the awk approach will stop working as you expect.

Resources