replace yaml array with json format using bash, yq, jq

replace yaml array with json format using bash, yq, jq - yaml

For readability purpose when displaying informations using yaml format, I'd like to be able to replace a yaml array with its json equivalent
The thing is that I may have several instances to replace in the yaml file/output with different paths
Example:
objects:
- object:
name: objA
inputs:
- dims:
- 1
- 3
- object:
name: objB
outputs:
- dims:
- 5
but I'd like the output to be json format for dims arrays, like
objects:
- object:
name: objA
inputs:
- dims: [1,3]
- object:
name: objB
outputs:
- dims: [5]
converting the value from yaml to json is easy, modifying the value of the yaml nodes is easy, but I don't see how I can get the value for a "dims" node, convert it to a json value string, a put it back in the node (I mean without searching explicitly all instances)
in general, I'm looking for a way to replace the value of a node, with the result of a process on the value of the node (other example, replacing an id with the name of the corresponding object retrieved through a REST api request)
objects:
- object:
name: objA
dependency: 3fc4bd5b-a6ee-4469-946d-5f780476784e
would be displayed as
objects:
- object:
name: objA
dependency: name-of-dependency
where the id is replaced by the friendly name of the dependency
thanks

With mikefarah's yq
yq e '.objects[].object["inputs","outputs"][].dims? |= "["+join(",")+"]"' data.yml

You could use tojson and the update operator |=. This will encode your arrays as JSON, which is a string and therefore itself enclosed in quotes:
yq -y '(.. | .dims? | arrays) |= tojson'
objects:
- object:
name: objA
inputs:
- dims: '[1,3]'
- object:
name: objB
outputs:
- dims: '[5]'
Run with Python yq

Related

Azure Pipelines - "While parsing a block mapping, did not find expected key" when setting variables

I have a strange problem that I can't seem to get my head around. I am trying to define some variables for use as part of the job that will deploy bicep files via Azure CLI and execute PowerShell tasks.
I get this validation error when I try and execute the pipeline: While parsing a block mapping, did not find expected key
The line that it refers to is: - name: managementResourceDNSPrivateResolverName
On the research that I have done on this problem, it sounds like an indentation problem but on the face of it, it seems to look fine.
jobs:
- job: 'Deploy_Management_Resources'
pool:
vmImage: ${{ parameters.buildAgent }}
variables:
- name: managementResourceDNSPrivateResolverName
value: 'acme-$[ lower(parameters['environmentCode']) ]-$[ lower(variables['resourceLocationShort']) ]-private-dns-resolver'
- name: managementResourceGroupManagement
value: 'acme-infrastructure-rg-management'
- name: managementResourceRouteTableName
value: 'acme-$[ lower(variables['subscriptionCode']) ]-$[ lower(variables['resourceLocationShort']) ]-route-table'
- name: managementResourceVirtualNetworkName
value: 'acme-$[ lower(variables['subscriptionCode']) ]-$[ lower(variables['resourceLocationShort']) ]-vnet-internal-mng'
Thanks!

The error message ...parsing a block mapping, did not find expected key is usually a side-effect of malformed yaml. You'll see if often with variables if you have mixed formats of arrays and property elements
variables: # an array of objects
# variable group reference object
- group: myvariablegroup
# variable template reference object
- template: my-variables.yml
# variable object
- name: myVariable
value: 'value1'
# variable shorthand syntax
myVariable: 'value1' # this fails because it's a property instead of an array element
While it doesn't appear that the sample you've provided is malformed, I am curious about the use of $[ ] which is a runtime expression. The expression $[ lower(parameters['environmentcode']) ] refers to parameters which is are only available at compile time.
Change:
$[ lower(parameters['environmentCode']) ] to ${{ lower(parameters.environmentCode) }}

yaml templating (maybe yq)

Hello currently I try to find a tool (I'm pretty sure yq does not the magic for me) to remove some content from a yaml file. My file looks as following:
paths:
/entity/{id}:
get:
tags: a
summary: b
...
So its a typical openapi-specification. I would like to add a magic property for example 'env: prod' so that some endpoints look as following:
paths:
/entity/{id}:
get:
env: prod
tags: a
summary: b
...
Is there a solution to remove all endpoints, which contain env: prod?
I am also free to change the concept. If there would be some transformation with a if else I would be very happy.

Using kislyuk/yq:
yq -y '.[][][] |= ({env: "prod"} + .)'
Using mikefarah/yq:
yq '.[][][] |= ({"env": "prod"} + .)'
Both produce:
paths:
/entity/{id}:
get:
env: prod
tags: a
summary: b
This adds env: prod to every object that is three levels deep. If you want the criteria be more sophisticated, you will have to adapt .[][][] accordingly.

yq 'del(.. | select(.env == "prod"))' file.yaml
Explanation:
You want to delete all the nodes that have a child 'env' property set to 'prod'.
.. recursively matches all nodes
select(.env == "prod") select the ones that have a env property equal to "prod"
del(.. | select(.env == "prod") delete those nodes :)
Disclaimer: I wrote mikefarah/yq

Get all children key values in a YAML with PyYAML

Say I have a YAML like:
Resources:
AlarmTopic:
Type: AWS::SNS::Topic
Properties:
Subscription:
- !If
- ShouldAlarm
Protocol: email
How do I get each key and value of all the children if I'm walking over each resource and I want to know if one of the values may contain a certain string? I'm using PyYAML but I'm also open to using some other library.

You can use the low-level event API if you only want to inspect scalar values:
import yaml
import sys
input = """
Resources:
AlarmTopic:
Type: AWS::SNS::Topic
Properties:
Subscription:
- !If
- ShouldAlarm
- Protocol: email
"""
for e in yaml.parse(input):
if isinstance(e, yaml.ScalarEvent):
print(e.value)
(I fixed your YAML because it had a syntax error.) This yields:
Resources
AlarmTopic
Type
AWS::SNS::Topic
Properties
Subscription
ShouldAlarm
Protocol
email

Different yaml inputs for CWL scatter

I have a commandline tool in cwl which can take the following input:
fastq:
class: File
path: /path/to/fastq_R1.fq.gz
fastq2:
class: File
path: /path/to/fastq_R2.fq.gz
sample_name: foo
Now I want to scatter over this commandline tool and the only way I can think to do it is with scatterMethod: dotproduct and an input of something like:
fastqs:
- class: File
path: /path/to/fastq_1_R1.fq.gz
- class: File
path: /path/to/fastq_2_R1.fq.gz
fastq2s:
- class: File
path: /path/to/fastq_1_R2.fq.gz
- class: File
path: /path/to/fastq_2_R2.fq.gz
sample_names: [foo, bar]
Is there any other way for me to design the workflow and / or input file such that each input group is sectioned together? Something like
paired_end_fastqs:
- fastq:
class: File
path: /path/to/fastq_1_R1.fq.gz
fastq2:
class: File
path: /path/to/fastq_1_R2.fq.gz
sample_name: foo
- fastq:
class: File
path: /path/to/fastq_2_R1.fq.gz
fastq2:
class: File
path: /path/to/fastq_2_R2.fq.gz
sample_name: bar

You could accomplish this with a nested workflow wrapper that maps a record to each individual field, and then using that workflow to scatter over an array of records. The workflow would look something like this:
---
class: Workflow
cwlVersion: v1.0
id: workflow
inputs:
- id: paired_end_fastqs
type:
type: record
name: paired_end_fastqs
fields:
- name: fastq
type: File
- name: fastq2
type: File
- name: sample_name
type: string
outputs: []
steps:
- id: tool
in:
- id: fastq
source: paired_end_fastqs
valueFrom: "$(self.fastq)"
- id: fastq2
source: paired_end_fastqs
valueFrom: "$(self.fastq2)"
- id: sample_name
source: paired_end_fastqs
valueFrom: "$(self.sample_name)"
out: []
run: "./tool.cwl"
requirements:
- class: StepInputExpressionRequirement
Specify a workflow input of type record, with fields for each of the inputs the tool accepts, the ones you want to keep together while scattering. Connect the workflow input to each tool input, its source. Using valueFrom on the step inputs, transform the record (self is the source this context) to pass only the appropriate field to the tool.
More about valueFrom in workflow steps: https://www.commonwl.org/v1.0/Workflow.html#WorkflowStepInput
Then use this wrapper in your actual workflow and scatter over paired_end_fastqs with an array of records.

Use globbed string in YAML

I am looking for a way to dynamically set the key using the path of the file below.
For example if I have this YAML:
prospectors.config:
- fields:
queue_name: <somehow get the globbed string below in here>
paths:
- /var/log/casino/*.log
type: log
output.redis:
hosts:
- "producer:6379"
key: "%{[fields.queue_name]}"
And then I had a file called /var/log/casino/test.log, then key would become test.

Im not sure that what you want is possible.
You could use the source field and configure your Redis output using that as the key:
output.redis:
hosts:
- "producer:6379"
key: "%{source}"
This would have the disadvantage of being the absolute path of the source file, not the basename as your question asks for.

If you have a small number of possible basename patterns, and want a queue for each. For example, you have files:
/common/path/test-1.log
/common/path/foo-0.log
/common/path/01-bar.log
/common/path/test-3.log
...
and wanted to have three queues in redis test, foo and bar you could use the source field and the conditionals available in the keys configuration of redis output something like this
output.redis:
hosts:
- "producer:6379"
key: "default_key"
keys:
- key: "test_key"
when.contains:
source: "test"
- key: "foo_key"
when.contains:
source: "foo"
- key: "bar_key"
when.contains:
source: "bar"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

replace yaml array with json format using bash, yq, jq - yaml

With mikefarah's yq yq e '.objects[].object["inputs","outputs"][].dims? |= "["+join(",")+"]"' data.yml

Related

Azure Pipelines - "While parsing a block mapping, did not find expected key" when setting variables

yaml templating (maybe yq)

Get all children key values in a YAML with PyYAML

Different yaml inputs for CWL scatter

Use globbed string in YAML

Categories

Resources