I have defined a mapping in yaml that looks like:
default: &DEFAULT
bucket: &bucket default_path
# Make sure that the second parameter of join doesn't start with a /
# otherwise it is interpreted as an absolute path and join won't work
path1: !!python/object/apply:os.path.join [*bucket, work_area/test1]
path2: !!python/object/apply:os.path.join [*bucket, work_area/test2]
I need to define more keys where the only value to be overwritten is bucket, sth like:
production:
<<: *DEFAULT
bucket: "s3://production-bucket"
but I still get
conf['production']['path1'] => 'default_path/work_area/test1'
instead of
conf['production']['path1'] => 's3://production-bucket/work_area/test1'.
Is there any way to do this in yaml?
As obvious from the syntax, I use pyyaml to parse the file.
YAML interpreters should take the most recent definition of an anchor:
An alias node is denoted by the “*” indicator. The alias refers to the most recent preceding node having the same anchor. It is an error for an alias node to use an anchor that does not previously occur in the document. It is not an error to specify an anchor that is not used by any alias node.
So even if PyYAML (3.10/3.11) would not throw a ComposerError if you try to parse:
default: &DEFAULT
bucket: &bucket default_path
# Make sure that the second parameter of join doesn't start with a /
# otherwise it is interpreted as an absolute path and join won't work
path1: !!python/object/apply:os.path.join [*bucket, work_area/test1]
path2: !!python/object/apply:os.path.join [*bucket, work_area/test2]
production:
<<: *DEFAULT
bucket: &bucket "s3://production-bucket"
inserting the path1 and path2 keys with <<: *DEFAULT* would give you their expanded versions with default_path as that is the definition available to the parser when reading [*bucket, work_area/test1]
The "expansion" of the alias is done as soon as the alias is read in from the YAML source, not at some point at the end of the file, when all anchored data has been read in.
In you updated example, there is no other anchor bucket defined than the one for the scalar "default_path". You are confusing yourself by using the same name for the anchor and the keys (bucket), but the key names are completely irrelevant for resolving the alias *bucket.
If you can rearrange your YAML you might get something acceptable to your use case by doing ¹:
import ruamel.yaml
yaml_str = """\
default: &DEFAULT
bucket: &klm default_path
production:
&klm "s3://production-bucket"
result:
<<: *DEFAULT
# Make sure that the second parameter of join doesn't start with a /
# otherwise it is interpreted as an absolute path and join won't work
path1: !!python/object/apply:os.path.join [*klm, work_area/test1]
path2: !!python/object/apply:os.path.join [*klm, work_area/test2]
"""
conf = ruamel.yaml.load(yaml_str)
print(conf['result']['path1'])
which will give you:
s3://production-bucket/work_area/test1
¹ This was done using ruamel.yaml of which I am the author.
Related
I am building an open source project using Ruby for testing HTTP services: https://github.com/Comcast/http-blackbox-test-tool
I want to be able to reference environment variables in my test-plan.yaml file. I could use ERB, however I don't want to support embedding any random Ruby code and ERB syntax is odd for non-rubyists, I just want to access environment variables using the commonly used Unix style ${ENV_VAR} syntax.
e.g.
order-lunch-app-health:
request:
url: ${ORDER_APP_URL}
headers:
content-type: 'application/text'
method: get
expectedResponse:
statusCode: 200
maxRetryCount: 5
All examples I have found for Ruby use ERB. Does anyone have a suggestion on the best way to deal with this? I an open to using another tool to preprocess the YAML and then send that to the Ruby application.
I believe something like this should work under most circumstances:
require 'yaml'
def load_yaml(file)
content = File.read file
content.gsub! /\${([^}]+)}/ do
ENV[$1]
end
YAML.load content
end
p load_yaml 'sample.yml'
As opposed to my original answer, this is both simpler and handles undefined ENV variables well.
Try with this YAML:
# sample.yml
path: ${PATH}
home: ${HOME}
error: ${NO_SUCH_VAR}
Original Answer (left here for reference)
There are several ways to do it. If you want to allow your users to use the ${VAR} syntax, then perhaps one way would be to first convert these variables to Ruby string substitution format %{VAR} and then evaluate all environment variables together.
Here is a rough proof of concept:
require 'yaml'
# Transform environments to a hash of { symbol: value }
env_hash = ENV.to_h.transform_keys(&:to_sym)
# Load the file and convert ${ANYTHING} to %{ANYTHING}
content = File.read 'sample.yml'
content.gsub! /\${([^}]+)}/, "%{\\1}"
# Use Ruby string substitution to replace %{VARS}
content %= env_hash
# Done
yaml = YAML.load content
p yaml
Use it with this sample.yml for instance:
# sample.yml
path: ${PATH}
home: ${HOME}
There are many ways this can be improved upon of course.
Preprocessing is easy, and I recommend you use a YAML loaderd/dumper
based solution, as the replacement might require quotes around the
replacement scalar. (E.g. you substitute the string true, if that
were not quoted, the resulting YAML would be read as a boolean).
Assuming your "source" is in input.yaml and your env. variable
ORDER_APP_URL set to https://some.site/and/url. And the following
script in expand.py:
import sys
import os
from pathlib import Path
import ruamel.yaml
def substenv(d, env):
if isinstance(d, dict):
for k, v in d.items():
if isinstance(v, str) and '${' in v:
d[k] = v.replace('${', '{').format(**env)
else:
substenv(v, env)
elif isinstance(d, list):
for idx, item in enumerate(d):
if isinstance(v, str) and '${' in v:
d[idx] = item.replace('${', '{').format(**env)
else:
substenv(item, env)
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load(Path(sys.argv[1]))
substenv(data, os.environ)
yaml.dump(data, Path(sys.argv[2]))
You can then do:
python expand.py input.yaml output.yaml
which writes output.yaml:
order-lunch-app-health:
request:
url: https://some.site/and/url
headers:
content-type: 'application/text'
method: get
expectedResponse:
statusCode: 200
maxRetryCount: 5
Please note that the spurious quotes around 'application/text' are preserved, as would be any comments
in the original file.
Quotes around the substituted URL are not necessary, but the would have been added if they were.
The substenv routine recursively traverses the loaded data, and substitutes even if the substitution is in mid-scalar, and if there are more than substitution in one scalar. You can "tighten" the test:
if isinstance(v, str) and '${' in v:
if that would match too many strings loaded from YAML.
I need to remove the tags field from each of the methods in my OpenAPI spec.
The spec must be in YAML format, as converting to JSON causes issues later on when publishing.
I couldn't find a ready tool for that, and my programming skills are insufficient. I tried Python with ruamel.yaml, but could not achieve anything.
I'm open to any suggestions how to approach this - a repo with a ready tool somewhere, a hint on what to try in Python... I'm out of my own ideas.
Maybe a regex that catches all cases all instances of tags so I can do a search and replace with Python, replacing them with nothing? Empty lines don't seem to break the publishing engine.
Here's a sample YAML piece (I know this is not a proper spec, just want to show where in the YAML tags sits)
openapi: 3.0.0
info:
title: ""
description: ""
paths:
/endpoint
get:
tags:
-
tag1
-
tag3
#rest of GET definition
post:
tags:
- tag2
/anotherEndpoint
post:
tags:
- tag1
I need to get rid of all tags arrays entirely (not just make them empty)
I am not sure why you couldn't achieve anything with Python + ruamel.yaml. Assuing your spec
is in a file input.yaml:
import sys
from pathlib import Path
import ruamel.yaml
in_file = Path('input.yaml')
out_file = Path('output.yaml')
yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
data = yaml.load(in_file)
# if you only have the three instances of 'tags', you can hard-code them:
# del data['paths']['/endpoint']['get']['tags']
# del data['paths']['/endpoint']['post']['tags']
# del data['paths']['/anotherEndpoint']['post']['tags']
# generic recursive removal of any key names 'tags' in the datastructure:
def rm_tags(d):
if isinstance(d, dict):
if 'tags' in d:
del d['tags']
for k in d:
rm_tags(d[k])
elif isinstance(d, list):
for item in d:
rm_tags(item)
rm_tags(data)
yaml.dump(data, out_file)
which gives as output.yaml:
openapi: 3.0.0
info:
title: ""
description: ""
paths:
/endpoint:
get: {}
post: {}
/anotherEndpoint:
post: {}
You can write back data to input.yaml if you need that.
Please note that normally the comment #rest of GET definition would be preserved, but
not now as it is associated during loading with the key before it and that key gets deleted.
When I run puppet agent --test I have no errors output but the user did not create.
My puppet hira.yaml configuration is:
---
version: 5
datadir: "/etc/puppetlabs/code/environments"
data_hash: yaml_data
hierarchy:
- name: "Per-node data (yaml version)"
path: "%{::environment}/nodes/%{::trusted.certname}.yaml"
- name: "Common YAML hierarchy levels"
paths:
- "defaults/common.yaml"
- "defaults/users.yaml"
users.yaml is:
accounts::user:
joed:
locked: false
comment: System Operator
uid: '1700'
gid: '1700'
groups:
- admin
- sudonopw
sshkeys:
- ssh-rsa ...Hw== sysop+moduledevkey#puppetlabs.com
I use this module
Nothing in Hiera data itself causes anything to be applied to target nodes. Some kind of declaration is required in a manifest somewhere or in the output of an external node classifier script. Moreover, the puppetlabs/accounts module provides only defined types, not classes. You can store defined-type data in Hiera and read it back, but automated parameter binding via Hiera applies only to classes, not defined types.
In short, then, no user is created (and no error is reported) because no relevant resources are declared into the target node's catalog. You haven't given Puppet anything to do.
If you want to apply the stored user data presented to your nodes, you would want something along these lines:
$user_data = lookup('accounts::user', Hash[String,Hash], 'hash', {})
$user_data.each |$user,$props| {
accounts::user { $user: * => $props }
}
That would go into the node block matched to your target node, or, better, into a class that is declared by that node block or an equivalent. It's fairly complicated for so few lines, but in brief:
the lookup function looks up key 'accounts::user' in your Hiera data
performing a hash merge of results appearing at different levels of the hierarchy
expecting the result to be a hash with string keys and hash values
and defaulting to an empty hash if no results are found;
the mappings in the result hash are iterated, and for each one, an instance of the accounts::user defined type is declared
using the (outer) hash key as the user name,
and the value associated with that key as a mapping from parameter names to parameter values.
There are a few problems here.
You are missing a line in your hiera.yaml namely the defaults key. It should be:
---
version: 5
defaults: ## add this line
datadir: "/etc/puppetlabs/code/environments"
data_hash: yaml_data
hierarchy:
- name: "Per-node data (yaml version)"
path: "%{::environment}/nodes/%{::trusted.certname}.yaml"
- name: "Common YAML hierarchy levels"
paths:
- "defaults/common.yaml"
- "defaults/users.yaml"
I detected that using the puppet-syntax gem (included if you use PDK, which is recommended):
▶ bundle exec rake validate
Syntax OK
---> syntax:manifests
---> syntax:templates
---> syntax:hiera:yaml
ERROR: Failed to parse hiera.yaml: (hiera.yaml): mapping values are not allowed in this context at line 3 column 10
Also, in addition to what John mentioned, the simplest class to read in your data would be this:
class test (Hash[String,Hash] $users) {
create_resources(accounts::user, $users)
}
Or if you want to avoid using create_resources*:
class test (Hash[String,Hash] $users) {
$users.each |$user,$props| {
accounts::user { $user: * => $props }
}
}
Note that I have relied on the Automatic Parameter Lookup feature for that. See the link below.
Then, in your Hiera data, you would have a key named test::users to correspond (class name "test", key name "users"):
---
test::users: ## Note that this line changed.
joed:
locked: false
comment: System Operator
uid: '1700'
gid: '1700'
groups:
- admin
- sudonopw
sshkeys:
- ssh-rsa ...Hw== sysop+moduledevkey#puppetlabs.com
Use of automatic parameter lookup is generally the more idiomatic way of writing Puppet code compared to calling the lookup function explicitly.
For more info:
PDK
Automatic Parameter Lookup
create_resources
(*Note that create_resources is "controversial". Many in the Puppet community prefer not to use it.)
1. Summary
I can't find, how I can automatically prettify my YAML files.
2. Data
Example:
I have SashaPrettifyYAML.yaml file:
sasha_commands:
# Sasha comment
sasha_command_help: {call: sublime.command_help, caption: 'Sasha Command: Command Help'}
3. Expected behavior
I want to delete {braces}:
sasha_commands:
# Sasha comment
sasha_command_help:
call: sublime.command_help
caption: 'Sasha Command: Command Help'
4. Not helped
Pretty YAML (based on PyYAML) and online formatters as YAML Formatter and OnlineYAMLTools delete comments;
I can't find the required option in ruamel.yaml.cmd;
align-yaml align, not prettify YAML file.
There is no option to do this in ruamel.yaml.cmd, but it is fairly straightforward to do this with a small python program and using ruamel.yaml, by loading and dumping in round-trip mode (the default).
The only thing you need to do is make sure the flow-style on the data-structure that is the value for the key sasha_command_help is set to block-style (which is how I interpret your definition of "prettifying YAML"):
import sys
import ruamel.yaml
yaml_str = """\
sasha_commands:
# Sasha comment
sasha_command_help: {call: sublime.command_help, caption: 'Sasha Command: Command Help'}
"""
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load(yaml_str)
data['sasha_commands']['sasha_command_help'].fa.set_block_style()
yaml.dump(data, sys.stdout)
this will exactly give the output you expect.
A recursive data structure walker can be found in scalarstring.py in the ruamel.yaml source, and adapted to make a generic "make-everything-block-style" routine:
import sys
import ruamel.yaml
def block_style(base):
"""
This routine walks over a simple, i.e. consisting of dicts, lists and
primitives, tree loaded from YAML. It recurses into dict values and list
items, and sets block-style on these.
"""
if isinstance(base, dict):
for k in base:
try:
base.fa.set_block_style()
except AttributeError:
pass
block_style(base[k])
elif isinstance(base, list):
for elem in base:
try:
base.fa.set_block_style()
except AttributeError:
pass
block_style(elem)
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
file_in = sys.argv[1]
file_out = sys.argv[2]
with open(file_in) as fp:
data = yaml.load(fp)
block_style(data)
with open(file_out, 'w') as fp:
yaml.dump(data, fp)
If you store the above in prettifyyaml.py you can call it with:
python prettifyyaml.py SashaPrettifyYAML.yaml Prettified.yaml
Since you are already using single quotes around the scalar that has embedded spaces, you won't see a change if you leave out yaml.preserve_quotes = True. But if you had used a double quoted scalar then that line makes sure the double quotes are preserved.
I had the same problem. I wrote my own YAML beautifier https://github.com/wangkuiyi/yamlfmt. I hope it helps.
I tried top results from Google, but none of them address the requirements of https://sqlflow.org/sqlflow, which I am leading:
https://pypi.org/project/yamlfmt cannot handle a file of multiple YAML documents separated by ---
https://github.com/devopyio/yamlfmt cannot handle multiple files.
https://github.com/miekg/yamlfmt/blob/master/fmt.go cannot replace (inline edit) the input files.
You can use yq tool - it's easy to install and use, and it's well maintained.
Supposing you have example.yml file to format, it can be processed by following ways:
from file: yq r --unwrapScalar -p pv -P example.yml '*'
from stdin: cat example.yml | yq r --unwrapScalar -p pv -P - '*'
I am looking for a way to dynamically set the key using the path of the file below.
For example if I have this YAML:
prospectors.config:
- fields:
queue_name: <somehow get the globbed string below in here>
paths:
- /var/log/casino/*.log
type: log
output.redis:
hosts:
- "producer:6379"
key: "%{[fields.queue_name]}"
And then I had a file called /var/log/casino/test.log, then key would become test.
Im not sure that what you want is possible.
You could use the source field and configure your Redis output using that as the key:
output.redis:
hosts:
- "producer:6379"
key: "%{source}"
This would have the disadvantage of being the absolute path of the source file, not the basename as your question asks for.
If you have a small number of possible basename patterns, and want a queue for each. For example, you have files:
/common/path/test-1.log
/common/path/foo-0.log
/common/path/01-bar.log
/common/path/test-3.log
...
and wanted to have three queues in redis test, foo and bar you could use the source field and the conditionals available in the keys configuration of redis output something like this
output.redis:
hosts:
- "producer:6379"
key: "default_key"
keys:
- key: "test_key"
when.contains:
source: "test"
- key: "foo_key"
when.contains:
source: "foo"
- key: "bar_key"
when.contains:
source: "bar"