Writing to a Yaml file - yaml

I am using python to write in to a YAML file. But I am unable to write it in a specific format as I require
awesome_people:
name: Awesome People
entities:
- device_tracker.dad_smith
- device_tracker.mom_smith
I am getting problem in the entities part, as I am unable to create a list with proper indents as in the above YAML.
How can I create the above exact format?

You can do this with ruamel.yaml, setting the sequence indent from default two to four,
and the offset of the dash+space (which need a minimum indent of two), from zero to two.
You should leave the mapping indent unchanged from the default:
import sys
import ruamel.yaml
ent = ['device_tracker.dad_smith', 'device_tracker.mom_smith']
data = dict(awesome_people = dict(name='Awesome People', entities=ent))
yaml_str = """\
"""
yaml = ruamel.yaml.YAML()
yaml.indent(sequence=4, offset=2)
# print(data)
yaml.dump(data, sys.stdout)
which gives:
awesome_people:
name: Awesome People
entities:
- device_tracker.dad_smith
- device_tracker.mom_smith

Related

Remove a field from YAML OpenAPI specification

I need to remove the tags field from each of the methods in my OpenAPI spec.
The spec must be in YAML format, as converting to JSON causes issues later on when publishing.
I couldn't find a ready tool for that, and my programming skills are insufficient. I tried Python with ruamel.yaml, but could not achieve anything.
I'm open to any suggestions how to approach this - a repo with a ready tool somewhere, a hint on what to try in Python... I'm out of my own ideas.
Maybe a regex that catches all cases all instances of tags so I can do a search and replace with Python, replacing them with nothing? Empty lines don't seem to break the publishing engine.
Here's a sample YAML piece (I know this is not a proper spec, just want to show where in the YAML tags sits)
openapi: 3.0.0
info:
title: ""
description: ""
paths:
/endpoint
get:
tags:
-
tag1
-
tag3
#rest of GET definition
post:
tags:
- tag2
/anotherEndpoint
post:
tags:
- tag1
I need to get rid of all tags arrays entirely (not just make them empty)
I am not sure why you couldn't achieve anything with Python + ruamel.yaml. Assuing your spec
is in a file input.yaml:
import sys
from pathlib import Path
import ruamel.yaml
in_file = Path('input.yaml')
out_file = Path('output.yaml')
yaml = ruamel.yaml.YAML()
yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True
data = yaml.load(in_file)
# if you only have the three instances of 'tags', you can hard-code them:
# del data['paths']['/endpoint']['get']['tags']
# del data['paths']['/endpoint']['post']['tags']
# del data['paths']['/anotherEndpoint']['post']['tags']
# generic recursive removal of any key names 'tags' in the datastructure:
def rm_tags(d):
if isinstance(d, dict):
if 'tags' in d:
del d['tags']
for k in d:
rm_tags(d[k])
elif isinstance(d, list):
for item in d:
rm_tags(item)
rm_tags(data)
yaml.dump(data, out_file)
which gives as output.yaml:
openapi: 3.0.0
info:
title: ""
description: ""
paths:
/endpoint:
get: {}
post: {}
/anotherEndpoint:
post: {}
You can write back data to input.yaml if you need that.
Please note that normally the comment #rest of GET definition would be preserved, but
not now as it is associated during loading with the key before it and that key gets deleted.

pyyaml parse data with tag

I have yaml data like the input below and i need output as key value pairs
Input
a="""
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
code:
- '716'
- '718'
id:
- 488
- 499
"""
ouput needed
{'code': ['716', '718'], 'id': [488, 499]}
The default constructor was giving me an error. I tried adding new constructor and now its not giving me error but i am not able to get key value pairs.
FYI, If i remove the !ruby/hash:ActiveSupport::HashWithIndifferentAccess line from my yaml then it gives me desired output.
def new_constructor(loader, tag_suffix, node):
if type(node.value)=='list':
val=''.join(node.value)
else:
val=node.value
val=node.value
ret_val="""
{0}
""".format(val)
return ret_val
yaml.add_multi_constructor('', new_constructor)
yaml.load(a)
output
"\n [(ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'code'), SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'716'), ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'718')])), (ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'id'), SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:int', value=u'488'), ScalarNode(tag=u'tag:yaml.org,2002:int', value=u'499')]))]\n "
Please suggest.
This is not a solution using PyYAML, but I recommend using ruamel.yaml instead. If for no other reason, it's more actively maintained than PyYAML. A quote from the overview
Many of the bugs filed against PyYAML, but that were never acted upon, have been fixed in ruamel.yaml
To load that string, you can do
import ruamel.yaml
parser = ruamel.yaml.YAML()
obj = parser.load(a) # as defined above.
I strongly recommend following #Andrew F answer, but in case you
wonder why your code did not get the proper result, that is because
you don't correctly process the node under the tag in your tag
handling.
Although the node's value is a list (of tuples with key value pairs),
you should test for the type of the node itself (using isinstance)
and then hand it over to the "normal" mapping processing routine as
the tag is on a mapping:
import yaml
from yaml.loader import SafeLoader
a = """\
--- !ruby/hash:ActiveSupport::HashWithIndifferentAccess
code:
- '716'
- '718'
id:
- 488
- 499
"""
def new_constructor(loader, tag_suffix, node):
if isinstance(node, yaml.nodes.MappingNode):
return loader.construct_mapping(node, deep=True)
raise NotImplementedError
yaml.add_multi_constructor('', new_constructor, Loader=SafeLoader)
data = yaml.load(a, Loader=SafeLoader)
print(data)
which gives:
{'code': ['716', '718'], 'id': [488, 499]}
You should not use PyYAML's yaml.load(), it is documented to be potentially unsafe
and above all it is not necessary. Just add the new constructor to the SafeLoader.

Prettify YAML with comments

1. Summary
I can't find, how I can automatically prettify my YAML files.
2. Data
Example:
    I have SashaPrettifyYAML.yaml file:
sasha_commands:
# Sasha comment
sasha_command_help: {call: sublime.command_help, caption: 'Sasha Command: Command Help'}
3. Expected behavior
I want to delete {braces}:
sasha_commands:
# Sasha comment
sasha_command_help:
call: sublime.command_help
caption: 'Sasha Command: Command Help'
4. Not helped
Pretty YAML (based on PyYAML) and online formatters as YAML Formatter and OnlineYAMLTools delete comments;
I can't find the required option in ruamel.yaml.cmd;
align-yaml align, not prettify YAML file.
There is no option to do this in ruamel.yaml.cmd, but it is fairly straightforward to do this with a small python program and using ruamel.yaml, by loading and dumping in round-trip mode (the default).
The only thing you need to do is make sure the flow-style on the data-structure that is the value for the key sasha_command_help is set to block-style (which is how I interpret your definition of "prettifying YAML"):
import sys
import ruamel.yaml
yaml_str = """\
sasha_commands:
# Sasha comment
sasha_command_help: {call: sublime.command_help, caption: 'Sasha Command: Command Help'}
"""
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load(yaml_str)
data['sasha_commands']['sasha_command_help'].fa.set_block_style()
yaml.dump(data, sys.stdout)
this will exactly give the output you expect.
A recursive data structure walker can be found in scalarstring.py in the ruamel.yaml source, and adapted to make a generic "make-everything-block-style" routine:
import sys
import ruamel.yaml
def block_style(base):
"""
This routine walks over a simple, i.e. consisting of dicts, lists and
primitives, tree loaded from YAML. It recurses into dict values and list
items, and sets block-style on these.
"""
if isinstance(base, dict):
for k in base:
try:
base.fa.set_block_style()
except AttributeError:
pass
block_style(base[k])
elif isinstance(base, list):
for elem in base:
try:
base.fa.set_block_style()
except AttributeError:
pass
block_style(elem)
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
file_in = sys.argv[1]
file_out = sys.argv[2]
with open(file_in) as fp:
data = yaml.load(fp)
block_style(data)
with open(file_out, 'w') as fp:
yaml.dump(data, fp)
If you store the above in prettifyyaml.py you can call it with:
python prettifyyaml.py SashaPrettifyYAML.yaml Prettified.yaml
Since you are already using single quotes around the scalar that has embedded spaces, you won't see a change if you leave out yaml.preserve_quotes = True. But if you had used a double quoted scalar then that line makes sure the double quotes are preserved.
I had the same problem. I wrote my own YAML beautifier https://github.com/wangkuiyi/yamlfmt. I hope it helps.
I tried top results from Google, but none of them address the requirements of https://sqlflow.org/sqlflow, which I am leading:
https://pypi.org/project/yamlfmt cannot handle a file of multiple YAML documents separated by ---
https://github.com/devopyio/yamlfmt cannot handle multiple files.
https://github.com/miekg/yamlfmt/blob/master/fmt.go cannot replace (inline edit) the input files.
You can use yq tool - it's easy to install and use, and it's well maintained.
Supposing you have example.yml file to format, it can be processed by following ways:
from file: yq r --unwrapScalar -p pv -P example.yml '*'
from stdin: cat example.yml | yq r --unwrapScalar -p pv -P - '*'

Is there a way to dump YAML in same version it was loaded in with ruamel.yaml?

Is there a good way with ruamel.yaml to dump a YAML file back out in the same version as it is loaded in? If I have a %YAML 1.1 directive in the file I would like to be able to dump the file back out in YAML 1.1 without having to hard-code version='1.1'.
So given some data like,
%YAML 1.1
---
is_string: 'on'
is_boolean: on
I would like to avoid hard-coding version='1.1' on the round_trip_dump(),
x = f.read()
d = round_trip_load(x)
round_trip_dump(d, f, explicit_start=True)
The version of the YAML file is a fleeting value, which gets reset after loading. It was (is) my plan to make the version of the latest document loaded available somehow, but with multiple documents in a stream this needs some more thought.
For single document streams you can do the following to capture the version from the directive. This is all done with the new API. With the old API that you are using in the example the same is possible, but more difficult because there is no YAML() instance to attach attributes to:
import sys
from ruamel.yaml import YAML
from ruamel.yaml.parser import Parser
yaml_str = """\
%YAML 1.1
---
is_string: 'on'
is_boolean: on
"""
class MyParser(Parser):
def dispose(self):
self.loader.last_yaml_version = self.yaml_version
Parser.dispose(self)
yaml = YAML()
yaml.Parser = MyParser
data = yaml.load(yaml_str)
yaml2 = YAML()
yaml2.version = yaml.last_yaml_version
yaml2.dump(data, sys.stdout)
which gives:
%YAML 1.1
---
is_string: 'on'
is_boolean: true
Please note that it is necessary to create a clean, new object for output, as the "unversioned" reading doesn't fully reset the yaml instance, when encountering the %YAML 1.1 directive.
It is also possible to dump the value associated with is_boolean as on, but that would affect all booleans in the stream.

How can I include a YAML file inside another?

So I have two YAML files, "A" and "B" and I want the contents of A to be inserted inside B, either spliced into the existing data structure, like an array, or as a child of an element, like the value for a certain hash key.
Is this possible at all? How? If not, any pointers to a normative reference?
No, standard YAML does not include any kind of "import" or "include" statement.
Your question does not ask for a Python solution, but here is one using PyYAML.
PyYAML allows you to attach custom constructors (such as !include) to the YAML loader. I've included a root directory that can be set so that this solution supports relative and absolute file references.
Class-Based Solution
Here is a class-based solution, that avoids the global root variable of my original response.
See this gist for a similar, more robust Python 3 solution that uses a metaclass to register the custom constructor.
import yaml
import os
class Loader(yaml.SafeLoader):
def __init__(self, stream):
self._root = os.path.split(stream.name)[0]
super(Loader, self).__init__(stream)
def include(self, node):
filename = os.path.join(self._root, self.construct_scalar(node))
with open(filename, 'r') as f:
return yaml.load(f, Loader)
Loader.add_constructor('!include', Loader.include)
An example:
foo.yaml
a: 1
b:
- 1.43
- 543.55
c: !include bar.yaml
bar.yaml
- 3.6
- [1, 2, 3]
Now the files can be loaded using:
>>> with open('foo.yaml', 'r') as f:
>>> data = yaml.load(f, Loader)
>>> data
{'a': 1, 'b': [1.43, 543.55], 'c': [3.6, [1, 2, 3]]}
For Python users, you can try pyyaml-include.
Install
pip install pyyaml-include
Usage
import yaml
from yamlinclude import YamlIncludeConstructor
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.FullLoader, base_dir='/your/conf/dir')
with open('0.yaml') as f:
data = yaml.load(f, Loader=yaml.FullLoader)
print(data)
Consider we have such YAML files:
├── 0.yaml
└── include.d
├── 1.yaml
└── 2.yaml
1.yaml 's content:
name: "1"
2.yaml 's content:
name: "2"
Include files by name
On top level:
If 0.yaml was:
!include include.d/1.yaml
We'll get:
{"name": "1"}
In mapping:
If 0.yaml was:
file1: !include include.d/1.yaml
file2: !include include.d/2.yaml
We'll get:
file1:
name: "1"
file2:
name: "2"
In sequence:
If 0.yaml was:
files:
- !include include.d/1.yaml
- !include include.d/2.yaml
We'll get:
files:
- name: "1"
- name: "2"
ℹ Note:
File name can be either absolute (like /usr/conf/1.5/Make.yml) or relative (like ../../cfg/img.yml).
Include files by wildcards
File name can contain shell-style wildcards. Data loaded from the file(s) found by wildcards will be set in a sequence.
If 0.yaml was:
files: !include include.d/*.yaml
We'll get:
files:
- name: "1"
- name: "2"
ℹ Note:
For Python>=3.5, if recursive argument of !include YAML tag is true, the pattern “**” will match any files and zero or more directories and subdirectories.
Using the “**” pattern in large directory trees may consume an inordinate amount of time because of recursive search.
In order to enable recursive argument, we shall write the !include tag in Mapping or Sequence mode:
Arguments in Sequence mode:
!include [tests/data/include.d/**/*.yaml, true]
Arguments in Mapping mode:
!include {pathname: tests/data/include.d/**/*.yaml, recursive: true}
Includes are not directly supported in YAML as far as I know, you will have to provide a mechanism yourself however, this is generally easy to do.
I have used YAML as a configuration language in my python apps, and in this case often define a convention like this:
>>> main.yml <<<
includes: [ wibble.yml, wobble.yml]
Then in my (python) code I do:
import yaml
cfg = yaml.load(open("main.yml"))
for inc in cfg.get("includes", []):
cfg.update(yaml.load(open(inc)))
The only down side is that variables in the includes will always override the variables in main, and there is no way to change that precedence by changing where the "includes: statement appears in the main.yml file.
On a slightly different point, YAML doesn't support includes as its not really designed as as exclusively as a file based mark up. What would an include mean if you got it in a response to an AJAX request?
The YML standard does not specify a way to do this. And this problem does not limit itself to YML. JSON has the same limitations.
Many applications which use YML or JSON based configurations run into this problem eventually. And when that happens, they make up their own convention.
e.g. for swagger API definitions:
$ref: 'file.yml'
e.g. for docker compose configurations:
services:
app:
extends:
file: docker-compose.base.yml
Alternatively, if you want to split up the content of a yml file in multiple files, like a tree of content, you can define your own folder-structure convention and use an (existing) merge script.
Expanding on #Josh_Bode's answer, here's my own PyYAML solution, which has the advantage of being a self-contained subclass of yaml.Loader. It doesn't depend on any module-level globals, or on modifying the global state of the yaml module.
import yaml, os
class IncludeLoader(yaml.Loader):
"""
yaml.Loader subclass handles "!include path/to/foo.yml" directives in config
files. When constructed with a file object, the root path for includes
defaults to the directory containing the file, otherwise to the current
working directory. In either case, the root path can be overridden by the
`root` keyword argument.
When an included file F contain its own !include directive, the path is
relative to F's location.
Example:
YAML file /home/frodo/one-ring.yml:
---
Name: The One Ring
Specials:
- resize-to-wearer
Effects:
- !include path/to/invisibility.yml
YAML file /home/frodo/path/to/invisibility.yml:
---
Name: invisibility
Message: Suddenly you disappear!
Loading:
data = IncludeLoader(open('/home/frodo/one-ring.yml', 'r')).get_data()
Result:
{'Effects': [{'Message': 'Suddenly you disappear!', 'Name':
'invisibility'}], 'Name': 'The One Ring', 'Specials':
['resize-to-wearer']}
"""
def __init__(self, *args, **kwargs):
super(IncludeLoader, self).__init__(*args, **kwargs)
self.add_constructor('!include', self._include)
if 'root' in kwargs:
self.root = kwargs['root']
elif isinstance(self.stream, file):
self.root = os.path.dirname(self.stream.name)
else:
self.root = os.path.curdir
def _include(self, loader, node):
oldRoot = self.root
filename = os.path.join(self.root, loader.construct_scalar(node))
self.root = os.path.dirname(filename)
data = yaml.load(open(filename, 'r'))
self.root = oldRoot
return data
With Yglu, you can import other files like this:
A.yaml
foo: !? $import('B.yaml')
B.yaml
bar: Hello
$ yglu A.yaml
foo:
bar: Hello
As $import is a function, you can also pass an expression as argument:
dep: !- b
foo: !? $import($_.dep.toUpper() + '.yaml')
This would give the same output as above.
Disclaimer: I am the author of Yglu.
Standard YAML 1.2 doesn't include natively this feature. Nevertheless many implementations provides some extension to do so.
I present a way of achieving it with Java and snakeyaml:1.24 (Java library to parse/emit YAML files) that allows creating a custom YAML tag to achieve the following goal (you will see I'm using it to load test suites defined in several YAML files and that I made it work as a list of includes for a target test: node):
# ... yaml prev stuff
tests: !include
- '1.hello-test-suite.yaml'
- '3.foo-test-suite.yaml'
- '2.bar-test-suite.yaml'
# ... more yaml document
Here is the one-class Java that allows processing the !include tag. Files are loaded from classpath (Maven resources directory):
/**
* Custom YAML loader. It adds support to the custom !include tag which allows splitting a YAML file across several
* files for a better organization of YAML tests.
*/
#Slf4j // <-- This is a Lombok annotation to auto-generate logger
public class MyYamlLoader {
private static final Constructor CUSTOM_CONSTRUCTOR = new MyYamlConstructor();
private MyYamlLoader() {
}
/**
* Parse the only YAML document in a stream and produce the Java Map. It provides support for the custom !include
* YAML tag to split YAML contents across several files.
*/
public static Map<String, Object> load(InputStream inputStream) {
return new Yaml(CUSTOM_CONSTRUCTOR)
.load(inputStream);
}
/**
* Custom SnakeYAML constructor that registers custom tags.
*/
private static class MyYamlConstructor extends Constructor {
private static final String TAG_INCLUDE = "!include";
MyYamlConstructor() {
// Register custom tags
yamlConstructors.put(new Tag(TAG_INCLUDE), new IncludeConstruct());
}
/**
* The actual include tag construct.
*/
private static class IncludeConstruct implements Construct {
#Override
public Object construct(Node node) {
List<Node> inclusions = castToSequenceNode(node);
return parseInclusions(inclusions);
}
#Override
public void construct2ndStep(Node node, Object object) {
// do nothing
}
private List<Node> castToSequenceNode(Node node) {
try {
return ((SequenceNode) node).getValue();
} catch (ClassCastException e) {
throw new IllegalArgumentException(String.format("The !import value must be a sequence node, but " +
"'%s' found.", node));
}
}
private Object parseInclusions(List<Node> inclusions) {
List<InputStream> inputStreams = inputStreams(inclusions);
try (final SequenceInputStream sequencedInputStream =
new SequenceInputStream(Collections.enumeration(inputStreams))) {
return new Yaml(CUSTOM_CONSTRUCTOR)
.load(sequencedInputStream);
} catch (IOException e) {
log.error("Error closing the stream.", e);
return null;
}
}
private List<InputStream> inputStreams(List<Node> scalarNodes) {
return scalarNodes.stream()
.map(this::inputStream)
.collect(toList());
}
private InputStream inputStream(Node scalarNode) {
String filePath = castToScalarNode(scalarNode).getValue();
final InputStream is = getClass().getClassLoader().getResourceAsStream(filePath);
Assert.notNull(is, String.format("Resource file %s not found.", filePath));
return is;
}
private ScalarNode castToScalarNode(Node scalarNode) {
try {
return ((ScalarNode) scalarNode);
} catch (ClassCastException e) {
throw new IllegalArgumentException(String.format("The value must be a scalar node, but '%s' found" +
".", scalarNode));
}
}
}
}
}
Unfortunately YAML doesn't provide this in its standard.
But if you are using Ruby, there is a gem providing the functionality you are asking for by extending the ruby YAML library:
https://github.com/entwanderer/yaml_extend
I make some examples for your reference.
import yaml
main_yaml = """
Package:
- !include _shape_yaml
- !include _path_yaml
"""
_shape_yaml = """
# Define
Rectangle: &id_Rectangle
name: Rectangle
width: &Rectangle_width 20
height: &Rectangle_height 10
area: !product [*Rectangle_width, *Rectangle_height]
Circle: &id_Circle
name: Circle
radius: &Circle_radius 5
area: !product [*Circle_radius, *Circle_radius, pi]
# Setting
Shape:
property: *id_Rectangle
color: red
"""
_path_yaml = """
# Define
Root: &BASE /path/src/
Paths:
a: &id_path_a !join [*BASE, a]
b: &id_path_b !join [*BASE, b]
# Setting
Path:
input_file: *id_path_a
"""
# define custom tag handler
def yaml_import(loader, node):
other_yaml_file = loader.construct_scalar(node)
return yaml.load(eval(other_yaml_file), Loader=yaml.SafeLoader)
def yaml_product(loader, node):
import math
list_data = loader.construct_sequence(node)
result = 1
pi = math.pi
for val in list_data:
result *= eval(val) if isinstance(val, str) else val
return result
def yaml_join(loader, node):
seq = loader.construct_sequence(node)
return ''.join([str(i) for i in seq])
def yaml_ref(loader, node):
ref = loader.construct_sequence(node)
return ref[0]
def yaml_dict_ref(loader: yaml.loader.SafeLoader, node):
dict_data, key, const_value = loader.construct_sequence(node)
return dict_data[key] + str(const_value)
def main():
# register the tag handler
yaml.SafeLoader.add_constructor(tag='!include', constructor=yaml_import)
yaml.SafeLoader.add_constructor(tag='!product', constructor=yaml_product)
yaml.SafeLoader.add_constructor(tag='!join', constructor=yaml_join)
yaml.SafeLoader.add_constructor(tag='!ref', constructor=yaml_ref)
yaml.SafeLoader.add_constructor(tag='!dict_ref', constructor=yaml_dict_ref)
config = yaml.load(main_yaml, Loader=yaml.SafeLoader)
pk_shape, pk_path = config['Package']
pk_shape, pk_path = pk_shape['Shape'], pk_path['Path']
print(f"shape name: {pk_shape['property']['name']}")
print(f"shape area: {pk_shape['property']['area']}")
print(f"shape color: {pk_shape['color']}")
print(f"input file: {pk_path['input_file']}")
if __name__ == '__main__':
main()
output
shape name: Rectangle
shape area: 200
shape color: red
input file: /path/src/a
Update 2
and you can combine it, like this
# xxx.yaml
CREATE_FONT_PICTURE:
PROJECTS:
SUNG: &id_SUNG
name: SUNG
work_dir: SUNG
output_dir: temp
font_pixel: 24
DEFINE: &id_define !ref [*id_SUNG] # you can use config['CREATE_FONT_PICTURE']['DEFINE'][name, work_dir, ... font_pixel]
AUTO_INIT:
basename_suffix: !dict_ref [*id_define, name, !product [5, 3, 2]] # SUNG30
# ↓ This is not correct.
# basename_suffix: !dict_ref [*id_define, name, !product [5, 3, 2]] # It will build by Deep-level. id_define is Deep-level: 2. So you must put it after 2. otherwise, it can't refer to the correct value.
With Symfony, its handling of yaml will indirectly allow you to nest yaml files. The trick is to make use of the parameters option. eg:
common.yml
parameters:
yaml_to_repeat:
option: "value"
foo:
- "bar"
- "baz"
config.yml
imports:
- { resource: common.yml }
whatever:
thing: "%yaml_to_repeat%"
other_thing: "%yaml_to_repeat%"
The result will be the same as:
whatever:
thing:
option: "value"
foo:
- "bar"
- "baz"
other_thing:
option: "value"
foo:
- "bar"
- "baz"
I think the solution used by #maxy-B looks great. However, it didn't succeed for me with nested inclusions. For example if config_1.yaml includes config_2.yaml, which includes config_3.yaml there was a problem with the loader. However, if you simply point the new loader class to itself on load, it works! Specifically, if we replace the old _include function with the very slightly modified version:
def _include(self, loader, node):
oldRoot = self.root
filename = os.path.join(self.root, loader.construct_scalar(node))
self.root = os.path.dirname(filename)
data = yaml.load(open(filename, 'r'), loader = IncludeLoader)
self.root = oldRoot
return data
Upon reflection I agree with the other comments, that nested loading is not appropriate for yaml in general as the input stream may not be a file, but it is very useful!
Based on previous posts:
class SimYamlLoader(yaml.SafeLoader):
'''
Simple custom yaml loader that supports include, e.g:
main.yaml:
- !include file1.yaml
- !include dir/file2.yaml
'''
def __init__(self, stream):
self.root = os.path.split(stream.name)[0]
super().__init__(stream)
def _include(loader, node):
filename = os.path.join(loader.root, loader.construct_scalar(node))
with open(filename, 'r') as f:
return yaml.load(f, SimYamlLoader)
SimYamlLoader.add_constructor('!include', _include)
# example:
with open('main.yaml', 'r') as f:
lists = yaml.load(f, SimYamlLoader)
# if you want to merge the lists
data = functools.reduce(
lambda x, y: x if y is None else {**x, **dict(y)}, lists, {})
# python 3.10+:lambda x, y: x if y is None else x | dict(y), lists, {})
Maybe this could inspire you, try to align to jbb conventions:
https://docs.openstack.org/infra/jenkins-job-builder/definition.html#inclusion-tags
- job:
name: test-job-include-raw-1
builders:
- shell:
!include-raw: include-raw001-hello-world.sh
Adding on #Joshbode's initial answer above , I modified the snippet a little to support UNIX style wild card patterns.
I haven't tested in windows though. I was facing an issue of splitting an array in a large yaml across multiple files for easy maintenance and was looking for a solution to refer multiple files within a same array of the base yaml. Hence the below solution. The solution does not support recursive reference. It only supports wildcards in a given directory level referenced in the base yaml.
import yaml
import os
import glob
# Base code taken from below link :-
# Ref:https://stackoverflow.com/a/9577670
class Loader(yaml.SafeLoader):
def __init__(self, stream):
self._root = os.path.split(stream.name)[0]
super(Loader, self).__init__(stream)
def include(self, node):
consolidated_result = None
filename = os.path.join(self._root, self.construct_scalar(node))
# Below section is modified for supporting UNIX wildcard patterns
filenames = glob.glob(filename)
# Just to ensure the order of files considered are predictable
# and easy to debug in case of errors.
filenames.sort()
for file in filenames:
with open(file, 'r') as f:
result = yaml.load(f, Loader)
if isinstance(result, list):
if not isinstance(consolidated_result, list):
consolidated_result = []
consolidated_result += result
elif isinstance(result, dict):
if not isinstance(consolidated_result, dict):
consolidated_result = {}
consolidated_result.update(result)
else:
consolidated_result = result
return consolidated_result
Loader.add_constructor('!include', Loader.include)
Usage
a:
!include a.yaml
b:
# All yamls included within b folder level will be consolidated
!include b/*.yaml
Combining other answers, here is a short solution without overloading Loader class and it works with any loader operating on files:
import json
from pathlib import Path
from typing import Any
import yaml
def yaml_include_constructor(loader: yaml.BaseLoader, node: yaml.Node) -> Any:
"""Include file referenced with !include node"""
# noinspection PyTypeChecker
fp = Path(loader.name).parent.joinpath(loader.construct_scalar(node)).resolve()
fe = fp.suffix.lstrip(".")
with open(fp, 'r') as f:
if fe in ("yaml", "yml"):
return yaml.load(f, type(loader))
elif fe in ("json", "jsn"):
return json.load(f)
else:
return f.read()
def main():
loader = yaml.SafeLoader # Works with any loader
loader.add_constructor("!include", yaml_include_constructor)
with open(...) as f:
yml = yaml.load(f, loader)
# noinspection PyTypeChecker is there to prevent PEP-check warning Expected type 'ScalarNode', got 'Node' instead when passing node: yaml.Node to loader.construct_scalar().
This solution fails if yaml.load input stream is not file stream, as loader.name does not contain the path in that case:
class Reader(object):
...
def __init__(self, stream):
...
if isinstance(stream, str):
self.name = "<unicode string>"
...
elif isinstance(stream, bytes):
self.name = "<byte string>"
...
else:
self.name = getattr(stream, 'name', "<file>")
...
In my use case, I know that only YAML files will be included, so the solution can be simplified further:
def yaml_include_constructor(loader: yaml.Loader, node: yaml.Node) -> Any:
"""Include YAML file referenced with !include node"""
with open(Path(loader.name).parent.joinpath(loader.construct_yaml_str(node)).resolve(), 'r') as f:
return yaml.load(f, type(loader))
Loader = yaml.SafeLoader # Works with any loader
Loader.add_constructor("!include", yaml_include_constructor)
def main():
with open(...) as f:
yml = yaml.load(f, Loader=Loader)
or even one-liner using lambda:
Loader = yaml.SafeLoader # Works with any loader
Loader.add_constructor("!include",
lambda l, n: yaml.load(Path(l.name).parent.joinpath(l.construct_scalar(n)).read_text(), type(l)))
Probably it was not supported when question was asked but you can import other YAML file into one:
imports: [/your_location_to_yaml_file/Util.area.yaml]
Though I don't have any online reference but this works for me.

Resources