I am building an open source project using Ruby for testing HTTP services: https://github.com/Comcast/http-blackbox-test-tool
I want to be able to reference environment variables in my test-plan.yaml file. I could use ERB, however I don't want to support embedding any random Ruby code and ERB syntax is odd for non-rubyists, I just want to access environment variables using the commonly used Unix style ${ENV_VAR} syntax.
e.g.
order-lunch-app-health:
request:
url: ${ORDER_APP_URL}
headers:
content-type: 'application/text'
method: get
expectedResponse:
statusCode: 200
maxRetryCount: 5
All examples I have found for Ruby use ERB. Does anyone have a suggestion on the best way to deal with this? I an open to using another tool to preprocess the YAML and then send that to the Ruby application.
I believe something like this should work under most circumstances:
require 'yaml'
def load_yaml(file)
content = File.read file
content.gsub! /\${([^}]+)}/ do
ENV[$1]
end
YAML.load content
end
p load_yaml 'sample.yml'
As opposed to my original answer, this is both simpler and handles undefined ENV variables well.
Try with this YAML:
# sample.yml
path: ${PATH}
home: ${HOME}
error: ${NO_SUCH_VAR}
Original Answer (left here for reference)
There are several ways to do it. If you want to allow your users to use the ${VAR} syntax, then perhaps one way would be to first convert these variables to Ruby string substitution format %{VAR} and then evaluate all environment variables together.
Here is a rough proof of concept:
require 'yaml'
# Transform environments to a hash of { symbol: value }
env_hash = ENV.to_h.transform_keys(&:to_sym)
# Load the file and convert ${ANYTHING} to %{ANYTHING}
content = File.read 'sample.yml'
content.gsub! /\${([^}]+)}/, "%{\\1}"
# Use Ruby string substitution to replace %{VARS}
content %= env_hash
# Done
yaml = YAML.load content
p yaml
Use it with this sample.yml for instance:
# sample.yml
path: ${PATH}
home: ${HOME}
There are many ways this can be improved upon of course.
Preprocessing is easy, and I recommend you use a YAML loaderd/dumper
based solution, as the replacement might require quotes around the
replacement scalar. (E.g. you substitute the string true, if that
were not quoted, the resulting YAML would be read as a boolean).
Assuming your "source" is in input.yaml and your env. variable
ORDER_APP_URL set to https://some.site/and/url. And the following
script in expand.py:
import sys
import os
from pathlib import Path
import ruamel.yaml
def substenv(d, env):
if isinstance(d, dict):
for k, v in d.items():
if isinstance(v, str) and '${' in v:
d[k] = v.replace('${', '{').format(**env)
else:
substenv(v, env)
elif isinstance(d, list):
for idx, item in enumerate(d):
if isinstance(v, str) and '${' in v:
d[idx] = item.replace('${', '{').format(**env)
else:
substenv(item, env)
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
data = yaml.load(Path(sys.argv[1]))
substenv(data, os.environ)
yaml.dump(data, Path(sys.argv[2]))
You can then do:
python expand.py input.yaml output.yaml
which writes output.yaml:
order-lunch-app-health:
request:
url: https://some.site/and/url
headers:
content-type: 'application/text'
method: get
expectedResponse:
statusCode: 200
maxRetryCount: 5
Please note that the spurious quotes around 'application/text' are preserved, as would be any comments
in the original file.
Quotes around the substituted URL are not necessary, but the would have been added if they were.
The substenv routine recursively traverses the loaded data, and substitutes even if the substitution is in mid-scalar, and if there are more than substitution in one scalar. You can "tighten" the test:
if isinstance(v, str) and '${' in v:
if that would match too many strings loaded from YAML.
So I have two YAML files, "A" and "B" and I want the contents of A to be inserted inside B, either spliced into the existing data structure, like an array, or as a child of an element, like the value for a certain hash key.
Is this possible at all? How? If not, any pointers to a normative reference?
No, standard YAML does not include any kind of "import" or "include" statement.
Your question does not ask for a Python solution, but here is one using PyYAML.
PyYAML allows you to attach custom constructors (such as !include) to the YAML loader. I've included a root directory that can be set so that this solution supports relative and absolute file references.
Class-Based Solution
Here is a class-based solution, that avoids the global root variable of my original response.
See this gist for a similar, more robust Python 3 solution that uses a metaclass to register the custom constructor.
import yaml
import os
class Loader(yaml.SafeLoader):
def __init__(self, stream):
self._root = os.path.split(stream.name)[0]
super(Loader, self).__init__(stream)
def include(self, node):
filename = os.path.join(self._root, self.construct_scalar(node))
with open(filename, 'r') as f:
return yaml.load(f, Loader)
Loader.add_constructor('!include', Loader.include)
An example:
foo.yaml
a: 1
b:
- 1.43
- 543.55
c: !include bar.yaml
bar.yaml
- 3.6
- [1, 2, 3]
Now the files can be loaded using:
>>> with open('foo.yaml', 'r') as f:
>>> data = yaml.load(f, Loader)
>>> data
{'a': 1, 'b': [1.43, 543.55], 'c': [3.6, [1, 2, 3]]}
For Python users, you can try pyyaml-include.
Install
pip install pyyaml-include
Usage
import yaml
from yamlinclude import YamlIncludeConstructor
YamlIncludeConstructor.add_to_loader_class(loader_class=yaml.FullLoader, base_dir='/your/conf/dir')
with open('0.yaml') as f:
data = yaml.load(f, Loader=yaml.FullLoader)
print(data)
Consider we have such YAML files:
├── 0.yaml
└── include.d
├── 1.yaml
└── 2.yaml
1.yaml 's content:
name: "1"
2.yaml 's content:
name: "2"
Include files by name
On top level:
If 0.yaml was:
!include include.d/1.yaml
We'll get:
{"name": "1"}
In mapping:
If 0.yaml was:
file1: !include include.d/1.yaml
file2: !include include.d/2.yaml
We'll get:
file1:
name: "1"
file2:
name: "2"
In sequence:
If 0.yaml was:
files:
- !include include.d/1.yaml
- !include include.d/2.yaml
We'll get:
files:
- name: "1"
- name: "2"
ℹ Note:
File name can be either absolute (like /usr/conf/1.5/Make.yml) or relative (like ../../cfg/img.yml).
Include files by wildcards
File name can contain shell-style wildcards. Data loaded from the file(s) found by wildcards will be set in a sequence.
If 0.yaml was:
files: !include include.d/*.yaml
We'll get:
files:
- name: "1"
- name: "2"
ℹ Note:
For Python>=3.5, if recursive argument of !include YAML tag is true, the pattern “**” will match any files and zero or more directories and subdirectories.
Using the “**” pattern in large directory trees may consume an inordinate amount of time because of recursive search.
In order to enable recursive argument, we shall write the !include tag in Mapping or Sequence mode:
Arguments in Sequence mode:
!include [tests/data/include.d/**/*.yaml, true]
Arguments in Mapping mode:
!include {pathname: tests/data/include.d/**/*.yaml, recursive: true}
Includes are not directly supported in YAML as far as I know, you will have to provide a mechanism yourself however, this is generally easy to do.
I have used YAML as a configuration language in my python apps, and in this case often define a convention like this:
>>> main.yml <<<
includes: [ wibble.yml, wobble.yml]
Then in my (python) code I do:
import yaml
cfg = yaml.load(open("main.yml"))
for inc in cfg.get("includes", []):
cfg.update(yaml.load(open(inc)))
The only down side is that variables in the includes will always override the variables in main, and there is no way to change that precedence by changing where the "includes: statement appears in the main.yml file.
On a slightly different point, YAML doesn't support includes as its not really designed as as exclusively as a file based mark up. What would an include mean if you got it in a response to an AJAX request?
The YML standard does not specify a way to do this. And this problem does not limit itself to YML. JSON has the same limitations.
Many applications which use YML or JSON based configurations run into this problem eventually. And when that happens, they make up their own convention.
e.g. for swagger API definitions:
$ref: 'file.yml'
e.g. for docker compose configurations:
services:
app:
extends:
file: docker-compose.base.yml
Alternatively, if you want to split up the content of a yml file in multiple files, like a tree of content, you can define your own folder-structure convention and use an (existing) merge script.
Expanding on #Josh_Bode's answer, here's my own PyYAML solution, which has the advantage of being a self-contained subclass of yaml.Loader. It doesn't depend on any module-level globals, or on modifying the global state of the yaml module.
import yaml, os
class IncludeLoader(yaml.Loader):
"""
yaml.Loader subclass handles "!include path/to/foo.yml" directives in config
files. When constructed with a file object, the root path for includes
defaults to the directory containing the file, otherwise to the current
working directory. In either case, the root path can be overridden by the
`root` keyword argument.
When an included file F contain its own !include directive, the path is
relative to F's location.
Example:
YAML file /home/frodo/one-ring.yml:
---
Name: The One Ring
Specials:
- resize-to-wearer
Effects:
- !include path/to/invisibility.yml
YAML file /home/frodo/path/to/invisibility.yml:
---
Name: invisibility
Message: Suddenly you disappear!
Loading:
data = IncludeLoader(open('/home/frodo/one-ring.yml', 'r')).get_data()
Result:
{'Effects': [{'Message': 'Suddenly you disappear!', 'Name':
'invisibility'}], 'Name': 'The One Ring', 'Specials':
['resize-to-wearer']}
"""
def __init__(self, *args, **kwargs):
super(IncludeLoader, self).__init__(*args, **kwargs)
self.add_constructor('!include', self._include)
if 'root' in kwargs:
self.root = kwargs['root']
elif isinstance(self.stream, file):
self.root = os.path.dirname(self.stream.name)
else:
self.root = os.path.curdir
def _include(self, loader, node):
oldRoot = self.root
filename = os.path.join(self.root, loader.construct_scalar(node))
self.root = os.path.dirname(filename)
data = yaml.load(open(filename, 'r'))
self.root = oldRoot
return data
With Yglu, you can import other files like this:
A.yaml
foo: !? $import('B.yaml')
B.yaml
bar: Hello
$ yglu A.yaml
foo:
bar: Hello
As $import is a function, you can also pass an expression as argument:
dep: !- b
foo: !? $import($_.dep.toUpper() + '.yaml')
This would give the same output as above.
Disclaimer: I am the author of Yglu.
Standard YAML 1.2 doesn't include natively this feature. Nevertheless many implementations provides some extension to do so.
I present a way of achieving it with Java and snakeyaml:1.24 (Java library to parse/emit YAML files) that allows creating a custom YAML tag to achieve the following goal (you will see I'm using it to load test suites defined in several YAML files and that I made it work as a list of includes for a target test: node):
# ... yaml prev stuff
tests: !include
- '1.hello-test-suite.yaml'
- '3.foo-test-suite.yaml'
- '2.bar-test-suite.yaml'
# ... more yaml document
Here is the one-class Java that allows processing the !include tag. Files are loaded from classpath (Maven resources directory):
/**
* Custom YAML loader. It adds support to the custom !include tag which allows splitting a YAML file across several
* files for a better organization of YAML tests.
*/
#Slf4j // <-- This is a Lombok annotation to auto-generate logger
public class MyYamlLoader {
private static final Constructor CUSTOM_CONSTRUCTOR = new MyYamlConstructor();
private MyYamlLoader() {
}
/**
* Parse the only YAML document in a stream and produce the Java Map. It provides support for the custom !include
* YAML tag to split YAML contents across several files.
*/
public static Map<String, Object> load(InputStream inputStream) {
return new Yaml(CUSTOM_CONSTRUCTOR)
.load(inputStream);
}
/**
* Custom SnakeYAML constructor that registers custom tags.
*/
private static class MyYamlConstructor extends Constructor {
private static final String TAG_INCLUDE = "!include";
MyYamlConstructor() {
// Register custom tags
yamlConstructors.put(new Tag(TAG_INCLUDE), new IncludeConstruct());
}
/**
* The actual include tag construct.
*/
private static class IncludeConstruct implements Construct {
#Override
public Object construct(Node node) {
List<Node> inclusions = castToSequenceNode(node);
return parseInclusions(inclusions);
}
#Override
public void construct2ndStep(Node node, Object object) {
// do nothing
}
private List<Node> castToSequenceNode(Node node) {
try {
return ((SequenceNode) node).getValue();
} catch (ClassCastException e) {
throw new IllegalArgumentException(String.format("The !import value must be a sequence node, but " +
"'%s' found.", node));
}
}
private Object parseInclusions(List<Node> inclusions) {
List<InputStream> inputStreams = inputStreams(inclusions);
try (final SequenceInputStream sequencedInputStream =
new SequenceInputStream(Collections.enumeration(inputStreams))) {
return new Yaml(CUSTOM_CONSTRUCTOR)
.load(sequencedInputStream);
} catch (IOException e) {
log.error("Error closing the stream.", e);
return null;
}
}
private List<InputStream> inputStreams(List<Node> scalarNodes) {
return scalarNodes.stream()
.map(this::inputStream)
.collect(toList());
}
private InputStream inputStream(Node scalarNode) {
String filePath = castToScalarNode(scalarNode).getValue();
final InputStream is = getClass().getClassLoader().getResourceAsStream(filePath);
Assert.notNull(is, String.format("Resource file %s not found.", filePath));
return is;
}
private ScalarNode castToScalarNode(Node scalarNode) {
try {
return ((ScalarNode) scalarNode);
} catch (ClassCastException e) {
throw new IllegalArgumentException(String.format("The value must be a scalar node, but '%s' found" +
".", scalarNode));
}
}
}
}
}
Unfortunately YAML doesn't provide this in its standard.
But if you are using Ruby, there is a gem providing the functionality you are asking for by extending the ruby YAML library:
https://github.com/entwanderer/yaml_extend
I make some examples for your reference.
import yaml
main_yaml = """
Package:
- !include _shape_yaml
- !include _path_yaml
"""
_shape_yaml = """
# Define
Rectangle: &id_Rectangle
name: Rectangle
width: &Rectangle_width 20
height: &Rectangle_height 10
area: !product [*Rectangle_width, *Rectangle_height]
Circle: &id_Circle
name: Circle
radius: &Circle_radius 5
area: !product [*Circle_radius, *Circle_radius, pi]
# Setting
Shape:
property: *id_Rectangle
color: red
"""
_path_yaml = """
# Define
Root: &BASE /path/src/
Paths:
a: &id_path_a !join [*BASE, a]
b: &id_path_b !join [*BASE, b]
# Setting
Path:
input_file: *id_path_a
"""
# define custom tag handler
def yaml_import(loader, node):
other_yaml_file = loader.construct_scalar(node)
return yaml.load(eval(other_yaml_file), Loader=yaml.SafeLoader)
def yaml_product(loader, node):
import math
list_data = loader.construct_sequence(node)
result = 1
pi = math.pi
for val in list_data:
result *= eval(val) if isinstance(val, str) else val
return result
def yaml_join(loader, node):
seq = loader.construct_sequence(node)
return ''.join([str(i) for i in seq])
def yaml_ref(loader, node):
ref = loader.construct_sequence(node)
return ref[0]
def yaml_dict_ref(loader: yaml.loader.SafeLoader, node):
dict_data, key, const_value = loader.construct_sequence(node)
return dict_data[key] + str(const_value)
def main():
# register the tag handler
yaml.SafeLoader.add_constructor(tag='!include', constructor=yaml_import)
yaml.SafeLoader.add_constructor(tag='!product', constructor=yaml_product)
yaml.SafeLoader.add_constructor(tag='!join', constructor=yaml_join)
yaml.SafeLoader.add_constructor(tag='!ref', constructor=yaml_ref)
yaml.SafeLoader.add_constructor(tag='!dict_ref', constructor=yaml_dict_ref)
config = yaml.load(main_yaml, Loader=yaml.SafeLoader)
pk_shape, pk_path = config['Package']
pk_shape, pk_path = pk_shape['Shape'], pk_path['Path']
print(f"shape name: {pk_shape['property']['name']}")
print(f"shape area: {pk_shape['property']['area']}")
print(f"shape color: {pk_shape['color']}")
print(f"input file: {pk_path['input_file']}")
if __name__ == '__main__':
main()
output
shape name: Rectangle
shape area: 200
shape color: red
input file: /path/src/a
Update 2
and you can combine it, like this
# xxx.yaml
CREATE_FONT_PICTURE:
PROJECTS:
SUNG: &id_SUNG
name: SUNG
work_dir: SUNG
output_dir: temp
font_pixel: 24
DEFINE: &id_define !ref [*id_SUNG] # you can use config['CREATE_FONT_PICTURE']['DEFINE'][name, work_dir, ... font_pixel]
AUTO_INIT:
basename_suffix: !dict_ref [*id_define, name, !product [5, 3, 2]] # SUNG30
# ↓ This is not correct.
# basename_suffix: !dict_ref [*id_define, name, !product [5, 3, 2]] # It will build by Deep-level. id_define is Deep-level: 2. So you must put it after 2. otherwise, it can't refer to the correct value.
With Symfony, its handling of yaml will indirectly allow you to nest yaml files. The trick is to make use of the parameters option. eg:
common.yml
parameters:
yaml_to_repeat:
option: "value"
foo:
- "bar"
- "baz"
config.yml
imports:
- { resource: common.yml }
whatever:
thing: "%yaml_to_repeat%"
other_thing: "%yaml_to_repeat%"
The result will be the same as:
whatever:
thing:
option: "value"
foo:
- "bar"
- "baz"
other_thing:
option: "value"
foo:
- "bar"
- "baz"
I think the solution used by #maxy-B looks great. However, it didn't succeed for me with nested inclusions. For example if config_1.yaml includes config_2.yaml, which includes config_3.yaml there was a problem with the loader. However, if you simply point the new loader class to itself on load, it works! Specifically, if we replace the old _include function with the very slightly modified version:
def _include(self, loader, node):
oldRoot = self.root
filename = os.path.join(self.root, loader.construct_scalar(node))
self.root = os.path.dirname(filename)
data = yaml.load(open(filename, 'r'), loader = IncludeLoader)
self.root = oldRoot
return data
Upon reflection I agree with the other comments, that nested loading is not appropriate for yaml in general as the input stream may not be a file, but it is very useful!
Based on previous posts:
class SimYamlLoader(yaml.SafeLoader):
'''
Simple custom yaml loader that supports include, e.g:
main.yaml:
- !include file1.yaml
- !include dir/file2.yaml
'''
def __init__(self, stream):
self.root = os.path.split(stream.name)[0]
super().__init__(stream)
def _include(loader, node):
filename = os.path.join(loader.root, loader.construct_scalar(node))
with open(filename, 'r') as f:
return yaml.load(f, SimYamlLoader)
SimYamlLoader.add_constructor('!include', _include)
# example:
with open('main.yaml', 'r') as f:
lists = yaml.load(f, SimYamlLoader)
# if you want to merge the lists
data = functools.reduce(
lambda x, y: x if y is None else {**x, **dict(y)}, lists, {})
# python 3.10+:lambda x, y: x if y is None else x | dict(y), lists, {})
Maybe this could inspire you, try to align to jbb conventions:
https://docs.openstack.org/infra/jenkins-job-builder/definition.html#inclusion-tags
- job:
name: test-job-include-raw-1
builders:
- shell:
!include-raw: include-raw001-hello-world.sh
Adding on #Joshbode's initial answer above , I modified the snippet a little to support UNIX style wild card patterns.
I haven't tested in windows though. I was facing an issue of splitting an array in a large yaml across multiple files for easy maintenance and was looking for a solution to refer multiple files within a same array of the base yaml. Hence the below solution. The solution does not support recursive reference. It only supports wildcards in a given directory level referenced in the base yaml.
import yaml
import os
import glob
# Base code taken from below link :-
# Ref:https://stackoverflow.com/a/9577670
class Loader(yaml.SafeLoader):
def __init__(self, stream):
self._root = os.path.split(stream.name)[0]
super(Loader, self).__init__(stream)
def include(self, node):
consolidated_result = None
filename = os.path.join(self._root, self.construct_scalar(node))
# Below section is modified for supporting UNIX wildcard patterns
filenames = glob.glob(filename)
# Just to ensure the order of files considered are predictable
# and easy to debug in case of errors.
filenames.sort()
for file in filenames:
with open(file, 'r') as f:
result = yaml.load(f, Loader)
if isinstance(result, list):
if not isinstance(consolidated_result, list):
consolidated_result = []
consolidated_result += result
elif isinstance(result, dict):
if not isinstance(consolidated_result, dict):
consolidated_result = {}
consolidated_result.update(result)
else:
consolidated_result = result
return consolidated_result
Loader.add_constructor('!include', Loader.include)
Usage
a:
!include a.yaml
b:
# All yamls included within b folder level will be consolidated
!include b/*.yaml
Combining other answers, here is a short solution without overloading Loader class and it works with any loader operating on files:
import json
from pathlib import Path
from typing import Any
import yaml
def yaml_include_constructor(loader: yaml.BaseLoader, node: yaml.Node) -> Any:
"""Include file referenced with !include node"""
# noinspection PyTypeChecker
fp = Path(loader.name).parent.joinpath(loader.construct_scalar(node)).resolve()
fe = fp.suffix.lstrip(".")
with open(fp, 'r') as f:
if fe in ("yaml", "yml"):
return yaml.load(f, type(loader))
elif fe in ("json", "jsn"):
return json.load(f)
else:
return f.read()
def main():
loader = yaml.SafeLoader # Works with any loader
loader.add_constructor("!include", yaml_include_constructor)
with open(...) as f:
yml = yaml.load(f, loader)
# noinspection PyTypeChecker is there to prevent PEP-check warning Expected type 'ScalarNode', got 'Node' instead when passing node: yaml.Node to loader.construct_scalar().
This solution fails if yaml.load input stream is not file stream, as loader.name does not contain the path in that case:
class Reader(object):
...
def __init__(self, stream):
...
if isinstance(stream, str):
self.name = "<unicode string>"
...
elif isinstance(stream, bytes):
self.name = "<byte string>"
...
else:
self.name = getattr(stream, 'name', "<file>")
...
In my use case, I know that only YAML files will be included, so the solution can be simplified further:
def yaml_include_constructor(loader: yaml.Loader, node: yaml.Node) -> Any:
"""Include YAML file referenced with !include node"""
with open(Path(loader.name).parent.joinpath(loader.construct_yaml_str(node)).resolve(), 'r') as f:
return yaml.load(f, type(loader))
Loader = yaml.SafeLoader # Works with any loader
Loader.add_constructor("!include", yaml_include_constructor)
def main():
with open(...) as f:
yml = yaml.load(f, Loader=Loader)
or even one-liner using lambda:
Loader = yaml.SafeLoader # Works with any loader
Loader.add_constructor("!include",
lambda l, n: yaml.load(Path(l.name).parent.joinpath(l.construct_scalar(n)).read_text(), type(l)))
Probably it was not supported when question was asked but you can import other YAML file into one:
imports: [/your_location_to_yaml_file/Util.area.yaml]
Though I don't have any online reference but this works for me.