I am modifying a YAML file in Ruby. After I write back the modified YAML, I see a --- added on top of the file. How is this getting added and how do I get rid of it?
YAML spec says:
YAML uses three dashes (“---”) to separate directives from document content. This also serves to signal the start of a document if no directives are present.
Example:
# Ranking of 1998 home runs
---
- Mark McGwire
- Sammy Sosa
- Ken Griffey
# Team ranking
---
- Chicago Cubs
- St Louis Cardinals
So if you have multiple documents per YAML file, you have to separate them by three dashes. If you only have one document, you can remove/omit it (I never had a problem with YAML in ruby if three-dashes was missing). The reason why it's added when you yamlify your object is that, I guess, the dumper is written "by the spec" and doesn't care to implement such "shortcuts" (omit three-dashes when it's only one document).
Related
I came across this yaml document:
--- !ruby/object:MyClass
myint: 100
mystring: hello world
What does the line:
--- !ruby/object:MyClass
mean?
In YAML, --- is the end of directives marker.
A YAML document may begin with a number of YAML directives (currently, two directives are defined, %YAML and %TAG). Since a text node (for example) can also start with a % character, there needs to be a way to distinguish between directives and text. This is achieved using the end of directives marker --- which signals the end of the directives and the beginning of the document.
Since directives are allowed to be empty, --- can also serve as a document separator.
YAML also has an end of document marker .... However, this is not often used, because and end of directives marker / document separator also implies the end of the document. You need it if you want to have multiple documents with directives within the same stream or when you want to indicate that a document is finished without necessarily starting a new one (e.g. in cases where there may be significant time passing between the end of one document and the start of another).
Many YAML emitters, and Psych is no exception, always emit an end of directives marker at the beginning of each document. This allows you to easily concatenate multiple documents into a single stream without doing any additional processing of the documents.
The other half of that line, !ruby/object:MyClass, is a tag. A tag is used to give a type to the following node. In YAML, every node has a type, even if it is implicit. You can also write the tag explicitly, for example text nodes have the type (tag) !!str. This can be useful in certain circumstances, for example here:
!!str 2018-10-31
This tells YAML that 2018-10-31 is text, not a date.
!ruby/object:MyClass is a tag used by Psych to indicate that the node is a serialized Ruby Object which is an instance of class MyClass. This way, when deserializing the document, Psych knows what class to instantiate and how to treat the node.
According to yaml.org, '---' indicates the start of a document.
https://yaml.org/spec/1.2/spec.html
for official specifications.
So I just started using YAML file instead of application.properties as it is more readable. I see in YAML files they start with ---. I googled and found the below explanation.
YAML uses three dashes (“---”) to separate directives from document
content. This also serves to signal the start of a document if no
directives are present.
Also, I tried a sample without --- and understood that it is not mandatory to have them.
I think I don't have a clear understanding of directive and document. Can anyone please explain with a simple example?
As you already found out, the three dashes --- are used to signal the start of a document, i.e.:
To signal the document start after directives, i.e., %YAML or %TAG lines according to the current spec. For example:
%YAML 1.2
%TAG !foo! !foo-types/
---
myKey: myValue
To signal the document start when you have multiple yaml documents in the same stream, e.g., a yaml file:
doc 1
---
doc 2
If doc 2 has some preceding directives, then we have to use three dots ... to indicate the end of doc 1 (and the start of potential directives preceding doc 2) to the parser. For example:
doc 1
...
%TAG !bar! !bar-types/
---
doc 2
The spec is good for yaml parser implementers. However, I find this article easier to read from a user perspective.
It's not mandatory to have them if you do not begin your YAML with a directive. If it's the case, you should use them.
Let's take a look at the documentation
3.2.3.4. Directives
Each document may be associated with a set of directives. A directive has a name and an optional sequence of
parameters. Directives are instructions to the YAML processor, and
like all other presentation details are not reflected in the YAML
serialization tree or representation graph. This version of YAML
defines a two directives, “YAML” and “TAG”. All other directives are
reserved for future versions of YAML.
One example of this can also be found in the documentation for directive YAML
%YAML 1.2 # Attempt parsing
# with a warning
---
"foo"
Is there any difference if i use one space, two or four spaces per indent level in YAML?
Are there any specific rules for space numbers per Structure type??
For example 4 spaces for nesting maps , 1 space per list item etc??
I am writing a yaml configuration file for elastic beanstalk .ebextensions and i am having really hard time constructing this correctly. Although i have valid yaml in YAML Validator elastic beanstalk seems to understand a different structure.
There is no requirement in YAML to indent any concrete number of spaces. There is also no requirement to be consistent. So for example, this is valid YAML:
a:
b:
- c
- d
- e
f:
"ghi"
Some rules might be of interest:
Flow content (i.e. everything that starts with { or [) can span multiple lines, but must be indented at least as many spaces as the surrounding current block level.
Block list items can (but don't need to) have the same indentation as the surrounding block level because - is considered part of the indentation:
a: # top-level key
- b # value of that key, which is a list
- c
c: # next top-level key
d # non-list value which must be more indented
The YAML spec for v 1.2 merely says that
In YAML block styles, structure is determined by indentation. In general, indentation is defined as a zero or more space characters at the start of a line.
To maintain portability, tab characters must not be used in indentation, since different systems treat tabs differently. Note that most modern editors may be configured so that pressing the tab key results in the insertion of an appropriate number of spaces.
The amount of indentation is a presentation detail and must not be used to convey content information.
So you can set the indent depth to your preference, as long as you use spaces and not tabs. Interestingly, IntelliJ uses 2 spaces by default.
INDENTATION
The suggested syntax for YAML files is to use 2 spaces for indentation, but YAML will follow whatever indentation system that the individual file uses. Indentation of two spaces works very well for SLS files given the fact that the data is uniform and not deeply nested.
NESTED DICTIONARIES
When dictionaries are nested within other data structures (particularly lists), the indentation logic sometimes changes. Examples of where this might happen include context and default options from the file.managed state:
/etc/http/conf/http.conf:
file:
- managed
- source: salt://apache/http.conf
- user: root
- group: root
- mode: 644
- template: jinja
- context:
custom_var: "override"
- defaults:
custom_var: "default value"
other_var: 123
Notice that while the indentation is two spaces per level, for the values under the context and defaults options there is a four-space indent. If only two spaces are used to indent, then those keys will be considered part of the same dictionary that contains the context key, and so the data will not be loaded correctly. If using a double indent is not desirable, then a deeply-nested dict can be declared with curly braces:
/etc/http/conf/http.conf:
file:
- managed
- source: salt://apache/http.conf
- user: root
- group: root
- mode: 644
- template: jinja
- context: {
custom_var: "override" }
- defaults: {
custom_var: "default value",
other_var: 123 }
you can read more from this link
I'm working with a yaml file that I'm not supposed to break it. The problem is I'm not familiar with it, so is not sure if I can change some of its format...
The source file we received looks like this:
- items:
- heading: Maps
description: >
Integrate 3D buildings and tacos.
image_path: /music/images/v2/web_api-music.png
After processing the files, it looks like this:
- items:
- heading: Maps
description: > Integrate 3D buildings and tacos.
image_path: /music/images/v2/web_api-music.png
Does it break the code if the line break is missing between the greater than sign and the string? Would it have any potential impact on the UI format?
Also does it matter if there's extra spaces before "Integrate 3D buildings and tacos"? like below
- items:
- heading: Maps
description: >
Integrate 3D buildings and tacos.
image_path: /music/images/v2/web_api-music.png
Thank you and happy Thanksgiving!
It is easiest to check your files using some online YAML validator. For example: yamllint. Also, there are libraries for many languages, so if possible I recommend you use one of these to process your yaml files.
Your first processes file is not valid. There should be a newline after the >, or you can leave out the >.
Your last example is valid. The amount of indentation doesn't matter. From the spec:
The amount of indentation is a presentation detail and must not be used to convey content information.
[...]
Each node must be indented further than its parent node. All sibling nodes must use the exact same indentation level. However the content of each sibling node may be further indented independently.
See the spec for the version of yaml you are interested in
Generally speaking, the > is only significant at the end of the line, and means that the subsequent indented block should be folded on to this line with all newlines and leading/trailing space removed (replaced by sinlge spaces.) So
- heading: Maps
description: >
Integrate 3D buildings and tacos.
image_path: /music/images/v2/web_api-music.png
would be equivalent to
- heading: Maps
description: Integrate 3D buildings and tacos.
and leaving the > in when you remove the newline essentially adds it to the string value.
Changing the amount of indentation of any given block is generally irrelevant, as long as the lines of a block are consistently indented
I have a system that filters template files through erb. Using convention over configuration, the output files get created in a file hierarchy that mirrors the input files. Many of the files have the same names, and I was able to use the directories to differentiate them.
That plan worked until I needed to associate additional info with each file. So I created a YAML file in each directory with the metadata. Now I have both convention and configuration. Yuck.
Then I learned Webby, and the way it includes a YAML metadata section at the top of each template file. They look like this:
---
title: Baxter the Dog
filter: textile
---
All the best little blogs use Webby.
If I could implement a header like that, I could ditch my hierarchy and the separate YAML files. The Webby implementation is very generic, implementing a new MetaFile class that separates the header from the "real text", but it seems more complicated than I need.
Putting the metadata in an erb comment seems good -- it will be automatically ignored by erb, but I'm not sure how to access the comment data.
<%#
title: Baxter the Dog
%>
Is there a way to access the erb comments? Or maybe a different approach? A lot of my templates do a bunch of erb stuff, but I could run erb in a separate step if it makes the rest easier.
How about if you dump your content as YAML too. Presumably the metadata is simply a Hash dumped to YAML. You could just append the content as string in a second YAML document in the same file :-
---
title: Baxter the Dog
filter: textile
--- |
Content line 1
Content line 2
Content line 3
Dumping is as simple as :-
File.open('file.txt', 'w') do |output|
YAML.dump(metadata, output)
YAML.dump(content, output)
end
Loading is as simple as :-
File.open('file.txt') do |input|
stream = YAML.load_stream(input)
metadata, content = stream.documents
end
Note that the pipe character appears in the YAML so that newlines in the content string are preserved.