Any issues with using ''' block string for block commenting in yaml? - yaml

I have been using ''' for block comments in yaml. Like:
'''
This
is
a
comment
'''
I have noticed that this approach isn't one of the answers to the How do you block comment in yaml question. Is there a reason why not to do this (other than terrible multiline string formating glitches in VIM)? Does it get loaded into memory or something else that could be problematic?

YAML comments are started with # separated from other tokens with whitespace and terminate at the end of line
If you do:
'''
This
is
a
comment
'''
You specify a scalar node, that starts and ends with one (1) single quote. That is because in single quoted style scalar nodes, you can insert a single quote by escaping it with a single quote. Since YAML does line unwrapping the above loads as the string ' This is a comment ' (the string including the quotes).
However if you insert that as comment after a scalar node like 42 as in:
answer: 42 '''
This
is
a
comment
'''
You still have valid YAML, but this will load e.g. in Python as a dict with a key answer and an associated value of 42 ''' This is a comment '''. A string, which would probably give you some error if you expected the integer value 42.

Based on the spec, use # only:
http://yaml.org/spec/1.2/spec.html#comment/
As to why? Short of 'Because they said so' I would guess that some of the readability of YAML is lost with multiline comments.
You're use of ''' is the standard for Python docstrings.

Related

Express variables within raw string in bash

Problem
I have a variable called boiler, and I want the variable si1 to be expressed, and I am unsure of how to do this in a simple and minimal fashion.
boiler='#!/bin/bash
source ../../functions.sh
current="${si1}"
ready custom
title
breadcrumbs \""$current"\" \"Options\"
# END OF BOILER (DO NOT REMOVE ABOVE CODE OR MODIFY IT)
'
ISSUE
The issue is that i want everything to be ignored withing this string (aka printed raw) except for the ${si1} variable.
EXPECTED OUTPUT
How could I concatenate the first part the variable and then the rest of the string while keeping it minimal and saving it back into the boiler variable?
You can delimit the string around ${si1}.
boiler='#!/bin/bash
source ../../functions.sh
current='"${si1}"'
ready custom
title
breadcrumbs \""$current"\" \"Options\"
# END OF BOILER (DO NOT REMOVE ABOVE CODE OR MODIFY IT)
'
This is ordinary string concatenation. The strings delimited with ' will be literal, while the string delimited with " will have the variable expanded.
Difference between single and double quotes in Bash

Reading and writing back yaml files with multi-line strings

I have to read a yaml file, modify it and write back using pyYAML. Every thing works fine except when there is multi-line string values in single quotes e.g. if input yaml file looks like
FOO:
- Bar: '{"HELLO":
"WORLD"}'
then reading it as data=yaml.load(open("foo.yaml")) and writing it yaml.dump(data, fref, default_flow_style=False) generates something like
FOO:
- Bar: '{"HELLO": "WORLD"}'
i.e. without the extra line for Bar value. Strange thing is that if input file has something like
FOO:
- Bar: '{"HELLO":
"WORLD"}'
i.e. one extra new line for Bar value then writing it back generates the correct number of new lines. Any idea what I am doing wrong?
You are not doing anything wrong, but you probably should have read more of the YAML specification.
According to the (outdated) 1.1 spec that PyYAML implements, within
single quoted scalars:
In a multi-line single-quoted scalar, line breaks are subject to (flow) line folding, and any trailing white space is excluded from the content.
And line-folding:
Line folding allows long lines to be broken for readability, while retaining the original semantics of a single long line. When folding is done, any line break ending an empty line is preserved. In addition, any specific line breaks are also preserved, even when ending a non-empty line.
This means that your first two examples are the same, as the
line-break is read as if there is a space.
The third example is different, because it actually contains a newline after loading, because "any line break ending an empty line is preserved".
In order to understand why that dumps back as it was loaded, you have to know that PyYAML doesn't
maintain any information about the quoting (nor about the single newline in the first example), it
just loads that scalar into a Python string. During dumping PyYAML evaluates how that string
can best be written and the options it considers (unless you try to force things using the default_style argument to dump()): plain style, single quoted style, double quoted style.
PyYAML will use plain style (without quotes) when possible, but since
the string starts with {, this leads to confusion (collision) with
that character's use as the start of a flow style mapping. So quoting
is necessary. Since there are also double quotes in the string, and
there are no characters that need backslash escaping the "cleanest"
representation that PyYAML can choose is single quoted style, and in
that style it needs to represent a line-break by including an emtpy
line withing the single quoted scalar.
I would personally prefer using a block style literal scalar to represent your last example:
FOO:
- Bar: |
{"HELLO":
"WORLD"}
but if you load, then dump that using PyYAML its readability would be lost.
Although worded differently in the YAML 1.2 specification (released almost 10 years ago) the line-folding works the same, so this would "work" in a similar way with a more up-to-date YAML loader/dumper. My package ruamel.yaml, for loading/dumping YAML 1.2 will properly maintain the block style if you set the attribute preserve_quotes = True on the YAML() instance, but it will still get rid of the newline in your first example. This could be implemented (as is shown by ruamel.yaml preserving appropriate newline positions in folded style block scalars), but nobody ever asked for that, probably because if people want that kind of control over wrapping they use a block style to start with.

Keep text delimiters after !!str "some text" or !!str 'some text'

I have some data to be used to generate SQL, therefore it is important which text delimiters are used (single quotes ' delimits string literal but double quotes " delimit identifiers, at least in Oracle db).
For load procedure generator I used this
someKey: !!str 'Some SQL text'
and expected that someKey would contain the whole string including single quotes: 'Some SQL text'.
However, js-yaml.safeLoad() interprets the data as Some SQL text which is not what I wanted.
The workaround is easy, I can put the literal into additional quotes:
someKey: "'Some SQL text'"
which gives the expected result. However, I am not quite sure why in that case do we need !!str tag in YAML if it does virtually nothing (it is useful only for explicit interpretation number literals, true, false and null) and it is actually almost the same as putting double quotes around the text.
I would prefer to post this into some YAML-spec-related forum but it seems there is none.
Apart from the standard workaround, is there any trick that would do what I originally wanted, i.e. interpret any content after object key as string (+trimming off any initial and trailing spaces) without dealing with double quotes?
In YAML tag !!str is a predifened denoting a string scalar. If you specify that then even things that without that tag (or without quotes) would not be considered a string scalar, like 123, True or null.
Some string scalars need quotes e.g. if they start with a quote or double quote, if special characters need backslash espacing, or if there is a : (colon, space) in the string (which could confuse the parser to intrepret the string scalar as a key-value pair.
However putting !!str before something doesn't make it quoted (which should be obvious as it doesn't define what kind of quoting and single quoted scalars have vastly different rules from double quoted scalars).
Your workaround is not a workaround, that is just one of the ways in YAML you can specify a string scalar that starts and ends with a single quote. Another way is:
someKey: |-
'Some SQL text'
Within literal block style scalars quotes (single or double) are interpreted as is even at the beginning of the scalar. The - makes sure you don't get an extra newline after the final '

YAML comments in multi-line strings

Does YAML support comments in multi-line strings?
I'm trying to do things like this, but the validator is throwing errors:
key:
#comment
value
#comment
value
value #comments here don't work either
No. Per the YAML 1.2 spec "Comments must not appear inside scalars". Which is exactly the case here. There's no way in YAML to escape the octothorpe symbol (#) so within a multi-line string there's no way to disambiguate the comment from the raw string value.
You can however interleave comments within a collection. For example, if you really needed to, you could break your string into a sequence of strings one per line:
key: #comment
- value line 1
#comment
- value line 2
#comment
- value line 3
Should work...

String#split in Ruby not behaving as expected

File.open(path, 'r').each do |line|
row = line.chomp.split('\t')
puts "#{row[0]}"
end
path is the path of file having content like name, age, profession, hobby
I'm expecting output to be name only but I am getting the whole line.
Why is it so?
The question already has an accepted answer, but it's worth noting what the cause of the original problem was:
This is the problem part:
split('\t')
Ruby has several forms for quoted string, which have differences, usually useful ones.
Quoting from Ruby Programming at wikibooks.org:
...double quotes are designed to
interpret escaped characters such as
new lines and tabs so that they appear
as actual new lines and tabs when the
string is rendered for the user.
Single quotes, however, display the
actual escape sequence, for example
displaying \n instead of a new line.
Read further in the linked article to see the use of %q and %Q strings. Or Google for "ruby string delimiters", or see this SO question.
So '\t' is interpreted as "backslash+t", whereas "\t" is a tab character.
String#split will also take a Regexp, which in this case might remove the ambiguity:
split(/\t/)
Your question was not very clear
split("\n") - if you want to split by lines
split - if you want to split by spaces
and as I can understand, you do not need chomp, because it removes all the "\n"

Resources