Considering writing/reading files in YAML format (http://yaml.org/)
I'm just surprised by apparent lack of output formatting options in default YAML.dump (Ruby 2.2.3). Without any pretty printing option, the YAML.dump appears really ugly. I explain:
Consider this hand-written YAML configuration file 'config/bots.yml' where I have a list of items (hashes, each one with keys 'token' and 'comment':
input file:
- token: 070743004:yuSJJdB5L354Zq41iGIggSdYGJ1IhVh8XSrA
comment: ROSPOshop.com
- token: 998001334:zAFo4dBdd3ZZtqKiGdPqkkYGJ1ppVW8pUZ
comment: pagoSALDO.com bot
- token: 184679990:BBBBBBBBBCCCCCCCGIDDDDDDDHHHHHHHHHH
comment: SOLYARISoftware demo bot
- token: 184679990:BBBBBBBBBCCCCCCUUUUUUUUUUHHHHHHHHHH
comment: Another demo bot
- token: 184679990:BBBBBBBBBCCCCCCCGHGGHHGHGHHHHHHHHHH
comment: Yet Another demo bot
No elaboration: just a load and a successive dump script as:
config = YAML.load(File.open('config/bots.yml'))
File.open('config/bots.yml', "w") { |f| f.write(YAML.dump(config)) }
output file:
---
- token: 070743004:yuSJJdB5L354Zq41iGIggSdYGJ1IhVh8XSrA
comment: ROSPOshop.com
- token: 998001334:zAFo4dBdd3ZZtqgKiGdPqkkYGJ1ppVW8pUZ
comment: pagoSALDO.com bot
- token: 184679990:BBBBBBBBBCCCCCCCGIDDDDDDDHHHHHHHHHH
comment: SOLYARISoftware demo bot
- token: 184679990:BBBBBBBBBCCCCCCUUUUUUUUUUHHHHHHHHHH
comment: Another demo bot
- token: 184679990:BBBBBBBBBCCCCCCCGHGGHHGHGHHHHHHHHHH
comment: Yet Another demo bot
I'm unhappy because all array items are now collapsed (the line break is removed). That's very sad if the numbers of items is long and/or data structures for each item isa variable: a messy reading!
Question (1)
There is any YAML option do do some more pretty printing for YAML.dump ?
By example to separate with a blank line each item in an Array ?
Question (2)
I found this very helpfull tutorial ("YAML Cookbook"):
http://www.yaml.org/YAML_for_ruby.html#yaml_for_ruby
There is any more recent update / official Ruby doc about explaining YAML tips&tricks (data conversions, etc.) ?
Question (3)
Any possible YAML alternative ? I mean maybe an alternative gem to read/write YAML ? BTW, of course I considered JSON, but I prefer the more clear YAML format when reading texts data!
UPDATED
BTW, A lot of info/useful YAML format tips here:
https://en.wikipedia.org/wiki/YAML
You can write your own pretty-print solution, if that's all you're looking for. For example:
config = YAML.load(File.open('bots.yml'))
puts config.to_yaml.gsub("\n-", "\n\n-")
Output:
---
- token: 070743004:yuSJJdB5L354Zq41iGIggSdYGJ1IhVh8XSrA
comment: ROSPOshop.com
- token: 998001334:zAFo4dBdd3ZZtqKiGdPqkkYGJ1ppVW8pUZ
comment: pagoSALDO.com bot
- token: 184679990:BBBBBBBBBCCCCCCCGIDDDDDDDHHHHHHHHHH
comment: SOLYARISoftware demo bot
- token: 184679990:BBBBBBBBBCCCCCCUUUUUUUUUUHHHHHHHHHH
comment: Another demo bot
- token: 184679990:BBBBBBBBBCCCCCCCGHGGHHGHGHHHHHHHHHH
comment: Yet Another demo bot
Related
I have a YAML file like below.
Details:
Name: Jack
Location: USA
ABC: TestValue
Refer:
Test1: %Details.Name%
Test2: %Details.Location%
Wanted to check if the value given in Test1 works? As I know if %Details.Name% is present under parameters. But above keys are not parameters. So, is there any way to refer the values from another key?
There are anchors and aliases, which you can use like this:
Details:
Name: &name Jack
Location: &location USA
ABC: TestValue
Refer:
Test1: *name
Test2: *location
However, there is no way to refer to other values via their „paths“. Applications using YAML may support pre- or postprocessing to do this (often via templating engines like Jinja), but plain YAML doesn't implement this feature.
If you are under the impression that %Details.Name% would work in some context, you are already using a pre- or postprocessing feature that is not plain YAML.
Consider the following yaml file:
topics:
topicA:
bins:
type: multi-bins
map:
FirstBin:
source: value-field
field-name: ServiceID
SecondBin:
source: value-field
field-name: ServiceID
message-transformer:
class: com.aerospike.connect.inbound.transforms.TombstoneMessageTransformer
params:
shouldDeleteOnNull: "FirstBin, SecondBin"
topicB:
...
As you can see there duplication between the topics.<topic-name>.bins.map.keys and topics.<topic-name>.message-transformer.params.shouldDeletedOnNull
Is there a way to extract the values dynamically? I want to send to shouldDeleteOnNull all the keys of topics.<topic-name>.bins.map
Note: I don't want to create an env variable for that and use Yaml anchors.
YAML is not a programming language. It doesn't let you extract anything and it doesn't let you send anything, because it does not provide you with any kind of actions or processing instructions.
While you can reference nodes multiple times with anchors, that doesn't help you here because you would need to concatenate values, which is not something that is possible in YAML. A solution that would work looks like this (shortened for clarity):
map:
&a FirstBin:
source: value-field
field-name: ServiceID
&b SecondBin:
source: value-field
field-name: ServiceID
shouldDeleteOnNull: [*a, *b]
As you can see, I needed to make shouldDeleteOnNull a sequence rather than a scalar to make this work. This does not seem like much of an improvement.
Anything more sophisticated would need to be implemented in the code loading the file and therefore does not make sense to be discussed in a pure YAML context.
I'm trying to model a YAML-like DSL in Xtext. In this DSL, I need some Multiline String as in YAML.
description: |
Line 1
line 2
...
My first try was this:
terminal BEGIN:
'synthetic:BEGIN'; // increase indentation
terminal END:
'synthetic:END'; // decrease indentation
terminal MULTI_LINE_STRING:
"|"
BEGIN ANY_OTHER END;
and my second try was
terminal MULTI_LINE_STRING:
"|"
BEGIN
((!('\n'))+ '\n')+
END;
but both of them did not succeed. Is there any way to do this in Xtext?
UPDATE 1:
I've tried this alternative as well.
terminal MULTI_LINE_STRING:
"|"
BEGIN ->END
When I triggered the "Generate Xtext Artifacts" process, I got this error:
3492 [main] INFO nerator.ecore.EMFGeneratorFragment2 - Generating EMF model code
3523 [main] INFO clipse.emf.mwe.utils.GenModelHelper - Registered GenModel 'http://...' from 'platform:/resource/.../model/generated/....genmodel'
error(201): ../.../src-gen/.../parser/antlr/lexer/Internal..Lexer.g:236:71: The following alternatives can never be matched: 1
error(3): cannot find tokens file ../.../src-gen/.../parser/antlr/internal/Internal...Lexer.tokens
error(201): ../....idea/src-gen/.../idea/parser/antlr/internal/PsiInternal....g:4521:71: The following alternatives can never be matched: 1
This slide deck shows how we implemented a whitespace block scoping in an Xtext DSL.
We used synthetic tokens called BEGIN corresponding to an indent, and END corresponding to an outdent.
(Note: the language was subsequently renamed to RAPID-ML, included as a feature of RepreZen API Studio.)
I think your main problem is, that you have not defined when your multiline token is ending. Before you come to a solution you have to make clear in your mind how an algorithm should determine the end of the token. No tool can take this mental burdon from you.
Issue: There is no end marking character. Either you have to define such a character (unlike YAML) or define the end of the token in anather way. For example through some sort of semantic whitespace (I think YAML does it like that).
The first approach would make the thing very easy. Just read content until you find the closing character. The sescond approach would probably be manageable using a custom lexer. Basically you replace the generated lexer with your own implemented solution that is able to cound blanks or similar.
Here are some starting points about how this could be done (different approaches thinkable):
Writing a custom Xtext/ANTLR lexer without a grammar file
http://consoliii.blogspot.de/2013/04/xtext-is-incredibly-powerful-framework.html
While building a blog using django I realized that it would be extremely practical to store the text of an article and all the related informations (title, author, etc...) together in a human-readable file format, and then charge those files on the database using a simple script.
Now that said, YAML caught my attention for his readability and ease of use, the only downside of the YAML syntax is the indentation:
---
title: Title of the article
author: Somebody
# Other stuffs here ...
text:|
This is the text of the article. I can write whatever I want
but I need to be careful with the indentation...and this is a
bit boring.
---
I believe that's not the best solution (especially if the files are going to be written by casual users). A format like this one could be much better
---
title: Title of the article
author: Somebody
# Other stuffs here ...
---
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
Is there any solution? Preferably using python.
Other file formats propositions are welcome as well!
Unfortunately this is not possible, what one would think could work is using | for a single scalar in the separate document:
import ruamel.yaml
yaml_str = """\
title: Title of the article
author: Somebody
---
|
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
for d in ruamel.yaml.load_all(yaml_str):
print(d)
print('-----')
but it doesn't because | is the block indentation indicator. And although at the top level an indentation of 0 (zero) would easily work, ruamel.yaml (and PyYAML) don't allow this.
It is however easy to parse this yourself, which has the advantage over using the front matter package that you can use YAML 1.2 and are not restricted to using YAML 1.1 because of frontmaker using the PyYAML. Also note that I used the more appropriate end of document marker ... to separate YAML from the markdown:
import ruamel.yaml
combined_str = """\
title: Title of the article
author: Somebody
...
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
with open('test.yaml', 'w') as fp:
fp.write(combined_str)
data = None
lines = []
yaml_str = ""
with open('test.yaml') as fp:
for line in fp:
if data is not None:
lines.append(line)
continue
if line == '...\n':
data = ruamel.yaml.round_trip_load(yaml_str)
continue
yaml_str += line
print(data['author'])
print(lines[2])
which gives:
Somebody
I want...
(the round_trip_load allows dumping with preservation of comments, anchor names etc).
I found Front Matter does exactly what I want to do.
There is also a python package.
I have heard of the term "front matter" and "back matter" to refer to some YAML parsing at the beginning or end of a non-YAML file. However, I can't seem to find any examples/documentation of how to implement this. Maybe this isn't a standard YAML feature. How can I make use of this feature in my Ruby project?
FYI: The reason I want to do this is to be able to require some ruby files at the top, and assume the rest is YAML. I don't think this is normally allowed in a YAML file.
I just came across a nice example of something similar to what I am trying to do. It isn't necessarily an example of "front/back matter" but it might help someone in the future:
Using the __END__ keyword, you can stop ruby from parsing the rest of the file. The rest of the file is stored in a DATA variable, which is actually a File object:
#!/usr/bin/env ruby
%w(yaml pp).each { |dep| require dep }
obj = YAML::load(DATA)
pp obj
__END__
---
-
name: Adam
age: 28
admin: true
-
name: Maggie
age: 28
admin: false
Source