I have heard of the term "front matter" and "back matter" to refer to some YAML parsing at the beginning or end of a non-YAML file. However, I can't seem to find any examples/documentation of how to implement this. Maybe this isn't a standard YAML feature. How can I make use of this feature in my Ruby project?
FYI: The reason I want to do this is to be able to require some ruby files at the top, and assume the rest is YAML. I don't think this is normally allowed in a YAML file.
I just came across a nice example of something similar to what I am trying to do. It isn't necessarily an example of "front/back matter" but it might help someone in the future:
Using the __END__ keyword, you can stop ruby from parsing the rest of the file. The rest of the file is stored in a DATA variable, which is actually a File object:
#!/usr/bin/env ruby
%w(yaml pp).each { |dep| require dep }
obj = YAML::load(DATA)
pp obj
__END__
---
-
name: Adam
age: 28
admin: true
-
name: Maggie
age: 28
admin: false
Source
Related
This does what I want, but going via to_ruby seems unnecessary:
doc = Psych.parse("foo: 123")
doc.to_ruby.to_yaml
# => "---\nfoo: 123\n"
When I try to do this, I get an error:
DEV 16:49:08 >> Psych.parse("foo: 123").to_yaml
RuntimeError: expected STREAM-START
from /opt/…/lib/ruby/2.5.0/psych/visitors/emitter.rb:42:in `start_mapping'
I get the impression that the input needs to be a stream of some sort, but I don't quite get what incantation I need. Any ideas?
(The problem I'm trying to solve here, by the way (in case you know of a better way) is to fix some YAML that can't be deserialised into Ruby, because it references classes that don't exist. The YAML is quite complex, so I don't want to just search-and-replace in the YAML string. My thinking was that I could use Psych.parse to get a syntax tree, modify that tree, then dump it back into a YAML string.)
Figured out the incantation after finding the higher-level docs at https://ruby-doc.org//stdlib-2.3.0_preview1/libdoc/psych/rdoc/Psych/Nodes.html, though please let me know if there's a better way:
doc = Psych.parse("foo: 123")
stream = Psych::Nodes::Stream.new
stream.children << doc
stream.to_yaml
# => "foo: 123\n"
While building a blog using django I realized that it would be extremely practical to store the text of an article and all the related informations (title, author, etc...) together in a human-readable file format, and then charge those files on the database using a simple script.
Now that said, YAML caught my attention for his readability and ease of use, the only downside of the YAML syntax is the indentation:
---
title: Title of the article
author: Somebody
# Other stuffs here ...
text:|
This is the text of the article. I can write whatever I want
but I need to be careful with the indentation...and this is a
bit boring.
---
I believe that's not the best solution (especially if the files are going to be written by casual users). A format like this one could be much better
---
title: Title of the article
author: Somebody
# Other stuffs here ...
---
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
Is there any solution? Preferably using python.
Other file formats propositions are welcome as well!
Unfortunately this is not possible, what one would think could work is using | for a single scalar in the separate document:
import ruamel.yaml
yaml_str = """\
title: Title of the article
author: Somebody
---
|
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
for d in ruamel.yaml.load_all(yaml_str):
print(d)
print('-----')
but it doesn't because | is the block indentation indicator. And although at the top level an indentation of 0 (zero) would easily work, ruamel.yaml (and PyYAML) don't allow this.
It is however easy to parse this yourself, which has the advantage over using the front matter package that you can use YAML 1.2 and are not restricted to using YAML 1.1 because of frontmaker using the PyYAML. Also note that I used the more appropriate end of document marker ... to separate YAML from the markdown:
import ruamel.yaml
combined_str = """\
title: Title of the article
author: Somebody
...
Here there is the text of the article, it is not valid YAML but
just plain text. Here I could put **Markdown** or <html>...or whatever
I want...
"""
with open('test.yaml', 'w') as fp:
fp.write(combined_str)
data = None
lines = []
yaml_str = ""
with open('test.yaml') as fp:
for line in fp:
if data is not None:
lines.append(line)
continue
if line == '...\n':
data = ruamel.yaml.round_trip_load(yaml_str)
continue
yaml_str += line
print(data['author'])
print(lines[2])
which gives:
Somebody
I want...
(the round_trip_load allows dumping with preservation of comments, anchor names etc).
I found Front Matter does exactly what I want to do.
There is also a python package.
I'm trying to understand how YAML works and I'm working from examples in a book to learn it. Here's the code I'm using:
require "yaml"
def savelist(songs, playlist)
File.open playlist, 'w' do |f|
f.write(songs.to_yaml)
end
end
def openlist(playlist)
loadlist = File.read playlist
YAML::load loadlist
end
podcasts = Dir['C:/Users/7/Music/Amazon MP3/*.mp3']
puts "Welcome to the Playlist Maker!"
list = 'playlist.txt'
savelist(podcasts, list)
openlist(list)
I thought the second method, openlist, would actually call the file to the screen and print out its contents. I read it as "create a loadlist object then load the object to the screen with YAML". I expected it to print the file contents (YAML::load).
The program works, but I don't really understand how it works. I'm also not really sure what the second method is there for if it doesn't actually open the file.
Thanks for your insight. I sorta get what YAML does, but I don't really know how or why this works.
Consider this:
require "yaml"
def savelist(songs, playlist)
File.write(playlist, songs.to_yaml)
end
def openlist(playlist)
YAML::load_file(playlist)
end
Learning how YAML works is easier if you look at what it's emitting, and play with it in IRB:
>> require 'yaml'
false
>> songs = ['foo', 'bar']
[
[0] "foo",
[1] "bar"
]
>> puts songs.to_yaml
---
- foo
- bar
nil
>>
>> yaml = songs.to_yaml
"---\n- foo\n- bar\n"
>> YAML::load(yaml)
[
[0] "foo",
[1] "bar"
]
YAML is just text, so writing a YAML file is simple if you use write, as I did above. YAML has a convenient method load_file to read a file, then parse it back into the equivalent Ruby object. If you've already read the file into a variable, then use load to parse it and return the Ruby object.
Playing with YAML this way, by converting an array or hash to YAML, then using load to mess with it, is a great way to learn it. The YAML spec is useful once you start to get an idea how it works, but doing the "round-trip" thing is how to boot-strap yourself. (This is also useful when learning about JSON.)
We actually use a variation of this when initially defining complex YAML files. I'll write a bit of Ruby code with the object defined inside it, then save that to a file. From that point on we can tweak the YAML, or, as I prefer, we tweak the Ruby and re-create the YAML. Using the Ruby code to recreate the file means we've got a fall-back if someone hoses the configuration. At least we can rebuild our default setup.
Let's step through this and see how we might learn what a block of code does.
def openlist(playlist)
loadlist = File.read playlist
YAML::load loadlist
end
For starters we're passing some playlist object to File.read. We can look up the File class but we won't find a read method on it. We will however see that File is a subclass of IO and if we look at IO we will find IO.read
read(name, [length [, offset]] ) → string
So read takes a file name and an optional length and offset and returns a string. loadlist is then just a string containing the contents of this file. Useful but not easy for us to work with yet.
If we look up the YAML module we learn that it is a wrapper around Psych where we can finally find load
load(yaml, filename = nil)
Load yaml in to a Ruby data structure...
So load takes some string and returns a Ruby data structure created by parsing the YAML syntax it contains.
Now we see why this example is using two different methods, one to load the contents of a file and one to parse those into a Ruby data structure. While we're looking at Psych we might also notice the load_file method which gives us another way to do the same thing as #the-tin-man suggested.
I'd like to get a word list from a text file using Ruby. I found how to use regex to parse only words here, so I made a script like following:
src = File.open("text.txt")
word_list = []
src.each do |line|
word_list << line.downcase.split(/[^[:alpha:]]/).delete_if {|x| x == ""}
end
word_list.flatten!.uniq!.sort!
p word_list
And the following is a sample text file text.txt:
TextMate may be the latest craze for developing Ruby on Rails
applications, but Vim is forever. This plugin offers the following
features for Ruby on Rails application development.
Automatically detects buffers containing files from Rails applications, and applies settings to those buffers (and only those
buffers). You can use an autocommand to apply your own custom
settings as well.
Unintrusive. Only files in a Rails application should be affected; regular Ruby scripts are left untouched. Even when enabled, the
plugin should keep out of your way if you're not using its features.
Easy navigation of the Rails directory structure. gf considers context and knows about partials, fixtures, and much more. There are
two commands, :A (alternate) and :R (related) for easy jumping between
files, including favorites like model to migration, template to
helper, and controller to functional test. For more advanced usage,
:Rmodel, :Rview, :Rcontroller, and several other commands are
provided.
As a Ruby novice, I'd like to learn better (more clear, concise, and following conventions) solutions for this problem.
Thanks for any advices and corrections.
A more idiomatic code would be:
word_list = open("text.txt")
.lines
.flat_map { |line| line.downcase.split(/[^[:alpha:]]/).reject(&:empty?) }
.uniq
.sort
# I suppose you want each line and collect the results
word_list = File.open("text.txt").each_line.collect do |line|
# collecting is done via collect above, no need anymore
# .reject(&:empty?) calls .empty? on each element
line.downcase.split(/[^[:alpha:]]/).reject(&:empty?)
# you can chain on blocks as well
end.flatten!.uniq!.sort!
p word_list
I want to preserve the order of the keys in a YAML file loaded from disk, processed in some way and written back to disk.
Here is a basic example of loading YAML in Ruby (v1.8.7):
require 'yaml'
configuration = nil
File.open('configuration.yaml', 'r') do |file|
configuration = YAML::load(file)
# at this point configuration is a hash with keys in an undefined order
end
# process configuration in some way
File.open('output.yaml', 'w+') do |file|
YAML::dump(configuration, file)
end
Unfortunately, this will destroy the order of the keys in configuration.yaml once the hash is built. I cannot find a way of controlling what data structure is used by YAML::load(), e.g. alib's orderedmap.
I've had no luck searching the web for a solution.
Use Ruby 1.9.x. Previous version of Ruby do not preserve the order of Hash keys, but 1.9 does.
If you're stuck using 1.8.7 for whatever reason (like I am), I've resorted to using active_support/ordered_hash. I know activesupport seems like a big include, but they've refactored it in later versions to where you pretty much only require the part you need in the file and the rest gets left out. Just gem install activesupport, and include it as shown below. Also, in your YAML file, be sure to use an !!omap declaration (and an array of Hashes). Example time!
# config.yml #
months: !!omap
- january: enero
- february: febrero
- march: marzo
- april: abril
- may: mayo
Here's what the Ruby behind it looks like.
# loader.rb #
require 'yaml'
require 'active_support/ordered_hash'
# Load up the file into a Hash
config = File.open('config.yml','r') { |f| YAML::load f }
# So long as you specified an !!omap, this is actually a
# YAML::PrivateClass, an array of Hashes
puts config['months'].class
# Parse through its value attribute, stick results in an OrderedHash,
# and reassign it to our hash
ordered = ActiveSupport::OrderedHash.new
config['months'].value.each { |m| ordered[m.keys.first] = m.values.first }
config['months'] = ordered
I'm looking for a solution that allows me to recursively dig through a Hash loaded from a .yml file, look for those YAML::PrivateClass objects, and convert them into an ActiveSupport::OrderedHash. I may post a question on that.
Someone came up with the same issue. There is a gem ordered hash. Note that it is not a hash, it creates a subclass of hash. You might give it a try, but if you see a problem dealing with YAML, then you should consider upgrading to ruby1.9.