Search/check values in YAML document with Ruby - ruby

my goal:
check if yaml document include value for specific key using ypath/xpath
select value for specified key using ypath/xpath
document yaml:
app:
name: xxx
version: xxx
description:
author:
name: xxx
surname: xxx
email: xxx#xxx.xx
what was checked:*
google
stackoverflow
Ruby API (YAML::DBM as one of methods it provide is select)
example:
Module::Class.select('description/author/name')
Module::Class.select('*/name')
Module::Class.isset?('*/name')

Use yaml:
require 'yaml'
yml = YAML.load_file('your_file.yml')
Now yml is a hash. You can use it like one. Here is a simple and ugly solution for what you try:
if !yml["description"].nil? && !yml["description"]["author"].nil? && !yml["description"]["author"]["name"].nil? && !yml["description"]["author"]["name"].empty?
puts "An author is set!"
end

Since there are no up-to-date YPath implementations around, I would suggest to give a chance ActiveSupport and Nokogiri:
yml = LOAD_YML_WITH_YOUR_PREFERRED_YAML_ENGINE
# ActiveSupport adds a to_xml method to Hash
xml = yml.to_xml(:root => 'yaml')
doc = Nokogiri::XML(xml)
doc.xpath("description/author/name").map do |name|
puts [name['key'], name['value']]
end

Related

How does a YAML double exclamation point work in this i18n gem?

I'm not using Rails and I haven't done any internationalization before, so I'm trying to understand how this particular example works but I'm a little bit stumped:
The r18n-desktop gem reads from a YAML file for translations. Pretty straightforward.
YAML file en.yml:
user:
edit: Edit user
name: User name is %1
count: !!pl
1: There is 1 user
n: There are %1 users
log:
signup: !!gender
male: Он зарегистрировался
female: Она зарегистрировалась
Test ruby code:
require 'r18n-desktop'
R18n.from_env('./localizations/')
R18n::Filters.add('gender') do |translation, config, user|
puts translation
puts config
puts user
translation[user.gender]
end
include R18n::Helpers
class Ayy
attr_accessor :gender
end
girl = Ayy.new
girl.gender = :female
puts t.user.count(5)
puts t.log.signup girl
Output:
There are 5 users
localization-test.rb:13:in
puts: can't convert R18n::Translation to Array (R18n::Translation#to_ary gives R18n::Untranslated) (TypeError) from localization-test.rb:13:in puts' from localization-test.rb:13:in '
Addenum: Looks like the error is in puts rather than the "translation". The actual result of a translation is log.signup[] though so the gender isn't getting through.
What is t.log.signup() expecting?
Seems like you forget to set a filter for !!gender custom type.
R18n has only few built-in filter — like !!pl. Gender filter is not built-in, you need to define it manually.
R18n Filter docs already contains simple filter example for gender:
R18n::Filters.add('gender') do |translation, config, user|
translation[user.gender]
end

Create nested object from YAML to access attributes via method calls in Ruby

I am completely new to ruby.
I have to parse a YAML file to construct an object
YAML File
projects:
- name: Project1
developers:
- name: Dev1
certifications:
- name: cert1
- name: Dev2
certifications:
- name: cert2
- name: Project2
developers:
- name: Dev1
certifications:
- name: cert3
- name: Dev2
certifications:
- name: cert4
I want to create an object from this YAML for which I wrote the following code in Ruby
require 'yaml'
object = YAML.load(File.read('./file.yaml'))
I can successfully access the attributes of this object with []
For e.g.
puts object[projects].first[developers].last[certifications].first[name]
# prints ABC
However, I want to access the attributes via method calls
For e.g.
puts object.projects.first.developers.last.certifications.first.name
# should print ABC
Is there any way to construct such an object whose attributes can be accessed in the (dots) way mentioned above?
I have read about OpenStruct and hashugar.
I also want to avoid usage of third party gems
Nice answer from Xavier, but it can be shorter, just require yaml, json and ostruct and parse your YAML, convert it to JSON, parse it in an Openstruct (a Struct would also be possible) like this
object = JSON.parse(YAML.load(yaml).to_json, object_class: OpenStruct)
To load your YAML from a file it's
object = JSON.parse(YAML::load_file("./test.yaml").to_json, object_class: OpenStruct)
This gives
object
=>#<OpenStruct projects=[#<OpenStruct name="Project1", developers=[#<OpenStruct name="Dev1", certifications=[#<OpenStruct name="cert1">]>, #<OpenStruct name="Dev2", certifications=[#<OpenStruct name="cert2">]>]>, #<OpenStruct name="Project2", developers=[#<OpenStruct name="Dev1", certifications=[#<OpenStruct name="cert3">]>, #<OpenStruct name="Dev2", certifications=[#<OpenStruct name="cert4">]>]>]>
object.projects.first.developers.last.certifications.first.name
=>cert2
I use this for loading configurations from file, a Yaml is easily to maintain and in your code it's easier to use than a configuration in Hash.
Don't do this for repetitive tasks.
If you are just experimenting, there is a quick and dirty way to do this:
class Hash
def method_missing(name, *args)
send(:[], name.to_s, *args)
end
end
I wouldn't use that in production code though, since both method_missing and monkey-patching are usually recipes for trouble down the road.
A better solution is to recursively traverse the data-structure and replace hashes with openstructs.
require 'ostruct'
def to_ostruct(object)
case object
when Hash
OpenStruct.new(Hash[object.map {|k, v| [k, to_ostruct(v)] }])
when Array
object.map {|x| to_ostruct(x) }
else
object
end
end
puts to_ostruct(object).projects.first.developers.last.certifications.first.name
Note that there are potentially performance issues with either approach if you are doing them a lot - if your application is time-sensitive make sure you benchmark them! This probably isn't relevant to you though.

Jekyll - generating JSON files alongside the HTML files

I'd like to make Jekyll create an HTML file and a JSON file for each page and post. This is to offer a JSON API of my Jekyll blog - e.g. a post can be accessed either at /posts/2012/01/01/my-post.html or /posts/2012/01/01/my-post.json
Does anyone know if there's a Jekyll plugin, or how I would begin to write such a plugin, to generate two sets of files side-by-side?
I was looking for something like this too, so I learned a bit of ruby and made a script that generates JSON representations of Jekyll blog posts. I’m still working on it, but most of it is there.
I put this together with Gruntjs, Sass, Backbonejs, Requirejs and Coffeescript. If you like, you can take a look at my jekyll-backbone project on Github.
# encoding: utf-8
#
# Title:
# ======
# Jekyll to JSON Generator
#
# Description:
# ============
# A plugin for generating JSON representations of your
# site content for easy use with JS MVC frameworks like Backbone.
#
# Author:
# ======
# Jezen Thomas
# jezenthomas#gmail.com
# http://jezenthomas.com
module Jekyll
require 'json'
class JSONGenerator < Generator
safe true
priority :low
def generate(site)
# Converter for .md > .html
converter = site.getConverterImpl(Jekyll::Converters::Markdown)
# Iterate over all posts
site.posts.each do |post|
# Encode the HTML to JSON
hash = { "content" => converter.convert(post.content)}
title = post.title.downcase.tr(' ', '-').delete("’!")
# Start building the path
path = "_site/dist/"
# Add categories to path if they exist
if (post.data['categories'].class == String)
path << post.data['categories'].tr(' ', '/')
elsif (post.data['categories'].class == Array)
path << post.data['categories'].join('/')
end
# Add the sanitized post title to complete the path
path << "/#{title}"
# Create the directories from the path
FileUtils.mkpath(path) unless File.exists?(path)
# Create the JSON file and inject the data
f = File.new("#{path}/raw.json", "w+")
f.puts JSON.generate(hash)
end
end
end
end
There are two ways you can accomplish this, depending on your needs. If you want to use a layout to accomplish the task, then you want to use a Generator. You would loop through each page of your site and generate a new .json version of the page. You could optionally make which pages get generated conditional upon the site.config or the presence of a variable in the YAML front matter of the pages. Jekyll uses a generator to handle slicing blog posts up into indices with a given number of posts per page.
The second way is to use a Converter (same link, scroll down). The converter will allow you to execute arbitrary code on your content to translate it to a different format. For an example of how this works, check out the markdown converter that comes with Jekyll.
I think this is a cool idea!
Take a look at JekyllBot and the following code.
require 'json'
module Jekyll
class JSONPostGenerator < Generator
safe true
def generate(site)
site.posts.each do |post|
render_json(post,site)
end
site.pages.each do |page|
render_json(page,site)
end
end
def render_json(post, site)
#add `json: false` to YAML to prevent JSONification
if post.data.has_key? "json" and !post.data["json"]
return
end
path = post.destination( site.source )
#only act on post/pages index in /index.html
return if /\/index\.html$/.match(path).nil?
#change file path
path['/index.html'] = '.json'
#render post using no template(s)
post.render( {}, site.site_payload)
#prepare output for JSON
post.data["related_posts"] = related_posts(post,site)
output = post.to_liquid
output["next"] = output["next"].id unless output["next"].nil?
output["previous"] = output["previous"].id unless output["previous"].nil?
#write
#todo, figure out how to overwrite post.destination
#so we can just use post.write
FileUtils.mkdir_p(File.dirname(path))
File.open(path, 'w') do |f|
f.write(output.to_json)
end
end
def related_posts(post, site)
related = []
return related unless post.instance_of?(Post)
post.related_posts(site.posts).each do |post|
related.push :url => post.url, :id => post.id, :title => post.to_liquid["title"]
end
related
end
end
end
Both should do exactly what you want.

Read and write YAML files without destroying anchors and aliases?

I need to open a YAML file with aliases used inside it:
defaults: &defaults
foo: bar
zip: button
node:
<<: *defaults
foo: other
This obviously expands out to an equivalent YAML document of:
defaults:
foo: bar
zip: button
node:
foo: other
zip: button
Which YAML::load reads it as.
I need to set new keys in this YAML document and then write it back out to disk, preserving the original structure as much as possible.
I have looked at YAML::Store, but this completely destroys the aliases and anchors.
Is there anything available that could something along the lines of:
thing = Thing.load("config.yml")
thing[:node][:foo] = "yet another"
Saving the document back as:
defaults: &defaults
foo: bar
zip: button
node:
<<: *defaults
foo: yet another
?
I opted to use YAML for this due to the fact it handles this aliasing well, but writing YAML that contains aliases appears to be a bit of a bleak-looking playing field in reality.
The use of << to indicate an aliased mapping should be merged in to the current mapping isn’t part of the core Yaml spec, but it is part of the tag repository.
The current Yaml library provided by Ruby – Psych – provides the dump and load methods which allow easy serialization and deserialization of Ruby objects and use the various implicit type conversion in the tag repository including << to merge hashes. It also provides tools to do more low level Yaml processing if you need it. Unfortunately it doesn’t easily allow selectively disabling or enabling specific parts of the tag repository – it’s an all or nothing affair. In particular the handling of << is pretty baked in to the handling of hashes.
One way to achieve what you want is to provide your own subclass of Psych’s ToRuby class and override this method, so that it just treats mapping keys of << as literals. This involves overriding a private method in Psych, so you need to be a little careful:
require 'psych'
class ToRubyNoMerge < Psych::Visitors::ToRuby
def revive_hash hash, o
#st[o.anchor] = hash if o.anchor
o.children.each_slice(2) { |k,v|
key = accept(k)
hash[key] = accept(v)
}
hash
end
end
You would then use it like this:
tree = Psych.parse your_data
data = ToRubyNoMerge.new.accept tree
With the Yaml from your example, data would then look something like
{"defaults"=>{"foo"=>"bar", "zip"=>"button"},
"node"=>{"<<"=>{"foo"=>"bar", "zip"=>"button"}, "foo"=>"other"}}
Note the << as a literal key. Also the hash under the data["defaults"] key is the same hash as the one under the data["node"]["<<"] key, i.e. they have the same object_id. You can now manipulate the data as you want, and when you write it out as Yaml the anchors and aliases will still be in place, although the anchor names will have changed:
data['node']['foo'] = "yet another"
puts Yaml.dump data
produces (Psych uses the object_id of the hash to ensure unique anchor names (the current version of Psych now uses sequential numbers rather than object_id)):
---
defaults: &2151922820
foo: bar
zip: button
node:
<<: *2151922820
foo: yet another
If you want to have control over the anchor names, you can provide your own Psych::Visitors::Emitter. Here’s a simple example based on your example and assuming there’s only the one anchor:
class MyEmitter < Psych::Visitors::Emitter
def visit_Psych_Nodes_Mapping o
o.anchor = 'defaults' if o.anchor
super
end
def visit_Psych_Nodes_Alias o
o.anchor = 'defaults' if o.anchor
super
end
end
When used with the modified data hash from above:
#create an AST based on the Ruby data structure
builder = Psych::Visitors::YAMLTree.new
builder << data
ast = builder.tree
# write out the tree using the custom emitter
MyEmitter.new($stdout).accept ast
the output is:
---
defaults: &defaults
foo: bar
zip: button
node:
<<: *defaults
foo: yet another
(Update: another question asked how to do this with more than one anchor, where I came up with a possibly better way to keep anchor names when serializing.)
YAML has aliases and they can round-trip, but you disable it by hash merging. << as a mapping key seems a non-standard extension to YAML (both in 1.8's syck and 1.9's psych).
require 'rubygems'
require 'yaml'
yaml = <<EOS
defaults: &defaults
foo: bar
zip: button
node: *defaults
EOS
data = YAML.load yaml
print data.to_yaml
prints
---
defaults: &id001
zip: button
foo: bar
node: *id001
but the << in your data merges the aliased hash into a new one which is no longer an alias.
Have you try Psych ? Another question with psych here.
I'm generating my CircleCI config file with a Ruby script and ERB templates. My script parses and regenerates the YAML, so I wanted to preserve all the anchors. The anchors in my config all have the same name as the key that defines them, e.g.
docker_images:
docker_auth: &docker_auth
username: '$DOCKERHUB_USERNAME'
password: '$DOCKERHUB_TOKEN'
cimg_base_image: &cimg_base_image
image: cimg/base:2022.09
auth: *docker_auth
jobs:
tests:
docker:
- *cimg_ruby_image
So I was able to solve this with regular expressions on the generated YAML string. It wrote a #restore_yaml_anchors method that converts &1 and *1 back into &docker_auth and *docker_auth.
# Ruby 3.1.2
require 'rubygems'
require 'yaml'
yaml = <<EOS
docker_images:
docker_auth: &docker_auth
username: '$DOCKERHUB_USERNAME'
password: '$DOCKERHUB_TOKEN'
cimg_base_image: &cimg_base_image
image: cimg/base:2022.09
auth: *docker_auth
jobs:
tests:
docker:
- *cimg_base_image
EOS
data = YAML.load yaml, aliases: true # needed for Ruby 3.x
def restore_yaml_anchors(yaml)
yaml.scan(/([A-Z0-9a-z_]+|<<): &(\d+)/).each do |anchor_name, anchor_id|
yaml.gsub!(/([:-]) (\*|&)#{anchor_id}/, "\\1 \\2#{anchor_name}")
end
yaml
end
puts [
"Original #to_yaml:",
data.to_yaml,
"-----------------------", '',
"With restored anchors:",
restore_yaml_anchors(data.to_yaml)
].join("\n")
Output:
Original #to_yaml:
---
docker_images:
docker_auth: &1
username: "$DOCKERHUB_USERNAME"
password: "$DOCKERHUB_TOKEN"
cimg_base_image: &2
image: cimg/base:2022.09
auth: *1
jobs:
tests:
docker:
- *2
-----------------------
With restored anchors:
---
docker_images:
docker_auth: &docker_auth
username: "$DOCKERHUB_USERNAME"
password: "$DOCKERHUB_TOKEN"
cimg_base_image: &cimg_base_image
image: cimg/base:2022.09
auth: *docker_auth
jobs:
tests:
docker:
- *cimg_base_image
It's working well for my CI config, but you may need to update it to handle some other cases in your own YAML.

How to parse a JSON friend list to hash in Ruby?

I have in some file contains this json part (from FACEBOOK API):
--- !seq:Koala::Facebook::API::GraphCollection - name: pop ool id: "1032225" - name: Rose kak id: "2312010"
and in ruby I try to do:
jsonFriends = File.open("friends.json" ,"r")
puts JSON.parse(jsonFriends.readline)
but I get this error:
from /usr/local/Cellar/ruby/1.9.3-p194/lib/ruby/1.9.1/json/common.rb:148:in `parse' from try.rb:22:in `<main>'
That's YAML, not JSON.
require 'yaml'
friends = YAML.load(File.read('friends.json'))
Try it
require 'json'
result = File.read("friends.json")
puts JSON.parse(result)

Resources