Annotating Ruby structures to include anchors/references on #to_yaml - ruby

I have some large hashes (>10⁵ keys) with interlocking structures. They're stored on disk as YAML. I'd like to avoid duplication by using anchors and references in the YAML, but I haven't been able to figure out if there's a way to do it implicitly in the hash such that the #to_yaml method will label the anchor nodes properly.
Desired YAML:
---
parent1:
common-element-1: &CE1
complex-structure-goes: here
parent2:
uncomment-element-1:
blah: blah
<<: *CE1
Ruby code:
hsh = {
'parent1' => {
'common-element-1' => {
'complex-structure-goes' => 'here',
},
'parent2' => {
'uncommon-element-1' => {
'blah' => 'blah',
},
'<<' => '*CE1',
},
}
The reference is quite straightforward -- but how to embed the &CE1 anchor in the 'common-element-1' item in the Ruby hash?
I want to work as much as possible with native Ruby primitive types (like Hash) rather than mucking about with builders and emitters and such -- and I definitely don't want to write the YAML manually!
I've looked at Read and write YAML files without destroying anchors and aliases? and its relative, among other places, but haven't found an answer yet -- at least not that I've understood.
Thanks!

If you use the same Ruby object, the YAML library will set up references for you:
> common = {"ohai" => "I am common"}
> doc = {"parent1" => {"id" => 1, "stuff" => common}, "parent2" => {"id" => 2, "stuff" => common}}
> puts doc.to_yaml
---
parent1:
id: 1
stuff: &70133422893680
ohai: I am common
parent2:
id: 2
stuff: *70133422893680
I'm not sure there's a straightforward way of defining Hashes that are subsets of each other, though. Perhaps tweaking your structure a bit would be warranted?

Related

Freeze arrays and hashes by default?

Just wondering if something like:
# frozen_string_literal: true
exists but for Array and Hash?
The goal is not having to .freeze every single of those within the same globals file.
I didn't find any library that monkey patches default ruby classes like Array or Hash. But I found an interesting gem immutable-ruby that may fit your needs
Simple example
require "immutable/hash"
person = Immutable::Hash[name: "Simon", gender: :male]
# => Immutable::Hash[:name => "Simon", :gender => :male]
and you cannot just modify values of it, cause it is immutable. You can perform some actions on that hash, but new copy will be returned to you
friend = person.put(:name, "James") # => Immutable::Hash[:name => "James", :gender => :male]
person # => Immutable::Hash[:name => "Simon", :gender => :male]
friend[:name] # => "James"
person[:name] # => "Simon"
Found a way to handle it without using another gem using only vscode and rubocop :
Install the rubocop extension on vscode
Open your .vscode/settings.json
Append those rules :
{
"editor.formatOnSave": true,
"editor.formatOnSaveTimeout": 5000,
"ruby.format": "rubocop"
}
save
enjoy
Thanks to Tom Lord for the hint.

Ruby find key by name inside converted JSON array of hashes

I have a Ruby hash converted from JSON data, it looks like this:
{ :query => {
:pages => {
:"743958" => {
:pageid => 743958,
:ns => 0,
:title => "Asterix the Gaul",
:revisions => [ {
:contentformat => "text/x-wiki",
:contentmodel => "wikitext",
:* => "{{Cleanup|date=April 2010}}\n{{Infobox graphic novel\n<!--Wikipedia:WikiProject Comics-->...
All the good stuff is inside the revisions array and then the Infobox hash.
The problem I have is getting to the Infobox hash. I can't seem to get to it. The pages and pageid hashes might not exist for other entries and of course the ID would be different.
I've tried all sorts of methods I could think of like .map, .select, .find, .include?, etc to no avail because they are not recursive and will not go into each key and array.
And all the answers I've seen in StackOverflow are to get the value by name inside a one-dimensional array which doesn't help.
How can I get the Infobox data from this?
Is this what you're looking for?
pp data
=> {:query=> {:pages=>
{:"743958"=>
{:pageid=>743958,
:ns=>0,
:title=>"Asterix the Gaul",
:revisions=>
[{:contentformat=>"text/x-wiki",
:contentmodel=>"wikitext",
:*=>"{{Cleanup..."}]}}}}
# just return data from the first revisionb
data[:query][:pages].map{|page_id,page_hash| page_hash[:revisions].first[:"*"]}
=> ["{{Cleanup..."]
# get data from all revisions
data[:query][:pages].map{|page_id,page_hash| page_hash[:revisions].map{|revision| revision[:"*"] }}.flatten
=> ["{{Cleanup..."]

In which version of ruby appeared ':' instead '=>'?

I mean
some: true
vs
:some => true
I have problem with compatibility my Rails version and Ruby version and I have to know in which version appeared only : instead =>.
I don't know how to find this kind of info by Google.
This is a feature introduced into Ruby 1.9:
{ example: 'key' }
# => { :example => 'key' }
This is similar to how JavaScript and other languages define their dictionary-type structures. The keys generated this way are always Symbol-type.
It's also possible to mix and match:
variable = :foo
{ example: 'key', 'string' => 'stored', variable => 'thing' }
# => {:example=>"key", "string"=>"stored", :foo=>"thing"}
This is a good thing because the x: approach is more limited. If you want dots in your keys, for example, you'll need to use the older style.

Tabbed text file to MultiDimensional hash using Ruby?

I'm having a bit of trouble figuring about how I'd go about this for a part of my project. Basically I need to take a normal tabbed text file and convert it into a Multi Dimensional hash in Ruby so I can cycle through and detect which parts have children. An example of the file:
hello
world
how
are
you
today
Would become:
{'hello' => ['world', 'how'], 'are' => {'you' => ['today']}}
Since your input format is up to you, I really don't understand why you're not using YAML:
puts { 'hello' => ['world', 'how'], 'are' => { 'you' => ['today'] } }.to_yaml
yields:
---
hello:
- world
- how
are:
you:
- today
Calling YAML.load with that string, of course, returns the original data structure. Contrary to what you believe, YAML does not require a "key value syntax".

What are good examples of mapping YAML data to Ruby objects?

I am looking for basic examples of YAML syntax and how to work with it in Ruby.
Basically, by looking at the examples, I hope to better understand how to map YAML scalars to object attributes, and whether to use different YAML files or having one YAML file containing multiple objects.
There is a YAML class in Ruby core which has a short tutorial and a few links.
YAML in Five Minutes
Serializing and Deserializing objects with Ruby
require "yaml"
test_obj = ["dogs", "cats", "badgers"]
yaml_obj = YAML::dump( test_obj )
# -> ---
- dogs
- cats
- badgers
ruby_obj = YAML::load( yaml_obj )
# => ["dogs", "cats", "badgers"]
ruby_obj == test_obj
# => true

Resources