ruby yaml ypath like xpath? - ruby

Hi i have a yaml file like so
---
data:
- date: "2004-06-11"
description: First description
- date: "2008-01-12"
description: Another descripion
How can i do a "ypath" query similar to xpath for xml ? Something like "get the description where the date is 2004-06-11"
YAML.parse_file('myfile.yml').select('/data/*/date == 2004-06-11')
How do you do it, and if that's possible how can i similarly edit the description by 'ypath' ?
Thanks

There is indeed such thing as YPath: github.com/peterkmurphy/YPath-Specification
And it's implemented in Ruby's YAML lib; see the doc for BaseNode#search.

If Ruby is not a hard constraint you might take a look at the dpath tool.
It provides an xpath-like query language to YAML (and other) files. Maybe call
it externally to filter your data.

The yaml file describes a hash mapping from strings to arrays of hashes that map from strings to strings. There is no such thing as xpath for nested hashes (at least not in the standard library), but it's simple enough with standard Hash and Enumerable methods:
hash = YAML.load_file('myfile.yml')
item = hash["data"].find {|inner_hash| inner_hash["date"] == "2004-06-11"}
#=> {"date"=>"2004-06-11", "description"=>"First description"}
To change the description, you can then simply do item["description"] = "new description" and then serialize the hash back to YAML using hash.to_yaml.

Related

bash Iterate over a hash table

I have a read the output of a function into a variable.
The data looks like this
---
data:
pkg:
-
NAME: 'bob'
FEATURE: Big
-
NAME: 'sue'
FEATURE: Tall
-
NAME: 'jim'
FEATURE: Slim
I see examples of iterating over an array.
Those examples always create the array by hand.
Is there a way to transform the hash into an array? How do I do that? Or can I deal with it in this form?
I'd like to echo the FEATURE of each pkg.
The yaml in your example represents an array of dictionaries. Bash doesn't do multidimensional arrays of any sort.
You can, however, simulate the result by parsing your data into parallel arrays, so that ${name[0]} of bob corresponds by its zero index with ${feature[0]} of Big.
The real problem is manually parsing YAML, which I don't recommend.
If you really need to dive into that, check out this discussion which has some options.

SuperCollider: convert a Dictionary to YAML

SuperCollider has a String:parseYAML method that can create a nested Dictionary:
"{44: 'woo'}".parseYAML
Dictionary[ (44 -> woo) ]
But how to go the other way, output a YAML string given a (possibly nested) Dictionary?
[answer is from someone else outside]
Does the document have to be readable?
I've ben using JSON.stringify from Felix's API quark In order to share dictionaries with an Max MSP application.
The result from this method is not readable, that is, it doesn't generate any newlines and tabs etc. So it doesn look pretty in a text document, but that's not the intention with method design I can imagine.

Is there a way to alias/anchor an array in YAML?

I'm using Jammit to package assets up for a Rails application and I have a few asset files that I'd like to be included in each of a few groups. For example, I'd like Sammy and its plugins to be in both my mobile and screen JS packages.
I've tried this:
sammy: &SAMMY
- public/javascripts/vendor/sammy.js
- public/javascripts/vendor/sammy*.js
mobile:
<<: *SAMMY
- public/javascripts/something_else.js
and this:
mobile:
- *SAMMY
but both put the Sammy JS files in a nested Array, which Jammit can't understand. Is there a syntax for including the elements of an Array directly in another Array?
NB: I realize that in this case there are only two elements in the SAMMY Array, so it wouldn't be too bad to give each an alias and reference both in each package. That's fine for this case, but quickly gets unmaintainable when there are five or ten elements that have a specific load order.
Closest solution I know of is this one:
sammy:
- &SAMMY1
public/javascripts/vendor/sammy.js
- &SAMMY2
public/javascripts/vendor/sammy*.js
mobile:
- *SAMMY1
- *SAMMY2
- public/javascripts/something_else.js
Alternatively, as already suggested, flatten the nested lists in a code snippet.
Note: according to yaml-online-parser, your first suggestion is not a valid use of << (used to merge keys from two dictionaries. The anchor then has to point to another dictionary I believe.
If you want mobile to be equal to sammy, you can just do:
mobile: *SAMMY
However if you want mobile to contain other elements in addition to those in sammy, there's no way to do that in YAML to the best of my knowledge.
Your example is valid YAML (a convenient place to check is YPaste), but it's not defined what the merge does. Per the spec, a merge key can have a value:
A mapping, in which case it's merged into the parent mapping.
A sequence of mappings, in which case each is merged, one-by-one, into the parent mapping.
There's no way of merging sequences on YAML level.
You can, however, do this in code. Using the YAML from your second idea:
mobile:
- *SAMMY
you'll get nested sequences - so flatten them! Assuming you have a mapping of such nested sequences:
data = YAML::load(File.open('test.yaml'))
data.each_pair { |key, value| value.flatten! }
(Of course, if you have a more complicated YAML file, and you don't want every sequence flattened (or they're not all sequences), you'll have to do some filtering.)
This solution is for Symfony/PHP only (considerations for other languages, see below)
Note about array keys from the PHP array manual page:
Strings containing valid decimal ints, unless the number is preceded by a + sign, will be cast to the int type. E.g. the key "8" will actually be stored under 8. [...]
This means that if you actually index your anchor array with integer keys, you can simply add new keys by continuing the initial list. So your solution would look like this:
sammy: &SAMMY
1: public/javascripts/vendor/sammy.js
2: public/javascripts/vendor/sammy*.js
mobile:
<<: *SAMMY
3: public/javascripts/something_else.js
You can even overwrite keys and still add new ones:
laptop:
<<: *SAMMY
1: public/javascripts/sammy_laptop.js
3: public/javascripts/something_else.js
In both cases the end result is a perfectly valid indexed array, just like before.
Other programming languages
Depending on your YAML implementation and how you iterate over your array, this could conceivably also be used in other programming languages. Though with a caveat.
For instance, in JS you can access numerical string keys by their integer value as well:
const sammy = {"1": "public/javascripts/vendor/sammy.js"}
sammy["1"]; // "public/javascripts/vendor/sammy.js"
sammy[1]; // "public/javascripts/vendor/sammy.js"
But you'd need to keep in mind, that your initial array is now an object, and that you would need to iterate over it accordingly, e.g.:
Object.keys(sammy).forEach(key => console.log(sammy[key]))
As it has been suggested, when you need to flatten a list, at least in ruby, it is trivial to add a "!flatten" type specifier to mobile and implement a class that extends Array, adds the yaml_tag and flattens the coder seq on init_with.

What's the xpath syntax to get tag names?

I'm using Nokogiri to parse a large XML file. Say I've got the following structure:
<menagerie>
<penguin>Pablo</penguin>
<penguin>Mortimer</penguin>
<bull>Ferdinand</bull>
<aardvark>James Cornelius Madison Humphrey Zophar Handlebrush III</aardvark>
</menagerie>
I can count the non-penguins like this:
xml.xpath('//menagerie//*[not(penguin)]').length // 2
But how do I get a list of the tags, like this? (The exact format isn't important; I just want to visually scan the non-penguins.)
bull
aardvark
Update
This gave me the list I wanted - thanks Oded and TMN and delnan!
xml.xpath('//menageries/*[not(penguin)]').each do |node|
puts node.name()
end
You can use the name() or local-name() XPath function.
See the examples on zvon.
I know it's a bit outdated but you should do: xml.xpath('//meagerie/*[not(penguin)]/name()') as the expression. Note the slash, not the dot. This is how you call methods on the current node in XPath.

save/edit array in and outside ruby

I am having an array like "author","post title","date","time","post category", etc etc
I scrape the details from a forum and I want to
save the data using ruby
update the data using ruby
update the data using text editor or I was thinking of one of OpenOffice programs? Calc would be the best.
I guess to have some kind of SQL database would be a solution but I need quick solution for that (somthing that I can do by myself :-)
any suggestions?
Thank you
YAML is your friend here.
require "yaml"
yaml= ["author","post title","date","time","post category"].to_yaml
File.open("filename", "w") do |f|
f.write(yaml)
end
this will give you
---
- author
- post title
- date
- time
- post category
vice versa you get
require "yaml"
YAML.load(File.read("filename")) # => ["author","post title","date","time","post category"]
Yaml is easily human readable, so you can edit it with any text editor (not word proccessor like ooffice). You can not only searialize array's and strings. Yaml works out of the box for most ruby objects, even for objects of user defined classes. This is a good itrodution into the yaml syntax: http://yaml.kwiki.org/?YamlInFiveMinutes.
If you want to use a spreadsheet, csv is the way to go. You can use the stdlib csv api like:
require 'csv'
my2DArray = [[1,2],["foo","bar"]]
File.open('data.csv', 'w') do |outfile|
CSV::Writer.generate(outfile) do |csv|
my2DArray.each do |row|
csv << row
end
end
end
You can then open the resulting file in calc or in most statistics applications.
The same API can be used to re-import the result in ruby if you need.
You could serialize it to json and save it to a file. This would allow you to edit it using a simple text editor.
if you want to edit it in something like calc, you could consider generating a CSV (comma separated values) file and import it.
If I understand correctly, you have a two-dimensional array. You could output it in csv format like so:
array.each do |row|
puts row.join(",")
end
Then you import it with Calc to edit it or just use a text editor.
If your data might contain commas, you should have a look at the csv module instead:
http://ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html

Resources