YAML generated by Ruby appears invalid (!omap vs !!omap) - ruby

I am trying to parse some YAML generated by some Ruby code (https://github.com/devrandom/gitian-builder/blob/81bf5d70252363a95cb75eea70f8d1d129948013/bin/gbuild#L322). I am trying to parse it using PyYAML. When PyYAML had trouble parsing the YAML, I tried an online validator (http://yaml-online-parser.appspot.com/) and it failed with the following error:
ERROR:
could not determine a constructor for the tag '!omap'
in "<unicode string>", line 1, column 5:
--- !omap
^
I see on the YAML website (can't post more than two links yet) that !!omap appears to be correct, not !omap. So why does Ruby output !omap when YAML::Omap is used?
I can't find anything online to explain this behavior.
If you want to see an example of the YAML I am trying to parse, search for the gitian.sigs repo on GitHub under the bitcoin account and look at any of the .assert files there (again, I can't post more than two links).

It turned out to be a legacy thing from Syck. While it should be !!omap, I was told I should be able to tell my parser that !omap is the same as !!omap.
See: https://github.com/tenderlove/psych/issues/241

Related

Validating YAML file in PhpStorm

I'm working on a project where YAMLs are used (among other use cases) for storing synonym lists. A file may look a little like this:
- "streifen,gestreift"
- "fleeceoverall,fleeceanzug"- "federball,badminton"
- "hochgarage,parkgarage"
In this case - "federball,badminton" is on the same row as - "fleeceoverall,fleeceanzug" which causes the build of the application to fail with an error stating
Unexpected characters near "- "federball,badminton".
I tried to configure a code style profile for code inspections as mentioned here in the PhpStorm documentation:
https://www.jetbrains.com/help/phpstorm/customizing-profiles.html?keymap=secondary_default_for_macos
But I don't know what to adjust here. I using the IDE-Standard which looks like this for wrapping and braces (which I guess is what I'm looking for ;) :
I also took a look at validating my YAML against a JSON file as mentioned here: https://www.jetbrains.com/help/phpstorm/yaml.html# but ultimately I don't understand how this works :/
So I guess I'm a little lost on how to avoid the errors at build time beforehand and would love some advice!

Convert non-standard yml localization files to php array or other format usable for web translation

I tried to translate online with Transifex Stellaris localization files but it doesn't import it correctly because they don't follow localization standard:
http://docs.transifex.com/formats/yaml/
Here for example there is one file:
http://pastebin.com/abKLLSpX
I tried to convert it to php array or other formats usable on Transifex with some online (and offline) tools/scripts but i didn't find anything that convert it without an error, here for example with Symfony it gave me the error:
PHP Fatal error: Uncaught exception 'Symfony\Component\Yaml\Exception\ParseException' with message 'Unable to parse at line 8 (near "DERELICT_SHIP_PROJECT:0 "Derelict Ship"").' in /usr/share/php/Symfony/Component/Yaml/Parser.php:246
Can someone give me an advice on how to convert it correctly to a format usable in Transifex please?
Thanks for any reply.
The file you linked contains lines like
DERELICT_SHIP_PROJECT:0 "Derelict Ship"
You probably want it to be a key-value pair:
DERELICT_SHIP_PROJECT: "Derelict Ship"
I don't know what the 0 is for and how it got there, but if you delete it from every line, you'll have a proper YAML source.

How to debug Octopress markdown source files?

I use Octopress for blogging. Generally it works well except one occassion -- after typing rake generate, I got depressing output which says something like:
psych.rb:203:in `parse': (<unknown>): mapping values are not allowed in this context at line 3 column 6 (Psych::SyntaxError)
I can't remember how many times I've encounterd this situation. Every time I google the key words above, but got nothing help.
What I can do is to exclude all the source files (*.mkd) from _posts, and add them one by one to check which one goes wrong. I keep checking, and finally it turns out that a minor grammer mistake makes octopress angry.
Life should NOT be that hard. So is it possible to debug a octopress source file to show which line of file is incorrect in grammer? The outputs from rake generate don't make sense at all.
The reason could be wrong JAML in the top part of the post (e.g. ':' in the title), see https://github.com/jekyll/jekyll/issues/549 for more info.
I've seen a similar error ("mapping values are not allowed in this context") when I try to convert markdown files, using Pandoc. Perhaps your error message is coming from pandoc somehow?
Don't bother to debug Octopress. Please migrate to Pelican -- a Python-powered static site generator. It is full-featured, easy to use, and no doubt, generating useful debug information.

How to export scrubyt extractor?

I've written a scrubyt extractor based on the 'learning' technique - that is, specifying the current text on the page and getting it to work out the XPath expressions itself. However, I now want to export the extractor so that it can be used even when the page has changed.
The documentation for scrubyt seems to be all over the place now, but from what I can find I should be able to put the line extractor.export(__FILE__) and it should work. It doesn't - I just get an error saying that there is the wrong number of arguments for export, it should have 0. I've tried it without any arguments and it still fails.
I would ask on the scrubyt forum, but it seems like no-one's been there for ages!
Any ideas what to do here?
Just had the same problem and tried "puts google_data.export()" (trying to get some stuff from google)
This gave me the following:
=== Extractor tree ===
export() is not working at the moment, due to the removal or
ParseTree, ruby2ruby and RubyInline.
For now, in case you are using examples, you can replace them by hand
based on the output below.
So if your pattern in the learning extractor looks like
book "Ruby Cookbook"
and you see the following below:
[book] /table[1]/tr/td[2]
then replace "Ruby Cookbook" with "/table[1]/tr/td[2]" (and all the
other XPaths) and you are ready!
[link] /body/div/div/div/div/div/ol/li/h3/a
which gave me the xpath I was looking for
scrubyt version is 0.4.06

finding json errors with ruby

i am using ruby gems json_pure and when i get parsing errors i am not able to determine the line number where the error is occuring. i was expecting to find a validator written in ruby that would tell me the line number. what is the best ruby approach to finding the json errors quickly?
thanks!
You could try Kwalify: http://www.kuwata-lab.com/kwalify/ruby/users-guide.html
It's not just a JSON/YAML validator, but you give it a schema and it validates against that.
You could use that to verify that your config file is correct (as per your definition of "correct") as well as correct JSON (or YAML), and it will tell you what line number the error happened on, and a bit of context for the error as well.
I removed a ']' from a sample in their documentation and the error message it gave me was
ERROR: document12a.json:10:1 [/favorite] flow sequence is not closed by ']'.
It also does data binding class generation if you want. It seems like a pretty good configuration management/validation tool.
You might find it easier to use http://www.jsonlint.com/ to check the JSON is valid - it will highlight any problematic lines.
It's and old question, but it's still an issue in current Ruby.
Default JSON parser in Ruby does not tell you the line number and/or column number the parse error occured. Just 'parse' error and that's it.
You could change the parser to Oj and use with MultiJson gem.
Oj is great, it's very fast, and it will show the line and even the column number the parsing errors occured! Great.
Sample error for one-line very big JSON (170KB+), Oj will give column 102421.
.../adapters/oj.rb:15:in `load': unexpected character at line 1, column 102421 [parse.c:666] (MultiJson::ParseError)
JSON is supposed to be sent over the network as a string, so there's actually only one line.

Resources