Ruby feed parsing: "Input is not proper UTF-8, indicate encoding!" - ruby

I am trying to parse RSS feeds using Feedzirra.
Some of them are ok, but others return the error:
Error while parsing. Input is not proper UTF-8, indicate encoding !
How do I fix it?

This does not seem to be a Feedzirra issue, IMO.
Your libxml or nokigiri dependencies may not be up-to-date. Update these gems and try again.
Like mentioned here, encoding detection is not 100% accurate.
If you'd like to ignore the ones which give you errors,
Feedzirra has callback functions
Another feature present in Feedzirra is the ability to create callback
functions that get called “on success” and “on failure” when getting a
feed. This makes it easy to do things like log errors or update data
stores.
Also, please give us more context on what code gives you the error or which file are you trying to parse.

Related

valid url is causing rest-client to throw bad URI (trying to work with an awful API)

The following is a valid URL in the browser (I know, browser is less strict).
https://api.siteconfidence.co.uk/current/30294ie0sdafhwe89rh5/Return/[Account[AccountId,Pages[Page],ServiceStatus[HighestStatusCode]]]/AccountId/123123jjh/Id/213123123/StartDate/2015-08-12/StartTime/12:00:00/EndDate/2015-08-07/EndTime/07:05:05/StatusCode/1
However when I give it to rest-client it just throws a `bad URI(is not URI?).
So, none of the things I've tried after having read the answers from this question work. They all give various errors, like can't gsub addressable, undefined method match, etc.
I'm not sure what next to do.
I know it's not real uri but that's how their API is (rubbish I know, but I can't change it)
Also, I can't put the [] bit in quotes, there API just ignores it.
EDIT:
I changed the places holders for fake values.
Your browser is url encoding it automatically. You have characters in there that are invalid in a url. You need to encode it before you try making a request with it.
In the end, I Solved the problem by just doing something else.
I used the Curb gem to make the request and then used the JSON gem to parse the output (after ensuring the return format is JSON).

How to make Compass/Sass compilation command generate parsable output

I’d like to automate the compilation of Compass projects and be able to get output that I can parse so I can take only what I need (the errors) and further format them how I want.
The issue is that Compass output is not in a format that can be easily parsed (it has error messages on multiple lines).
Is there any reliable way to parse this output? Or… any idea what would need to be changed and where in Compass’s code to allow a new param that would allow you to specify the output format (e.g. JSON, XML)?
I’m asking this because I don’t know Ruby, so I would need a starting point. Their current code is not easy to understand (due to the fact that I don’t know Ruby), but if I at least have a starting point I would try to see what I can do and hopefully create a pull request with this if I get it working.
I think, there is another way to solve this problem, what you think about to parse the output css and do not touch compass.
There is a good framework for creating postproccesor for css:
https://github.com/postcss/postcss
You can do what you want with output css and send message to console or send email or other things, and many other things.

make error latex

make -C doc html latexpdf
yields this:
Package hyperref Message: Driver (autodetected): hpdftex.
(/usr/share/texlive/texmf-dist/tex/latex/hyperref/hpdftex.def
(/usr/share/texlive/texmf-dist/tex/latex/oberdiek/rerunfilecheck.sty))
(/usr/share/texlive/texmf-dist/tex/latex/oberdiek/hypcap.sty))
(/usr/share/texlive/texmf-dist/tex/latex/multirow/multirow.sty)
Writing index file Arakoon.idx
(./Arakoon.aux)
Runaway argument?
{{1.10.3}{9}{Client side support\relax }{subsection.1.10.
! File ended while scanning use of \#newl#bel.
<inserted text>
\par
l.113 \begin{document}
?
If someone is still looking for this: For me usually, when a prior build had failed due to an error, subsequent builds would fail with this error. Solved it by deleting the main .aux file and building it again.
It is impossible to say with any confidence, but it looks like a fragile (non-robust) command in a subsubsection heading or a maths label (would be label 10.1.3), because:
It occurs at \begin{document}, when the .aux files are loaded,
The error indicates a malformed directive in the .aux file, and the presence of a \relax there - typically what command reduce to after having performed their side effect.
Two suggestions:
Generate an MWE by making a new document from this with all the body of your document except that heading/ equation (and perhaps the sentence following) deleted. Does this create the same error? If so, post it here. You might need some trial and error to find out which Lat3ex command is responsible, but it should contain the text Client side support.
Do read https://tex.stackexchange.com/questions/4736/what-is-the-difference-between-fragile-and-robust-commands - if I am right, you have a fragile command where it shouldn't be. Figure out what should be there instead.

How to debug Octopress markdown source files?

I use Octopress for blogging. Generally it works well except one occassion -- after typing rake generate, I got depressing output which says something like:
psych.rb:203:in `parse': (<unknown>): mapping values are not allowed in this context at line 3 column 6 (Psych::SyntaxError)
I can't remember how many times I've encounterd this situation. Every time I google the key words above, but got nothing help.
What I can do is to exclude all the source files (*.mkd) from _posts, and add them one by one to check which one goes wrong. I keep checking, and finally it turns out that a minor grammer mistake makes octopress angry.
Life should NOT be that hard. So is it possible to debug a octopress source file to show which line of file is incorrect in grammer? The outputs from rake generate don't make sense at all.
The reason could be wrong JAML in the top part of the post (e.g. ':' in the title), see https://github.com/jekyll/jekyll/issues/549 for more info.
I've seen a similar error ("mapping values are not allowed in this context") when I try to convert markdown files, using Pandoc. Perhaps your error message is coming from pandoc somehow?
Don't bother to debug Octopress. Please migrate to Pelican -- a Python-powered static site generator. It is full-featured, easy to use, and no doubt, generating useful debug information.

finding json errors with ruby

i am using ruby gems json_pure and when i get parsing errors i am not able to determine the line number where the error is occuring. i was expecting to find a validator written in ruby that would tell me the line number. what is the best ruby approach to finding the json errors quickly?
thanks!
You could try Kwalify: http://www.kuwata-lab.com/kwalify/ruby/users-guide.html
It's not just a JSON/YAML validator, but you give it a schema and it validates against that.
You could use that to verify that your config file is correct (as per your definition of "correct") as well as correct JSON (or YAML), and it will tell you what line number the error happened on, and a bit of context for the error as well.
I removed a ']' from a sample in their documentation and the error message it gave me was
ERROR: document12a.json:10:1 [/favorite] flow sequence is not closed by ']'.
It also does data binding class generation if you want. It seems like a pretty good configuration management/validation tool.
You might find it easier to use http://www.jsonlint.com/ to check the JSON is valid - it will highlight any problematic lines.
It's and old question, but it's still an issue in current Ruby.
Default JSON parser in Ruby does not tell you the line number and/or column number the parse error occured. Just 'parse' error and that's it.
You could change the parser to Oj and use with MultiJson gem.
Oj is great, it's very fast, and it will show the line and even the column number the parsing errors occured! Great.
Sample error for one-line very big JSON (170KB+), Oj will give column 102421.
.../adapters/oj.rb:15:in `load': unexpected character at line 1, column 102421 [parse.c:666] (MultiJson::ParseError)
JSON is supposed to be sent over the network as a string, so there's actually only one line.

Resources