How to Call/Require Ruby 1.8 Lib from Ruby 1.9 - ruby

I'm using a Ruby 1.8 lib kakasi-ruby, but it seems that it can only be compiled against Ruby 1.8 (https://github.com/hogelog/kakasi-ruby/issues/2)
My application is Ruby 1.9.3, so I need to call kakasi-ruby from Ruby 1.9.3.
How should I do?
Do I have to open a subprocess with Ruby 1.8, and wait for it finish to get the process return value?
Edit:
https://github.com/hogelog/kakasi-ruby

Found 3 possible paths:
There seems to be a branch for 1.9 in the repo. Maybe try to compile that instead?
Otherwise your fastest option is probably to go back to 1.8 depending on what kind of app it is.
Calling with 1.8 may work BUT since the library seems to be a binding to some C code you could probably call that code directly just as well.

BTW, here is the usage in Ruby 1.9
plee#sos:~/Japanese$ irb
1.9.3p194 :001 > require 'kakasi'
=> true
1.9.3p194 :002 > src="前原誠司経済財政相は4日、朝日新聞などのインタビューに対し"
=> "前原誠司経済財政相は4日、朝日新聞などのインタビューに対し"
1.9.3p194 :003 > src=src.encode("EUC-JP", "UTF-8")
=> "\x{C1B0}\x{B8B6}\x{C0BF}\x{BBCA}\x{B7D0}\x{BAD1}\x{BAE2}\x{C0AF}\x{C1EA}\x{A4CF}\x{A3B4}\x{C6FC}\x{A1A2}\x{C4AB}\x{C6FC}\x{BFB7}\x{CAB9}\x{A4CA}\x{A4C9}\x{A4CE}\x{A5A4}\x{A5F3}\x{A5BF}\x{A5D3}\x{A5E5}\x{A1BC}\x{A4CB}\x{C2D0}\x{A4B7}"
1.9.3p194 :004 > dst=Kakasi.kakasi("-w", src)
=> "\xC1\xB0\xB8\xB6 \xC0\xBF\xBB\xCA \xB7\xD0\xBA\xD1 \xBA\xE2\xC0\xAF \xC1\xEA \xA4\xCF \xA3\xB4 \xC6\xFC \xA1\xA2 \xC4\xAB\xC6\xFC\xBF\xB7\xCA\xB9 \xA4\xCA\xA4\xC9\xA4\xCE \xA5\xA4\xA5\xF3\xA5\xBF\xA5\xD3\xA5\xE5\xA1\xBC \xA4\xCB \xC2\xD0\xA4\xB7"
1.9.3p194 :005 > dst.force_encoding("EUC-JP")
=> "\x{C1B0}\x{B8B6} \x{C0BF}\x{BBCA} \x{B7D0}\x{BAD1} \x{BAE2}\x{C0AF} \x{C1EA} \x{A4CF} \x{A3B4} \x{C6FC} \x{A1A2} \x{C4AB}\x{C6FC}\x{BFB7}\x{CAB9} \x{A4CA}\x{A4C9}\x{A4CE} \x{A5A4}\x{A5F3}\x{A5BF}\x{A5D3}\x{A5E5}\x{A1BC} \x{A4CB} \x{C2D0}\x{A4B7}"
1.9.3p194 :006 > dst=dst.encode("UTF-8", "EUC-JP")
=> "前原 誠司 経済 財政 相 は 4 日 、 朝日新聞 などの インタビュー に 対し"
1.9.3p194 :007 >

Related

Loading help documentation alters irb results

I'm learning Ruby, and went into irb to test something out with the Date class. In short, I did the following:
$ irb
irb(main):001:0> Date.new
=> #<Date:0x007f983103ee60>
irb(main):002:0> Date.constants
=> []
irb(main):003:0> help Date
=> nil
irb(main):004:0> Date.constants
=> [:MONTHNAMES, :ABBR_MONTHNAMES, :DAYNAMES, :ABBR_DAYNAMES, :ITALY, :ENGLAND, :JULIAN, :GREGORIAN, :Infinity]
irb(main):005:0>
I'm so confused by this. Questions:
Why would reading help documentation cause the output of Date.constants to change?
Presumably help is loading/initializing something. What is it? And why did Date.new work?
Is whatever this is something I need to worry about when writing .rb files?
This is tough to Google for. I'm on ruby 2.1.2 and irb 0.9.6.
The most likely cause of the addition of the Date constants after running help Date is that somewhere in the execution of the command, require 'date' (or require 'time') is called:
2.1.0 :001 > Date.constants
=> []
2.1.0 :002 > require 'date'
=> true
2.1.0 :003 > Date.constants
=> [:MONTHNAMES, :ABBR_MONTHNAMES, :DAYNAMES, :ABBR_DAYNAMES, :ITALY, :ENGLAND, :JULIAN, :GREGORIAN, :Infinity]
As for Date.new, it works because Date comes with a default constructor.
I doubt this will ever be problematic for you.

Mistaken call to a hash table results in strange ruby irb prompt

I'm running the Ruby irb on a DOS environment.
I've defined a dictionary.
irb(main):001:0> stuff = {'name'=> 'Zed', 'age'=>36, 'height'=>6*12+2}
I've made a mistake in calling it
irb(main):004:0> puts stuff['age]
the ruby prompt changes to an apostrophe ' instead of the usual >
irb(main):006:1'
irb(main):007:1'
IRB doesn't work anymore.
What has happened here and how do I get the shell to function again without quitting the program?
It is waiting for the closing ',that you missed here puts stuff['age]. Use Ctrl+c to get the prompt back,that you are expecting.
See below:
2.0.0p0 :001 > stuff = {'name'=> 'Zed', 'age'=>36, 'height'=>6*12+2}
=> {"name"=>"Zed", "age"=>36, "height"=>74}
2.0.0p0 :002 > puts stuff['age]
2.0.0p0 :003'> ^C
2.0.0p0 :003 >

How do I avoid pretty-printing HTML in Nokogiri while using to_html?

I am using Nokogiri with Ruby on Rails v2.3.8.
Is there a way in which I can avoid pretty-printing in Nokogiri while using to_html?
I read that to_xml allows this to be done using to_xml(:indent => 0), but this doesn't work with to_html.
Right now I am using gsub to strip away new-line characters. Does Nokogiri provide any option to do it?
I solved this using .to_html(save_with: 0)?
2.1.0 :001 > require 'nokogiri'
=> true
2.1.0 :002 > doc = Nokogiri::HTML.fragment('<ul><li><span>hello</span> boom!</li></ul>')
=> #<Nokogiri::HTML::DocumentFragment:0x4e4cbd2 name="#document-fragment" children=[#<Nokogiri::XML::Element:0x4e4c97a name="ul" children=[#<Nokogiri::XML::Element:0x4e4c47a name="li" children=[#<Nokogiri::XML::Element:0x4e4c240 name="span" children=[#<Nokogiri::XML::Text:0x4e4c0a6 "hello">]>, #<Nokogiri::XML::Text:0x4e4c86c " boom!">]>]>]>
2.1.0 :003 > doc.to_html
=> "<ul><li>\n<span>hello</span> boom!</li></ul>"
2.1.0 :004 > doc.to_html(save_with: 0)
=> "<ul><li><span>hello</span> boom!</li></ul>"
tested on: nokogiri (1.6.5) + libxml2 2.7.6.dfsg-1ubuntu1 + ruby 2.1.0p0 (2013-12-25 revision 44422) [i686-linux]
You can use Nokogiri::HTML.fragment() instead of just Nokogiri::HTML(). When you perform to_html it won't add newlines, a DOCTYPE header or make it 'pretty' in any way.

Ruby JSON.parse returning incorrect data for unicode

I'm trying to parse some JSON containing escaped unicode characters using JSON.parse. But on one machine, using json/ext, it gives back incorrect values. For example, \u2030 should return E2 80 B0 in UTF-8, but instead I'm getting 01 00 00. It fails with either the escaped "\\u2030" or the unescaped "\u2030".
1.9.2p180 :001 > require 'json/ext'
=> true
1.9.2p180 :002 > s = JSON.parse '{"f":"\\u2030"}'
=> {"f"=>"\u0001\u0000\u0000"}
1.9.2p180 :003 > s["f"].encoding
=> #<Encoding:UTF-8>
1.9.2p180 :004 > s["f"].valid_encoding?
=> true
1.9.2p180 :005 > s["f"].bytes.map do |x| x; end
=> [1, 0, 0]
It works on my other machine with the same version of ruby and similar environment variables. The Gemfile.lock on both machines is identical, including json (= 1.6.3). It does work with json/pure on both machines.
1.9.2p180 :001 > require 'json/pure'
=> true
1.9.2p180 :002 > s = JSON.parse '{"f":"\\u2030"}'
=> {"f"=>"‰"}
1.9.2p180 :003 > s["f"].encoding
=> #<Encoding:UTF-8>
1.9.2p180 :004 > s["f"].valid_encoding?
=> true
1.9.2p180 :005 > s["f"].bytes.map do |x| x; end
=> [226, 128, 176]
So is there something else in my environment or setup that could be causing it to parse incorrectly?
Recently ran into this same problem, and I tracked it down to this Ruby bug caused by the declaration of this buffer in Ruby 1.9.2 and how it gets optimized by GCC. It's fixed in this commit.
You can recompile Ruby with -O0 or use a newer version of Ruby (1.9.3 or better) to fix it.
Try upgrade your JSON Gem (at least to 1.6.6) or newest 1.7.1.

JSON with JRuby - Not parsing the result in UTF-8

I am using JSON implementation for Ruby in my rails project to parse the JSON string sent by ajax, but I found that although the json string is in UTF-8, the result coming out is in ASCII-8BIT by default, see below
jruby-1.6.7 :068 > json_text = '["に到着を待っている"]'
=> "[\"に到着を待っている\"]"
jruby-1.6.7 :069 > json_text.encoding
=> #<Encoding:UTF-8>
jruby-1.6.7 :070 > json_parsed = JSON.parse(json_text)
=> ["\u00E3\u0081\u00AB\u00E5\u0088\u00B0\u00E7\u009D\u0080\u00E3\u0082\u0092\u00E5\u00BE\u0085\u00E3\u0081\u00A3\u00E3\u0081\u00A6\u00E3\u0081\u0084\u00E3\u0082\u008B"]
jruby-1.6.7 :071 > json_parsed.first.encoding
=> #<Encoding:ASCII-8BIT>
I don't want it being escaped, I would like to have a UTF-8 result. Is there a way to set that? I check the documentation of the JSON project, finding not encoding options for the method JSON.parse. Maybe I missed something, how could I do that?
UPDATE:
as notified by #fl00r, this example is working fine in MRI, but not in JRUBY
This looks like a bug, as this actually works when using the pure version:
jruby-1.6-head :001 > require 'json/pure'
=> true
jruby-1.6-head :002 > json_text = '["に到着を待っている"]'
=> "[\"に到着を待っている\"]"
jruby-1.6-head :003 > json_parsed = JSON.parse(json_text)
=> ["に到着を待っている"]
jruby-1.6-head :004 > json_parsed.first.encoding
=> #<Encoding:UTF-8>
jruby-1.6-head :005 >
Edit: Just saw you opened a ticket for this...
Edit 2: This actually seems to have already been fixed by this commit. To install latest code from json:
$ git clone https://github.com/flori/json.git
$ cd json
$ rake jruby_gem
$ jruby -S gem install pkg/json-1.6.6-java.gem

Resources