How to write raw unicode ruby string unprocessed - ruby

when I run in rails console, Ruby automatically changes my string:
2.7.1 :124 > "\u001b!\u0010\u001bE\u0001"
=> "\e!\u0010\eE\u0001"
(from "u001b" into "\e")
How to do this if I still want to maintain the \u001b ?

Related

Dir.entries(src).sort.each do |item| in natural order

File names look like 1835 Some text. p1, 1835 Some text. p1035, 1835 Some text. p2
I want the files to be processed in natural order, but Ruby sorts in binary order so p1035 comes before p2. Can this be done in natural order using something like Dir.entries(src).sort.each do |item|?
I would like to process .jpg files and change the creation date according to the p(age) order. For example the files above will have creation dates of 1897-01-01 00:00:**01** -0800, 1897-01-01 00:00:**02** -0800 etc. That is the seconds are incremented so they appear in order in a photo management program and show up in the calendar in that program.
On macOS
Yes, you can use naturally.
$ gem install naturally
Fetching naturally-2.2.1.gem
Successfully installed naturally-2.2.1
Parsing documentation for naturally-2.2.1
Installing ri documentation for naturally-2.2.1
Done installing documentation for naturally after 0 seconds
1 gem installed
$ irb
2.7.4 :001 > require 'naturally'
=> true
2.7.4 :002 > Naturally.sort(['1835 Some text. p1', '1835 Some text. p1035', '1835 Some text. p2'])
=> ["1835 Some text. p1", "1835 Some text. p2", "1835 Some text. p1035"]
2.7.4 :003 > Naturally.sort(["336", "335a", "3356", "335.1"])
=> ["335.1", "335a", "336", "3356"]
Adding to #Schwern
Dir.chdir(src) # the directory
files = Dir.glob("*.{jpg,png}")
filesSorted = Naturally.sort(files)
filesSorted.each do |item|
fn = src + item
# process each file
end
This has a small advantage over Dir.entries(src).each because Dir.glob doesn't seem to pick up the . and .. files.

Why is my Ruby (2.2.2) using ASCII-8 instead of UTF-8

The proximate problem I'm having is that in my app, these errors are happening in a JSON#dump call:
Encoding::UndefinedConversionError ("\xEF" from ASCII-8BIT to UTF-8):
I'm trying to understanding encoding issues with Ruby better. I read everywhere that as of Ruby 2.0, UTF-8 is the default encoding. Yet I find:
> RUBY_VERSION
=> "2.2.2"
> __ENCODING__
=> #<Encoding:US-ASCII>
I know the RUBYOPT env variable is one solution, but since I have to coordinate this solution between many developers and a few production & staging servers, I'd really like to fill the holes in my understanding first.
I have a file that the bash utility "file" correctly figures out is utf-8:
[none] andrew#~/ws$ file sample.txt
sample.txt: UTF-8 Unicode text
But this is not working as expected in ruby.
[none] andrew#~/ws$ irb
> text = File.read("sample.txt")
=> "\xE2\x99\xAA It seems today\nthat all you see \xE2\x99\xAA\n"
> text.encoding
=> #<Encoding:US-ASCII>
Of course there are numerous manual ways to do it:
> Encoding.default_external = Encoding.list[1]
=> #<Encoding:UTF-8>
> text = File.read("sample.txt")
=> "♪ It seems today\nthat all you see ♪\n"
But why might I not be getting utf-8 by default? For now I'll likely do something like the above in an initializer, but I'm really hoping to understand what's going on in ruby. Is there some environment variable or setting I should be looking for that I might have missed ($RUBYOPT is blank in my local environment)?

Charlock_Holmes not returning anything in Ruby?

I'm trying to use the gem charlock_holmes (https://github.com/brianmario/charlock_holmes) to detect and correct character formatting errors. However, the program doesn't return anything.
My code is:
require 'charlock_holmes'
contents = File.read('./myfile.csv')
detection = CharlockHolmes::EncodingDetector.detect(contents)
# => {:encoding => 'UTF-8', :confidence => 100, :type => :text}
as specified in the documentation.
When I run this in the directory, I just get nothing at all:
user$ ruby detector.rb
user$
Expected behavior is that it returns the detected encoding (and, if desired, can change it as well). I've got all the gems installed, I think, and I've tried under both 1.9.2 and 2.0.0.
Any ideas what I'm doing wrong or how to find out? I'm afraid I'm new to ruby, but I have tried to do a pretty comprehensive search before asking and have come up blank.
I think you should put p detection in your file detector.rb.
Save your code as below :
require 'charlock_holmes'
contents = File.read('./myfile.csv')
detection = CharlockHolmes::EncodingDetector.detect(contents)
p detection
Now run it as you ran earlier.

JSON Parser Acts Differently

I am trying to parse the following string called result:
{
"status":0,
"id":"faxxxxx-1",
"hypotheses":[
{"utterance":"skateboard","confidence":0.90466744},
{"utterance":"skate board"},
{"utterance":"skateboarding"},
{"utterance":"skateboards"},
{"utterance":"skate bored"}
]
}
Using obj = JSON.parse(result) in Ruby 1.8 with the json gem.
The command in question is:
puts "#{obj['hypotheses'][0]}"
My old workstation (whose harddrive died) gave me:
{"utterance" => "skateboard", "confidence" => 0.90466744}
My current workstation gives me:
confidence0.90466744utteranceskateboard
The old workstation was not set up by me, so I don't know what kind of packages were installed, while this current one was.
Why is there a difference in the output of the exact same script?
How can I make the current one look like the old one?
I am completely new to this btw.
In Ruby 1.8, Hash#to_s simply joins all of the elements together without spaces, equivalent to to_a.flatten.join('').
In Ruby 1.9, Hash#to_s is an alias to inspect and produces well-formatted output.
To get the equivalent thing in both cases:
puts obj['hypotheses'][0].inspect
The same thing applies to Array.

Ruby error Webrick or CGI?

I have using Webrick + CGI and when I instantiate, returns an error: (offline mode: enter name=value pairs on standard input)
irb(main):001:0> require 'cgi'
=> true
irb(main):002:0> cgi = CGI.new
(offline mode: enter name=value pairs on standard input)
Nope, not an error. That's the way it works.
From the ruby-docs CGI documentation
If the CGI object is not created in a standard CGI call environment (that is, it can’t locate REQUEST_METHOD in its environment), then it will run in “offline” mode. In this mode, it reads its parameters from the command line or (failing that) from standard input
In the irb console, after the (offline mode: enter name=value pairs on standard input) message, the console is waiting for you to enter the values. Enter key value pairs followed by Ctrld to finish entering data.
irb(main):001:0> require 'cgi'
=> true
irb(main):002:0> cgi = CGI.new
(offline mode: enter name=value pairs on standard input)
name=Prakash
number=432
Ctrld
=> #<CGI:0x007fa4eb2abd30 #options={:accept_charset=>"UTF-8"}, #accept_charset="UTF-8", #multipart=false, #params={"name"=>["Prakash"], "number"=>["432"]}, #cookies={}, #output_cookies=nil, #output_hidden=nil>
irb(main):003:0>
Refer to CGI Programming Documentation on PLEAC-Ruby for further code examples of working with CGI in ruby.

Resources