I'm currently trying to output the euro symbol through a simple ruby script that I made but I keep getting "?" whenever I try.
I'm currently using puts "\244".
Any thoughts?
btw. I'm using ruby 1.9.2 p180
You need to add a "magic comment" at the top of your script like this:
# encoding: UTF-8
puts "€"
... assuming that you want to use UTF-8 to allow for double-byte characters. You can then use the Euro symbol directly.
You can read more about Ruby 1.9 string encoding and magic comments here:
http://blog.grayproductions.net/articles/ruby_19s_three_default_encodings
Related
I can't do a very simple thing in ruby shoes.
The code is the seguent:
#encoding:UTF-8
Shoes.app do
#var = File.readlines('temp.txt')
para #var
end
If in the txt there is a word with an accented character, the code doesn't work. Shoes says "not a valid UTF8 string"
How can I solve this problem???
edit: the result is the same if I remove the line #encoding:UTF-8
The line:
#encoding:UTF-8
only governs the characters you use for your actual code. And if you are using a recent version of ruby, it does nothing because by default ruby code is assumed to be written with UTF-8 characters.
edit: the result is the same if I remove the line #encoding:UTF-8
Yes, that's because you are probably using a recent version of ruby, and that is the default.
You have to know the encoding of the file you want to read. If the file is encoded in ISO-8859-1, you can do:
#var = File.readlines('temp.txt', encoding: "ISO-8859-1")
What characters are in your file? What do you get when you do this:
$ ruby -e"puts Encoding.default_external"
I am trying to get access to the text in the macOS clipboard from within Automator using a Ruby script. This script calls macOS's internal Ruby (/usr/bin/ruby). After running into much trouble with unidentified character sequence errors, I noticed that Automator's Ruby defaults to ASCII instead of UTF-8, while this is not the default behaviour of modern Ruby since years ago.
So, running the following:
require 'clipboard'
puts(Clipboard.paste.encoding)
always yields "ASCII", while running the same Ruby interpreter from the command line to run the same script and to paste the same pieces of text always yields "UTF-8".
This becomes an issue when I copy multibyte characters like the accented characters (e.g. ê). For instance if I copy the following text:
Bourdieu, P., & Passeron, J.-C. (1970). La reproduction: éléments pour une théorie du système d’enseignement. Ed. de Minuit.
And then run:
require 'clipboard'
puts(Clipboard.paste)
I get nothing in Automator while I get a copy of the original text on the command line.
If I try to transform the text in any way, I get an error. Let's say I run the following:
require 'clipboard'
puts(Clipboard.paste.gsub(/\r/,""))
In response, I will receive:
-e:2:in `gsub': invalid byte sequence in US-ASCII (ArgumentError)
from -e:2:in `<main>'
How can I avoid this and make sure what I get from the clipboard is already converted into proper UTF-8?
I have tried encode and force_encoding methods, as well as a variety of combinations of # encoding: UTF-8, Encoding.default_external='utf-8' and Encoding.default_internal='utf-8', but it seems there are corrupt characters that hinder the conversion, so no success in the end.
Is there anything I am ignoring here, or any combination I haven't tried?
Notes:
It is Automator that calls the interpreter, and not me. So, I can't modify Automator's call to add switches and modify options.
string.encode!('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '') works, but the sanitization comes at the cost of chopping off the multibyte characters, which is obviously not the intended behaviour here.
I found that in macOS Mojave 10.14.6, starting the Automator 'Run Shell Script' with # coding: UTF-8 solved the problem. Not sure the #!/usr/bin/ruby is useful or necessary, but I include it. You can test by using this code with and without the # coding: UTF-8:
#!/usr/bin/ruby
# coding: UTF-8
test_s = "will print ✪"
puts test_s
Credit for the answer is from here: discussions.apple.com
I can't seem to find the right combination of String#encode shenanigans.
I think I'd got confused on this one so I'll post this here to hopefully help anyone else who is similarly confused.
I was trying to do my encoding in an irb session, which gives you
irb(main):002:0> 'I’d'.force_encoding('UTF-8')
=> "I’d"
And if you try using encode instead of force_encoding then you get
irb(main):001:0> 'I’d'.encode('UTF-8')
=> "I’d"
This is with irb set to use an output and input encoding of UTF-8. In my case to convert that string the way I want it involves telling Ruby that the source string is in windows-1252 encoding. You can do this by using the -E argument in which you specify `inputencoding:outputencoding' and then you get this
$ irb -EWindows-1252:UTF-8
irb(main):001:0> 'I’d'
=> "I\xC3\xA2\xE2\x82\xAC\xE2\x84\xA2d"
That looks wrong unless you pipe it out, which gives this
$ ruby -E Windows-1252:UTF-8 -e "puts 'I’d'"
I’d
Hurrah. I'm not sure about why Ruby showed it as "I\xC3\xA2\xE2\x82\xAC\xE2\x84\xA2d" (something to do with the code page of the terminal?) so if anyone can comment with further insight that would be great.
I expect your script is using the encoding cp1251 and you have ruby >= 1.9.
Then you can use force_encoding:
#encoding: cp1251
#works also with encoding: binary
source = 'I’d'
puts source.force_encoding('utf-8') #-> I’d
If my exceptions are wrong: Which encoding do you use and which ruby version?
A little background:
Problems with encoding are difficult to analyse. There may be conflicts between:
Encoding of the source code (That's defined by the editor).
Expected encoding of the source code (that's defined with #encoding on the first line). This is used by ruby.
Encoding of the string (see e.g. section String encodings in http://nuclearsquid.com/writings/ruby-1-9-encodings/ )
Encoding of the output shell
this is my ruby code
require 'json'
a=Array.new
value="¿value"
data=value.gsub('¿','-')
a[0]=data
puts a
puts "json is"
puts jsondata=a.to_json
getting following error
C:\Ruby193>new.rb
C:/Ruby193/New.rb:3: invalid multibyte char (US-ASCII)
C:/Ruby193/New.rb:3: syntax error, unexpected tIDENTIFIER, expecting $end
value="┐value"
^
That's not a JSON problem — Ruby can't decode your source because it contains a multibyte character. By default, Ruby tries to decode files as US-ASCII, but ¿ isn't representable in US-ASCII, so it fails. The solution is to provide a magic comment as described in the documentation. Assuming your source file's encoding is UTF-8, you can tell Ruby that like so:
# encoding: UTF-8
# ...
value = "¿value"
# ...
With an editor or an IDE the soluton of icktoofay (# encoding: UTF-8 - in the first line) is perfect.
In a shell with IRB or PRY it is difficult to find a working configuration. But there is a workaround that at least worked for my encoding problem which was to enter German umlaut characters.
Workaround for PRY:
In PRY I use the edit command to edit the contents of the input buffer
as described in this pry wiki page.
This opens an external editor (you can configure which editor you want). And the editor accepts special characters that can not be entered in PRY directly.
Originally this bug was posted here: https://rails.lighthouseapp.com/projects/8994/tickets/5713-ruby-19-ku-incompatible-with-mem_cache_store
And now, as we've run into the same issue, I'll copy here a question from that issue, hoping someone have an answer already:
When Ruby 1.9 is started in unicode mode (-Ku), mem_cache_store.rb fails to parse:
/usr/local/ruby19/bin/ruby -Ku /usr/local/ruby-1.9.2-p0/lib/ruby/gems/1.9.1/gems/
activesupport-3.0.0/lib/active_support/cache/mem_cache_store.rb
/usr/local/ruby-1.9.2-p0/lib/ruby/gems/1.9.1/gems/activesupport-3.0.0/lib/active_support/
cache/mem_cache_store.rb:32: invalid multibyte escape: /[\x00-\x20%\x7F-\xFF]/
Our case is practically identical: when you set config.action_controller.cache_store to :mem_cache_store, and try to run tests, console, or server, you recieve this in return:
/Users/%username%/.rvm/gems/ruby-1.9.2-p0/gems/activesupport-3.0.1/lib/active_support/
cache/mem_cache_store.rb:32: invalid multibyte escape: /[\x00-\x20%\x7F-\xFF]/
Any ideas how this can be avoided?..
Ruby 1.9 in unicode mode will attempt to interpret the regular expression as unicode. To avoid this you need to pass the regular expression option "n" for "no encoding":
ESCAPE_KEY_CHARS = /[\x00-\x20%\x7F-\xFF]/n
Now we have our raw 8-bit encoding (the only thing Ruby 1.8 speaks) as intended:
ruby-1.9.2-p136 :001 > ESCAPE_KEY_CHARS = /[\x00-\x20%\x7F-\xFF]/n.encoding
=> # <Encoding:ASCII-8BIT>
Hopefully the Rails teams fixes this, for now you have to edit the file.