I'm really confused on this one, and maybe it's a bug in Ruby 2.6.2. I have files that were written as UTF-8 with BOM, so I'm using the following:
filelist = Dir.entries(#input_dirname).join(' ')
filelist = filelist.split(' ').grep(/xml/)
filelist.each do |indfile|
filecontents_tmp = File.read("#{#input_dirname}/#{indfile}", :encoding =>'bom|utf-8')
puts filecontents_tmp
end
If I put a debug breakpoint at the puts line, my file is read in properly. If I just run the simple script, I get the following error:
in `read': ASCII incompatible encoding needs binmode (ArgumentError)
I'm confused as to why this would work in debug, but not when run normally. Ideas?
Have you tried printing the default encoding when you run the file as opposed to when you debug the file? There are 3 ways to set / change the encoding in Ruby (that I'm aware of), so I wonder if it's different between running the file and debugging. You should be able to tell by printing the default encoding: puts Encoding.default_external.
As for actually fixing the issue, I ran into a similar problem and found this answer which said to add bin mode as an option to the File.open call and it worked for me.
Related
actually i'm writing a ruby script which accesses an API based on HTTP-POST calls.
The API returns a zip file containing textdocuments when i call it with specific POST-Parameters. At the moment i'm doing that with the Net::HTTP Package.
Now my problem:
It seems to return the zip-file as a string as far as i know. I can see "PK" (which i suppose is part of the PK-Header of zip-files) and the text from the documents.
And the Content-Type Header is telling me "application/x-zip-compressed; name="somename.zip"".
When i save the zip file like so:
result = comodo.get_cert("<somenumber>")
puts result['Content-Type']
puts result.inspect
puts result.body
File.open("test.zip", "w") do |file|
file.write result.body
end
I can unzip it on my macbook without further problems. But when i run the same code on my Win10 PC it tells me that the file is corrupt or not a ZIP-file.
Has it something to do with the encoding? Can i change it, so it's working on both?
Or is it a complete wrong approach on how to recieve a zip-file from a POST-request?
PS:
My ruby-version on Mac:
ruby 2.2.3p173
My ruby-version on Windows:
ruby 2.2.4p230
Many thanks in advance!
The problem is due to the way Windows handles line endings (\r\n for Windows, whereas OS X and other Unix based operating systems use just \n). When using File.open, using the mode of just w makes the file subject to line ending changes, so any occurrences of byte 0x0A (or \n) are converted into bytes 0x0D 0x0A (or \r\n), which effectively breaks the zip.
When opening the file for write, use the mode wb instead, as this will suppress any line ending changes.
http://ruby-doc.org/core-2.2.0/IO.html#method-c-new-label-IO+Open+Mode
Many thanks! Just as you posted the solution i found it out myself..
So much trouble because of one missing 'b' :/
Thank you very much!
The solution (see Ben Y's answer):
result = comodo.get_cert("<somenumber>")
puts result['Content-Type']
puts result.inspect
puts result.body
File.open("test.zip", "wb") do |file|
file.write result.body
end
I'm writing a small programm in ruby, which essentially changes some files within a zip-file. The zip-file is specified as a parameter on the command line and interpreted via the OptionParser.
The problem is, that when specifiying a file, which contains non-ascii characters, the file cannot be opened, saying that it could not be found. This problem occurs using cmd.exe under Windows.
Here is a minimal example:
# example.rb
require "zip"
require "optparse"
zip_file_name = String.new
# read and interprete command line arguments:
OptionParser.new do |opts|
opts.on("-f", "--file FILE", String, "The zip-file, which will be modified") do |f|
zip_file_name = f
end
end.parse!
# Open the zip file:
Zip::File.open(zip_file_name) do |zipfile|
end
If you create a zip-file test.zip and run example.rb -f test.zip everything is okay (it does finish without errors). Doing the same with a zip-file täst.zip gives me an error. I tried doing zip_file_name.encode!(Encoding::UTF_8), but this didn't solve the problem.
It seems to be an encoding problem (the encoding of zip_file_name is cp850) but the transcoding does not seem to work correctly.
So my question would be: How can I change my program to also allow non-ascii characters for specifying files on the command line?
Adding zip_file_name.force_encoding(Encoding::Windows_1252) before opening the file solves the issue (on Western Europe Windows).
Apparently, the CP850 file names encoding is a wrong assumption from Ruby. On my Windows system, it seems that filenames are encoded in Windows_1252 (a custom version of Latin1 or ISO 8859-1).
Running the command ocra script.rb --no-autoload --no-enc --add-all-core gives me the error initialize: can't convert nil into String (TypeError) for the following line:
doc = Nokogiri::XML(File.open(ARGV[0]))
Whats going on here? I want to build the executable to be able to take any argument and use that file as the xml configuration.
It seems a long time but the accept solution doesn't work for me.
The working solution is adding -- then any fake data to your argument to make the execution flow to be just as normal
example for:
so you need to do
ocra yourscript.rb -- ANYDATAHERE
Just add this above that line:
exit if defined? Ocra
# skip anything below this line when we're building the exe
Unless there's a require or otherwise loaded dependency below that line you should be fine.
I cannot figure out the proper way to encode a shell command to run from Ruby on Windows. The following script reproduces the problem:
# encoding: utf-8
def test(word)
returned = `echo #{word}`.chomp
puts "#{word} == #{returned}"
raise "Cannot roundtrip #{word}" unless word == returned
end
test "good"
test "bÃd"
puts "Success"
# win7, cmd.exe font set to Lucinda Console, chcp 65001
# good == good
# bÃd == bÃd
Is this a bug in Ruby, or do I need to encode the command string manually to a specific encoding, before it gets passed to the cmd.exe process?
Update: I want to make it clear that the problem is not with reading the output back into Ruby, its purely with sending the command to the shell. To demonstrate:
# encoding: utf-8
File.open("bbbÃd.txt", "w") do |f|
f.puts "nothing to see here"
end
filename = Dir.glob("bbb*.txt").first
command = "attrib #{filename}"
puts command.encoding
puts "#{filename} exists?: #{ File.exists?(filename) }"
system command
File.delete(filename)
#=>
# UTF-8
# bbbÃd.txt exists?: true
# File not found - bbbÃd.txt
You can see that the file gets created correctly, the File.exists? method confirms that Ruby can see it, but when I try to run the attrib command on it, its trying to use a different filename.
Try setting the environment variable LC_CTYPE like this:
LC_CTYPE=en_US.UTF-8
Set this globally in the command shell or inside your Ruby script:
ENV['LC_CTYPE']='en_US.UTF-8'
I had the same issue using drag-and-drop in windows.
When I dropped a file having unicode characters in it's name the unicode characters got replaced by question marks.
Tried everything with encoding, changing the drophandler etc.
The only thing that worked was creating a batch file with following contents.
ruby.exe -Eutf-8 C:\Users\user\myscript.rb %*
The batch file does receive the unicode characters correctly as you can see as you do an echo %* first followed by a pause
I needed to add the -Eutf-8 parameter to have the filename come in as UTF-8 in the script itself, having the following lines in my script were not enough
#encoding: UTF-8
Encoding.default_external = Encoding::UTF_8
Encoding.default_internal = Encoding::UTF_8
Hope this helps people with similar problems.
I'm still learning ruby, so I'm sure I'm doing something wrong here, but using ruby 1.9.3 on windows, I'm having a problem writing a file with random ascii garbage to be a specific size. I need to be able to write these files for a test on an application I'm QAing. On Mac and on *nix, the file size is written correctly every time. But on windows, it generates files of random size, generally between 1,024 bytes and 1,031 bytes.
I'm sure the problem is one of the characters that the rstr is generating is counting as two characters but... it seems like this shouldn't happen.
Here is my code:
num = 10
k = 1
for i in 1..num
fname = "f#{i}.txt"
f = File.new(fname, "w")
for k in 1..size
rstr = "#{(1..1024).map{rand(255).chr}.join}"
f.write rstr
print " #{rstr.size} " # this returns 1024 every time.
rstr = ""
end
f.close
end
Also tried:
opts = {}
opts[:encoding] = "UTF-8"
fname = "f#{i}.txt"
f = File.new(fname, "w", opts)
By default files open in Windows are open with text mode meaning that line endings and other details are adjusted.
If you want the files be written byte-to-byte exactly as you want, you need to open the files in binary mode:
File.new("foo", "wb") do |f|
# ...
end
The b is a ignored on POSIX operating systems, so your scripts are now cross-platform compatible.
Note: I used block syntax to manage the file so it properly closes and disposes the file handler once the block is executed. You no longer need to worry about closing the file ;-)
Hope this helps.
There is not any 255 ASCII. The values goes from 0~254.
If you try to printf 255.chr, you'll get a multibyte character.
As Windows does not standard utf-8, you'll get incorrect values. Hence the problem you're facing!
Try adding #coding: utf-8 at the top of your file. It should get things working.