encoding error when calling FileUtils.copy_entry in Ruby - ruby

I'm writing code for copy and paste recursively.
But I got an encoding error when calling FileUtils.copy_entry
Error Messages :
C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1535:in `join': incompatible character encodings: CP949 and UTF-8 (Encoding::CompatibilityError)
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1535:in `join'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1218:in `path'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:463:in `block in copy_entry'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1485:in `call'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1485:in `wrap_traverse'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1488:in `block in wrap_traverse'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1487:in `each'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1487:in `wrap_traverse'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1488:in `block in wrap_traverse'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1487:in `each'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1487:in `wrap_traverse'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:460:in `copy_entry'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:435:in `block in cp_r'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1558:in `block in fu_each_src_dest'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1574:in `fu_each_src_dest0'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1556:in `fu_each_src_dest'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:434:in `cp_r'
from copy_test.rb:11:in `copy_files'
from copy_test.rb:81:in `block in <main>'
from copy_test.rb:76:in `each'
from copy_test.rb:76:in `<main>'
I'm calling copy_entry like this.
def copy_files(src, dst)
FileUtils.mkdir_p(File.dirname(dst))
FileUtils.copy_entry(src, dst)
end
There are some sub-folders and files named with my local language in src.
So I think (Encoding::CompatibilityError) occurs because of these sub-folders and files with local language (not English).
When I tested with only Enlgish folders and files, It worked.
But, I need non-English folders and files too.
How can I solve this problem?
Should I define a new method replacing copy_entry?
My Codes ADDED:
# -*- encoding: cp949 -*-
require 'fileutils'
LIST_FILE = "backup_target_list.txt"
DEST_FILE = "backup_dest.txt"
def copy_files(src, dst)
FileUtils.mkdir_p(File.dirname(dst))
FileUtils.copy_entry(src, dst)
end
def get_dst_base_name()
cur_date = Time.now.to_s[0..9]
return "backup_#{cur_date}"
end
def get_backup_list(list_file)
if !File.exist?(list_file) then
return nil
end
path_arr = []
File.open(list_file, "r") do |f|
f.each_line { |line|
path_arr.push(make_path(line.gsub("\n", "")))
}
end
return path_arr
end
def get_dest(dest_file)
if !File.exist?(dest_file) then
return nil
end
return File.open(dest_file, "r").readline
end
def make_path(*str)
path_new = nil
str.each do |item|
if item.class == Array then
path_new = (path_new == nil ? File.join(item) : File.join(path_new, item))
else
if item.include?(File::ALT_SEPARATOR) then
path_new = (path_new == nil ? File.join(item.split(File::ALT_SEPARATOR)) : File.join(path_new, item.split(File::ALT_SEPARATOR)))
else
path_new = (path_new == nil ? File.join(item) : File.join(path_new, item))
end
end
end
return path_new
end
get_backup_list(LIST_FILE).each do |path|
src = path
tmp = src.split(File::SEPARATOR)
dst = make_path(get_dest(DEST_FILE), get_dst_base_name, tmp[1..tmp.length])
print "src: #{src}\n=> dst: #{dst}\n"
copy_files(src, dst)
end

I saw you are using cp949 encoding, and the raised error tells CP949 and UTF-8 are incompatible, so why you just using UTF-8 encoding? So replace the shebang with
# coding: UTF-8
And to ensure all the characters read from LIST_FILE are utf-8 encoded, add the following:
line.force_encoding('utf-8')

Related

How to copy files and sub-folders

I'm trying to copy files including sub-folders, but I'm getting a RunTimeError copy: unknown file type.
I also created folders and files to test myself.
Folders and files for test:
- D:
- backup_target
- target1
- file1.txt
- sub1
- sub_file1.txt
- sub_file2.txt
- sub2
- sub1.txt
- sub.txt
- target2
- file1.txt
- file.txt
And I made a list as a text file in "backup_target_list2.txt":
D:\backup_target\target1
D:\backup_target\target2
My code is:
require 'fileutils'
LIST_FILE = "backup_target_list2.txt"
def copy_files(src, dst)
FileUtils.mkdir_p(File.dirname(dst))
FileUtils.copy_entry(src, dst)
end
def dst_naming(src, src_head, dst_head)
cur_date = Time.now.to_s[0..9]
dst = src.gsub(src_head, dst_head + "/backup_#{cur_date}/")
return dst
end
def get_backup_list(list_file)
if !File.exist?(list_file) then
return nil
end
path_arr = []
File.open(list_file, "r") do |f|
f.each_line { |line|
path_arr.push(replace_delim(line).gsub("\n", ""))
}
end
return path_arr
end
def replace_delim(str_obj, delim_org= "\\", delim_new = "/")
if str_obj.class == Array
str_arr = []
str_obj.each do |str|
str_arr.push(str.gsub! delim_org, delim_new)
end
return str_arr
else
return str_obj.gsub! delim_org, delim_new
end
end
get_backup_list(LIST_FILE).each do |path|
src = path
dst = dst_naming(src, /D:\//, "C:/Users/MyName/Desktop/")
print "src: #{src}\n=> dst: #{dst}\n"
copy_files(src, dst)
end
It works fine with the test folders.
The problem is, I'm getting a RunTimeError when I run the code with my real folders:
C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1375:in `copy': unknown file type: D:/MyRealFolder (RuntimeError)
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:463:in `block in copy_entry'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1485:in `call'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:1485:in `wrap_traverse'
from C:/Ruby200/lib/ruby/2.0.0/fileutils.rb:460:in `copy_entry'
from copy_test.rb:8:in `copy_files'
from copy_test.rb:52:in `block in <main>'
from copy_test.rb:48:in `each'
from copy_test.rb:48:in `<main>'
The difference between "real" and "test" is that "real" folders have more sub-folders and more files.
I also tried FileUtils.cp_r and FileUtils.cp but I'm still getting an error.
How can I fix it?

Ruby Filename error?

I forked this gist from https://gist.github.com/mattdipasquale/571405
I am getting the following error:
deDUPER.rb:14:in `read': Invalid argument - /Volumes/Drobo #1 2009-2012/AMNH Video/2012/2012-01-17 Creatures of Light/Capture Scratch/Art 3:9/Capture Scratch/2012-03-09_microraptor livestream/A Cam_Microraptor livestream.mov (Errno::EINVAL)
from deDUPER.rb:14:in `block in <main>'
from deDUPER.rb:10:in `each'
from deDUPER.rb:10:in `<main>'
I think it is caused from illegal characters in the file or folder names, but i'm not sure. I don't want to change the file or folder names because they are linked to old Final Cut Pro project files that rely on referenced filepaths to keep the project intact. Does anyone have experience with this? Is there a way I can get this script to work without having to change the file or folder names?
# Define the unique method that removes duplicates
#!/usr/bin/ruby
require 'digest/md5'
library_path = ARGV[0]
hash = {}
Dir.glob(library_path + "/**/*", File::FNM_DOTMATCH).each do |filename|
next if File.directory?(filename)
puts 'Checking ' + filename
key = Digest::MD5.hexdigest(IO.read(filename)).to_sym
if hash.has_key? key
# puts "same file #{filename}"
hash[key].push filename
else
hash[key] = [filename]
end
end
hash.each_value do |filename_array|
if filename_array.length > 1
puts "=== Identical Files ===\n"
filename_array.each { |filename| puts ' '+filename }
end
end

ruby - zlib header error when uncompressing tar.gz

I wanted to write a ruby program to unpackage tar.gz files, and I ran into some issues. After reading the documentation on the ruby-doc site, Zlib::GzipReader and Zlib::Inflate. Then I found this Module someone wrote on GitHub, and that didn't work either. So, using the examples from the Ruby page for Zlib::GzipReader, I these were able to run successfully.
irb(main):027:0* File.open('zlib_inflate.tar.gz') do |f|
irb(main):028:1* gz = Zlib::GzipReader.new(f)
irb(main):029:1> print gz.read
irb(main):030:1> gz.close
irb(main):031:1> end
#
#
#
irb(main):023:0* Zlib::GzipReader.open('zlib_inflate.tar.gz') { |gz|
irb(main):024:1* print gz.read
irb(main):025:1> }
Then, when trying to use the Zlib::Inflate options, I kept running into incorrect header check errors.
irb(main):047:0* zstream = Zlib::Inflate.new
=> #<Zlib::Inflate:0x00000002b15790 #dictionaries={}>
irb(main):048:0> buf = zstream.inflate('zlib_inflate.tar.gz')
Zlib::DataError: incorrect header check
from (irb):48:in `inflate'
from (irb):48
from C:/ruby/Ruby200p353/Ruby200-x64/bin/irb:12:in `<main>'
#
#
#
irb(main):049:0> cf = File.open("zlib_inflate.tar.gz")
=> #<File:zlib_inflate.tar.gz>
irb(main):050:0> ucf = File.open("ruby_inflate.rb", "w+")
=> #<File:ruby_inflate.rb>
irb(main):051:0> zi = Zlib::Inflate.new(Zlib::MAX_WBITS)
=> #<Zlib::Inflate:0x00000002c097f0 #dictionaries={}>
irb(main):052:0> ucf << zi.inflate(cf.read)
Zlib::DataError: incorrect header check
from (irb):52:in `inflate'
from (irb):52
from C:/ruby/Ruby200p353/Ruby200-x64/bin/irb:12:in `<main>'
#
#
#
irb(main):057:0* File.open("zlib_inflate.tar.gz") {|cf|
irb(main):058:1* zi = Zlib::Inflate.new
irb(main):059:1> File.open("zlib_inflate.rb", "w+") {|ucf|
irb(main):060:2* ucf << zi.inflate(cf.read)
irb(main):061:2> }
irb(main):062:1> zi.close
irb(main):063:1> }
Zlib::DataError: incorrect header check
from (irb):60:in `inflate'
from (irb):60:in `block (2 levels) in irb_binding'
from (irb):59:in `open'
from (irb):59:in `block in irb_binding'
from (irb):57:in `open'
from (irb):57
from C:/ruby/Ruby200p353/Ruby200-x64/bin/irb:12:in `<main>'
How would I go about taking a tar.gz file, and extract the contents so the files are not null? I did read this other post about Ruby Zlib from here, but that did not work for me as well. What am I doing wrong?

Encoding::UndefinedConversionError: U+00A0 from UTF-8 to US-ASCII

I'm trying to scrap the 52 between the anchor links:
<div class="zg_usedPrice">
52 new
</div>
With this code:
def self.parse_products
product_hash = {}
product = #data.css('#zg_centerListWrapper')
product.css('.zg_itemImmersion').each do | product |
product_name = product.css('.zg_title a').text
product_used_price_status = product.css('.zg_usedPrice > a').text[/(\D+)/]
product_hash[:product] ||= []
product_hash[:product] << { :name => product_name,
:used_status => product_used_price_status }
end
product_hash
end
But I think the http://www.amazon.com/gp/offer-listing/B000O3GCFU/ref=zg_bs_baby-products_price?ie=UTF8&condition=new part in the URL is producing the following error:
Encoding::UndefinedConversionError:
U+00A0 from UTF-8 to US-ASCII
# ./parser_spec.rb:175:in `block (2 levels) in <top (required)>'
I tried what they suggested in "Ruby error UTF-8 to ASCII", but I'm still getting the same problem. Is there any workaround for that?
Full error trace:
1) Product (Baby) should return correct keys
Failure/Error: expect(product_hash[:product]["Pet Supplies"].keys).to eq(["Birds", "Cats", "Dogs", "Fish & Aquatic Pets", "Horses", "Insects", "Reptiles & Amphibians", "Small Animals"])
TypeError:
can't convert String into Integer
# ./parser_spec.rb:179:in `[]'
# ./parser_spec.rb:179:in `block (2 levels) in <top (required)>'
2) Product (Baby) should return correct values
Failure/Error: expect(product_hash[:product]["Pet Supplies"].values).to eq([16281, 245512, 513926, 46811, 14805, 364, 5816, 19769])
TypeError:
can't convert String into Integer
# ./parser_spec.rb:183:in `[]'
# ./parser_spec.rb:183:in `block (2 levels) in <top (required)>'
3) Product (Baby) should return correct hash
Failure/Error: expect(product_hash[:product]).to eq({"Pet Supplies"=>{"Birds"=>16281, "Cats"=>245512, "Dogs"=>513926, "Fish & Aquatic Pets"=>46811, "Horses"=>14805, "Insects"=>364, "Reptiles & Amphibians"=>5816, "Small Animals"=>19769}})
Encoding::UndefinedConversionError:
U+00A0 from UTF-8 to US-ASCII
# ./parser_spec.rb:187:in `block (2 levels) in <top (required)>'
Your HTML sample doesn't match the code you're showing, plus the URL you gave doesn't exist any more, so it's difficult to help you.
Here's a start:
require 'nokogiri'
html = '<div class="zg_usedPrice">
52 new
</div>
'
doc = Nokogiri::HTML(html)
text = doc.at('div.zg_usedPrice a').text # => "52\u00A0new"
text.gsub(/\u00A0/, ' ') # => "52 new"

ruby: `read': Invalid argument -(Errno::EINVAL) at File.read

I'm doing a simple script to check crc of all files...
require "zlib"
exit if Object.const_defined?(:Ocra)
files = Dir.glob("*")
File.open('dir.txt', 'a+') do |file|
file.puts files
end
File.read('dir.txt').each_line { |line|
file = File.read(line) ; nil
file_crc = Zlib.crc32(file,0).to_s(16)
puts line, file_crc
}
The problem is at the line File.read('dir.txt').each_line { |line|
I get this error:
test.rb:13:in `read': Invalid argument - 1.exe (Errno::EINVAL)
from C:/Users/Administrador/Desktop/1.rb:13:in `block in <main>'
from C:/Users/Administrador/Desktop/1.rb:12:in `each_line'
from C:/Users/Administrador/Desktop/1.rb:12:in `<main>'
PD: 1.exe is a file listed in the "dir.txt".
Have you checked that the line doesn't contain extra characters? p line.
IIRC line will contain the newline character, use line.chomp.

Resources