I have a .rb file that when run takes a string input for UTF-8, but for some reason the input is modified automatically. Here is an example of what my code looks like:
# encoding :UTF-8
print "Enter a UTF-8 input: "
text = gets.chomp
p text
So, if I input "\n\u001C\u0018\t\u001C", it prints out "\\n\\u001C\\u0018\\t\\u001C" which is not what I inputted!
Curious as I was, I compared the lengths, and it is the same 22. But, I know it is modified because when I run the text through a function in the same file, it reads it as the second one. I know this because when I ran my actual code through irb, it works as intended, but when I run it from the file, it doesn't do what I want.
EDIT: Sean answered the question I had about the printing, but it doesn't explain why when I use the value in text for a function within the same ruby file, it does not see it as it should. In other words, the function works perfectly on irb when I physically input the UTF string. So, if I input "\t\u001C\u001C".xor "key" to the function below, the result should be "bye". Once again, this works in irb, but it doesn't work when I run it from a file! When I run it from the file, it gives me a "'*': negative argument (ArgumentError)" when I don't get any errors running it from irb! Below is the function:
class String
def xor(key)
text = dup
b1 = text.unpack("U*")
b2 = key.unpack("U*")
longest = key.length #[b1.length,b2.length].max
b1 = [0]*(longest-b1.length) + b1
b2 = [0]*(longest-b2.length) + b2
result ={ |a,b| a^b }

The reason this is happening is because you are using:
p text
puts text
When you use p, ruby outputs the result of:
puts text.inspect
Which will show you the extra \'s in there that are being used as escape characters. If you just used puts you will see the expected result!


Ruby script which can replace a string in a binary file to a different, but same length string?

I would like to write a Ruby script (repl.rb) which can replace a string in a binary file (string is defined by a regex) to a different, but same length string.
It works like a filter, outputs to STDOUT, which can be redirected (ruby repl.rb data.bin > data2.bin), regex and replacement can be hardcoded. My approach is:
fn = ARGV[0]
regex = /\-\-[0-9a-z]{32,32}\-\-/
replacement = "--0ca2765b4fd186d6fc7c0ce385f0e9d9--"
blk_size = 1024, "rb") {|f|
while not f.eof?
data =
data.gsub!(regex, str)
print data
My problem is that when string is positioned in the file that way it interferes with the block size used by reading the binary file. For example when blk_size=1024 and my 1st occurance of the string begins at byte position 1000, so I will not find it in the "data" variable. Same happens with the next read cycle. Should I process the whole file two times with different block size to ensure avoiding this worth case scenario, or is there any other approach?
I would posit that a tool like sed might be a better choice for this. That said, here's an idea: Read block 1 and block 2 and join them into a single string, then perform the replacement on the combined string. Split them apart again and print block 1. Then read block 3 and join block 2 and 3 and perform the replacement as above. Split them again and print block 2. Repeat until the end of the file. I haven't tested it, but it ought to look something like this:, "rb") do |f|
last_block, this_block = nil
while not f.eof?
last_block, this_block = this_block,
data = "#{last_block}#{this_block}".gsub(regex, str)
last_block, this_block = data.slice!(0, blk_size), data
print last_block
print this_block
There's probably a nontrivial performance penalty for doing it this way, but it could be acceptable depending on your use case.
Maybe a cheeky
f.pos = f.pos - replacement.size
at the end of the while loop, just before reading the next chunk.

Ruby: How do you search for a substring, and increment a value within it?

I am trying to change a file by finding this string:
<aspect name=\"lineNumber\"><![CDATA[{CLONEINCR}]]>
and replacing {CLONEINCR} with an incrementing number. Here's what I have so far:
file ='input3400.txt' , 'rb')
contents =
contents.each_index do |i|contents.join["<aspect name=\"lineNumber\"><![CDATA[{CLONEINCR}]]></aspect>"] = "<aspect name=\"lineNumber\"><![CDATA[#{i}]]></aspect>" end
But this seems to go on forever - do I have an infinite loop somewhere?
Note: my text file is 533,952 lines long.
You are repeatedly concatenating all the elements of contents, making a substitution, and throwing away the result. This is happening once for each line, so no wonder it is taking a long time.
The easiest solution would be to read the entire file into a single string and use gsub on that to modify the contents. In your example you are inserting the (zero-based) file line numbers into the CDATA. I suspect this is a mistake.
This code replaces all occurrences of <![CDATA[{CLONEINCR}]]> with <![CDATA[1]]>, <![CDATA[2]]> etc. with the number incrementing for each matching CDATA found. The modified file is sent to STDOUT. Hopefully that is what you need.'input3400.txt' , 'r') do |f|
i = 0
contents ='<![CDATA[{CLONEINCR}]]>') { |m|
m.sub('{CLONEINCR}', (i += 1).to_s)
puts contents
If what you want is to replace CLONEINCR with the line number, which is what your above code looks like it's trying to do, then this will work. Otherwise see Borodin's answer.
output = File.readlines('input3400.txt').map.with_index do |line, i|
line.gsub "<aspect name=\"lineNumber\"><![CDATA[{CLONEINCR}]]></aspect>",
"<aspect name=\"lineNumber\"><![CDATA[#{i}]]></aspect>"
File.write('input3400.txt', output.join(''))
Also, you should be aware that when you read the lines into contents, you are creating a String distinct from the file. You can't operate on the file directly. Instead you have to create a new String that contains what you want and then overwrite the original file.

Read string as variable RUBY

I am pulling the following string from a CSV file, from cell A1, and storing it as a variable:
So, cell A1 reads #{collector_id}, and my code essentially does this:
test = #excel_cell_A1
However, if I then do this:
puts test
I get this:
I need #{collector_id} to read as the actual variable collector_id, not the code that I am using to call the variable. Is that possible?
Thanks for the help. I am using ruby 1.9.3.
You can use sub or gsub to replace expected input values:
collector_id = "foo"
test = '#{collector_id}'
test.sub("\#{collector_id}", "#{collector_id}") #=> "foo"
I would avoid the use of eval (or at least sanity check what you are running) to reduce the risk of running arbitrary code you receive from the CSV file.
Try this:
test_to_s = eval("\"#{ test }\"")
puts test_to_s
%q["#{ test }"] will build the string "#{collector_id}" (the double quotes are part of the string, "#{collector_id}".length == 17) which then will be evaluated as ruby code by eval

Using Ruby to automate a large directory system

So I have the following little script to make a file setup for organizing reports that we get.
#This script is to create a file structure for our survey data
require 'fileutils'
f ='CustomerList.txt') or die "Unable to open file..."
a = f.readlines
x = 0
while a[x] != nil
Customer = a[x]
FileUtils.mkdir_p(Customer + "/foo/bar/orders")
FileUtils.mkdir_p(Customer + "/foo/bar/employees")
FileUtils.mkdir_p(Customer + "/foo/bar/comments")
x += 1
Everything seems to work before the while, but I keep getting:
'mkdir': Invalid argument - Cust001_JohnJacobSmith(JJS) (Errno::EINVAL)
Which would be the first line from the CustomerList.txt. Do I need to do something to the array entry to be considered a string? Am I mismatching variable types or something?
Thanks in advance.
The following worked for me:
IO.foreach('CustomerList.txt') do |customer|
["orders", "employees", "comments"].each do |dir|
with data like so:
$ cat CustomerList.txt
A few things to make it more like the ruby way:
Use blocks when opening a file or iterating through arrays, that way you don't need to worry about closing the file or accessing the array directly.
As noted by #inger, local vars start with lower case, customer.
When you want the value of a variable in a string usign #{} is more rubinic than concatenating with +.
Also note that we took off the trailing newline using chomp! (which changes the var in place, noted by the trailing ! on the method name)

Ruby - detecting the end of the read file

I upload through a form a file and in the controller this file read. My problem is, that I don't know, hot to detect the end of the file (=> when stop a loop). This part of code looks like this:
dat = params[:data]
while(d =
puts d
break if d.eof #this doesn't work
The result of this part is (except the error about eof) infinity while looping.
If length is omitted or is nil, it reads until EOF and the encoding conversion is applied. It returns a string even if EOF is met at beginning.
So I guess you should just do
Edit: if you want all the lines of the file, use dat.readlines - this will return an Array of Strings
