Increment part of a string in Ruby - ruby

I have a method in a Ruby script that is attempting to rename files before they are saved. It looks like this:
def increment (path)
if path[-3,2] == "_#"
print " Incremented file with that name already exists, renaming\n"
count = path[-1].chr.to_i + 1
return path.chop! << count.to_s
else
print " A file with that name already exists, renaming\n"
return path << "_#1"
end
end
Say you have 3 files with the same name being saved to a directory, we'll say the file is called example.mp3. The idea is that the first will be saved as example.mp3 (since it won't be caught by if File.exists?("#{file_path}.mp3") elsewhere in the script), the second will be saved as example_#1.mp3 (since it is caught by the else part of the above method) and the third as example_#2.mp3 (since it is caught by the if part of the above method).
The problem I have is twofold.
1) if path[-3,2] == "_#" won't work for files with an integer of more than one digit (example_#11.mp3 for example) since the character placement will be wrong (you'd need it to be path[-4,2] but then that doesn't cope with 3 digit numbers etc).
2) I'm never reaching problem 1) since the method doesn't reliably catch file names. At the moment it will rename the first to example_#1.mp3 but the second gets renamed to the same thing (causing it to overwrite the previously saved file).
This is possibly too vague for Stack Overflow but I can't find anything that addresses the issue of incrementing a certain part of a string.
Thanks in advance!
Edit/update:
Wayne's method below seems to work on it's own but not when included as part of the whole script - it can increment a file once (from example.mp3 to example_#1.mp3) but doesn't cope with taking example_#1.mp3 and incrementing it to example_#2.mp3. To provide a little more context - currently when the script finds a file to save it is passing the name to Wayne's method like this:
file_name = increment(image_name)
File.open("images/#{file_name}.jpeg", 'w') do |output|
open(image_url) do |input|
output << input.read
end
end
I've edited Wayne's script a little so now it looks like this:
def increment (name)
name = name.gsub(/\s{2,}|(http:\/\/)|(www.)/i, '')
if File.exists?("images/#{name}.jpeg")
_, filename, count, extension = *name.match(/(\A.*?)(?:_#(\d+))?(\.[^.]*)?\Z/)
count = (count || '0').to_i + 1
"#{name}_##{count}#{extension}"
else
return name
end
end
Where am I going wrong? Again, thanks in advance.

A regular expression will git 'er done:
#!/usr/bin/ruby1.8
def increment(path)
_, filename, count, extension = *path.match(/(\A.*?)(?:_#(\d+))?(\.[^.]*)?\Z/)
count = (count || '0').to_i + 1
"#{filename}_##{count}#{extension}"
end
p increment('example') # => "example_#1"
p increment('example.') # => "example_#1."
p increment('example.mp3') # => "example_#1.mp3"
p increment('example_#1.mp3') # => "example_#2.mp3"
p increment('example_#2.mp3') # => "example_#3.mp3"
This probably doesn't matter for the code you're writing, but if you ever may have multiple threads or processes using this algorithm on the same files, there's a race condition when checking for existence before saving: Two writers can both find the same filename unused and write to it. If that matters to you, then open the file in a mode that fails if it exists, rescuing the exception. When the exception occurs, pick a different name. Roughly:
loop do
begin
File.open(filename, File::CREAT | File::EXCL | File::WRONLY) do |file|
file.puts "Your content goes here"
end
break
rescue Errno::EEXIST
filename = increment(filename)
redo
end
end

Here's a variation that doesn't accept a file name with an existing count:
def non_colliding_filename( filename )
if File.exists?(filename)
base,ext = /\A(.+?)(\.[^.]+)?\Z/.match( filename ).to_a[1..-1]
i = 1
i += 1 while File.exists?( filename="#{base}_##{i}#{ext}" )
end
filename
end
Proof:
%w[ foo bar.mp3 jim.bob.mp3 ].each do |desired|
3.times{
file = non_colliding_filename( desired )
p file
File.open( file, 'w' ){ |f| f << "tmp" }
}
end
#=> "foo"
#=> "foo_#1"
#=> "foo_#2"
#=> "bar.mp3"
#=> "bar_#1.mp3"
#=> "bar_#2.mp3"
#=> "jim.bob.mp3"
#=> "jim.bob_#1.mp3"
#=> "jim.bob_#2.mp3"

Related

No output produced

Can anyone tell me why this program is not producing an output? The output it should be producing is: Line read: 0
Line read: 1 Line read: 2 Line read: 3 and so on.
So far, I am not getting an output even though I have fixed a number of bugs. Any help or suggestions would be much appreciated.
# takes a number and writes that number to a file then on each line
# increments from zero to the number passed
def write(aFile, number)
# You might need to fix this next line:
aFile.puts(number)
index = 0
while (index < number)
aFile.puts(index.to_s)
index += 1
end
end
# Read the data from the file and print out each line
def read(aFile)
# Defensive programming:
count = aFile.gets
if (is_numeric?(count))
count = count.to_i
index = 0
while (index < count)
line = aFile.gets
puts "line read: " + line
index+=1
end
end
end
# Write data to a file then read it in and print it out
def main
aFile = File.new("mydata.txt", "w") # open for writing
write(aFile, 10)
aFile.close
aFile = File.new("mydata.txt", "r")
read(aFile)
aFile.close
end
# returns true if a string contains only digits
def is_numeric?(obj)
if /[^0-9]/.match(obj) == nil
true
end
false
end
main
Your code isn't written in the Ruby way.
This is how I'd write it if I wanted to closely mimic your code's logic:
# takes a number and writes that number to a file then on each line
# increments from zero to the number passed
def write_data(fname, counter)
File.open(fname, 'w') do |fo|
fo.puts(counter)
counter.times do |n|
fo.puts n
end
end
end
# returns true if a string contains only digits
def is_numeric?(obj)
obj[/^\d+$/]
end
# Read the data from the file and print out each line
def read_data(fname)
File.open(fname) do |fi|
counter = fi.gets.chomp
if is_numeric?(counter)
counter.to_i.times do |n|
line_in = fi.gets
puts 'Line read: %s' % line_in
end
end
end
end
# Write data to a file then read it in and print it out
DATA_FILE = 'mydata.txt'
write_data(DATA_FILE, 10)
read_data(DATA_FILE)
Which outputs:
Line read: 0
Line read: 1
Line read: 2
Line read: 3
Line read: 4
Line read: 5
Line read: 6
Line read: 7
Line read: 8
Line read: 9
Notice these things:
Method (or variable) names are not in camelCase in Ruby, they're snake_case. ItsAReadabiltyThing.
Ruby encourages us to use a block when opening files for reading or writing, to automatically close the file when we're finished with it. Leaving danging file handles opened then not closed, in a loop, in a long-running program, is a great way for your program to crash in a way that's hard to figure out. SO has many questions that resulted from doing that. This is from the IO#open documentation:
With no associated block, ::open is a synonym for ::new. If the optional code block is given, it will be passed io as an argument, and the IO object will automatically be closed when the block terminates. In this instance, ::open returns the value of the block.
Usually you'll see code use File.open instead of IO.open, mostly out of habit in Ruby coders. File inherits from IO and adds some additional file-oriented methods to the class, so it's a little more full-featured.
Ruby has many methods that help us avoid using while loops. Getting the counters wrong or missing a condition that should terminate the loop, is all too common in programming, so Ruby makes it easy to loop "n times" or to iterate over all the elements in an array. The times method accomplishes that nicely.
String's [] method is really powerful and makes it easy to look at the contents of a string and apply a pattern or a slice. Using /^\d+$/ checks the entire string to make sure all characters are digits, so some_string[/^\d+$/] is a shorter version than what you're doing and accomplishes the same thing, returns a "truthy" value.
We don't use a main method. That's old-school Pascal, C or Java and is artificially structured. Ruby's a little more friendly than that.
Instead of using
3.times do |n|
puts n
end
# >> 0
# >> 1
# >> 2
I'd probably use
puts (0..(3 - 1)).to_a * "\n"
# >> 0
# >> 1
# >> 2
just because I tend to think in Perl terms. It's another old habit.
I found 2 errors. Fixing those errors gives you desired output.
Error #1.
Your method is_numeric? always returns false. Even if your condition is true. The last line of the method is false and therefore the whole method ALWAYS returns false.
You can fix it in 2 steps.
Step #1:
if /[^0-9]/.match(obj) == nil
true
else
false
end
It's not a good practice to return booleans within conditional. You can simplify it this way:
def is_numeric?(obj)
/[^0-9]/.match(obj) == nil
end
or even better
def is_numeric?(obj)
/[^0-9]/.match(obj).nil?
end
Error #2 is inside your read method. If you try to output the value of count after you read it from the file it gives you "10\n". That \n at the end messes you up.
To get rid of \n when you read from the file you could possibly use chomp. So then your reading line would be:
count = aFile.gets.chomp
and the rest works like magic

reading from disk multiple times possibly cause bottleneck

I'm trying to find out where the bottleneck of a ruby script is. I suspect that it might happen because the script parses thousands of lines and, for each one, it checks if a certain file is present in disk and eventually reads its contents.
def sectionsearch(brand, season, video)
mytab.trs.each_with_index do |row, i|
# ...some code goes here...
f = "modeldesc/" + brand.downcase + "/" + modelcode + ".html"
if File.exist?(f)
modeldesc = File.read(f)
else
modeldesc = ""
end
# ...more code here...
end
end
Given that there are no more than 30 different "modelcode" files for thousands of record, I was looking for a different approach that reads all the content of the folder before the each loop (since it is not going to change during the execution).
Is this approach going to speed up my script, also is this the right way to implement this?
I would probably do something like a hash (passing a block) to check for the file, on unknown keys:
def sectionsearch(brand, season, video)
modeldescrs = Hash.new do |cache, model|
if File.exist?(model)
cache[model] = File.read(model)
else
cache[model] = ''
end
end
mytab.trs.each_with_index do |row, i|
# ...some code goes here...
f = "modeldesc/" + brand.downcase + "/" + modelcode + ".html"
puts modeldescrs[f]
# ...more code here...
end
end
then just access modeldescrs[f] when you need it (the puts above is an example) if the key doesn't exist the block will be executed and it will look it up / populate it. see http://www.ruby-doc.org/core-2.0/Hash.html for more info on the block form of the initializer for Hash
Also you could make modeldescrs an instance variable if it needs to be saved.

read file into an array excluding the the commented out lines

I'm almost a Ruby-nOOb (have just the knowledge of Ruby to write some basic .erb template or Puppet custom-facts). Looks like my requirements fairly simple but can't get my head around it.
Trying to write a .erb template, where it reads a file (with space delimited lines) to an array and then handle each array element according to the requirements. This is what I got so far:
fname = "webURI.txt"
def myArray()
#if defined? $fname
if File.exist?($fname) and File.file?($fname)
IO.readlines($fname)
end
end
myArray.each_index do |i|
myLine = myArray[i].split(' ')
puts myLine[0] +"\t=> "+ myLine.last
end
Which works just fine, except (for obvious reason) for the line that is commented out or blank lines. I also want to make sure that when spitted (by space) up, the line shouldn't have more than two fields in it; a file like this:
# This is a COMMENT
#
# Puppet dashboard
puppet controller-all-local.example.co.uk:80
# Nagios monitoring
nagios controller-all-local.example.co.uk::80/nagios
tac talend-tac-local.example.co.uk:8080/org.talend.admin
mng console talend-mca-local.example.co.uk:8080/amc # Line with three fields
So, basically these two things I'd like to achieve:
Read the lines into array, stripping off everything after the first #
Split each element and print a message if the number id more than two
Any help would be greatly appreciated. Cheers!!
Update 25/02
Thanks guy for your help!!
The blankthing doesn't work for at all; throwing in this error; but I kinda failed to understand why:
undefined method `blank?' for "\n":String (NoMethodError)
The array: myArray, which I get is actually something like this (using p instead of puts:
["\n", "puppet controller-all-local.example.co.uk:80\n", "\n", "\n", "nagios controller-all-local.example.co.uk::80/nagios\n", ..... \n"]
Hence, I had to do this to get around this prob:
$fname = "webURI.txt"
def myArray()
if File.exist?($fname) and File.file?($fname)
IO.readlines($fname).map { |arr| arr.gsub(/#.*/,'') }
end
end
# remove blank lines
SSS = myArray.reject { |ln| ln.start_with?("\n") }
SSS.each_index do |i|
myLine = SSS[i].split(' ')
if myLine.length > 2
puts "Too many arguments!!!"
elsif myLine.length == 1
puts "page"+ i.to_s + "\t=> " + myLine[0]
else
puts myLine[0] +"\t=> "+ myLine.last
end
end
You are most welcome to improve the code. cheers!!
goodArray = myArray.reject do |line|
line.start_with?('#') || line.split(' ').length > 2
end
This would reject whatever that either starts with # or the split returns an array of more than two elements returning you an array of only good items.
Edit:
For your inline commenting you can then do
goodArray.map do |line|
line.gsub(/#.*/, '')
end

recursive file list in Ruby

I'm new to Ruby (being a Java dev) and trying to implement a method (oh, sorry, a function) that would retrieve and yield all files in the subdirectories recursively.
I've implemented it as:
def file_list_recurse(dir)
Dir.foreach(dir) do |f|
next if f == '.' or f == '..'
f = dir + '/' + f
if File.directory? f
file_list_recurse(File.absolute_path f) { |x| yield x }
else
file = File.new(f)
yield file
end
end
end
My questions are:
Does File.new really OPEN a file? In Java new File("xxx") doesn't... If I need to yield some structure that I could query file info (ctime, size etc) from what would it be in Ruby?
{ |x| yield x } looks a little strange to me, is this OK to do yields from recursive functions like that, or is there some way to avoid it?
Is there any way to avoid checking for '.' and '..' on each iteration?
Is there a better way to implement this?
Thanks
PS:
the sample usage of my method is something like this:
curr_file = nil
file_list_recurse('.') do |file|
curr_file = file if curr_file == nil or curr_file.ctime > file.ctime
end
puts curr_file.to_path + ' ' + curr_file.ctime.to_s
(that would get you the oldest file from the tree)
==========
So, thanks to #buruzaemon I found out the great Dir.glob function which saved me a couple of lines of code.
Also, thanks to #Casper I found out the File.stat method, which made my function run two times faster than with File.new
In the end my code is looking something like this:
i=0
curr_file = nil
Dir.glob('**/*', File::FNM_DOTMATCH) do |f|
file = File.stat(f)
next unless file.file?
i += 1
curr_file = [f, file] if curr_file == nil or curr_file[1].ctime > file.ctime
end
puts curr_file[0] + ' ' + curr_file[1].ctime.to_s
puts "total files #{i}"
=====
By default Dir.glob ignores file names starting with a dot (considered to be 'hidden' in *nix), so it's very important to add the second argument File::FNM_DOTMATCH
How about this?
puts Dir['**/*.*']
According to the docs File.new does open the file. You might want to use File.stat instead, which gathers file-related stats into a queryable object. But note that the stats are gathered at point of creation. Not when you call the query methods like ctime.
Example:
Dir['**/*'].select { |f| File.file?(f) }.map { |f| File.stat(f) }
this thing tells me to consider accepting an answer, I hope it wouldn't mind me answering it myself:
i=0
curr_file = nil
Dir.glob('**/*', File::FNM_DOTMATCH) do |f|
file = File.stat(f)
next unless file.file?
i += 1
curr_file = [f, file] if curr_file == nil or curr_file[1].ctime > file.ctime
end
puts curr_file[0] + ' ' + curr_file[1].ctime.to_s
puts "total files #{i}"
You could use the built-in Find module's find method.
If you are on Windows see my answer here under for a mutch faster (~26 times) way than standard Ruby Dir. If you use mtime it's still going to be waaayyy faster.
If you use another OS you could use the same technique, I'm curious if the gain would be that big but I'm almost certain.
How to find the file path of file that is not the current file in ruby

Ruby: Deleting last iterated item?

What I'm doing is this: have one file as input, another as output. I chose a random line in the input, put it in the output, and then delete it.
Now, I've iterated over the file and am on the line I want. I've copied it to the output file. Is there a way to delete it? I'm doing something like this:
for i in 0..number_of_lines_to_remove
line = rand(lines_in_file-2) + 1 #not removing the first line
counter = 0
IO.foreach("input.csv", "r") { |current_line|
if counter == line
File.open("output.csv", "a") { |output|
output.write(current_line)
}
end
counter += 1
}
end
So, I have current_line, but I'm not sure how to remove it from the source file.
Array.delete_at might do. Given an index, it removes the object at that index, returning the object.
input.csv:
one,1
two,2
three,3
Program:
#!/usr/bin/ruby1.8
lines = File.readlines('/tmp/input.csv')
File.open('/tmp/output.csv', 'a') do |file|
file.write(lines.delete_at(rand(lines.size)))
end
p lines # ["two,2\n", "three,3\n"]
output.csv:
one,1
Here is a randomline class. You create a new randomline object by passing it an input file name and an output file name. You can then call the deleterandom method on that object and pass it a number of lines to delete.
The data is stored internally in arrays as well as being put to file. Currently output is in append mode so if you use the same file it will just add to the end, you could change the a to a w if you wanted to start the file fresh each time.
class Randomline
attr_accessor :inputarray, :outputarray
def initialize(filein, fileout)
#filename = filein
#filein = File.open(filein,"r+")
#fileoutput = File.open(fileout,"a")
#inputarray = []
#outputarray = []
readin()
end
def readin()
#filein.each do |line|
#inputarray << line
end
end
def deleterandom(numtodelete)
numtodelete.times do |num|
random = rand(#inputarray.size)
#outputarray << inputarray[random]
#fileoutput.puts inputarray[random]
#inputarray.delete_at(random)
end
#filein = File.open(#filename,"w")
#inputarray.each do |line|
#filein.puts line
end
end
end
here is an example of it being used
a = Randomline.new("testin.csv","testout.csv")
a.deleterandom(3)
You have to re-write the source-file after removing a line otherwise the modifications won't stick as they're performed on a copy of the data.
Keep in mind that any operation which modifies a file in-place runs the risk of truncating the file if there's an error of any sort and the operation cannot complete.
It would be safer to use some kind of simple database for this kind of thing as libraries like SQLite and BDB have methods for ensuring data integrity, but if that's not an option, you just need to be careful when writing the new input file.

Resources