Ruby remove blank lines from a file which include spaces - ruby

I am trying to remove blank lines from a file.
My current code is:
def remove_blank_lines_from_file(file)
File.write(file, File.read(file).gsub(/^$\n/, ''))
end
The above code removes only the empty lines, but I also want to remove the lines which include spaces.
How could I do it?

Since you nevertheless load the whole file into memory, this might be easier to read:
File.write(file, File.readlines(file).reject { |s| s.strip.empty? }.join)
Just remove those lines, containing the spaces only.

Related

how to overwrite part of a line in a txt file with regex and .sub in ruby

I have the following layout in a txt file.
[item] label1: comment1 | label2: foo
I have the code below. The goal is to modify part of an existing line in text
def replace_info(item, bar)
return "please create a file first" unless File.exist?('site_info.txt')
IO.foreach('site_info.txt','a+') do |line|
if line.include?(item)
#regex should find the data from the whitespace after the colon all the way to the end.
#this should be equivalent to foo
foo_string = line.scan(/[^"label2: "]*\z/)
line.sub(foo_string, bar)
end
end
end
Please advise. Perhaps my regrex is off, but .sub is correct, but I cannot overwrite line.
Tiny problem: Your regular expression does not do what you think. /[^"label2: "]*\z/ means: any number of characters at the end of line that are not a, b, e, l, ", space, colon or 2 (see Character classes). And scan returns an array, which sub doesn't work with. But that doesn't really matter, because...
Small problem: line.sub(foo_string, bar) doesn't do anything. It returns a changed string, but you don't assign it to anything and it gets thrown away. line.sub!(foo_string, bar) would change line itself, but that leads us to...
Big problem: You cannot just change the read line and expect it to change in the file itself. It's like reading a book, thinking you could write a line better, and expecting it to change the book. The way to change a line in a text file is to read from one file and copy what you read to another. If you change a line between reading and writing, the newly written copy will be different. At the end, you can rename the new file to the old file (which will delete the old file and replace it atomically with the new one).
EDIT: Here's some code. First, I dislike IO.foreach as I like to control the iteration myself (and IMO, IO.foreach is not readable as IO#each_line). In the regular expression, I used lookbehind to find the label without including it into the match, so I can replace just the value; I changed to \Z for a similar reason, to exclude the newline from the match. You should not be returning error messages from functions, that's what exceptions are for. I changed simple include? to #start_with? because your item might be found elsewhere in the line when we wouldn't want to trigger the change.
class FileNotFoundException < RuntimeError; end
def replace_info(item, bar)
# check if file exists
raise FileNotFoundException unless File.exist?('site_info.txt')
# rewrite the file
File.open('site_info.txt.bak', 'wt') do |w|
File.open('site_info.txt', 'rt') do |r|
r.each_line do |line|
if line.start_with?("[#{item}]")
line.sub!(/(?<=label2: ).*?\Z/, bar)
end
w.write(line)
end
end
end
# replace the old file
File.rename('site_info.txt.bak', 'site_info.txt')
end
replace_info("item", "bar")

ruby match string or space/tab at the beginning of a line and insert uniq lines to a file

This is my code:
File.open(file_name) do |file|
file.each_line do |line|;
if line =~ (/SAPK/) || (line =~ /^\t/ and tabs == true) || (line =~ /^ / and spaces == true)
file = File.open("./1.log", "a"); puts "found a line #{line}"; file.write("#{line}".lstrip!)
end
end
end
File.open("./2.log", "a") { |file| file.puts File.readlines("./1.log").uniq }
I want to to insert all the lines that match a specific string, start with tab or start with space to a file 1.log, all lines should be with space/tab at the beginning so I removed them.
I want get the unique lines in 1.log and write them to 2.log
It will be great if some can go over the code and tell me if something is not correct.
When using files in Ruby, what is the difference between the w+ and a modes?
I know:
w+ - Create an empty file for both reading and writing.
a - Append to a file.The file is created if it does not exist.
But both options append to the file, I though w+ should behave like >, instead of >> ,so I guess w+ also like >> ?
Thanks !!
There's a lot of confusion in this code and it isn't helped by your habit of jamming things on a single line for no reason. Do try and keep your code clean, as the functionality should be obvious. There's also a lot of quirky anti-patterns like stringifying strings and testing booleans vs booleans which you should really avoid doing.
One thing you'll want to do is employ Tempfile for those situations where you need an intermediate file.
Here's a reworked version that's cleaned up:
Tempfile.open do |temp|
File.open(file_name) do |input|
input.each_line do |line|
if line.match(/SAPK/) || (line.match(/^\t/) and tabs) || (line.match(/^ /) and spaces)
puts "found a line #{line}"
temp.write(line.lstrip!)
end
end
end
File.open("./2.log", "w+") do |file|
# Rewind the temporary file to read data back
temp.rewind
file.write(temp.readlines.uniq)
end
end
Now a and w+ are largely similar, it's just two ways that are offered for people familiar with whatever notation. It's like how Array has both length and size which do the same thing. Pick one and use it consistently or your code will be confusing.
My criticism over things like x == true is because something that narrowly specific usually means that x could take on a multitude of values and true is one particular case we're trying to handle, something that implies that we should be aware it might be false and many other things. It's a red herring and will only invite questions.

Removing traiing commas from csv file in ruby

In a csv file, I have trailing commas that I want to get rid of but the number of commas vary in length. So I cannot use gsub to remove them. Does anyone know a way to read a csv file, remove any trailing commas from the row, and rewrite to the same csv file?
You can read file line by line and sub all trailing ,s. You cannot directly edit the file so best thing to do is create a TempFile and replace your csv file with it when done. Here:
require 'fileutils'
require 'tempfile'
t_file = Tempfile.new('temp.txt')
File.open("/path/to/csv", 'r') do |f|
f.each_line{|line| t_file.puts line.chomp.sub(/,$/,'') }
end
t_file.close
FileUtils.mv(t_file.path, "/path/to/csv")

Issue copying file into new file gsub with regex, variable and string?

I'm struggling with a script to target specific XML files in a directory and rename them as copies with a different name.
I put in the puts statements for debugging, and, from what I can tell, everything looks OK until the FileUtils.cp line. I tried this with simpler text and it worked, but my overly complicated cp(file, file.gsub()) seems to be causing problems that I can't figure out.
def piano_treatment(cats)
FileUtils.chdir('12Piano')
src = Dir.glob('*.xml')
src.each do |file|
puts file
cats.each do |text|
puts text
if file =~ /#{text}--\d\d/
puts "Match Found!!"
puts FileUtils.pwd
FileUtils.cp(file, file.gsub!(/#{text}--\d\d/, "#{text}--\d\dBass "))
end
end
end
end
piano_treatment(cats)
I get the following output in Terminal:
12Piano--05Free Stuff--11Test.xml
05Free Stuff
Match Found!!
/Users/mbp/Desktop/Sibelius_Export/12Piano
cp 12Piano--05Free Stuff--ddBass Test.xml 12Piano--05Free Stuff--ddBass Test.xml
/Users/mbp/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/fileutils.rb:1551:in `stat': No such file or directory - 12Piano--05Free Stuff--ddBass Test.xml (Errno::ENOENT)
Why is \d\d showing up as "dd" when it should actually be numbers? Is this a single vs. double quote issue? Both yield errors.
Any suggestions are appreciated. Thanks.
EDIT One additional change was needed to this code. The FileUtils.chdir('12Piano') would change the directory for the first iteration of the loop, but it would revert to the source directory after that. Instead I did this:
def piano_treatment(cats)
src = Dir.glob('12Piano/*.xml')
which sets the match path for the whole method.
Your replacement string is not a regex, so \d has no special meaning, but is just a literal string. You need to specify a group in your regex, and then you can use the captured group in your replacement string:
FileUtils.cp(file, file.gsub(/#{text}--(\d\d)/, "#{text}--\\1Bass "))
The parenthesis in the regex form the group, which can be used (by number) in the replacement string: \1 for the first group, \2 for the second, etc. \0 refers to the entire regex match.
Update
Replaced gsub!() with gsub() and escaped the backslash in the replacement string (to treat \1 as the capture group, not a literal character... Doh!).

ruby each_line reads line break too?

I'm trying to read data from a text file and join it with a post string. When there's only one line in the file, it works fine. But with 2 lines, my request is failed. Is each_line reading the line break? How can I correct it?
File.open('sfzh.txt','r'){|f|
f.each_line{|row|
send(row)
}
I did bypass this issue with split and extra delimiter. But it just looks ugly.
Yes, each_line includes line breaks. But you can strip them easily using chomp:
File.foreach('test1.rb') do |line|
send line.chomp
end
Another way is to map strip onto each line as it is returned. To read a file line-by-line, stripping whitespace and do something with each line you can do the following:
File.open("path to file").readlines.map(&:strip).each do |line|
(do something with line)
end

Resources