Regular Expression matching in ruby, checking for white space

Regular Expression matching in ruby, checking for white space - ruby

I am trying to check a file for white spaces at the beginning of each line. I want the white-space at the beginning of the line to be consistent, all start with spaces or all start with tabs. I wrote the code below but it isn't working for me. If there exist a space at a beginning of one line and then a tab exists in the beginning of another line print a warning or something.
file = File.open("file_tobe_checked","r") #I'm opening the file to be checked
while (line = file.gets)
a=(line =~ /^ /).nil?
b=(line =~/^\t/).nil?
if a==false && b==false
print "The white spaces at the beginning of each line are not consistent"
end
end
file.close

This is one solution where you don't read the file or the extracted lines array twice:
#!/usr/bin/env ruby
file = ARGV.shift
tabs = spaces = false
File.readlines(file).each do |line|
line =~ /^\t/ and tabs = true
line =~ /^ / and spaces = true
if spaces and tabs
puts "The white spaces at the beginning of each line are not consistent."
break
end
end
Usage:
ruby script.rb file_to_be_checked
And it may be more efficient to compare lines with these:
line[0] == "\t" and tabs = true
line[0] == ' ' and spaces = true
You can also prefer to use each_line over readlines. Perhaps each_line allows you to read the file line by line instead of reading all the lines in one shot:
File.open(file).each_line do |line|

How important is it that you check for the whitespace (and warn/notify accordingly)? If you are aiming to just correct the whitespace, .strip is great at taking care of errant whitespace.
lines_array = File.readlines(file_to_be_checked)
File.open(file_to_be_checked, "w") do |f|
lines_array.each do |line|
# Modify the line as you need and write the result
f.write(line.strip)
end
end

I assume that no line can begin with one or more spaces followed by a tab, or vice-versa.
To merely conclude that there are one or more inconsistencies within the file is not very helpful in dealing with the problem. Instead you might consider giving the line number of the first line that begins with a space or tab, then giving the line numbers of all subsequent lines that begin with a space or tab that does not match the first line found with such. You could do that as follows (sorry, untested).
def check_file(fname)
file = File.open(fname,"r")
line_no = 0
until file.eof?
first_white = file.gets[/(^\s)/,1]
break if first_white
line_no += 1
end
unless file.eof?
puts "Line #{line_no} begins with a #{(first_white=='\t') ? "tab":"space"}"
until file.eof?
preface = file.gets[/(^\s)/,1))]
puts "Line #{line_no} begins with a #{(preface=='\t') ? "tab":"space"}" \
if preface && preface != first_white
line_no += 1
end
end
file.close
end

Related

How can I read lines from a file into an array that are not comments or empty?

I have a text file where a line may be either blank, a comment (begins with //) or an instruction (i.e. anything not blank or a comment). For instance:
Hiya #{prefs("interlocutor")}!
// Say morning appropriately or hi otherwise
#{(0..11).include?(Time.now.hour) ? 'Morning' : 'Hi'} #{prefs("interlocutor")}
I'm trying to read the contents of the file into an array where only the instruction lines are included (i.e. skip the blank lines and comments). I have this code (which works):
path = Pathname.new("test.txt")
# Get each line from the file and reject comment lines
lines = path.readlines.reject{ |line| line.start_with?("//") }.map{ |line| line.chomp }
# Reject blank lines
lines = lines.reject{ |line| line.length == 0 }
Is there a more efficient or elegant way of doing it? Thanks.

start_with takes multiple arguments, so you can do
File.open("test.txt").each_line.reject{|line| line.start_with?("//", "\n")}.map(&:chomp)
in one go.

I would do it like so, using regex:
def read_commands(path)
File.read(path).split("\n").reduce([]) do |results, line|
case line
when /^\s*\/\/.*$/ # check for comments
when /^\s*$/ # check for empty lines
else
results.push line
end
results
end
end
To break down the regexes:
comments_regex = %r{
^ # beginning of line
\s* # any number of whitespaces
\/\/ # the // sequence
.* # any number of anything
$ # end of line
}x
empty_lines_regex = %r{
^ # beginning of line
\s* # any number of whitespaces
$ # end of line
}x

Deleting lines containing specific words in text file in Ruby

I have a text file like this:
User accounts for \\AGGREP-1
-------------------------------------------------------------------------------
Administrator users grzesieklocal
Guest scom SUPPORT_8855
The command completed successfully.
First line is empty line. I want to delete every empty lines in this file and every line containing words "User accounts for", "-------", "The command". I want to have only lines containing users. I don't want to delete only first 4 and the last one lines, because it can be more users in some systems and file will contain more lines.
I load file using
a = IO.readlines("test.txt")
Is any way to delete lines containing specific words?

Solution
This structure reads the file line by line, and write a new file directly :
def unwanted?(line)
line.strip.empty? ||
line.include?('User accounts') ||
line.include?('-------------') ||
line.include?('The command completed')
end
File.open('just_users.txt', 'w+') do |out|
File.foreach('test.txt') do |line|
out.puts line unless unwanted?(line)
end
end
If you're familiar with regexp, you could use :
def unwanted?(line)
line =~ /^(User accounts|------------|The command completed|\s*$)/
end
Warning from your code
The message warning: string literal in condition appears when you try to use :
string = "nothing"
if string.include? "a" or "b"
puts "FOUND!"
end
It outputs :
parse_text.rb:16: warning: string literal in condition
FOUND!
Because it should be written :
string = 'nothing'
if string.include?('a') || string.include?('b')
puts "FOUND!"
end
See this question for more info.

IO::readlines returns an array, so you could use Array#select to select just the lines you need. Bear in mind that this means that your whole input file will be in memory, which might be a problem, if the file is really large.
An alternative approach would be to use IO::foreach, which processes one line at a time:
selected_lines = []
IO.foreach('test.txt') { |line| selected_lines << line if line_matches_your_requirements }

Replace whole line with sub-string in a text file - Ruby

New to ruby here!
How to replace the whole line in a text file which contains a specific string using ruby?
Example: I want to remove and add the whole line contains "DB_URL" and add something like "DB_CON=jdbc:mysql:replication://master,slave1,slave2,slave3/test"
DB_URL=jdbc:oracle:thin:#localhost:TEST
DB_USERNAME=USER
DB_PASSWORD=PASSWORD

Here is your solution.
file_data = ""
word = 'Word you want to match in line'
replacement = 'line you want to set in replacement'
IO.foreach('pat/to/file.txt') do |line|
file_data += line.gsub(/^.*#{Regexp.quote(word)}.*$/, replacement)
end
puts file_data
File.open('pat/to/samefile.txt', 'w') do |line|
line.write file_data
end

Here is my attempt :
file.txt
First line
Second line
foo
bar
baz foo
Last line
test.rb
f = File.open("file.txt", "r")
a = f.map do |l|
(l.include? 'foo') ? "replacing string\n" : l # Please note the double quotes
end
p a.join('')
Output
$ ruby test.rb
"First line\nSecond line\nreplacing string\nbar\nreplacing string\nLast line"
I commented # Please note the double quotes because single quotes will escape the \n (that will become \\n). Also, you might want to think about the last line of your file since it will add \n at the end of the last line when there will not have one at the end of your original file. If you don't want that you could make something like :
f = File.open("file.txt", "r")
a = f.map do |l|
(l.include? 'foo') ? "replacing string\n" : l
end
a[-1] = a[-1][0..-2] if a[-1] == "replacing string\n"
p a.join('')

reading text file lines in ruby

I would like to scan each line in a text file, EXCEPT the first line.
I would usually do:
while line = file.gets do
...
...etc
end
but line = file.gets reads EVERY single line starting from the first.
How do I read from the second line onwards?

Why not simply call file.gets once and discard the result:
file.gets
while line = file.gets
# code here
end

I would do it in a simple fashion:
IO.readlines('filename').drop(1).each do |line| # drop the first array element
# do any proc here
end

Do you actually want to avoid reading the first line or avoid doing something with it. If you are OK reading the line but you want to avoid processing it then you can use lineno to ignore the line during processing as follows
f = File.new "/tmp/xx"
while line = f.gets do
puts line unless f.lineno == 1
end

Can using the ruby flip-flop as a filter be made less kludgy?

In order to get part of text, I'm using a true if kludge in front of a flip-flop:
desired_portion_lines = text.each_line.find_all do |line|
true if line =~ /start_regex/ .. line =~ /finish_regex/
end
desired_portion = desired_portion_lines.join
If I remove the true if bit, it complains
bad value for range (ArgumentError)
Is it possible to make it less kludgy, or should I merely do
desired_portion_lines = ""
text.each_line do |line|
desired_portion_lines << line if line =~ /start_regex/ .. line =~ /finish_regex/
end
Or is there a better approach that doesn't use enumeration?

if you are doing it line by line, my preference is something like this
line =~ /finish_regex/ && p=0
line =~ /start_regex/ && p=1
puts line if p
if you have all in one string. I would use split
mystring.split(/finish_regex/).each do |item|
if item[/start_regex/]
puts item.split(/start_regex/)[-1]
end
end

I think
desired_portion_lines = ""
text.each_line do |line|
desired_portion_lines << line if line =~ /start_regex/ .. line =~ /finish_regex/
end
is perfectly acceptable. The .. operator is very powerful, but not used by a lot of people, probably because they don't understand what it does. Possibly it looks weird or awkward to you because you're not used to using it, but it'll grow on you. It's very common in Perl when dealing with ranges of lines in text files, which is where I first encountered it, and eventually was using it a lot.
The only thing I'd do differently is add some parenthesis to visually separate the logical tests from each other, and from the rest of the line:
desired_portion_lines = ""
text.each_line do |line|
desired_portion_lines << line if ( (line =~ /start_regex/) .. (line =~ /finish_regex/) )
end
Ruby (and Perl) coders seem to abhor using parenthesis, but I consider them useful for visually separating the logic tests. For me it's a readability and, by extension, a maintenance thing.
The only other thing I can think of that might help, would be to change desired_portion_lines to an array, and push your selected lines onto it. Currently, using desired_portion_lines << line appends to the string, mutating it each time. It might be faster pushing on the array then joining its elements afterward to build your string.
Back to the first example. I didn't test this but I think you can simplify it to:
desired_portion = text.each_line.find_all { |line| line =~ /start_regex/ .. line =~ /finish_regex/ }.join
The only downside to iterating over all lines in a file using the flip-flop, is that if the start-pattern can occur multiple times, you'll get each found block added to desired_portion.

You can save three characters by replacing true if with !!() (with the flip flop belonging in between the parentheses).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Regular Expression matching in ruby, checking for white space - ruby

Related

How can I read lines from a file into an array that are not comments or empty?

Deleting lines containing specific words in text file in Ruby

Replace whole line with sub-string in a text file - Ruby

reading text file lines in ruby

Can using the ruby flip-flop as a filter be made less kludgy?

Categories

Resources