Deleting lines containing specific words in text file in Ruby - ruby

I have a text file like this:
User accounts for \\AGGREP-1
-------------------------------------------------------------------------------
Administrator users grzesieklocal
Guest scom SUPPORT_8855
The command completed successfully.
First line is empty line. I want to delete every empty lines in this file and every line containing words "User accounts for", "-------", "The command". I want to have only lines containing users. I don't want to delete only first 4 and the last one lines, because it can be more users in some systems and file will contain more lines.
I load file using
a = IO.readlines("test.txt")
Is any way to delete lines containing specific words?

Solution
This structure reads the file line by line, and write a new file directly :
def unwanted?(line)
line.strip.empty? ||
line.include?('User accounts') ||
line.include?('-------------') ||
line.include?('The command completed')
end
File.open('just_users.txt', 'w+') do |out|
File.foreach('test.txt') do |line|
out.puts line unless unwanted?(line)
end
end
If you're familiar with regexp, you could use :
def unwanted?(line)
line =~ /^(User accounts|------------|The command completed|\s*$)/
end
Warning from your code
The message warning: string literal in condition appears when you try to use :
string = "nothing"
if string.include? "a" or "b"
puts "FOUND!"
end
It outputs :
parse_text.rb:16: warning: string literal in condition
FOUND!
Because it should be written :
string = 'nothing'
if string.include?('a') || string.include?('b')
puts "FOUND!"
end
See this question for more info.

IO::readlines returns an array, so you could use Array#select to select just the lines you need. Bear in mind that this means that your whole input file will be in memory, which might be a problem, if the file is really large.
An alternative approach would be to use IO::foreach, which processes one line at a time:
selected_lines = []
IO.foreach('test.txt') { |line| selected_lines << line if line_matches_your_requirements }

Related

Reading specific line into an array - ruby

Have a txt file with the following:
Anders Hansen;87442355;11;87
Jens Hansen;22338843;23;11
Nanna Kvist;25233255;24;84
I would like to search the file after a specific name taken from the user input. Then save that line into an array, splittet via ";". Can't get it to work though. This is my code:
user1 = []
puts "Start by entering the full name of user 1: "
input = gets.chomp
File.open("userregister.txt") do |f|
f.each_line { |line|
if line =~ input then do |line|
user1 << line.split(';').map
=~ in ruby tries to match a string with a regex (or vice versa). Here, you use it with two strings, which gives an error:
'foo' =~ 'bar' # => TypeError: type mismatch: String given
There are more appropriate String methods to use instead. In your case, #start_with? does the job. If you wanted to check if the latter is contained somewhere as a substring (but not necessary the beginning), you can use #include?.
In case you actually wanted to take a regex as a user input (generally a bad idea), you can convert it from string to regex:
line =~ /#{input}/
Looking at the file format, I would actually use Ruby CSV class. By specifying the column separator to ;, you will get an array for each row.
require 'csv'
input = gets.chomp
CSV.foreach('userregister.txt', col_sep: ';') do |row|
if row[0].downcase == input.downcase
# Do stuffs with row[1..-1]
end
end

Regular Expression matching in ruby, checking for white space

I am trying to check a file for white spaces at the beginning of each line. I want the white-space at the beginning of the line to be consistent, all start with spaces or all start with tabs. I wrote the code below but it isn't working for me. If there exist a space at a beginning of one line and then a tab exists in the beginning of another line print a warning or something.
file = File.open("file_tobe_checked","r") #I'm opening the file to be checked
while (line = file.gets)
a=(line =~ /^ /).nil?
b=(line =~/^\t/).nil?
if a==false && b==false
print "The white spaces at the beginning of each line are not consistent"
end
end
file.close
This is one solution where you don't read the file or the extracted lines array twice:
#!/usr/bin/env ruby
file = ARGV.shift
tabs = spaces = false
File.readlines(file).each do |line|
line =~ /^\t/ and tabs = true
line =~ /^ / and spaces = true
if spaces and tabs
puts "The white spaces at the beginning of each line are not consistent."
break
end
end
Usage:
ruby script.rb file_to_be_checked
And it may be more efficient to compare lines with these:
line[0] == "\t" and tabs = true
line[0] == ' ' and spaces = true
You can also prefer to use each_line over readlines. Perhaps each_line allows you to read the file line by line instead of reading all the lines in one shot:
File.open(file).each_line do |line|
How important is it that you check for the whitespace (and warn/notify accordingly)? If you are aiming to just correct the whitespace, .strip is great at taking care of errant whitespace.
lines_array = File.readlines(file_to_be_checked)
File.open(file_to_be_checked, "w") do |f|
lines_array.each do |line|
# Modify the line as you need and write the result
f.write(line.strip)
end
end
I assume that no line can begin with one or more spaces followed by a tab, or vice-versa.
To merely conclude that there are one or more inconsistencies within the file is not very helpful in dealing with the problem. Instead you might consider giving the line number of the first line that begins with a space or tab, then giving the line numbers of all subsequent lines that begin with a space or tab that does not match the first line found with such. You could do that as follows (sorry, untested).
def check_file(fname)
file = File.open(fname,"r")
line_no = 0
until file.eof?
first_white = file.gets[/(^\s)/,1]
break if first_white
line_no += 1
end
unless file.eof?
puts "Line #{line_no} begins with a #{(first_white=='\t') ? "tab":"space"}"
until file.eof?
preface = file.gets[/(^\s)/,1))]
puts "Line #{line_no} begins with a #{(preface=='\t') ? "tab":"space"}" \
if preface && preface != first_white
line_no += 1
end
end
file.close
end

Search for word in text file and display (Ruby)

I'm trying to take input from the user, search through a text file (case insensitively), and then display the match from the file if it matches (with the case of the word in the file). I don't know how to get the word from the file, here's my code:
found = 0
words = []
puts "Please enter a word to add to text file"
input = gets.chomp
#Open text file in read mode
File.open("filename", "r+") do |f|
f.each do |line|
if line.match(/\b#{input}\b/i)
puts "#{input} was found in the file." # <--- I want to show the matched word here
#Set to 1 indicating word was found
found = 1
end
end
end
So, what you want to do is to store the result of the match method, you can then get the actual matched word out of that, ie.
if m = line.match( /\b#{input}\b/i )
puts "#{m[0]} was found in the file."
# ... etc.
end
Update
Btw, you didn't ask - but I would use scan in this case, so that I got an array of the matched words on each line (for when there's more than one match on the same line), something like this:
if m = line.scan( /\b#{input}\b/i )
puts "Matches found on line #{f.lineno}: #{m.join(', ')}"
# ... etc.
end
If you don't need to report the locations of the matches and the file is not overly large, you could just do this:
File.read("testfile").scan /\b#{input}\b/i
Let's try it:
text = <<THE_END
Now is the time for all good people
to come to the aid of their local
grocers, because grocers are important
to our well-being. One more thing.
Grocers sell ice cream, and boy do I
love ice cream.
THE_END
input = "grocers"
F_NAME = "text"
File.write(F_NAME, text)
File.read(F_NAME).scan /\b#{input}\b/i
# => ["grocers", "grocers", "Grocers"]
File.read(F_NAME) returns the entire text file in a single string. scan /\b#{input}\b/i is sent to that string.

How to print from specific column range?

I want to grab only the first line of columns 46 to 245 of source.txt and write it to output.txt
source_file.each { |line|
File.open(output_file,"a+") { |f|
f.print ???
}
Bonus: I also need to keep a count of the number of characters in this range, as some may be whitespace. i.e. 38 characters and the rest whitespace.
Example:
source_file: (first line only, columns 45 to 245): 13287912721981239854 + 180 blank columns
output_file: 13287912721981239854
count = 20 characters
Update: appending [46..245].delete(' ').size gives me the desired count.
If I am understanding what you are asking correctly, there's no reason to grab the whole file when you only want the first line. If this isn't what you're asking for, then you need to specify what you're trying to pull out of the source file more clearly.
This should grab the data you need:
output_line = source_file.gets [45..244]
If you write:
source_file.each { |line|
File.open(output_file,"a+") { |f|
f.print ???
}
}
You will open, then close, your output file for each line read from the output file. That is the wrong way to do it, even if you only want to read one line of input.
Instead try something like one of these:
File.open(output_file, 'a') do |fo|
File.open('path/to/input_file') do |fi|
fo.puts fi.readline[46..245]
end
end
This uses IO.readline, which reads a single line from the file. The block falls through afterwards, causing both the input and output files to be closed automatically. Also, it opens the output file as 'a' which is append-mode only. 'a+' is wrong unless you intend to append and read, which is rarely done. From the documentation:
"a+" Read-write, starts at end of file if file exists,
otherwise creates a new file for reading and
writing
Or:
File.open(output_file, 'a') do |fo|
File.foreach('path/to/input_file') do |li|
fo.puts li[46..245]
break
end
end
foreach is used most often when we're reading a file line-by-line. It's the mainstay for reading files in a scalable manner. It wants to loop over the file inside the block, which is why break is there, to break out of that loop.
Or:
File.foreach('path/to/input_file') do |li|
File.write(output_file, li[46..245], -1, :mode => 'a')
break
end
File.write is useful when you have a blob of text or binary, and want to write it in one chunk, then move on. The -1 tells Ruby to move to the end of the file. :mode => 'a' overrides the default mode which would normally truncate an existing file.
Maybe this will do the job:
line = f.readline
columns = line.split
File.open("output.txt", "w") do |out|
columns[46, (245 - 46 + 1)].each do |column|
out.puts column
end
end
break # only process first line
I have used 245 - 46 + 1 to indicate this is the number of columns we are interested in. I have also assumed that columns are separate by whitespaces. If that is not the case you will need to change the delimiter of split.

Compare Arrays for matching string

I have a script that telnets into a box, runs a command, and saves the output. I run another script after that which parses through the output file, comparing it to key words that are located in another file for matching. If a line is matched, it should save the entire line (from the original telnet-output) to a new file.
Here is the portion of the script that deals with parsing text:
def parse_file
filter = []
temp_file = File.open('C:\Ruby193\scripts\PARSED_TRIAL.txt', 'a+')
t = File.open('C:\Ruby193\scripts\TRIAL_output_log.txt')
filter = File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines
t.each do |line|
filter.each do |segment|
if (line =~ /#{segment}/)
temp_file.puts line
end
end
end
t.close()
temp_file.close()
end
Currently, it is only saving the last run string located in array filter and saving that to temp_file. It looks like the loop does not run all the strings in the array, or does not save them all. I have five strings placed inside the text file Filtered_text.txt. It only prints my last matched line into temp_file.
This (untested code) will duplicate the original code, only more succinctly and idiomatically:
filter = Regexp.union(File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines.map(&:chomp))
File.open('C:\Ruby193\scripts\PARSED_TRIAL.txt', 'a+') do |temp_file|
File.foreach('C:\Ruby193\scripts\TRIAL_output_log.txt') do |l|
temp_file.puts l if (l[filter])
end
end
To give you an idea what is happening:
Regexp.union(%w[a b c])
=> /a|b|c/
This gives you a regular expression that'll walk through the string looking for any substring matches. It's a case-sensitive search.
If you want to close those holes, use something like:
Regexp.new(
'\b' + Regexp.union(
File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines.map(&:chomp)
).source + '\b',
Regexp::IGNORECASE
)
which, using the same sample input array as above would result in:
/\ba|b|c\b/i

Resources