Reading specific line into an array - ruby - ruby

Have a txt file with the following:
Anders Hansen;87442355;11;87
Jens Hansen;22338843;23;11
Nanna Kvist;25233255;24;84
I would like to search the file after a specific name taken from the user input. Then save that line into an array, splittet via ";". Can't get it to work though. This is my code:
user1 = []
puts "Start by entering the full name of user 1: "
input = gets.chomp
File.open("userregister.txt") do |f|
f.each_line { |line|
if line =~ input then do |line|
user1 << line.split(';').map

=~ in ruby tries to match a string with a regex (or vice versa). Here, you use it with two strings, which gives an error:
'foo' =~ 'bar' # => TypeError: type mismatch: String given
There are more appropriate String methods to use instead. In your case, #start_with? does the job. If you wanted to check if the latter is contained somewhere as a substring (but not necessary the beginning), you can use #include?.
In case you actually wanted to take a regex as a user input (generally a bad idea), you can convert it from string to regex:
line =~ /#{input}/

Looking at the file format, I would actually use Ruby CSV class. By specifying the column separator to ;, you will get an array for each row.
require 'csv'
input = gets.chomp
CSV.foreach('userregister.txt', col_sep: ';') do |row|
if row[0].downcase == input.downcase
# Do stuffs with row[1..-1]
end
end

Related

Deleting lines containing specific words in text file in Ruby

I have a text file like this:
User accounts for \\AGGREP-1
-------------------------------------------------------------------------------
Administrator users grzesieklocal
Guest scom SUPPORT_8855
The command completed successfully.
First line is empty line. I want to delete every empty lines in this file and every line containing words "User accounts for", "-------", "The command". I want to have only lines containing users. I don't want to delete only first 4 and the last one lines, because it can be more users in some systems and file will contain more lines.
I load file using
a = IO.readlines("test.txt")
Is any way to delete lines containing specific words?
Solution
This structure reads the file line by line, and write a new file directly :
def unwanted?(line)
line.strip.empty? ||
line.include?('User accounts') ||
line.include?('-------------') ||
line.include?('The command completed')
end
File.open('just_users.txt', 'w+') do |out|
File.foreach('test.txt') do |line|
out.puts line unless unwanted?(line)
end
end
If you're familiar with regexp, you could use :
def unwanted?(line)
line =~ /^(User accounts|------------|The command completed|\s*$)/
end
Warning from your code
The message warning: string literal in condition appears when you try to use :
string = "nothing"
if string.include? "a" or "b"
puts "FOUND!"
end
It outputs :
parse_text.rb:16: warning: string literal in condition
FOUND!
Because it should be written :
string = 'nothing'
if string.include?('a') || string.include?('b')
puts "FOUND!"
end
See this question for more info.
IO::readlines returns an array, so you could use Array#select to select just the lines you need. Bear in mind that this means that your whole input file will be in memory, which might be a problem, if the file is really large.
An alternative approach would be to use IO::foreach, which processes one line at a time:
selected_lines = []
IO.foreach('test.txt') { |line| selected_lines << line if line_matches_your_requirements }

Extract names from File using Ruby and Grep

I have a file with the following data:
other data
user1=name1
user2=name2
user3=name3
other data
to extract the names I do the following
names = File.open('resource.cfg', 'r') do |f|
f.grep(/[a-z][a-z][0-9]/)
end
which returns the following array
user1=name1
user2=name2
user3=name3
but I really want only the name part
name1
name2
name3
Right now I'm doing this after the file step:
names = names.map do |name|
name[7..9]
end
is there a better way to do? with the file step
You could do it like this, using String#scan with a regex:
Code
File.read(FNAME).scan(/(?<==)[A-Za-z]+\d+$/)
Explanation
Let's start by constructing a file:
FNAME = "my_file"
lines =<<_
other data
user1=name1
user2=name2
user3=name3
other data
_
File.write(FNAME,lines)
We can confirm the file contents:
puts File.read(FNAME)
other data
user1=name1
user2=name2
user3=name3
other data
Now run the code::
File.read(FNAME).scan(/(?<==)[A-Za-z]+\d+$/)
#=> ["name1", "name2", "name3"]
A word about the regex I used.
(?<=...)
is called a "positive lookbehind". Whatever is inserted in place of the dots must immediately precede the match, but is not part of the match (and for that reason is sometimes referred to as as "zero-length" group). We want the match to follow an equals sign, so the "positive lookbehind" is as follows:
(?<==)
This is followed by one or more letters, then one or more digits, then an end-of-line, which comprise the pattern to be matched. You could of course change this if you have different requirements, such as names being lowercase or beginning with a capital letter, a specified number of digits, and so on.
Is your code working as you have posted it?
names = File.open('resource.cfg', 'r') { |f| f.grep(/[a-z][a-z][0-9]/) }
names = names.map { |name| name[7..9] }
=> ["ame", "ame", "ame"]
You could make it into a neat little one-liner by writing it as such:
names = File.readlines('resource.cfg').grep(/=(\w*)/) { |x| x.split('=')[1].chomp }
You can do it all in a single step:
names = File.open('resource.cfg', 'r') do |f|
f.grep(/[a-z][a-z][0-9]/).map {|x| x.split('=')[1]}
end

How do I execute Date from a string?

I am trying to read a file which has dynamic dates in it such as Date.today or (Date.today - 1 ), and perform my code based on the date requested.
If I have the string defined with the date in quotes it works. When reading the same string from a file it does not. Is there any eval function that I need to use to make it work?
require 'date'
#Works
abc = "something #{Date.today}"
puts abc
# something 2013-04-19
#does not work
f = File.read("test.txt")
f.each_line { |line| puts line ; words = line.split("\t")
puts line
}
Contents of the test.txt file:
something #{Date.today}
# something #{Date.today}
You're going to have to use eval to actually evaluate the contents of each line as its read.
But since you will be evaluating arbitrary code you will have to trust that its not malicious code.
So, assuming you trust the file and its lines:
require 'date'
f = File.read("test.txt")
f.each_line do |line|
puts eval("\"#{line}\"")
end
Notice the double-quote wrapping, as the "piece of code" you will be evaluating needs to be a "valid" Ruby string (which in turn contains code, so you wrap it in quotes to make it appear to be an actual double-quoted String.
This works if test.txt looks like:
Hello, #{Date.today}
Goodbye, #{Date.today}

Compare Arrays for matching string

I have a script that telnets into a box, runs a command, and saves the output. I run another script after that which parses through the output file, comparing it to key words that are located in another file for matching. If a line is matched, it should save the entire line (from the original telnet-output) to a new file.
Here is the portion of the script that deals with parsing text:
def parse_file
filter = []
temp_file = File.open('C:\Ruby193\scripts\PARSED_TRIAL.txt', 'a+')
t = File.open('C:\Ruby193\scripts\TRIAL_output_log.txt')
filter = File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines
t.each do |line|
filter.each do |segment|
if (line =~ /#{segment}/)
temp_file.puts line
end
end
end
t.close()
temp_file.close()
end
Currently, it is only saving the last run string located in array filter and saving that to temp_file. It looks like the loop does not run all the strings in the array, or does not save them all. I have five strings placed inside the text file Filtered_text.txt. It only prints my last matched line into temp_file.
This (untested code) will duplicate the original code, only more succinctly and idiomatically:
filter = Regexp.union(File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines.map(&:chomp))
File.open('C:\Ruby193\scripts\PARSED_TRIAL.txt', 'a+') do |temp_file|
File.foreach('C:\Ruby193\scripts\TRIAL_output_log.txt') do |l|
temp_file.puts l if (l[filter])
end
end
To give you an idea what is happening:
Regexp.union(%w[a b c])
=> /a|b|c/
This gives you a regular expression that'll walk through the string looking for any substring matches. It's a case-sensitive search.
If you want to close those holes, use something like:
Regexp.new(
'\b' + Regexp.union(
File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines.map(&:chomp)
).source + '\b',
Regexp::IGNORECASE
)
which, using the same sample input array as above would result in:
/\ba|b|c\b/i

What's the best way to parse a tab-delimited file in Ruby?

What's the best (most efficient) way to parse a tab-delimited file in Ruby?
The Ruby CSV library lets you specify the field delimiter. Ruby 1.9 uses FasterCSV. Something like this would work:
require "csv"
parsed_file = CSV.read("path-to-file.csv", col_sep: "\t")
The rules for TSV are actually a bit different from CSV. The main difference is that CSV has provisions for sticking a comma inside a field and then using quotation characters and escaping quotes inside a field. I wrote a quick example to show how the simple response fails:
require 'csv'
line = 'boogie\ttime\tis "now"'
begin
line = CSV.parse_line(line, col_sep: "\t")
puts "parsed correctly"
rescue CSV::MalformedCSVError
puts "failed to parse line"
end
begin
line = CSV.parse_line(line, col_sep: "\t", quote_char: "Ƃ")
puts "parsed correctly with random quote char"
rescue CSV::MalformedCSVError
puts "failed to parse line with random quote char"
end
#Output:
# failed to parse line
# parsed correctly with random quote char
If you want to use the CSV library you could used a random quote character that you don't expect to see if your file (the example shows this), but you could also use a simpler methodology like the StrictTsv class shown below to get the same effect without having to worry about field quotations.
# The main parse method is mostly borrowed from a tweet by #JEG2
class StrictTsv
attr_reader :filepath
def initialize(filepath)
#filepath = filepath
end
def parse
open(filepath) do |f|
headers = f.gets.strip.split("\t")
f.each do |line|
fields = Hash[headers.zip(line.split("\t"))]
yield fields
end
end
end
end
# Example Usage
tsv = Vendor::StrictTsv.new("your_file.tsv")
tsv.parse do |row|
puts row['named field']
end
The choice of using the CSV library or something more strict just depends on who is sending you the file and whether they are expecting to adhere to the strict TSV standard.
Details about the TSV standard can be found at http://en.wikipedia.org/wiki/Tab-separated_values
There are actually two different kinds of TSV files.
TSV files that are actually CSV files with a delimiter set to Tab. This is something you'll get when you e.g. save an Excel spreadsheet as "UTF-16 Unicode Text". Such files use CSV quoting rules, which means that fields may contain tabs and newlines, as long as they are quoted, and literal double quotes are written twice. The easiest way to parse everything correctly is to use the csv gem:
use 'csv'
parsed = CSV.read("file.tsv", col_sep: "\t")
TSV files conforming to the IANA standard. Tabs and newlines are not allowed as field values, and there is no quoting whatsoever. This is something you will get when you e.g. select a whole Excel spreadsheet and paste it into a text file (beware: it will get messed up if some cells do contain tabs or newlines). Such TSV files can be easily parsed line-by-line with a simple line.rstrip.split("\t", -1) (note -1, which prevents split from removing empty trailing fields). If you want to use the csv gem, simply set quote_char to nil:
use 'csv'
parsed = CSV.read("file.tsv", col_sep: "\t", quote_char: nil)
I like mmmries answer. HOWEVER, I hate the way that ruby strips off any empty values off of the end of a split. It isn't stripping off the newline at the end of the lines, either.
Also, I had a file with potential newlines within a field. So, I rewrote his 'parse' as follows:
def parse
open(filepath) do |f|
headers = f.gets.strip.split("\t")
f.each do |line|
myline=line
while myline.scan(/\t/).count != headers.count-1
myline+=f.gets
end
fields = Hash[headers.zip(myline.chomp.split("\t",headers.count))]
yield fields
end
end
end
This concatenates any lines as necessary to get a full line of data, and always returns the full set of data (without potential nil entries at the end).

Resources