I have a text file (a.txt) that looks like the following.
open
close
open
open
close
open
I need to find a way to replace the 3rd line with "close". I did some search and most method involve searching for the line than replace it. Can't really do it here since I don't want to turn all the "open" to "close".
Essentially (for this case) I'm looking for a write version of IO.readlines("./a.txt") [2].
How about something like:
lines = File.readlines('file')
lines[2] = 'close' << $/
File.open('file', 'w') { |f| f.write(lines.join) }
str = <<-_
my
dog
has
fleas
_
FNameIn = 'in'
FNameOut = 'out'
First, let's write str to FNameIn:
File.write(FNameIn, str)
#=> 17
Here are a couple of ways to replace the third line of FNameIn with "had" when writing the contents of FNameIn to FNameOut.
#1 Read a line, write a line
If the file is large, you should read from the input file and write to the output file one line at a time, rather than keeping large strings or arrays of strings in memory.
fout = File.open(FNameOut, "w")
File.foreach(FNameIn).with_index { |s,i| fout.puts(i==2 ? "had" : s) }
fout.close
Let's check that FNameOut was written correctly:
puts File.read(FNameOut)
my
dog
had
fleas
Note that IO#puts writes a record separator if the string does not already end with a record separator.1. Also, if fout.close is omitted FNameOut is closed when fout goes out of scope.
#2 Use a regex
r = /
(?:[^\n]*\n) # Match a line in a non-capture group
{2} # Perform the above operation twice
\K # Discard all matches so far
[^\n]+ # Match next line up to the newline
/x # Free-spacing regex definition mode
File.write(FNameOut, File.read(FNameIn).sub(r,"had"))
puts File.read(FNameOut)
my
dog
had
fleas
1 File.superclass #=> IO, so IO's methods are inherited by File.
Related
I have a text file that starts with:
Title
aaa
bbb
ccc
I don't know what the line would include, but I know that the structure of the file will be Title, then an empty line, then the actual lines. I want to modify it to:
New Title
fff
aaa
bbb
ccc
I had this in mind:
lineArray = File.readlines(destinationFile).drop(2)
lineArray.insert(0, 'fff\n')
lineArray.insert(0, '\n')
lineArray.insert(0, 'new Title\n')
File.writelines(destinationFile, lineArray)
but writelines doesn't exist.
`writelines' for File:Class (NoMethodError)
Is there a way to delete the first two lines of the file an add three new lines?
I'd start with something like this:
NEWLINES = {
0 => "New Title",
1 => "\nfff"
}
File.open('test.txt.new', 'w') do |fo|
File.foreach('test.txt').with_index do |li, ln|
fo.puts (NEWLINES[ln] || li)
end
end
Here's the contents of test.txt.new after running:
New Title
fff
aaa
bbb
ccc
The idea is to provide a list of replacement lines in the NEWLINES hash. As each line is read from the original file the line number is checked in the hash, and if the line exists then the corresponding value is used, otherwise the original line is used.
If you want to read the entire file then substitute, it reduces the code a little, but the code will have scalability issues:
NEWLINES = [
"New Title",
"",
"fff"
]
file = File.readlines('test.txt')
File.open('test.txt.new', 'w') do |fo|
fo.puts NEWLINES
fo.puts file[(NEWLINES.size - 1) .. -1]
end
It's not very smart but it'll work for simple replacements.
If you really want to do it right, learn how diff works, create a diff file, then let it do the heavy lifting, as it's designed for this sort of task, runs extremely fast, and is used millions of times every day on *nix systems around the world.
Use put with the whole array:
File.open("destinationFile", "w+") do |f|
f.puts(lineArray)
end
If your files are big, the performance and memory implications of reading them into memory in their entirety are worth thinking about. If that's a concern, then your best bet is to treat the files as streams. Here's how I would do it.
First, define your replacement text:
require "stringio"
replacement = StringOI.new <<END
New Title
fff
END
I've made this a StringIO object, but it could also be a File object if your replacement text is in a file.
Now, open your destination file (a new file) and write each line from the replacement text into it.
dest = File.open(dest_fn, 'wb') do |dest|
replacement.each_line {|ln| dest << ln }
We could have done this more efficiently, but there's a good reason to do it this way: Now we can call replacement.lineno to get the number of lines read, instead of iterating over it a second time to count the lines.
Next, open the original file and seek ahead by calling gets replacement.lineno times:
orig = File.open(orig_fn, 'r')
replacement.lineno.times { orig.gets }
Finally, write the remaining lines from the original file to the new file. We'll do it more efficiently this time with File.copy_stream:
File.copy_stream(orig, dest)
orig.close
dest.close
That's it. Of course, it's a drag closing those files manually (and when we do we should do it in an ensure block), so it's better to use the block form of File.open to automatically close them. Also, we can move the orig.gets calls into the replacement.each_line loop:
File.open(dest_fn, 'wb') do |dest|
File.open(orig_fn, 'r') do |orig|
replacement.each_line {|ln| dest << ln; orig.gets }
File.copy_stream(orig, dest)
end
end
First create an input test file.
FNameIn = "test_in"
text = <<_
Title
How now,
brown cow?
_
#=> "Title\n\nHow now,\nbrown cow?\n"
File.write(FNameIn, text)
#=> 27
Now read and write line-by-line.
FNameOut = "test_out"
File.open(FNameIn) do |fin|
fin.gets; fin.gets
File.open(FNameOut, 'w') do |fout|
fout.puts "New Title"
fout.puts
fout.puts "fff"
until fin.eof?
fout.puts fin.gets
end
end
end
Check the result:
puts File.read(FNameOut)
# New Title
#
# fff
# How now,
# brown cow?
Ruby will close each of the two files when its block terminates.
If the files are not large, you could instead write:
File.write(FNameOut,
["New Title\n", "\n", "fff\n"].concat(File.readlines(FNameIn).drop(2)).join)
I am trying to read in a text file and iterate through every line. If the line contains "_u" then I want to copy that word in that line.
For example:
typedef struct {
reg 1;
reg 2;
} buffer_u;
I want to copy the word buffer_u.
This is what I have so far (everything up to how to copy the word in the string):
f_in = File.open( h_file )
test = h_file.read
text.each_line do |line|
if line.include? "_u"
# copy word
# add to output file
end
end
Thanks in advance for your help!
Don't make it harder than it has to be. If you want to scan a body of text for words that match a criteria, do just that:
text = "
word_u1
something
_u1 foo
bar _u2
another word_u2
typedef struct {
reg 1;
reg 2;
} buffer_u;
"
text.scan(/\w+/).select{ |w| w['_u'] }
# => ["word_u1", "_u1", "_u2", "word_u2", "buffer_u"]
Regex are useful but the more complex ("smarter") they are, they slower they run unless you are very careful to anchor them, as anchors give them hints on where to look. Without those, the engine tries a number of things to determine exactly what you want, and that can really bog down the processing.
I recommend instead simply grabbing the words in the text:
scan(/\w+/)
Then filtering out the ones that match:
select{ |w| w['_u'] }
Using select with a simple sub-string search w['_u'] is extremely fast.
It could probably run faster using split() instead of scan(/\w+/) but you'll have to deal with cleaning up non-word characters.
Note: \w means [a-zA-Z0-9_] so what we generally call a "word" character is actually a "variable" definition for most languages since words generally don't include digits or _.
You can probably reduce your code to:
File.read( h_file ).scan(/\w+/).select{ |w| w['_u'] }
That will return an array of matching words.
Caveat: Using read has scalability issues. If you're concerned about the size of the file being read (which you always should be) then use foreach and iterate over the file line-by-line. You will probably see no change in processing speed.
You can try something like this:
words = []
File.open( h_file ) { |file| file.each_line { |line|
words << line.split.find { |a| a =~ /_u/ }
}}
words.compact!
# => [["buffer_u"]]
puts words
# buffer_u
This regex should catch a word ending with _u
(\w*_u)(?!\w)
The matching group will match a word ending with _u not followed by letters digits or underscores.
If you want _u to appear anywhere in a word use
(\w*_u\w*)
See DEMO here.
This will return all such words in the file, even if there are two or more in a line:
r = /
\w* # match >= 0 word characters
_u # match string
\w* # match >= 0 word characters
/x # extended mode
File.read(fname).scan r
For example:
str = "Cat_u has 9 lives, \n!dog_u has none and \n pig_u_o and cow_u, 3."
fname = 'temp'
File.write(fname, str)
#=> 63
Confirm the file contents:
File.read(fname)
#=> "Cat_u has 9 lives, \n!dog_u has none and \n pig_u_o and cow_u, 3."
Extract strings:
File.read(fname).scan r
#=> ["Cat_u", "dog_u", "pig_u_o", "cow_u"]
It's not difficult to modify this code to return at most one string per line. Simply read the file into an array of lines (or read a line at a time) and execute s = line[r]; arr << s if s for each line, where r is the above regex.
I have a file like this:
some content
some oterh
*********************
useful1 text
useful3 text
*********************
some other content
How do I get the content of the file within between two stars line in an array. For example, on processing the above file the content of array should be like this
a=["useful1 text" , "useful2 text"]
A really hack solution is to split the lines on the stars, grab the middle part, and then split that, too:
content.split(/^\*+$/)[1].split(/\s+/).reject(&:empty?)
# => ["useful1","useful3"]
f = File.open('test_doc.txt', 'r')
content = []
f.each_line do |line|
content << line.rstrip unless !!(line =~ /^\*(\*)*\*$/)
end
f.close
The regex pattern /^*(*)*$/ matches strings that contain only asterisks. !!(line =~ /^*(*)*$/) always returns a boolean value. So if the pattern does not match, the string is added to the array.
What about this:
def values_between(array, separator)
array.slice array.index(separator)+1..array.rindex(separator)-1
end
filepath = '/tmp/test.txt'
lines = %w(trash trash separator content content separator trash)
separator = "separator\n"
File.write '/tmp/test.txt', lines.join("\n")
values_between File.readlines('/tmp/test.txt'), "separator\n"
#=> ["content\n", "content\n"]
I'd do it like this:
lines = []
File.foreach('./test.txt') do |li|
lines << li if (li[/^\*{5}/] ... li[/^\*{5}/])
end
lines[1..-2].map(&:strip).select{ |l| l > '' }
# => ["useful1 text", "useful3 text"]
/^\*{5}/ means "A string that starts with and has at least five '*'.
... is one of two uses of .. and ... and, in this use, is commonly called a "flip-flop" operator. It isn't used often in Ruby because most people don't seem to understand it. It's sometimes mistaken for the Range delimiters .. and ....
In this use, Ruby watches for the first test, li[/^\*{5}/] to return true. Once it does, .. or ... will return true until the second condition returns true. In this case we're looking for the same delimiter, so the same test will work, li[/^\*{5}/], and is where the difference between the two versions, .. and ... come into play.
.. will return toggle back to false immediately, whereas ... will wait to look at the next line, which avoids the problem of the first seeing a delimiter and then the second seeing the same line and triggering.
That lets the test assign to lines, which, prior to the [1..-2].map(&:strip).select{ |l| l > '' } looks like:
# => ["*********************\n",
# "\n",
# "useful1 text\n",
# "\n",
# "useful3 text\n",
# "\n",
# "*********************\n"]
[1..-2].map(&:strip).select{ |l| l > '' } cleans that up by slicing the array to remove the first and last elements, strip removes leading and trailing whitespace, effectively getting rid of the trailing newlines and resulting in empty lines and strings containing the desired text. select{ |l| l > '' } picks up the lines that are greater than "empty" lines, i.e., are not empty.
See "When would a Ruby flip-flop be useful?" and its related questions, and "What is a flip-flop operator?" for more information and some background. (Perl programmers use .. and ... often, for just this purpose.)
One warning though: If the file has multiple blocks delimited this way, you'll get the contents of them all. The code I wrote doesn't know how to stop until the end-of-file is reached, so you'll have to figure out how to handle that situation if it could occur.
I have a script that telnets into a box, runs a command, and saves the output. I run another script after that which parses through the output file, comparing it to key words that are located in another file for matching. If a line is matched, it should save the entire line (from the original telnet-output) to a new file.
Here is the portion of the script that deals with parsing text:
def parse_file
filter = []
temp_file = File.open('C:\Ruby193\scripts\PARSED_TRIAL.txt', 'a+')
t = File.open('C:\Ruby193\scripts\TRIAL_output_log.txt')
filter = File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines
t.each do |line|
filter.each do |segment|
if (line =~ /#{segment}/)
temp_file.puts line
end
end
end
t.close()
temp_file.close()
end
Currently, it is only saving the last run string located in array filter and saving that to temp_file. It looks like the loop does not run all the strings in the array, or does not save them all. I have five strings placed inside the text file Filtered_text.txt. It only prints my last matched line into temp_file.
This (untested code) will duplicate the original code, only more succinctly and idiomatically:
filter = Regexp.union(File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines.map(&:chomp))
File.open('C:\Ruby193\scripts\PARSED_TRIAL.txt', 'a+') do |temp_file|
File.foreach('C:\Ruby193\scripts\TRIAL_output_log.txt') do |l|
temp_file.puts l if (l[filter])
end
end
To give you an idea what is happening:
Regexp.union(%w[a b c])
=> /a|b|c/
This gives you a regular expression that'll walk through the string looking for any substring matches. It's a case-sensitive search.
If you want to close those holes, use something like:
Regexp.new(
'\b' + Regexp.union(
File.open('C:\Ruby193\scripts\Filtered_text.txt').readlines.map(&:chomp)
).source + '\b',
Regexp::IGNORECASE
)
which, using the same sample input array as above would result in:
/\ba|b|c\b/i
Good afternoon!
I am pretty new to Ruby and want to code a basic search and replace function in Ruby.
When you call the function, you can pass parameters (search pattern, replacing word).
This works like this: multiedit(pattern1, replacement1, pattern2, replacement2, ...)
Now, I want my function to read a text file, search for pattern1 and replace it with replacement2, search for pattern2 and replace it with replacement2 and so on. Finally, the altered text should be written to another text file.
I've tried to do this with a until loop, but all I get is that only the very first pattern is replaced while all the following patterns are ignored (in this example, only apple is replaced with fruit). I think the problem is that I always reread the original unaltered text? But I can't figure out a solution. Can you help me? Calling the function the way I am doing it is important for me.
def multiedit(*_patterns)
return puts "Number of search patterns does not match number of replacement strings!" if (_patterns.length % 2 > 0)
f = File.open("1.txt", "r")
g = File.open("2.txt", "w")
i = 0
until i >= _patterns.length do
f.each_line {|line|
output = line.sub(_patterns[i], _patterns[i+1])
g.puts output
}
i+=2
end
f.close
g.close
end
multiedit("apple", "fruit", "tomato", "veggie", "steak", "meat")
Can you help me out?
Thank you very much in advance!
Regards
Your loop was kind of inside-out ... do this instead ...
f.each_line do |line|
_patterns.each_slice 2 do |a, b|
line.sub! a, b
end
g.puts line
end
Perhaps the most efficient way to evaluate all the patterns for every line is to build a single regexp from all the search patterns and use the hash replacement form of String#gsub
def multiedit *patterns
raise ArgumentError, "Number of search patterns does not match number of replacement strings!" if (_patterns.length % 2 != 0)
replacements = Hash[ *patterns ].
regexp = Regexp.new replacements.keys.map {|k| Regexp.quote(k) }.join('|')
File.open("2.txt", "w") do |out|
IO.foreach("1.txt") do |line|
out.puts line.gsub regexp, replacements
end
end
end
Easier and better method is to use erb.
http://apidock.com/ruby/ERB