Further to stackoverflow question in the link below:
ruby match string or space/tab at the beginning of a line and insert uniq lines to a file
enter link description here
I have this files - ./files_tmp/60416.log:
AAAAAA555
AAAAAA555
BBBBBB
CCCCC
AAAAAA434343
AAAAAA434343
./files_tmp/60417.log
AAAAAA55544
AAAAAA55544
BBBBBB
CCCCC
AAAAAA434343
AAAAAA434343
I have this code:
files = Dir["./files_tmp/*.log"]
files.each do |file_name|
puts file_name if !File.directory? file_name
Tempfile.open do |temp|
File.open(file_name) do |input|
input.each_line do |line|
if line.match(/AAAAAA/) || (line.match(/^\t/) and tabs)
puts "found a line #{line}"
temp.write(line.lstrip!)
end
end
end
File.open("./temp.log", "a") do |file|
temp.rewind
file.write(temp.readlines.uniq.join(""))
end
end
end
the result of puts "found a line #{line}" is below, but I expected it will print only the lines with AAAAAA
./files_tmp/60416.log
found a line AAAAAA555
found a line AAAAAA555
found a line BBBBBB
found a line CCCCC
found a line AAAAAA434343
found a line AAAAAA434343
./files_tmp/60417.log
found a line AAAAAA55544
found a line AAAAAA55544
found a line BBBBBB
found a line CCCCC
found a line AAAAAA434343
found a line AAAAAA434343
I can see duplicate lines in the temp file ./temp.log and not all the lines with the lines AAAAAA
AAAAAA434343
AAAAAA434343
I expected to :
AAAAAA555
AAAAAA434343
AAAAAA55544
And I wonder why ?
I am using file.write(temp.readlines.uniq.join("")) instead of file.write(temp.readlines.uniq) because the result will be :
["AAAAAA434343\n"]
it will be great to understand rewind purpose, what is it for?
Thanks for your help !
You do not need to mess with Tempfile. Just collect what you want and afterwards write everything into the destination file:
result = Dir["./files_tmp/*.log"].each_with_object([]) do |file_name, lines|
next if File.directory? file_name # skip dirs
File.readlines(file_name) do |line|
next unless line =~ /AAAAAA/
puts "found a line #{line}"
lines |= [line.lstrip!] # append if and only it’s uniq
end
end
File.write("./temp.log", result.join($/)) # join with OS-aware line sep
Related
I already have one line in fd.txt and when I'm inserting three multiple lines, the first line appends right after the existing data. Here's an example:
fd.txt
This is past data.
New data
Line 1
Line 2
Line 3
When I run the following code:
open('fd.txt', 'a+') { |file|
file.puts "Line 1"
file.puts "Line 2"
file.puts "Line 3"
}
I get the following output:
This is past data.Line 1
Line 2
Line 3
But, I need Line 1 from the second line. So I add "\n" in file.puts "\nLine 1" but this adds an additional empty line right before Line 1. What update should I make to my code to get the following output:
This is past data.
Line 1
Line 2
Line 3
Not very elegant, but you could check whether the last character is a \n and add one otherwise: (I assume that you don't know if the file ends with a newline)
open('fd.txt', 'a+') do |file|
file.puts unless file.pread(1, file.size - 1) == "\n"
file.puts "Line 1"
file.puts "Line 2"
file.puts "Line 3"
end
It would be better of course to not have a missing newline in the first place.
Similar to the other proposed answer, you could do:
open('fd.txt', 'a+') do |file|
file.seek(-1, IO::SEEK_END)
file.puts unless file.read == "\n"
file.puts "Line 1"
file.puts "Line 2"
file.puts "Line 3"
end
I have a txt file and I want to extract only the first word of a line which contains the characters 'ath'.
File.open("myfile.txt").readlines.each do |line|
if line =~ /ath/
line.split.first
puts line
$line = line.chomp
puts "Ok"
else
nil
end
end
line.split.first only works if the first word of the line is a match, because when I do the same in irb:
"im here to ask someting easy".split.first
The output is 'im'.
If the first word in a line contains ath at any point
if line =~ /^\S*ath\S*/
I'm very new to the Ruby world so please bear if its a simple query.
For one of my assignments, I'm looking to read the contents of all the text files in a folder (only top level) and redirect the file contents to a single output file in a appended or merged manner.
I'm a expecting a format like below:
Output File
File Name: 1st file name
all its contents
====================================
File Name: 2nd file name
all its contents
====================================
File Name: 3rd file name
all its contents
====================================
....
....
====================================
I managed to write the below script but the output file is empty. Any suggestions please.
File.open('C:\Users\darkop\Desktop\final_output.txt','a') do |final|
#files = Dir.glob("D:\text\*.txt")
for file in #files
text = File.open(file, 'r').read.sub(/#$/?\z/, $/)
text.each_line do |line|
puts "File Name:"#{file}
puts
final << line
puts "=" * 20
end
end
end
Also, is it possible to redirect the output in aforementioned format to a word document instead of a text file ?
Many thanks.
This should work.
The file name was empty because you have puts "File Name:"#{file}. This way #{file} doesn't get interpolated, because it isn't inside the double quotation marks.
Also, you didn't get the contents of the file because you just used puts, instead of puts line, which is what you want.
File.open('C:\Users\darkop\Desktop\final_output.txt','a') do |final|
#files = Dir.glob("D:\text\*.txt")
for file in #files
text = File.open(file, 'r').read.sub(/#$/?\z/, $/)
text.each_line do |line|
puts "File Name: #{file}"
puts
puts line
final << line
puts "=" * 20
end
end
end
-EDIT-
Since you are new to Ruby, it's better to use an each loop, instead of the for .. in loop. Also, just specify the output name with a .doc extension for a Word document.
File.open('C:\Users\darkop\Desktop\final_output.doc','a') do |final|
#files = Dir.glob("D:\text\*.txt")
#files.each do |file|
text = File.open(file, 'r').read.sub(/#$/?\z/, $/)
text.each_line do |line|
puts "File Name: #{file}"
puts
puts line
final << line
puts "=" * 20
end
end
end
I have a following file:
old_file
new_file
Some string.
end
Text in the middle that is not supposed to go to any of files.
new_file
Another text.
end
How using regex can I create two files with the following content:
file1
new_file
Some string.
end
file2
new_file
Another text.
end
How can I get information which is between keywords 'new_file' and 'end' to write it to the file?
If your files are not that large, you can read them in as a string, (use File.read(file_name)), and then run the following regex:
file_contents.scan(/^new_file$.*?^end$/m).select { |block| WRITE_TO_FILE_CODE_HERE }
See the regex demo
The ^new_file$.*?^end$ regex matches new_file that is a whole line content, then 0+ any characters as few as possible (incl. a newline as /m modifier is used), and then end (a whole line).
Else, you may adapt this answer here as
printing = false
File.open(my_file).each_line do |line|
printing = true if line =~ /^new_file$/
puts line if printing
printing = false if line =~ /^end$/
end
Open the file when the starting line is found, write to it where puts line is in the example above, and close when printing false occurs.
You can also read the file chunk by chunk by changing what constitutes a "line" in ruby:
File.open("file1.txt", "w") do |file1|
File.open("file2.txt", "w") do |file2|
enum = IO.foreach("old_file.txt", sep="\n\n")
file1.puts enum.next.strip
enum.next #discard
file2.puts enum.next.strip
end #automatically closes file2
end #automatically closes file1
By designating the separator as "\n\n" ruby will read all the characters up to and including two consecutive newlines--and return that as a "line".
If that kind of format is fixed, then you may try this (new_file\n.*\nend)
cat infile
abc 123 678
sda 234 345 321
xyz 234 456 678
I need grep the file for keyword sda and report with first and last column.
sda has the value of 321
If you know bash script, I need a function in ruby as in below bash(awk) script:
awk '/sda/{print $1 " has the value of " $NF}' infile
How about something like this?
File.open("infile", "r").each_line do |line|
next unless line =~ /^sda/ # don't process the line unless it starts with "sda"
entries = line.split(" ")
var1 = entries.first
var2 = entries.last
puts "#{var1} has the value of #{var2}"
end
I don't know where you are defining the "sda" matcher. If it's fixed, you can just put it in there.
If not, you might try grabbing it from commandline arguments.
key, *_, value = line.split
next unless key == 'sda' # or "next if key != 'sda'"
puts your_string
Alternatively, you could use a regexp matcher in the beginning to see if the line starts with 'sda' or not.