I'm trying to read all the text files from a directory, iterate through every file, and search for strings in the file and delete those lines. E.g.,
sample.txt
#Wrote for the configuration ideas
mag = some , db\m09oi, id\polki
jio = red\po9i8
[\]
#mag = denk
#jio = tea
I want to delete the lines having mag.
Output
#Wrote for the configuration ideas
jio = red\po9i8
[\]
#jio = tea
I tried:
Dir.glob("D:\\my_folder\\*.txt") do |file_name|
value = File.read(file_name)
change = value.gsub!(/[#m]ag/, "")
File.open(file_name, "w") { |file| file.puts change }
end
But the lines aren't removed.
Any suggestions please.
It is better to read the file line by line and if a line contains mag, just omit it, and only write other lines to the new file.
— Credits to #WiktorStribiżew
File.write(file_name, File.readlines(file_name).reject do |line|
line[/\bmag\b/]
end.join($/))
Here we read the file, split by lines with IO#readlines, reject lines having mag as a single word inside ("magistrate" won’t be matched,) join it back with the line delimiter, specified for this particular platform (e.g. \n on unix) and write it back.
Related
My bot reads emails one by one from a document.txt file and after login with this email the bot outputs the comments that I have in another file.
I have reached the point that the bot reads the emails but I want that a specific account makes a specific and not a repeated comment.
So I have in mind the solution of reading a specific line from the comments file.
For example account 1 reads and puts line 1 of the comments file. I want to know how can I read the second line from a comments file.
This is the code part when I read comments one by one but I want to read for example line two or three!
file = 'comments.txt'
File.readlines(file).each do |line|
comment = ["#{line}"]
comment.each { |val|
comment = ["#{val}"]
}
end
File.readlines returns array. So you can do everything you want
lines = []
File.readlines(path_to_file, chomp: true).each.with_index(1) do |line, line_number|
lines << (line_number == 2 ? 'Special line' : line)
end
Try the below.
# set the line number to read
line_number = 2 # <== Reading 2nd line
comment = IO.readlines('comments.txt')[line_number-1]
Your code is overwriting the comment variable in each iteration.
I'd write your code like this:
lines = File.readlines('comments.txt')
lines.each do |line|
# entire line
end
In the loop you can do a lot of things with the single line, unfortunately I don't get 100% what you want to do (one comment vs. multiple, always the same for specific users, etc.) I hope this helps anyway.
I have a text file that has around 100 plus entries like out.txt:
domain\1esrt
domain\2345p
yrtfj
tkpdp
....
....
I have to read out.txt, line-by-line and check whether the strings like "domain\1esrt" are present in any of the files under a different directory. If present delete only that string occurrence and save the file.
I know how to read a file line-by-line and also know how to grep for a string in multiple files in a directory but I'm not sure how to join those two to achieve my above requirement.
You can create an array with all the words or strings you want to find and then delete/replace:
strings_to_delete = ['aaa', 'domain\1esrt', 'delete_me']
Then to read the file and use map to create an array with all the lines who doesn't match with none of the elements in the array created before:
# read the file 'text.txt'
lines = File.open('text.txt', 'r').map do|line|
# unless the line matches with some value on the strings_to_delete array
line unless strings_to_delete.any? do |word|
word == line.strip
end
# then remove the nil elements
end.reject(&:nil?)
And then open the file again but this time to write on it, all the lines which didn't match with the values in the strings_to_delete array:
File.open('text.txt', 'w') do |line|
lines.each do |element|
line.write element
end
end
The txt file looks like:
aaa
domain\1esrt
domain\2345p
yrtfj
tkpdp
....
....
delete_me
I don't know how it'll work with a bigger file, anyways, I hope it helps.
I would suggest using gsub here. It will run a regex search on the string and replace it with the second parameter. So if you only have to replace any single string, I believe you can simply run gsub on that string (including the newline) and replace it with an empty string:
new_file_text = text.gsub(/regex_string\n/, "")
I am trying to change a file by finding this string:
<aspect name=\"lineNumber\"><![CDATA[{CLONEINCR}]]>
and replacing {CLONEINCR} with an incrementing number. Here's what I have so far:
file = File.open('input3400.txt' , 'rb')
contents = file.read.lines.to_a
contents.each_index do |i|contents.join["<aspect name=\"lineNumber\"><![CDATA[{CLONEINCR}]]></aspect>"] = "<aspect name=\"lineNumber\"><![CDATA[#{i}]]></aspect>" end
file.close
But this seems to go on forever - do I have an infinite loop somewhere?
Note: my text file is 533,952 lines long.
You are repeatedly concatenating all the elements of contents, making a substitution, and throwing away the result. This is happening once for each line, so no wonder it is taking a long time.
The easiest solution would be to read the entire file into a single string and use gsub on that to modify the contents. In your example you are inserting the (zero-based) file line numbers into the CDATA. I suspect this is a mistake.
This code replaces all occurrences of <![CDATA[{CLONEINCR}]]> with <![CDATA[1]]>, <![CDATA[2]]> etc. with the number incrementing for each matching CDATA found. The modified file is sent to STDOUT. Hopefully that is what you need.
File.open('input3400.txt' , 'r') do |f|
i = 0
contents = f.read.gsub('<![CDATA[{CLONEINCR}]]>') { |m|
m.sub('{CLONEINCR}', (i += 1).to_s)
}
puts contents
end
If what you want is to replace CLONEINCR with the line number, which is what your above code looks like it's trying to do, then this will work. Otherwise see Borodin's answer.
output = File.readlines('input3400.txt').map.with_index do |line, i|
line.gsub "<aspect name=\"lineNumber\"><![CDATA[{CLONEINCR}]]></aspect>",
"<aspect name=\"lineNumber\"><![CDATA[#{i}]]></aspect>"
end
File.write('input3400.txt', output.join(''))
Also, you should be aware that when you read the lines into contents, you are creating a String distinct from the file. You can't operate on the file directly. Instead you have to create a new String that contains what you want and then overwrite the original file.
I am in a problem while reading a text file with readline and trying to compare first line with a string. I want to compare the first line of the text file with a string and then will go for next process. But I can't do that. Here is my code:
doc = File.open("example.txt", "r")
line1 = doc.readline
if line1 == "sukanta"
line2 = doc.readline
line3 = doc.readline
line4 = doc.readline
end
My example.txt file contains:
sukanta
Software engineer
label2
server:107.108.9.190
Please give me solution. While I am trying to get string length with line1.length it's not showing the exact number.
i got the answer. Its silly mistake .. i should use "sukanta\n" to compare
When i am using readline to read each line then i have to set each line in their place sequentially. i cant break the order. Whil i am using loop like
doc = File.open("example.txt", "r")
doc.each_line do |lines|
puts lines
end
getting the whole text as a line. cant separate each line from others. i need to break the order. How to do that?
I suspect you are not taking into account that a line ends with $/ ("\n" on UNIX). So you probably intended
line1 == "sukanta\n"
or
line1.chomp == "sukanta"
and you are not including $/ when you count the length (which is one or two characters less than the correct length depending on the OS).
I have a sample position file like below.
789754654 COLORA SOMETHING1 19370119FYY076 2342423234SS323423
742784897 COLORB SOMETHING2 20060722FYY076 2342342342SDFSD3423
I am interested in positions 54-61 (4th column). I want to change the date to be a different format. So final outcome will be:
789754654 COLORA SOMETHING1 01191937FYY076 2342423234SS323423
742784897 COLORB SOMETHING2 07222006FYY076 2342342342SDFSD3423
The columns are seperated by spaces not tabs. And the final file should have exact number of spaces as the original file....only thing changing should be the date format. How can I do this? I wrote a script but it will lose the original spaces and positioning will be messed up.
file.each_line do |line|
dob = line.split(" ")
puts dob[3] #got the date. change its format
5.times { puts "**" }
end
Can anyone suggest a better strategy so that positioning in the original file remains the same?
You can use line[range] to replace part of a line, leaving the rest the same.
#!/usr/bin/ruby1.8
line = "789754654 COLORA SOMETHING1 19370119FYY076 2342423234SS323423"
line[44..57] = "01191937FYY076"
p line
# => "789754654 COLORA SOMETHING1 01191937FYY076 2342423234SS323423"
I would:
Read the file in
Split the lines like you have
Store the data in an array (temporarily)
Do all the date changes, etc. you want to that array
Make a method that knows how to output your data correctly (with spaces) to turn the array back into a string
Print out what that method gives you
Take a look at String#ljust and/or String#rjust for making the conversion method.
You can use String#sub for a simple search/replace.
>> s.sub(/(\d{8}FYY\d{3})(\s*)/){ "Original: '#$1', Spaces: '#$2'" }
=> "789754654 COLORA SOMETHING1 Original: '19370119FYY076', Spaces: ' '2342423234SS323423"
Of course, in your case you'd output the reformatted date.
>> s.sub(/(\d{8}FYY\d{3})/){ $1.reverse }
=> "789754654 COLORA SOMETHING1 670YYF91107391 2342423234SS323423"
Use a regex to break the line out into it's constituent parts, then put them back together in the order you require.
lines = [
'789754654 COLORA SOMETHING1 19370119FYY076 2342423234SS323423',
'742784897 COLORB SOMETHING2 20060722FYY076 2342342342SDFSD3423'
]
rx = Regexp.new(/^(\d{9})(\s+)(\S+)(\s+)(\S+)(\s+)(\d{4})(\d{2})(\d{2})(FYY076)(\s+)(\S+)$/)
lines.each do |line|
match = rx.match(line)
puts sprintf("%s%s%s%s%s%s%s%s%s%s%s%s",
match[1], match[2], match[3], match[4], match[5], match[6],
match[8], match[9], match[7], match[10],match[11],match[12]
)
end