I have a file ImageContainer.xml with text as follow:
<leftArrowImage>/apps/mcaui/PAL/Arrows/C0004OptionNavArrowLeft.png</leftArrowImage>
<rightArrowImage>/apps/mcaui/PAL/Arrows/C0003OptionNavArrowRight.png</rightArrowImage>
Now, I am searching for C0004OptionNavArrowLeft.png and C0003OptionNavArrowRight.png in that file.
Code is:
#LangFileName = "ZZZPNG.txt"
fileLangInput = File.open(#LangFileName)
fileLangInput.each_line do |varStrTextSearch|
puts "\nSearching ==>" + varStrTextSearch
Dir.glob("**/*.*") do |file_name|
fileSdfInput = File.open(file_name)
fileSdfInput.each_line do |line|
if line.include?"#{varStrTextSearch}"
puts"Found"
end
end
end
end
here varStrTextSearch is string variable having different string values.
Problem is that is it is finding C0004OptionNavArrowLeft.png but not finding C0003OptionNavArrowRight.png.
Can someone tell me where I am doing wrong?
My guess is, newline chars are the problem.
fileLangInput.each_line do |varStrTextSearch|
varStrTextSearch here will contain a \n char at the end. And if your XML is not consistently formatted (for example, like this)
<leftArrowImage>
/apps/mcaui/PAL/Arrows/C0004OptionNavArrowLeft.png
</leftArrowImage>
<rightArrowImage>/apps/mcaui/PAL/Arrows/C0003OptionNavArrowRight.png</rightArrowImage>
Then your problem can be reproduced (there's no newline char after "C0003OptionNavArrowRight", so it can't be found).
Solution? Remove the unwanted whitespace.
fileSdfInput.each_line do |line|
if line.include? varStrTextSearch.chomp # read the docs on String#chomp
puts"Found"
end
end
Related
I'm going through a file line by line and looking for the word "BIOS" in each line because each line that contains the word BIOS has a version number I need. After I see that the line contains the word "BIOS" I want to take the entire line, and split it into an array. Here's my code:
File.open(file).each do |line|
if line.includes? 'BIOS'
array = line.split(" ")
end
end
The problem I'm having is that I keep getting an error, saying that "includes?" is an undefined method. Is there a better way to parse each line of this file for a specific string?
That's right, includes? is not defined for String in Ruby. Use include? instead, or []: https://ruby-doc.org/core-2.2.0/String.html#method-i-include-3F
So the code should be:
File.open(file).each do |line|
if line.include? 'BIOS'
array = line.split(' ')
end
end
I tried to find a simple solution to replace a particular character randomly within a file.
Unfortunately, my solution replaces all found characters and not just some of them.
file_names = ['users_controller.rb']
file_names.each do |file_name|
text = File.read(file_name)
new_contents = text.gsub(",", ";") #replaces , to ; (unfortunatly all and not just some)
puts new_contents
File.open(file_name, "w") {|file| file.puts new_contents }
end
I appreciate any help, thanks.
The question is not clear. If you want to replace some random occurrences of a particular (fixed) character (",") with a particular (fixed) character (";"), then do:
text.gsub(","){rand(2).zero? ? "," : ";"}
I know that I can replace text as below in a file
File.write(file, File.read(file).gsub(/text/, "text_to_replace"))
Can we also use sub/gsub to:-
Replace a string on a particular line number (useful when there is a same string at different locations in a file)
Example
root#vikas:~# cat file.txt
fix grammatical or spelling errors
clarify meaning without changing it
correct minor mistakes
add related resources or links
root#box27:~#
I want to insert some text at 3rd line
root#vikas:~# cat file.txt
fix grammatical or spelling errors
clarify meaning without changing it
Hello, how are you ?
correct minor mistakes
add related resources or links
root#box27:~
Replace a string on the line just before/after matching a pattern
Example
root#vikas:~# cat file.txt
fix grammatical or spelling errors
clarify meaning without changing it
correct minor mistakes
add related resources or links
root#box27:~#
I want to search 'minor mistakes' and put text 'Hello, how are you ?' before that.
root#vikas:~# cat file.txt
fix grammatical or spelling errors
clarify meaning without changing it
Hello, how are you ?
correct minor mistakes
add related resources or links
root#box27:~
Here is the answer.
File.open("file.txt", "r").each_line do |line|
if line =~ /minor mistakes/
puts "Hello, how are you ?"
end
puts "#{line}"
end
Here is ruby one-liner.
ruby -pe 'puts "Hello, how are you ?" if $_ =~ /minor mistakes/' < file.txt
You can find this functionality in a gem like Thor. Check out the documentation for the inject_into_file method here:
http://www.rubydoc.info/github/erikhuda/thor/master/Thor/Actions#inject_into_file-instance_method.
Here is the source code for the method:
https://github.com/erikhuda/thor/blob/067f6638f95bd000b0a92cfb45b668bca5b0efe3/lib/thor/actions/inject_into_file.rb#L24-L32
If you wish to match on line n (offset from zero):
def match_line_i(fname, linenbr regex)
IO.foreach(fname).with_index { |line,i|
return line[regex] if i==line_nbr }
end
or
return scan(regex) if i==line_nbr }
depending on your requirements.
If you wish to match on a given line, then return the previous line, for application of gsub (or whatever):
def return_previous_line(fname, regex)
last_line = nil
IO.foreach(fname) do |line|
line = f.readline
return last_line if line =~ regex
last_line = line
end
end
Both methods return nil if there is no match.
Okay, as there is no such option available with sub/gsub, I am pasting here my code (with slight modifications to BMW's code) for all three options. Hopefully, this helps someone in a similar situation.
Insert text before a pattern
Insert text after a pattern
Insert text at a specific line number
root#box27:~# cat file.txt
fix grammatical or spelling errors
clarify meaning without changing it
correct minor mistakes
add related resources or links
always respect the original author
root#box27:~#
root#box27:~# cat ruby_script
puts "#### Insert text before a pattern"
pattern = 'minor mistakes'
File.open("file.txt", "r").each_line do |line|
puts "Hello, how are you ?" if line =~ /#{pattern}/
puts "#{line}"
end
puts "\n\n#### Insert text after a pattern"
pattern = 'meaning without'
File.open("file.txt", "r").each_line do |line|
found = 'no'
if line =~ /#{pattern}/
puts "#{line}"
puts "Hello, how are you ?"
found = 'yes'
end
puts "#{line}" if found == 'no'
end
puts "\n\n#### Insert text at a particular line"
insert_at_line = 3
line_number = 1
File.open("file.txt", "r").each_line do |line|
puts "Hello, how are you ?" if line_number == insert_at_line
line_number += 1
puts "#{line}"
end
root#box27:~#
I'm writing a short class to extract email addresses from documents. Here is my code so far:
# Class to scrape documents for email addresses
class EmailScraper
EmailRegex = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
def EmailScraper.scrape(doc)
email_addresses = []
File.open(doc) do |file|
while line = file.gets
temp = line.scan(EmailRegex)
temp.each do |email_address|
puts email_address
emails_addresses << email_address
end
end
end
return email_addresses
end
end
if EmailScraper.scrape("email_tests.txt").empty?
puts "Empty array"
else
puts EmailScraper.scrape("email_tests.txt")
end
My "email_tests.txt" file looks like so:
example#live.com
another_example90#hotmail.com
example3#diginet.ie
When I run this script, all I get is the "Empty array" printout. However, when I fire up irb and type in the regex above, strings of email addresses match it, and the String.scan function returns an array of all the email addresses in each string. Why is this working in irb and not in my script?
Several things (some already mentioned and expanded upon below):
\z matches to the end of the string, which with IO#gets will typically include a \n character. \Z (upper case 'z') matches the end of the string unless the string ends with a \n, in which case it matches just before.
the typo of emails_addresses
using \A and \Z is fine while the entire line is or is not an email address. You say you're seeking to extract addresses from documents, however, so I'd consider using \b at each end to extract emails delimited by word boundaries.
you could use File.foreach()... rather than the clumsy-looking File.open...while...gets thing
I'm not convinced by the Regex - there's a substantial body of work already around:
There's a smarter one here: http://www.regular-expressions.info/email.html (clicking on that odd little in-line icon takes you to a piece-by-piece explanation). It's worth reading the discussion, which points out several potential pitfalls.
Even more mind-bogglingly complex ones may be found here.
class EmailScraper
EmailRegex = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\Z/i # changed \z to \Z
def EmailScraper.scrape(doc)
email_addresses = []
File.foreach(doc) do |line| # less code, same effect
temp = line.scan(EmailRegex)
temp.each do |email_address|
email_addresses << email_address
end
end
email_addresses # "return" isn't needed
end
end
result = EmailScraper.scrape("email_tests.txt") # store it so we don't print them twice if successful
if result.empty?
puts "Empty array"
else
puts result
end
Looks like you're putting the results into emails_addresses, but are returning email_addresses. This would mean that you're always returning the empty array you defined for email_addresses, making the "Empty array" response correct.
You have a typo, try with:
class EmailScraper
EmailRegex = /\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
def EmailScraper.scrape(doc)
email_addresses = []
File.open(doc) do |file|
while line = file.gets
temp = line.scan(EmailRegex)
temp.each do |email_address|
puts email_address
email_addresses << email_address
end
end
end
return email_addresses
end
end
if EmailScraper.scrape("email_tests.txt").empty?
puts "Empty array"
else
puts EmailScraper.scrape("email_tests.txt")
end
You used at the end \z try to use \Z according to http://www.regular-expressions.info/ruby.html it has to be a uppercase Z to match the end of the string.
Otherwise try to use ^ and $ (matching the start and the end of a row) this worked for me here on Regexr
When you read the file, the end of line is making the regex fail. In irb, there probably is no end of line. If that is the case, chomp the lines first.
regex=/\A[\w+\-.]+#[a-z\d\-.]+\.[a-z]+\z/i
line_from_irb = "example#live.com"
line_from_file = line_from_irb +"/n"
p line_from_irb.scan(regex) # => ["example#live.com"]
p line_from_file.scan(regex) # => []
I'm writing this little HelloWorld as a followup to this and the numbers do not add up
filename = "testThis.txt"
total_bytes = 0
file = File.new(filename, "r")
file.each do |line|
total_bytes += line.unpack("U*").length
end
puts "original size #{File.size(filename)}"
puts "Total bytes #{total_bytes}"
The result is not the same as the file size. I think I just need to know what format I need to plug in... or maybe I've missed the point entirely. How can I measure the file size line by line?
Note: I'm on Windows, and the file is encoded as type ANSI.
Edit: This produces the same results!
filename = "testThis.txt"
total_bytes = 0
file = File.new(filename, "r")
file.each_byte do |whatever|
total_bytes += 1
end
puts "Original size #{File.size(filename)}"
puts "Total bytes #{total_bytes}"
so anybody who can help now...
IO#gets works the same as if you were capturing input from the command line: the "Enter" isn't sent as part of the input; neither is it passed when #gets is called on a File or other subclass of IO, so the numbers are definitely not going to match up.
See the relevant Pickaxe section
May I enquire why you're so concerned about the line lengths summing to the file size? You may be solving a harder problem than is necessary...
Aha. I think I get it now.
Lacking a handy iPod (or any other sort, for that matter), I don't know if you want exactly 4K chunks, in which case IO#read(4000) would be your friend (4000 or 4096?) or if you're happier to break by line, in which case something like this ought to work:
class Chunkifier
def Chunkifier.to_chunks(path)
chunks, current_chunk_size = [""], 0
File.readlines(path).each do |line|
line.chomp! # strips off \n, \r or \r\n depending on OS
if chunks.last.size + line.size >= 4_000 # 4096?
chunks.last.chomp! # remove last line terminator
chunks << ""
end
chunks.last << line + "\n" # or whatever terminator you need
end
chunks
end
end
if __FILE__ == $0
require 'test/unit'
class TestFile < Test::Unit::TestCase
def test_chunking
chs = Chunkifier.to_chunks(PATH)
chs.each do |chunk|
assert 4_000 >= chunk.size, "chunk is #{chunk.size} bytes long"
end
end
end
end
Note the use of IO#readlines to get all the text in one slurp: #each or #each_line would do as well. I used String#chomp! to ensure that whatever the OS is doing, the byts at the end are removed, so that \n or whatever can be forced into the output.
I would suggest using File#write, rather than #print or #puts for the output, as the latter have a tendency to deliver OS-specific newline sequences.
If you're really concerned about multi-byte characters, consider taking the each_byte or unpack(C*) options and monkey-patching String, something like this:
class String
def size_in_bytes
self.unpack("C*").size
end
end
The unpack version is about 8 times faster than the each_byte one on my machine, btw.
You might try IO#each_byte, e.g.
total_bytes = 0
file_name = "test_this.txt"
File.open(file_name, "r") do |file|
file.each_byte {|b| total_bytes += 1}
end
puts "Original size #{File.size(file_name)}"
puts "Total bytes #{total_bytes}"
That, of course, doesn't give you a line at a time. Your best option for that is probably to go through the file via each_byte until you encounter \r\n. The IO class provides a bunch of pretty low-level read methods that might be helpful.
You potentially have several overlapping issues here:
Linefeed characters \r\n vs. \n (as per your previous post). Also EOF file character (^Z)?
Definition of "size" in your problem statement: do you mean "how many characters" (taking into account multi-byte character encodings) or do you mean "how many bytes"?
Interaction of the $KCODE global variable (deprecated in ruby 1.9. See String#encoding and friends if you're running under 1.9). Are there, for example, accented characters in your file?
Your format string for #unpack. I think you want C* here if you really want to count bytes.
Note also the existence of IO#each_line (just so you can throw away the while and be a little more ruby-idiomatic ;-)).
The issue is that when you save a text file on windows, your line breaks are two characters (characters 13 and 10) and therefore 2 bytes, when you save it on linux there is only 1 (character 10). However, ruby reports both these as a single character '\n' - it says character 10. What's worse, is that if you're on linux with a windows file, ruby will give you both characters.
So, if you know that your files are always coming from windows text files and executed on windows, every time you get a newline character you can add 1 to your count. Otherwise it's a couple of conditionals and a little state machine.
BTW there's no EOF 'character'.
f = File.new("log.txt")
begin
while (line = f.readline)
line.chomp
puts line.length
end
rescue EOFError
f.close
end
Here is a simple solution, presuming that the current file pointer is set to the start of a line in the read file:
last_pos = file.pos
next_line = file.gets
current_pos = file.pos
backup_dist = last_pos - current_pos
file.seek(backup_dist, IO::SEEK_CUR)
in this example "file" is the file from which you are reading. To do this in a loop:
last_pos = file.pos
begin loop
next_line = file.gets
current_pos = file.pos
backup_dist = last_pos - current_pos
last_pos = current_pos
file.seek(backup_dist, IO::SEEK_CUR)
end loop