data = File.read("data.txt")
if(data.to_s.eql? "hello")
...
end
My data.txt is filled with "hello" as well, so the if-loop should get active but it doesn't. What do I do wrong?
When reading data from a file, it's likely you get a newline tagged on to the end of your data.
For example, if I run the following in the terminal to create a file containing only the word 'hello':
echo hello > data.txt
And the read this in the terminal, I see:
cat data.txt
# => hello
However, jumping into irb, I get the following:
File.read("data.txt")
# => "hello\n"
The \n is the newline operator.
To solve your question, you can use:
if data.chomp == "hello"
...
end
chomp removes any record separator from the end of the string, giving you the comparison you're after.
If you just want to know whether the file contains the specified string, you can also use:
data['hello']
This will return the truthy value in the square brackets if present, or nil if not.
if data["hello"]
...
end
Related
I have a text file like this:
User accounts for \\AGGREP-1
-------------------------------------------------------------------------------
Administrator users grzesieklocal
Guest scom SUPPORT_8855
The command completed successfully.
First line is empty line. I want to delete every empty lines in this file and every line containing words "User accounts for", "-------", "The command". I want to have only lines containing users. I don't want to delete only first 4 and the last one lines, because it can be more users in some systems and file will contain more lines.
I load file using
a = IO.readlines("test.txt")
Is any way to delete lines containing specific words?
Solution
This structure reads the file line by line, and write a new file directly :
def unwanted?(line)
line.strip.empty? ||
line.include?('User accounts') ||
line.include?('-------------') ||
line.include?('The command completed')
end
File.open('just_users.txt', 'w+') do |out|
File.foreach('test.txt') do |line|
out.puts line unless unwanted?(line)
end
end
If you're familiar with regexp, you could use :
def unwanted?(line)
line =~ /^(User accounts|------------|The command completed|\s*$)/
end
Warning from your code
The message warning: string literal in condition appears when you try to use :
string = "nothing"
if string.include? "a" or "b"
puts "FOUND!"
end
It outputs :
parse_text.rb:16: warning: string literal in condition
FOUND!
Because it should be written :
string = 'nothing'
if string.include?('a') || string.include?('b')
puts "FOUND!"
end
See this question for more info.
IO::readlines returns an array, so you could use Array#select to select just the lines you need. Bear in mind that this means that your whole input file will be in memory, which might be a problem, if the file is really large.
An alternative approach would be to use IO::foreach, which processes one line at a time:
selected_lines = []
IO.foreach('test.txt') { |line| selected_lines << line if line_matches_your_requirements }
I have a question about the need for the use of \n\n to make a newline.
Please see below examples.
If I do ..
puts "hello"
puts "hi"
or
puts "hello\n"
puts "hi"
The output is..
hello
hi
If I do ..
puts "hello\n\n"
puts "hi"
The output is..
hello
hi
Why do I need \n\n to make one extra newline?
Why doesn't the single \n make any difference?
From the documentation:
puts(obj, ...) → nil
Writes the given objects to ios as with IO#print. Writes a record separator (typically a newline) after any that do not already end with a newline sequence. If called with an array argument, writes each element on a new line. If called without arguments, outputs a single record separator.
The purpose of puts is to ensure the string ends with the newline character.
If there is none, then one newline character will be appended.
If there is one or more, no newline character will be appended.
The other answers here nailed it.
If you want to avoid the magic handling of \n, try using print instead of puts. print outputs your string literally, with no line ending unless you put it there.
> 3.times { print 'Zap' }
ZapZapZap=> 3
> 3.times { puts 'Zap' }
Zap
Zap
Zap
=> 3
I have a file like this:
some content
some oterh
*********************
useful1 text
useful3 text
*********************
some other content
How do I get the content of the file within between two stars line in an array. For example, on processing the above file the content of array should be like this
a=["useful1 text" , "useful2 text"]
A really hack solution is to split the lines on the stars, grab the middle part, and then split that, too:
content.split(/^\*+$/)[1].split(/\s+/).reject(&:empty?)
# => ["useful1","useful3"]
f = File.open('test_doc.txt', 'r')
content = []
f.each_line do |line|
content << line.rstrip unless !!(line =~ /^\*(\*)*\*$/)
end
f.close
The regex pattern /^*(*)*$/ matches strings that contain only asterisks. !!(line =~ /^*(*)*$/) always returns a boolean value. So if the pattern does not match, the string is added to the array.
What about this:
def values_between(array, separator)
array.slice array.index(separator)+1..array.rindex(separator)-1
end
filepath = '/tmp/test.txt'
lines = %w(trash trash separator content content separator trash)
separator = "separator\n"
File.write '/tmp/test.txt', lines.join("\n")
values_between File.readlines('/tmp/test.txt'), "separator\n"
#=> ["content\n", "content\n"]
I'd do it like this:
lines = []
File.foreach('./test.txt') do |li|
lines << li if (li[/^\*{5}/] ... li[/^\*{5}/])
end
lines[1..-2].map(&:strip).select{ |l| l > '' }
# => ["useful1 text", "useful3 text"]
/^\*{5}/ means "A string that starts with and has at least five '*'.
... is one of two uses of .. and ... and, in this use, is commonly called a "flip-flop" operator. It isn't used often in Ruby because most people don't seem to understand it. It's sometimes mistaken for the Range delimiters .. and ....
In this use, Ruby watches for the first test, li[/^\*{5}/] to return true. Once it does, .. or ... will return true until the second condition returns true. In this case we're looking for the same delimiter, so the same test will work, li[/^\*{5}/], and is where the difference between the two versions, .. and ... come into play.
.. will return toggle back to false immediately, whereas ... will wait to look at the next line, which avoids the problem of the first seeing a delimiter and then the second seeing the same line and triggering.
That lets the test assign to lines, which, prior to the [1..-2].map(&:strip).select{ |l| l > '' } looks like:
# => ["*********************\n",
# "\n",
# "useful1 text\n",
# "\n",
# "useful3 text\n",
# "\n",
# "*********************\n"]
[1..-2].map(&:strip).select{ |l| l > '' } cleans that up by slicing the array to remove the first and last elements, strip removes leading and trailing whitespace, effectively getting rid of the trailing newlines and resulting in empty lines and strings containing the desired text. select{ |l| l > '' } picks up the lines that are greater than "empty" lines, i.e., are not empty.
See "When would a Ruby flip-flop be useful?" and its related questions, and "What is a flip-flop operator?" for more information and some background. (Perl programmers use .. and ... often, for just this purpose.)
One warning though: If the file has multiple blocks delimited this way, you'll get the contents of them all. The code I wrote doesn't know how to stop until the end-of-file is reached, so you'll have to figure out how to handle that situation if it could occur.
I need to match a line in an inputted text file string and wrap that captured line with a character for example.
For example imagine a text file as such:
test
foo
test
bar
I would like to use gsub to output:
XtestX
XfooX
XtestX
XbarX
I'm having trouble matching a line though. I've tried using regex starting with ^ and ending with $, but it doesn't seem to work. Any ideas?
I have a text file that has the following in it:
test
foo
test
bag
The text file is being read in as a command line argument.
So I got
string = IO.read(ARGV[0])
string = string.gsub(/^(test)$/,'X\1X')
puts string
It outputs the exact same thing that is in the text file.
If you're trying to match every line, then
gsub(/^.*$/, 'X\&X')
does the trick. If you only want to match certain lines, then replace .* with whatever you need.
Update:
Replacing your gsub with mine:
string = IO.read(ARGV[0])
string = string.gsub(/^.*$/, 'X\&X')
puts string
I get:
$ gsub.rb testfile
XtestX
XfooX
XtestX
XbarX
Update 2:
As per #CodeGnome, you might try adding chomp:
IO.readlines(ARGV[0]).each do |line|
puts "X#{line.chomp}X"
end
This works equally well for me. My understanding of ^ and $ in regular expressions was that chomping wouldn't be necessary, but maybe I'm wrong.
You can do it in one line like this:
IO.write(filepath, File.open(filepath) {|f| f.read.gsub(//<appId>\d+<\/appId>/, "<appId>42</appId>"/)})
IO.write truncates the given file by default, so if you read the text first, perform the regex String.gsub and return the resulting string using File.open in block mode, it will replace the file's content in one fell swoop.
I like the way this reads, but it can be written in multiple lines too of course:
IO.write(filepath, File.open(filepath) do |f|
f.read.gsub(//<appId>\d+<\/appId>/, "<appId>42</appId>"/)
end
)
If your file is input.txt, I'd do as following
File.open("input.txt") do |file|
file.lines.each do |line|
puts line.gsub(/^(.*)$/, 'X\1X')
end
end
(.*) allows to capture any characters and makes it a variable Regexp
\1 in the string replacement is that captured group
If you prefer to do it in one line on the whole content, you can do it as following
File.read("input.txt").gsub(/^(.*)$/, 'X\1X')
string.gsub(/^(matchline)$/, 'X\1X')
Uses a backreference (\1) to get the first capture group of the regex, and surround it with X
Example:
string = "test\nfoo\ntest\nbar"
string.gsub!(/^test$/, 'X\&X')
p string
=> "XtestX\nfoo\nXtestX\nbar"
Chomp Line Endings
Your lines probably have newline characters. You need to handle this one way or another. For example, this works fine for me:
$ ruby -ne 'puts "X#{$_.chomp}X"' /tmp/corpus
XtestX
XfooX
XtestX
XbarX
I thought this code would work, but the regular expression doesn't ever match the \r\n. I have viewed the data I am reading in a hex editor and verified there really is a hex D and hex A pattern in the file.
I have also tried the regular expressions /\xD\xA/m and /\x0D\x0A/m but they also didn't match.
This is my code right now:
lines2 = lines.gsub( /\r\n/m, "\n" )
if ( lines == lines2 )
print "still the same\n"
else
print "made the change\n"
end
In addition to alternatives, it would be nice to know what I'm doing wrong (to facilitate some learning on my part). :)
Use String#strip
Returns a copy of str with leading and trailing whitespace removed.
e.g
" hello ".strip #=> "hello"
"\tgoodbye\r\n".strip #=> "goodbye"
Using gsub
string = string.gsub(/\r/," ")
string = string.gsub(/\n/," ")
Generally when I deal with stripping \r or \n, I'll look for both by doing something like
lines.gsub(/\r\n?/, "\n");
I've found that depending on how the data was saved (the OS used, editor used, Jupiter's relation to Io at the time) there may or may not be the newline after the carriage return. It does seem weird that you see both characters in hex mode. Hope this helps.
If you are using Rails, there is a squish method
"\tgoodbye\r\n".squish => "goodbye"
"\tgood \t\r\nbye\r\n".squish => "good bye"
What do you get when you do puts lines? That will give you a clue.
By default File.open opens the file in text mode, so your \r\n characters will be automatically converted to \n. Maybe that's the reason lines are always equal to lines2. To prevent Ruby from parsing the line ends use the rb mode:
C:\> copy con lala.txt
a
file
with
many
lines
^Z
C:\> irb
irb(main):001:0> text = File.open('lala.txt').read
=> "a\nfile\nwith\nmany\nlines\n"
irb(main):002:0> bin = File.open('lala.txt', 'rb').read
=> "a\r\nfile\r\nwith\r\nmany\r\nlines\r\n"
irb(main):003:0>
But from your question and code I see you simply need to open the file with the default modifier. You don't need any conversion and may use the shorter File.read.
modified_string = string.gsub(/\s+/, ' ').strip
lines2 = lines.split.join("\n")
"still the same\n".chomp
or
"still the same\n".chomp!
http://www.ruby-doc.org/core-1.9.3/String.html#method-i-chomp
How about the following?
irb(main):003:0> my_string = "Some text with a carriage return \r"
=> "Some text with a carriage return \r"
irb(main):004:0> my_string.gsub(/\r/,"")
=> "Some text with a carriage return "
irb(main):005:0>
Or...
irb(main):007:0> my_string = "Some text with a carriage return \r\n"
=> "Some text with a carriage return \r\n"
irb(main):008:0> my_string.gsub(/\r\n/,"\n")
=> "Some text with a carriage return \n"
irb(main):009:0>
I think your regex is almost complete - here's what I would do:
lines2 = lines.gsub(/[\r\n]+/m, "\n")
In the above, I've put \r and \n into a class (that way it doesn't matter in which order they might appear) and added the "+" qualifier (so that "\r\n\r\n\r\n" would also match once, and the whole thing replaced with "\n")
Just another variant:
lines.delete(" \n")
Why not read the file in text mode, rather than binary mode?
lines.map(&:strip).join(" ")
You can use this :
my_string.strip.gsub(/\s+/, ' ')
def dos2unix(input)
input.each_byte.map { |c| c.chr unless c == 13 }.join
end
remove_all_the_carriage_returns = dos2unix(some_blob)