How do you loop through a multiline string in Ruby? - ruby

Pretty simple question from a first-time Ruby programmer.
How do you loop through a slab of text in Ruby? Everytime a newline is met, I want to re-start the inner-loop.
def parse(input)
...
end

String#each_line
str.each_line do |line|
#do something with line
end

What Iraimbilanja said.
Or you could split the string at new lines:
str.split(/\r?\n|\r/).each { |line| … }
Beware that each_line keeps the line feed chars, while split eats them.
Note the regex I used here will take care of all three line ending formats. String#each_line separates lines by the optional argument sep_string, which defaults to $/, which itself defaults to "\n" simply.
Lastly, if you want to do more complex string parsing, check out the built-in StringScanner class.

You can also do with with any pattern:
str.scan(/\w+/) do |w|
#do something
end

str.each_line.chomp do |line|
# do something with a clean line without line feed characters
end
I think this should take care of the newlines.

Related

How can I remove a newline character from the line above in Ruby?

I have a string which has been split anytime that a single line goes over 69 characters. In order to process it, I would like to restore it to how it was pre-split. A split line always starts with a forward slash character on the second and subsequent lines, which needs to be kept. Is there a nice Ruby way to do this?
# Split version
GTSS/230028GG/JUL15/LL:123456X3-0051234G4/DES/000G/57NM/57NM/095T
/002GTS////gts
# Required output
GTSS/230028GG/JUL15/LL:123456X3-0051234G4/DES/000G/57NM/57NM/095T/002GTS////gts
I'm happy matching a line that starts with the forward slash. What I don't know is how to remove the newline character from end of the previous line.
example.lines.each_with_index do |line, index|
if line.match(/^\/.+$/)
# what goes here?
end
end
I would use gsub:
string = "GTSS/230028GG/JUL15/LL:123456X3-0051234G4/DES/000G/57NM/57NM/095T\n/002GTS////gts"
string.gsub("\n/", '/')
#=> "GTSS/230028GG/JUL15/LL:123456X3-0051234G4/DES/000G/57NM/57NM/095T/002GTS////gts"
You can also use lstripit will remove all whitespace(space, newline...) from left:
example.lines.each_with_index do |line, index|
if line.match(/^\/.+$/)
line.lstrip
end
end
.strip will remove all whitespace(space, newline...) the whole string
Another way (but I like #spickermann's answer better):
str = "GTSS/230028GG/JUL15/LL:123456X3-0051234G4/DES/000G/57NM/57NM/095T
/002GTS////gts"
str.split("\n/").join("/")
#=> "GTSS/230028GG/JUL15/LL:123456X3-0051234G4/DES/000G/57NM/57NM/095T/002GTS////gts"

Alternative code to read and process array by newline in Ruby

My code is supposed to read a file on the server, store its content in an Array, then read the array elements (eventually each element is a line) and split each line into 7 parts by (:)
I wrote this code and it works 100% fine.
lines = File.readlines('/etc/passwd')
lines.each do |line|
line = line.chomp! #I removed the \n
line_arr = line.split(/:/)
puts line_arr.inspect
puts "*************"
end
I just want to know if there is a shortcut to do this since each element of the array ends with \n.
Maybe I am a bit confused between a an array elements ending with \n and a string that contains \n
the content of the file looks like this
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
lp:x:7:7:lp:/var/spool/lpd:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh
As for the output, there's no specific format, because I am going to use this part and extend my code later. As long as I can access those 7 parts that I extracted from the line_arr, i should be fine.
thank you
require 'etc'
[].tap {|ary| Etc.passwd {|u|
ary << [u.name, u.passwd, u.uid, u.gid, u.gecos, u.dir, u.shell, u.change,
u.uclass, u.expire]
}}
Rule of thumb: never try to reimplement behavior that someone else has already written for you. Unless you are really, really, really, REALLY smart.
Actually, now that you have edited your question, I don't even see why you need those arrays in the first place and cannot just use the Etc.passwd iterator and Struct::Passwd directly.

Ruby: How to append to each line of a string based on a given regex?

I want to append </tag> to each line where it's missing:
text = '<tag>line 1</tag>
<tag>line2 # no closing tag, append
<tag>line3 # no closing tag, append
line4</tag> # no opening tag, but has a closing tag, so ignore
<tag>line5</tag>'
I tried to create a regular expression to match this but I know its wrong:
text.gsub! /.*?(<\/tag>)Z/, '</tag>'
How can I create a regular expression to conditionally append each line?
Here you go:
text.gsub!(%r{(?<!</tag>)$}, "</tag>")
Explanation:
$ means end of line and \z means end of string. \Z means something similar, with complications.
(?<!) work together to create a negative lookbehind.
Given the example provided, I'd just do something like this:
text.split(/<\/?tag>/).
reject {|t| t.strip.length == 0 }.
map {|t| "<tag>%s</tag>" % t.strip }.
join("\n")
You're basically treating either and as record delimiters, so you can just split on them, reject any blank records, then construct a new combined string from the extracted values. This works nicely when you can't count on newlines being record delimiters and will generally be tolerant of missing tags.
If you're insistent on a pure regex solution, though, and your data format will always match the given format (one record per line), you can use a negative lookbehind:
text.strip.gsub(/(?<!<\/tag>)(\n|$)/, "</tag>\\1")
One that could work is:
/<tag>[^\n ]+[^>][\s]*(\n)/
This is will return all the newline chars without a ">" before them.
Replace it with "\n", i.e.
text.gsub!( /<tag>[^\n ]+[^>][\s]*(\n)/ , "</tag>\n")
For more polishing, try http://rubular.com/
text = '<tag>line 1</tag>
<tag>line2
<tag>line3
line4</tag>
<tag>line5</tag>'
result = ""
text.each_line do |line|
line.rstrip!
line << "</tag>" if not line.end_with?("</tag>")
result << line << "\n"
end
puts result
--output:--
<tag>line 1</tag>
<tag>line2</tag>
<tag>line3</tag>
line4</tag>
<tag>line5</tag>

Ruby: line by line match range

Is there a way to do the following Perl structure in Ruby?
while( my $line = $file->each_line() ) {
if($line =~ /first_line/ .. /end_line_I_care_about/) {
do_something;
# this will do something on a line per line basis on the range of the match
}
}
In ruby that would read something like:
file.each_line do |line|
if line.match(/first_line/) .. line.match(/end_line_I_care_about/)
do_something;
# this will only do it based on the first match not the range.
end
end
Reading the whole file into memory is not an option and I don't know how big is the chunk of the range.
EDIT:
Thanks for the answers, the answers I got where basically the same as the code I had in the first place. The problem I was having was " It can test the right operand and become false on the same evaluation it became true (as in awk), but it still returns true once."
"If you don't want it to test the right operand until the next evaluation, as in sed, just use three dots ("...") instead of two. In all other regards, "..." behaves just like ".." does."
I am marking the correct answer as the one that pointed me to see that '..' can be turn off in the same call it is made.
For reference the code I am using is:
file.each_line do |line|
if line.match(/first_line/) ... line.match(/end_line_I_care_about/)
do_something;
end
end
Yes, Ruby supports flip-flops:
str = "aaa
ON
bbb
OFF
cccc
ON
ddd
OFF
eee"
str.each_line do |line|
puts line if line =~ /ON/..line =~ /OFF/
#puts line if line.match(/ON/)..line.match(/OFF/) #works too
end
Output:
ON
bbb
OFF
ON
ddd
OFF
I'm not perfectly clear on the exact semantics of the Perl code, assuming you want exactly the same. Ruby does have something that looks and works similarly, or perhaps identically: a Range as a condition works as a toggle. The code you presented works exactly as I imagine you intend.
There are a few caveats, however:
Even after you reach the end condition, lines will keep being read until you reach the end of the file. This may be a performance consideration if you expect the end condition to be near the beginning of a large file.
The start condition can be triggered multiple times, flipping the "switch" back on, doing your do_something and testing for the end condition again. This may be fine if your condition is specific enough, or if you want that behavior, but it's something to be aware of.
The end condition can be called at the same time the start condition is called giving you true for just one line.
Here's an alternative:
started = false
file.each_line do |line|
started = true if line =~ /first_line_condition/
next unless started
do_something()
break if line =~ /last_line_condition/
end
That code reads each line of the file until the start condition is reached. Then it does whatever processing you like starting with that line until you reach a line that matches your end condition, at which point it breaks out of the loop, reading no more lines from the file.
This solution is the closest to your needs. It almost looks like Perl, but this valid Ruby (although the flip-flop operator is kind of discouraged).
The file is read line by line, it is not fully loaded in memory.
File.open("my_file.txt", "r").each_line do |line|
if (line =~ /first_line/) .. (line =~ /end_line_I_care_about/)
do_something
end
end
The parentheses are optional, but they improve readability.

Ruby regex gsub a line in a text file

I need to match a line in an inputted text file string and wrap that captured line with a character for example.
For example imagine a text file as such:
test
foo
test
bar
I would like to use gsub to output:
XtestX
XfooX
XtestX
XbarX
I'm having trouble matching a line though. I've tried using regex starting with ^ and ending with $, but it doesn't seem to work. Any ideas?
I have a text file that has the following in it:
test
foo
test
bag
The text file is being read in as a command line argument.
So I got
string = IO.read(ARGV[0])
string = string.gsub(/^(test)$/,'X\1X')
puts string
It outputs the exact same thing that is in the text file.
If you're trying to match every line, then
gsub(/^.*$/, 'X\&X')
does the trick. If you only want to match certain lines, then replace .* with whatever you need.
Update:
Replacing your gsub with mine:
string = IO.read(ARGV[0])
string = string.gsub(/^.*$/, 'X\&X')
puts string
I get:
$ gsub.rb testfile
XtestX
XfooX
XtestX
XbarX
Update 2:
As per #CodeGnome, you might try adding chomp:
IO.readlines(ARGV[0]).each do |line|
puts "X#{line.chomp}X"
end
This works equally well for me. My understanding of ^ and $ in regular expressions was that chomping wouldn't be necessary, but maybe I'm wrong.
You can do it in one line like this:
IO.write(filepath, File.open(filepath) {|f| f.read.gsub(//<appId>\d+<\/appId>/, "<appId>42</appId>"/)})
IO.write truncates the given file by default, so if you read the text first, perform the regex String.gsub and return the resulting string using File.open in block mode, it will replace the file's content in one fell swoop.
I like the way this reads, but it can be written in multiple lines too of course:
IO.write(filepath, File.open(filepath) do |f|
f.read.gsub(//<appId>\d+<\/appId>/, "<appId>42</appId>"/)
end
)
If your file is input.txt, I'd do as following
File.open("input.txt") do |file|
file.lines.each do |line|
puts line.gsub(/^(.*)$/, 'X\1X')
end
end
(.*) allows to capture any characters and makes it a variable Regexp
\1 in the string replacement is that captured group
If you prefer to do it in one line on the whole content, you can do it as following
File.read("input.txt").gsub(/^(.*)$/, 'X\1X')
string.gsub(/^(matchline)$/, 'X\1X')
Uses a backreference (\1) to get the first capture group of the regex, and surround it with X
Example:
string = "test\nfoo\ntest\nbar"
string.gsub!(/^test$/, 'X\&X')
p string
=> "XtestX\nfoo\nXtestX\nbar"
Chomp Line Endings
Your lines probably have newline characters. You need to handle this one way or another. For example, this works fine for me:
$ ruby -ne 'puts "X#{$_.chomp}X"' /tmp/corpus
XtestX
XfooX
XtestX
XbarX

Resources