Ruby puts<<PARAGRAPH - ruby

puts <<PARAGRAPH
There's somthing going on here.
With the PARAGRAPH thing
We'll be able to type as much as we like.
Even 4 lines if we want, or 5, or 6 .
PARAGRAPH
This can work, using Notepad++
But why this can't work?
puts <<PARAGRAPH
aaaa Aa
aaa
AA
PARAGRAPH
test.rb:1: syntax error,unexpected tCONSTANT, expecting $end
Thanks!

My guess is that in your second snippet PARAGRAPH is not at the begging of the line.
The multi-line strings in ruby, are weird that way. The terminating character (whatever it may be) must be the first thing on a line to terminate the string, otherwise you will often see the syntax errors.

Ensure ensure that PARAGRAPH (the second instance) is a) spelled the same as your first instance, and b) at the start of the line, or change your code to:
def go
puts <<-PARAGRAPH # hyphen allows the end marker to be indented
Hi mom!
PARAGRAPH
end
For more information, read the intro to Strings and the full description.

The code works for me. One way I broke it was by adding space between << and PARAGRAPH
puts << PARAGRAPH
PARAGRAPH
This is different from the next example.
puts <<PARAGRAPH
PARAGRAPH
Edit: As I continue play with it I found that PARAGRAPH is just like any place holder. You can do the following and you will still get a paragraph in a string
puts <<ANYTHING_YOU_WANT
ANYTHING_YOU_WANT
I thought it was cool that it is not restricted only to the word PARAGRAPH. I didn't know.

I can get either version to error by adding additional spaces after the final PARAGRAPH.
Ensure that the closing PARAGRAPH is truly on a new line (per diedthreetimes' answer) and has no trailing characters (i.e. spaces, tabs, etc.)

Related

StringScanner is matching a string as though it was one position back

I'm trying to use a StringScanner to parse a string into tokens for processing later. All was going well until I tested the regex syntax parsing. Regexen look like this:
r|hello|gmi
r:there|there:gmi
r/:(?=[jedi])[sith]:/gmi
r!hello!gmi
That is, r, followed by | (or a couple of other characters, but that's irrelevant right now), followed by the body of the regex -- which can include escaped characters, like \| and \\ -- then another |, and then the flags of the regex.
To look for regex literals, I'm using code that looks an awful lot like this:
require 'strscan'
scanner = StringScanner.new('r|abc| ')
puts "pre-regex: #{scanner.inspect}"
puts "got a char: #{scanner.getch} (res: #{scanner.inspect})"
divider = scanner.getch
puts "got divider: #{divider.inspect}"
puts "mid-regex: #{scanner.inspect}"
# this bit still fails even if you replace `#{divider}' with `|'
res = scanner.scan_until(/(?<![^\\]\\)#{divider}[a-z]*/)
puts "post-regex: #{scanner.inspect}"
if scanner.skip(/\s+/)# || scanner.skip(/;-.*?-;/m)
puts "Success! #{res}"
else
puts "Fail. Ended at: #{scanner.inspect}"
puts "(res was #{res.inspect})"
end
Try it online at ideone
Here, I've trimmed it down as much as I think I can to show the problem clearly. In the real code, it's part of a much large piece of code, the vast majority of which isn't relevant. I've narrowed down the bug to this part -- you can use the link to see that it's there -- but I can't figure out why this isn't correctly scanning until the next instance of |, then returning the flags.
As a side note, if there's a better way to do this, please let me know. I've found that I quite like StringScanner, but that might be because I'm obsessed with regexen, to the point that I call them regexen.
TL;DR: Why is StringScanner apparently matching as though its position was one character back, and how can I make it work right?
The problem is that Ruby interpolates the regexp literal with the string as is, for example
divider = '|'
/(?<![^\\]\\)#{divider}[a-z]*/
=> /(?<![^\\]\\)|[a-z]*/
To escape the divider, you can
divider = '|'
/(?<![^\\]\\)#{Regexp.quote divider}[a-z]*/
=> /(?<![^\\]\\)\|[a-z]*/
And this modification makes the code pass, but you still need to verify that a divider is a non-word character.

Why strings are not coming in same line in ruby

In this code in three places i am having puts, Where first one prints variables in different line with the string and second one too. but 3rd one gives in same line.
def calliee (name,game)
#puts("#{name}#{game} he might be a bad guy")
return " he might be a bad guy #{name}#{game}"
end
def mymethod(name)
puts("enter your last name")
ss=gets()
#return "#{name}"+"#{ss}"+"he might be a bad guy"
calliee(name,ss)
end
puts("enter tour first name")
tt=gets()
#ww=mymethod(tt)
yy=mymethod(tt)
puts(yy)
puts("#{tt} is 1st name")
puts("prabhu "+"#{2+3}"+"#{4+5}")
i want everything in same line and i need to know why this is happening. please help
Kernel#gets gives you a string with the \n added to the end of the string. That causing the output in the multiple lines.
To make your output as you wanted it to be, you need to use #chomp method, like gets.chomp.
Adding to Arup's answer:
puts adds a newline to the end of the output. print does not. So you may also want to replace puts with print to have all output in one line.

Format text with footnote by regular expression

I want to transform the annotation of text into the form of footnote. Here is a minimal example of the text.
Paragraph one. This is the first place [1] of paragraph one. This is the second place [2] of paragraph one.
[1] annotation one of paragraph one
[2] annotation two of paragraph one
Paragraph two. This is the first place [1] of paragraph two. This is the second place [2] of paragraph two.
[1] annotation one of paragraph two
[2] annotation two of paragraph two
At the end of each paragraph, there will be several annotations begins with label [1]. Each annotation will form a single paragraph.
What I want to do is to insert those annotations into the text with latex syntax. The desired output of the sample text is,
Paragraph one. This is the first place \footnote{annotation one of paragraph one} of paragraph one. This is the second place \footnote{annotation two of paragraph one} of paragraph one.
Paragraph two. This is the first place \footnote{annotation one of paragraph two} of paragraph two. This is the second place \footnote{annotation one of paragraph two} of paragraph two.
This is a not just a simple replacement by matching pattens. It may have to be performed on a paragraph basis. What do you think is the simplest way to do it?
Edit: I have came up with a possible solution in order to use sed.
remove the newline in front of the annotation,
Paragraph one. This is the first place [1] of paragraph one. This is the second place [2] of paragraph one. [1] annotation one of paragraph one [2] annotation two of paragraph one
Paragraph two. This is the first place [1] of paragraph two. This is the second place [2] of paragraph two. [1] annotation one of paragraph two [2] annotation two of paragraph two
match the pattern
[1] text1 [1] text2 [2]
and replace it with
text2 text1 [2]
basically the first [1] is where the annotation should be inserted; things between [1] and [2] are annotations to be relocated.
These questions are relevant: Remove new line / line break characters only for specific lines How can I remove a line-feed/newline BEFORE a pattern using sed, but I can't make those code work for me the lack of knowledge of regular expression.
Fundamentally, sed is the wrong tool for this job. You might be able to write a sed script that preprocesses the file and generates a new sed script that processes the file, but you're clutching at straws when there are many much better tools for the task. I'd reach for Perl (but I learned Perl over twenty years ago, and Python only a couple of years ago), but Python is also capable of handling it, and with care you could probably even use awk. Part of the trouble is that you have to save all the text of paragraph one until you reach the start of paragraph two; only then can you start generating the actual text for paragraph one.
I think that the 'sed is the wrong tool' comment remains valid even if the sed script captures the contents of the paragraph in the hold space. Those would be lines not starting with a square bracket. The trouble is, when you come to a line with a square bracket, you need to write a regex that substitutes the tail of the line into the hold space in lieu of the contents of the square brackets. That requires a sort of 'dynamic regex'. Even if you knew there'd never be more than, say, 9 footnotes in a paragraph, so you could consider some sort of hack that wrote out the code 9 times, there are still problems writing the replacement strings in the right places.
Here's a simple script in Perl — well, a not incredibly complex script in Perl — that does the job. The 'whirling loops' (three nested loops) make it a little tricky to understand.
#!/usr/bin/env perl
use strict;
use warnings;
my $para = "";
TEXT:
while (<>)
{
NOTES:
while (m/^\s*\[(\d+)]\s+(.*)/)
{
my $tag = $1;
my $note = $2;
$para =~ s/\[$tag]/\\footnote{$note}/m;
while (<>)
{
last if $_ =~ m/^\s*\[/;
if ($_ !~ m/^\s*$/)
{
print $para;
$para = "";
last NOTES;
}
}
last TEXT if eof;
}
$para .= $_;
}
print "$para";
Given the input file:
Paragraph one. This is the first place [1] of paragraph one. This is the second place [2] of paragraph one.
[1] annotation one of paragraph one
[2] annotation two of paragraph one
Paragraph two. This is the first place [1] of paragraph two. This is the second place [2] of paragraph two.
[1] annotation one of paragraph two
[2] annotation two of paragraph two
The output of this script from that file is:
Paragraph one. This is the first place \footnote{annotation one of paragraph one} of paragraph one. This is the second place \footnote{annotation two of paragraph one} of paragraph one.
Paragraph two. This is the first place \footnote{annotation one of paragraph two} of paragraph two. This is the second place \footnote{annotation two of paragraph two} of paragraph two.
What does the script do?
The outer loop (labelled TEXT) reads lines into $_ until EOF.
The loop labelled NOTES processes the material after a paragraph up to the start of the next one. It knows that it is a footnote line because it starts with a number in square brackets (possibly indented with spaces, and definitely with a space after the close square bracket). When it finds such a line, the number is saved in $tag and the replacement text (must be a single line — no extended multiline footnotes here) is saved in $note. Then the first occurrence of the tag inside square brackets in the saved paragraph is replaced with the footnote notation and the text of the note (this is the part that is nigh-on impossible in a single run of sed, and given that the footnote numbers repeat across paragraphs, makes even two runs of sed problematic). Having done that substitution (not caring if there is no match to replace), it reads the next line, and this is where the loops (and the head) start whirling. If the newly read line is a note line, then the initial last exits the innermost while and returns to the next iteration of the NOTES loop. If the line does not match a blank line, then we must have just read the first line of the next paragraph, so print the previous paragraph (which now has as many substitutions made as there are substitutions to make), empty the saved paragraph, and exit the NOTES loop. Otherwise, ignore the blank line in the middle of the notes.
After the loop, check whether we got EOF and exit the main loop if we did. Otherwise, add the paragraph line that was just read to the saved paragraph.
At the end, print the last saved paragraph.
This has not been exhaustively tested. I've not generated paragraphs with references to missing notes, or notes without references, or notes out of sequence. I think it would 'handle' those by ignoring the issues; there'd still be a reference to the missing note, and unreferenced notes would simply not show up in the output. If the same note number reference appears twice in a paragraph but there's only one note number after the paragraph, the second and subsequent ones are ignored. If the same note number appears twice ('text[1] more[1]') and the notes after the paragraph repeat the number ('[1] note 1A', '[1] note 1B'), then the first will be replaced with 'note 1A' and the second with 'note 1B'. I've not tested multiline paragraphs (but I don't expect trouble). Multiline qualifiers aren't needed for the replacement regex because the reference to a tag cannot be split over lines and isn't anchored on a line.
Processing multiline footnotes is an exercise for the reader (and is not entirely trivial). All else apart, you can't begin substituting a multiline footnote until you find a blank line, another footnote line, or the start of the next paragraph.
A less verbose (and less documented) perl version
perl -00 -pe '
#markers = m{(\[\d+\])}g;
for $i (0..$#markers) {
$footnote = <>;
($marker, $text) = $footnote =~ m{(\[\d+\])\s+(.*)};
s{\Q$marker\E}{\\footnote{$text}};
}
' file
This assumes that if there are 5 footnote markers in a paragraph, 5 footnotes will follow that paragraph.

Delete first two lines of file with ruby

My script reads in large text files and grabs the first page with a regex. I need to remove the first two lines of each first page or change the regex to match 1 line after the ==Page 1== string. I include the entire script here because I've been asked to in past questions and because I'm new to ruby and don't always know how integrate snippets as answers:
#!/usr/bin/env ruby -wKU
require 'fileutils'
source = File.open('list.txt')
source.readlines.each do |line|
line.strip!
if File.exists? line
file = File.open(line)
end
text = (File.read(line))
match = text.match(/==Page 1(.*)==Page 2==/m)
puts match
end
Now, when You have updated your question, I had to delete a big part of so good answer :-)
I guess the main point of your problem was that you wanted to use match[1] instead of match. The object returned by Regexp.match method (MatchData) can be treated like an array, which holds the whole matched string as the first element, and each subquery in the following elements. So, in your case the variable match (and match[0]) is the whole matched string (together with '==Page..==' marks), but you wanted just the first subexpression which is hidden in match[1].
Now about other, minor problems I sense in your code. Please, don't be offended in case you already know what I say, but maybe others will profit from the warnings.
The first part of your code (if File.exists? line) was checking whether the file exists, but your code just opened the file (without closing it!) and still was trying to open the file few lines later.
You may use this line instead:
next unless File.exists? line
The second thing is that the program should be prepared to handle the situation when the file has no page marks, so it does not match the pattern. (The variable match would then be nil)
The third suggestion is that a little more complicated pattern might be used. The current one (/==Page 1==(.*)==Page 2==/m) would return the page content with the End-Of-Line mark as the first character. If you use this pattern:
/==Page 1==\s*\n(.*)==Page 2==/m
then the subexpression will not contain the white spaces placed in the same line as the '==Page 1==` text. And if you use this pattern:
/==Page 1==\s*\n(.*\n)==Page 2==/m
then you will be sure that the '==Page 2==' mark starts from the beginning of the line.
And the fourth issue is that very often programmers (sometimes including me, of course) tend to forget about closing the file after they opened it. In your case you have opened the 'source' file, but in the code there was no source.close statement after the loop. The most secure way of handling files is by passing a block to the File.open method, so You might use the following form of the first lines of your program:
File.open('list.txt') do |source|
source.readlines.each do |line|
...but in this case it would be cleaner to write just:
File.readlines('list.txt').each do |line|
Taking it all together, the code might look like this (I changed the variable line to fname for better code readability):
#!/usr/bin/env ruby -wKU
require 'fileutils'
File.readlines('list.txt').each do |fname|
fname.strip!
next unless File.exists? fname
text = File.read(fname)
if match = text.match(/==Page 1==\s*\n(.*\n)==Page 2==/m)
# The whole 'page' (String):
puts match[1].inspect
# The 'page' without the first two lines:
# (in case you really wanted to delete lines):
puts match[1].split("\n")[2..-1].inspect
else
# What to do if the file does not match the pattern?
raise "The file #{fname} does NOT include the page separators."
end
end

Multi-Line Comments in Ruby?

How can I comment multiple lines in Ruby?
#!/usr/bin/env ruby
=begin
Every body mentioned this way
to have multiline comments.
The =begin and =end must be at the beginning of the line or
it will be a syntax error.
=end
puts "Hello world!"
<<-DOC
Also, you could create a docstring.
which...
DOC
puts "Hello world!"
"..is kinda ugly and creates
a String instance, but I know one guy
with a Smalltalk background, who
does this."
puts "Hello world!"
##
# most
# people
# do
# this
__END__
But all forgot there is another option.
Only at the end of a file, of course.
This is how it looks (via screenshot) - otherwise it's hard to interpret how the above comments will look. Click to Zoom-in:
=begin
My
multiline
comment
here
=end
Despite the existence of =begin and =end, the normal and a more correct way to comment is to use #'s on each line. If you read the source of any ruby library, you will see that this is the way multi-line comments are done in almost all cases.
#!/usr/bin/env ruby
=begin
Between =begin and =end, any number
of lines may be written. All of these
lines are ignored by the Ruby interpreter.
=end
puts "Hello world!"
Using either:
=begin
This
is
a
comment
block
=end
or
# This
# is
# a
# comment
# block
are the only two currently supported by rdoc, which is a good reason to use only these I think.
=begin
comment line 1
comment line 2
=end
make sure =begin and =end is the first thing on that line (no spaces)
Here is an example :
=begin
print "Give me a number:"
number = gets.chomp.to_f
total = number * 10
puts "The total value is : #{total}"
=end
Everything you place in between =begin and =end will be treated as a comment regardless of how many lines of code it contains between.
Note: Make sure there is no space between = and begin:
Correct: =begin
Wrong: = begin
=begin
(some code here)
=end
and
# This code
# on multiple lines
# is commented out
are both correct. The advantage of the first type of comment is editability—it's easier to uncomment because fewer characters are deleted. The advantage of the second type of comment is readability—reading the code line by line, it's much easier to tell that a particular line has been commented out. Your call but think about who's coming after you and how easy it is for them to read and maintain.
In case someone is looking for a way to comment multiple lines in a html template in Ruby on Rails, there might be a problem with =begin =end, for instance:
<%
=begin
%>
... multiple HTML lines to comment out
<%= image_tag("image.jpg") %>
<%
=end
%>
will fail because of the %> closing the image_tag.
In this case, maybe it is arguable whether this is commenting out or not, but I prefer to enclose the undesired section with an "if false" block:
<% if false %>
... multiple HTML lines to comment out
<%= image_tag("image.jpg") %>
<% end %>
This will work.
def idle
<<~aid
This is some description of what idle does.
It does nothing actually, it's just here to show an example of multiline
documentation. Thus said, this is something that is more common in the
python community. That's an important point as it's good to also fit the
expectation of your community of work. Now, if you agree with your team to
go with a solution like this one for documenting your own base code, that's
fine: just discuss about it with them first.
Depending on your editor configuration, it won't be colored like a comment,
like those starting with a "#". But as any keyword can be used for wrapping
an heredoc, it is easy to spot anyway. One could even come with separated
words for different puposes, so selective extraction for different types of
documentation generation would be more practical. Depending on your editor,
you possibly could configure it to use the same syntax highlight used for
monoline comment when the keyword is one like aid or whatever you like.
Also note that the squiggly-heredoc, using "~", allow to position
the closing term with a level of indentation. That avoids to break the visual reading flow, unlike this far too long line.
aid
end
Note that at the moment of the post, the stackoverflow engine doesn't render syntax coloration correctly. Testing how it renders in your editor of choice is let as an exercise. ;)

Resources