What does $/ mean in Ruby? - ruby

I was reading about Ruby serialization (http://www.skorks.com/2010/04/serializing-and-deserializing-objects-with-ruby/) and came across the following code. What does $/ mean? I assume $ refers to an object?
array = []
$/="\n\n"
File.open("/home/alan/tmp/blah.yaml", "r").each do |object|
array << YAML::load(object)
end

$/ is a pre-defined variable. It's used as the input record separator, and has a default value of "\n".
Functions like gets uses $/ to determine how to separate the input. For example:
$/="\n\n"
str = gets
puts str
So you have to enter ENTER twice to end the input for str.
Reference: Pre-defined variables

This code is trying to read each object into an array element, so you need to tell it where one ends and the next begins. The line $/="\n\n" is setting what ruby uses to to break apart your file into.
$/ is known as the "input record separator" and is the value used to split up your file when you are reading it in. By default this value is set to new line, so when you read in a file, each line will be put into an array. What setting this value, you are telling ruby that one new line is not the end of a break, instead use the string given.
For example, if I have a comma separated file, I can write $/="," then if I do something like your code on a file like this:
foo, bar, magic, space
I would create an array directly, without having to split again:
["foo", " bar", " magic", " space"]
So your line will look for two newline characters, and split on each group of two instead of on every newline. You will only get two newline characters following each other when one line is empty. So this line tells Ruby, when reading files, break on empty lines instead of every line.

I found in this page something probably interesting:
http://www.zenspider.com/Languages/Ruby/QuickRef.html#18
$/ # The input record separator (eg #gets). Defaults to newline.

The $ means it is a global variable.
This one is however special as it is used by Ruby. Ruby uses that variable as a input record separator
For a full list with the special global variables see:
http://www.rubyist.net/~slagell/ruby/globalvars.html

Related

'gets' doesn't wait for user input

I'm attempting to develop a program on repl.it using its Ruby platform. Here's what I've got:
puts "Copy the entire request page and paste it into this console, then
hit ENTER."
request_info = gets("Edit Confirmation").chomp.gsub!(/\s+/m, '
').strip.split(" ")
puts "What is your name?"
your_name = gets.chomp
puts "Thanks, #{your_name}!"
The way I've got it, the user pastes a multi-line request, which ends with "Edit Confirmation", and then it splits the request, word-by-word, into its own array for me to parse and pull the relevant data.
But I can't seem to use the gets command a second time after initially inquiring the user for a multi-line input at the start. Any other gets command I attempt to use after that is skipped, and the program ends.
Your code is doing something quite unusual: By passing a string to the gets method, you are actually changing the input separator:
gets(sep, limit [, getline_args]) → string or nil
Reads the next “line'' from the I/O stream; lines are separated by sep.
The reason why your code is not working as you expect is because a trailing "\n" character is left in the input buffer - so calling gets a second time instantly returns this string.
Perhaps the easiest way to resolve this would just be to absorb this character in the first gets call:
request_info = gets("Edit Confirmation\n").chomp.gsub!(/\s+/m, ' ').strip.split(" ")
For a "complex" multi-line input like this, it would be more common to pass a file name parameter to the ruby script, and read this file rather than pasting it into the terminal.
Or, you could use gets(nil) to read until an EOF character and ask the user to press CTRL+D to signify the end of the multi-line input.

Difference between 2 ways working with pipes using ARGF?

Using ARGF I can create Ruby programs that respect pipelines. Suppose, I to constantly read new entries:
$ tail -f log/test.log | my_prog
I can do this using:
ARGF.each_line do |line|
...
end
Also, I found another way:
while input = ARGF.gets
input.each_line do |line|
...
end
end
Looks like, both variants do the same thing or there is a difference between them? If so, what is it?
Thanks in advance.
As Stefan mentioned, you did a little mistake in second case. Proper way of using "ARGF.gets" approach in your case will look like:
while input = ARGF.gets
# input here represents a line
end
If you rewrite the second example as above, you will not have difference in behavior.
Actual difference you may notice between ARGF#gets and ARGF#each_line is in semantics: each_line accepts block or returns enumerator and gets returns a next line if it is available.
Another option is to use Kernel#gets. Beware it's behavior may differ from ARGF#gets in some cases, especially if you change a separator:
A separator of nil reads the entire contents, and a zero-length separator reads the input one paragraph at a time, where paragraphs are divided by two consecutive newlines.
But for reading (and then printing) constantly from stdin you may use it as follows:
print while gets

What does "$," mean in ruby?

I stumbled upon this piece of code in the rails source:
# File actionpack/lib/action_view/helpers/output_safety_helper.rb, line 30
def safe_join(array, sep=$,)
sep ||= "".html_safe
sep = ERB::Util.html_escape(sep)
array.map { |i| ERB::Util.html_escape(i) }.join(sep).html_safe
end
What does $, do? I read the Regexp-documentation but I couldn't find anything about it.
The official documentation for the system variables is in:
http://www.ruby-doc.org/stdlib-2.0/libdoc/English/rdoc/English.html
A lot of Ruby's special variables are accessible via methods in various modules and classes, which hides the fact that the variable is what contains the value. For instance, lineno, available in IO and inherited by File, is the line number of the last line read by an IO stream. It's relying on $/ and $.
The "English" module provides long versions of the cryptic variables, making it more readable. Use of the cryptic variables isn't as idiomatic in Ruby as they are in Perl, which is why they're more curious when you run into them.
They come from a variety of sources: most, if not all, are immediately from Perl, but Perl inherited them from sed, awk, and the rest of its kitchen-sink collection of code. (It's a great language, really.)
There are other variables set by classes like Regexp, which defines variables for pre and post match, plus captures. This is from the documentation:
$~ is equivalent to ::last_match;
$& contains the complete matched text;
$` contains string before match;
$' contains string after match;
$1, $2 and so on contain text matching first, second, etc capture group;
$+ contains last capture group.
Though Ruby defines the short, cryptic, versions of the variables, it's recommended that we use require "English" to provide the long names. It's a readability thing, which translates to a long-term ease-of-maintenance thing.
I finally found the answer myself here:
The output field separator for the print. Also, it is the default separator for Array#join. (Mnemonic: what is printed when there is a , in your print statement.)
The following code snippet shows the effect:
a = [1,2,3]
puts a.join # => 123
$, = ','
puts a.join # => 1,2,3

Delete first two lines of file with ruby

My script reads in large text files and grabs the first page with a regex. I need to remove the first two lines of each first page or change the regex to match 1 line after the ==Page 1== string. I include the entire script here because I've been asked to in past questions and because I'm new to ruby and don't always know how integrate snippets as answers:
#!/usr/bin/env ruby -wKU
require 'fileutils'
source = File.open('list.txt')
source.readlines.each do |line|
line.strip!
if File.exists? line
file = File.open(line)
end
text = (File.read(line))
match = text.match(/==Page 1(.*)==Page 2==/m)
puts match
end
Now, when You have updated your question, I had to delete a big part of so good answer :-)
I guess the main point of your problem was that you wanted to use match[1] instead of match. The object returned by Regexp.match method (MatchData) can be treated like an array, which holds the whole matched string as the first element, and each subquery in the following elements. So, in your case the variable match (and match[0]) is the whole matched string (together with '==Page..==' marks), but you wanted just the first subexpression which is hidden in match[1].
Now about other, minor problems I sense in your code. Please, don't be offended in case you already know what I say, but maybe others will profit from the warnings.
The first part of your code (if File.exists? line) was checking whether the file exists, but your code just opened the file (without closing it!) and still was trying to open the file few lines later.
You may use this line instead:
next unless File.exists? line
The second thing is that the program should be prepared to handle the situation when the file has no page marks, so it does not match the pattern. (The variable match would then be nil)
The third suggestion is that a little more complicated pattern might be used. The current one (/==Page 1==(.*)==Page 2==/m) would return the page content with the End-Of-Line mark as the first character. If you use this pattern:
/==Page 1==\s*\n(.*)==Page 2==/m
then the subexpression will not contain the white spaces placed in the same line as the '==Page 1==` text. And if you use this pattern:
/==Page 1==\s*\n(.*\n)==Page 2==/m
then you will be sure that the '==Page 2==' mark starts from the beginning of the line.
And the fourth issue is that very often programmers (sometimes including me, of course) tend to forget about closing the file after they opened it. In your case you have opened the 'source' file, but in the code there was no source.close statement after the loop. The most secure way of handling files is by passing a block to the File.open method, so You might use the following form of the first lines of your program:
File.open('list.txt') do |source|
source.readlines.each do |line|
...but in this case it would be cleaner to write just:
File.readlines('list.txt').each do |line|
Taking it all together, the code might look like this (I changed the variable line to fname for better code readability):
#!/usr/bin/env ruby -wKU
require 'fileutils'
File.readlines('list.txt').each do |fname|
fname.strip!
next unless File.exists? fname
text = File.read(fname)
if match = text.match(/==Page 1==\s*\n(.*\n)==Page 2==/m)
# The whole 'page' (String):
puts match[1].inspect
# The 'page' without the first two lines:
# (in case you really wanted to delete lines):
puts match[1].split("\n")[2..-1].inspect
else
# What to do if the file does not match the pattern?
raise "The file #{fname} does NOT include the page separators."
end
end

Why does Ruby's 'gets' includes the closing newline?

I never need the ending newline I get from gets. Half of the time I forget to chomp it and it is a pain in the....
Why is it there?
Like puts (which sounds similar), it is designed to work with lines, using the \n character.
gets takes an optional argument that is used for "splitting" the input (or "just reading till it arrives). It defaults to the special global variable $/, which contains a \n by default.
gets is a pretty generic method for readings streams and includes this separator. If it would not do it, parts of the stream content would be lost.
var = gets.chomp
This puts it all on one line for you.
If you look at the documentation of IO#gets, you'll notice that the method takes an optional parameter sep which defaults to $/ (the input record separator). You can decide to split input on other things than newlines, e.g. paragraphs ("a zero-length separator reads the input a paragraph at a time (two successive newlines in the input separate paragraphs)"):
>> gets('')
dsfasdf
fasfds
dsafadsf #=> "dsfasdf\nfasfds\n\n"
From a performance perspective, the better question would be "why should I get rid of it?". It's not a big cost, but under the hood you have to pay to chomp the string being returned. While you may never have had a case where you need it, you've surely had plenty of cases where you don't care -- gets s; puts stuff() if s =~ /y/i, etc. In those cases, you'll see a (tiny, tiny) performance improvement by not chomping.
How I auto-detect line endings:
# file open in binary mode
line_ending = 13.chr + 10.chr
check = file.read(1000)
case check
when /\r\n/
# already set
when /\n/
line_ending = 10.chr
when /\r/
line_ending = 13.chr
end
file.rewind
while !file.eof?
line = file.gets(line_ending).chomp
...
end

Resources