Getting both File input AND STDIN from ARGF? - ruby

I am using the shoes library to run a piece of ruby code and have discovered that it treats the ruby code it's running as File Input, and thus does not allow me to get STDIN anymore (since ARGF allows File Input OR STDIN but apparently not both).
Is there anyway to override this? I'm told perl, for example, allows you to read from STDIN once the IO buffer is empty.
Edit:
I have had some success with the "-" special filename character, which apparently is a signal to switch to STDIN on the command line.
Previous Form of Question: Is Shoes ARGF Broken?
Using general Ruby, I can read either files or Standard In with ARGF. With Shoes, I am only able to read files. Anything from standard in just gets ignored. Is it eating standard in, or is there another way to access it?
Example code lines: Either stand alone in a ruby file, or inside a Shoes app in shoes.
#ruby testargf.rb aus.txt is the same as ruby testargf.rb<aus.txt
#but isn't in shoes. shoes only prints with the first input, not the second
ARGF.each do |line| #readLine.each has same result
puts line
end
Or in Shoes:
#shoes testargfshoes.rb aus.txt should be the same as <aus.txt but isn't.
Shoes.app(title: "File I/0 test",width:800,height:650) do
ARGF.each do |line| #readLine.each has same result
puts line
para line
end
end
In retrospect, I do also see a further difference between Shoes and Ruby: Shoes ALSO prints out the source code of the program I am running, along with any files I pass along. If I try to input a file to standard in, ONLY the source code is printed.
I imagine this means that the shoes app is taking my program as an input, and then not sanitizing (or whatever the correct word would be) the input when it passes it along to my code. This seems to strengthen my "Shoes eats Standard In" hypothesis, since it is clearly USING standard In for something. I guess it can take two files in a row, but not one file and THEN a reference to standard in.
I can confirm that Ruby without Shoes provides identical behavior if I mix file input and STDIN with:
ruby testargf.rb aus_simple.txt < testargf.rb

I have had some success with the "-" special filename character, which apparently is a signal to switch to STDIN on the command line.
Example of use:
shoes testargfshoes.rb - <aus_simple.txt
Don't pass the "-" without passing any standard In, makes it hang.
Found the answer here: https://robots.thoughtbot.com/rubys-argf

Related

How can I get piped data with arguments on Ruby2.4

In Python3, I have this code:
arg, unk = parser.parse_known_args()
buf = ''
for line in fileinput.input(unk):
buf += line
fileinput.close()
This code allows me to get piped data along with arguments to the program.
What I am achieving is to get piped e-mail from postfix. Postfix pipe email file to my python app and also add some arguments that I want. In Ruby I cannot find a proper way of doing this. Piped data can be max. ~25MB. So I need a correct, proper and smooth way of handling this. I want to handle even large files without issues.
ruby test.rb --option ARG
Of course, I can get arguments easily but I also want to get PIPED data.
In fact, I cannot find exact method that Ruby has for getting piped data. I am stuck at this point. Can anyone give me a hand on this?
It seems that in ruby you want to read from ARGF. It handles files passed as filenames or piped to your program.

What is the difference between "hello".length and "hello" .length?

I am surprised when I run the following examples in ruby console. They both produce the same output.
"hello".length
and
"hello" .length
How does the ruby console remove the space and provide the right output?
You can put spaces wherever you want, the interpreter looks for the end of the line. For example:
Valid
"hello".
length
Invalid
"hello"
.length
The interpreter sees the dot at the end of the line and knows something has to follow it up. While in the second case it thinks the line is finished. The same goes for the amount of spaces in one line. Does it matter how the interpreter removes the spaces? What matters is that you know the behavior.
If you want you can even
"hello" . length
and it will still work.
I know this is not an answer to you question, but does the "how" matter?
EDIT: I was corrected in the comments below. The examples with multiple lines given above are both valid when run in a script instead of IRB. I was mixed them up with the operators. Where the following also applies when running a script:
Valid
result = true || false
Valid
result = true ||
false
Invalid
result = true
|| false
This doesn't have as much to do with the console as it has to do with how the language itself is parsed by the compiler.
Most languages are parsed in such a way that items to be parsed are first grouped into TOKENS. Then the compiler is defined to expect a certain SEQUENCE of tokens in order to interpret each programming statement.
Because the compiler is only looking for a TOKEN SEQUENCE, it doesn't matter if there is space in between or not.
In this case the compiler is looking for:
STRING DOT METHOD_NAME
So it won't matter if you write "hello".length, or even "hello" . length. The same sequence of tokens are present in both, and that is all that matters to the compiler.
If you are curious how these token sequences are defined in the Ruby source code, you can look at parse.y starting around line 1042:
https://github.com/ruby/ruby/blob/trunk/parse.y#L1042
This is a file that is written using the YACC language, which is a language used to define parsers with.
Even without knowing anything about YACC, you should already be able to get some clues on how it works by just looking around the file a bit.

Difference between ways to use gets method

I saw two ways to use gets, a simple form:
print 'Insert your name: '
name = gets()
puts "Your name is #{name}"
and a form that drew my attention:
print 'Insert your name: '
STDOUT.flush
name = gets.chomp
puts "Your name is #{name}"
The second sample looks like perl in using the flush method of the default output stream. Perl makes explicit default output stream manipulating; the method flush is a mystery to me. It can behave different from what I'm inferring, and it uses chomp to remove the new line character.
What happens behind the scenes in the second form? What situation is it useful or necessary to use the second form?
"Flushing" the output ensures that it shows the printed message before it waits for your input; this may be just someone being certain unnecessarily, or it may be that on certain operating systems you need it. Alternatively you can use STDOUT.sync = true to force a flush after every output. (You may wonder, "Why wouldn't I always use this?" Well, if your code is outputting a lot of content, repeatedly flushing it may slow it down.)
chomp removes the newline from the end of the input. If you want the newline (the result of the user pressing "Enter" after typing their name) then don't chomp it.
Looking at some Github code I can see that STDOUT.flush is used mostly for server-side/multi-threaded jobs, and not in everyday use.
Generally speaking, when you want to accept input from the user, you'd want to use gets.chomp. Just remember, no matter what the user enters, Ruby will ALWAYS interprete that as a string.
To convert it to an integer, you need to call to_i, or to_f for a float. You don't need chomp in these cases, since to_i or to_f removes the "\n" automatically. There are a lot of subtle things going on implicitly as you'll see, and figuring them out is simply a matter of practice.
I've rarely seen someone use STDOUT.flush except in mutli-threading. Also it makes things confusing, defeating the whole purpose of writing elegant code.

Ruby script return array to powershell

I have a Ruby script and I am calling this ruby script from a
Powershell script. I want Ruby to return the result as an array back to
Powershell..So i will be able to use the array in Powershell. I am a
very beginner in Ruby, so need help on constructing the array in Ruby.
The vast majority of Ruby implementations are built on the Unix model: every input and output is an (unstructured) character stream. So, you will have to have additional PowerShell code to parse that unstructured character stream into a PowerShell array. This can ben made easier if you emit some well-known format as your character stream such as JSON, YAML, XML, XAML, CSV.
Alernatively, you could try an approach with IronRuby, and write a PowerShell cmdlet in Ruby.
You left out what your Ruby array should contain, so I would recommend to look at the examples in the Ruby documentation, like
first_array = ["Matz", "Guido"]
on how to create and populate a Ruby array.
The second step is how you transfer the contents of this array into the outer / calling Powershell script. One way is using the standard output stream, thus the Ruby script writes the contents of the array to standard output,
and the Powershell script catches it as described here.
Example for the Ruby side (when you entered the example line above):
puts first_array
will result in such output
Matz
Guido
with an entry on each line, or if your prefer CSV, you might try
puts first_array.join(',')
which will result in this output
Matz,Guido
Or you use JSON or whatever is best fit, when you parse that ouput later.
The final step would be to parse that output string within the Powershell to its array format.

Most reliable way to get text into ruby script

I have a ruby script that’ll do some text parsing (à lá markdown). It does it in a sequence of steps, like
string = string.gsub # more code here
string = string.gsub # more code here
# and so on
what is the best (i.e. most reliable) way to feed text into string in the first place? It’s a script, and the text it’ll be fed can vary a lot — it can be multilingual, have some characters that might trip a shell (like ", ', ’, &, $ you get the idea), and will likely be multi-line.
Is there some trick on the lines of
cat << EOF
bunch of text here
EOF
Additional considerations
I’m not looking for a markdown parser, this is something I want to do, not something I want a tool for.
I’m not a big ruby user (I’m starting to use it), so the more detailed the answer you can provide, the better.
It must be completely scriptable (i.e., no interrupting to ask the user for information).
The Kernel#gets method will read a string separated using the record separator from stdin or files specified on the command line. So if you use that you can do things like:
yourscript <filename #read from filename
yourscript file1 file2 # read both file1 and file2
yourscript #lets you type at your script
So to run something like:
cat <<'eof' |ruby yourscript.rb
This' & will $all 'eof' be 'fine'''
eof
Script might contain something like:
s = gets() # read a line
lines = readlines() # read all lines into an array
That's fairly standard for command-line scripts. If you want to have a user-interface then you'll want something more complex. There is an option to the Ruby interpreter to set the encoding of files as they are read.
Just read from stdin (which is an IO object):
$stdin.read
As you can see, stdin is provided in the global variable $stdin. Since it’s an IO object, there are a lot of other methods available if read doesn’t suit your needs.
Here’s a simple one-line example in the shell:
$ echo "foo\nbar" | ruby -e 'puts $stdin.read.upcase'
FOO
BAR
Obviously reading from stdin is extremely flexible since you can pipe input in from anywhere.
Ruby is very adept at encodings (see eg. Encoding docs). To get text into Ruby, one typically uses either gets, or reads File objects, or uses a GUI, which one can build with gtk2 gem or rugui (if already finished). In case you are getting texts from the wild internet, security should be your concern. Ruby used to have 4 $SAFE levels, but after some discussions, now there might be only 3 of them left. In any case, the best strategy to handle strings is to know as much as possible about the properties of the string that you expect in advance. Handling absolutely arbitrary strings is a surprisingly difficult task. Try to limit the number of possible encodings and figure the maximum size for the string that you expect.
Also, with respect to your original stated goal writing a markdown-processor-like something, you might want to not reinvent the wheel (unless it is for didactic purposes). There is this SO post:
Better ruby markdown interpreter?
The answer will direct you to kramdown gem, which gets a lot of praise, though I have not tried it personally.

Resources