How can I get piped data with arguments on Ruby2.4 - ruby

In Python3, I have this code:
arg, unk = parser.parse_known_args()
buf = ''
for line in fileinput.input(unk):
buf += line
fileinput.close()
This code allows me to get piped data along with arguments to the program.
What I am achieving is to get piped e-mail from postfix. Postfix pipe email file to my python app and also add some arguments that I want. In Ruby I cannot find a proper way of doing this. Piped data can be max. ~25MB. So I need a correct, proper and smooth way of handling this. I want to handle even large files without issues.
ruby test.rb --option ARG
Of course, I can get arguments easily but I also want to get PIPED data.
In fact, I cannot find exact method that Ruby has for getting piped data. I am stuck at this point. Can anyone give me a hand on this?

It seems that in ruby you want to read from ARGF. It handles files passed as filenames or piped to your program.

Related

Getting both File input AND STDIN from ARGF?

I am using the shoes library to run a piece of ruby code and have discovered that it treats the ruby code it's running as File Input, and thus does not allow me to get STDIN anymore (since ARGF allows File Input OR STDIN but apparently not both).
Is there anyway to override this? I'm told perl, for example, allows you to read from STDIN once the IO buffer is empty.
Edit:
I have had some success with the "-" special filename character, which apparently is a signal to switch to STDIN on the command line.
Previous Form of Question: Is Shoes ARGF Broken?
Using general Ruby, I can read either files or Standard In with ARGF. With Shoes, I am only able to read files. Anything from standard in just gets ignored. Is it eating standard in, or is there another way to access it?
Example code lines: Either stand alone in a ruby file, or inside a Shoes app in shoes.
#ruby testargf.rb aus.txt is the same as ruby testargf.rb<aus.txt
#but isn't in shoes. shoes only prints with the first input, not the second
ARGF.each do |line| #readLine.each has same result
puts line
end
Or in Shoes:
#shoes testargfshoes.rb aus.txt should be the same as <aus.txt but isn't.
Shoes.app(title: "File I/0 test",width:800,height:650) do
ARGF.each do |line| #readLine.each has same result
puts line
para line
end
end
In retrospect, I do also see a further difference between Shoes and Ruby: Shoes ALSO prints out the source code of the program I am running, along with any files I pass along. If I try to input a file to standard in, ONLY the source code is printed.
I imagine this means that the shoes app is taking my program as an input, and then not sanitizing (or whatever the correct word would be) the input when it passes it along to my code. This seems to strengthen my "Shoes eats Standard In" hypothesis, since it is clearly USING standard In for something. I guess it can take two files in a row, but not one file and THEN a reference to standard in.
I can confirm that Ruby without Shoes provides identical behavior if I mix file input and STDIN with:
ruby testargf.rb aus_simple.txt < testargf.rb
I have had some success with the "-" special filename character, which apparently is a signal to switch to STDIN on the command line.
Example of use:
shoes testargfshoes.rb - <aus_simple.txt
Don't pass the "-" without passing any standard In, makes it hang.
Found the answer here: https://robots.thoughtbot.com/rubys-argf

Use bash to extract data between two regular expressions while keeping the formatting

but I have a question about a small piece of code using the awk command. I have not found an answer/solution anywhere.
I am trying to parse an output file and extract all data between the 1st expression (including) ATOMIC and 2nd expression (excluding) Bond. This data is to be sent to a new file $1_geom. So far I have the following:
`awk '/ATOMIC/{flag=1;next}/Bond lengths in Bohr/{flag=0}flag' $1` >> $1_geom
This script will extract the correct data for me, but there are 2 problems:
The line ATOMICis not extracted with the data
The data is extracted and appended to a single line. I want the data to retain the formatting from the parsed file (5 columns, variable amount of lines). Please see attachment to see a visual. Visual Example Attachment. Is there a different way to append data (other than >>) so that I can keep formatting?
Any help is appreciated, thank you.
The next is causing the first match to be skipped; take it out if you don't want that.
The backticks by themselves are a shell syntax error (unless your Awk script happens to produce valid shell commands). I'm guessing you have a useless echo or something like that in your actual script which disarms the error, but instead produces the symptoms you describe.
This was part of a code in a csh script and I did have an "echo" in front of this line. Removing the "echo" makes it work perfectly and addresses the 2 questions that I had.

Ruby script return array to powershell

I have a Ruby script and I am calling this ruby script from a
Powershell script. I want Ruby to return the result as an array back to
Powershell..So i will be able to use the array in Powershell. I am a
very beginner in Ruby, so need help on constructing the array in Ruby.
The vast majority of Ruby implementations are built on the Unix model: every input and output is an (unstructured) character stream. So, you will have to have additional PowerShell code to parse that unstructured character stream into a PowerShell array. This can ben made easier if you emit some well-known format as your character stream such as JSON, YAML, XML, XAML, CSV.
Alernatively, you could try an approach with IronRuby, and write a PowerShell cmdlet in Ruby.
You left out what your Ruby array should contain, so I would recommend to look at the examples in the Ruby documentation, like
first_array = ["Matz", "Guido"]
on how to create and populate a Ruby array.
The second step is how you transfer the contents of this array into the outer / calling Powershell script. One way is using the standard output stream, thus the Ruby script writes the contents of the array to standard output,
and the Powershell script catches it as described here.
Example for the Ruby side (when you entered the example line above):
puts first_array
will result in such output
Matz
Guido
with an entry on each line, or if your prefer CSV, you might try
puts first_array.join(',')
which will result in this output
Matz,Guido
Or you use JSON or whatever is best fit, when you parse that ouput later.
The final step would be to parse that output string within the Powershell to its array format.

Most reliable way to get text into ruby script

I have a ruby script that’ll do some text parsing (à lá markdown). It does it in a sequence of steps, like
string = string.gsub # more code here
string = string.gsub # more code here
# and so on
what is the best (i.e. most reliable) way to feed text into string in the first place? It’s a script, and the text it’ll be fed can vary a lot — it can be multilingual, have some characters that might trip a shell (like ", ', ’, &, $ you get the idea), and will likely be multi-line.
Is there some trick on the lines of
cat << EOF
bunch of text here
EOF
Additional considerations
I’m not looking for a markdown parser, this is something I want to do, not something I want a tool for.
I’m not a big ruby user (I’m starting to use it), so the more detailed the answer you can provide, the better.
It must be completely scriptable (i.e., no interrupting to ask the user for information).
The Kernel#gets method will read a string separated using the record separator from stdin or files specified on the command line. So if you use that you can do things like:
yourscript <filename #read from filename
yourscript file1 file2 # read both file1 and file2
yourscript #lets you type at your script
So to run something like:
cat <<'eof' |ruby yourscript.rb
This' & will $all 'eof' be 'fine'''
eof
Script might contain something like:
s = gets() # read a line
lines = readlines() # read all lines into an array
That's fairly standard for command-line scripts. If you want to have a user-interface then you'll want something more complex. There is an option to the Ruby interpreter to set the encoding of files as they are read.
Just read from stdin (which is an IO object):
$stdin.read
As you can see, stdin is provided in the global variable $stdin. Since it’s an IO object, there are a lot of other methods available if read doesn’t suit your needs.
Here’s a simple one-line example in the shell:
$ echo "foo\nbar" | ruby -e 'puts $stdin.read.upcase'
FOO
BAR
Obviously reading from stdin is extremely flexible since you can pipe input in from anywhere.
Ruby is very adept at encodings (see eg. Encoding docs). To get text into Ruby, one typically uses either gets, or reads File objects, or uses a GUI, which one can build with gtk2 gem or rugui (if already finished). In case you are getting texts from the wild internet, security should be your concern. Ruby used to have 4 $SAFE levels, but after some discussions, now there might be only 3 of them left. In any case, the best strategy to handle strings is to know as much as possible about the properties of the string that you expect in advance. Handling absolutely arbitrary strings is a surprisingly difficult task. Try to limit the number of possible encodings and figure the maximum size for the string that you expect.
Also, with respect to your original stated goal writing a markdown-processor-like something, you might want to not reinvent the wheel (unless it is for didactic purposes). There is this SO post:
Better ruby markdown interpreter?
The answer will direct you to kramdown gem, which gets a lot of praise, though I have not tried it personally.

Writing a raw data file in Mathematica

I have a number in Mathematica, a large number. I have even gotten this number in base 16 form, using OutputForm[]. I am basically trying to write out a number to a file in hex format.
Please keep in mind I am using 123456 in these examples instead of my 70,000 digit number.
Whenever I write a file using a simple Put[123456, "file.raw"] command, I get a raw data file with the actual data 3132333435360A with a line ending.
If I use Put[OutputForm[BaseForm[123456, 16]], "file.raw"] command, I get a raw data file with the data in hex format 31653234300A202020202031360A but still not written as raw data.
I would like the Hex Form of the Number Dumped as Data.
I have tried Export, BinaryWrite, and DumpSave, but can't figure it out.
I just am getting a headache I guess cause I can't see past what I need to do.
One thing I did try was doing:
Export["file.raw", 123456];
But the file is not raw enough. What I mean by that is there is there is header data and extra crap.
Would love to get this working thanks.
Please let us know what you expect to see in your output file, and what you want use it for. Do you want something a human can read, or something in a specified format to be used by a computer? Please provide an example.
The two examples using Put[] correctly provide files containing ASCII characters corresponding to the text representations of your inputs, and which are human-readable.
I think what you're looking for is IntegerString[_,16]:
In[33]:= IntegerString[123456, 16]
Out[33]= "1e240"
str = OpenWrite[];
WriteString[str, IntegerString[123456, 16]];
Close[str];
FilePrint[%]
1e240
(using WriteString instead of Put avoids having the string characters

Resources