Handling ARGF.read in ruby - ruby

I am using following lines of code in my ruby program on ubuntu:
data=ARGF.read
if data.length != 0
.....
end
The program runs fine when I run as "cat file.txt | ruby test.rb", however, I am unable to handle following issues:
When run as "cat | ruby test.rb", the program goes into endless loop.
When run as "ruby test.rb", the program goes into endless loop.
When run as "cat file1.txt | ruby test.rb", the program gives "cat: file1.txt: No such file or directory" error.
Any input will be highly appreciated.

I think you misunderstand what ARGF is used for. ARGF.read gives all the data of all the files passed as arguments.
When you don't give any input file, it is waiting for you to give the input through stdin. Since, you are in Ubuntu, you could just press (Control + D) which would end the stream and then you could process the data normally.

Related

Ruby STDIN, blocking vs not blocking

I'm trying to find some documentation on how STDIN is handled in Ruby.
I've experimented with this simple script:
# test.rb
loop do
puts "stdin: #{$stdin.gets}"
sleep 2
end
That I've run from bash (on OS X) with:
$ ruby test.rb
As I expected, the call to $stdin.gets is blocking, and the loop waits for the next input. The 2 second sleep time even allows me to enter more lines in one go, and the loop correctly prints them in order, then stops again when STDIN is cleared:
$ ruby test.rb
a
stdin: a
b
stdin: b
c
d
e
stdin: c
stdin: d
stdin: e
So far, all good. I was expecting this.
Then, I made a test with a pipe:
$ mkfifo my_pipe
$ ruby test.rb < my_pipe
And, in another shell:
$ echo "Hello" > my_pipe
This time, it behaved a bit differently.
At first it did wait, blocking the loop. But then, after the first input was passed through the pipe, it keept looping and printing empty strings:
$ ruby test.rb
stdin: Hello
stdin:
stdin:
stdin: Other input
stdin:
So my question is: why the difference? Does it treat the pipe like an empty file? Where is this documented? The docs don't say anything about the blocking behaviour, but they do say that:
Returns nil if called at end of file.
It's a start.
So the short answer is yes, you are getting an EOF from the pipe. Because the way echo works is that it's going to open the pipe for writing, write to it, then close (i.e. send EOF). Then a new call to echo will open it back up, read to it, and then close.
If you had instead used a program that printed lines of a file after a 3 second sleep, you would see that your application would perform blocking waits until that one exits (at which point the never-ending EOFs would return).
# slow_write.rb
ARGF.each do |line|
puts line
STDOUT.flush
sleep 3
end
I should note that this behavior is not specific to Ruby. The C stdlio library has the exact same behavior and since most languages use C primitives as their basis they have the same behavior as well.

Repeating Bash Task using At

I am running ubuntu 13.10 and want to write a bash script that will execute a given task at non-pre-determined time intervals. My understanding of this is that cronjobs require me to know when the task will be performed again. Thus, I was recommended to use "at."
I'm having a bit of trouble using "at." Based on some experimentation, I've found that
echo "hello" | at now + 1 minutes
will run in my terminal (with and without quotes). Running "atq" results in my computer telling me that the command is in the queue. However, I never see the results of the command. I assume that I'm doing something wrong, but the manpages don't seem to be telling me anything useful.
Thanks in advance for any help.
Besides the fact that commands are run without a terminal (output and input is probably redirected to /dev/null), your command would also not run since what you're passing to at is not echo hello but just hello. Unless hello is really an existing command, it won't run. What you want probably is:
echo "echo hello" | at now + 1 minutes
If you want to know if your command is really running, try redirecting the output to a file:
echo "echo hello > /var/tmp/hello.out" | at now + 1 minutes
Check the file later.

$stdin.gets is not working when execute ruby script via pipeline

Here comes a sample ruby code:
r = gets
puts r
if the script is executed standalone from console, it work fine. But if i ran it via pipeline:
echo 'testtest' | ruby test.rb
gets seem is redirected to pipeline inputs, but i need some user input.
How?
Stdin has been attached to the receiving end of the pipe by the invoking shell. If you really need interactive input you have a couple choices. You can open the tty input directly, leavng stdin bound to the pipe:
tty_input = open('/dev/tty') {|f| f.gets }
/dev/tty works under linux and OS/x, but might not work everywhere.
Alternatively, you can use a different form of redirection, process substitution, under bash to supply the (formerly piped) input as a psuedo-file passed as an argument and leave stdin bound to your terminal:
ruby test.rb <(echo 'testtest')
# test.rb
input = open(ARGV[0])
std_input = gets
input.readlines { |line| process_line(line) }

Trouble with UNIX pipes

So I have some ruby code that loops putting strings to stdout using puts then sleeps using sleep. I then have some node.js code that listens on stdin for data events and simply logs what it gets from stdin.
If I run echo 'something' | node my_code.js I'll see something, but if I run ruby my_code.rb | node my_code.js I don't see anything.
Am I not able to redirect the stdout from the ruby code to stdin of the node.js code using a UNIX pipe?
There should be very little difference between the two, and the Ruby code should be fine.
However, you are spotting some problems. What happens if you run the Ruby through tee?
ruby my_code.rb | tee file
Do you see the output? If not, start investigating your Ruby code. (Does it work when you run it without piping its output?). If you do see the output as you expect, does the Ruby program stop (exit)? Do you get your command line prompt back?
If there's nothing anomalous with the Ruby, what happens with the JavaScript when you pipe a multiline file to it:
cat my_code.rb | node my_code.js
I expect one of these scenarios to provide you with something to chase.
Try this:
ruby my_code.rb | awk '{print;fflush()}' | node my_code.js
Or this:
ruby my_code.rb | grep --line-buffer '.*' | node my_code.js

Jenkins console output not in realtime

Pretty new to Jenkins and I have simple yet annoying problem. When I run job (Build) on Jenkins I am triggering ruby command to execute my test script.
Problem is Jenkins is not displaying output in real time from console. Here is trigger log.
Building in workspace /var/lib/jenkins/workspace/foo_bar
No emails were triggered.
[foo_bar] $ /bin/sh -xe /tmp/hudson4042436272524123595.sh
+ ruby /var/lib/jenkins/test-script.rb
Basically it hangs on this output until build is complete than it just shows full output. Funny thing is this is not consistent behavior, sometimes it works as it should. But most of the time there is no real time console output.
Jenkins version: 1.461
To clarify some of the answers.
ruby or python or any sensible scripting language will buffer the output; this is in order to minimize the IO; writing to disk is slow, writing to a console is slow...
usually the data gets flush()'ed automatically after you have enough data in the buffer with special handling for newlines. e.g. writing a string without newline then sleep() would not write anything until after the sleep() is complete (I'm only using sleep as an example, feel free to substitute with any other expensive system call).
e.g. this would wait 8 seconds, print one line, wait 5 more seconds, print a second line.
from time import sleep
def test():
print "ok",
time.sleep(3)
print "now",
time.sleep(5)
print "done"
time.sleep(5)
print "again"
test()
for ruby, STDOUT.sync = true, turns the autoflush on; all writes to STDOUT are followed by flush(). This would solve your problem but result in more IO.
STDOUT.sync = true
for python, you can use python -u or the environment variable PYTHONUNBUFFERED to make stdin/stdout/stout not buffered, but there are other solutions that do not change stdin or stderr
export PYTHONUNBUFFERED=1
for perl, you have autoflush
autoflush STDOUT 1;
Make sure your script is flushing its stdout and stderr.
In my case I had a buffering issue similar to what you describe but I was using python.
The following python code fixed it for me:
import sys
sys.stdout.flush()
I'm not a Ruby coder, but Google reveals the following:
$stdout.flush
It seems to me that python -u works as well.
E.g. In batch command
python -u foo.py
Easiest solution here is to turn on syncing buffer to output. Something that #Craig wrote about in his answer but one line solution that will cover whole script, and not require you to flush buffer many times.
Just write
STDOUT.sync = true
Logic behind is simple, to avoid using IO operations many times output is buffered. To disable this use
STDOUT.sync = false
This is Ruby solution ofc.
Each of the other answers is specific to one program or another, but I found a more general solution here:
https://unix.stackexchange.com/a/25378
You can use stdbuf to alter the buffering behavior of any program.
In my case, I was piping output from a shell script through tee and grep to split lines into either the console or a file based on content. The console was hanging as described by OP. This solved it:
./slowly_parse.py login.csv |tee >(grep -v LOG: > out.csv) | stdbuf -oL -eL grep LOG:
Eventually I discovered I could just pass --line-buffered to grep for the same result:
./slowly_parse.py login.csv |tee >(grep -v LOG: > out.csv) | grep --line-buffered LOG:
The other answers are correct in saying that you need to ensure standard output is not buffered.
The other thing to be aware of is that Jenkins itself does line by line buffering. If you have a slow-running process that emits single characters (for example, an nunit test suite summary that prints a . for a successful test and an E for an error) you will not see anything until the end of line.
[True for my Jenkins 1.572 running on a Windows box.]
For some commands, including tee a the best choice for unbuffering is a program called unbuffer from expect package.
Usage example:
instead of
somecommand | tee /some/path
do
somecommand | unbuffer -p tee /some/path
Sources and more info:
https://stackoverflow.com/a/11337310/2693875
https://unix.stackexchange.com/a/25375/53245
The Operating-System is buffering output-data by nature, to save CPU, and so does Jenkins.
Looks like you are using a shell-command to run your Ruby script -
I suggest running your Ruby script directly via the dedicated plugin:
Jenkins Ruby Plugin
(may need to install it)
Python buffered its output traces and print it at the end of script to minimize writing on console as writing to console is slow.
You can use following command after your traces. It will flush all traces to console, which are queued before that command.
sys.stdout.flush()

Resources