How to read and process a file in Ruby with EventMachine - ruby

I am wondering if anybody figured out a way to process bigger (500Mb+) text files with EventMachine, where you actually need to access the individual lines.

I guess I found the answer, the only thing is messy is the read_chunk gets called in the after io.gets, and I am not sure why it works :)
require 'eventmachine'
def process_line(line)
puts line
end
EM.run do
io = File.open('query_profiles.csv')
read_chunk = proc do
if line = io.gets
process_line(line)
EM.next_tick(read_chunk)
else
EM.stop
end
end
EM.next_tick(read_chunk)
end

Related

Ruby - STDIN.read

I'm trying to read from a file, and because my more complex code doesn't work, I came back to basics to see whether it even reads properly.
My code:
MyParser.new(STDIN.read).run.lines.each do |line|
p line.chomp
end
I use
ruby program
(it's placed in a bin directory and I saved it without .rb )
Now program is waiting for me to write something. I type:
../examples/file.txt
and use CTRL + Z (I'm on Windows 10). It produces ^Z and I hit enter.
Now I have an error:
Invalid argument # rb_sysopen - ../examples/file.txt (Errno::EINVAL)
MyParser class and its whole logic works fine. I'll be grateful for any hints.
Without knowing, what your MyParser expect, it is hard to know, what you need.
But maybe this helps:
MyParser.new(STDIN.gets.strip).run.lines.each do |line|
p line.chomp
end
I would extend it by a message what you need:
puts "Please enter the filename"
STDOUT.flush
MyParser.new(STDIN.gets.strip).run.lines.each do |line|
p line.chomp
end
With STDOUT.flush the user gets the message, before STDIN.getswaits for a message.
In your case I would take a look on ARGV and call the program with:
ruby program ../examples/file.txt
Your program should then use:
MyParser.new(ARGV.first).run.lines.each do |line|
p line.chomp
end

A ruby script to run other ruby scripts

If I want to run a bunch of ruby scripts (super similar, with maybe a number changed as a commandline argument) and still have them output to stdout, is there a way to do this?
i.e a script to run these:
ruby program1.rb input_1.txt
ruby program1.rb input_2.txt
ruby program1.rb input_3.txt
like
(1..3).each do |i|
ruby program1.rb input_#{i}'
end
in another script, so I can just run that script and see the output in a terminal from all 3 runs?
EDIT:
I'm struggling to implement the second highest voted suggested answer.
I don't have a main function within my program1.rb, whereas the suggested answer has one.
I've tried this, for script.rb:
require "program1.rb"
(1..6).each do |i|
driver("cmd_line_arg_#{i}","cmd_line_arg2")
end
but no luck. Is that right?
You can use load to run the script you need (the difference between load and require is that require will not run the script again if it has already been loaded).
To make each run have different arguments (given that they are read from the ARGV variable), you need to override the ARGV variable:
(1..6).each do |i|
ARGV = ["cmd_line_arg_#{i}","cmd_line_arg2"]
load 'program1.rb'
end
# script_runner.rb
require_relative 'program_1'
module ScriptRunner
class << self
def run
ARGV.each do | file |
SomeClass.new(file).process
end
end
end
end
ScriptRunner.run
.
# programe_1.rb
class SomeClass
attr_reader :file_path
def initialize(file_path)
#file_path = file_path
end
def process
puts File.open(file_path).read
end
end
Doing something like the code shown above would allow you to run:
ruby script_runner.rb input_1.txt input_2.txt input_3.txt
from the command line - useful if your input files change. Or even:
ruby script_runner.rb *.txt
if you want to run it on all text files. Or:
ruby script_runner.rb inputs/*
if you want to run it on all files in a specific dir (in this case called 'inputs').
This assumes whatever is in program_1.rb is not just a block of procedural code and instead provides some object (class) that encapsulates the logic you want to use on each file,meaning you can require program_1.rb once and then use the object it provides as many times as you like, otherwise you'll need to use #load:
# script_runner.rb
module ScriptRunner
class << self
def run
ARGV.each do | file |
load('program_1.rb', file)
end
end
end
end
ScriptRunner.run
.
# program_1.rb
puts File.open(ARGV[0]).read

Capturing failure details to file using ruby test/unit

I'm writing a quick test using the test/unit gem, and want to write error/failure details to a file. In the teardown section I'm using the #test_passed variable to know when there's a failure, and then I write to a file, but I can't seem to find the proper variable to dump out the method that failed or any failure details.
I really only want to capture the errors. It seems like it should be fairly simple. Anyone know what variables test/unit is using to store the error details?
Below is an example how I'm trying to dump out the errors:
require "test/unit"
class MyTest < Test::Unit::TestCase
def setup
end
def teardown
if #test_passed then
puts "no errors"
else
File.open("errors.txt", "a+") do |f|
f.puts "Error in #{what_is_the_variable_for_the_method_name}"
f.puts "#{variable_with_error_details_like_expecting_this_but_got_that}"
end
end
end
def test_fail
a = 9
assert_equal(a, 10)
end
end
The source files of the Test::Unit framework are installed on your system. Look for testcase.rb.
The results are stored in a #_result member.

Get output from program into Ruby processing

I am using the Ruby processing library.
I would like to pipe output from a program into my code. For example, echo "hello" | rp5 run receiver.rb.
In a normal program, I know I can accomplish this with
while $stdin.gets
puts $_
puts "Receiving!"
end
And I know that in processing, the program loops through the draw function continuously. So I tried this code, but it did not work, since it freezes on the line puts $stdin.gets. So I know it must be a problem with the pipes not matching up, so I'm going to try using named pipes so that there is no confusion.
def setup
puts "setting up"
end
def draw
puts "drawing"
puts $stdin
puts $stdin.gets
puts "after gets"
while $stdin.gets
puts $_
puts "Receiving!"
end
puts "done drawing"
end
Any suggestion would be appreciated. I'm running Ubuntu 12.04.
Yep, the name pipes worked. Check out this example to get you started and make sure you have the latest version of JRuby loaded.

Ruby deleting directories

I'm trying to delete a non-empty directory in Ruby and no matter which way I go about it it refuses to work.
I have tried using FileUtils, system calls, recursively going into the given directory and deleting everything, but always seem to end up with (temporary?) files such as
.__afsECFC
.__afs73B9
Anyone know why this is happening and how I can go around it?
require 'fileutils'
FileUtils.rm_rf('directorypath/name')
Doesn't this work?
Safe method: FileUtils.remove_dir(somedir)
Realised my error, some of the files hadn't been closed.
I earlier in my program I was using
File.open(filename).read
which I swapped for a
f = File.open(filename, "r")
while line = f.gets
puts line
end
f.close
And now
FileUtils.rm_rf(dirname)
works flawlessly
I guess the best way to remove a directory with all your content "without using an aditional lib" is using a simple recursive method:
def remove_dir(path)
if File.directory?(path)
Dir.foreach(path) do |file|
if ((file.to_s != ".") and (file.to_s != ".."))
remove_dir("#{path}/#{file}")
end
end
Dir.delete(path)
else
File.delete(path)
end
end
remove_dir(path)
The built-in pathname gem really improves the ergonomics of working with paths, and it has an #rmtree method that can achieve exactly this:
require "pathname"
path = Pathname.new("~/path/to/folder").expand_path
path.rmtree

Resources