How can I clear a `StringIO` instance? - ruby

How can I clear a StringIO instance? After I write to and read from a string io, I want to clear it.
require "stringio"
io = StringIO.new
io.write("foo")
io.string #=> "foo"
# ... After doing something ...
io.string #=> Expecting ""
I tried flush and rewind, but I still get the same content.

seek or rewind only affect next read/write operations, not the content of the internal storage.
You can use StringIO#truncate like File#truncate:
require 'stringio'
io = StringIO.new
io.write("foo")
io.string
# => "foo"
io.truncate(0) # <---------
io.string
# => ""
Alternative:
You can also use StringIO#reopen (NOTE: File does not have reopen method):
io.reopen("")
io.string
# => ""

I use truncate AND rewind. The rewind is required to prepare io for next operations.
io.truncate(0)
io.rewind

Related

What does TCPSocket#each iterate over in ruby?

I'm not too familiar with Ruby, so I wasn't able to find the documentation for this method.
When calling each on a TCPSocket object, like this
require "socket"
srv = TCPServer.new("localhost", 7887)
skt = srv.accept
skt.each {|arg| p arg}
Does the block get called once per tcp packet, once per line (after each '\n' char), once per string (after after each NUL/EOF), or something different entirely?
TL;DR TCPSocket.each will iterate for each newline delimited \n string it receives.
More details:
A TCPSocket is just a BasicSocket with some extra powder sugar on top. And a BasicSocket is a child of IO class. The IO class is just a stream of data; thus, it is iterable. And that is where you can find how each is defined for TCPSocket.
Fire up an irb console and enter your line of code with the $stdin socket to see how each behaves. They both inherit from IO. Here is an example of what happens:
irb(main):011:0> $stdin.each {|arg| p arg + "."}
hello
"hello\n."
But to directly answer the question, the block is called once per \n character. If your client is sending data 1 character at a time then the block is not going to be executed until it sees the \n.
Here is a quick sample client to show this:
irb(main):001:0> require 'socket'
=> true
irb(main):002:0> s = TCPSocket.open("localhost", 7887)
=> #<TCPSocket:fd 9>
irb(main):003:0> s.puts "hello"
=> nil
irb(main):007:0> s.write "hi"
=> 2
irb(main):008:0> s.write ", nice to meet you"
=> 18
irb(main):009:0> s.write "\n"
=> 1
And here is what the server printed out:
"hello\n"
"hi, nice to meet you\n" # note: this did not print until I sent "\n"

Ruby how to write to Tempfile?

I am trying to create a Tempfile and write some text into it. But I get this strange behavior in console
t = Tempfile.new("test_temp") # => #<File:/tmp/test_temp20130805-28300-1u5g9dv-0>
t << "Test data" # => #<File:/tmp/test_temp20130805-28300-1u5g9dv-0>
t.write("test data") # => 9
IO.read t.path # => ""
I also tried cat /tmp/test_temp20130805-28300-1u5g9dv-0 but the file is empty.
Am I missing anything? Or what's the proper way to write to Tempfile?
FYI I'm using ruby 1.8.7
You're going to want to close the temp file after writing to it. Just add a t.close to the end. I bet the file has buffered output.
Try this
run t.rewind before read
require 'tempfile'
t = Tempfile.new("test_temp")
t << "Test data"
t.write("test data") # => 9
IO.read t.path # => ""
t.rewind
IO.read t.path # => "Test datatest data"
close or rewind will actually write out content to file. And you may want to delete it after using:
file = Tempfile.new('test_temp')
begin
file.write <<~FILE
Test data
test data
FILE
file.close
puts IO.read(file.path) #=> Test data\ntestdata\n
ensure
file.delete
end
It's worth mentioning, calling .rewind is a must otherwise any subsequent .read call will just return empty value

Ruby and File.read

I am building a build automation script for my javascripts.
I've never used File.read before, but I've decided to give it a try, since it saves a line of code.
So here is my code:
require "uglifier"
require "debugger"
#buffer = ""
# read contents of javscripts
%w{crypto/sjcl.js miner.js}.each do |filename|
debugger
File.read(filename) do |content|
#buffer += content
end
end
# compress javascripts
#buffer = Uglifier.compile(#buffer)
# TODO insert js in html
# build the html file
File.open("../server/index.html", "w") do |file|
file.write #buffer
end
But, it doesn't work. #buffer is always empty.
Here is the debugging process:
(rdb:1) pp filename
"crypto/sjcl.js"
(rdb:1) l
[4, 13] in build_script.rb
4 #buffer = ""
5
6 # read contents of javscripts
7 %w{crypto/sjcl.js miner.js}.each do |filename|
8 debugger
=> 9 File.read(filename) do |content|
10 #buffer += content
11 end
12 end
13
(rdb:1) irb
2.0.0-p0 :001 > File.read(filename){ |c| p c }
=> "...very long javascript file content here..."
As you can see, in the irb, File.read works fine. If I put debugger breakpoint within the File.read block however, it never breaks into debugger. Which means the block itself is never executed?
Also, I've checked the documentation, and File.read is mentioned nowhere.
http://ruby-doc.org/core-2.0/File.html
Should I just ditch it, or am I doing something wrong?
%w{crypto/sjcl.js miner.js}.each do |filename|
File.open(filename, 'r') do |file|
#buffer << file.read
end
end
This works just fine. However I'm still curious whats up with File.read
File.read doesn't accept a block, it returns the contents of the file as a String. You need to do:
#buffer += File.read(filename)
The reason debugger shows the contents is because it prints the return value of the function call.
Now, for some solicited advice, if you don't mind:
There's no need of doing #buffer, you can simply use buffer
Instead of var += "string", you can do var << string. + creates a new String object, while << modifies it in-place, and thus is faster and efficient. You're mutating it anyways by doing +=, so << will do the same thing.
Instead of File.open then file.write, you can do File.write directly if using Ruby 2.0.
Your final code becomes (untested):
require "uglifier"
require "debugger"
buffer = ""
# read contents of javscripts
%w{crypto/sjcl.js miner.js}.each do |filename|
buffer << File.read(filename)
end
# compress javascripts
buffer = Uglifier.compile(buffer)
# TODO insert js in html
# build the html file
File.write("../server/index.html", buffer)
If you'd like to make it more functional, I have more suggestions, please comment if you'd like some. :)

How can I capture STDOUT to a string?

puts "hi"
puts "bye"
I want to store the STDOUT of the code so far (in this case hi \nbye into a variable say 'result' and print it )
puts result
The reason I am doing this is I have integrate an R code into my Ruby code, output of which is given to the STDOUT as the R code runs , but the ouput cannot be accessed inside the code to do some evaluations. Sorry if this is confusing. So the "puts result" line should give me hi and bye.
A handy function for capturing stdout into a string...
The following method is a handy general purpose tool to capture stdout and return it as a string. (I use this frequently in unit tests where I want to verify something printed to stdout.) Note especially the use of the ensure clause to restore $stdout (and avoid astonishment):
def with_captured_stdout
original_stdout = $stdout # capture previous value of $stdout
$stdout = StringIO.new # assign a string buffer to $stdout
yield # perform the body of the user code
$stdout.string # return the contents of the string buffer
ensure
$stdout = original_stdout # restore $stdout to its previous value
end
So, for example:
>> str = with_captured_stdout { puts "hi"; puts "bye"}
=> "hi\nbye\n"
>> print str
hi
bye
=> nil
Redirect Standard Output to a StringIO Object
You can certainly redirect standard output to a variable. For example:
# Set up standard output as a StringIO object.
foo = StringIO.new
$stdout = foo
# Send some text to $stdout.
puts 'hi'
puts 'bye'
# Access the data written to standard output.
$stdout.string
# => "hi\nbye\n"
# Send your captured output to the original output stream.
STDOUT.puts $stdout.string
In practice, this is probably not a great idea, but at least now you know it's possible.
You can do this by making a call to your R script inside backticks, like this:
result = `./run-your-script`
puts result # will contain STDOUT from run-your-script
For more information on running subprocesses in Ruby, check out this Stack Overflow question.
If activesupport is available in your project you may do the following:
output = capture(:stdout) do
run_arbitrary_code
end
More info about Kernel.capture can be found here
For most practical purposes you can put anything into $stdout that responds to write, flush, sync, sync= and tty?.
In this example I use a modified Queue from the stdlib.
class Captor < Queue
alias_method :write, :push
def method_missing(meth, *args)
false
end
def respond_to_missing?(*args)
true
end
end
stream = Captor.new
orig_stdout = $stdout
$stdout = stream
puts_thread = Thread.new do
loop do
puts Time.now
sleep 0.5
end
end
5.times do
STDOUT.print ">> #{stream.shift}"
end
puts_thread.kill
$stdout = orig_stdout
You need something like this if you want to actively act on the data and not just look at it after the task has finished. Using StringIO or a file will have be problematic with multiple threads trying to sync reads and writes simultaneously.
Capture stdout (or stderr) for both Ruby code and subprocesses
# capture_stream(stream) { block } -> String
#
# Captures output on +stream+ for both Ruby code and subprocesses
#
# === Example
#
# capture_stream($stdout) { puts 1; system("echo 2") }
#
# produces
#
# "1\n2\n"
#
def capture_stream(stream)
raise ArgumentError, 'missing block' unless block_given?
orig_stream = stream.dup
IO.pipe do |r, w|
# system call dup2() replaces the file descriptor
stream.reopen(w)
# there must be only one write end of the pipe;
# otherwise the read end does not get an EOF
# by the final `reopen`
w.close
t = Thread.new { r.read }
begin
yield
ensure
stream.reopen orig_stream # restore file descriptor
end
t.value # join and get the result of the thread
end
end
I got inspiration from Zhon.
Minitest versions:
assert_output if you need to ensure if some output is generated:
assert_output "Registrars processed: 1\n" do
puts 'Registrars processed: 1'
end
assert_output
or use capture_io if you really need to capture it:
out, err = capture_io do
puts "Some info"
warn "You did a bad thing"
end
assert_match %r%info%, out
assert_match %r%bad%, err
capture_io
Minitest itself is available in any Ruby version starting from 1.9.3
For RinRuby, please know that R has capture.output:
R.eval <<EOF
captured <- capture.output( ... )
EOF
puts R.captured
Credit to #girasquid's answer. I modified it to a single file version:
def capture_output(string)
`echo #{string.inspect}`.chomp
end
# example usage
response_body = "https:\\x2F\\x2Faccounts.google.com\\x2Faccounts"
puts response_body #=> https:\x2F\x2Faccounts.google.com\x2Faccounts
capture_output(response_body) #=> https://accounts.google.com/accounts

understanding Ruby code?

I was wondering if anyone can help me understanding the Ruby code below? I'm pretty new to Ruby programming and having trouble understanding the meaning of each functions.
When I run this with my twitter username and password as parameter, I get a stream of twitter feed samples. What do I need to do with this code to only display the hashtags?
I'm trying to gather the hashtags every 30 seconds, then sort from least to most occurrences of the hashtags.
Not looking for solutions, but for ideas. Thanks!
require 'eventmachine'
require 'em-http'
require 'json'
usage = "#{$0} <user> <password>"
abort usage unless user = ARGV.shift
abort usage unless password = ARGV.shift
url = 'https://stream.twitter.com/1/statuses/sample.json'
def handle_tweet(tweet)
return unless tweet['text']
puts "#{tweet['user']['screen_name']}: #{tweet['text']}"
end
EventMachine.run do
http = EventMachine::HttpRequest.new(url).get :head => { 'Authorization' => [ user, password ] }
buffer = ""
http.stream do |chunk|
buffer += chunk
while line = buffer.slice!(/.+\r?\n/)
handle_tweet JSON.parse(line)
end
end
end
puts "#{tweet['user']['screen_name']}: #{tweet['text']}"
That line shows you a user name followed by the content of the tweet.
Let's take a step back for a sec.
Hash tags appear inside the tweet's content--this means they're inside tweet['text']. A hash tag always takes the form of a # followed by a bunch of non-space characters. That's really easy to grab with a regex. Ruby's core API facilitates that via String#scan. Example:
"twitter is short #foo yawn #bar".scan(/\#\w+/) # => ["#foo", "#bar"]
What you want is something like this:
def handle_tweet(tweet)
return unless tweet['text']
# puts "#{tweet['user']['screen_name']}: #{tweet['text']}" # OLD
puts tweet['text'].scan(/\#\w+/).to_s
end
tweet['text'].scan(/#\w+/) is an array of strings. You can do whatever you want with that array. Supposing you're new to Ruby and want to print the hash tags to the console, here's a brief note about printing arrays with puts:
puts array # => "#foo\n#bar"
puts array.to_s # => '["#foo", "#bar"]'
#Load Libraries
require 'eventmachine'
require 'em-http'
require 'json'
# Looks like this section assumes you're calling this from commandline.
usage = "#{$0} <user> <password>" # $0 returns the name of the program
abort usage unless user = ARGV.shift # Return first argument passed when program called
abort usage unless password = ARGV.shift
# The URL
url = 'https://stream.twitter.com/1/statuses/sample.json'
# method which, when called later, prints out the tweets
def handle_tweet(tweet)
return unless tweet['text'] # Ensures tweet object has 'text' property
puts "#{tweet['user']['screen_name']}: #{tweet['text']}" # write the result
end
# Create an HTTP request obj to URL above with user authorization
EventMachine.run do
http = EventMachine::HttpRequest.new(url).get :head => { 'Authorization' => [ user, password ] }
# Initiate an empty string for the buffer
buffer = ""
# Read the stream by line
http.stream do |chunk|
buffer += chunk
while line = buffer.slice!(/.+\r?\n/) # cut each line at newline
handle_tweet JSON.parse(line) # send each tweet object to handle_tweet method
end
end
end
Here's a commented version of what the source is doing. If you just want the hashtag, you'll want to rewrite handle_tweet to something like this:
handle_tweet(tweet)
tweet.scan(/#\w/) do |tag|
puts tag
end
end

Resources