What does TCPSocket#each iterate over in ruby? - ruby

I'm not too familiar with Ruby, so I wasn't able to find the documentation for this method.
When calling each on a TCPSocket object, like this
require "socket"
srv = TCPServer.new("localhost", 7887)
skt = srv.accept
skt.each {|arg| p arg}
Does the block get called once per tcp packet, once per line (after each '\n' char), once per string (after after each NUL/EOF), or something different entirely?

TL;DR TCPSocket.each will iterate for each newline delimited \n string it receives.
More details:
A TCPSocket is just a BasicSocket with some extra powder sugar on top. And a BasicSocket is a child of IO class. The IO class is just a stream of data; thus, it is iterable. And that is where you can find how each is defined for TCPSocket.
Fire up an irb console and enter your line of code with the $stdin socket to see how each behaves. They both inherit from IO. Here is an example of what happens:
irb(main):011:0> $stdin.each {|arg| p arg + "."}
hello
"hello\n."
But to directly answer the question, the block is called once per \n character. If your client is sending data 1 character at a time then the block is not going to be executed until it sees the \n.
Here is a quick sample client to show this:
irb(main):001:0> require 'socket'
=> true
irb(main):002:0> s = TCPSocket.open("localhost", 7887)
=> #<TCPSocket:fd 9>
irb(main):003:0> s.puts "hello"
=> nil
irb(main):007:0> s.write "hi"
=> 2
irb(main):008:0> s.write ", nice to meet you"
=> 18
irb(main):009:0> s.write "\n"
=> 1
And here is what the server printed out:
"hello\n"
"hi, nice to meet you\n" # note: this did not print until I sent "\n"

Related

How can I clear a `StringIO` instance?

How can I clear a StringIO instance? After I write to and read from a string io, I want to clear it.
require "stringio"
io = StringIO.new
io.write("foo")
io.string #=> "foo"
# ... After doing something ...
io.string #=> Expecting ""
I tried flush and rewind, but I still get the same content.
seek or rewind only affect next read/write operations, not the content of the internal storage.
You can use StringIO#truncate like File#truncate:
require 'stringio'
io = StringIO.new
io.write("foo")
io.string
# => "foo"
io.truncate(0) # <---------
io.string
# => ""
Alternative:
You can also use StringIO#reopen (NOTE: File does not have reopen method):
io.reopen("")
io.string
# => ""
I use truncate AND rewind. The rewind is required to prepare io for next operations.
io.truncate(0)
io.rewind

What happens to the object I assign to $stdout in Ruby? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
EDIT: Don't bother reading this question, I just can't delete it. It's based on broken code and there's (almost) nothing to learn here.
I am redirecting console output in my Ruby program and although it works perfectly there is one thing I'm curious about:
Here's my code
capture = StringIO.new
$stdout = capture
puts "Hello World"
It looks like even though I'm assigning my capture object to $stdout, $stdout contains a new and different object after the assignment, but at least the type is correct.
In other words:
$stdout.to_s # => #<IO:0x2584b30>
capture = StringIO.new
$stdout = capture
$stdout.to_s # => #<StringIO:0x4fda948>
capture.to_s # => #<StringIO:0x4e3b220>
Subsequently $stdout.string contains "Hello World", but capture.string is empty.
Is there something happening behind the scenes or am I missing something here?
EDIT: This might be specific to certain versions only. I'm using Ruby 2.0.0-p247 on Windows 8.1
It works as expected.
>> capture = StringIO.new
=> #<StringIO:0x00000001ea8c00>
>> $stdout = capture
>> $stdout.to_s
>> capture.to_s
Above two line does not print anything because $stdout is now disconnected from terminal.
So I used $stderr.puts in following lines (can also use STDOUT.puts as Stefan commented):
>> $stderr.puts $stdout.to_s
#<StringIO:0x00000001ea8c00>
>> $stderr.puts capture.to_s
#<StringIO:0x00000001ea8c00>
$stdout.to_s, capture.to_s give me same result.
I used ruby 1.9.3. (Same for 2.0.0)
Are you sure there is no other manipulation of $stdout or capturehappening in between?
For me, output looks different. Both capture and $stdout are the same object and subsequently answer to string with the same response (ruby 1.9.2):
require 'stringio'
$stdout.to_s # => #<IO:0x2584b30>
capture = StringIO.new
$stdout = capture
puts $stdout.to_s # => #<StringIO:0x89a38c0>
puts capture.to_s # => #<StringIO:0x89a38c0>
puts "redirected"
$stderr.puts $stdout.string # => '#<StringIO:0x89a38c0>\n#<StringIO:0x89a38c0>\nredirected'
$stderr.puts capture.string # => '#<StringIO:0x89a38c0>\n#<StringIO:0x89a38c0>\nredirected'
Although this question was the result of overlooking a change to the value of $stdout, Ruby does have the ability to override assignment to global vars in this way, at least in the C api, using hooked variables.
$stdout actually does make use of this to check whether the new value is appropriate (it checks whether the new value responds to write) and raises an exception if it doesn’t.
If you really wanted (you don’t) you could create an extension that defines a global variable that automatically stores a different object than the value assigned, perhaps by called dup on it and using that instead:
#include "ruby.h"
VALUE foo;
static void foo_setter(VALUE val, ID id, VALUE *var){
VALUE dup_val = rb_funcall(val, rb_intern("dup"), 0);
*var = dup_val;
}
void Init_hooked() {
rb_define_hooked_variable("$foo", &foo, 0, foo_setter);
}
You could then use it like:
2.0.0-p247 :001 > require './ext/hooked'
=> true
2.0.0-p247 :002 > s = Object.new
=> #<Object:0x00000100b20560>
2.0.0-p247 :003 > $foo = s
=> #<Object:0x00000100b20560>
2.0.0-p247 :004 > s.to_s
=> "#<Object:0x00000100b20560>"
2.0.0-p247 :005 > $foo.to_s
=> "#<Object:0x00000100b3bea0>"
2.0.0-p247 :006 > s == $foo
=> false
Of course this is very similar to simply creating a setter method in a class that dups the vale and stores that, which you can do in plain Ruby:
def foo=(new_foo)
#foo = new_foo.dup
end
Since using global variables is generally bad design, it seems reasonable that this isn’t possible in Ruby for globals.

understanding Ruby code?

I was wondering if anyone can help me understanding the Ruby code below? I'm pretty new to Ruby programming and having trouble understanding the meaning of each functions.
When I run this with my twitter username and password as parameter, I get a stream of twitter feed samples. What do I need to do with this code to only display the hashtags?
I'm trying to gather the hashtags every 30 seconds, then sort from least to most occurrences of the hashtags.
Not looking for solutions, but for ideas. Thanks!
require 'eventmachine'
require 'em-http'
require 'json'
usage = "#{$0} <user> <password>"
abort usage unless user = ARGV.shift
abort usage unless password = ARGV.shift
url = 'https://stream.twitter.com/1/statuses/sample.json'
def handle_tweet(tweet)
return unless tweet['text']
puts "#{tweet['user']['screen_name']}: #{tweet['text']}"
end
EventMachine.run do
http = EventMachine::HttpRequest.new(url).get :head => { 'Authorization' => [ user, password ] }
buffer = ""
http.stream do |chunk|
buffer += chunk
while line = buffer.slice!(/.+\r?\n/)
handle_tweet JSON.parse(line)
end
end
end
puts "#{tweet['user']['screen_name']}: #{tweet['text']}"
That line shows you a user name followed by the content of the tweet.
Let's take a step back for a sec.
Hash tags appear inside the tweet's content--this means they're inside tweet['text']. A hash tag always takes the form of a # followed by a bunch of non-space characters. That's really easy to grab with a regex. Ruby's core API facilitates that via String#scan. Example:
"twitter is short #foo yawn #bar".scan(/\#\w+/) # => ["#foo", "#bar"]
What you want is something like this:
def handle_tweet(tweet)
return unless tweet['text']
# puts "#{tweet['user']['screen_name']}: #{tweet['text']}" # OLD
puts tweet['text'].scan(/\#\w+/).to_s
end
tweet['text'].scan(/#\w+/) is an array of strings. You can do whatever you want with that array. Supposing you're new to Ruby and want to print the hash tags to the console, here's a brief note about printing arrays with puts:
puts array # => "#foo\n#bar"
puts array.to_s # => '["#foo", "#bar"]'
#Load Libraries
require 'eventmachine'
require 'em-http'
require 'json'
# Looks like this section assumes you're calling this from commandline.
usage = "#{$0} <user> <password>" # $0 returns the name of the program
abort usage unless user = ARGV.shift # Return first argument passed when program called
abort usage unless password = ARGV.shift
# The URL
url = 'https://stream.twitter.com/1/statuses/sample.json'
# method which, when called later, prints out the tweets
def handle_tweet(tweet)
return unless tweet['text'] # Ensures tweet object has 'text' property
puts "#{tweet['user']['screen_name']}: #{tweet['text']}" # write the result
end
# Create an HTTP request obj to URL above with user authorization
EventMachine.run do
http = EventMachine::HttpRequest.new(url).get :head => { 'Authorization' => [ user, password ] }
# Initiate an empty string for the buffer
buffer = ""
# Read the stream by line
http.stream do |chunk|
buffer += chunk
while line = buffer.slice!(/.+\r?\n/) # cut each line at newline
handle_tweet JSON.parse(line) # send each tweet object to handle_tweet method
end
end
end
Here's a commented version of what the source is doing. If you just want the hashtag, you'll want to rewrite handle_tweet to something like this:
handle_tweet(tweet)
tweet.scan(/#\w/) do |tag|
puts tag
end
end

How can I only read one line of data from a TCPSocket in Ruby?

I'm using the following code to connect to a network service i'm writing (thats backed by EventMachine) and I'm having a bit of trouble getting into a situation allowing me to use one socket connection to execute multiple commands.
#!/usr/bin/env ruby
require 'socket'
opts = {
:address => "0.0.0.0",
:port => 2478
}
connection = TCPSocket.open opts[:address], opts[:port]
# Get ID
connection.print "ID something"
puts connection.read
# Status
connection.print "STATUS"
puts connection.read
# Close the connection
connection.close
Here's what my EventMachine server hander looks like...
module ConnectionHandler
def receive_data data
send_data "Some output #{data}"
end
end
However, my first ruby script hangs when it executes connection.read as I presume its waiting for the connection to close so it knows its got all of the data? This is not what I want to happen.
My socket server will just take one command (on one line) and return one line of output.
Any ideas how I can do this? Thanks.
It turns out the connection.gets method will return a line of data received if the server sends a response ending in a \n character. So I just added \n to the end of my send_data call and switch to using puts connection.gets and it worked great!

How to lock IO shared by fork in ruby

How can we lock an IO that has been shared by multiple ruby process?
Consider this script:
#!/usr/bin/ruby -w
# vim: ts=2 sw=2 et
if ARGV.length != 2
$stderr.puts "Usage: test-io-fork.rb num_child num_iteration"
exit 1
end
CHILD = ARGV[0].to_i
ITERATION = ARGV[1].to_i
def now
t = Time.now
"#{t.strftime('%H:%M:%S')}.#{t.usec}"
end
MAP = %w(nol satu dua tiga empat lima enam tujuh delapan sembilan)
IO.popen('-', 'w') {|pipe|
unless pipe
# Logger child
File.open('test-io-fork.log', 'w') {|log|
log.puts "#{now} Program start"
$stdin.each {|line|
log.puts "#{now} #{line}"
}
log.puts "#{now} Program end"
}
exit!
end
pipe.sync = true
pipe.puts "Before fork"
CHILD.times {|c|
fork {
pid = Process.pid
srand
ITERATION.times {|i|
n = rand(9)
sleep(n / 100000.0)
pipe.puts "##{c}:#{i} #{MAP[n]} => #{n}, #{n} => #{MAP[n]} ##{c}:#{i}"
}
}
}
}
And try it like this:
./test-io-fork.rb 200 50
Like expected, the test-io-fork.log files would contains sign of IO race condition.
What I want to achieve is to make a TCP server for custom GPS protocol that will save the GPS points to database. Because this server would handle 1000 concurrent clients, I would like to restrict database connection to only one child instead opening 1000 database connection simultaneously. This server would run on linux.
UPDATE
It may be bad form to update after the answer was accepted, but the original is a bit misleading. Whether or not ruby makes a separate write(2) call for the automatically-appended newline is dependent upon the buffering state of the output IO object.
$stdout (when connected to a tty) is generally line-buffered, so the effect of a puts() -- given reasonably sized string -- with implicitly added newline is a single call to write(2). Not so, however, with IO.pipe and $stderr, as the OP discovered.
ORIGINAL ANSWER
Change your chief pipe.puts() argument to be a newline terminated string:
pipe.puts "##{c} ... #{i}\n" # <-- note the newline
Why? You set pipe.sync hoping that the pipe writes would be atomic and non-interleaved, since they are (presumably) less than PIPE_BUF bytes. But it didn't work, because ruby's pipe puts() implementation makes a separate call to write(2) to append the trailing newline, and that's why your writes are sometimes interleaved where you expected a newline.
Here's a corroborating excerpt from a fork-following strace of your script:
$ strace -s 2048 -fe trace=write ./so-1326067.rb
....
4574 write(4, "#0:12 tiga => 3, 3 => tiga #0:12", 32) = 32
4574 write(4, "\n", 1)
....
But putting in your own newline solves the problem, making sure that your entire record is transmitted in one syscall:
....
5190 write(4, "#194:41 tujuh => 7, 7 => tujuh #194:41\n", 39 <unfinished ...>
5179 write(4, "#183:38 enam => 6, 6 => enam #183:38\n", 37 <unfinished ...>
....
If for some reason that cannot work for you, you'll have to coordinate an interprocess mutex (like File.flock()).

Resources