Simple HTTP server in Ruby using TCPServer - ruby

For a school assignment, I am trying to create a simple HTTP server using Ruby and the sockets library.
Right now, I can get it to respond to any connection with a simple hello:
require 'socket'
server = TCPServer.open 2000
puts "Listening on port 2000"
loop {
client = server.accept()
resp = "Hello?"
headers = ["HTTP/1.1 200 OK",
"Date: Tue, 14 Dec 2010 10:48:45 GMT",
"Server: Ruby",
"Content-Type: text/html; charset=iso-8859-1",
"Content-Length: #{resp.length}\r\n\r\n"].join("\r\n")
client.puts headers
client.puts resp
client.close
}
This works as expected. However, when I have the server tell me who just connected with
puts "Client: #{client.addr[2]}"
and use Chromium (browser) to connect to localhost:2000/ (just once), I get:
Client: 127.0.0.1
Client: 127.0.0.1
Client: 127.0.0.1
Client: 127.0.0.1
I assume this is Chromium requesting auxiliary files, like favicon.ico, and not my script doing something weird, so I wanted to investigate the incoming request. I replaced the resp = "Hello?" line with
resp = client.read()
And restarted the server. I resent the request in Chromium, and instead of it coming back right away, it just hung. Meanwhile, I got the output Client: 127.0.0.1 in my server output. I hit the "stop" button in Chromium, and then the server crashed with
server.rb:16:in `write': Broken pipe (Errno::EPIPE)
from server.rb:16:in `puts'
from server.rb:16:in `block in <main>'
from server.rb:6:in `loop'
from server.rb:6:in `<main>'
Obviously, I'm doing something wrong, as the expected behavior was sending the incoming request back as the response.
What am I missing?

I don't really know about chrome and the four connections, but I'll try to answer your questions on how to read the request properly.
First of all, IO#read won't work in this case. According to the documentation, read without any parameters reads until it encounters EOF, but nothing like that happens. A socket is an endless stream, you won't be able to use that method in order to read in the entire message, since there is no "entire" message for the socket. You could use read with an integer, like read(100) or something, but that will block at some point anyway.
Basically, reading a socket is very different from reading a file. A socket is updated asynchronously, completely independent of the time you try to read it. If you request 10 bytes, it's possible that, at this point in the code, only 5 bytes are available. With blocking IO, the read(10) call will then hang and wait until 5 more bytes are available, or until the connection is closed. This means that, if you try repeatedly reading packets of 10 bytes, at some point, it will still hang. Another way to read a socket is using non-blocking IO, but that's not very important in your case, and it's a long topic by itself.
So here's an example of how you might access the data by using blocking IO:
loop {
client = server.accept
while line = client.gets
puts line.chomp
break if line =~ /^\s*$/
end
# rest of loop ...
}
The gets method tries to read from the socket until it encounters a newline. This will happen at some point for an HTTP request, so even if the entire message is transferred piece by piece, gets should return a single line from the output. The line.chomp call will cut off the final newlines if they're present. If the line read is empty, that means the HTTP headers have been transferred and we can safely break the loop (you can put that in the while condition, of course). The request will be dumped to the console that the server has been started on. If you really want to send it back to the browser, the idea's the same, you just need to handle the lines differently:
loop {
client = server.accept
lines = []
while line = client.gets and line !~ /^\s*$/
lines << line.chomp
end
resp = lines.join("<br />")
headers = ["http/1.1 200 ok",
"date: tue, 14 dec 2010 10:48:45 gmt",
"server: ruby",
"content-type: text/html; charset=iso-8859-1",
"content-length: #{resp.length}\r\n\r\n"].join("\r\n")
client.puts headers # send the time to the client
client.puts resp
client.close
}
As for the broken pipe, that error occurs because the browser forcefully breaks the connection off while read is trying to access data.

Related

Read entire message from a TCPSocket without hanging

I'm putting together a TCPServer in Ruby 3.0.2 and I'm finding that I can't seem to read the entire packet without blocking (until the socket is closed).
Edit: There was some confusion on what I was trying to do - my bad - so just to help clarify: I wanted to read everything that had been sent over the TCP connection so far. (end edit)
My first try was:
#!/snap/bin/ruby
require 'socket'
server = TCPServer.new('localhost', 4200)
loop {
Thread.start(server.accept) do |connection|
puts connection.gets # The important line
end
}
But that hangs until the client closes the connection. Okay, so I take a look at connection.methods, and the ruby docs and try a bunch of options that seem promising. Basically, there is two types of read methods: blocking and nonblocking.
The blocking methods that I tried are .read, .gets, .readlines, .readline, .recv, and .recvmsg. Now .read, .readlines, and .gets all hang (until the socket is closed) - so that's not helpful. The other ones (eg. .readline, the recv methods) don't read the entire message. Now, I could read each line until I see an empty line and parse the HTTP header from there. But there's got to be a better way; I don't want to have to worry about getting a corrupted message and hanging because I didn't read an empty line at the end of the header.
So I went looking at the non-blocking options. Specifically .recv_nonblock and .recvmsg_nonblock. Both of these throw errors (Resource temporarily unavailable - recvfrom(2) would block and Resource temporarily unavailable - recvmsg(2) respectively).
Any ideas on what could be going on? I think it has something to with me using Ruby 3, because trying out the code on Ruby 2.5, client.gets returns a line (doesn't hang), although .readlines does hang - so not sure what's going on.
Ideally, I could just call something along the lines of client.get_message and I would get the entire message that has been sent, but I'd also be okay with working at the TCP level and getting the packet size, reading that size, and reconstructing the message from there.
TCP just transmits the bytes that you write to the socket, and guarantees that the are received in the order they were sent. If you have the concept of a 'message' then you'll need to add that into your server and client.
.gets specifically will block until it reads a new 'line', or whatever you define as the separator for the string - see the docs IO#gets. This means that until your server receives that byte from the client, it will block.
In your client have a look at how you're writing your data - if you're using ruby then puts would work, as it will terminate the string with a new line. If you're using write then it will only write the string without a new line
Ie.
# client.rb
c = TCPSocket.new 'localhost', 5000
c.puts "foo"
c.write "bar"
c.write "baz\n"
# server.rb
s = TCPServer.new 5000
loop do
client = s.accept
puts client.gets
puts client.gets
end
will output
foo
barbaz
Thanks to everyone who commented/answered, but I found the solution that I think was intended by the creators of the Socket class!
The recv_nonblock method takes some optional arguments - one of which is a buffer that the Socket will store what it has read to. So a call like client.recv_nonblock(1000, 0, buffer) stores up to 1000 characters from the Socket into buffer and then exits instead of blocking.
Just to make life easy, I put together a monkey patch to the TCPSocket class:
class TCPSocket
def eat_buffer
contents = ''
buffer = ''
begin
loop {
recv_nonblock(256, 0, buffer)
contents += buffer
}
rescue IO::EAGAINWaitReadable
contents
end
end
end
The point that Steffen makes in the comments is well taken - TCP isn't designed to be used this way. This is a hacky (in the bad sense) method, and should be avoided.

Net::HTTP – Flush or Close

I've written a consumer for a payment API. My code simply issues a POST request and gets a response from the API. I've implemented that with Net::HTTP, here are the relevant lines of code:
http = Net::HTTP.new(uri.host, 443)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Post.new(uri.request_uri)
request.set_form_data(params)
response = http.request(request)
This worked for years, however, recently some requests have reached timeouts when the API is under stress. The API maintainer came up with this explanation:
We pass on the data to RabbitMQ synchronously after flushing the HTTP response. Apparently, some HTTP libs wait for the connection to be closed before the program continues on the consumer side and we think this is happening here. Please reconfigure your consumer not to wait for close but to continue right after the response has been flushed.
I'm not sure how Net::HTTP is implemented and whether it really waits for the close when the response has been flushed. The docs don't say anything about it nor is there a setting to control any of this. And to make matters worse, I don't really know how to simulate this.
Any ideas are very welcome!
I guess the following experiment (with Ruby 2.3) should give the answer, I post it here in case someone else stumbles across this question in the future.
server.rb:
require 'socket'
server = TCPServer.new('localhost', 2345)
loop do
socket = server.accept
request = socket.gets
STDERR.puts request
response = "Hello World at #{Time.now}!\n"
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Content-Length: #{response.bytesize}\r\n" +
"Connection: close\r\n"
socket.print "\r\n"
socket.print response
socket.flush
sleep 10
socket.close
end
client.rb:
require 'net/http'
http = Net::HTTP.new('localhost', 2345)
request = Net::HTTP::Post.new('/')
response = http.request(request)
puts response.body
Running the server, the client will send one request and exit. It does so immediately, so the flush is sufficient to have the client code continue. Restarting the client within the 10 seconds wait of the server, causes the client the hang until the 10 seconds have fully elapsed, then printing the Hello World and once again immediately exiting.
In other words: Such a simple Net::HTTP client does not wait for the connection to close but continues to execute it's code once the server has flushed.

Clarification of the Ruby Socket Library Gets Method

I am working on an assignment where I have to develop a web server in Ruby using the socket library. I was able to get a simple web server up and running as seen in this thread here .
I am currently working on getting and storing the body of an HTTP request into a variable in my web server. The problem I am running into is trying to define a while loop that gets the entire body of a HTTP request.
I am attempting to get the body of a HTTP request by using the gets method. I could not find any documentation on this method (I saw it being used here)
and was wondering if there were more documentation online.
In my first post here, someone suggested that I use the Content-Length header to determine the size of the body and how much data to read from the socket. I don't really understand how I would go about implementing this because I am unsure how the gets method functions.
Since this is for an assignment, I don't think posting code would be a good idea. I am looking for more information on the gets method and any tips to point me towards the right direction.
You shouldn't be using gets. gets tries to read complete lines (ie it reads up to a line separator), but there is no guarantee that an http request body ends with a line separator.
Instead you should be using read - this allows you to read an arbitrary amount of data (as you mentioned you can use the content length header to know how much to read)
Your ultimate problem isn't related to gets, or even really anything in your code. But before we get to that, let's answer this question & explore sockets a little bit.
If you follow the chain up, you find that Ruby's TCPSocket class inherits from its IO class. It's IO that provides gets. gets will read, line-by-line, until there's nothing more to read. Let's create a simple client that connects to a port, spits out 4 lines of poetry, and then quits:
# poetry_sender.rb
require 'socket'
poem = ["'God save thee, ancient Mariner!",
"From the fiends, that plague thee thus!—",
"Why look'st thou so?'—With my cross-bow",
"I shot the ALBATROSS."]
puts "Client establishing connection..."
s = TCPSocket.new 'localhost', 2000
puts "Client sending poetry..."
poem.each { |line| s.puts line } # Print each line out on the socket
s.close # Close our socket
puts "All done."
And a simple server, that displays what the client sends us:
# poetry_receiver.rb
require 'socket'
server = TCPServer.new 2000 # Server bind to port 2000
loop do
puts "Server now awaiting some poetry..."
socket = server.accept # Wait for a client to connect
while line = socket.gets
puts "A client sent us this beautiful line: #{line}"
end
puts "They had nothing more to say; let's disconnect them."
socket.close
end
If you run the server (poetry_receiver.rb) first, and then the client, you'll see some output like this:
Server now awaiting a connection...
A client sent us this beautiful line: 'God save thee, ancient Mariner!
A client sent us this beautiful line: From the fiends, that plague thee thus!—
A client sent us this beautiful line: Why look'st thou so?'—With my cross-bow
A client sent us this beautiful line: I shot the ALBATROSS.
They had nothing more to say; let's disconnect them.
Server now awaiting a connection...
The last two lines are the important ones; they indicate that socket.gets returned nil and we exited the while loop.
So, how can we modify our poetry_sender.rb so the server doesn't detect the end of the poem? You might think it's got something to do with blank lines, but if you set poem = [] or poem = ["", "", ""] then you'll find that it still gets disconnected OK. But what if we added a delay before closing the socket in poetry_sender.rb?
sleep 60
s.close # Close our socket
puts "All done."
Now you'll see a big delay in the server output. The TCP server doesn't break out of its while loop until the TCP client closes its socket.
Now we can turn to your broader problem: you're trying to implement a simple HTTP server, but your server is getting hung up in a while loop when you try to connect via your web browser. It's because your web browser is keeping that socket open; but it has to, otherwise it has no way to send you back a response. So, how do we know when a client has finished sending us a response? The HTTP 1.1 spec says:
A client sends an HTTP request to a server in the form of a request message... followed by header fields... an empty line to indicate the end of the header section, and finally a message body containing the payload body (if any).
Let's not worry about the message body; how could we write a while loop that terminates if it has no more impact, or if it receives a blank line? Here's one way, in a simple HTTP server that just sends back "Hello world" no matter what request it receives:
require 'socket'
server = TCPServer.new('localhost', 2345)
http_request = [] # We'll store the lines of our incoming request here.
loop do
socket = server.accept
while (line = socket.gets) && line.chomp != '' # While the client is connected, and hasn't sent us a blank line yet...
http_request << line
end
# Send response headers
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Connection: close\r\n" +
"\r\n"
# Send response body
socket.print "Hello world!"
socket.close
end
Quite late to the party, but I'm currently implementing my own rack app server (for fun).
Here you can see how I do it: https://github.com/tak1n/reifier/blob/master/lib/reifier/request.rb
The first line of a HTTP request is always the request line, which is basically something like GET /test HTTP/1.1
After the request line until \r\n you get the headers.
After that you are able to read the body (if PUT / POST request) with just using the CONTENT_LENGTH you parsed from the headers.

Ruby Web Server Hanging When Trying To Parse HTTP Request

I am working on an assignment which requires me to implement a web server in Ruby without using any libraries. I have a basic server setup to return a "Hello World" response and I am ready to move onto the next step.
The next step is to generate HTTP Responses based on the HTTP Requests. This is where I am having trouble, it seems that the while loop in my program causes the server to hang.
The code for the web server:
require 'socket'
server = TCPServer.new('localhost', 2345)
http_request = ""
loop do
socket = server.accept
request = socket.gets
while line = socket.gets
puts line
http_request << line
end
response = "Hello World!\n"
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Content-Length: #{response.bytesize}\r\n" +
"Connection: close\r\n"
socket.print "\r\n"
socket.print response
puts "DONE with while loop!"
socket.close
end
In the code above, I am trying to put the HTTP request into a the string http_request and parse that to determine which HTTP response I want to generate. I have tested my code without the while loop and was able to reach the Hello World page in my browser using localhost:2345/test. However, with the addition of the while loop, I am no longer able to load the page and the string "DONE with while loop!" is never printed into the console.
Does anyone know why my web server is hanging? Am I approaching the problem entirely wrong?
Your call to socket.gets will continue to wait for more data after all the request has been sent, blocking any further progress. It has no way of knowing that this is a HTTP call and the the request has finished.
A HTTP request consists of the headers and then a blank line indicating the end of the headers. Your code needs to look out for this blank line. You could do this by changing your loop to something like this:
while (line = socket.gets).chomp != ''
This will work for requests that don’t have a body, such as GETs, but things are more difficult when processing requests with bodies. In that case you will need to parse the headers for the Content-Length in order to know how much data to read from the socket. It is even more complex still for chunked requests, you may not need to go that far in your assignment.

How to prevent "The connection was reset" error?

I have a very basic TCP server implemented in Ruby. In general it does what it's supposed to, but every once in a while I get "The connection to the server was reset while the page was loading" error. I have a feeling that it has something to do with close terminating the connection too soon. If so, how do I wait for all the data to be sent? Or is it something else?
require 'socket'
server = TCPServer.new('', 80)
loop do
session = server.accept
begin
session.print Time.now
ensure
session.close
end
end
I'm not an expert in this area, but here is what I believe is happening....
The browser sends a GET request with the header field "Connection: keep-alive". So the browser is expecting to keep the connection alive at least until it receives a complete chunk of the response. Under this protocol, the server response must include a header specifying the length of the response, so that the browser knows when it has received the complete response. After this point, the connection can be closed without the browser caring.
The original example closes the connection too quickly, before the browser can validate that a complete response was received. Curiously, if I run that example and refresh my browser several times, it will load about every 1 in 10 tries. Maybe this erratic behavior is due to the browser occasionally executing fast enough to beat my server closing the connection.
Below is a code example that executes consistently in my browser:
require 'socket'
response = %{HTTP/1.1 200 OK
Content-Type: text;charset=utf-8
Content-Length: 12
Hello World!
}
server = TCPServer.open(80)
loop do
client = server.accept
client.puts response
sleep 1
client.close
end
I suspect it's because the browser is expecting an HTTP response with headers &c. Curiously, you can make the "reset" error happen every time if you put before the "ensure" a sleep of, say, one second.
How to fix it depends upon what you are after. If this is not to be an HTTP server, then don't use the browser to test it. Instead, use telnet or write a little program. If it is to be an HTTP server, then take a look at webrick, which is built into Ruby MRI >= 1.8. Here's how:
#!/usr/bin/ruby1.8
require 'webrick'
# This class handles time requests
class TimeServer < WEBrick::HTTPServlet::AbstractServlet
def do_GET(request, response)
response.status = 200
response['Content-Type'] = 'text/plain'
response.body = Time.now.to_s
end
end
# Create the server. There are many other options, if you need them.
server = WEBrick::HTTPServer.new(:Port=>8080)
# Whenever a request comes in for the root page, use TimeServer to handle it
server.mount('/', TimeServer)
# Finally, start the server. Does not normally return.
server.start
Also, should note that including Connection: close in the response header doesn't seem to help me at all with this connection reset error in my browser (FFv3.6). I have to include both the content-length header field, and include the sleep method to put some delay in the connection closing in order to get a consistent response in my browser.

Resources