Ruby Web Server Hanging When Trying To Parse HTTP Request - ruby

I am working on an assignment which requires me to implement a web server in Ruby without using any libraries. I have a basic server setup to return a "Hello World" response and I am ready to move onto the next step.
The next step is to generate HTTP Responses based on the HTTP Requests. This is where I am having trouble, it seems that the while loop in my program causes the server to hang.
The code for the web server:
require 'socket'
server = TCPServer.new('localhost', 2345)
http_request = ""
loop do
socket = server.accept
request = socket.gets
while line = socket.gets
puts line
http_request << line
end
response = "Hello World!\n"
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Content-Length: #{response.bytesize}\r\n" +
"Connection: close\r\n"
socket.print "\r\n"
socket.print response
puts "DONE with while loop!"
socket.close
end
In the code above, I am trying to put the HTTP request into a the string http_request and parse that to determine which HTTP response I want to generate. I have tested my code without the while loop and was able to reach the Hello World page in my browser using localhost:2345/test. However, with the addition of the while loop, I am no longer able to load the page and the string "DONE with while loop!" is never printed into the console.
Does anyone know why my web server is hanging? Am I approaching the problem entirely wrong?

Your call to socket.gets will continue to wait for more data after all the request has been sent, blocking any further progress. It has no way of knowing that this is a HTTP call and the the request has finished.
A HTTP request consists of the headers and then a blank line indicating the end of the headers. Your code needs to look out for this blank line. You could do this by changing your loop to something like this:
while (line = socket.gets).chomp != ''
This will work for requests that don’t have a body, such as GETs, but things are more difficult when processing requests with bodies. In that case you will need to parse the headers for the Content-Length in order to know how much data to read from the socket. It is even more complex still for chunked requests, you may not need to go that far in your assignment.

Related

Ruby Socket TCPServer send png image to client

I have an http server using sockets that gets all client data, and sends back data. I am successfully able to send back HTML to the client (my web browser) but whenever I try and send an image, I get a small white square no matter what image I send.
The code:
#Generate and send response
def response(client, response = 200, headers, data)
client.print "HTTP/1.1 #{response.to_s}\r\n"
headers_s = ""
for h in headers do
headers_s = headers_s + h + "\n"
end
client.print "#{headers_s}\r\n"
client.print "\r\n"
client.print data.to_s
end
response(client, 200, ["Content-Type: image/png"], File.read("./very_cool_picture.png"))
I probably am reading the image wrong, but I am not sure. Also, sending back other binary data such as executables does not work properly either even with the correct headers.
There is also more code that I did not show because it was excessive and irrelevant that accepts the clients, parses requests, etc.
You have an extra \r\n between the headers and the data. After each header you add a \n (should really be a \r\n), then when you print them out you add another (client.print "#{headers_s}\r\n"), and then finally you write out another. This extra two bytes is resulting in the browser seeing invalid PNG data.
Removing the line client.print "\r\n" should fix your issue.
(You should probably also send a Content-Length header, although it will still work without one.)

Net::HTTP – Flush or Close

I've written a consumer for a payment API. My code simply issues a POST request and gets a response from the API. I've implemented that with Net::HTTP, here are the relevant lines of code:
http = Net::HTTP.new(uri.host, 443)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Post.new(uri.request_uri)
request.set_form_data(params)
response = http.request(request)
This worked for years, however, recently some requests have reached timeouts when the API is under stress. The API maintainer came up with this explanation:
We pass on the data to RabbitMQ synchronously after flushing the HTTP response. Apparently, some HTTP libs wait for the connection to be closed before the program continues on the consumer side and we think this is happening here. Please reconfigure your consumer not to wait for close but to continue right after the response has been flushed.
I'm not sure how Net::HTTP is implemented and whether it really waits for the close when the response has been flushed. The docs don't say anything about it nor is there a setting to control any of this. And to make matters worse, I don't really know how to simulate this.
Any ideas are very welcome!
I guess the following experiment (with Ruby 2.3) should give the answer, I post it here in case someone else stumbles across this question in the future.
server.rb:
require 'socket'
server = TCPServer.new('localhost', 2345)
loop do
socket = server.accept
request = socket.gets
STDERR.puts request
response = "Hello World at #{Time.now}!\n"
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Content-Length: #{response.bytesize}\r\n" +
"Connection: close\r\n"
socket.print "\r\n"
socket.print response
socket.flush
sleep 10
socket.close
end
client.rb:
require 'net/http'
http = Net::HTTP.new('localhost', 2345)
request = Net::HTTP::Post.new('/')
response = http.request(request)
puts response.body
Running the server, the client will send one request and exit. It does so immediately, so the flush is sufficient to have the client code continue. Restarting the client within the 10 seconds wait of the server, causes the client the hang until the 10 seconds have fully elapsed, then printing the Hello World and once again immediately exiting.
In other words: Such a simple Net::HTTP client does not wait for the connection to close but continues to execute it's code once the server has flushed.

Clarification of the Ruby Socket Library Gets Method

I am working on an assignment where I have to develop a web server in Ruby using the socket library. I was able to get a simple web server up and running as seen in this thread here .
I am currently working on getting and storing the body of an HTTP request into a variable in my web server. The problem I am running into is trying to define a while loop that gets the entire body of a HTTP request.
I am attempting to get the body of a HTTP request by using the gets method. I could not find any documentation on this method (I saw it being used here)
and was wondering if there were more documentation online.
In my first post here, someone suggested that I use the Content-Length header to determine the size of the body and how much data to read from the socket. I don't really understand how I would go about implementing this because I am unsure how the gets method functions.
Since this is for an assignment, I don't think posting code would be a good idea. I am looking for more information on the gets method and any tips to point me towards the right direction.
You shouldn't be using gets. gets tries to read complete lines (ie it reads up to a line separator), but there is no guarantee that an http request body ends with a line separator.
Instead you should be using read - this allows you to read an arbitrary amount of data (as you mentioned you can use the content length header to know how much to read)
Your ultimate problem isn't related to gets, or even really anything in your code. But before we get to that, let's answer this question & explore sockets a little bit.
If you follow the chain up, you find that Ruby's TCPSocket class inherits from its IO class. It's IO that provides gets. gets will read, line-by-line, until there's nothing more to read. Let's create a simple client that connects to a port, spits out 4 lines of poetry, and then quits:
# poetry_sender.rb
require 'socket'
poem = ["'God save thee, ancient Mariner!",
"From the fiends, that plague thee thus!—",
"Why look'st thou so?'—With my cross-bow",
"I shot the ALBATROSS."]
puts "Client establishing connection..."
s = TCPSocket.new 'localhost', 2000
puts "Client sending poetry..."
poem.each { |line| s.puts line } # Print each line out on the socket
s.close # Close our socket
puts "All done."
And a simple server, that displays what the client sends us:
# poetry_receiver.rb
require 'socket'
server = TCPServer.new 2000 # Server bind to port 2000
loop do
puts "Server now awaiting some poetry..."
socket = server.accept # Wait for a client to connect
while line = socket.gets
puts "A client sent us this beautiful line: #{line}"
end
puts "They had nothing more to say; let's disconnect them."
socket.close
end
If you run the server (poetry_receiver.rb) first, and then the client, you'll see some output like this:
Server now awaiting a connection...
A client sent us this beautiful line: 'God save thee, ancient Mariner!
A client sent us this beautiful line: From the fiends, that plague thee thus!—
A client sent us this beautiful line: Why look'st thou so?'—With my cross-bow
A client sent us this beautiful line: I shot the ALBATROSS.
They had nothing more to say; let's disconnect them.
Server now awaiting a connection...
The last two lines are the important ones; they indicate that socket.gets returned nil and we exited the while loop.
So, how can we modify our poetry_sender.rb so the server doesn't detect the end of the poem? You might think it's got something to do with blank lines, but if you set poem = [] or poem = ["", "", ""] then you'll find that it still gets disconnected OK. But what if we added a delay before closing the socket in poetry_sender.rb?
sleep 60
s.close # Close our socket
puts "All done."
Now you'll see a big delay in the server output. The TCP server doesn't break out of its while loop until the TCP client closes its socket.
Now we can turn to your broader problem: you're trying to implement a simple HTTP server, but your server is getting hung up in a while loop when you try to connect via your web browser. It's because your web browser is keeping that socket open; but it has to, otherwise it has no way to send you back a response. So, how do we know when a client has finished sending us a response? The HTTP 1.1 spec says:
A client sends an HTTP request to a server in the form of a request message... followed by header fields... an empty line to indicate the end of the header section, and finally a message body containing the payload body (if any).
Let's not worry about the message body; how could we write a while loop that terminates if it has no more impact, or if it receives a blank line? Here's one way, in a simple HTTP server that just sends back "Hello world" no matter what request it receives:
require 'socket'
server = TCPServer.new('localhost', 2345)
http_request = [] # We'll store the lines of our incoming request here.
loop do
socket = server.accept
while (line = socket.gets) && line.chomp != '' # While the client is connected, and hasn't sent us a blank line yet...
http_request << line
end
# Send response headers
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Connection: close\r\n" +
"\r\n"
# Send response body
socket.print "Hello world!"
socket.close
end
Quite late to the party, but I'm currently implementing my own rack app server (for fun).
Here you can see how I do it: https://github.com/tak1n/reifier/blob/master/lib/reifier/request.rb
The first line of a HTTP request is always the request line, which is basically something like GET /test HTTP/1.1
After the request line until \r\n you get the headers.
After that you are able to read the body (if PUT / POST request) with just using the CONTENT_LENGTH you parsed from the headers.

Simple HTTP server in Ruby using TCPServer

For a school assignment, I am trying to create a simple HTTP server using Ruby and the sockets library.
Right now, I can get it to respond to any connection with a simple hello:
require 'socket'
server = TCPServer.open 2000
puts "Listening on port 2000"
loop {
client = server.accept()
resp = "Hello?"
headers = ["HTTP/1.1 200 OK",
"Date: Tue, 14 Dec 2010 10:48:45 GMT",
"Server: Ruby",
"Content-Type: text/html; charset=iso-8859-1",
"Content-Length: #{resp.length}\r\n\r\n"].join("\r\n")
client.puts headers
client.puts resp
client.close
}
This works as expected. However, when I have the server tell me who just connected with
puts "Client: #{client.addr[2]}"
and use Chromium (browser) to connect to localhost:2000/ (just once), I get:
Client: 127.0.0.1
Client: 127.0.0.1
Client: 127.0.0.1
Client: 127.0.0.1
I assume this is Chromium requesting auxiliary files, like favicon.ico, and not my script doing something weird, so I wanted to investigate the incoming request. I replaced the resp = "Hello?" line with
resp = client.read()
And restarted the server. I resent the request in Chromium, and instead of it coming back right away, it just hung. Meanwhile, I got the output Client: 127.0.0.1 in my server output. I hit the "stop" button in Chromium, and then the server crashed with
server.rb:16:in `write': Broken pipe (Errno::EPIPE)
from server.rb:16:in `puts'
from server.rb:16:in `block in <main>'
from server.rb:6:in `loop'
from server.rb:6:in `<main>'
Obviously, I'm doing something wrong, as the expected behavior was sending the incoming request back as the response.
What am I missing?
I don't really know about chrome and the four connections, but I'll try to answer your questions on how to read the request properly.
First of all, IO#read won't work in this case. According to the documentation, read without any parameters reads until it encounters EOF, but nothing like that happens. A socket is an endless stream, you won't be able to use that method in order to read in the entire message, since there is no "entire" message for the socket. You could use read with an integer, like read(100) or something, but that will block at some point anyway.
Basically, reading a socket is very different from reading a file. A socket is updated asynchronously, completely independent of the time you try to read it. If you request 10 bytes, it's possible that, at this point in the code, only 5 bytes are available. With blocking IO, the read(10) call will then hang and wait until 5 more bytes are available, or until the connection is closed. This means that, if you try repeatedly reading packets of 10 bytes, at some point, it will still hang. Another way to read a socket is using non-blocking IO, but that's not very important in your case, and it's a long topic by itself.
So here's an example of how you might access the data by using blocking IO:
loop {
client = server.accept
while line = client.gets
puts line.chomp
break if line =~ /^\s*$/
end
# rest of loop ...
}
The gets method tries to read from the socket until it encounters a newline. This will happen at some point for an HTTP request, so even if the entire message is transferred piece by piece, gets should return a single line from the output. The line.chomp call will cut off the final newlines if they're present. If the line read is empty, that means the HTTP headers have been transferred and we can safely break the loop (you can put that in the while condition, of course). The request will be dumped to the console that the server has been started on. If you really want to send it back to the browser, the idea's the same, you just need to handle the lines differently:
loop {
client = server.accept
lines = []
while line = client.gets and line !~ /^\s*$/
lines << line.chomp
end
resp = lines.join("<br />")
headers = ["http/1.1 200 ok",
"date: tue, 14 dec 2010 10:48:45 gmt",
"server: ruby",
"content-type: text/html; charset=iso-8859-1",
"content-length: #{resp.length}\r\n\r\n"].join("\r\n")
client.puts headers # send the time to the client
client.puts resp
client.close
}
As for the broken pipe, that error occurs because the browser forcefully breaks the connection off while read is trying to access data.

How to prevent "The connection was reset" error?

I have a very basic TCP server implemented in Ruby. In general it does what it's supposed to, but every once in a while I get "The connection to the server was reset while the page was loading" error. I have a feeling that it has something to do with close terminating the connection too soon. If so, how do I wait for all the data to be sent? Or is it something else?
require 'socket'
server = TCPServer.new('', 80)
loop do
session = server.accept
begin
session.print Time.now
ensure
session.close
end
end
I'm not an expert in this area, but here is what I believe is happening....
The browser sends a GET request with the header field "Connection: keep-alive". So the browser is expecting to keep the connection alive at least until it receives a complete chunk of the response. Under this protocol, the server response must include a header specifying the length of the response, so that the browser knows when it has received the complete response. After this point, the connection can be closed without the browser caring.
The original example closes the connection too quickly, before the browser can validate that a complete response was received. Curiously, if I run that example and refresh my browser several times, it will load about every 1 in 10 tries. Maybe this erratic behavior is due to the browser occasionally executing fast enough to beat my server closing the connection.
Below is a code example that executes consistently in my browser:
require 'socket'
response = %{HTTP/1.1 200 OK
Content-Type: text;charset=utf-8
Content-Length: 12
Hello World!
}
server = TCPServer.open(80)
loop do
client = server.accept
client.puts response
sleep 1
client.close
end
I suspect it's because the browser is expecting an HTTP response with headers &c. Curiously, you can make the "reset" error happen every time if you put before the "ensure" a sleep of, say, one second.
How to fix it depends upon what you are after. If this is not to be an HTTP server, then don't use the browser to test it. Instead, use telnet or write a little program. If it is to be an HTTP server, then take a look at webrick, which is built into Ruby MRI >= 1.8. Here's how:
#!/usr/bin/ruby1.8
require 'webrick'
# This class handles time requests
class TimeServer < WEBrick::HTTPServlet::AbstractServlet
def do_GET(request, response)
response.status = 200
response['Content-Type'] = 'text/plain'
response.body = Time.now.to_s
end
end
# Create the server. There are many other options, if you need them.
server = WEBrick::HTTPServer.new(:Port=>8080)
# Whenever a request comes in for the root page, use TimeServer to handle it
server.mount('/', TimeServer)
# Finally, start the server. Does not normally return.
server.start
Also, should note that including Connection: close in the response header doesn't seem to help me at all with this connection reset error in my browser (FFv3.6). I have to include both the content-length header field, and include the sleep method to put some delay in the connection closing in order to get a consistent response in my browser.

Resources