Ruby HTTP server without networking - ruby

I am trying to add an HTTP server to an existing Ruby application. The application is based around a select loop, and I want to handle incoming HTTP requests there too (it is important to process the requests in the same thread, or I have to jump through hoops to marshal them there).
Ruby has plenty of solutions for standalone HTTP servers, but I can't seem to find a library which implements an HTTP server on an existing socket. I don't want the HTTP library to open a port and wait, I want to feed it sockets.
The basic logic I'm looking for is this:
handler = SomeHTTPParsingLibrary.new
# set up handler callbacks, etc on handler...
while socket = get_incoming_connection()
handler.handle_request(socket)
end
Are there any existing Ruby libraries that can work like this? HTTP is a simple enough protocol, but there are enough irritating details involved (I need cookies, basic auth, etc) that I'd rather not roll my own.

You may have to roll your sleeves up a bit to figure out what methods to call, but I'd suggest trying the HTTPParser class from within mongrel.

A quick glance through the code in httprequest.rb (webrick - from ruby stdlib) seems like it might suit your purpose.
A WEBrick::HTTPRequest object is able to accept a socket as an argument to its parse() method. It will then block, and return when the request object has been fully populated with the incoming HTTP request.
eg:
res = HTTPResponse.new(#config)
req = HTTPRequest.new(#config)
# some code to "select" a socket goes here
# sock is active, hand it over to the req object for reading.
req.parse(sock)
res.request_method = req.request_method
Of course, this assumes that this thread will block will the current request handling is complete.
OTOH, something like tmm1/http_parser.rb might also fit your needs, but sacrifice other things (like handling cookies) in favor of speed.

Related

How to manage a slow callback function in the ESPAsyncWebServer library

I understand that delaying or yielding in the ESPAsyncWebServer library callbacks are a no-no. However, my callback function needs to query another device via the Serial port. This process is slow and will crash the ESP32 as a result.
Here is an example:
void getDeviceConfig(AsyncWebServerRequest *request) {
AsyncResponseStream *response =
request->beginResponseStream("application/json");
StaticJsonDocument<1024> doc;
JsonArray array = doc.createNestedArray("get");
for (size_t i = 0; i < request->params(); i++)
array.add(request->getParam(i)->value());
serializeJson(doc, Serial);
/* At this point, the remote device determines what is being asked for
and builds a response. This can take fair bit of time depending on
what is being asked (>1sec) */
response->print(Serial.readStringUntil('\n'));
request->send(response);
}
I looked into building a response callback. However, I would need to know ahead of time how much data the remote device will generate. There's no way for me to know this.
I also looked into using a chunked response. In this case, the library will continuously call my callback function until I return 0 (which indicates that there is no more data). This is a good start - but doesn't quite fit. I can't inform of the caller that there is definitely more data coming, I just haven't received a single byte yet. All I can do here is return 0 which will stop the caller.
Is there an alternative approach I could use here?
The easiest way to do this without major changes to your code is to separate the request and the response and poll periodically for the results.
Your initial request as you have it written would initiate the work. The callback handler would set global boolean variable indicating there was work to be done, and if there were any parameters for the work, would save them in globals. Then it would return and the client would see the HTTP request complete but wouldn't have an answer.
In loop() you'd look for the boolean that there was work to be done, do the work, store any results in global variables, set a different global boolean indicating that the work was done, and set the original boolean that indicated work needed to be done to false.
You'd write a second HTTP request that checked to see if the work was complete, and issue that request periodically until you got an answer. The callback handler for the second request would check the "work was done" boolean and return either the results or an indication that the results weren't available yet.
Doing it this way would likely be considered hostile on a shared server or public API, but you have 100% of the ESP32 at your disposal so while it's wasteful it doesn't matter that it's wasteful.
It would also have problems if you ever issued a new request to do work before the first one was complete. If that is a possibility you'd need to move to a queueing system where each request created a queue entry for work, returned an ID for the request, and then the polling request to ask if work was complete would send the ID. That's much more complicated and a lot more work.
An alternate solution would be to use websockets. ESPAsyncWebServer supports async websockets. A websocket connection stays open indefinitely.
The server could listen for a websocket connection and then instead of performing a new HTTP request for each query, the client would send an indication over the websocket that it wanted to the server to do the work. The websocket callback would work much the same way as the regular HTTP server callback I wrote about above. But when the work was complete, the code doing it would just write the result back to the client over the websocket.
Like the polling approach this would get a lot more complicated if you could ever have two or more overlapping requests.

JSON RPC in Golang with AMQP

I use "github.com/streadway/amqp" for async processing requests via queue (RabbitMQ).
And I use "github.com/gorilla/rpc" to register my service without workaround, but I have to use ugly solution for conversion amqp.Delivery to http.Request (mux.Server can works with http.Request only).
Can I use more elegant solution for this task?
I can't find JSON RPC router for AMQP.
First, RPC and pub-sub (e.g. AMQP) are two very different beasts; trying to use one to implement the other isn't necessarily wrong or bad, but it's definitely suspicious, and implies that there could be a breakdown somewhere in the design. So I would highly recommend reconsidering the design starting with your business goals and make sure that what you're trying to implement is actually the correct way to achieve the desired functionality.
That said, what you're describing is basically possible, but you want to move your abstraction up a level. Trying to send a http.Request via AMQP is mixing protocols in a way that's only going to lead to more problems. The cleaner way to implement this behavior would be to have an HTTP handler that handles http.Requests (as normal), and a AMQP handler that handles amqp.Deliverys (as normal), and have each of those handlers call a shared business logic handler which deals only in your domain model.
So, your HTTP handler would parse an HTTP request and turn it into a domain object - you don't give any concrete details in the question so I'll invent something like maybe myapp.UserRegistration. Your HTTP handler would pass that to a myapp.UserService which would handle the actual business logic of registering a user, it would return a result, which you would then transform into the appropriate type, marshal to JSON, and send back to the client in an http.Response. myapp.UserService would know nothing about HTTP or AMQP; it operates only on your own domain types.
Your AMQP handler would pick up a message, parse it into the same myapp.UserRegistration type, pass it to the same myapp.UserService handler, and get the same response back - ensuring that the business logic for AMQP and HTTP behaves the same way. Then you'd get your response back, and... well, this is AMQP, so you don't get to send a response to the client. I don't know your setup, maybe you have another queue you can send the response back on, maybe you don't care about the response and can discard it. This is where the difference between RPC and AMQP is most apparent.
This also makes your business logic, HTTP handler, and AMQP handler more testable in isolation because you're separating the protocol logic from the business logic, which can be helpful even when you aren't trying to deal with multiple protocols (i.e. it's not a bad idea even if you're only doing HTTP)
I hope that at least gives you enough info to put you on the right track in your implementation. Good luck!

Does http have to be a request/response protocol?

I have to ask a plaintive question. I know that http is normally request-response. Can it be request-done?
We have a situation where we would like to send an ajax call off to one server, and then when that completes post a form to another server. We can't send them both without coordinating them, because the post makes the browser navigate to another server, and we lose our context.
When I am currently doing is to do the first ajax call, and then in its callback, I'm doing document['order-form'].submit(). My boss pointed out that if the ajax call isn't completed for a while, the user will see his browser not make progress, even though it's still responsive. He wanted me to put a reasonable timeout on the ajax call.
But really, the ajax call is a "nice but not necessary" thing. My boss would be equally happy if we could send it and forget about it.
I'm having a lot of trouble formulating an appropriate query for Google. "Use HTTP like UDP" doesn't work. A lot of things don't work. Time to ask a human.
If you look at the ISO-OSI model of networking, HTTP is an application layer protocol and UDP is in the transport layer. HTTP typically uses TCP and rarely uses UDP. RTP (Realtime Transport Protocol) however uses UDP and is used for media streaming. Here is one more thing, UDP is not going to assure you a 100% transport, whereas TCP tries to (when packet loss is detected, TCP tries a re-transmission). So we expect drops in UDP. So when you say - fire and forget - What happens when your packet fails to reach?
So I guess you got confused between UDP and HTTP (and I am sorry if that' s not the case and there is something really with HTTP using UDP for web pages since I am not aware of it right now)
The best way, IMHO, to co-ordinate an asynchronous process like this is to have an AJAX call (with CORS enabled if required) like what you have written currently, coupled with good UI/UX frontends which intelligently shows progress/status to the end user.
Also - maybe we could tune up the process which makes the AJAX response slower..say a DB call which is supposed to return data can be tuned up a bit.
Here's what Eric Bidelman says:
// Listen to the upload progress.
var progressBar = document.querySelector('progress');
xhr.upload.onprogress = function(e) {
if (e.lengthComputable) {
progressBar.value = (e.loaded / e.total) * 100;
progressBar.textContent = progressBar.value; // Fallback for unsupported browsers.
}
};
I think this has the germ of an answer. 1) We can find out when the request has entirely gone. 2) We can choose not to have handlers for the response.
As soon as you have been informed that the request has gone out, you can take your next step, including navigating to another page.
I'm not sure, however, how many browsers support xhr.upload.onprogress.
If something is worth doing, surely it's worth knowing whether what you requested was done or not. Otherwise how can you debug problems, or give any kind of good user experience?
A response is any kind of response, it need not carry a message body. A simple 204 response could indicate that something succeeded, as opposed to a 403 or 401 which may require some more action.
I think I've figured out the answer. And it is extremely simple. Good across all browsers.
Just add xhr.timeout = 100; to your ajax call. If it takes the server a full second to respond, you don't care. You already moved on at 1/10 second.
So in my case, I put document['order-form'].submit() in my timeout handler. When the browser navigates away, I am assured that the request has finished going out.
Doesn't use any esoteric knowledge of protocols, or any recent innovations.

Server architecture: websocket multicast server?

What would be the simplest way to build a server that receives incoming connections via a websocket, and streams the data flowing in that socket out to n subscribers on other websockets. Think for example of a streaming application, where one person is broadcasting to n consumers.
Neglecting things like authentication, what would be the simplest way to build a server that can achieve this? I'm a little confused about what would happen when a chunk of data hits the server. It would go into a buffer in memory, then how would it be distributed to the n consumers waiting for it? Some sort of circular buffer? Are websockets an appropriate protocol for this? Thanks.
Here's one using the Ruby Plezi framework (I'm the author, so I'm biased):
require 'plezi'
class Client
# Plezi recognizes websocket handlers by the presence of the
# `on_message` callback.
def on_message data
true
end
protected
# this will be out event.
def publish data
write data
end
end
class Streamer
def on_message data
Client.broadcast :publish, data
end
end
# the streamer will connect to the /streamer path
route '/streamer', Streamer
# the client will connect to the /streamer path
route '/', Client
# on irb, we start the server by exiting the `irb` terminal
exit
You can test it with the Ruby terminal (irb) - it's that simple.
I tested the connections using the Websocket.org echo test with two browser windows, one "streaming" and the other listening.
use ws://localhost:3000/streamer for the streamer websocket connection
use ws://localhost:3000/ for the client's connection.
EDIT (relating to your comment regarding the Library and architecture)
The magic happens in the IO core, which I placed in a separate Ruby gem (Ruby libraries are referred to as 'gems') called Iodine.
Iodine leverages Ruby's Object Oriented approach (in Ruby, everything is an object) to handle broadcasting.
A good entry point for digging through that piece of the code is here. When you encounter the method each, note that it's inherited from the core Protocol and uses an Array derived from the IO map.
Iodine's websocket implementation iterates through the array of IO handlers (the value half of a key=>value map), and if the IO handler is a Websocket it will "broadcast" the message to that IO handler by invoking the on_broadcst callback. The callback is invoked asynchronously and it locks the IO handler while being executed, to avoid conflicts.
Plezi leverages Iodine's broadcast method and uses the same concept so that the on_broadcast callback will filter out irrelevant messages.
Unicasting works a little bit differently, for performance reasons, but it's mostly similar.
I'm sorry for using a lot of shorthand in my code... pre-Ruby habits I guess. I use the condition ? when_true : when_false shorthand a lot and tend to squish stuff into single lines... but it should be mostly readable.
Good luck!

Concurrent web requests with Ruby (Sinatra?)?

I have a Sinatra app that basically takes some input values and then finds data matching those values from external services like Flickr, Twitter, etc.
For example:
input:"Chattanooga Choo Choo"
Would go out and find images at Flickr on the Chattanooga Choo Choo and tweets from Twitter, etc.
Right now I have something like:
#images = Flickr::...find...images..
#tweets = Twitter::...find...tweets...
#results << #images
#results << #tweets
So my question is, is there an efficient way in Ruby to run those requests concurrently? Instead of waiting for the images to finish before the tweets finish.
Threads would work, but it's a crude tool. You could try something like this:
flickr_thread = Thread.start do
#flickr_result = ... # make the Flickr request
end
twitter_thread = Thread.start do
#twitter_result = ... # make the Twitter request
end
# this makes the main thread wait for the other two threads
# before continuing with its execution
flickr_thread.join
twitter_thread.join
# now both #flickr_result and #twitter_result have
# their values (unless an error occurred)
You'd have to tinker a bit with the code though, and add proper error detection. I can't remember right now if instance variables work when declared inside the thread block, local variables wouldn't unless they were explicitly declared outside.
I wouldn't call this an elegant solution, but I think it works, and it's not too complex. In this case there is luckily no need for locking or synchronizations apart from the joins, so the code reads quite well.
Perhaps a tool like EventMachine (in particular the em-http-request subproject) might help you, if you do a lot of things like this. It could probably make it easier to code at a higher level. Threads are hard to get right.
You might consider making a client side change to use asynchronous Ajax requests to get each type (image, twitter) independently. The problem with server threads (one of them anyway) is that if one service hangs, the entire request hangs waiting for that thread to finish. With Ajax, you can load an images section, a twitter section, etc, and if one hangs the others will still show their results; eventually you can timeout the requests and show a fail whale or something in that section only.
Yes why not threads?
As i understood. As soon as the user submit a form, you want to process all request in parallel right? You can have one multithread controller (Ruby threads support works really well.) where you receive one request, then you execute in parallel the external queries services and then you answer back in one response or in the client side you send one ajax post for each service and process it (maybe each external service has your own controller/actions?)
http://github.com/pauldix/typhoeus
parallel/concurrent http requests
Consider using YQL for this. It supports subqueries, so that you can pull everything you need with a single (client-side, even) call that just spits out JSON of what you need to render. There are tons of tutorials out there already.

Resources