Extract uri parameters from a HTTP connection on a Ruby TCPSocket - ruby

My first question here... so be gentle :D
I have the following code:
server = TCPServer.new('localhost', 8080)
loop do
socket = server.accept
# Do something with the URL parameters
response = "Hello world";
socket.print response
socket.close
end
The point is that I want to be able to retrieve if any parameters have been sent in URL of the HTTP request.
Example:
From this request:
curl http://localhost:8080/?id=1&content=test
I want to be able to retrieve something like this:
{id => "1", content => "test"}
I've been looking for CGI::Parse[1] or similar solutions but I haven't found a way to extract that data from a TCPSocket.
[1] http://www.ruby-doc.org/stdlib-1.9.3/libdoc/cgi/rdoc/CGI.html#method-c-parse
FYI: My need is to have a minimal http server in order to receive a couple of parameters and wanted to avoid the use of gems and/or full HTTP wrappers/helpers like Rack.
Needless to say... but thanks in advance.

If you want to see a very minimal server, here is one. It handles exactly two parameters, and puts the strings in an array. You'll need to do more to handle variable numbers of parameters.
There is a fuller explanation of the server code at https://practicingruby.com/articles/implementing-an-http-file-server.
require "socket"
server = TCPServer.new('localhost', 8080)
loop do
socket = server.accept
request = socket.gets
# Here is the first line of the request. There are others.
# Your parsing code will need to figure out which are
# the ones you need, and extract what you want. Rack will do
# this for you and give you everything in a nice standard form.
paramstring = request.split('?')[1] # chop off the verb
paramstring = paramstring.split(' ')[0] # chop off the HTTP version
paramarray = paramstring.split('&') # only handles two parameters
# Do something with the URL parameters which are in the parameter array
# Build a response!
# you need to include the Content-Type and Content-Length headers
# to let the client know the size and type of data
# contained in the response. Note that HTTP is whitespace
# sensitive and expects each header line to end with CRLF (i.e. "\r\n")
response = "Hello world!"
socket.print "HTTP/1.1 200 OK\r\n" +
"Content-Type: text/plain\r\n" +
"Content-Length: #{response.bytesize}\r\n" +
"Connection: close\r\n"
# Print a blank line to separate the header from the response body,
# as required by the protocol.
socket.print "\r\n"
socket.print response
socket.close
end

Related

Net::HTTP Proxy list

I understand that you could use proxy in the ruby Net::HTTP. However, I have no idea how to do this with a bunch of proxy. I need the Net::HTTP to change to another proxy and send another post request after every post request. Also, is it possible to make the Net::HTTP to change to another proxy if the previous proxy is not working? If so, how?
Code I'm trying to implement the script in:
require 'net/http'
sleep(8)
http = Net::HTTP.new('URLHERE', 80)
http.read_timeout = 5000
http.use_ssl = false
path = 'PATHHERE'
data = '(DATAHERE)'
headers = {
'Referer' => 'REFERER HERE',
'Content-Type' => 'application/x-www-form-urlencoded; charset=UTF-8',
'User-Agent' => '(USERAGENTHERE)'}
resp, data = http.post(path, data, headers)
# Output on the screen -> we should get either a 302 redirect (after a successful login) or an error page
puts 'Code = ' + resp.code
puts 'Message = ' + resp.message
resp.each {|key, val| puts key + ' = ' + val}
puts data
end
Given an array of proxies, the following example will make a request through each proxy in the array until it receives a "302 Found" response. (This isn't actually a working example because Google doesn't accept POST requests, but it should work if you insert your own destination and working proxies.)
require 'net/http'
destination = URI.parse "http://www.google.com/search"
proxies = [
"http://proxy-example-1.net:8080",
"http://proxy-example-2.net:8080",
"http://proxy-example-3.net:8080"
]
# Create your POST request_object once
request_object = Net::HTTP::Post.new(destination.request_uri)
request_object.set_form_data({"q" => "stack overflow"})
proxies.each do |raw_proxy|
proxy = URI.parse raw_proxy
# Create a new http_object for each new proxy
http_object = Net::HTTP.new(destination.host, destination.port, proxy.host, proxy.port)
# Make the request
response = http_object.request(request_object)
# If we get a 302, report it and break
if response.code == "302"
puts "#{proxy.host}:#{proxy.port} responded with #{response.code} #{response.message}"
break
end
end
You should also probably do some error checking with begin ... rescue ... end each time you make a request. If you don't do any error checking and a proxy is down, control will never reach the line that checks for response.code == "302" -- the program will just fail with some type of connection timeout error.
See the Net::HTTPHeader docs for other methods that can be used to customize the Net::HTTP::Post object.

How to get the content from an HTTP POST request received in thin?

I am using thin to receive HTTP POST requests, my server code is this:
http_server = proc do |env|
# Want to make response dependent on content
response = "Hello World!"
[200, {"Connection" => "close", "Content-Length" => response.bytesize.to_s}, [response]]
end
Setting a breakpoint, I can see that I have received the content-type (json), and content length, but can't see the actual content. How can I retrieve the content from the request for processing?
You need to use the rack.input entry of the env object. From the Rack Spec:
The input stream is an IO-like object which contains the raw HTTP POST data. When applicable, its external encoding must be “ASCII-8BIT” and it must be opened in binary mode, for Ruby 1.9 compatibility. The input stream must respond to gets, each, read and rewind.
So you can call read on it like this:
http_server = proc do |env|
json_string = env['rack.input'].read
json_string.force_encoding 'utf-8' # since the body has ASCII-8BIT encoding,
# but we know this is json, we can use
# force_encoding to get the right encoding
# parse json_string and do your stuff
response = "Hello World!"
[200, {"Connection" => "close", "Content-Length" => response.bytesize.to_s}, [response]]
end

Can't read HTTP Request Header correctly with Ruby 1.9.3

I'm writing a small webserver. I want to read the HTTP Request. It works when there is no body involved. But when a body is sent then I can't read the content of the body in a satisfying manner.
I read the data coming from the client via TCPSocket. The TCPSocket::gets method reads until the data for the body is received. There is no delimiter or EOF send to signal for the end of the HTTP Request body. The HTTP/1.1 Specification - Section 4.4 lists five cases to get the message length. Point 1) works. Points 2) and 4) are not relevant for my application. Point 5) is not an option because I need to send an response.
I can read the value of the Content-Length field. But when I try to "persuade" the TCPSocket to read the last part of the HTTP Request via read(contentlength) or rcv(contentlength), I have no success. Reading line-by-line until the \r\n which separates Header and Body works, but after that I'm stuck - at least in the way I want to do it.
So my questions are:
Is there a possibility to do is like I intended in the code?
Are there better ways to achieve my goal of reading the HTTP Request correctly (which I really hope for)?
Here is runnable code. The parts that I want to work is in comments.
#!/usr/bin/ruby
require 'socket'
server = TCPServer.new 2000
loop do
Thread.start(server.accept) do |client|
hascontent = false
contentlength = 0
content = ""
request = ""
#This seems to work, but I'm not really happy with it, too much is happening in
#the loop
while(buf = client.readpartial(4096))
request = request + buf
split = request.split("\r\n")
puts request
puts request.dump
puts split.length
puts split.inspect
if(request.index("\r\n\r\n")>0)
break
end
end
#This part is commented out because it doesn't work
=begin
while(line = client.gets)
puts ":" + line
request = request + line
if(line.start_with?("Content-Length"))
hascontent = true
split = line.split(' ')
contentlength = split[1]
end
if(line == "\r\n" and !hascontent)
break
end
if(line == "\r\n" and hascontent)
puts "Trying to get content :P"
puts contentlength
puts content.length
puts client.inspect
#tried read, with and without parameter, rcv, also with and
#without param and their nonblocking couterparts
#where does my thought process go in the wrong direction
while(readin = client.readpartial(contentlength))
puts readin
content = content + readin
end
break
end
end
=end
puts request
client.close
end
So... I have just had this issue for the past 2 hours also, and so I did some digging into the Socket API. Turns out Socket extends BasicSocket which has a method recvmsg. When I tried calling it I got the following:
["GET / HTTP/1.1\r\nHost: localhost:12357\r\nConnection: keep-alive\r\nCache-Control: max-age=0\r\nUpgrade-Insecure-Requests: 1\r\nUser-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\nAccept-Encoding: gzip, deflate, br\r\nAccept-Language: en-US,en;q=0.9\r\n\r\n", #<Addrinfo: empty-sockaddr SOCK_STREAM>, 0]
I.E. My the complete HTTP request, the sender's address information and any other ruby flags raised.
You can use recvmsg to read the entire HTTP request:
raw_request = client.recvmsg()
request = /(?<METHOD>\w+) \/(?<RESOURCE>[^ ]*) HTTP\/1.\d\r\n(?<HEADERS>(.+\r\n)*)(?:\r\n)?(?<BODY>(.|\s)*)/i.match(raw_request)
p request["BODY"]
I have no idea how to do it without recvmsg but I am glad the functionality exists.

Retrieving full request string using Ruby curl

I intend to send a request like the following:
c = Curl::Easy.http_post("https://example.com", json_string
) do |curl|
curl.headers['Accept'] = 'application/json'
curl.headers['Content-Type'] = 'application/json'
curl.headers['Api-Version'] = '2.2'
end
I want to log the exact http request that is being made. Is there a way to get the actual request that was made (base path, query parameters, headers and body)?
The on_debug handler has helped me before. In your example you could try:
curl.on_debug do |type, data|
puts type, data
end
You can reach the solution in differents manner:
Inside your block you can put:
curl.verbose = true # that prints a detailed output of the connection
Or outside the block:
c.url # return the url with queries
c.total_time # retrieve the total time for the prev transfer (name resolving, TCP,...)
c.header_str # return the response header
c.headers # return your call header
c.body_str # return the body of the response
Remember to call c.perform (if not yet performed) before call these methods.
Many more option can be found here: http://curb.rubyforge.org/classes/Curl/Easy.html#M000001

Accessing Headers for Net::HTTP::Post in ruby

I have the following bit of code:
uri = URI.parse("https://rs.xxx-travel.com/wbsapi/RequestListenerServlet")
https = Net::HTTP.new(uri.host,uri.port)
https.use_ssl = true
req = Net::HTTP::Post.new(uri.path)
req.body = searchxml
req["Accept-Encoding"] ='gzip'
res = https.request(req)
This normally works fine but the server at the other side is complaining about something in my XML and the techies there need the xml message AND the headers that are being sent.
I've got the xml message, but I can't work out how to get at the Headers that are being sent with the above.
To access headers use the each_header method:
# Header being sent (the request object):
req.each_header do |header_name, header_value|
puts "#{header_name} : #{header_value}"
end
# Works with the response object as well:
res.each_header do |header_name, header_value|
puts "#{header_name} : #{header_value}"
end
you can add:
https.set_debug_output $stderr
before the request and you will see in console the real http request sent to the server.
very useful to debug this kind of scenarios.
Take a look at the docs for Net::HTTP's post method. It takes the path of the uri value, the data (XML) you want to post, then the headers you want to set. It returns the response and the body as a two-element array.
I can't test this because you've obscured the host, and odds are good it takes a registered account, but the code looks correct from what I remember when using Net::HTTP.
require 'net/http'
require 'uri'
uri = URI.parse("https://rs.xxx-travel.com/wbsapi/RequestListenerServlet")
https = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
req, body = https.post(uri.path, '<xml><blah></blah></xml>', {"Accept-Encoding" => 'gzip'})
puts "#{body.size} bytes received."
req.each{ |h,v| puts "#{h}: #{v}" }
Look at Typhoeus as an alternate, and, in my opinion, easier to use gem, especially the "Making Quick Requests" section.

Resources