Forward a file download in sinatra using streaming - ruby

I have made a ruby / sinatra website and I need to let the user to download a file.
This file is not local hosted, it is hosted on a remote API. end user must not see the true origin of the file.
get "/files/:elementKey/masterfile" do
content_type "application/octet-stream"
loadMasterfile(params[:elementKey])
end
With loadMasterfile:
http = Net::HTTP.new(plainURI,443)
http.use_ssl = true;
http.start do |http|
req = Net::HTTP::Get.new(resource, {"User-Agent" =>"API downloader"})
req.basic_auth(user.keytechUserName, user.keytechPassword)
response = http.request(req)
# return this as a file attachment
attachment( response["X-Filename"]) #Use the sinatra helper to set this as filename
response.body << This lets sinatra download the file and then forward the whole content to the browser
end
This code works, but:
The file is downloaded first to the ruby/sinatra and then forwarded to the browser.
User must wait until download starts - browser seems to freeze.
Is there a solution to start a download form a remote API and forward the contents in one flow?
I found nothing about that or just found solutions for local file downloads, but I must download a file from a remote API.
I also can not cache the file locally or on Amazon AWS.
Any Ideas?

To achieve this in a streaming fashion in which your app is the proxy, you'll need to send the client chunks as you are downloading chunks. This is not the default behavior of ruby / Net::HTTP, but it is possible.
From the ruby docs:
By default Net::HTTP reads an entire response into memory. If you are handling large files or wish to implement a progress bar you can instead stream the body directly to an IO.
Streaming is possible through read_body, though.
Net::HTTP Streaming Response Bodies
Example usage from the docs:
uri = URI('http://example.com/large_file')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
http.request request do |response|
open 'large_file', 'w' do |io|
response.read_body do |chunk|
io.write chunk
end
end
end
end
This example from the docs writes the streaming data to a file, but you could replace it with writes to your response stream. In combination with Sinatra's streaming api, the code might look like this:
get "/files/:elementKey/masterfile" do
content_type "application/octet-stream"
stream do |out|
loadMasterfile(params[:elementKey]) do |chunk|
out << chunk
end
end
end
def loadMasterfile(resource, &block)
http = Net::HTTP.new(plainURI, 443)
http.use_ssl = true;
http.start do |http|
req = Net::HTTP::Get.new(resource, {"User-Agent" =>"API downloader"})
req.basic_auth(user.keytechUserName, user.keytechPassword)
http.request(req) do |origin_repsonse|
origin_repsonse.read_body(&block)
end
end
end
I'm not sure how you'd set the filename. You'd also want to handle errors appropriately in the net calls and stream close. Also note that a front-end like nginx can affect the buffering / chunking of streaming responses.

Related

Streaming data in Ruby net/http PUT request

In the Ruby-doc Net/HTTP there is a detailed example for streaming response bodies - it applies when you try to download a large file.
I am looking for an equivalent code snippet to upload a file via PUT. Spent quite a bit of time trying to make code work with no luck. I think I need to implement a particular interface and pass it the to request.body_stream
I need streaming because I want to alter the content of the file while it is being uploaded so I want to have access to the buffered chunks while the upload takes place. I would gladly use a library like http.rb or rest-client as long as I can use streaming.
Thanks in advance!
For reference following is the working non streamed version
uri = URI("http://localhost:1234/upload")
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Put.new uri
File.open("/home/files/somefile") do |f|
request.body = f.read()
end
# Ideally I would need to use **request.body_stream** instead of **body** to get streaming
http.request request do |response|
response.read_body do |result|
# display the operation result
puts result
end
end
end

Wait for selector to present

When doing web scraping with Nokogiri I occasionally get the following error message
undefined method `at_css' for nil:NilClass (NoMethodError)
I know that the selected element is present at some time, but the site is sometimes a bit slow to respond, and I guess this is the reason why I'm getting the error.
Is there some way to wait until a certain selector is present before proceeding with the script?
My current http request block looks like this
url = URL
body = BODY
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.read_timeout = 200 # default 60 seconds
http.open_timeout = 200 # default nil
http.use_ssl = true
request = Net::HTTP::Post.new(uri.request_uri)
request.body = body
request["Content-Type"] = "application/x-www-form-urlencoded"
begin
response = http.request(request)
doc = Nokogiri::HTML(response.body)
rescue
sleep 100
retry
end
While you can use a streaming Net::HTTP like #Stefan says in his comment, and an associated handler that includes Nokogiri, you can't parse a partial HTTP document using a DOM model, which is Nokogiri's default, because it expects the full document also.
You could use Nokogiri's SAX parser, but that's an entirely different programming style.
If you're retrieving an entire page, then use OpenURI instead of the lower-level Net::HTTP. It automatically handles a number of things that Net::HTTP will not do by default, such as redirection, which makes it a lot easier to retrieve pages and will greatly simplify your code.
I suspect the problem is either that the site is timing out, or the tag you're trying to find is dynamically loaded after the real page loads.
If it's timing out you'll need to increase your wait time.
If it's dynamically loading that markup, you can request the main page, locate the appropriate URL for the dynamic content and load it separately. Once you have it, you can either insert it into the first page if you need everything, or just parse it separately.

Azure rest API images missing when listing images

So when I list the images using the Ruby SDK, I get all of the publicly available ones, but the ones that I have created myself are not included. They do show up in the web console though... I've even tried using the REST API and constructed a Net:HTTP object as illustrated here. I get a 5xx error after setting the content-length (even though it isn't listed as a required header) to anything, including 0... I have had success using the same code on other azure RESTful urls, so I am unsure as to why this specific one is giving me an error....
Does anyone have any clue as to why my images aren't listed? Any experience with the endpoint linked above? Just fyi, heres my ruby request code:
# HTTP request code
def get(uri)
uri = URI.parse(uri)
pem = File.read('/path/to/management_cert')
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.cert = OpenSSL::X509::Certificate.new(pem)
http.key = OpenSSL::PKey::RSA.new(pem)
http.verify_mode = OpenSSL::SSL::VERIFY_PEER
request = Net::HTTP::Get.new(uri.request_uri)
request['x-ms-version'] = '2014-06-01'
request['Content-Length'] = 0
http.request(request)
end
Here is the calling code:
# The invoking line
get 'https://management.core.windows.net/<subscription-id>/services/vmimages'
???
You must go through the API first. Here is link of rest API http://msdn.microsoft.com/en-us/library/azure/dn499770.aspx
You are making POST request instead of GET. The Method should be 'GET'
request = Net::HTTP::Get.new(uri.request_uri)
You have to set Content-Length
I found the answer (kinda).... I guess the servers were having an issue that day as I re-ran the code and got the data I needed... The above code (now fixed) works!!!

React on HTTP response before response is done?

Is it possible with any Ruby library to start doing something when a pattern is matched from a HTTP response, before the HTTP session is finished/closed and before the entire result is fetched from the server?
Pseudo code:
http.get 'http://example.org/foo.json' do |response|
run_this_function if /\"field\":\"data\"/ =~ response.body_str
end
I want something similar to odoe.js, but in Ruby.
Normally Net::HTTP will pull the entire body into memory, but you can change that behavior into streaming. From the documentation:
Streaming Response Bodies¶ ↑
By default Net::HTTP reads an entire response into memory. If you are handling large files or wish to implement a progress bar you can instead stream the body directly to an IO.
uri = URI('http://example.com/large_file')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
http.request request do |response|
open 'large_file', 'w' do |io|
response.read_body do |chunk|
io.write chunk
end
end
end
end
You'll want your code to camp out in the read_body block. See the documentation for read_body as there is additional information you should be aware of, but basically it says:
If a block is given, the body is passed to the block, and the body is provided in fragments, as it is read in from the socket.

How do I access the Kippt API through Ruby without an external library?

I want to access the Kippt API through Ruby without the usage of any external libraries whatsoever, i.e. everything that comes packed with Ruby is fine, but nothing else (except for the standard library).
How should I go about doing this? Please detail the process.
This is very basic access, showing it is possible:
require "net/https"
require "uri"
uri = URI.parse( 'https://kippt.com/api/users/1/' )
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
data = JSON.parse( response.body )
=> {
"username"=>"jorilallo",
"bio"=>"Co-founder of Kippt. I love building products.",
"app_url"=>"/jorilallo",
"avatar_url"=>"https://d19weqihs4yh5u.cloudfront.net/avatars/147d86b9-0830-49d8-a449-0421a6a4bf05/160x160",
"twitter"=>"jorilallo",
"id"=>1, "github"=>"jorde",
"website_url"=>"http://about.me/jorilallo",
"full_name"=>"Jori Lallo",
"dribbble"=>"jorilallo",
"counts"=>{"follows"=>1192, "followed_by"=>23628},
"is_pro"=>true, "resource_uri"=>"/api/users/1/"
}
There is a fair amount of work to take this demonstration and put it into some re-usable code that copes with authentication, posting params, request failure and other standard issues for HTTP-based APIs.
I'd suggest reading http://www.rubyinside.com/nethttp-cheat-sheet-2940.html for some examples of how to build and process the requests in more detail. That's how I did the above (until writing the answer, I'd never used Ruby's net/http directly before, and I just grabbed a likely looking block of code from that site).

Resources