Upload file in chunks with progress bar - ruby

I want to upload a file in chunks while updating a progress bar after each chunk, in ruby, preferably without the implementation of any gems or plugins.
I have this POST:
uri = URI.parse("http://some/url")
http = Net::HTTP.new(uri.host,uri.port)
req = Net::HTTP::Post.new(uri.path)
req['some'] = 'header'
req.body_stream = File.new('some.file')
req.content_length = File.size('some.file')
res = https.request req
It uploads the file in one single piece in this line:
res = https.request req
I want to update a progress bar on the side.
The reverse, downloading with a progress bar in pure ruby is easy, and you can find references like this:
uri = URI('http://example.com/large_file')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
http.request request do |response|
open 'large_file', 'w' do |io|
response.read_body do |chunk|
io.write chunk
end
end
end
end
Is there a way to do something similar as above, but for uploads in Ruby?

Related

Ruby Net::HTTP passing headers through the creation of request

Maybe I'm just blind but many post about passing headers in Net::HTTP follows the lines of
require 'net/http'
uri = URI("http://www.ruby-lang.org")
req = Net::HTTP::Get.new(uri)
req['some_header'] = "some_val"
res = Net::HTTP.start(uri.hostname, uri.port) {|http|
http.request(req)
}
puts res.body
(From Ruby - Send GET request with headers metaphori's answer)
And from the Net::HTTP docs (https://docs.ruby-lang.org/en/2.0.0/Net/HTTP.html)
uri = URI('http://example.com/cached_response')
file = File.stat 'cached_response'
req = Net::HTTP::Get.new(uri)
req['If-Modified-Since'] = file.mtime.rfc2822
res = Net::HTTP.start(uri.hostname, uri.port) {|http|
http.request(req)
}
open 'cached_response', 'w' do |io|
io.write res.body
end if res.is_a?(Net::HTTPSuccess)
But what is the advantage of doing the above when you can pass the headers via the following way?
options = {
'headers' => {
'Content-Type' => 'application/json'
}
}
request = Net::HTTP::Get.new('http://www.stackoverflow.com/', options['headers'])
This allows you to parameterize the headers and can allow for multiple headers very easily.
My main question is, what is the advantage of passing the headers in the creation of Net::HTTP::Get vs passing them after the creation of Net::HTTP::Get
Net::HTTPHeader already goes ahead and assigns the headers in the function
def initialize_http_header(initheader)
#header = {}
return unless initheader
initheader.each do |key, value|
warn "net/http: duplicated HTTP header: #{key}", uplevel: 1 if key?(key) and $VERBOSE
if value.nil?
warn "net/http: nil HTTP header: #{key}", uplevel: 1 if $VERBOSE
else
value = value.strip # raise error for invalid byte sequences
if value.count("\r\n") > 0
raise ArgumentError, 'header field value cannot include CR/LF'
end
#header[key.downcase] = [value]
end
end
end
So doing
request['some_header'] = "some_val" almost seems like code duplication.
There is no advantage for setting headers one way or another, at least not that I can think of. It comes down to your own preference. In fact, if you take a look at what happens when you supply headers while initializing a new Net::Http::Get, you will find that internally, Ruby simply sets the headers onto a #headers variable:
https://github.com/ruby/ruby/blob/c5eb24349a4535948514fe765c3ddb0628d81004/lib/net/http/header.rb#L25
And if you set the headers using request[name] = value, you can see that Net::Http does the exact same thing, but in a different method:
https://github.com/ruby/ruby/blob/c5eb24349a4535948514fe765c3ddb0628d81004/lib/net/http/header.rb#L46
So the resulting object has the same configuration no matter which way you decide to pass the request headers.

Issue while fetching data from nested json

I am trying to fetch data from a nested json. Not able to understand the issue over here. Please ignore the fields that I am passing to ChildArticle class. I can sort that out.
URL for JSON - http://api.nytimes.com/svc/mostpopular/v2/mostshared/all-sections/email/30.json?api-key=31fa4521f6572a0c05ad6822ae109b72:2:72729901
Below is my code:
url = 'http://api.nytimes.com'
#Define the HTTP object
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
#If the api being scraped uses https, then set use_ssl to true.
http.use_ssl = false
#Define the request_url
#Make a GET request to the given url
request = '/svc/mostpopular/v2/mostshared/all-sections/email/30.json?api-key=31fa4521f6572a0c05ad6822ae109b72:2:72729901'
response = http.send_request('GET', request)
#Parse the response body
forecast = JSON.parse(response.body)
forecast["results"]["result"].each do |item|
date = Date.parse(item["published_date"].to_s)
if (#start <= date) && (#end >= date)
article = News::ChildArticle.new(author: item["author"], title: item["title"], summary: item["abstract"],
images: item["images"],source: item["url"], date: item["published_date"],
guid: item["guid"], link: item["link"], section: item["section"],
item_type: item["item_type"], updated_date: item["updated_date"],
created_date: item["created_date"],
material_type_facet: item["material_type_facet"])
#articles.concat([article])
end
end
I get below error -
[]': no implicit conversion of String into Integer (TypeError) atforecast["results"]["result"].each do |item|`
Looks like forecast['results'] is simply an array, not a hash.
Take a look at this slightly modified script. Give it a run in your terminal, and check out its output.
require 'net/http'
require 'JSON'
url = 'http://api.nytimes.com'
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = false
request = '/svc/mostpopular/v2/mostshared/all-sections/email/30.json?api-key=31fa4521f6572a0c05ad6822ae109b72:2:72729901'
response = http.send_request('GET', request)
forecast = JSON.parse(response.body)
forecast["results"].each.with_index do |item, i|
puts "Item #{i}:"
puts '--'
item.each do |k, v|
puts "#{k}: #{v}"
end
puts '----'
end
Also, you may want to inspect the JSON structure of the API return from that URL. If you go to that URL, open your JavaScript console, and paste in
JSON.parse(document.body.textContent)
you can inspect the JSON structure very easily.
Another option would be downloading the response to a JSON file, and inspecting it in your editor. You'll need a JSON prettifier though.
File.open('response.json', 'w') do |f|
f.write(response.body)
end

Net::HTTP.get show progress

require 'net/http'
File.write(file_name, Net::HTTP.get(URI.parse(url)))
I want to show to the user what's happening here, something like progress because the size of a file can be big. But only the information the user can be interested in, not all the debug information.
Does Net::HTTP.get have such an ability?
You can find information on that here: http://ruby-doc.org/stdlib-2.1.1/libdoc/net/http/rdoc/Net/HTTP.html#class-Net::HTTP-label-Streaming+Response+Bodies
The example snippet used in the docs for just such a thing is:
require 'net/http'
uri = URI("http://apps.sfgov.org/datafiles/view.php?file=sfgis/citylots.zip")
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
http.request request do |response|
file_size = response['content-length'].to_i
amount_downloaded = 0
open 'large_file', 'wb' do |io| # 'b' opens the file in binary mode
response.read_body do |chunk|
io.write chunk
amount_downloaded += chunk.size
puts "%.2f%" % (amount_downloaded.to_f / file_size * 100)
end
end
end
end

Ruby + Net::HTTP: How do I send two XML documents in one POST request?

I have to send two XML documents in my request to the UPS API (here's my original question What is the root of this XML document? )
How would I do this?
def make_initial_request
uri = URI.parse(UPS_API['confirm_url'])
https = Net::HTTP.new(uri.host, uri.port)
https.use_ssl = true
headers = {'Content-Type' => 'text/xml'}
request = Net::HTTP::Post.new(uri.path, headers)
request.body = xml_for_initial_request #<-- how do i split this into two documents?
#request.body = second_xml_document #<-- i want something like that. could i just use << ?
begin
response = https.request(request)
rescue
return nil
end
puts "response: #{response.code} #{response.message}: #{response.body}"
return nil if response.body.include?("Error")
end
You should use MIME Multipart messages if the API support them (ruby gem).
Otherwise just try to concatenate files' contents request.body = "#{xml_for_initial_request}\n#{second_xml_document}"

Sinatra streaming response with headers

I want to proxy remote files through a Sinatra application. This requires streaming an HTTP response with headers from a remote source back to the client, but I can't figure out how to set the headers of the response while using the streaming API inside the block provided by Net::HTTP#get_response.
For example, this will not set response headers:
get '/file' do
stream do |out|
uri = URI("http://manuals.info.apple.com/en/ipad_user_guide.pdf")
Net::HTTP.get_response(uri) do |file|
headers 'Content-Type' => file.header['Content-Type']
file.read_body { |chunk| out << chunk }
end
end
end
And this results in the error: Net::HTTPOK#read_body called twice (IOError):
get '/file' do
response = nil
uri = URI("http://manuals.info.apple.com/en/ipad_user_guide.pdf")
Net::HTTP.get_response(uri) do |file|
headers 'Content-Type' => file.header['Content-Type']
response = stream do |out|
file.read_body { |chunk| out << chunk }
end
end
response
end
I could be wrong but after thinking a bit about this it appears to me that when setting the response headers from inside the stream helper block, those headers don't get applied into the response because the execution of that block is actually being deferred. So, probably, the block gets evaluated and the response headers get set before it begins executing.
A possible workaround for this is issuing a HEAD request before streaming back the contents of the file.
For example:
get '/file' do
uri = URI('http://manuals.info.apple.com/en/ipad_user_guide.pdf')
# get only header data
head = Net::HTTP.start(uri.host, uri.port) do |http|
http.head(uri.request_uri)
end
# set headers accordingly (all that apply)
headers 'Content-Type' => head['Content-Type']
# stream back the contents
stream do |out|
Net::HTTP.get_response(uri) do |f|
f.read_body { |ch| out << ch }
end
end
end
It may not be ideal for your use case because of the additional request but it should be small enough to not be much of a problem (delay) and it adds the benefit that your app may be able to react if that request fails before sending back any data.
Hope it helps.

Resources