require 'net/http'
File.write(file_name, Net::HTTP.get(URI.parse(url)))
I want to show to the user what's happening here, something like progress because the size of a file can be big. But only the information the user can be interested in, not all the debug information.
Does Net::HTTP.get have such an ability?
You can find information on that here: http://ruby-doc.org/stdlib-2.1.1/libdoc/net/http/rdoc/Net/HTTP.html#class-Net::HTTP-label-Streaming+Response+Bodies
The example snippet used in the docs for just such a thing is:
require 'net/http'
uri = URI("http://apps.sfgov.org/datafiles/view.php?file=sfgis/citylots.zip")
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
http.request request do |response|
file_size = response['content-length'].to_i
amount_downloaded = 0
open 'large_file', 'wb' do |io| # 'b' opens the file in binary mode
response.read_body do |chunk|
io.write chunk
amount_downloaded += chunk.size
puts "%.2f%" % (amount_downloaded.to_f / file_size * 100)
end
end
end
end
Related
I want to upload a file in chunks while updating a progress bar after each chunk, in ruby, preferably without the implementation of any gems or plugins.
I have this POST:
uri = URI.parse("http://some/url")
http = Net::HTTP.new(uri.host,uri.port)
req = Net::HTTP::Post.new(uri.path)
req['some'] = 'header'
req.body_stream = File.new('some.file')
req.content_length = File.size('some.file')
res = https.request req
It uploads the file in one single piece in this line:
res = https.request req
I want to update a progress bar on the side.
The reverse, downloading with a progress bar in pure ruby is easy, and you can find references like this:
uri = URI('http://example.com/large_file')
Net::HTTP.start(uri.host, uri.port) do |http|
request = Net::HTTP::Get.new uri
http.request request do |response|
open 'large_file', 'w' do |io|
response.read_body do |chunk|
io.write chunk
end
end
end
end
Is there a way to do something similar as above, but for uploads in Ruby?
With Ruby, no Rails, how can I call an API such as http://api.anapi.com/, and later get a value and check if it is greater than 5?
If it contains an array called "anarray" which contains hashes, in one of those hashes I want to get to the value of the key "key".
Right now I use:
require "net/https"
require "uri"
uri = URI.parse("http://api.cryptocoincharts.info/tradingPair/eth_btc")
http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Get.new(uri.request_uri)
response = http.request(request)
puts response.body
And I get: #<StringIO:0x2cadb90>
Figured it out:
# http://ruby-doc.org/stdlib-2.0.0/libdoc/open-uri/rdoc/OpenURI.html
require 'open-uri'
# https://github.com/flori/json
require 'json'
# http://stackoverflow.com/questions/9008847/what-is-difference-between-p-and-pp
require 'pp'
buffer = open('http://api.cryptocoincharts.info/tradingPair/eth_btc').read
# api_key = "FtHwuH8w1RDjQpOr0y0gF3AWm8sRsRzncK3hHh9"
result = JSON.parse(buffer)
puts result["markets"]
# result.each do |user|
# puts "#{user['id']}\t#{user['name']}\t#{user['email']}"
# puts "Registered: #{user['created_at']}\n\n"
# end
# my_hash = JSON.parse('{"hello": "goodbye"}')
# puts my_hash["hello"] => "goodbye"
With Net::HTTP:
require 'net/http'
uri = URI('http://example.com/index.html?count=10')
Net::HTTP.get(uri) # => String
Then you can do whatever you want with the data. If for example the API returns JSON, you can parse the String with Ruby's JSON module.
Parsing HTTPresponse with Nokogiri
Hi, I am having trouble parsing HTTPresponse objects with Nokogiri.
I use this function to fetch a website here:
fetch a link
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(URI.encode(uri_str.strip))
puts url
#get path
req = Net::HTTP::Get.new(url.path,headers)
#start TCP/IP
response = Net::HTTP.start(url.host,url.port) { |http|
http.request(req)
}
case response
when Net::HTTPSuccess
then #print final redirect to a file
puts "this is location" + uri_str
puts "this is the host #{url.host}"
puts "this is the path #{url.path}"
return response
# if you get a 302 response
when Net::HTTPRedirection
then
puts "this is redirect" + response['location']
return fetch(response['location'],aFile, limit - 1)
else
response.error!
end
end
html = fetch("http://www.somewebsite.com/hahaha/")
puts html
noko = Nokogiri::HTML(html)
When I do this html prints a whole bunch of gibberish and
Nokogiri complains that "node_set must be a Nokogiri::XML::NOdeset
If anyone could offer help it would be quite appreciated
First thing. Your fetch method returns a Net::HTTPResponse object and not just the body. You should provide the body to Nokogiri.
response = fetch("http://www.somewebsite.com/hahaha/")
puts response.body
noko = Nokogiri::HTML(response.body)
I've updated your script so it's runnable (bellow). A couple of things were undefined.
require 'nokogiri'
require 'net/http'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(URI.encode(uri_str.strip))
puts url
#get path
headers = {}
req = Net::HTTP::Get.new(url.path,headers)
#start TCP/IP
response = Net::HTTP.start(url.host,url.port) { |http|
http.request(req)
}
case response
when Net::HTTPSuccess
then #print final redirect to a file
puts "this is location" + uri_str
puts "this is the host #{url.host}"
puts "this is the path #{url.path}"
return response
# if you get a 302 response
when Net::HTTPRedirection
then
puts "this is redirect" + response['location']
return fetch(response['location'], limit-1)
else
response.error!
end
end
response = fetch("http://www.google.com/")
puts response
noko = Nokogiri::HTML(response.body)
puts noko
The script gives no error and prints the content. You may be getting Nokogiri error due to the content you're receiving. One common problem I've encountered with Nokogiri is character encoding. Without the exact error it's impossible to tell what's going on.
I'd recommnend looking at the following StackOverflow Questions
ruby 1.9: invalid byte sequence in UTF-8 (specifically this answer)
How to convert a Net::HTTP response to a certain encoding in Ruby 1.9.1?
I've got a URL and I'm using HTTP GET to pass a query along to a page. What happens with the most recent flavor (in net/http) is that the script doesn't go beyond the 302 response. I've tried several different solutions; HTTPClient, net/http, Rest-Client, Patron...
I need a way to continue to the final page in order to validate an attribute tag on that pages html. The redirection is due to a mobile user agent hitting a page that redirects to a mobile view, hence the mobile user agent in the header. Here is my code as it is today:
require 'uri'
require 'net/http'
class Check_Get_Page
def more_http
url = URI.parse('my_url')
req, data = Net::HTTP::Get.new(url.path, {
'User-Agent' => 'Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3_2 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8H7 Safari/6533.18.5'
})
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}
cookie = res.response['set-cookie']
puts 'Body = ' + res.body
puts 'Message = ' + res.message
puts 'Code = ' + res.code
puts "Cookie \n" + cookie
end
end
m = Check_Get_Page.new
m.more_http
Any suggestions would be greatly appreciated!
To follow redirects, you can do something like this (taken from ruby-doc)
Following Redirection
require 'net/http'
require 'uri'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(uri_str)
req = Net::HTTP::Get.new(url.path, { 'User-Agent' => 'Mozilla/5.0 (etc...)' })
response = Net::HTTP.start(url.host, url.port, use_ssl: true) { |http| http.request(req) }
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then fetch(response['location'], limit - 1)
else
response.error!
end
end
print fetch('http://www.ruby-lang.org/')
Given a URL that redirects
url = 'http://httpbin.org/redirect-to?url=http%3A%2F%2Fhttpbin.org%2Fredirect-to%3Furl%3Dhttp%3A%2F%2Fexample.org'
A. Net::HTTP
begin
response = Net::HTTP.get_response(URI.parse(url))
url = response['location']
end while response.is_a?(Net::HTTPRedirection)
Make sure that you handle the case when there are too many redirects.
B. OpenURI
open(url).read
OpenURI::OpenRead#open follows redirects by default, but it doesn't limit the number of redirects.
I wrote another class for this based on examples given here, thank you very much everybody. I added cookies, parameters and exceptions and finally got what I need: https://gist.github.com/sekrett/7dd4177d6c87cf8265cd
require 'uri'
require 'net/http'
require 'openssl'
class UrlResolver
def self.resolve(uri_str, agent = 'curl/7.43.0', max_attempts = 10, timeout = 10)
attempts = 0
cookie = nil
until attempts >= max_attempts
attempts += 1
url = URI.parse(uri_str)
http = Net::HTTP.new(url.host, url.port)
http.open_timeout = timeout
http.read_timeout = timeout
path = url.path
path = '/' if path == ''
path += '?' + url.query unless url.query.nil?
params = { 'User-Agent' => agent, 'Accept' => '*/*' }
params['Cookie'] = cookie unless cookie.nil?
request = Net::HTTP::Get.new(path, params)
if url.instance_of?(URI::HTTPS)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE
end
response = http.request(request)
case response
when Net::HTTPSuccess then
break
when Net::HTTPRedirection then
location = response['Location']
cookie = response['Set-Cookie']
new_uri = URI.parse(location)
uri_str = if new_uri.relative?
url + location
else
new_uri.to_s
end
else
raise 'Unexpected response: ' + response.inspect
end
end
raise 'Too many http redirects' if attempts == max_attempts
uri_str
# response.body
end
end
puts UrlResolver.resolve('http://www.ruby-lang.org')
The reference that worked for me is here: http://shadow-file.blogspot.co.uk/2009/03/handling-http-redirection-in-ruby.html
Compared to most examples (including the accepted answer here), it's more robust as it handles URLs which are just a domain (http://example.com - needs to add a /), handles SSL specifically, and also relative URLs.
Of course you would be better off using a library like RESTClient in most cases, but sometimes the low-level detail is necessary.
Maybe you can use curb-fu gem here https://github.com/gdi/curb-fu the only thing is some extra code to make it follow redirect. I've used the following before. Hope it helps.
require 'rubygems'
require 'curb-fu'
module CurbFu
class Request
module Base
def new_meth(url_params, query_params = {})
curb = old_meth url_params, query_params
curb.follow_location = true
curb
end
alias :old_meth :build
alias :build :new_meth
end
end
end
#this should follow the redirect because we instruct
#Curb.follow_location = true
print CurbFu.get('http://<your path>/').body
If you do not need to care about the details at each redirection, you can use the library Mechanize
require 'mechanize'
agent = Mechanize.new
begin
response = #agent.get(url)
rescue Mechanize::ResponseCodeError
// response codes other than 200, 301, or 302
rescue Timeout::Error
rescue Mechanize::RedirectLimitReachedError
rescue StandardError
end
It will return the destination page.
Or you can turn off redirection by this :
agent.redirect_ok = false
Or you can optionally change some settings at the request
agent.user_agent = "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Mobile Safari/537.36"
Can anybody help me to
get the file size before I start downloading
display how much % was already downloaded
.
require 'net/http'
require 'uri'
url = "http://www.onalllevels.com/2009-12-02TheYangShow_Squidoo_Part 1.flv"
url_base = url.split('/')[2]
url_path = '/'+url.split('/')[3..-1].join('/')
Net::HTTP.start(url_base) do |http|
resp = http.get(URI.escape(url_path))
open("test.file", "wb") do |file|
file.write(resp.body)
end
end
puts "Done."
Use the request_head method. Like this
response = http.request_head('http://www.example.com/remote-file.ext')
file_size = response['content-length']
The file_size will be in bytes.
Follow these two links for more info.
http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html#M000695
http://curl.haxx.se/mail/archive-2002-07/0070.html
so I made it work even with the progress bar ....
require 'net/http'
require 'uri'
require 'progressbar'
url = "url with some file"
url_base = url.split('/')[2]
url_path = '/'+url.split('/')[3..-1].join('/')
#counter = 0
Net::HTTP.start(url_base) do |http|
response = http.request_head(URI.escape(url_path))
ProgressBar#format_arguments=[:title, :percentage, :bar, :stat_for_file_transfer]
pbar = ProgressBar.new("file name:", response['content-length'].to_i)
File.open("test.file", 'w') {|f|
http.get(URI.escape(url_path)) do |str|
f.write str
#counter += str.length
pbar.set(#counter)
end
}
end
pbar.finish
puts "Done."
The file size is available in the HTTP Content-Length response header. If it is not present, you can't do anything. To calculate the %, just do the primary school math like (part/total * 100).
Here the full code to get file details before download
require 'net/http'
response = nil
uri = URI('http://hero.com/abc.mp4')
Net::HTTP.start(uri.host, uri.port) do |http|
response = http.head(uri)
end
response.header.each_header {|key,value| puts "#{key} = #{value}" }