Making a URL in a string usable by Ruby's Net::HTTP - ruby

Ruby's Net:HTTP needs to be given a full URL in order for it to connect to the server and get the file properly. By "full URL" I mean a URL including the http:// part and the trailing slash if it needs it. For instance, Net:HTTP won't connect to a URL looking like this: example.com, but will connect just fine to http://example.com/. Is there any way to make sure a URL is a full URL, and add the required parts if it isn't?
EDIT: Here is the code I am using:
parsed_url = URI.parse(url)
req = Net::HTTP::Get.new(parsed_url.path)
res = Net::HTTP.start(parsed_url.host, parsed_url.port) {|http|
http.request(req)
}

If this is only doing what the sample code shows, Open-URI would be an easier approach.
require 'open-uri'
res = open(url).read

This would do a simple check for http/https:
if !(url =~ /^https?:/i)
url = "http://" + url
end
This could be a more general one to handle multiple protocols (ftp, etc.)
if !(url =~ /^\w:/i)
url = "http://" + url
end
In order to make sure parsed_url.path gives you a proper value (it should be / when no specific path was provided), you could do something like this:
req = Net::HTTP::Get.new(parsed_url.path.empty? ? '/' : parsed_url.path)

Related

Ruby Post returns 404 URL Not found while curl works fine

I'm trying to write some Ruby code to update GitLab CI/CD variables using the REST endpoint update variable. When I perform a curl with the same path, the same private token, and the same --form data it updates the variable as expected. When I use the Ruby code that I put together based on reading stackoverflow and the net::http docs, it fails with a 404 URL not found.
I can use a similar piece of code to create a new CI/CD variable successfully. I can also delete an existing variable, and re-create it, but it I would like to know the mistake I am making in the update call.
Can someone point out what I did wrong?
#!/usr/bin/env ruby
require 'net/http'
require 'uri'
token = File.read(__dir__ + '/.gitlab-token').chomp
host = 'https://gitlab.com/'
variables_path = 'api/v4/projects/123456/variables'
env_var = 'MY_VAR'
update_uri = URI(host + variables_path + '/' + env_var)
# I've written the above this way because my actual code
# has a delete and create in order to "update" the variable
response = Net::HTTP.start(update_uri.host, update_uri.port, use_ssl: true) do |http|
update_request = Net::HTTP::Post.new(update_uri)
update_request['PRIVATE-TOKEN'] = token
form_data = [
['value', 'a new value']
]
update_request.set_form(form_data, 'multipart/form-data')
response = http.request(update_request)
response.body
end

Ruby open "userinfo not supported": URL with basic auth

I have a lot of URLs in the following format
ftp://user:pass#example.com/some_image.jpg
https://user:pass#example.com/some_image.jpg
When I try to load the image with ruby's open method, it throws the following error for https, but works for ftp
open(URI.parse("ftp://user:pass#example.com/some_image.jpg")) # works
open(URI.parse("https://user:pass#example.com/some_image.jpg")) # throws error:
# ArgumentError: userinfo not supported. [RFC3986]
I found (JSON parse from a Remote URL which requires a username and password) that you can provide open with basic auth parameter like this
url = URI.parse(url)
open(url, http_basic_authentication: [url.user,url.password])
But this still throws the error, because the url contains still the user / password info.
So, what would be an easy way to parse out the user / password info from the url? I tried it by concatenating the parts of the URL by myself like this:
uri = URI.parse(url)
uri_base = "#{uri.scheme}://#{uri.host}:#{uri.port}#{uri.path}"
uri_base += "?#{uri.query}" if uri.query
open(uri_base, http_basic_authentication: [uri.user,uri.password])
But this doesn't work for FTP, it throws an Net::FTPPermError: 530 User _ftp denied by SACL. error.
So is there an easy way to support open with optional http basic authentication for https AND ftp?
Update
I came up with the following solution, but it looks kinda hacky, and I think there must be a better way:
def download url
opts = {}
uri = URI.parse(url)
uri_base = "#{uri.scheme}://"
if uri.scheme=="ftp"
uri_base += "#{uri.user}:#{uri.password}#" if uri.user
else
opts[:http_basic_authentication] = [uri.user,uri.password] if uri.user
end
uri_base += "#{uri.host}:#{uri.port}/#{uri.path}"
uri_base += "?#{uri.query}" if uri.query
open(uri_base, opts)
end

Reading Withings API ruby

I have been trying for days to pull down activity data from the Withings API using the OAuth Ruby gem. Regardless of what method I try I consistently get back a 503 error response (not enough params) even though I copied the example URI from the documentation, having of course swapped out the userid. Has anybody had any luck with this in the past. I hope it is just something stupid I am doing.
class Withings
API_KEY = 'REMOVED'
API_SECRET = 'REMOVED'
CONFIGURATION = { site: 'https://oauth.withings.com', request_token_path: '/account/request_token',
access_token_path: '/account/access_token', authorize_path: '/account/authorize' }
before do
#consumer = OAuth::Consumer.new API_KEY, API_SECRET, CONFIGURATION
#base_url ||= "#{request.env['rack.url_scheme']}://#{request.env['HTTP_HOST']}#{request.env['SCRIPT_NAME']}"
end
get '/' do
#request_token = #consumer.get_request_token oauth_callback: "#{#base_url}/access_token"
session[:token] = #request_token.token
session[:secret] = #request_token.secret
redirect #request_token.authorize_url
end
get '/access_token' do
#request_token = OAuth::RequestToken.new #consumer, session[:token], session[:secret]
#access_token = #request_token.get_access_token oauth_verifier: params[:oauth_verifier]
session[:token] = #access_token.token
session[:secret] = #access_token.secret
session[:userid] = params[:userid]
redirect "#{#base_url}/activity"
end
get '/activity' do
#access_token = OAuth::AccessToken.new #consumer, session[:token], session[:secret]
response = #access_token.get("http://wbsapi.withings.net/v2/measure?action=getactivity&userid=#{session[:userid]}&startdateymd=2014-01-01&enddateymd=2014-05-09")
JSON.parse(response.body)
end
end
For other API endpoints I get an error response of 247 - The userid provided is absent, or incorrect. This is really frustrating. Thanks
So I figured out the answer after copious amount of Googleing and grasping a better understanding of both the Withings API and the OAuth library I was using. Basically Withings uses query strings to pass in API parameters. I though I was going about passing these parameters correctly when I was making API calls, but apparently I needed to explicitly set the OAuth library to use the query string scheme, like so
http_method: :get, scheme: :query_string
This is appended to my OAuth consumer configuration and all worked fine immediately.

Is there a way to attach Ruby Net::HTTP request to a specific IP address / network interface?

Im looking a way to use different IP addresses for each GET request with standard Net::HTTP library. Server has 5 ip addresses and assuming that some API`s are blocking access when request limit per IP is reached. So, only way to do it - use another server. I cant find anything about it in ruby docs.
For example, curl allows you to attach it to specific ip address (in PHP):
$req = curl_init($url)
curl_setopt($req, CURLOPT_INTERFACE, 'ip.address.goes.here';
$result = curl_exec($req);
Is there any way to do it with Net::HTTP library? As alternative - CURB (ruby curl binding). But it will be the last thing i`ll try.
Suggestions / Ideas?
P.S. The solution with CURB (with dirty tests, ip`s being replaced):
require 'rubygems'
require 'curb'
ip_addresses = [
'1.1.1.1',
'2.2.2.2',
'3.3.3.3',
'4.4.4.4',
'5.5.5.5'
]
ip_addresses.each do |address|
url = 'http://www.ip-adress.com/'
c = Curl::Easy.new(url)
c.interface = address
c.perform
ip = c.body_str.scan(/<h2>My IP address is: ([\d\.]{1,})<\/h2>/).first
puts "for #{address} got response: #{ip}"
end
I know this is old, but hopefully someone else finds this useful, as I needed this today. You can do the following:
http = Net::HTTP.new(uri.host, uri.port)
http.local_host = ip
response = http.request(request)
Note that you I don't believe you can use Net::HTTP.start, as it doesn't accept local_host as an option.
There is in fact a way to do this if you monkey patch TCPSocket:
https://gist.github.com/800214
Curb is awesome but won't work with Jruby so I've been looking into alternatives...
Doesn't look like you can do it with Net:HTTP. Here's the source
http://github.com/ruby/ruby/blob/trunk/lib/net/http.rb
Line 644 is where the connection is opened
s = timeout(#open_timeout) { TCPSocket.open(conn_address(), conn_port()) }
The third and fourth arguments to TCPSocket.open are local_address and local_port, and since they're not specified, it's not possible. Looks like you'll have to go with curb.
Of course you can. I did as below:
# remote_host can be IP or hostname
uri = URI.parse( "http://" + remote_host )
http = Net::HTTP.new( uri.host, uri.port )
request = Net::HTTP::Get.new(uri.request_uri)
request.initialize_http_header( { "Host" => domain })
response = http.request( request )

Ruby open-uri, returns error when opening a png URL

I am making a crawler parsing images on the Gantz manga at http://manga.bleachexile.com/gantz-chapter-1.html and on.
I had success until my crawler tried to open a image (on chapt 273):
bad URI(is not URI?): http://static.bleachexile.com/manga/gantz/273/Gantz[0273]_p001[Whatever-Illuminati].png
BUT this url is valid I guess, because I can open from Firefox.. Any thoughts?
Partial code:
img_link = nav.page.image_urls.find {|x| x.include?("manga/gantz")}
img_name = RAILS_ROOT+"/public/#{nome}/#{cap}/"+nome+((template).sub('::cap::', cap.to_s).sub('::pag::', i.to_s))
img = File.new( img_name, 'w' )
img.write( open(img_link) {|f| f.read} )
img.close
It is not a valid uri. Only certain characters are allowed for uri's. By the way firefox like all browsers try to do as much as possible for the user instead of complaining when it does not look standard compliant.
It is valid in the following form:
open("http://static.bleachexile.com/manga/gantz/273/Gantz%5B0273%5D_p001%5BWhatever-Illuminati%5D.png") # => #<File:/tmp/open-uri20100226-3342-clj08a-0>
You could try to escape it like this:
uri.gsub(/\/.*/) do |t|
t.gsub(/[^.\/a-zA-Z0-9\-_ ]/) do |c|
"%#{ c[0]<16 ? "0" : "" }#{ c[0].to_s(16).upcase }"
end.gsub(" ", "+")
end
But be carefull, if the website uses correct escaped uri's and you escape them a second time. The uri's wont point to the same location anymore.

Resources