Ruby URL Validation - ruby

I wrote out this script to basically parse a textfile of URL's and return the http response code, however I cant get it to work. I'm able to import and parse the file, however unable to get the return code. Thanks in advance!
require 'net/http'
#Open URL from file
File.open("sample_input_file", "r") do |infile|
while (URI = infile.gets)
end
end
#Get HTTP response code
http = Net::HTTP.new
response = http.request_head(URI)
#Print result
if
response.code != "200"
puts URI + "Error"
else
puts "Ok"
end

.gets returns a string, you need to actually make an a uri by calling for example URI.parse
http://www.ruby-doc.org/stdlib-1.9.3/libdoc/uri/rdoc/

Related

Cannot make HTTP Delete request with Ruby's net/http library

I've been trying to make an API call to my server to delete a user record help on a dev database. When I use Fiddler to call the URL with the DELETE operation I am able to immediately delete the user record. When I call that same URL, again with the DELETE operation, from my script below, I get this error:
{"Message":"The requested resource does not support http method 'DELETE'."}
I have changed the url in my script below. The url I am using is definitely correct. I suspect that there is a logical error in my code that I haven't caught. My script:
require 'net/http'
require 'json'
require 'pp'
require 'uri'
def deleteUserRole
# prepare request
url= "http://my.database.5002143.access" # dev
uri = URI.parse(url)
request = Net::HTTP::Delete.new(uri.path)
http = Net::HTTP.new(uri.host, uri.port)
# send the request
response = http.request(request)
puts "response: \n"
puts response.body
puts "response code: " + response.code + "\n \n"
# parse response
buffer= response.body
result = JSON.parse(buffer)
status= result["Success"]
if status == true
then puts "passed"
else puts "failed"
end
end
deleteUserRole
It turns out that I was typing in the wrong command. I needed to change this line:
request = Net::HTTP::Delete.new(uri.path)
to this line:
request = Net::HTTP::Delete.new(uri)
By typing uri.path I was excluding part of the URL from the API call. When I was debugging, I would type puts uri and that would show me the full URL, so I was certain the URL was right. The URL was right, but I was not including the full URL in my DELETE call.
if you miss the parameters to pass while requesting delete, it won't work
you can do like this
uri = URI.parse('http://localhost/test')
http = Net::HTTP.new(uri.host, uri.port)
attribute_url = '?'
attribute_url << body.map{|k,v| "#{k}=#{v}"}.join('&')
request = Net::HTTP::Delete.new(uri.request_uri+attribute_url)
response = http.request(request)
where body is a hashmap where you can define query params as a hashmap.. while sending request it can be joined in the url by the code above.
ex:body = { :resname => 'res', :bucket_name => 'bucket', :uploaded_by => 'upload' }

Net::HTTP Proxy list

I understand that you could use proxy in the ruby Net::HTTP. However, I have no idea how to do this with a bunch of proxy. I need the Net::HTTP to change to another proxy and send another post request after every post request. Also, is it possible to make the Net::HTTP to change to another proxy if the previous proxy is not working? If so, how?
Code I'm trying to implement the script in:
require 'net/http'
sleep(8)
http = Net::HTTP.new('URLHERE', 80)
http.read_timeout = 5000
http.use_ssl = false
path = 'PATHHERE'
data = '(DATAHERE)'
headers = {
'Referer' => 'REFERER HERE',
'Content-Type' => 'application/x-www-form-urlencoded; charset=UTF-8',
'User-Agent' => '(USERAGENTHERE)'}
resp, data = http.post(path, data, headers)
# Output on the screen -> we should get either a 302 redirect (after a successful login) or an error page
puts 'Code = ' + resp.code
puts 'Message = ' + resp.message
resp.each {|key, val| puts key + ' = ' + val}
puts data
end
Given an array of proxies, the following example will make a request through each proxy in the array until it receives a "302 Found" response. (This isn't actually a working example because Google doesn't accept POST requests, but it should work if you insert your own destination and working proxies.)
require 'net/http'
destination = URI.parse "http://www.google.com/search"
proxies = [
"http://proxy-example-1.net:8080",
"http://proxy-example-2.net:8080",
"http://proxy-example-3.net:8080"
]
# Create your POST request_object once
request_object = Net::HTTP::Post.new(destination.request_uri)
request_object.set_form_data({"q" => "stack overflow"})
proxies.each do |raw_proxy|
proxy = URI.parse raw_proxy
# Create a new http_object for each new proxy
http_object = Net::HTTP.new(destination.host, destination.port, proxy.host, proxy.port)
# Make the request
response = http_object.request(request_object)
# If we get a 302, report it and break
if response.code == "302"
puts "#{proxy.host}:#{proxy.port} responded with #{response.code} #{response.message}"
break
end
end
You should also probably do some error checking with begin ... rescue ... end each time you make a request. If you don't do any error checking and a proxy is down, control will never reach the line that checks for response.code == "302" -- the program will just fail with some type of connection timeout error.
See the Net::HTTPHeader docs for other methods that can be used to customize the Net::HTTP::Post object.

Ruby HTTP post with session cookie

I'm trying to write a Ruby script to use the API on the image gallery site Piwigo, this requires you to login first with one HTTP post and upload an image with another post.
This is what I've got so far but it doesn't work, just returns a 401 error, can anyone see where I am going wrong?
require 'net/http'
require 'pp'
http = Net::HTTP.new('mydomain.com',80)
path = '/piwigo/ws.php'
data = 'method=pwg.session.login&username=admin&password=password'
resp, data = http.post(path, data, {})
if (resp.code == '200')
cookie = resp.response['set-cookie']
data = 'method=pwg.images.addSimple&image=image.jpg&category=7'
headers = { "Cookie" => cookie }
resp, data = http.post(path, data, headers)
puts resp.code
puts resp.message
end
Which gives this response when run;
$ ruby piwigo.rb
401
Unauthorized
There is a Perl example on their API page which I was trying to convert to Ruby http://piwigo.org/doc/doku.php?id=dev:webapi:pwg.images.addsimple
By using the nice_http gem: https://github.com/MarioRuiz/nice_http
NiceHttp will take care of your cookies so you don't have to do anything
require 'nice_http'
path = '/piwigo/ws.php'
data = '?method=pwg.session.login&username=admin&password=password'
http = NiceHttp.new('http://example.com')
resp = http.get(path+data)
if resp.code == 200
resp = http.post(path)
puts resp.code
puts resp.message
end
Also if you want you can add your own cookies by using http.cookies
You can use a gem called mechanize. It handles cookies transparently.

Ruby parsing HTTPresponse with Nokogiri

Parsing HTTPresponse with Nokogiri
Hi, I am having trouble parsing HTTPresponse objects with Nokogiri.
I use this function to fetch a website here:
fetch a link
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(URI.encode(uri_str.strip))
puts url
#get path
req = Net::HTTP::Get.new(url.path,headers)
#start TCP/IP
response = Net::HTTP.start(url.host,url.port) { |http|
http.request(req)
}
case response
when Net::HTTPSuccess
then #print final redirect to a file
puts "this is location" + uri_str
puts "this is the host #{url.host}"
puts "this is the path #{url.path}"
return response
# if you get a 302 response
when Net::HTTPRedirection
then
puts "this is redirect" + response['location']
return fetch(response['location'],aFile, limit - 1)
else
response.error!
end
end
html = fetch("http://www.somewebsite.com/hahaha/")
puts html
noko = Nokogiri::HTML(html)
When I do this html prints a whole bunch of gibberish and
Nokogiri complains that "node_set must be a Nokogiri::XML::NOdeset
If anyone could offer help it would be quite appreciated
First thing. Your fetch method returns a Net::HTTPResponse object and not just the body. You should provide the body to Nokogiri.
response = fetch("http://www.somewebsite.com/hahaha/")
puts response.body
noko = Nokogiri::HTML(response.body)
I've updated your script so it's runnable (bellow). A couple of things were undefined.
require 'nokogiri'
require 'net/http'
def fetch(uri_str, limit = 10)
# You should choose better exception.
raise ArgumentError, 'HTTP redirect too deep' if limit == 0
url = URI.parse(URI.encode(uri_str.strip))
puts url
#get path
headers = {}
req = Net::HTTP::Get.new(url.path,headers)
#start TCP/IP
response = Net::HTTP.start(url.host,url.port) { |http|
http.request(req)
}
case response
when Net::HTTPSuccess
then #print final redirect to a file
puts "this is location" + uri_str
puts "this is the host #{url.host}"
puts "this is the path #{url.path}"
return response
# if you get a 302 response
when Net::HTTPRedirection
then
puts "this is redirect" + response['location']
return fetch(response['location'], limit-1)
else
response.error!
end
end
response = fetch("http://www.google.com/")
puts response
noko = Nokogiri::HTML(response.body)
puts noko
The script gives no error and prints the content. You may be getting Nokogiri error due to the content you're receiving. One common problem I've encountered with Nokogiri is character encoding. Without the exact error it's impossible to tell what's going on.
I'd recommnend looking at the following StackOverflow Questions
ruby 1.9: invalid byte sequence in UTF-8 (specifically this answer)
How to convert a Net::HTTP response to a certain encoding in Ruby 1.9.1?

How to handle 404 errors with Ruby HTTP::Net?

I'm trying to parse web pages but I sometimes get 404 errors. Here's the code I use to get the web page:
result = Net::HTTP::get URI.parse(URI.escape(url))
How do I test if result is a 404 error code?
Rewrite your code like this:
uri = URI.parse(url)
result = Net::HTTP.start(uri.host, uri.port) { |http| http.get(uri.path) }
puts result.code
puts result.body
That will print the status code followed by the body.
As you know, your code will always return the response body, whether there is an error or not. In order to test the response code, use Theo's answer, and the following if statement, for example:
if result.code.to_i < 400
puts "success"
end
This example converts the code (which is a string) to an integer, and treats redirects and various 200 codes as successful.
See this for the various codes returned:
http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
You need to get the response:
response = Net::HTTP.get_response(URI(url))
error = response.is_a?(Net::HTTPNotFound)
result = response.body

Resources