How to check if a username is already in use on Facebook?
My solution was trying to access http://www.facebook.com/USER and check the http headers (200 = OK; 404 = NOT FOUND). I could use this code:
require 'open-uri'
require 'net/http'
def remote_file_exists?(url,httpcode)
url = URI.parse(url)
Net::HTTP.start(url.host, url.port) do |http|
return http.head(url.request_uri).code == httpcode
end
end
The problem is that Facebook always returns 302 (Found), then redirects to https://www.facebook.com/USER.
I can require net/https and create a new function:
def https_url_exists? (url,httpcode)
url = URI.parse(url)
net = Net::HTTP.new(url.host, url.port)
net.use_ssl = true
net.verify_mode = OpenSSL::SSL::VERIFY_NONE
net.start do |http|
return (http.head(url.request_uri).code == httpcode)
end
end
Now the problem is that some users use dots on their usernames. For example, username might be user.name. Facebook use redirections for this.
What's the best way to check if USERNAME exists on facebook? How to get USER.NAME if USERNAME redirects to it?
You can use https://graph.facebook.com/username. This will return a json response with info to see if it exists as well as enough information to identify it as a user or page.
Once you have a valid user you can get user name First,Last info using:
https://graph.facebook.com/{userId}?fields=first_name,last_name
Related
After looking a lot, I've found some solutions that seem working, but not for me...
For example, I have this script:
require 'net/http'
require "net/https"
#http=Net::HTTP.new('www.xxxxxxx.net', 443)
#http.use_ssl = true
#http.verify_mode = OpenSSL::SSL::VERIFY_NONE
#http.start() {|http|
req = Net::HTTP::Get.new('/gb/PastSetupsXLS.asp?SR=31,6')
req.basic_auth 'my_user', 'my_password'
response = http.request(req)
print response.body
}
When I run it, it gives me a page that requests for authentication, but if I write the following URL in the browser, I get into the website without problems:
https://my_user:my_password#www.xxxxxxx.net/gb/PastSetupsXLS.asp?SR=31,6
I have also tried with open-uri:
module OpenSSL
module SSL
remove_const :VERIFY_PEER
end
end
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
def download(full_url, to_here)
writeOut = open(to_here, "wb")
writeOut.write(open(full_url, :http_basic_authentication=>["my_user", "my_password"]).read)
writeOut.close
end
download('https://www.xxxxxxx.net/gb/PastSetupsXLS.asp?SR=31,6', "target_file.html")
But the result is the same, the site is asking for user authentication.
Any tips of what am I doing wrong?. Must I encode the password in Base 64?
I wrote a piece of code based on examples given in the Net::HTTP docs and tested it on my local WAMP server - it works fine. Here's what I have:
require 'net/http'
require 'openssl'
uri = URI('https://localhost/')
Net::HTTP.start(uri.host, uri.port,
:use_ssl => uri.scheme == 'https',
:verify_mode => OpenSSL::SSL::VERIFY_NONE) do |http|
request = Net::HTTP::Get.new uri.request_uri
request.basic_auth 'matt', 'secret'
response = http.request request # Net::HTTPResponse object
puts response
puts response.body
end
And my .htaccess file looks like this:
AuthName "Authorization required"
AuthUserFile c:/wamp/www/ssl/.htpasswd
AuthType basic
Require valid-user
My .htpasswd is just a one liner generated with htpasswd -c .htpasswd matt for password "secret". When I run my code I get "200 OK" and contents of index.html. If I remove the request.basic_auth line, I get 401 error.
UPDATE:
As indicated by #stereoscott in the comments, the :verify_mode value I used in the example (OpenSSL::SSL::VERIFY_NONE) is not safe for production.
All available options listed in the OpenSSL::SSL::SSLContext docs are: VERIFY_NONE, VERIFY_PEER, VERIFY_CLIENT_ONCE, VERIFY_FAIL_IF_NO_PEER_CERT, out of which (according to the OpenSSL docs) only the first two ones are used in the client mode.
So VERIFY_PEER should be used on production, which is the default btw, so you can skip it entirely.
The following is what ended up working for me:
require "uri"
require "net/http"
url = URI("https://localhost/")
https = Net::HTTP.new(url.host, url.port)
https.use_ssl = true
request = Net::HTTP::Get.new(url)
request["Authorization"] = "Basic " + Base64::encode64("my_user:my_password")
response = https.request(request)
puts response.read_body
I came up with this by building a new HTTP Request in Postman, specifying the URL, choosing an Authorization Type of "Basic Auth," and inputting the credentials.
Clicking the Code icon (</>) and selecting "Ruby - Net::HTTP" will then generate a code snippet, giving you the output above.
Postman took care of encoding the credentials, but this answer helped me to dynamically set these values. You also can likely omit the "cookie" key as part of the request.
I'm trying to figure out how I can verify what I'm feeding into carrierwave is actually an image. The source I'm getting my image urls from isn't giving me back all live urls. Some of the images no longer exist. Unfortunately it doesn't really return the right status codes or anything because I was using some code to check if the remote file exists and it was passing that check. So now just to be on the safe side I'd like a way to verify i'm getting back a valid image file before I go ahead and download it.
Here is the remote file checking code I was using just for reference but I'd prefer something that actually can identify that the files are images.
require 'open-uri'
require 'net/http'
def remote_file_exists?(url)
url = URI.parse(url)
Net::HTTP.start(url.host, url.port) do |http|
return http.head(url.request_uri).code == "200"
end
end
I would check to see if the service returns the proper mime types in the Content-Type HTTP header. (here's a list of mime types)
For example, the Content-Type of the StackOverflow homepage is text/html; charset=utf-8, and the Content-Type of your gravatar image is image/png
To check the Content-Type header for image in ruby using Net::HTTP, you would use the following:
def remote_file_exists?(url)
url = URI.parse(url)
Net::HTTP.start(url.host, url.port) do |http|
return http.head(url.request_uri)['Content-Type'].start_with? 'image'
end
end
Rick Button's answer worked for me but I needed to add SSl support:
def self.remote_image_exists?(url)
url = URI.parse(url)
http = Net::HTTP.new(url.host, url.port)
http.use_ssl = (url.scheme == "https")
http.start do |http|
return http.head(url.request_uri)['Content-Type'].start_with? 'image'
end
end
I ended up using HTTParty for this. The .net request answer from Rick Button kept timing out.
def remote_file_exists?(url)
response = HTTParty.get(url)
response.code == 200 && response.headers['Content-Type'].start_with? 'image'
end
https://github.com/jnunemaker/httparty
How would you do a request to Facebook object graph to get the user's friends?
If you type in the url it works in the browser (replaced by valid user_id and access token):
"https://graph.facebook.com/user_id/friends?access_token=2227470867|2.AQDi3TbqnqrsPa0_.360"
When I try it from ruby code using Net::HTTP.get_response(URI.parse('url')) I get URI::InvalidURIError error message.
Your access token has some characters that are invalid for a URL. You have to CGI.escape them.
require 'cgi'
access_token = '2227470867|2.AQDi3TbqnqrsPa0_.360'
url = "https://graph.facebook.com/user_id/friends?access_token=#{CGI.escape(access_token)}"
uri = URI.parse(url)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
request = Net::HTTP::Get.new(uri.path + "?" + uri.query)
response = http.request(request)
data = response.body
Maybe something to do with OAuth? I'd suggest you to use a library like Koala instead of unrolling custom adhoc solutions.
How would I go about checking if a URL exists using Ruby?
For example, for the URL
https://google.com
the result should be truthy, but for the URLs
https://no.such.domain
or
https://stackoverflow.com/no/such/path
the result should be falsey
Use the Net::HTTP library.
require "net/http"
url = URI.parse("http://www.google.com/")
req = Net::HTTP.new(url.host, url.port)
res = req.request_head(url.path)
At this point res is a Net::HTTPResponse object containing the result of the request. You can then check the response code:
do_something_with_it(url) if res.code == "200"
Note: To check for https based url, use_ssl attribute should be true as:
require "net/http"
url = URI.parse("https://www.google.com/")
req = Net::HTTP.new(url.host, url.port)
req.use_ssl = true
res = req.request_head(url.path)
Sorry for the late reply on this, but I think this deserves a better answer.
There are three ways to look at this question:
Strict check if the URL exist
Check if you are requesting the URL correctly
Check if you can request it correctly and the server can answer it correctly
1. Strict check if the URL exist
While 200 means that the server answers to that URL (thus, the URL exists), answering other status code doesn't means that the URL does not exist. For example, answering 302 - redirected means that the URL exists and is redirecting to another one. While browsing, 302 many times behaves the same than 200 to the final user. Other status code that can be returned if a URL exists is 500 - internal server error. After all, if the URL does not exists, how it comes the application server processed your request instead return simply 404 - not found?
So there are actually only two cases when a URL does not exist: When the server does not exist or when the server exists but can't find the given URL path does not exist. Thus, the only way to check if the URL exists is checking if the server answers and the return code is not 404. The following code does just that.
require "net/http"
def url_exist?(url_string)
url = URI.parse(url_string)
req = Net::HTTP.new(url.host, url.port)
req.use_ssl = (url.scheme == 'https')
path = url.path if url.path.present?
res = req.request_head(path || '/')
res.code != "404" # false if returns 404 - not found
rescue Errno::ENOENT
false # false if can't find the server
end
2. Check if you are requesting the URL correctly
However, most of the times we are not interested in see if a URL exists, but if we can access it. Fortunately looking to the HTTP status codes families, that is the 4xx family, which states for client error (thus, an error in your side, which means you are not requesting the page correctly, don't have permission or whatsoever). This is a good of errors to check if you can access this page. From wiki:
The 4xx class of status code is intended for cases in which the client seems to have erred. Except when responding to a HEAD request, the server should include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. These status codes are applicable to any request method. User agents should display any included entity to the user.
So the following code make sure the URL exists and you can access it:
require "net/http"
def url_exist?(url_string)
url = URI.parse(url_string)
req = Net::HTTP.new(url.host, url.port)
req.use_ssl = (url.scheme == 'https')
path = url.path if url.path.present?
res = req.request_head(path || '/')
if res.kind_of?(Net::HTTPRedirection)
url_exist?(res['location']) # Go after any redirect and make sure you can access the redirected URL
else
res.code[0] != "4" #false if http code starts with 4 - error on your side.
end
rescue Errno::ENOENT
false #false if can't find the server
end
3. Check if you can request it correctly and the server can answer it correctly
Just like the 4xx family checks if you can access the URL, the 5xx family checks if the server had any problem answering your request. An error on this family most of the times are due problems on the server itself, and hopefully they are working on solve it. If You need to be able to access the page and get a correct answer now, you should make sure the answer is not from 4xx or 5xx family, and if you was redirected, the redirected page answers correctly. So much similar to (2), you can simply use the following code:
require "net/http"
def url_exist?(url_string)
url = URI.parse(url_string)
req = Net::HTTP.new(url.host, url.port)
req.use_ssl = (url.scheme == 'https')
path = url.path if url.path.present?
res = req.request_head(path || '/')
if res.kind_of?(Net::HTTPRedirection)
url_exist?(res['location']) # Go after any redirect and make sure you can access the redirected URL
else
! %W(4 5).include?(res.code[0]) # Not from 4xx or 5xx families
end
rescue Errno::ENOENT
false #false if can't find the server
end
Net::HTTP works but if you can work outside stdlib, Faraday is better.
Faraday.head(the_url).status == 200
(200 is a success code, assuming that's what you meant by "exists".)
Simone's answer was very helpful to me.
Here is a version that returns true/false depending on URL validity, and which handles redirects:
require 'net/http'
require 'set'
def working_url?(url, max_redirects=6)
response = nil
seen = Set.new
loop do
url = URI.parse(url)
break if seen.include? url.to_s
break if seen.size > max_redirects
seen.add(url.to_s)
response = Net::HTTP.new(url.host, url.port).request_head(url.path)
if response.kind_of?(Net::HTTPRedirection)
url = response['location']
else
break
end
end
response.kind_of?(Net::HTTPSuccess) && url.to_s
end
I'd like to add cookie support to a ruby class utilizing net/http to browse the web. Cookies have to be stored in a file to survive after the script has ended. Of course I can read the specs and write some kind of a handler, use some cookie.txt format and so on, but it seems to mean reinventing the wheel. Is there a better way to accomplish this task? Maybe some kind of a cooie jar class to take care of cookies?
The accepted answer will not work if your server returns and expects multiple cookies. This could happen, for example, if the server returns a set of FedAuth[n] cookies. If this affects you, you might want to look into using something along the lines of the following instead:
http = Net::HTTP.new('https://example.com', 443)
http.use_ssl = true
path1 = '/index.html'
path2 = '/index2.html'
# make a request to get the server's cookies
response = http.get(path)
if (response.code == '200')
all_cookies = response.get_fields('set-cookie')
cookies_array = Array.new
all_cookies.each { | cookie |
cookies_array.push(cookie.split('; ')[0])
}
cookies = cookies_array.join('; ')
# now make a request using the cookies
response = http.get(path2, { 'Cookie' => cookies })
end
Taken from DZone Snippets
http = Net::HTTP.new('profil.wp.pl', 443)
http.use_ssl = true
path = '/login.html'
# GET request -> so the host can set his cookies
resp, data = http.get(path, nil)
cookie = resp.response['set-cookie'].split('; ')[0]
# POST request -> logging in
data = 'serwis=wp.pl&url=profil.html&tryLogin=1&countTest=1&logowaniessl=1&login_username=blah&login_password=blah'
headers = {
'Cookie' => cookie,
'Referer' => 'http://profil.wp.pl/login.html',
'Content-Type' => 'application/x-www-form-urlencoded'
}
resp, data = http.post(path, data, headers)
# Output on the screen -> we should get either a 302 redirect (after a successful login) or an error page
puts 'Code = ' + resp.code
puts 'Message = ' + resp.message
resp.each {|key, val| puts key + ' = ' + val}
puts data
update
#To save the cookies, you can use PStore
cookies = PStore.new("cookies.pstore")
# Save the cookie
cookies.transaction do
cookies[:some_identifier] = cookie
end
# Retrieve the cookie back
cookies.transaction do
cookie = cookies[:some_identifier]
end
The accepted answer does not work. You need to access the internal representation of the response header where the multiple set-cookie values are stores separately and then remove everything after the first semicolon from these string and join them together. Here is code that works
r = http.get(path)
cookie = {'Cookie'=>r.to_hash['set-cookie'].collect{|ea|ea[/^.*?;/]}.join}
r = http.get(next_path,cookie)
Use http-cookie, which implements RFC-compliant parsing and rendering, plus a jar.
A crude example that happens to follow a redirect post-login:
require 'uri'
require 'net/http'
require 'http-cookie'
uri = URI('...')
jar = HTTP::CookieJar.new
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == 'https') do |http|
req = Net::HTTP::Post.new uri
req.form_data = { ... }
res = http.request req
res.get_fields('Set-Cookie').each do |value|
jar.parse(value, req.uri)
end
fail unless res.code == '302'
req = Net::HTTP::Get.new(uri + res['Location'])
req['Cookie'] = HTTP::Cookie.cookie_value(jar.cookies(uri))
res = http.request req
end
Why do this? Because the answers above are incredibly insufficient and flat out don't work in many RFC-compliant scenarios (happened to me), so relying on the very lib implementing just what's needed is infinitely more robust if you want to handle more than one particular case.
I've used Curb and Mechanize for a similar project.
Just enable cookies support and save the cookies to a temp cookiejar...
If your using net/http or packages without cookie support built in, you will need to write your own cookie handling.
You can send receive cookies using headers.
You can store the header in any persistence framework. Whether it is some sort of database, or files.