login vk.com net::http.post_form - ruby

I want login to vk.com or m.vk.com without Ruby. But my code dosen't work.
require 'net/http'
email = "qweqweqwe#gmail.com"
pass = "qeqqweqwe"
userUri = URI('m.vk.com/index.html')
Net::HTTP.get(userUri)
res = Net::HTTP.post_form(userUri, 'email' => email, 'pass' => pass)
puts res.body

First of all, you need to change userUri to the following:
userUri = URI('https://login.vk.com/?act=login')
Which is where the vk site expects your login parameters.
I'm not very faimilar with vk, but you probably need a way to handle the session cookie. Both receiving it, and providing it for future requests. Can you elaborate on what you're doing after login?
Here is the net/http info for cookie handling:
# Headers
res['Set-Cookie'] # => String
res.get_fields('set-cookie') # => Array
res.to_hash['set-cookie'] # => Array
puts "Headers: #{res.to_hash.inspect}"

This kind of task is exactly what Mechanize is for. Mechanize handles redirects and cookies automatically. You can do something like this:
require 'mechanize'
agent = Mechanize.new
url = "http://m.vk.com/login/"
page = agent.get(url)
form = page.forms[0]
form['email'] = "qweqweqwe#gmail.com"
form['pass'] = "qeqqweqwe"
form.submit
puts agent.page.body

Related

How do I follow URL redirection?

I have a URL and I need to retrieve the URL it redirects to (the number of redirections is arbitrary).
One real example I'm working on is:
https://www.google.com/url?q=http://m.zynga.com/about/privacy-center/privacy-policy&sa=D&usg=AFQjCNESJyXBeZenALhKWb52N1vHouAd5Q
which will eventually redirect to:
http://company.zynga.com/privacy/policy
which is the URL I'm interested in.
I tried with open-uri as follows:
privacy_url = "https://www.google.com/url?q=http://m.zynga.com/about/privacy-center/privacy-policy&sa=D&usg=AFQjCNESJyXBeZenALhKWb52N1vHouAd5Q"
final_url = nil
open(privacy_url) do |h|
puts "Redirecting to #{h.base_uri}"
final_url = h.base_uri
end
but I keep getting the original URL back, meaning that final_url is equal to privacy_url.
Is there any way to follow this kind of redirection and programmatically access the resulting URL?
I finally made it, using the Mechanize gem. They key is to enable the follow_meta_refresh options, which is disabled by default.
Here's how
require 'mechanize'
browser = Mechanize.new
browser.follow_meta_refresh = true
start_url = "https://www.google.com/url?q=http://m.zynga.com/about/privacy-center/privacy-policy&sa=D&usg=AFQjCNESJyXBeZenALhKWb52N1vHouAd5Q"
final_url = nil
browser.get(start_url) do |page|
final_url = page.uri.to_s
end
puts final_url # => http://company.zynga.com/privacy/policy

Writing autologin using Net::Http in Ruby

http://ruby-doc.org/stdlib-1.8.7/libdoc/net/http/rdoc/Net/HTTP.html
After reading the doc very carefully, I'm writing the following code snippet for the autologin feature of my program:
url = URI.parse('http://localhost/login.aspx')
req = Net::HTTP::Post.new(url.path)
req.basic_auth 'username'
The target page asks only the correct user name, no password is needed in order to login, the basic_auth method requires two parameters, user name and password, if I leave one out, I'll get the error, I tried to write it like this "req.basic_auth 'username', ''", but I still cannot login.
Could anyone kindly give me a hint?
more info:
I also tried req.basic_auth 'username', '', it didn't seem to be working, I know this because there's another line of line follow right after this one, which is basically doing auto form submission.
x = Net::HTTP.post_form(URI.parse("http://localhost/NewTask.aspx"), params)
puts x.body
And the puts result came back with the redirect to login page body.
You can consider using ruby mechanize gem. a login example will be much simpler(from official site), for this one, you will not need to do the agent cert and private key thing:
require 'rubygems'
require 'mechanize'
# create Mechanize instance
agent = Mechanize.new
# set the path of the certificate file
agent.cert = 'example.cer'
# set the path of the private key file
agent.key = 'example.key'
# get the login form & fill it out with the username/password
login_form = agent.get("http://example.com/login_page").form('Login')
login_form.Userid = 'TestUser'
login_form.Password = 'TestPassword'
# submit login form
agent.submit(login_form, login_form.buttons.first)

How to manually add a cookie to Mechanize state?

I'm working in Ruby, but my question is valid for other languages as well.
I have a Mechanize-driven application. The server I'm talking to sets a cookie using JavaScript (rather than standard set-cookie), so Mechanize doesn't catch the cookie. I need to pass that cookie back on the next GET request.
The good news is that I already know the value of the cookie, but I don't know how to tell Mechanize to include it in my next GET request.
I figured it out by extrapolation (and reading sources):
agent = Mechanize.new
...
cookie = Mechanize::Cookie.new(key, value)
cookie.domain = ".oddity.com"
cookie.path = "/"
agent.cookie_jar.add(cookie)
...
page = agent.get("https://www.oddity.com/etc")
Seems to do the job just fine.
update
As #Benjamin Manns points out, Mechanize now wants a URL in the add method. Here's the amended recipe, making the assumption that you've done a GET using the agent, and that the last page visited is the domain for the cookie (saves a URI.parse()):
agent = Mechanize.new
...
cookie = Mechanize::Cookie.new(key, value)
cookie.domain = ".oddity.com"
cookie.path = "/"
agent.cookie_jar.add(agent.history.last.uri, cookie)
These answers are old, so to bring this up to date, these days it looks more like this:
cookie = Mechanize::Cookie.new :domain => '.mydomain.com', :name => name, :value => value, :path => '/', :expires => (Date.today + 1).to_s
agent.cookie_jar << cookie
I wanted to add my experience for specifically passing cookies from Selenium to Mechanize:
Get the cookies from your selenium driver
sel_driver = Selenium::WebDriver.for :firefox
sel_driver.navigate.to('https://sample.com/javascript_login')
#login
sel_cookies = sel_driver.manage.all_cookies
Value for :expires from Selenium cookie is a DateTime object or blank.
However, value for :expires Mechanize cookie (a) must be a string and (b) cannot be blank
sel_cookies.each do |c|
if c[:expires].blank?
c[:expires] = (DateTime.now + 10.years).to_s #arbitrary date in the future
else
c[:expires] = c[:expires].to_s
end
end
Now instantiate as Mechanize cookies and place them in the cookie jar
mech_agent = Mechanize.new
sel_cookies.each { |c| agent.cookie_jar << Mechanize::Cookie.new(c) }
mech_agent.get 'https://sample.com/html_pages'
Also you can try this
Mechanize::Cookie.parse(url, "SessionCookie=#{sessid}",
Logger.new(STDOUT)) { |c| agent.cookie_jar.add(url, c) }
source: http://twitter.com/#!/calebcrane/status/51683884341002240
response.to_hash.fetch("set-cookie").each do |c|
agent.cookie_jar.parse c
end
response here is a native Ruby stdlib thing, like Net::HTTPOK.

post form parameters difference between Firefox and Ruby Mechanize

I am trying to figure out if mechanize sends correct post query.
I want to log in to a forum (please see html source, mechanize log in my other question) but I get only the login page again. When looking into it I can see that firefox sends out post with parameters like
auth_username=myusername&auth_password=mypassword&auth_login=Login but my script sends
auth_username=radek&auth_password=mypassword is that ok or the &auth_login=Login part must be present?
When I tried to add it using login_form['auth_login'] = 'Login' I got an error gems/mechanize-0.9.3/lib/www/mechanize/page.rb:13 inmeta': undefined method search' for nil:NilClass (NoMethodError)
It seems to me that auth_login is a form button not a field (I don't know if it matters)
[#<WWW::Mechanize::Form
{name nil}
{method "POST"}
{action
"http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1"}
{fields
#<WWW::Mechanize::Form::Field:0x36946c0 #name="auth_username", #value="">
#<WWW::Mechanize::Form::Field:0x369451c #name="auth_password", #value="">}
{radiobuttons}
{checkboxes}
{file_uploads}
{buttons
#<WWW::Mechanize::Form::Button:0x36943b4
#name="auth_login",
#value="Login">}>
]
My script is as follow
require 'rubygems'
require 'mechanize'
require 'logger'
agent = WWW::Mechanize.new {|a| a.log = Logger.new("loginYOTA.log") }
agent.follow_meta_refresh = true #Mechanize does not follow meta refreshes by default, we need to set that option.
page = agent.get("http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1")
login_form = page.form_with(:method => 'POST') #works
puts login_form.buttons.inspect
puts page.forms.inspect
STDIN.gets
login_form.fields.each { |f| puts "#{f.name} : #{f.value}" }
#STDIN.gets
login_form['auth_username'] = 'myusername'
login_form['auth_password'] = 'mypassword'
login_form['auth_login'] = 'Login'
STDIN.gets
page = agent.submit login_form
#Display message if logged in
puts page.parser.xpath("/html/body/div/div/div/table/tr/td[2]/div/strong").xpath('text()').to_s.strip
puts
puts page.parser.xpath("/html/body/div/div/div/table/tr/td[2]/div").xpath('text()').to_s.strip
output = File.open("login.html", "w") {|f| f.write(page.parser.to_html) }
You can find more code, html, log in my other related question log in with browser and then ruby/mechanize takes it over?
the absence of one parameter compare to firefox in POST caused mechanize not to log in. Adding new parameter solved this problem. So it seems to me that the web server requires &auth_login=Login parameter to be in POST.
You can read how to add new field to mechanize form in another question.

How to get redirect log in Mechanize?

In ruby, if you use mechanize following 301/302 redirects like this
require 'mechanize'
m = WWW::Mechanize.new
m.get('http://google.com')
how to get the list of the pages mechanize was redirected through? (Like http://google.com => http://www.google.com => http://google.com.ua)
OK, here is the code in mechanize responsible for redirection
elsif res_klass <= Net::HTTPRedirection
return page unless follow_redirect?
log.info("follow redirect to: #{ response['Location'] }") if log
from_uri = page.uri
raise RedirectLimitReachedError.new(page, redirects) if redirects + 1 > redirection_limit
redirect_verb = options[:verb] == :head ? :head : :get
page = fetch_page( :uri => response['Location'].to_s,
:referer => page,
:params => [],
:verb => redirect_verb,
:redirects => redirects + 1
)
#history.push(page, from_uri)
return page
but trying to m.history.map {|p| puts p.uri} shows 3 times the uri of last page..
The key here is to take advantage of the built in logging in Mechanize. Here's a full code sample using the built in Rails logging facilities.
require 'mechanize'
require 'logger'
mechanize_logger = Logger.new('log/mechanize.log')
mechanize_logger.level = Logger::INFO
url = 'http://google.com'
agent = Mechanize.new
agent.log = mechanize_logger
agent.get(url)
And then check the output of log/mechanize.log in your log directory and you'll see the whole mechanize process including the intermediate urls.
I'm not certain, but here are a couple of things to try:
see what's in m.history[i].uri after the get()
You might need something like:
for m.redirection_limit in 0..99
begin
m.get(url)
break
rescue WWW::Mechanize::RedirectLimitReachedError
# code here could get control at
# intermediate redirection levels
end
end

Resources