how to add new field to mechanize form (ruby/mechanize) - ruby

there is a public class method to add field to mechanize form
I tried ..
#login_form.field.new('auth_login','Login')
#login_form.field.new('auth_login','Login')
and both gives me an error undefined method "new" for #<WWW::Mechanize::Form::Field:0x3683cbc> (NoMethodError)
I tried login_form.field.new('auth_login','Login') which gives me an error
mechanize-0.9.3/lib/www/mechanize/page.rb:13 n `meta': undefined method `search' for nil:NilClass (NoMethodError)
but at the time I submit the form. The field does not exist in html source. I want to add it so POST query sent by my script will contain auth_username=myusername&auth_password=mypassword&auth_login=Login So far it sends only auth_username=radek&auth_password=mypassword which might be why I cannot get logged in. Just my thought.
The script looks like
require 'rubygems'
require 'mechanize'
require 'logger'
agent = WWW::Mechanize.new {|a| a.log = Logger.new("loginYOTA.log") }
agent.follow_meta_refresh = true #Mechanize does not follow meta refreshes by default, we need to set that option.
page = agent.get("http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1")
login_form = page.form_with(:method => 'POST')
puts login_form.buttons.inspect
puts page.forms.inspect
#STDIN.gets
login_form.fields.each { |f| puts "#{f.name} : #{f.value}" }
login_form['auth_username'] = 'radeks'
login_form['auth_password'] = 'TestPass01'
#login_form['auth_login'] = 'Login'
#login_form.field.new('auth_login','Login')
#login_form.field.new('auth_login','Login')
#login_form.fields.each { |f| puts "#{f.name} : #{f.value}" }
#STDIN.gets
page = agent.submit login_form
#Display welcome message if logged in
puts page.parser.xpath("/html/body/div/div/div/table/tr/td[2]/div/strong").xpath('text()').to_s.strip
puts
puts page.parser.xpath("/html/body/div/div/div/table/tr/td[2]/div").xpath('text()').to_s.strip
output = File.open("login.html", "w") {|f| f.write(page.parser.to_html) }
The .inspect of the form looks like
[#<WWW::Mechanize::Form
{name nil}
{method "POST"}
{action
"http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1"}
{fields
#<WWW::Mechanize::Form::Field:0x36946c0 #name="auth_username", #value="">
#<WWW::Mechanize::Form::Field:0x369451c #name="auth_password", #value="">}
{radiobuttons}
{checkboxes}
{file_uploads}
{buttons
#<WWW::Mechanize::Form::Button:0x36943b4
#name="auth_login",
#value="Login">}>
]

I think what you're looking for is
login_form.add_field!(field_name, value = nil)
Here are the docs:
http://rdoc.info/projects/tenderlove/mechanize
The difference between this and the method WWW::Mechanize::Form::Field.new is not much, aside from the fact that there aren't many ways to add fields to a form. Here's how the add_field! method is implemented....you can see that it's exactly what you'd expect. It instantiates a Field object, then adds it to the form's 'fields' array. You wouldn't be able to do this in your code because the method "fields<<" is a private method inside "Form."
# File lib/www/mechanize/form.rb, line 65
def add_field!(field_name, value = nil)
fields << Field.new(field_name, value)
end
On a side note, according to the docs you should be able to do the first variation you proposed:
login_form['field_name']='value'
Hope this helps!

another way how to add new field is to so at the time of posting the form
page = agent.post( url, {'auth_username'=>'myusername', #existing field
'auth_password'=>'mypassword', #existing field
'auth_login'=>'Login'}) #new field

Related

Why am I getting an unsupportedSchemeError

I am writing a program that uses Mechanize to scrape a student's grades and classes from edline.net using a student account and return the data I need. However, after logging in, from the homepage I have to access a link (called 'Private Reports') which will then dynamically return a page of links to each of the student's classes and respective grades.
When testing I create a new object my_account that has several instance variables including the homepage. I pointed new variables to the instance variables for this to be more simple to read):
result_page = agent.page.link_with(:text => 'Private Reports').click
I get:
Mechanize::UnsupportedSchemeError
But if I were to replace :click with :text it responds correctly and result_page will equal the link's text "Private Reports"
Why does it respond to :text correctly but give an error for :click? Is there a way to get around this or should I rethink my solution to this problem?
Here's the code:
class AccountFetcher
EDLINE_LOGIN_URL = 'http://edline.net/Index.page'
def self.fetch_form(agent)
# uses #agent of an Account object
page = agent.get(EDLINE_LOGIN_URL)
form = page.form('authenticationEntryForm')
end
# edline's login form has to have several pre-conditions met before submitting the form
def self.initialize_form(agent)
form = AccountFetcher.fetch_form(agent)
submit_event = form.field_with(:name => 'submitEvent')
enter_clicked = form.field_with(:name => 'enterClicked')
ajax_support = form.field_with(:name => 'ajaxSupported')
ajax_support.value = 'yes'
enter_clicked.value = true
submit_event.value = 1
return form
end
# logs the user in and returns the homepage
def self.fetch_homepage(u_username, u_password, agent)
form = AccountFetcher.initialize_form(agent)
username = form.field_with(:name => 'screenName')
password = form.field_with(:name => 'kclq')
username.value = u_username
password.value = u_password
form.submit
end
end
# class Account will be expanded later on but here are the bare bones for a user to log in to their account
class Account
attr_accessor :report, :agent, :username, :password
def initialize(u_username, u_password)
#agent = Mechanize.new
#username = u_username
#password = u_password
end
def login
page = AccountFetcher.fetch_homepage(self.username, self.password, self.agent)
#report = page
end
end
my_account = Account.new('ex_username', 'ex_password')
my_account.login
page = my_account.report
agent = my_account.agent
page = agent.page.link_with(:text => 'Private Reports').click
Where does the link point to? Ie, what's the href?
I ask because "scheme" usually refers to something like http or https or ftp, so maybe the link has a weird scheme that mechanize doesn't know how to handle, hence Mechanize::UnsupportedSchemeError

Mechanize Page.Form.Action POST for multiple INPUT tags with same NAME / VALUE

Need to post to existing web page (no login required) and post parameters for submit where multiple submit forms tags exist and contains identical tags with the same NAME and VALUE tags; for example, on the same page this INPUT Submit is repeated 3 times under different FORM tags:
< INPUT TYPE='Submit' NAME='submit_button' VALUE='Submit Query' >
My Ruby code runs ok for identifying the fields on the form tags, but fails on the page.forms[x].action post with 405 HTTPMethodNotAllowed for https://pdb.nipr.com/html/PacNpnSearch -- unhandled response.
Ruby code:
class PostNIPR2
def post(url)
button_count = 0
agent = Mechanize.new
page = agent.get(url)
page.forms.each do |form|
form.buttons.each do |button|
if(button.value == 'Submit Query')
button_count = button_count + 1
if (button_count == 3)
btn_submit_license = button.name
puts button
puts btn_submit_license
puts button.value
end
end
end
end
begin
uform = page.forms[1]
uform.license = "0H20649"
uform.state = "CA"
uform.action = 'https://pdb.nipr.com/html/PacNpnSearch'
rescue Exception => e
error_page = e.page
end
page = agent.submit(uform)
end
url = "https://pdb.nipr.com/html/PacNpnSearch.html"
p = PostNIPR2.new
p.post(url)
end
Is your question how to select that button? If so:
form.button_with(:name => 'submit_button')
or submit the form like this:
next_page = form.submit form.button_with(:name => 'submit_button')
Also you are changing the form's action for some reason and that will explain the 405s
You are correct, sorry about the comment code - the question was to have the form.license and form.state updated with the input params then have the form.submit post the form.button_with(:name => 'Submit Query' - I did this and received the 405 HTTPMethodNotAllowed, while for https://pdb.nipr.com/html/PacNpnSearch -- unhandled response. But now I have changed the code to agent.page.form_with(:name => 'license_form') which now correctly finds the form I need to post to; then I get the form.button_with(:value => 'Submit Query') and then utilize the agent.submit(form, button). Now I get the correct result.

How to set the Referer header before loading a page with Ruby mechanize?

Is there a straightforward way to set custom headers with Mechanize 2.3?
I tried a former solution but get:
$agent = Mechanize.new
$agent.pre_connect_hooks << lambda { |p|
p[:request]['Referer'] = 'https://wwws.mysite.com/cgi-bin/apps/Main'
}
# ./mech.rb:30:in `<main>': undefined method `pre_connect_hooks' for nil:NilClass (NoMethodError)
The docs say:
get(uri, parameters = [], referer = nil, headers = {}) { |page| ... }
so for example:
agent.get 'http://www.google.com/', [], agent.page.uri, {'foo' => 'bar'}
alternatively you might like:
agent.request_headers = {'foo' => 'bar'}
agent.get url
You misunderstood the code you were copying. There was a newline in the example, but it disappeared in the formatting as it wasn't tagged as code. $agent contains nil since you're trying to use it before it has been initialized. You must initialize the object and then use it. Just try this:
$agent = Mechanize.new
$agent.pre_connect_hooks << lambda { |p| p[:request]['Referer'] = 'https://wwws.mysite.com/cgi-bin/apps/Main' }
For this question I noticed people seem to use:
page = agent.get("http://www.you.com/index_login/", :referer => "http://www.you.com/")
As an aside, now that I tested this answer, it seems this was not the issue behind my actual problem: that every visit to a site I'm scraping requires going through the login sequence pages again, even seconds later after the first logged-in visit, despite that I'm always loading and saving the complete cookie jar in yaml format. But that would lead to another question of course.

How to submit formstack form using ruby?

I have a form similiar to THIS and want to be submit data to it from a CSV file using ruby. Here is what I have been trying to do:
require 'uri'
require 'net/http'
params = {
'field15157482-first' => 'bip',
'field15157482-last' => 'bop',
'field15157485' => 'bip#bob.com',
'field15157487' => 'option1'
'fsSubmitButton1196962' => 'Submit'
}
x = Net::HTTP.post_form(URI.parse('http://www.formstack.com/forms/?1196833-GxMTxR20GK'), params)
I keep getting A valid form ID was not supplied. I have a hunch I am using the wrong URL but I don't know what to replace it with.
I would use the the API but I don't have access to the token hence my stone age approach. Any suggestions would be much appreciated.
The form uses hidden variables and cookies to attempt to maintain a "unique session". Fortunately, Mechanize makes handling 'sneaky' forms quite easy.
require "mechanize"
form_uri = "http://www.formstack.com/forms/?1196962-617Z6Foyif"
#agent = Mechanize.new
page = #agent.get form_uri
form = page.forms[0]
form.fields_with(:class => /fsField/).each do |field|
field.value = case field.name
when /first/ then "First Name"
when /last/ then "Last Name"
else "email#address.com"
end
end
page = form.submit form.buttons.first
puts
puts "=== Response Header"
puts
puts page.header
puts
puts "=== Response Body"
puts
puts page.body
Looking at the source on http://www.formstack.com/forms/?1196833-GxMTxR20GK and the example in your link, it appears that formstack forms post to index.php, and require a form id to be passed in to identify which form is being submitted.. Looking at the forms in both examples, you'll see a field similar to this:
<input type="hidden" name="form" value="1196833" />
Try adding the following to your params hash:
'form' => '1196883' # or other appropriate form value
You may also need to include the other hidden fields for a valid submit.

post form parameters difference between Firefox and Ruby Mechanize

I am trying to figure out if mechanize sends correct post query.
I want to log in to a forum (please see html source, mechanize log in my other question) but I get only the login page again. When looking into it I can see that firefox sends out post with parameters like
auth_username=myusername&auth_password=mypassword&auth_login=Login but my script sends
auth_username=radek&auth_password=mypassword is that ok or the &auth_login=Login part must be present?
When I tried to add it using login_form['auth_login'] = 'Login' I got an error gems/mechanize-0.9.3/lib/www/mechanize/page.rb:13 inmeta': undefined method search' for nil:NilClass (NoMethodError)
It seems to me that auth_login is a form button not a field (I don't know if it matters)
[#<WWW::Mechanize::Form
{name nil}
{method "POST"}
{action
"http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1"}
{fields
#<WWW::Mechanize::Form::Field:0x36946c0 #name="auth_username", #value="">
#<WWW::Mechanize::Form::Field:0x369451c #name="auth_password", #value="">}
{radiobuttons}
{checkboxes}
{file_uploads}
{buttons
#<WWW::Mechanize::Form::Button:0x36943b4
#name="auth_login",
#value="Login">}>
]
My script is as follow
require 'rubygems'
require 'mechanize'
require 'logger'
agent = WWW::Mechanize.new {|a| a.log = Logger.new("loginYOTA.log") }
agent.follow_meta_refresh = true #Mechanize does not follow meta refreshes by default, we need to set that option.
page = agent.get("http://www.somedomain.com/login?auth_successurl=http://www.somedomain.com/forum/yota?baz_r=1")
login_form = page.form_with(:method => 'POST') #works
puts login_form.buttons.inspect
puts page.forms.inspect
STDIN.gets
login_form.fields.each { |f| puts "#{f.name} : #{f.value}" }
#STDIN.gets
login_form['auth_username'] = 'myusername'
login_form['auth_password'] = 'mypassword'
login_form['auth_login'] = 'Login'
STDIN.gets
page = agent.submit login_form
#Display message if logged in
puts page.parser.xpath("/html/body/div/div/div/table/tr/td[2]/div/strong").xpath('text()').to_s.strip
puts
puts page.parser.xpath("/html/body/div/div/div/table/tr/td[2]/div").xpath('text()').to_s.strip
output = File.open("login.html", "w") {|f| f.write(page.parser.to_html) }
You can find more code, html, log in my other related question log in with browser and then ruby/mechanize takes it over?
the absence of one parameter compare to firefox in POST caused mechanize not to log in. Adding new parameter solved this problem. So it seems to me that the web server requires &auth_login=Login parameter to be in POST.
You can read how to add new field to mechanize form in another question.

Resources