Scrapy follow javascript input button - xpath

I have following input's on the page:
<input name="ct99" value="" id="ct99" class="GetData" type="submit">
<input name="ct92" value="" id="ct92" class="GetData" type="submit">
<input name="ct87" value="" id="ct87" class="GetData" type="submit">
class GetData display some click-able icon. When click on it new page is opened. Some JavaScript taking care of it. How can I follow this?
I'm already try code below just to see if scrapy follow inputs, but without success.
def parse(self, response):
sel = Selector(response)
links = sel.xpath("//input[#class='GetData']").extract()
for data in links:
yield scrapy.FormRequest.from_response(response,
formdata={}, callback=self.after_click)
def after_click(self, response):
url = response.url
print '\nURL', url

There are two common ways to approach the problem:
using browser developer tools (Network tab), inspect the requests sent when you click a particular button and then simulate this request using scrapy.Request or scrapy.FormRequest
automate a browser using selenium: locate the button and click it, then grab the .page_source and instantiate a Selector instance, see samples here:
Scrapy with Selenium crawling but not scraping
How to use selenium along with scrapy to automate the process?

Related

How to handle dynamic elements with cucumber and capybara

(Click for image) I am working on an project to write a scenario to test login feature. For some reason capybara is not accessing dynamic elements.
Steps to Reproduce:
1) visit Redfin.com(for example)
2) click on Sign in button
3) a dynamic popup dialog appears
4) click on "continue with email" and try and enter details and clicking on submit.
I am not able to find any of the elements with find(#) and inturn not able to click on submit or enter details.
Also I believe the webapp is build with React.
Please do let me know how to handle this.
<div class="emailSignInButtonWrapper" style="position: relative;">
<button class="button Button tertiary emailSignInButton v3" type="button" tabindex="0" data-rf-test-name="submitButton">
<span>
<span class="signInText">Continue with Email</span>
</span>
</button>
None of the elements you are talking about have ids so a CSS find using #<id> (find('#my_button').click) isn't going to work. However to click that button you should just be able to do
click_button('Continue with Email') # case of the text matters
or
click_button(class: 'emailSignInButton')
This all assumes you are using a driver with Capybara that supports JS - https://github.com/teamcapybara/capybara#drivers
Here's code that shows it works if using a JS capable driver
require 'capybara/dsl'
require 'selenium-webdriver'
session = Capybara::Session.new(:selenium_chrome)
session.visit "https://www.redfin.com"
session.click_link('Sign In', href: nil)
session.click_button('Continue with Email')
Here's what worked for me on the redfin site:
scenario "bring up signup form on redfin" do
visit 'https://www.redfin.com'
find('a', :text => 'Sign In').click
click_button('Continue with Email')
end

change input hidden controls in ruby mechanize

When I click on a button:
<input type="button" onclick="document.lista_de_precios.opcion.value='por_categoria';showCat()" value="Por Categoría" class="btn btn-mini">
A input type:hidden value is changed to the button name: "por_categoria"
How do I change
<input type="hidden" value="" name="opcion">
to
<input type="hidden" value="por_categoria" name="opcion">
in Ruby Mechanize gem, I already tried using python examples in ruby with no success..
page.form.new_control('hidden','opcion',{'value': 'por_categoria'}
Update:
I investigated a little bit more and:
QUOTE from webpage
Sometimes mechanize won't pick up certain hidden form controls. Since mechanize doesn't pick up these controls, you will need to create them manually in order to get the form submission to work.
I think I will leave this post as is, because I don't know how to create form controls in this ruby code and mechanize.
You can ignore the advice on that page, it's talking about Python mechanize which is a different library (and not a very good one apparently!)
Here's how to do it with ruby mechanize:
form = page.forms[0] # or some other number
form['opcion'] = 'por_categoria'

Finding an element by XPath in Selenium

I am trying to use Selenium to navigate a webpage. There is a button I am trying to get to via its xpath. For other buttons on the site, it works fine. But for this particular one, I keep getting the error that the element can't be located. Firebug is just giving me the xpath in this format: //*[#id="continueButton"].
I notice that the button has wrappers around it. They are structured like
<div class = "cButtonWrapper">
<div class = "cButtonHolder">
<input type="image" id="continueButton" name="Continue" alt="Continue" src="/store/images/btn_continue.gif" value="Continue">
</div>
</div>
Could the wrappers around the button have anything to do with not being able to locate it?
Maybe the <input> element cannot be properly located by XPath because you are using invalid HTML. Try using <input id="continueButton"/> or <input id="continueButton"></input> in your page source.

How to store to browser auto-complete/auto-fill when using AJAX calls

I've noticed that browsers do not store form values until the form is submitted, which means that if you're using AJAX instead of a standard form submit, your browser's auto-fill is never populated. Is there a way to force populate your browsers auto-fill/auto-complete so that I can have this convenience with forms that are submitted via AJAX? It's annoying to go to my AJAX page and have to type in the same things in the form fields every time because the browser doesn't remember them.
My question is pretty much identical to the this one, except that only a work around in FireFox is provided as the accepted answer to that question. I'm looking for a solution that works in all major browsers (at least Chrome, FF, and IE), if there is one.
Note: I am not talking about AJAX auto-complete plugins, which is what almost always pops up when googling this question. I am talking about your browser's built-in auto-complete or auto-fill that helps you fill out forms by remembering what you entered in the past.
For anyone who's still trying to solve this, seem like I've found the answer.
Chromium tries to recognize the submit event, even if you preventDefault and handle the actual submission yourself.
That's it, you need to preventDefault the submit event, not the click event.
This worked on Chrome, Edge and IE 11 at the time of writing (I'm too lazy to download and test it on Firefox).
Here's your form:
<form method="POST" id="my-form">
<label>Email</label>
<input autocomplete="email" type="email" name="email">
<button type="submit">Subscribe</button>
</form>
Notice the autocomplete attribute. These are all the possible values that you can use for autocomplete.
In JavaScript, simply do this:
$("#my-form").on("submit", function (ev) {
ev.preventDefault();
// Do AJAX stuff here
});
The browser will remember whatever email you've entered on clicking subscribe button.
I have also come across this; there doesn't seem to be a great solution, certainly not a cross browser one, but here is one for IE I haven't seen anyone mention:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<SCRIPT>
function subForm()
{
window.external.AutoCompleteSaveForm(f1);
f1.submit();
}
</script>
</HEAD>
<BODY>
<FORM id=f1>
User ID : <input type=text name=id></input><br>
Password :<input type=password name=pw></input><br>
E-mail :<input type = text VCARD_NAME = "vCard.Email"> <br>
<input type=button value=submit onclick="subForm()">
</FORM>
</BODY>
</HTML>
From: http://support.microsoft.com/kb/329156
Use this Method:
AutoCompleteSaveForm = function(form){
var iframe = document.createElement('iframe');
iframe.name = 'uniqu_asdfaf';
iframe.style.cssText = 'position:absolute; height:1px; top:-100px; left:-100px';
document.body.appendChild(iframe);
var oldTarget = form.target;
var oldAction = form.action;
form.target = 'uniqu_asdfaf';
form.action = '/favicon.ico';
form.submit();
setTimeout(function(){
form.target = oldTarget;
form.action = oldAction;
document.body.removeChild(iframe);
});
}
Tested with ie10, ff latest, chrome latest
Test yourself: http://jsbin.com/abuhICu/1
Have you try the answer of my question that you mention?
The answer is using hidden iframe but seems he claim the idea is not working on IE and Chrome on that time.
Try to take the idea, and instead of using hidden iframe, just put the username/password/submit visible input element in a form POST, in an iframe. So user will enter login details directly into iframe. With proper Javascript you can put loading image, get success or denied from server and update the parent or the whole page. I believe it should work on any browser.
Or if you still want to use AJAX since you probably implemented the API on server side. You can make the iframe to just send a dummy POST at the same time send the real user/pass to AJAX URL.
Or back to use hidden iframe, not to hide it but move it to the invisible area like top: -1000px.
After several hours searching, I found a solution at Trigger autocomplete without submitting a form.
Basically, it uses a hidden iframe in the same page, set the action of your form to the 'src' of the iframe, and add a hidden submit button inside the form, when user clicks your button which triggers AJAX requests, you should programmatically click the hidden button before sending the AJAX request. see example below:
In your form page:
<iframe id="hidden_iframe" name="hidden_iframe" class="hidden" src="/content/blank"></iframe>
<form target="hidden_iframe" method="post" action="/content/blank" class="form-horizontal">
<input type="text" name="name">
<input type="text" name="age">
....
<button id="submit_button" type="submit" class="hidden"></button>
<button id="go_button" type="submit" class="hidden">Go</button>
</form>
Then java script:
$('#go_button').click(function(event){
//submit the form to the hidden iframe
$('#submit_button').click();
//do your business here
$.ajax(function(){
//whatever you want here
}})
);
Hope this helps.

Clicking a button with Ruby Mechanize

I have a particularly difficult form that I am trying to click the search button and can't seem to do it. Here is the code for the form from the page source:
<input type="image" name="" src="http://images.example.com/WOKRS53B4/images/search.gif" align="absmiddle" border="0" onclick="return check_form_inputs('UA_GeneralSearch_input_form','search');" title="Search" alt="Search" class="">
I am trying to do the standard mechanize click action:
login_page = agent.click(homepage.link_with(:text => "Search"))
Is this because the button uses javascript? If so, any suggestions?
I struggled with this too, especially since my form had multiple buttons.
There are multiple ways to submit a form (with many using a 'form_with' block), but this helped me:
# get the form
form = agent.page.form_with(:name => "my-form")
# get the button you want from the form
button = form.button_with(:value => "Search")
# submit the form using that button
agent.submit(form, button)
See more info here
Also, make sure you upgrade to the latest mechanize. I was using mechanize 1.x, which was giving me "undefined method" errors for the code above.
It is not a link, it is a button. What you need to do is look for the form (for example, with form_with) and then look for the ImageButton and submit it.
button = form.button_with(value: 'Search')
form.click_button(button)

Resources