how to request if href have only https in selenium python - for-loop

I'm try to finding broken links from website using selenium python. I almost done the script and executed its working good. but I'm facing issue while request the link to get status code. In my website anchor attribute have href="javascript:void(0) So I try to get href attribute value and request to get status-code(response code) but this javascript:void(0) is not a link. How Can i solve this?
anchor = driver.find_elements_by_tag_name("a")
for anchors in anchor:
if(requests.head(anchors.get_attribute('href')).status_code):
print(anchors.get_attribute('href'), "Valid link")
else:
print(anchors.get_attribute('href'), "Broken link")

javascript:void(0) is just a dummy placeholder. Your website may have JavaScript which actually handles clicks on that link. You'll have to investigate your website to see what it's actually designed to do on those clicks.

Related

Problem interacting with elements inside an iframe

I'm able to retrieve data inside an iframe using
browser.iframe
But when I try to interact with elements inside this iframe, like for ex. click on a button, watir won't locate any of them.
I tried with all kind of elements inside this iframe, but nothing happens.
There could be more than one iframe on your page so you might be targeting the wrong one (just writing browser.iframe will target the first iframe on page.
You can could inspect the page and check if there are more than one iframe, or type:
puts "there are #{browser.iframes.count} on page"
and run the script again.
ideally iframe has an id so you could write
my_iframe = browser.iframe(id: "something")
my_iframe.buttons.last.click
# or
my_iframe.button(text: "ok").click
if it's not a secret, right click => inspect page you are automating and screenshot/paste html here. Or share the whole code that doesn't work for you.

Getting URL with javascript

Because of an HTTP_REFERER issue I need to make a url pass from an https site to http.
I have this bit of javascript but it is not working.
Save this page as PDF
Can I also find out how I would append the current site using javascript their api url?
http://api.htm2pdf.co.uk/urltopdf?apikey=yourapikey&url=http://www.example.com
Any advice?
Need to block the initial anchor tag event.
Save this page as PDF
I would use either javascript or the href attribute, not both. I don't see how they would work well together.
You can use .preventDefault() as noted, but why put the href attribute there in the first place?
Is this what you're looking for? It should work on both http or https sites.
<a onclick="window.open('http://api.htm2pdf.co.uk/urltopdf?apikey=yourapikey&url=' + window.location.href, '_blank', 'location=yes,scrollbars=yes,status=yes');">Save as PDF</a>

Google ajax crawling not working with fetch as google

I am trying to test with "fetch as google" an orchard website which has ajax content . Shouldn't google replace http://cmbbeta.azurewebsites.net/#! with http://cmbbeta.azurewebsites.net/?_escaped_fragment_ (both links work). When i hit my beta website with fetch as google, the preview shows me that the page is loading the ajax content,and not the static one.
Am i missing something?
The preview that appears when you put your mouse over the link always seem to show the dynamic website. The important thing to look at is the fetch result that you can access by clicking the "Success" link in the "Fetch Status" column.
This is probably not affecting your site, but the Fetch as Google feature doesn't work for AJAX urls that are specified with the <meta> tag. See here.

Using Watir-webdriver how to check the URL of a page

I am new to watir-webdriver automation, apologies if its a basic question of automation. But the thing is I am automating pagination of a website where the URL of the website changes as the user changes the page
say the URL is www.example.co.uk/news which has pagination when the user clicks next button on the pagination the URL changes to www.example.co.uk/news?page=1
I want to check the URL at this point to see if the URL is correct.
But I can't really find a way to get the URL of the current page.
browser.url will return url of the page, so to check if it is as expected, try something like this:
browser.url == "www.example.co.uk/news?page=1"
It will return true or false.

How does a website load only part of the page and still display full on URLs?

I am looking at the Gawker blogs (http://io9.com, http://lifehacker.com/) and I'm curious about how they are made.
When I click for on a link only the article part of the page reloads displaying a loading icon while it does.
But what I can't figure out is that links point to new URLs like io9.com/something/something and its not something like I see on ajax pages that they put a site.com/#something tag at the end of the url from javascript to mark the page after an ajax request.
Can I change the full blown URL from javascript or what is happening?
When it happens, the website is using the HTML5 History API. This API can change the url (via JavaScript) without changing the page.
See caniuse.com for browser support.
If you would like to implement it in yout website, backbonejs.org would be very useful.

Resources