xpath for a google search url issue no content - xpath

http://www.google.com/search?q=youtube
Trying to get the url part www.youtube.com/ of the search
using this xpath but there is no output from it.
.//*[#id='rso']/li[1]/div/div[2]/div[1]/cite
i've also tried using the css path same issue

CSS: div[aria-label='Result details']+div>div cite.
Btw, your xpath works fine for me. If you use selenium to retrieve your text for example, you should write xpath=.// in this case, because it recognize selector as xpath by preceding // symbols. Also //*[#id='rso']/li[1]/div/div[2]/div[1]/cite//text() will return three textNodes www., youtube and .com/

Related

How to write xpath for below code displayed on Image

Snapshots displayed Field as well as Inspect element code. Always faced problem on writing xpath for table element. Xpath copied from Moxilla firbugs is worked sometimes but not always.. can any one tell how to write xpath of above code.... Thanks
You can use this xpath
//table[#class='detailList']/tbody/tr/td[contains(text(),'Business Lease')]

Locating WebElements using XPATH (NoSuchElementException)

I am having problems with locating elements using xpath while trying to write automated webUI tests with Arquillian Drone + Graphene.
To figure things out I tried to locate the search-button on the google homepage. Even that I am not getting done. Neither with an absolute or a relative xpath.
However, I am able to locate elements using IDs or when the xpath string has an ID in it. But only when the ID is a real ID and is not generated. For example on google homepage: The google-logo has a real ID "hplogo". I can locate this element by using directly the ID or the ID within the xpath-expression.
Why is locating the google logo using the ID "hplogo" possible but it fails while using the absolute xpath "/html/body/div[1]/div[5]/span/center/div[1]/div/div"?
I am really confused. What am I doing wrong? Any help is appreciated!
EDIT:
WebElement e = browser.findElement(By.xpath("/html/body/div[1]/div[5]/span/center/div[1]/div/div"));
is causing a NoSuchElementException.
Your expression works on
Firefox, but on webkit-based browser (e.g., chrome) the rendered DOM is a bit different. Maybe it depends on localization (google.co.uk for me). If I force on google.com the image logo for me is:
/html/body/div/div[5]/span/center/div[1]/img on firefox 37 and /html/body/div/div[6]/span/center/div[1]/img on Chome 42.
EDIT:
After discussing in chat, we figure out that HTMLUNIT is indeed creating a DOM that is different from the one real browsers render. Suggested to migrate to FirefoxDriver

scrapy xpath select elements by classname

I have followed How can I find an element by CSS class with XPath? which gives the selector to use for selecting elements by classname. The problem is when I use it it retrieves an empty result "[]" and I know by fact there is a div classed "zoomWindow" in the url fed to the scrapy shell.
My attempt:
scrapy shell "http://www.niceicdirect.com/epages/NICShop.sf/secAlIVFGjzzf2/?ObjectPath=/Shops/NICShop/Products/5696"
response.xpath("//*[contains(#class, 'zoomWindow')]")
I have looked at many resources that provide varied selectors. In my case the element only has one class, so versions that use "concat" I used but didn't work and discarded.
I have installed ubuntu and scrapy in a virtual machine just to make sure it was not a bug in my installation on windows but my attempt on ubuntu had the same results.
I don't know what else to try, can you see any typo in the selector?
If you would check the response.body in the shell - you would see that it doesn't contain an element with class="zoomWindow":
In [3]: "zoomWindow" in response.body
Out[3]: False
But, if you open the page in the browser and inspect the HTML source, you would see that the element is there. This means that the page load involves javascript logic or additional AJAX requests. Scrapy is not a browser and doesn't have a javascript engine built-in. In other words, it only downloads the initial HTML code of the page without additionally downloading js and css files and "executing" them.
What you can try, for starters, is to use scrapyjs download handler and middleware.
To image you want to extract is also available in the img tag with id="PreviewImage":
In [4]: response.xpath("//img[#id='PreviewImage']/#src").extract()
Out[4]: [u'/WebRoot/NICEIC/Shops/NICShop/547F/0D9A/F434/5E4C/0759/0A0A/124C/58F7/5708.png']

get current element using Simple HTML DOM

I'm trying to use Simple HTML DOM to find objects via XPath.
It's working pretty well but I can't seem to get the current element:
$object->find('.');
$object->find('..');
$object->find('//');
all return an empty array
$object->innertext
returns a normal table with HTML, so the object IS valid.
Simple HTML DOM doesn't recognize '.' for getting the current element,
in fact, it uses Regex to find elements using XPath.
In order to solve this problem I used DOMXPath instead of Simple HTML DOM,
which has a lot more options and functionality.

how to use Xpath with LibXml 2

in this address i am trying to scrape a tage (that is Larg price which is bold red one)
i use LIBXML 2.2
when i try to extract the tag through this XPATH
//*[#class='priceLarge']
it works!
but to make queries easier i would like to use FireBug on Firefox.
Using FireBug it gives me this XPath
/html/body/div[2]/form/table[3]/tbody/tr/td/div/table/tbody/tr[2]/td[2]/span/b
using this Xpath it does not work, seems this one does not give a complete query. how can i modify this XPath to scrape the item ?
Firefox and other browsers generate tbody tags in HTML.
In fact, the tbody is probably not there, so you can remove it in your XPath. (/html/body/div[2]/form/table[3]/tr/td/div/table/tr[2]/td[2]/span/b) You can test this by just saving the HTML from your application and viewing it in a text editor.
Since it seems the intent is to pull information from a web page however, your application will probably be more resistant to changes in the web page if you use XPath less dependent on the tree structure (i.e. //b[#class='priceLarge']).
EDIT: It seems that in addition to the tbody problem, Firefox is rendering the div (ID: divsinglecolumnminwidth) element as containing the form element (ID: handleBuy).
Looking at the html with an XML editor shows that the form element is a sibling of that div element, so the expression should start with /html/body/form/table[3].
One tool, among many others, to test your XPath expressions is HAP Testbed.

Resources