loading elements with <object> tag in HtmlUnit - htmlunit

I am loading a webpage which has lots of flash/gif elements embedded within object tags. I can see those elements in my browser, but I cannot reach them in HtmlUnit.
page.getByXPath("//object"); //gives empty result
waitForBackgroundJavaScript() doesn't help either.
Anyone had the same problem before? Thanks!!

Related

Contents contained within <ul></ul> tags are not displayed by the browser

i have tried other tags the browser still renders normally but for the . tag pair
then its content is not read by the browser including tags within it
I hope to find a way to fix it and thank you very much <3

How many types of DOM are there?

How many types of DOM are there ? I know only one that is html DOM. This question was asked in an interview. Then I googled but no satisfactory result. In fact xml, sgml also have their own doms.
Please clarify.
Thanks in advance.
Types of DOM that I know of:-
HTML DOM
CSS DOM
Shadow DOM
Virtual DOM

Confused about scrapy and Xpath

I am trying to scrape some data from the following website: https://xrpcharts.ripple.com/
The data I am interested in is Total XRP which you can see immediately below or to the side (depending on your browser) of the circle diagram. So what I first did was inspect the element I am interested in. So I see that it is inside <div class="stat" inside span ng-bind="totalXRP | number:2" class="ng-binding">99,993,056,930.18</span>.
The number 99,993,056,930.18 is what I am interested in.
So I started in a scrapy shell and wrote:
fetch("https://xrpcharts.ripple.com")
I then used chrome to copy the Xpath by right clicking on that place of HTML code, the result chrome gave me was:
/html/body/div[5]/div[3]/div/div/div[2]/div[3]/ul/li[1]/div/span
Then I used the Xpath command to extract the text:
response.xpath('/html/body/div[5]/div[3]/div/div/div[2]/div[3]/ul/li[1]/div/span/text()').extract()
but this gave me an empty list []. I really do not understand what I am doing wrong here. I think I am making an obvious mistake but I dont see it. Thanks in advance!
The bottom line is: you cannot expect the page you see in the browser to be the same page Scrapy would download and have available to work with. Scrapy is not a browser.
This page is quite dynamic and complex and is constructed with the help of multiple asynchronous requests bringing in both the logic and the data. There is also JavaScript executed in the browser that plays an important role in forming and supporting the HTML document object tree.
Scrapy does not have all these things, the thing you get when you do fetch() is just the very first initial "bare bones" HTML page without all the "dynamic content".

XPath Next Page navigation

I'm using Chrome Data Miner, and so far, failing to extract the data from my query: http://www.allinlondon.co.uk/restaurants.php?type=name&rest=gluten+free
How to code the Next Element Xpath for this website? I tried all the possible web sources, nothing worked.
Thanks in advance!
You could look for a tags (//a) whose descendant::text() starts with "Next" and then get the href attribute of that a element.
% xpquery -p HTML '//a[starts-with(descendant::text(), "Next")]/#href' 'http://www.allinlondon.co.uk/restaurants.php?type=name&rest=gluten+free'
href="http://www.allinlondon.co.uk/restaurants.php?type=name&tube=0&rest=glutenfree&region=0&cuisine=0&start=30&ordering=&expand="

Locating WebElements using XPATH (NoSuchElementException)

I am having problems with locating elements using xpath while trying to write automated webUI tests with Arquillian Drone + Graphene.
To figure things out I tried to locate the search-button on the google homepage. Even that I am not getting done. Neither with an absolute or a relative xpath.
However, I am able to locate elements using IDs or when the xpath string has an ID in it. But only when the ID is a real ID and is not generated. For example on google homepage: The google-logo has a real ID "hplogo". I can locate this element by using directly the ID or the ID within the xpath-expression.
Why is locating the google logo using the ID "hplogo" possible but it fails while using the absolute xpath "/html/body/div[1]/div[5]/span/center/div[1]/div/div"?
I am really confused. What am I doing wrong? Any help is appreciated!
EDIT:
WebElement e = browser.findElement(By.xpath("/html/body/div[1]/div[5]/span/center/div[1]/div/div"));
is causing a NoSuchElementException.
Your expression works on
Firefox, but on webkit-based browser (e.g., chrome) the rendered DOM is a bit different. Maybe it depends on localization (google.co.uk for me). If I force on google.com the image logo for me is:
/html/body/div/div[5]/span/center/div[1]/img on firefox 37 and /html/body/div/div[6]/span/center/div[1]/img on Chome 42.
EDIT:
After discussing in chat, we figure out that HTMLUNIT is indeed creating a DOM that is different from the one real browsers render. Suggested to migrate to FirefoxDriver

Resources