Watir get browser html with iframe content included - ruby

Is there a way to get the html of the whole page with rendered iframes html also included? I know you can access browser html like so.
browser.html
but that only prints the current's DOM html, without the iframe contents. I also know the iframe content can be retrieve like so .
browser.iframe(index: 0).html
But is there a way to get the browser page contents, with all the iframes content rendered included also; the reason why I'm asking is because some iframes might have other iframes embedded and so on. So it becomes cumbersome to parse.
Thanks.

Related

Can I parse a webpage as a browser would and get back the html DOM element in swiftUI?

I'm trying to scrape an Instagram page using SwiftSoup, but when I run let html = try String(contentsOf: URL("https://www.instagram.com/sasawpi/"), I get back a bunch of JS functions and CSS styles like this, probably because Instagram uses React and some package control to send files over from the server. My question is, can I render it as an HTML DOM, like a browser would, and get it as a text inside of SwiftUI using SwiftSoup or some other library. WebView renders web pages so one of my suggestions was to render it with WebView but I don't know how to get this HTML DOM as text from WebView either.

Browser Rendered Page Appears Different When Scraped

opening this page https://www.barchart.com/futures/quotes/ESH20/options
in Nokogiri doesn't have the same elements as in the rendered page inside the browser.
How can I access the same source code as seen in the Browser DevTools from the scraper library?
this element in particular <div class="bc-datatable"...
is required an headless browser to get the right page code first?
That data is coming from a JSON endpoint:
So luckily no headless browser required this time.

Scrape an HTML page after AJAX calls for elements not in the page source

I'm trying to scrape a webpage where the content I want loads after the DOM completes. The new content is fetched through AJAX calls.
So the fetched content isn't available in the page source. I can see when inspecting the page.
When I use cURL it doesn't find the elements on the page. What is the best method to get this content?
I'm trying to use PhantomJS for this, but I'm not sure if that can do it either.
Thanks.

What influences the time an iframe is requested from browser

Can someone explain what influences the order of requests before an iframe is loaded?
My observation on one of our websites (springerprofessional.de) seems to suggest that some script files are loaded first before the iframe is requested even though they come after the tag in the html.
"login?service=" is the iframe content in the picture below.

How to view JS code loaded with AJAX in browser?

I have a JSP page, where some parts of the pages are loaded from the backend using AJAX. For example, when I first open the page, the URL is http://www.made-up-domain-name-because-of-stack-overflow-restrictions.com/listUsers.do. The page contains an "add user" button, which loads HTML content (containing a form etc.) from the backend to the div-element with id "addArea". The URL stays the same the whole time (naturally), as the request is done in the background.
The problem I have is that the content loaded using AJAX is not completely viewable with any means.
Using Firefox I can see the new HTML with the Firebug add-on and "Inspect element", but the content within the script-tags is not visible that way (also not in the "Script" tab in Firebug - only the originally loaded scripts appear there). If I use "View page source" in FF a page reload is executed and I don't see the newly generated content (I only see the content of page http://www.made-up-domain-name-because-of-stack-overflow-restrictions.com/listUsers.do as it was when first loaded).
With Chrome I have the same problem as with Firefox.
Using IE I see only the original source.
Of course I can work around this by adding debugging mechanisms to the JS code and working half-blind, or moving parts of the JS code to external files etc., but if by any means possible, I would prefer to just view the code loaded using AJAX. Any suggestions, perhaps using some add-on?
Update: There is a better way: see the accepted answer for this question: How to debug dynamically loaded javascript(with jquery) in the browser's debugger itself?
You can use the JavaScript Deobfuscator extension for that. It can show you what scripts are compiled/executed on a webpage - including the ones that were loaded dynamically.

Resources