GeckoFX - Wait till the page has loaded with Proxy - geckofx

I am using this code (VisualBasic) to wait till the page has loaded:
Private Sub WaitForPageLoad()
While (GeckoWebBrowser1.IsBusy)
Application.DoEvents()
End While
End Sub
It works good but only if there is no proxy. After connecting to proxy (by setting proxies directly in GeckoFX or using external program like Cyberghost) this code completely fails. It "says" that website has been loaded but it is still loading, so program starts to execute next instructions that shouldn't be executed without fully loaded page.
Does anyone have solution for this? I will be satisfied also with C# code.

Use the document Completed event. The isbusy is not reliable.

Related

Selenium cannot find elements on Google service while minimalised or headless mode

I try to create a program which will automate fetching data from one of the Google services. By using chrome and watir (which is basically a Ruby library build on top of the Selenium). Everything works fine as long as I keep my browser open. But when I minimize window, my program is not even able to pass a login process since it cannot find certain elements. This is my code to login:
#browser = Watir::Browser.new :chrome, options: { detach: true }
#browser.goto BASE_URL
#browser.text_field(name: 'identifier').set USER_EMAIL
#browser.element(xpath: '//*[#id="identifierNext"]').click
#browser.text_field(xpath: '//input[#type="password"]').set USER_PASSWORD
#browser.element(xpath: '//*[#id="passwordNext"]/div/button/div[2]').click
When my browser is minimize, during attempt to set a password I get this error message:
*** Watir::Exception::UnknownObjectException Exception: element located, but timed out after 30 seconds, waiting for
#<Watir::TextField: located: true; {:xpath=>"//input[#type="password"]", :tag_name=>"input"}> to be
present
And it works just fine as an open window. Even if I maximize the window during whole process program is suddenly able to locate missing input fields. The same story goes in many other points further. Program is not able to locate some elements unless chrome window is open.
Needless to say it works even worse in headless mode and I'm basically not able to locate any of those elements in html code.
As far as I understand Google services frontend side are build with Angular framework which inject html code dynamically. But shouldn't selenium pretend to act like a regular user and trigger the same responses on minimized and open window (and the headless mode as well)?
Is is some kind of blockade from Google to prevent this kind of automated proces and how can I bypass it?
Is this an issue with Chrome and switching for e.g. Firefox would fix it?
Can I implement some additional actions to actually mimic human interaction and pretend that my Chrome window is open?

"New version available" with service worker and sw-precache

I'm trying to use sw-precache, but I must be doing something wrong!
I'm mostly using the demo code available from the github repo and can't seem to get updates to the app to come through. Once it's cached the first time, it never checks for new versions.
I was expecting that when I publish a new service worker, the browser would request the new service worker and update the cache accordingly in the background. Then using the registration code in the example, I would be able to prompt the user to refresh and get the latest version from their newly refreshed cache.
Would really appreciate if someone could please point me in the right direction.
Example
To demonstrate the problem, I've created an isolated example here:
https://github.com/stevenocchipinti/sw-precache-demo
The example uses a basic skeleton from create-react-app which has a built in build task which take care of fingerprinting the filenames, etc.
I suspect the problem is with me caching everything by using the following sw-precache config:
{
"staticFileGlobs": [ "build/**/*.*" ],
"stripPrefix": "build/"
}
There are more accurate steps in the repo's readme, but the basic steps I'm taking to reproduce the problem are as follows (with my probably incorrect expectations).
Steps and Assumptions
Browse to the app for the first
I should see Content is now available offline! in the console
Reload the page
The message in the console should not appear again because the service worker is installed, but the page should still work.
Go offline and reload the page
The page should still work
Make a visible change to the source code
Rebuild (run the build task and sw-precache)
This is where my understanding must be wrong
Reload the page
The service worker should update the cache in the background
When its done, you should see New or updated content is available. in the console
The actual visible changes should not be visible until the next reload
Reload the page again
The browser will use the new cache this time around
The changes should be visible now!
There shouldn't be any messages in the console
The problem
Once the app has been cached initially, it will never update unless you unregister the service worker or force a reload.
I'm not sure how to make this work - any help would be greatly appreciated!
After replicating your development hosting environment, I can see that you're serving your service-worker.js file with a browser HTTP cache lifetime of one hour:
There's more information as to why this is leading to the behavior you're seeing, along with best practices, in this previous answer. As mentioned at the top of that answer, browsers plan on changing their behavior to stop honoring the HTTP cache for the service worker file by default, mainly due to the type of confusion that you're experiencing here. For the time being, though, the production versions of both Chrome and Firefox continue to honor those headers.

How to detect when Selenium loads a browser's error page

Is there a universal way to detect when a selenium browser opens an error page? For example, disable your internet connection and do
driver.get("http://google.com")
In Firefox, Selenium will load the 'Try Again' error page containing text like "Firefox can't establish a connection to the server at www.google.com." Selenium will NOT throw any errors.
Is there a browser-independent way to detect these cases? For firefox (python), I can do
if "errorPageContainer" in [ elem.get_attribute("id") for elem in driver.find_elements_by_css_selector("body > div") ]
But (1) this seems like computational overkill (see next point below) and (2) I must create custom code for every browser.
If you disable your internet and use htmlunit as the browser you will get a page with the following html
<html>
<head></head>
<body>Unknown host</body>
</html>
How can I detect this without doing
if driver.find_element_by_css_selector("body").text == "Unknown host"
It seems like this would be very expensive to check on every single page load since there would usually be a ton of text in the body.
Bonus points if you also know of a way to detect the type of load problem, for example no internet connection, unreachable host, etc.
WebDriver API doesnot expose HTTP status codes , so if you want to detect/manage HTTP errors, you should use a debugging proxy.
See Jim's excellent post Implementing WebDriver HTTP Status on how to do exactly that.
If you just need to remote-control the Tor Browser, you might also consider the Marionette framework by Mozilla. Bonus: It fails when a page cannot be loaded: (see navigate(url) in the API)
The command will return with a failure if there is an error loading
the document or the URL is blocked. This can occur if it fails to
reach the host, the URL is malformed, the page is restricted (about:*
pages), or if there is a certificate issue to name some examples.
Example use (copy from other answer):
To use with the Tor Browser, enable marionette at startup via
Browser/firefox -marionette
(inside the bundle). Then, you can connect via
from marionette import Marionette
client = Marionette('localhost', port=2828);
client.start_session()
and load a new page for example via
url='http://mozilla.org'
client.navigate(url);
For more examples, there is a tutorial.

Ajax Post Request blocks website loading

I have a strange problem with using ajax post requests. I use the request to run an ImageMagick process directly on the command line by using php function exec(). The process takes about a minute, and then responds with some variables. This is working fine, except from one problem. During the execution time I cannot excess other parts of the website that are installed on the same webserver (as if the server is unreachable). When the process finishes, everything works fine again.
I first thought this to be due to an overloaded server. However, when you access the website via another browser, there are no problems, even during the execution time of the process in the other browser. So it looks like the problems has something to do with browsers blocking other requests during the post request.
Could anyone help me out here? What could be the root problem?
Found the solution! Thanks from the help by kukipei By adding session_write_close(); to the file of the ajax request (after is has read the userid and token), the session file is no longer locked, and all pages are accessible again. Problem was that the session was locked during the whole execution time of the process, which was not necessary, since I only needed the session to read the userid and token. So before calling the ImageMagick operation, I now add session_write_close()

Zend_Session and Zend_Log _Db are both writing to the database twice for every page load

There are plenty of examples of similar problems littered around the web but none of their solutions seem to fix this particular variation. Any suggestions would be appreciated.
Usually this problem occurs because a rogue link is causing a request for a resources like a favicon or css file to hit the dispatcher more than once, thus causing multiple dispatch processes and therefore multiple rows in your database.
I have checked that all the links on this very simple example page do actually resolve to the resource to which they point.
The session handler is setup as follows:
Zend_Db_Table_Abstract::setDefaultAdapter($db);
Zend_Session::setSaveHandler(new
Zend_Session_SaveHandler_DbTable($config->session->toArray()));
The db logging is setup as follows:
$writer = new Zend_Log_Writer_Db($db, $config->log->tableName,
$config->log->columnMap->toArray());
$logger = new Zend_Log($writer);
Both objects are correctly setup and can read and write to and from the database. Only everything happens twice. If I put a test log message anywhere in the application it is written into the database twice. If I increment three variables with every call to the index action - one stored in the session, one passed around via a Zend_Registry object and another local to the indexAction - only the session variable is incremented by 2. The Apache access log shows the correct amount of requests being fired from the page load and all have good response codes of either 200 or 304 (unchanged).
I have tried disabling all head links.
I have tried disabling the layout entirely.
I have localised everything to the dispatcher and exited before dispatch is run.
In all cases the extra write/increment takes place.
Any thoughts?
Thanks in advance for any help.
I seem to have found and fixed the issue. Chrome (and possibly all Webkit browsers) issues an additional HEAD request on top of the GET which means the application is hit twice and anything session based will be triggered as a result of both requests. My temporary solution is to put the following code near the start of my index.php file.
if ("HEAD" == $_SERVER['REQUEST_METHOD']) {
exit;
}
I hope that helps anyone with the same issue.
Google Chrome always asks for the favicon.ico by making annoying requests to the server. Take care about this in Chrome.
For more information:
http://framework.zend.com/issues/browse/ZF-11502?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#issue-tabs
Thanks to the Sebastian Galenski contribution.

Resources