I am trying to download file from a url using selenium and Firefox on python3 but that give me an error in the geckodriver log file:
(firefox:13723): Gtk-WARNING **: 11:12:39.178: Theme parsing error: <data>:1:77: Expected ')' in color definition
1546945960048 Marionette INFO Listening on port 40601
1546945960132 Marionette WARN TLS certificate errors will be ignored for this session
console.error: BroadcastService:
receivedBroadcastMessage: handler for
remote-settings/monitor_changes
threw error:
Message: Error: Polling for changes failed: NetworkError when attempting to fetch resource..
Stack:
remoteSettingsFunction/remoteSettings.pollChanges#resource://services-settings/remote-settings.js:188:13
I use geckodriver verssion 0.22 and firefow version 65.0. Also am on UBUNTU 18 (only ssh)
geckodriver is in the /usr/bin file and have all the needed right.
I have read on google that this might be because of the COPS. But I really get what the COPS are or how to do to fix them (if that is the real problem).
here my code:
from os import getcwd
from pyvirtualdisplay import Display
from selenium import webdriver
# start the virtual display
display = Display(visible=0, size=(800, 600))
display.start()
# configure firefox profile to automatically save csv files in the current directory
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/csv")
driver = webdriver.Firefox(firefox_profile=fp)
page = "https://www.thinkbroadband.com/download"
driver.get(page)
driver.find_element_by_xpath("//*[#id='main-col']/div/div/div[8]/p[2]/a[1]").click()
Do you guys have any idea ?
This error message...
Message: Error: Polling for changes failed: NetworkError when attempting to fetch resource..
...implies that there was a NetworkError while attempting to fetch resource.
Here the main issue probably is related to Cross-Origin Resource Sharing (CORS)
Cross-Origin Resource Sharing (CORS) is a mechanism that uses additional HTTP headers to tell a browser to let a web application running at one origin (domain) have permission to access selected resources from a server at a different origin. A web application makes a cross-origin HTTP request when it requests a resource that has a different origin (domain, protocol, and port) than its own origin.
An example of a cross-origin request: The frontend JavaScript code for a web application served from http://domain-a.com uses XMLHttpRequest to make a request for http://api.domain-b.com/data.json.
For security reasons, browsers restrict cross-origin HTTP requests initiated from within scripts. For example, XMLHttpRequest and the Fetch API follow the same-origin policy. This means that a web application using those APIs can only request HTTP resources from the same origin the application was loaded from, unless the response from the other origin includes the right CORS headers.
Modern browsers handle the client-side components of cross-origin sharing, including headers and policy enforcement. But this new standard means servers have to handle new request and response headers.
Solution
You need to induce WebDriverWait for the desired element to be clickable and you can use the following solution:
Code Block:
from selenium import webdriver
from os import getcwd
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# configure firefox profile to automatically save csv files in the current directory
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir", getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/csv")
driver = webdriver.Firefox(firefox_profile=fp, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.thinkbroadband.com/download")
WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.XPATH, "//div[#class='specific-download-headline' and contains(., 'Extra Small File (5MB)')]//following::p[1]/a"))).click()
Snapshot:
Reference: How to resolve “TypeError: NetworkError when attempting to fetch resource.”
I got the same error. After updating the geckodriver vresion to geckodriver 0.24.0 ( 2019-01-28) worked fine for me. Try this
xxxxx:~$ geckodriver --version
geckodriver 0.24.0 ( 2019-01-28)
Related
I have been trying to deploy a python predictive model using flask API however I am not able to receive the image request. It works fine when I test it using text but I think I am missing something when it comes to dealing with Images. I have uploaded the image to postman using the form-data and set the key to type 'file' and name 'image'.
This is the error on postman I am getting:
werkzeug.exceptions.BadRequestKeyError: 400 Bad Request: The browser (or proxy) sent a request that this
server could not understand.
KeyError: 'file'
In my flask code, I am using this line of code to receive the request:
imagefile = request.files['image']
Please help me figure this out. Thank you
I had a similar issue before. You can try to add this to your code:
from flask import Flask
#python3 -m pip install flask-cors
#installation command
from flask_cors import CORS
#You can specify the app variable's name CORS(your_app)
app = Flask(__name__)
CORS(app)
#app.route("/")
def helloWorld():
return "Hello, cross-origin-world!"
Full Documentation:https://flask-cors.readthedocs.io/en/latest/
Also if you are trying to use HTTPS just change it to HTTP.
I am trying to use the Hugging face pipeline behind proxies.
Consider the following line of code
from transformers import pipeline
sentimentAnalysis_pipeline = pipeline("sentiment-analysis")
The above code gives the following error.
HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json (Caused by ProxyError('Your proxy appears to only use HTTP and not HTTPS, try changing your proxy URL to be HTTP. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#https-proxy-error-http-proxy', SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1091)'))))
I tried to check the proxy on my machine having OS - "windows server 2016 Datacenter" using the following code.
import urllib.request
print(urllib.request.getproxies())
The output is as follows:
{'http': 'http://12.10.10.12:8080', 'https': 'https://12.10.10.12:8080', 'ftp': 'ftp://12.10.10.12:8080'}
However, as per the documentation from urlib3 page, the above setting is incompatible and the problem lies in the https setting :
{
"http": "http://127.0.0.1:8888",
"https": "https://127.0.0.1:8888" # <--- This setting is the problem!
}
and the right setting is
{ # Everything is good here! :)
"http": "http://127.0.0.1:8888",
"https": "http://127.0.0.1:8888"
}
How can we change the proxy setting from "https": "https://127.0.0.1:8888" to "https": "http://127.0.0.1:8888" in a windows OS?
I tried by setting the windows environment variable name as "https_proxy" and the variable values as http://127.0.0.1:8888. However, It is not working.
I found the solution and it is pretty simple. Include the following lines in your python script/notebook. Change the proxy_url and port as per your setting. I hope it helps, someone in the community.
import os
os.environ['HTTP_PROXY'] = 'http://proxy_url:proxy_port'
os.environ['HTTPS_PROXY'] = 'http://proxy_url:proxy_port'
Method 1 (recommended): pypac
pip install pypac
Monkeypatch the requests libary.
import requests
import pypac
def request(method, url, **kwargs):
with pypac.PACSession() as session:
return session.request(method=method, url=url, **kwargs)
requests.request = request
transformers should now work as expected.
from transformers import pipeline
sentimentAnalysis_pipeline = pipeline("sentiment-analysis")
Method 2: Disable SSL verification
WARNING: This method could expose you to malicious attacks.
Disabling SSL verification is a bad idea in general, but the story even worse here because (afaik) transformers may download code and run exec on it. This opens the door for a man-in-the-middle to execute arbitrary code on your machine.
This is probably a very bad idea unless you really know what you're doing or you don't care at all about your machine's security. Otherwise, do not use this method.
import requests
import functools
requests.request = functools.partial(requests, verify=False)
Explaination
Setting the HTTP_PROXY and HTTPS_PROXY environment variables might not be enough to get through your corporate firewall. It may be using a .pac to autoconfigure its proxy on your machine. Browsers pick file up and use it automatically, as do some development tools (e.g. JetBrains). The requests library does not appear to do so.
Fortunately, there is a library called PyPAC that can does this for you. But you'll need to monkeypatch requests to use PyPAC's request method rather than its own.
You don't need to patch requests.get since that function delegates to requests.request anyway.
I am trying to implement the IIIF standard in order to show some papyri. I have configured Loris as an image server (here there is an info.json example: https://philhist-papyri-01.philhist.unibas.ch/loris/1/images/1.RectoIliad19th(T)book-IR-enh.jpg/info.json) and also I have configured Mirador. I am also serving manifests via an API (example: https://philhist-papyri-01.philhist.unibas.ch/api/iiif/11b4ca60-6bac-11eb-a1e6-005056b34690/manifest).
When I try to load the images in Mirador, I am getting an error:
Tile push../node_modules/openseadragon/build/openseadragon/openseadragon.js.$.Tile failed to load: https, https://philhist-papyri-01.philhist.unibas.ch, philhist-papyri-01.philhist.unibas.ch/6%2Fimages%2F6.VersoUnidentifiedLiteraryText-IR.jpg/full/4,/0/default.jpg - error: Image load aborted
Does anybody have any idea why this is coming from? The image actually can be retrieved from the URI in the manifest (https://philhist-papyri-01.philhist.unibas.ch/loris/1/images/1.RectoIliad19th(T)book-IR-enh.jpg/full/full/0/default.jpg), but it is not being shown in the mirador window.
There might be an issue with the resolver of Loris which is causing the #id of the image not to be canonical, but I am not quite sure.
I'm seeing an issue that perhaps CORS is not enabled for your info.json responses.
See: https://projectmirador.org/embed/?iiif-content=https://philhist-papyri-01.philhist.unibas.ch/api/iiif/11b4ca60-6bac-11eb-a1e6-005056b34690/manifest
Depending on how you use Loris to serve content, you will need to enable CORS for the IIIF requests.
I've been in the process of making an instagram bot for a few days now and I'm having trouble logging into instagram with a headless selenium browser.
The script I made works perfectly fine when run on my laptop but when I try to run this on my digital ocean server, the browser will fill the login forms, then submit the form but nothing happens. I was able to print any console errors with this code:
for entry in self.browser.get_log('browser'):
print(str(entry))
and this error comes up
{u'source': u'network', u'message': u'https://www.instagram.com/accounts/login/ajax/ - Failed to load resource: the server responded with a status of 400 ()', u'timestamp': 1546536937734, u'level': u'SEVERE'}
I am using the chromedriver for selenium and python2.7
I'm not completely sure why this is happening. Thanks for any help!
This error message...
{u'source': u'network', u'message': u'https://www.instagram.com/accounts/login/ajax/ - Failed to load resource: the server responded with a status of 400 ()', u'timestamp': 1546536937734, u'level': u'SEVERE'}
...implies that the WebDriver was unable to initiate/spawn a new WebBrowsing Session as it failed to load the required resources through AJAX calls.
As per the HTML DOM of the Instagram - Log in page it is pretty clear the DOM Tree contains AJAX and JavaScript enabled elements.
So while you invoke get() method before interacting with the elements on the particular you need to induce WebDriverWait for the desired elements to be clickable which will ensure that:
The associated JavaScript and AJAX Calls have completed
The desired elements are enabled and visible to recognize click events propogated through Selenium.
You can find a relevant discussion in Google adsense responding the server responded with a status of 400 ()
Here you can find a relevant discussion on Filling in login forms in Instagram using selenium and webdriver (chrome) python OSX
I'm having a pretty weird problem with CORS on a webapp I'm trying to make
I'm using Servlets (Tomcat8.0) for the backend. It's a school project, so I can't use a framework
A GET request to http://localhost:8080/FileBox/dashboard
returns a JSON payload( plain json, not jsonp,which I could use, but its the same domain). I'm using ajax to make the XHR, but it's being blocked by chrome as CORS
Should this be happening, since I'm making the XHR from the same domain(host+port)
'localhost:8080/FileBox/dashboard.jsp'
to
'localhost:8080/FileBox/dashboard'
Please, and thank you for the help!
You aren't making a request to http://localhost:8080/FileBox/dashboard. The error message says you are making a cross-origin request using an unsupported scheme and that http is a supported scheme.
Presumably you have made the two mistakes of:
Getting the URL wrong
You should be using a relative URL:
/FileBox/dashboard
but are trying to use an absolute URL:
http://localhost:8080/FileBox/dashboard
but have typed it wrong and are actually requesting
localhost:8080/FileBox/dashboard
Not loading the page over HTTP to start with
Possibly by double clicking the file in your system file manager, you have bypassed your HTTP server and are loading something like file:///c:/users/you/yourproject/index.html
Combined with the previous mistake, you end up trying to request file:///c:/users/you/yourproject/localhost:8080/FileBox/dashboard, with Ajax and get a security violation.
Solution
Fix the URL to be a proper relative URL
Point your browser at http://localhost:8080 instead of double clicking files in your file manager