I am automating Internet Explorer using SHDocVW.dll and MSHTML with C#, and I wish to save an image from the page to the disk (JPEG format).
I can't use the WebClient class to download the image; if I do it, I end up downloading the site's login page. I can't print the screen either, because the browser has to remain invisible during this process, running in the background.
I have tried to do the following:
IHTMLImgElement imgElement = ...;
IHTMLControlRange imgRange = ...;
imgRange.add(imgElement as IHTMLControlElement);
imgRange.execCommand( "copy", false, null );
This does nothing. I am not able to extract anything from the clipboard. Every solution I found didn't work for me.
Your webclient approach is probably missing cookies... see How do I log into a site with WebClient? for an example that handles cookies.
your code looks fine except the user has to change the security setting to enable clipboard access. If the image is cached on disk you can dig the WinInet cache after parsing the page for the image location.
Related
Im creating a web extension and porting from XUL. I used to be able to easily read files with
var dJsm = Components.utils.import("resource://gre/modules/Downloads.jsm").Downloads;
var tJsm = Components.utils.import("resource://gre/modules/Task.jsm").Task;
var fuJsm = Components.utils.import("resource://gre/modules/FileUtils.jsm").FileUtils;
var nsiPromptService = Components.classes["#mozilla.org/embedcomp/prompt-service;1"].getService(Components.interfaces.nsIPromptService);
....
NetUtil.asyncFetch(file, function(inputStream, status) {
if (!Components.isSuccessCode(status)) {
return;
}
var data = NetUtil.readInputStreamToString(inputStream, inputStream.available());
var data = window.btoa(data);
var encoded_data_to_send_via_xmlhttp = encodeURIComponent(data);
...
});
This above will be deprecated.
I can use the downloads.download() to know what was the last download but I can NOT read the file and then get the equivalent for encoded_data_to_send_via_xmlhttp
Also in Firefox 57 onwards, means that I have to try to fake a user action by a button click or something, or upload a file.
Access to file:// URLs or reading files without any explicit user input
isnt there an easy way to read the last downloaded file?
The WebExtension API won't allow extensions to read local files anymore. You could let the extension get CORS privilege and read the content directly from the URL via fetch() or XMLHttpRequest() as blob and store directly to IndexedDB or memory, then encode and send to server. This comes with many restrictions and limitations such as to which origin you can read from and so forth.
Also, this would add potentially many unneeded steps. If the purpose is, as it seem to be in the question at the moment, to share the downloaded file with a server, I would instead suggest that you obtain the last DownloadItem object, extract the URL (.url) from that object and send the URL back to server.
This way the server can load directly from that URL (and encode it on server if needed). The network load will be about the same (a little less actually since there is no Base64 encoding involved which adds 33% to the size), and much less load on the client. The server would read the data as a binary/byte data stream; about the same as if the data was sent directly from the extension.
To obtain the last downloaded file you would do the following from a privileged script:
browser.downloads.search({
limit: 1,
orderBy: ["-startTime"]
})
.then(getLastDownload);
function getLastDownload(downloads) {
if (downloads.length) {
var url = downloads[0].url;
// ... send url to the server and let server fetch the data from it directly
}
}
According to this support mozilla question.
(2) Local file security
Firefox limits access from pages on web servers to pages on local disk or UNC paths. [...]).
Which solution ?
Use local-filesystem-links firefox addon (not tested)
and/or
run a small local webserver on client side, supposing server was run with sufficient privileges, you may finally access any local content via http:// (but still cannot with file:///)
I am looking for a method to download automatically a file from a website.
Currently the process is really manual and heavy.
I go on a webpage, I enter my pass and login.
It opens a pop up, where I have to click a download button to save a .zip file.
Do you have any advice on how I could automate this task ?
I am on windows 7, and I can use mainly MS dos batch, or python. But I am open to other ideas.
You can use selenium web driver to automate the downloading. You can use below snippet for browser download preferences in java.
FirefoxProfile profile = new FirefoxProfile();
profile.setPreference("browser.download.folderList", 2);
profile.setPreference("browser.download.manager.showWhenStarting", false);
profile.setPreference("browser.download.dir", "C:\\downloads");
profile.setPreference("browser.helperApps.neverAsk.openFile","text/csv,application/x-msexcel,application/excel,application/x-excel,application/vnd.ms-excel,text/html,text/plain,application/msword,application/xml");
To handle the popup using this class when popup comes.
Robot robot = new Robot();
robot.keyPress(KeyEvent.VK_DOWN);
robot.keyRelease(KeyEvent.VK_DOWN);
robot.keyPress(KeyEvent.VK_ENTER);
robot.keyRelease(KeyEvent.VK_ENTER);
You'll want to take a look at requests (to fetch the html and the file), Beautifulsoup (to parse the html and find the links)
requests has built in auth: http://docs.python-requests.org/en/latest/
Beautifulsoup is quite easy to use: http://www.crummy.com/software/BeautifulSoup/bs4/doc/
Pseudocode: use request to download the sites html and auth. Go through the links by parsing. If a link meets the criteria -> save in a list, else continue. When all the links have been scrapped, go through them and download the file using requests (req = requests.get('url_to_file_here', auth={'username','password'}), if req.status_code in [200], file = req.text
If you can post the link of the site you want to download from, maybe we can do more.
I am having an erratic problem using Azure blob storage where my images do not load consistently. The problem is that sometimes when I load a web page, the browser will not show the image, but if I refresh it will load correctly.
When the image doesn't load, the browser shows the default image placeholder. Here is an example:
If I check the hyperlink for the image placeholder, I find that it is the same as the when the image loads successfully, except the Shared Access Signature is different.
Sometimes the same image will fail to load for one link but load successfully for another link even in the same page and same page load. The only difference in the URL is the Shared Access Signature.
Here is my code to build the URL with the shared signature
// Get reference to blob (file) that is to be downloaded
blob = blobContainer.GetBlobReference(blobURL.ToString());
// Get shared access signature to download file from azure blob (valid upto "active duration" minutes) from now
signature = blob.GetSharedAccessSignature(new SharedAccessPolicy()
{
SharedAccessStartTime = null,
SharedAccessExpiryTime = DateTime.UtcNow.AddMinutes(60),
Permissions = SharedAccessPermissions.Read
});
// Append signature query string to blob / file that is to be downloaded
downloadURL = string.Format("{0}{1}", blob.Uri.AbsoluteUri, signature);
This is the final HTML image link on the web page, i.e. if I show source on the web page in the browser:
<img alt="Profile Picture" src="https://mystorageaccount2.blob.core.windows.net/abcdefg1-hi23-40b5-86de-a20b568f5626/1601/1234d664d1b74ce1aebf4403e5b74af7.jpg?se=2015-10-31T11%3A38%3A39Z&sr=b&sp=r&sig=SaiUToJg%5Ab3zcdef8EeOq84urHf6HQqS%2BAFt1dEQMNI%3D">
Has anyone else seen this problem? Any recommendations on what I might be doing wrong?
I suspect this could be related to the expiry period which you have set on your image blob's shared access signature (SAS). Is there any good reason where you need to set the SAS to 1 minute when you have set it's permission to read-only?
I want to add validation in filefield of ExtJs4 , so that user can only browse .png , .jpeg image files..How should I do it ?
{
xtype: 'filefield',
id:'photoUpload',
buttonOnly:true,
buttonText: 'Photo'
}
I think it is important to understand how file upload works, so to prevent yourself from troubles in the future...
For security reasons, the following applies:
Browsers cannot access the file system unless the user has explicitly clicked on an upload field.
Browser has minimal access to the file being uploaded, in particular - you JS code may be able to see the file name (the browser has to display it in the field), but nothing else (the path itself on most browsers is not the correct one).
The upload process itself happens in these steps:
The user clicks on an upload field, initiating the file select dialog.
The browser implements access to the file system through the dialog, allowing the user to select a file.
Upon OK click, the browser sends the file to the server.
The server places the file in its temp directory (configured per server).
Once upload is complete, the upload script on the server is called with the file details, and that script will have full access to the uploaded file.
The last step is the only point where you have full access to the file details, including the real actual name, its size, and its content.
Anything the browser gives javascript is browser depended. Even the file name will vary between browsers although all the browsers I know do keep the actual file name (but not the real actual path), you cannot rely on this to work with future versions. The reason for this is that the file name is displayed on the client side.
So the recommendation is this:
Do all file upload checks on the server side.
Again, you may get away with the file name on the JS client side, particularly if you know and can test what browsers your clients will use, but I'd strongly recommend to to this test on the server.
The last thing you have to remember is that users might upload a file ending with .png, but the file itself is a .zip with the extension changed - so to really confirm that the file is .png you need to actually look into the file data, which only the server can do.
{
xtype: 'filefield',
id:'photoUpload',
buttonOnly:true,
vtype:'fileUpload',
buttonText: 'Photo'
}
And Vtype which I have use..
Ext.apply(Ext.form.VTypes, {
fileUpload: function(val, field) {
var fileName = /^.*\.(gif|png|bmp|jpg|jpeg)$/i;
return fileName.test(val);
},
fileUploadText: 'Image must be in .gif,.png,.bmp,.jpg,.jpeg format'
});
Try following snippet in your 'filefield' xtype config
regex : (/.(gif|jpg|jpeg|png)$/i),
regexText : 'Only image files allowed for upload',
msgTarget : 'under'
I have a big problem with Internet Explorer 7 and 8.
SITUATION:
I have a FORM that build a Medical Prescription. When I hit the save button, the script saves the DomainObject on DB and set a boolean property (of panel where the form is added) called "saved" to true and a byte[] property called PDF with bytestream.
On RenderHead of Panel, I read this boolean and, if is true, I force the trigger of a hidden button with this code:
String js = "$('#" + printPDF.getMarkupId() + "').click();";
response.renderOnDomReadyJavaScript(js);
The button executes this code:
ResourceStreamRequestHandler handler = new ResourceStreamRequestHandler(new ByteArrayResourceStream(pdf, "application/pdf"));
handler.setFileName("foo.pdf");
RequestCycle.get().scheduleRequestHandlerAfterCurrent(handler);
This code work perfecly on FF and Chrome. The Browser download windows appears and the user can save the PDF on HD.
Unfortunally, Internet Explorer has that damn security behavior that is triggered when a site require something to download. That warning require a user validation. A yellow Bar appear and the user is force to hit "Download".
screenshot http://imageshack.us/a/img198/1438/securityg.jpg
When I hit Download File, the form is submitted again with the exact state I had when I hit save the first time. So no previous INSERT on DB is already committed; The Session is resetted to the previous state etc...
The result is a double INSERT on DB of the Domain Ojbect.
Any clue to resolve this?
The problem is that you click download link programaticly instead to redirect browser to an URL or open an URL by JS window.open(url). Click a link looks like an unwanted operation that is sometimes restricted by browser.