How to disable download for chrome headless using start up flag - google-chrome-headless

I am using google chrome headless for some crawling.
When crawler visit page with file like this: www.example.com/file.zip file is downloaded, I don't want to downloaded file. Is there any way how to disable download in start up flags? Or some other way?

Related

When I download file from Box into Google Colab, HTML was is downloaded

I am trying to download a file (or several files) from Box into Google Colab using "wget". But, what is downloaded looks like a HTML page not the file itself.
I am using the command:
!wget https://AAA.box.com/s/mh7xq8lou9ukb5i7lssz0frou554dupb -O script.py
Is there a problem with the URL that I am using? I get the URL by opening the file in Box and click "Get shared link".
You are trying to download from sharing link which is a webpage not a direct download link. So It will download the webpage. As a simple trick you can click download in browser and cancel it. Then copy the URL from download and use it with wget.

Firefox - How to enable an HTML (with Javascript) file to save itself locally?

Firefox - How to enable an HTML file to save itself locally?
I use Firefox to open and edit TiddlyWiki.html files.
https://github.com/Jermolene/TiddlyWiki5
These are HTML with a Javascript app packaged together in one file.
They also have this Firefox extension called TiddlyFox that can enable the TiddlyWiki.html file to save itself locally in the file you just opened. It first asks if you want the this ability to be enabled on that particular file and if you click Yes it works.
I was wondering how this behavior is achieved via the Firefox Extension? (ex source code here: https://github.com/TiddlyWiki/TiddlyFox )
Google Chrome has the filesystem api, i dont know if this is how Tiddly does it but this topic solution shows how to do it in chrome:
http://stackoverflow.com/a/13779352/1828637
Noitidart comment Feb 10 '15 at 3:38

Using Watir and IE 7 to download files

Based off of this discussion: Watir Web driver to download file
Is there any other way similar to this to download a file using Watir without using something like RAutomation or AutoIT? The website I have to use is only formatted for IE and breaks when using other browsers.

Export all http requests on a specific page to txt/csv

I use SIEGE to test my web server performance. For a more realistic test the best way to go would be to have SIEGE hit the web page (website.com/our-company) and all static assets (.css, .js, .png, .jpg). Everything that you see on the firefox / chrome debbuing tools, except of course from resources loaded from external servers (cdn.facebook, apis.google.com).
I am running several tests so it is a pain to manually collect all asset urls. Is there a tool that I can use to load a web page and export the url for everything that was loaded?
This is firefox debugging. If I could export this to txt or csv, it would be perfect.
I tried CURL on debian CLI but I am no experct. Any tool will help, it does't have to be a plugin of Firefox / Chrome.
Best regards.
In Chrome you can export these data to a HAR file (it's JSON based) in one click. Go to "Network", right click and choose "Save as HAR with content".
Here's a free command line application to convert HAR files to CSV. Hope it helps.
http://www.yamamoto.com.ar/blog/?p=201
EDIT: added the project to GitHub:
https://github.com/spcgh0st/HarTools
On Windows you could use HttpWatch to do this with the free Basic Edition in IE or Firefox:
http://www.httpwatch.com/download/
The CSV export function will export the URLs and other fields to a CSV file.
** Disclaimer: This was posted by Simtec Limited the makers of HttpWatch **
Had the same requirement of exporting HAR files from Chrome DevTools or Firebug to do load testing with siege. Additionally, I wanted to replay POST requests too.
Choose one of these solutions:
hardy # https://github.com/nbibler/hardy - ruby script
har2siege # https://gist.github.com/photopresentr/7974747 - node.js (my script)
Nevermind.
Just fond out the very nive LiveHtttpHeaders extension for firefox.
Best regards.
As you guys know, the HAR file format is a JSON file. So... I looked for a JSON to CSV converter and found this:
https://json-csv.com/
This worked for my HAR file that I got from GTmetrix.com. Enjoy!
You can Export all Http requests from Chrome Developer console by going to the Network tab
select one of the requests in Network Tab
press Right Mouse Button
from PopUp menu select Copy -> Copy all as Har (Curl/Har/etc)
paste into file

View PDFs with Chromium on Windows

It's possible?
Is there a way to install the plugin?
I've been searching trying to find a solution for this, but found nothing.
Edit: Without installing anything from Adobe.
Grab the "pdf.dll" from the latest Google Chrome version. (Download here from version 25.0.1364.172)
Put it in Chromium's install directory ("C:\Program Files\Chromium\Application" or "%appdata%/Chromium/Application/VersionNumberHere/")
Restart any running instance of Chromium.
Type "chrome://plugins" in Chromium, make sure the plugin is enabled and any other PDF plugin is disabled.
And test your browser in: http://www.google.com.br/search?q=pdf+test
Install Adobe Reader on the computer. The PDF will then display in Chrome.

Resources