I do not have much experiance with the command line and I have done my research and still wasn't able to solve my problem.
I need to download a .txt file from a folder on box.com.
I attempted using:
$ curl -o FILE URL
However, all I got was a empty text file that was named random numbers. I assumed the reason this happened is because the url of the file location does not end in .txt since it is in a file on box.com.
I also attempted:
$ wget FILE URL
However, my mac terminal doesn't seem to find that command
Is there a different command that can download the file from box.com? Or am I missing something?
You need to put your URL in quotes to avoid shell trying to parse it:
curl -o myfile.txt "http://example.com/"
Update: If the URL requires authentication
Modern browsers allow you to export requests as curl commands.
For example, in Chrome, you can:
open your file URL in a new tab
open Developer tools (View -> Developer -> Developer Tools)
switch to Network tab in the tools
refresh the page, a request should appear in the "Network" tab
Right-click the request, choose "Copy -> Copy as cURL"
paste the command in the shell
Here's how it looks for this page for example:
Related
Hello friends I am trying to download a csv file from a external url I try with wget command but I obtain a 400 Error bad request if I paste the url directly in the browser I can download the csv file, is there another way to download this type of file or other solution? I need to have a file with the csv content.
Thanks
Have you escaped all special characters in the url such as & to \& or $ to \$
If it doesn't have to do anything with authentication/cookies, run some tool on your browser (like Live HTTP Headers) to capture headers. If you, then, mockup those fields in wget, that would get you very close to be like your "browser". That will also show you if there is any difference in encodings between wget data and browser data.
On other hand, you could also watch the server log files (if you have access to them).
I need to download a project from SourceForge, but there is no easy visible way.
Here, on this picture (linked down, not enough reputation), it is possible to download "the latest version", which does include only files from first folder, but I need to download other folder.
It is possible to download these files, but only manually and because there are hundreds of files and subfolders - it would be quite impractical.
Does anyone know any way to download it? I didn't find much, only some mentioned wget, but I tried it without any success.
Link: http://s9.postimg.org/xk2upvbwv/example.jpg
In every Sourceforge project or project folders page there is an RSS link, as you can see in the example screenshot here.
right click that RSS icon in the page of the folder or project you want to download then copy the link and use the following Bash script:
curl "<URL>" | grep "<link>.*</link>" | sed 's|<link>||;s|</link>||' | while read url; do url=`echo $url | sed 's|/download$||'`; wget $url ; done
replace "<URL>" with your RSS link for example : "https://sourceforge.net/projects/xdxf/rss?path=/dicts-babylon/001", and watch the magic happens, The RSS link will include all the files of the Sourceforge folder or project and it's sub-folders, so the script will download everything recursively.
If the above script doesn't work, try this one that extracts the links from the HTML directly, Replace "<URL>" with the project's files URL example : "https://sourceforge.net/projects/synwrite-addons/files/Lexers/"
curl "<URL>" | tr '"' "\n" | grep "sourceforge.net/projects/.*/download" | sort | uniq | while read url; do url=`echo $url | sed 's|/download$||'`; wget $url ; done
Good luck
Sometimes there is a download link at the summary tab, but sometimes I don't know a work around so I use this piece of code:
var urls = document.getElementsByClassName('name')
var txt = ""
for (i = 0; i < urls.length; i++) {
txt += "wget " + urls[i].href +"\n"
}
alert(txt)
You should open a console in your browser on the page where all the files are listed. Copy+past+enter the code and you will be prompted a list of wget commands which you can Copy+past+enter in your terminal.
How to download all files from a particular folder in Sourceforge on Windows 10:
Step 1: Download the latest wget (zip file, not exe file) from https://eternallybored.org/misc/wget/.
Note: Search google for 'wget 1.20.x' to find proper link, if necessary. Download the 32-bit file, if your system is Winodws 10 32-bit or the 64-bit file, if your system is Windows 10 64-bit.
Step 2: Download the latest grep and coreutils installers from http://gnuwin32.sourceforge.net/packages.html.
Note: Search google for 'gnuwin32' to find proper link, if necessary. Only the 32-bit installers are available.
Step 3: Extract everything in the wget zip file downloaded to C:\WgetWinx32 or C:\WgetWinx64.
Note: You can install the wget virtually anywhere but preferably in a folder without space in the folder name.
Step 4: Install grep by double-clicking the respective installer to a folder, C:\GnuWin32.
Step 5: Install coreutils by double-clicking the respective installer to the same folder where the grep has been installed.
Note: You can install the grep and the coreutils in any order you want (i.e., first grep and then coreutils or vice versa) and virtually anywhere, even in the default location shown by the installer, but preferably in a folder without space in the folder name.
Step 6: Right-click on the 'This PC' icon in the desktop. Select the 'properties' menu from the drop-down list. Select the 'Advanced system settings' from the pop-up 'System' window.
Select the 'Environment Variables...' from the pop-up 'System properties' window. Select 'Path' from the 'System variables' and click on the 'Edit...' button.
Click on the 'New' button in the 'Edit environment variables' pop-up window. Enter the path for the wget installation folder (e.g., 'C:\WgetWin32' or 'C:\WgetWin64' without quotes).
Click on the 'New' button in the 'Edit environment variables' pop-up window again. Enter the path for the grep and coreutils installation folder (e.g., 'C:\GnuWin32\bin' without quotes).
Now, keep clicking on the 'Ok' buttons in the 'Edit environment variables', 'System variables' and 'System properties' pop-up windows.
Step 7: Create a DOS batch file, 'wgetcmd.bat' in the wget installation folder (e.g., 'C:\WgetWin32' or 'C:\WgetWin64' without quotes) with the following line using a text editor.
cd C:\WgetWin32
cmd
(OR)
cd C:\WgetWin64
cmd
Step 8: Create a shortcut to this batch file on the desktop.
Step 9: Right-click on the shortcut and select the 'Run as administrator' from the drop-down list.
Step 10: Enter the following commands either one by one or all at once in the DOS prompt in the pop-up DOS command window.
wget -w 1 -np -m -A download <link_to_sourceforge_folder>
grep -Rh refresh sourceforge.net | grep -o "https[^\\?]*" > urllist
wget -P <folder_where_you_want_files_to_be_downloaded> -i urllist
That's all folks! This will download all the files from the Sourceforge folder specified.
Example for the above: Suppose that I want to download all the files from the Soruceforge folder: https://sourceforge.net/projects/octave/files/Octave%20Forge%20Packages/Individual%20Package%20Releases/, then the following lines of commands will do this.
wget -w 1 -np -m -A download https://sourceforge.net/projects/octave/files/Octave%20Forge%20Packages/Individual%20Package%20Releases/
grep -Rh refresh sourceforge.net | grep -o "https[^\\?]*" > urllist
wget -P OctaveForgePackages -i urllist
The Sourceforge folder mentioned above contains a lot of Octave packages as .tar.gz files. All those files would be downloaded to the folder 'OctaveForgePackages' locally!
In case of no wget or shell install do it with FileZilla:
sftp://yourname#web.sourceforge.net
you open the connection with sftp and your password then
you browse to the /home/pfs/
after that path (could be a ? mark sign) you fill in with your folder path you want to download in remote site, in my case
/home/pfs/project/maxbox/Examples/
this is the access path of the frs:
File Release System: /home/frs/project/PROJECTNAME/
Here is a sample python script you could use to download from Sourceforge
r = requests.get("https://sourceforge.net/Files/filepath") #Path to folder with files
soup = BeautifulSoup(r.content,'html.parser')
dowload_dir ="download/directory/here"
files = [file_.a['href'] for file_ in soup.find_all('th' , headers = "files_name_h")]
for file_download_url in files:
filename = file_download_url.split('/')[-2]
if filename not in os.listdir(download_dir):
r = requests.get(file_download_url)
with open(os.path.join(download_dir, filename),"wb") as f:
f.write(r.content)
print(f"created file {os.path.join(download_dir, filename)}")
I have a problem with using the curl command: I have to download a file from a site that is of "www.example.com:8000/get.php?username=xxxx&password=xxxx" form.
Now what happens is this: If I open the link from the browser, if that file exists, part of the automatic download, but if the link is not correct, nothing happens and the displayed page is white.
My problem is that by using the command
curl -o file.txt "www.example.com:8000/get.php?username=xxxx&password=xxxx", the file is generated to me is if the link is correct, by downloading the file correctly, is that the link is not right, generating the 0 byte .txt file.
How can I do that, if not corrected the link (and therefore there is no file to download), no file is generated to me?
I would like to use Automator to:
1- extract URLs from a text file with about 50 URLs
2- open it in firefox
3- take a screenshot of the window
4- Close the window
5- do it a again for the next 49 URLs.
First step, I can't extract urls from the text files, automator give
me nothing when I do it.
Well, this is done know, mistake from me I had to use get content of text edit document before extract url.
Second thing, I don't know how to make it recursively URL after URL.
Know it opens all urls at the same time in different tabs, which make my firefox to shut down because of the number of tab open at the same time.
How could I make it do it url after url ?
It's the first time I use Automator and I don't know nothing about apple scripting.
Any help?
No need for Automator, just use webkit2png which you can install easily with homebrew like this:
brew install webkit2png
Then put a list of all your sites in a file called sites.txt that looks like this:
http://www.google.com
http://www.ibm.com
and then run webkit2png like this:
webkit2png - < sites.txt
Or, if you don't like that approach, you can do something like this just with the built-in tools in OS X. Save the following in a text file called GrabThem
#!/bin/bash
while read f
do
echo Processing $f...
open "$f"
sleep 3
screencapture ${i}.png
((i++))
done < sites.txt
Then make it executable in Terminal (you only need do this once) with
chmod +x GrabThem
Then run it like this in Terminal:
./GrabThem
and the files will be called 1.png, 2.png etc.
You can see the newest files at the bottom of the list is you run:
ls -lrt
You may want to look at the options for screencapture, maybe the ones for selecting a specific window rather than the whole screen. You can look at the options by typing:
man screencapture
and hitting SPACE to go forwards a page and q to quit.
Dropbox makes it easy to programmatically download a single file via curl (EX: curl -O https://dl.dropboxusercontent.com/s/file.ext). It is a little bit trickier for a folder (regular directory folder, not zipped). The shared link for a folder, as opposed to a file, does not link directly to the zipped folder (Dropbox automatically zips the folder before it is downloaded). It would appear that you could just add ?dl=1 to the end of the link, as this will directly start the download in a browser. This, however, points to an intermediary html document that redirects to the actual zip folder and does not seem to work with curl. Is there anyway to use curl to download a folder via a shared link? I realize that the best solution would be to use the Dropbox api, but for this project it is important to keep it as simple as possible. Additionally, the solution must be incorporated into a bash shell script.
It does appear to be possible with curl by using the -L option. This forces curl to follow the redirect. Additionally, it is important to specify an output name with a .zip extension, as the default will be a random alpha-numeric name with no extension. Finally, do not forget to add the ?dl=1 to the end of the link. Without it, curl will never reach the redirect page.
curl -L -o newName.zip https://www.dropbox.com/sh/[folderLink]?dl=1
Follow redirects (use -L). Your immediate problem is that Curl is not following redirects.
Set a filename. (Optional)
Dropbox already sends a Content-Disposition Header with its Dropbox filename. There is no reason to specify the filename if you use the correct curl flags.
Conversely, you can force a filename using something of your choosing.
Use one of these commands:
curl https://www.dropbox.com/sh/AAbbCCEeFF123?dl=1 -O -J -L
Preserve/write the remote filename (-O,-J) and follows any redirects (-L).
This same line works for both individually shared files or entire folders.
Folders will save as a .zip automatically (based on folder name).
Don't forget to change the parameter ?dl=0 to ?dl=1 (see comments).
OR:
curl https://www.dropbox.com/sh/AAbbCCEeFF123?dl=1 -L -o [filename]
Follow any redirects (-L) and set a filename (-o) of your choosing.
NOTE: Using the -J flag in general:
WARNING: Exercise judicious use of this option, especially on Windows. A rogue server could send you the name of a DLL or other file that could possibly be loaded automatically by Windows or some third party software.
Please consult: https://curl.haxx.se/docs/manpage.html#OPTIONS (See: -O, -J, -L, -o) for more.