Bash command to copy images from remote url - macos

I'm using mac's terminal.
I want to copy images from remote url: http://media.pragprog.com/titles/rails4/code/depot_b/public/images/ to a local directory.
What's the command to do that?
Tnx,

You can use curl
curl -O "http://media.pragprog.com/titles/rails4/code/depot_b/public/images/*.jpg"
for example.

alternatively you may want just all the images, from a website. wget can do this with a recursive option such as:
$ wget -r -A=jpeg,jpg,bmp,png,gif,tiff,xpm,ico http://www.website.com/
This should only download the comma delimited extensions recursively starting at the site index. This works like a web-spider so if its not referenced anywhere on the site it will miss the image.

wget will work, assuming the server has directory listing:
wget -m http://media.pragprog.com/titles/rails4/code/depot_b/public/images

You can do this with Wget or cURL. If I recall correctly, neither come out-of-the-box w/ OS X, so you may need to install them with MacPorts or something similar.

Related

wget command using brew in terminal (mac)

wget -E -H -k -K -p -e robots=off -P ./images/ -i./list.txt
./list.txt: No such file or directory
No URLs found in ./list.txt.
Converted links in 0 files in 0 seconds.
I downloaded and installed brew. Further, I installed wget and it's letting me download images one image at a time. However, when I tried the aforementioned command to download images from multiple urls, it's not doing anything. Can someone tell me what I could be doing wrong here?
wget is pretty lucid with description of issue
./list.txt: No such file or directory
apparently there is not file named list.txt inside current dir. Please trying giving full path to list.txt.

Considering a specific name for the downloaded file

I download a .tar.gz file using wget using this command:
wget hello.tar.gz
This is a part of a long script, sometimes when I want to download this file, an error occurs and when for the second time the file is downloaded the name of the downloaded file changes to something like this:
hello.tar.gz.2
the third time:
hello.tar.gz.3
How can I say that the whatever the name of the downloaded is, change it to hello.tar.gz?
In other words I don't want the name of the downloaded file be anything other than hello.tar.gz?
wget hello.tar.gz -O <fileName>
wget have internal option like -r, -p to change default behavior
So just try the following:
wget -p <url>
wget -r <url>
Since now you noticed the incremental change. Discard any repeated files and rely on the following as initial condition:
wget hello.tar.gz
mv hello.tar.gz.2 hello.tar.gz

How to curl for extracting valid .zip file from redirecting link

I am trying to automate a data downloading process. For this purpose, my goal is to extract (using bash commands) the .zip from a redirection link that could be seen on display here: https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303
I have seen that people suggest the -L tag with curl for redirections, but it doesn't seem to work for my case. The specific command I have tried is:
curl -L -o output.zip https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303/suppl_file/Sambanis_Aug_06.zip
The command file output.zip shows that the extracted .zip file is actually a HTML document text. On the other hand, clicking the redirection link (used inside curl command) downloads the extracted folder automatically via a browser.
Any ideas, tips, or suggestions on what I should try (or whether this is possible or not) will be highly appreciated!
If you execute curl with the --verbose option you can see that it is a cookie related problem. The cookie engine needs to be enabled. You can download the desired file as follows:
curl -b cookies.txt -L https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303/suppl_file/Sambanis_Aug_06.zip -o test.zip
It doesn't matter if the file provided with the -b option doesn't exist. We just need to activate the cookie engine.
Refer to Send cookies with curl and Save cookies between two curl requests for futher information.
You can download that file with wget on Linux
$ wget https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303/suppl_file/Sambanis_Aug_06.zip
$ unzip Sambanis_Aug_06.zip
Archive: Sambanis_Aug_06.zip
inflating: Sambanis (Aug 06).dta
inflating: Sambanis Appendix (Aug 06).pdf

how do i use wget to download all images from the domain

Hello I would like to download all the pictures from the www.demotywatory.pl website.
I have seen other subject with accepted answer but this does not work for me at all.
The answer was:
wget -r -P /save/location -A jpeg,jpg,bmp,gif,png http://www.domain.com
So i tried that with several websites and alway got that: It looks like it tried only save the one file
Have you tried doing this:
wget -r -A.jpg http://www.demotywatory.pl
It will download all .jpg files from given URL.

Download file with different local name using wget

Hi i have a text file where download links are given like -
http://www.example.com/10.10.11/abc.jpg
http://www.example.com/10.10.12/abc.jpg
http://www.example.com/10.10.13/abc.jpg
Here 10.10.* is the date of the image.
I need to download all the images using wget where the image name will be the corresponding date (eg. 10.10.111.jpg).
PS. I tried using:
wget -i download.txt
So, any solution?
Thanks in advance
You can instruct Wget to create subdirectories based on the URL, and then do the renaming after the download has finished.
I'd suggest a batch script that downloads the files one by one using the -O option, and a bit of sed/awk magic to get the names right
But careful! given the -O option, you have to call wget on a per file basis
This should do the trick.
#!/bin/sh
while read url; do
urldir=${url%/*}
dir=${urldir##*/}
wget -O $dir.jpg $url
done < download.txt
This might work for you:
sed '\|/\([^/]*\)/[^/]*\1[^/.]*.jpg|!d' download.txt | wget -i -
Explanation:
Filter the download.txt file to contain only those files which you require and then pass them on to wget.
I have developed a script that does just this bulkGetter. Super easy to use, you just need an input file with all the links you want to download and use option "-rb" (refer to link).

Resources