how to unzip the latest file only - ftp

am downloading daily FTP File through the following command:
wget -mN --ftp-user=myuser --ftp-password=mypassword ftp://ftp2.link.com/ -P /home/usr/public_html/folder/folder2
my file structure are like this:
Data_69111232_2016-01-29.zip
Data_69111232_2016-01-28.zip
Data_69111232_2016-01-27.zip
can you please let me know how can extract only the latest downloaded file only
usually am using the following command to unzip the file, but i don't know what should i add to extract only the latest file
unzip -o /home/user/public_html/folder/folder2/ftp2.directory/????.zip -d /home/user/public_html/folder/folder2/
you help is really approciated
Thanks in Advance

Updated Answer
I thought your question was about FTP, but it is maybe about finding the newest file to unzip.
You can get the newest file like this:
newest=$(ls -t /home/user/public_html/folder/folder2/ftp2.directory/*zip | head -1)
and see the value like this:
echo $newest
and use it like this:
unzip -o "$newest" ...
Original Answer
You can probably string something together using lftp. For example, I can get a listing in reverse time order with the newest file at the bottom like this:
lftp -e 'cd path/to/daily/file; ls -lrt; bye' -u user,password host | tail -1

Related

How define output folder of curl while running script

I have command that executed when script over and download file from list. I use Termux on android and it say you can't use cd while running script.
xargs -n 1 curl -O -C - <url
But it download all file to folder where I runned this script. How I can change output directory.
PS: Only curl please. Aria2c and wget will ignored by me.
Okay. This script I use now
while read url
do
curl --create-dirs -o "$file path/name" $url
I use "basename" of url for name.
Please answer if you have better code.

wget: Download Files with Specific Names

I try to download file from a website which is updated every day.
I managed to download all the gz files with the command:
wget -r --no-parent -A.gz --no-directories -R robots http://www.web.com/some_path/
But now I want to download only the files with specific name, e.g "FullList". How do I do this?
Use the -A option followed by your string, as in -A "string*".
Use the -O option for wget. Example:
wget https://sample.com/movie/gameofthrones.mp4 -O Session01Part03.mp4

Download specific directories data with wget

I am downloading data from an 'ftp' server using 'wget' from Ubuntu system command line. I know how to download all the directories from a specific URL. Is their any command to select and download some specific directories?
I am using "wget -c -r --no-parent 'URL' --user=xx --password=xx"
Now if I give a URL xx followed by the year say 2009, then the URL will be xx/2009. Now I want to download folders 100 to 365 within 2009, not the total folders (001 to 365). How can I do that?
you could specify the directories you want using a shell expansion:
wget -c -r --no-parent URL/2009/{100..365} --user=xx --password=xx
if someone searching for these types of problem I have another solution now which is better than
wget -c -r --no-parent URL/2009/{100..365} --user=xx --password=xx
This does not have any problem, but if the URL is programmed in such a way that no one is allowed to download a large amount of data at a time then the next one will do the trick
for i in $(seq -w 144 365);do wget -r --no-parent URL/year/$i --user=XX --password=xx ; sleep 30; done

wget do not download subirectories only all files in specified directory [duplicate]

I am trying to download the files for a project using wget, as the SVN server for that project isn't running anymore and I am only able to access the files through a browser. The base URLs for all the files is the same like
http://abc.tamu.edu/projects/tzivi/repository/revisions/2/raw/tzivi/*
How can I use wget (or any other similar tool) to download all the files in this repository, where the "tzivi" folder is the root folder and there are several files and sub-folders (upto 2 or 3 levels) under it?
You may use this in shell:
wget -r --no-parent http://abc.tamu.edu/projects/tzivi/repository/revisions/2/raw/tzivi/
The Parameters are:
-r //recursive Download
and
--no-parent // Don´t download something from the parent directory
If you don't want to download the entire content, you may use:
-l1 just download the directory (tzivi in your case)
-l2 download the directory and all level 1 subfolders ('tzivi/something' but not 'tivizi/somthing/foo')
And so on. If you insert no -l option, wget will use -l 5 automatically.
If you insert a -l 0 you´ll download the whole Internet, because wget will follow every link it finds.
You can use this in a shell:
wget -r -nH --cut-dirs=7 --reject="index.html*" \
http://abc.tamu.edu/projects/tzivi/repository/revisions/2/raw/tzivi/
The Parameters are:
-r recursively download
-nH (--no-host-directories) cuts out hostname
--cut-dirs=X (cuts out X directories)
This link just gave me the best answer:
$ wget --no-clobber --convert-links --random-wait -r -p --level 1 -E -e robots=off -U mozilla http://base.site/dir/
Worked like a charm.
wget -r --no-parent URL --user=username --password=password
the last two options are optional if you have the username and password for downloading, otherwise no need to use them.
You can also see more options in the link https://www.howtogeek.com/281663/how-to-use-wget-the-ultimate-command-line-downloading-tool/
use the command
wget -m www.ilanni.com/nexus/content/
you can also use this command :
wget --mirror -pc --convert-links -P ./your-local-dir/ http://www.your-website.com
so that you get the exact mirror of the website you want to download
try this working code (30-08-2021):
!wget --no-clobber --convert-links --random-wait -r -p --level 1 -E -e robots=off --adjust-extension -U mozilla "yourweb directory with in quotations"
I can't get this to work.
Whatever I try, I just get some http file.
Just looking at these commands for simply downloading a directory?
There must be a better way.
wget seems the wrong tool for this task, unless it is a complete failure.
This works:
wget -m -np -c --no-check-certificate -R "index.html*" "https://the-eye.eu/public/AudioBooks/Edgar%20Allan%20Poe%20-%2"
This will help
wget -m -np -c --level 0 --no-check-certificate -R"index.html*"http://www.your-websitepage.com/dir

Download data from FTP Website

There is something which I am missing or might be the whole case. So I am trying to download NCDC data from NCDC Datasets and unable to do it the unix box.
The command which I have used this far are
wget ftp://ftp.ncdc.noaa.gov:21/pub/data/noaa/1901/029070-99999-1901.gz">029070-99999-1901.gz
This is for one file, but will be very happy if I can downlaod the entire parent directory.
You seem to have a lonely " just before the >
to download everything you can try this command to get the whole directory content
wget -r ftp://ftp.ncdc.noaa.gov:21/pub/data/noaa/1901/*
for i in {1990..1993}
do
echo "$i"
cd /home/chile/data
# -nH Disable generation of host-prefixed directories
# -nd all files will get saved to the current directory
# -np Do not ever ascend to the parent directory when retrieving recursively.
# -R index.html*,227070*.gz* don't download files with this regex
wget -r -nH -nd -np -R *.html,999999-99999-$i.gz* http://www1.ncdc.noaa.gov/pub/data/noaa/$i/
/data/noaa/$i/
done

Resources