bash script wget download files by date - bash

I'm new to the world of bash scripting. Hoping to seek some help here.
Been messing about with the 'wget' command and found that it is quite neat! At the moment, it gets all contents from a https site, including all directories, and saves them all accordingly. Here's the command:
wget -r -nH –cut-dirs=1 -R index.html -P /home/snoiniM/data/in/ https://www.someWebSite.com/folder/level2 --user=someUserName --password=P#ssword
/home/snoiniM/data/in/folder/level2/level2-2013-07-01.zip saved
/home/snoiniM/data/in/folder/level2/level2-2013-07-02.zip saved
/home/snoiniM/data/in/folder/level2/level2-2013-07-03.zip saved
/home/snoiniM/data/in/folder/level3/level3-2013-07-01.zip saved
/home/snoiniM/data/in/folder/level3/level3-2013-07-02.zip saved
/home/snoiniM/data/in/folder/level3/level3-2013-07-03.zip saved
That is fine for all intends and purposes. But what if I really just want to get a specific date from all its directories? E.g. just levelx-2013-07-03.zip from all dirs within folder and save all to 1 directory locally (e.g. all *zip will be in ...folder/)
Does anyone know how to do this?
I found that dropping -cut-dirs=1 and on the URL www.someWebsite.com/folder/ is sufficient.
Also, with that in mind, added the -nd option. This means no directories -- "Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering."
This means, we're left with one more part -- how to write a bash script, which gets yesterday date, parse it to the wget command as a parameter?
E.g.
wget -r -nH -nd -R index.html -A *$yesterday.zip -P /home/snoiniM/data/in/ https://www.someWebSite.com/folder/ --user=someUserName --password=P#ssword

Just the snippet you are looking for:
yesterday=$(date --date="#$(($(date +%s)-86400))" +%Y-%m-%d)
And no need of the * before yesterday; just treat it as a suffix.

Related

How to scp multiple files from remote to local in a folder other than ~/?

I'm trying to make a bash expect script that takes in input like user, host, password, and file names and then copies the files from remote to local. From what I've read so far, scp'ing multiple files from remote to local works just fine when you're assuming the files are coming from ~/, i.e:
scp -r user#host:"file1 file2 file3" .
would copy file1, file2, and file3 from ~/ into the current working directory. But I need to be able to pass in another directory as an argument, so my bash script looks like this (but doesn't work, I'll explain how):
eval spawn scp -oStrictHostKeyChecking=no -oCheckHostIP=no -r $user#$host:$dir/"$file1 $file2 $file3" ~/Downloads
This doesn't work after the first file; the shell raises a "No such file or directory" error after the first file, which I would assume means that the script only works on $dir for the first file, then kicks back into ~/ and of course can't find the files there. I've looked everywhere for an answer on this but can't find it, and it would be super tedious to do this one file at a time.
Assuming your remote login shell understands Brace Expansion, this should work
scp $user#$host:$dir/"{$file1,$file2,$file3}" ~/Downloads
If you want to download multiple files with a specific pattern, you can do the following for example if you want all zip files:
scp -r user#host:/path/to/files/"*.zip" /your/local/path

How to create tar files automatically

I like to create tar-files to distribute some scripts using bash.
For every script certain configuration-files and libraries (or toolboxes) are needed,
e.g. a script called CheckTool.py needs Checks.ini, CheckToolbox.py and CommontToolbox.py to run, which are stored in specific folders on my harddisk and need to be copied in the same manner on the users harddisk.
I can create a tarfile manually for each script, but i like to have it more simple.
For this i have the idea to define a list of all needed files and their pathes for a specific script and read this in a bashscript, which creates the tar file.
I started with:
#!/bin/bash
while read line
do
echo "$line"
done < $1
Which is reading the files and pathes. In my example the lines are:
./CheckTools/CheckMesh.bs
./Configs/CheckMesh.ini
./Toolboxes/CommonToolbox.bs
./Toolboxes/CheckToolbox.bs
My question is how do I have to organize the data to make a tar file with the specified files using bash?
Or is there someone having a better idea?
No need for a complicated script, use option -T of tar. Every file listed in there will be added to the tar file:
-T, --files-from FILE
get names to extract or create from FILE
So your script becomes:
#!/bin/bash
tar -cvpf something.tar -T listoffiles.txt
listoffiles.txt format is super easy, one file per line. You might want to put full path to ensure you get the right files:
./CheckTools/CheckMesh.bs
./Configs/CheckMesh.ini
./Toolboxes/CommonToolbox.bs
./Toolboxes/CheckToolbox.bs
You can add tar commands to the script as needed, or you could loop on the list files, from that point on, your imagination is the limit!

LFTP to touch every file it's downloading

I'm using lftp to run a backup job, from one location to another. And it's the only possible solution. But it works really great. I'm using this command:
/usr/bin/lftp -c "open -p 9922 -u jdoe,passw0rd sftp://sftpsiteurl.com; mirror -c -e -R -L /source-folder /destination-folder/"
But I need to change the greated date or modified date on the files comming down. Right now the date on the files is from the remote location. But I'm not sure on how to do this.
I can see that you can run some kind off script for validating the files coming down. But I'm unsure off the command, and I can't seem to find any examples.
Do anybody know if this is possible, and how to do it.

Mac zip compress without __MACOSX folder?

When I compress files with the built in zip compressor in Mac OSX, it causes an extra folder titled "__MACOSX" to be created in the extracted zip.
Can I adjust my settings to keep this folder from being created or do I need to purchase a third party compression tool?
UPDATE: I just found a freeware app for OSX that solves my problem: "YemuZip"
UPDATE 2: YemuZip is no longer freeware.
Can be fixed after the fact by zip -d filename.zip __MACOSX/\*
And, to also delete .DS_Store files: zip -d filename.zip \*/.DS_Store
When I had this problem I've done it from command line:
zip file.zip uncompressed
EDIT, after many downvotes: I was using this option for some time ago and I don't know where I learnt it, so I can't give you a better explanation. Chris Johnson's answer is correct, but I won't delete mine. As one comment says, it's more accurate to what OP is asking, as it compress without those files, instead of removing them from a compressed file. I find it easier to remember, too.
Inside the folder you want to be compressed, in terminal:
zip -r -X Archive.zip *
Where -X means: Exclude those invisible Mac resource files such as “_MACOSX” or “._Filename” and .ds store files
source
Note: Will only work for the folder and subsequent folder tree you are in and has to have the * wildcard.
This command did it for me:
zip -r Target.zip Source -x "*.DS_Store"
Target.zip is the zip file to create. Source is the source file/folder to zip up. The -x parameter specifies the file/folder to exclude.
If the above doesn't work for whatever reason, try this instead:
zip -r Target.zip Source -x "*.DS_Store" -x "__MACOSX"
I'm using this Automator Shell Script to fix it after.
It's showing up as contextual menu item (right clicking on any file showing up in Finder).
while read -r p; do
zip -d "$p" __MACOSX/\* || true
zip -d "$p" \*/.DS_Store || true
done
Create a new Service with Automator
Select "Files and Folders" in "Finder"
Add a "Shell Script Action"
zip -r "$destFileName.zip" "$srcFileName" -x "*/\__MACOSX" -x "*/\.*"
-x "*/\__MACOSX": ignore __MACOSX as you mention.
-x "*/\.*": ignore any hidden file, such as .DS_Store .
Quote the variable to avoid file if it's named with SPACE.
Also, you can build Automator Service to make it easily to use in Finder.
Check link below to see detail if you need.
Github
The unwanted folders can be also be deleted by the following way:
zip -d filename.zip "__MACOSX*"
Works best for me
The zip command line utility never creates a __MACOSX directory, so you can just run a command like this:
zip directory.zip -x \*.DS_Store -r directory
In the output below, a.zip which I created with the zip command line utility does not contain a __MACOSX directory, but a 2.zip which I created from Finder does.
$ touch a
$ xattr -w somekey somevalue a
$ zip a.zip a
adding: a (stored 0%)
$ unzip -l a.zip
Archive: a.zip
Length Date Time Name
-------- ---- ---- ----
0 01-02-16 20:29 a
-------- -------
0 1 file
$ unzip -l a\ 2.zip # I created `a 2.zip` from Finder before this
Archive: a 2.zip
Length Date Time Name
-------- ---- ---- ----
0 01-02-16 20:29 a
0 01-02-16 20:31 __MACOSX/
149 01-02-16 20:29 __MACOSX/._a
-------- -------
149 3 files
-x .DS_Store does not exclude .DS_Store files inside directories but -x \*.DS_Store does.
The top level file of a zip archive with multiple files should usually be a single directory, because if it is not, some unarchiving utilites (like unzip and 7z, but not Archive Utility, The Unarchiver, unar, or dtrx) do not create a containing directory for the files when the archive is extracted, which often makes the files difficult to find, and if multiple archives like that are extracted at the same time, it can be difficult to tell which files belong to which archive.
Archive Utility only creates a __MACOSX directory when you create an archive where at least one file contains metadata such as extended attributes, file flags, or a resource fork. The __MACOSX directory contains AppleDouble files whose filename starts with ._ that are used to store OS X-specific metadata. The zip command line utility discards metadata such as extended attributes, file flags, and resource forks, which also means that metadata such as tags is lost, and that aliases stop working, because the information in an alias file is stored in a resource fork.
Normally you can just discard the OS X-specific metadata, but to see what metadata files contain, you can use xattr -l. xattr also includes resource forks and file flags, because even though they are not actually stored as extended attributes, they can be accessed through the extended attributes interface. Both Archive Utility and the zip command line utility discard ACLs.
You can't.
But what you can do is delete those unwanted folders after zipping. Command line zip takes different arguments where one, the -d, is for deleting contents based on a regex. So you can use it like this:
zip -d filename.zip __MACOSX/\*
Cleanup .zip from .DS_Store and __MACOSX, including subfolders:
zip -d archive.zip '__MACOSX/*' '*/__MACOSX/*' .DS_Store '*/.DS_Store'
Walkthrough:
Create .zip as usual by right-clicking on the file (or folder) and selecting "Compress ..."
Open Terminal app (search Terminal in Spotlight search)
Type zip in the Terminal (but don't hit enter)
Drag .zip to the Terminal so it converts to the path
Copy paste -d '__MACOSX/*' '*/__MACOSX/*' .DS_Store '*/.DS_Store'
Hit enter
Use zipinfo archive.zip to list files inside, to check (optional)
I have a better solution after read all of the existed answers. Everything could done by a workflow in a single right click.
NO additional software, NO complicated command line stuffs and NO shell tricks.
The automator workflow:
Input: files or folders from any application.
Step 1: Create Archive, the system builtin with default parameters.
Step 2: Run Shell command, with input as parameters. Copy command below.
zip -d "$#" "__MACOSX/*" || true
zip -d "$#" "*/.DS_Store" || true
Save it and we are done! Just right click folder or bulk of files and choose workflow from services menu. Archive with no metadata will be created alongside.
IMAGE UPDATE: I chose "Quick Action" when creating a new workflow - here’s an English version of the screenshot:
do not zip any hidden file:
zip newzipname filename.any -x "\.*"
with this question, it should be like:
zip newzipname filename.any -x "\__MACOSX"
It must be said, though, zip command runs in terminal just compressing the file, it does not compress any others. So do this the result is the same:
zip newzipname filename.any
Keka does this. Just drag your directory over the app screen.
Do you mean the zip command-line tool or the Finder's Compress command?
For zip, you can try the --data-fork option. If that doesn't do it, you might try --no-extra, although that seems to ignore other file metadata that might be valuable, like uid/gid and file times.
For the Finder's Compress command, I don't believe there are any options to control its behavior. It's for the simple case.
The other tool, and maybe the one that the Finder actually uses under the hood, is ditto. With the -c -k options, it creates zip archives. With this tool, you can experiment with --norsrc, --noextattr, --noqtn, --noacl and/or simply leave off the --sequesterRsrc option (which, according to the man page, may be responsible for the __MACOSX subdirectory). Although, perhaps the absence of --sequesterRsrc simply means to use AppleDouble format, which would create ._ files all over the place instead of one __MACOSX directory.
This is how i avoid the __MACOSX directory when compress files with tar command:
$ cd dir-you-want-to-archive
$ find . | xargs xattr -l # <- list all files with special xattr attributes
...
./conf/clamav: com.apple.quarantine: 0083;5a9018b1;Safari;9DCAFF33-C7F5-4848-9A87-5E061E5E2D55
./conf/global: com.apple.quarantine: 0083;5a9018b1;Safari;9DCAFF33-C7F5-4848-9A87-5E061E5E2D55
./conf/web_server: com.apple.quarantine: 0083;5a9018b1;Safari;9DCAFF33-C7F5-4848-9A87-5E061E5E2D55
Delete the attribute first:
find . | xargs xattr -d com.apple.quarantine
Run find . | xargs xattr -l again, make sure no any file has the xattr attribute. then you're good to go:
tar cjvf file.tar.bz2 dir
Another shell script that could be used with the Automator tool (see also benedikt's answer on how to create the script) is:
while read -r f; do
d="$(dirname "$f")"
n="$(basename "$f")"
cd "$d"
zip "$n.zip" -x \*.DS_Store -r "$n"
done
The difference here is that this code directly compresses selected folders without macOS specific files (and not first compressing and afterwards deleting).

Bash script to archive files and then copy new ones

Need some help with this as my shell scripting skills are somewhat less than l337 :(
I need to gzip several files and then copy newer ones over the top from another location. I need to be able to call this script in the following manner from other scripts.
exec script.sh $oldfile $newfile
Can anyone point me in the right direction?
EDIT: To add more detail:
This script will be used for monthly updates of some documents uploaded to a folder, the old documents need to be archived into one compressed file and the new documents, which may have different names, copied over the top of the old. The script needs to be called on a document by document case from another script. The basic flow for this script should be -
The script file should create a new gzip
archive with a specified name (created from a prefix constant in the script and the current month and year e.g. prefix.september.2009.tar.gz) only if it
does not already exist, otherwise add to the existing one.
Copy the old file into the archive.
Replace the old file with the new one.
Thanks in advance,
Richard
EDIT: Added mode detail on the archive filename
Here's the modified script based on your clarifications. I've used tar archives, compressed with gzip, to store the multiple files in a single archive (you can't store multiple files using gzip alone). This code is only superficially tested - it probably has one or two bugs, and you should add further code to check for command success etc. if you're using it in anger. But it should get you most of the way there.
#!/bin/bash
oldfile=$1
newfile=$2
month=`date +%B`
year=`date +%Y`
prefix="frozenskys"
archivefile=$prefix.$month.$year.tar
# Check for existence of a compressed archive matching the naming convention
if [ -e $archivefile.gz ]
then
echo "Archive file $archivefile already exists..."
echo "Adding file '$oldfile' to existing tar archive..."
# Uncompress the archive, because you can't add a file to a
# compressed archive
gunzip $archivefile.gz
# Add the file to the archive
tar --append --file=$archivefile $oldfile
# Recompress the archive
gzip $archivefile
# No existing archive - create a new one and add the file
else
echo "Creating new archive file '$archivefile'..."
tar --create --file=$archivefile $oldfile
gzip $archivefile
fi
# Update the files outside the archive
mv $newfile $oldfile
Save it as script.sh, then make it executable:
chmod +x script.sh
Then run like so:
./script.sh oldfile newfile
something like frozenskys.September.2009.tar.gz, will be created, and newfile will replace oldfile. You can also call this script with exec from another script if you want. Just put this line in your second script:
exec ./script.sh $1 $2
A good refference for any bash scripting is Advanced Bash-Scripting Guide.
This guide explains every thing bash scripting.
The basic approach I would take is:
Move the files you want to zip to a directory your create.
(commands mv and mkdir)
zip the directory. (command gzip, I assume)
Copy the new files to the desired location (command cp)
In my experience bash scripting is mainly knowing how to use these command well and if you can run it on the command line you can run it in your script.
Another command that might be useful is
pwd - this returns the current directory
Why don't you use version control? It's much easier; just check out, and compress.
(apologize if it's not an option)

Resources