Improve my cron script for website change notification - bash

I've been using a script through cron to check if a web page has changed.
Everything is running smoothly but this is my first attempt and I know it could be better. I'm a noob so take it easy.
I've hobbled this together through many sources. The standard services that check if a webpage has changed didn't work on my webpage of instance because each instance created a new shopping cart ID. Instead, I looked for alike with say article:modified_time that was my TextofInterest and it will grab that whole line and compare it to the same line in the prior file.
Things to maybe do better:
I'd like to define each file (.txt, .html, websites) at the beginning.
I've also run into some other code where it looks like it might not have to save to a file and run in memory?
Currently saving to a flash drive (open media vault), I'd like to change the directory to a different drive for the writes of the files.
Any other improvements are welcome.
Here is working code:
#!/bin/bash
cp Current.html Prior.html
wget -O Current.html 'https://websiteofinterest.com/
awk '/TextOfInterest/' Prior.html > Prior.txt
awk '/TextOfInterest/' Current.html > Current.txt
diff -q Current.txt Prior.txt || diff Current.txt Prior.txt | mail -s "Website Canged" "Emailtosendto#email.com"

Related

Retrieving latest file in a directory from a remote server

I was hoping to crack this myself, but it seems I have fallen at the first hurdle because I can't make head nor tale of other options I've read about.
I wish to access a database file hosted as follows (i.e. the hhsuite_dbs is a folder containing several databases)
http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pdb70_08Oct15.tgz
Periodically, they update these databases, and so I want to download the lastest version. My plan is to run a bash script via cron, most likely monthly (though I've yet to even tackle the scheduling aspect of the task).
I believe the database is refreshed fortnightly, so if my script runs monthly I can expect there to be a new version. I'll then be running downstream programs that require the database.
My question is then, how do I go about retrieving this (and for a little more finesse I'd perhaps like to be able to check whether the remote file has changed in name or content to avoid a large download if unnecessary)? Is the best approach to query the name of the file, or the file property of date last modified (given that they may change the naming syntax of the file too?). To my naive brain, some kind of globbing of the pdb70 (something I think I can rely on to be in the filename) then pulled down with wget was all I had come up with so far.
EDIT Another confounding issue that has just occurred to me is that the file I want wont necessarily be the newest in the folder (as there are other types of databases there too), but rather, I need the newest version of, in this case, the pdb70 database.
Solutions I've looked at so far have mentioned weex, lftp, curlftpls but all of these seem to suggest logins/passwords for the server which I don't have/need if I was to just download it via the web. I've also seen mention of rsync, but of a cursory read it seems like people are steering clear of it for FTP uses.
Quite a few barriers in your way for this.
My first suggestion is that rather than getting the filename itself, you simply mirror the directory using wget, which should already be installed on your Ubuntu system, and let wget figure out what to download.
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
And new files will be created in the "safe" directory.
But that just gets you your mirror. You're still after is the "newest" file.
Luckily, wget sets the datestamp of files it downloads, if it can. So after mirroring, you might be able to do something like:
newestfile=$(ls -t /some/place/safe/pdb70*gz | head -1)
Note that this fails if ever there are newlines in the filename.
Another possibility might be to check the difference between the current file list and the last one. Something like this:
#!/bin/bash
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
rm index.html* *.gif # remove debris from mirroring an index
ls > /tmp/filelist.txt.$$
if [ -f /tmp/filelist.txt ]; then
echo "Difference since last check:"
diff /tmp/filelist.txt /tmp/filelist.txt.$$
fi
mv /tmp/filelist.txt.$$ /tmp/filelist.txt
You can parse the diff output (man diff for more options) to determine what file has been added.
Of course, with a solution like this, you could run your script every day and hopefully download a new update within a day of it being ready, rather than a fortnight later. Nice thing about --mirror is that it won't download files that are already on-hand.
Oh, and I haven't tested what I've written here. That's one monstrously large file.

How to download multiple numbered images from a website in an easy manner?

I'd like to download multiple numbered images from a website.
The images are structured like this:
http://website.com/images/foo1bar.jpg
http://website.com/images/foo2bar.jpg
http://website.com/images/foo3bar.jpg
... And I'd like to download all of the images within a specific interval.
Are there simple browser addons that could do this, or should I use "wget" or the like?
Thank you for your time.
Crudely, on Unix-like systems:
#!/bin/sh
for i in {1..3}
do
wget http://website.com/images/foo"$i"bar.jpg
done
Try googling "bash for loop".
Edit LOL! Indeed, in a haste I omitted the name of the very program that downloads the image files. Also, this goes into a text editor, then you save it with an arbitrary file name, make it executable with the command
chmod u+x the_file_name
and finally you run it with
./the_file_name

How to create a batch file in Mac?

I need to find a solution at work to backup specific folders daily, hopefully to a RAR or ZIP file.
If it was on PC, I would have done it already. But I don't have any idea to how to approach it on a Mac.
What I basically want to achieve is an automated task, that can be run with an executable, that does:
compress a specific directory (/Volumes/Audio/Shoko) to a rar or zip file.
(in the zip file exclude all *.wav files in all sub Directories and a directory names "Videos").
move It to a network share (/Volumes/Post Shared/Backup From Sound).
(or compress directly to this folder).
automate the file name of the Zip file with dynamic date and time (so no duplicate file names).
Shutdown Mac when finished.
I want to say again, I don't usually use Mac, so things like what kind of file to open for the script, and stuff like that is not trivial for me, yet.
I have tried to put Mark's bash lines (from the first answer, below) in a txt file and executed it, but it had errors and didn't work.
I also tried to use Automator, but it's too plain, no advanced options.
How can I accomplish this?
I would love a working example :)
Thank You,
Dave
You can just make a bash script that does the backup and then you can either double-click it or run it on a schedule. I don't know your paths and/or tools of choice, but some thing along these lines:
#!/bin/bash
FILENAME=`date +"/Volumes/path/to/network/share/Backup/%Y-%m-%d.tgz"`
cd /directory/to/backup || exit 1
tar -cvz "$FILENAME" .
You can save that on your Desktop as backup and then go in Terminal and type:
chmod +x ~/Desktop/backup
to make it executable. Then you can just double click on it - obviously after changing the paths to reflect what you want to backup and where to.
Also, you may prefer to use some other tools - such as rsync but the method is the same.

Creating an executable file to download a file, then upload the file to new location

I'm having trouble finding the correct method to accomplish a relatively simple task
I'm trying to make a simple executable that I can run/schedule to run.
That
1. Downloads a file from an intranet location (192.168.100.112/file.txt)
2. Uploads the new version file to web (fpt.website.com/docs/file.txt)
There are 5 pdf files that auto generate on an intranet and I would like to keep the web versions updated. Ideally create one executable that does all 5 files at once and have the ability to do each one individually.
thanks
Use the windows ftp command. Is has a -s option for providing ftp "scripts". Basically just add all the commands you need to accomplish your task to something.txt for example:
open 192.168.100.112
get file.txt
close
open fpt.website.com
cd docs
put file.txt
close
bye
then do:
ftp -s:something.txt
You could make ftp scripts, one for each upload. Then put all five commands in a batch file

How to add copyrights to my current app in xcode using MAC OS

I am working on an app.I have finished the app development module ,now i am working on getting it ready for delivery to client.
I have to add client's copyrights to all the .h,.m files inn my projects.As there are around 700+ files in this .Also i don't want to use the cntrol+find in entire app and replace it with the copyright content required.
Is there any other approach i can look into to add the copyrights content to it.I heard of creating some sort of batch file and then using terminal to add copyrights to project but i am not sure how to implement this.
Any help would be appreciated.
Thanks
Hmm, this may need just a little shell scripting. In general, this is how I would attempt to tackle it:
Create a text file with your copyright.
Create a text file with the (code) file names you need to add the copyright to. (Should be pretty easy with something like ls *.m > script).
Open the script file with a text editor and modify each line so that it reads:
cat copyright 'filename' >
Should be pretty easy to do with a macro
Set the script file's execution bit: chmod a+x script
Run the script by: ./script
Notice that this produces files with '_out' at the end. Change it to what fits you.

Resources