protocol compilation using a remote file for Python (and others) - bash

I am looking for a way to convince bash that "I've passed you a file, in the specified directory" without actually downloading and creating that directory on disk, OR pass a remote file directly to my protocol buffer compiler without writing it to disk.
I would like my protocol buffers compiler to look at a remote file, compile my classes, then finish without the .proto ever being downloaded.
I have this script which is getting us by for now
git archive --remote=git#bitbucket.org:redzoneco/asset-pipeline-protocol-buffers.git master preflight_check_service/preflight.proto | tar -x
preflight_check_service/preflight.proto
protoc --python_out=chalicelib/protos preflight_check_service/preflight.proto
rm -r preflight_check_service
but as you can see we make the directory locally, run protoc, then remove the file. Seems like a waste.

I don't think you'll be able to avoid using a temp file/directory as protoc doesn't accept it's input via stdin. There's an open ticket here, but it has received very little attention.
You could always just drop the .proto file into your destination and not worry about it. So, in your example, put it in chalicelib/protos. That'd help document what was created and why. (...for others if this is checked into source control, or just for your self if not.)

Related

PSFTP rename file after transfer completed

I am transferring files through PSFTP to 3rd party server using Batch files. While transferring files, due to buffering issues, files are being broken/not transferred fully.
As a remedy, 3rd party requested us to name each file with '.new' before starting file transfer and remove '.new' once file is transferred fully/successfully.
Please let me know Batch script commands to implement above. Please let me know if you need additional info.
To rename a file, use mv command (or it's ren alias):
put c:\local\path\file /remote/path/file.new
mv /remote/path/file.new /remote/path/file
Though if you are transferring multiple files using a wildcard, this won't help you.
A relatively simple solution for multiple files is using a temporary upload folder. After the upload finishes, you can move all files at once to the target folder:
mput c:\local\path\* /temp/path
mv /temp/path/* /remote/path
For a similar discussion, see also SFTP file lock mechanism.
If you need to use the solution with extensions, you can use WinSCP, as it allows you to automatically use a temporary file name for upload. Though it uses .filepart, not .new extension.
put -resumesupport=on c:\local\path\* /remote/path/
See WinSCP article on Uploading to temporary file name for more details.
The article also shows (a way more complicated) solution using WinSCP .NET assembly that allows you to use even the .new extension.
If you choose to switch to WinSCP, there's a guide for converting psftp script to WinSCP.
(I'm the author of WinSCP)

How can I send an HTTPS request from a file?

Let's assume I have a file request.txt that looks like:
GET / HTTP/1.0
Some_header: value
text=blah
I tried:
cat request.txt | openssl -s_client -connect server.com:443
Unfortunately it didn't work and I need to manually copy & paste the file contents. How can I do it within a script?
cat is not ideally suited to download remote files, it's best used for files local to the file system running the script. To download a remote file you have other commands that you can use which handle this better.
If your environment has wget installed you can download the file by URL. Here is a link for some examples on how it's used. That would look like:
wget https://server.com/request.txt
If your environment has curl installed you can download the file by URL. Here is a link for some examples on how it's used. That would look like:
curl -O https://server.com/request.txt
Please note that if you want to store the response in a variable for further modification you can do this as well with a bit more work.
Also worth noting is that if you really must use cat to download a remote file it's possible, but it may require ssh to be used and I'm not a fan of using that method as it requires access to a file via ssh where it's already publicly available over HTTP/S. There isn't a practical reason I can think of to go about it this way, but for the sake of completion I wanted to mention that it could be done but probably shouldn't.

Retrieving latest file in a directory from a remote server

I was hoping to crack this myself, but it seems I have fallen at the first hurdle because I can't make head nor tale of other options I've read about.
I wish to access a database file hosted as follows (i.e. the hhsuite_dbs is a folder containing several databases)
http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pdb70_08Oct15.tgz
Periodically, they update these databases, and so I want to download the lastest version. My plan is to run a bash script via cron, most likely monthly (though I've yet to even tackle the scheduling aspect of the task).
I believe the database is refreshed fortnightly, so if my script runs monthly I can expect there to be a new version. I'll then be running downstream programs that require the database.
My question is then, how do I go about retrieving this (and for a little more finesse I'd perhaps like to be able to check whether the remote file has changed in name or content to avoid a large download if unnecessary)? Is the best approach to query the name of the file, or the file property of date last modified (given that they may change the naming syntax of the file too?). To my naive brain, some kind of globbing of the pdb70 (something I think I can rely on to be in the filename) then pulled down with wget was all I had come up with so far.
EDIT Another confounding issue that has just occurred to me is that the file I want wont necessarily be the newest in the folder (as there are other types of databases there too), but rather, I need the newest version of, in this case, the pdb70 database.
Solutions I've looked at so far have mentioned weex, lftp, curlftpls but all of these seem to suggest logins/passwords for the server which I don't have/need if I was to just download it via the web. I've also seen mention of rsync, but of a cursory read it seems like people are steering clear of it for FTP uses.
Quite a few barriers in your way for this.
My first suggestion is that rather than getting the filename itself, you simply mirror the directory using wget, which should already be installed on your Ubuntu system, and let wget figure out what to download.
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
And new files will be created in the "safe" directory.
But that just gets you your mirror. You're still after is the "newest" file.
Luckily, wget sets the datestamp of files it downloads, if it can. So after mirroring, you might be able to do something like:
newestfile=$(ls -t /some/place/safe/pdb70*gz | head -1)
Note that this fails if ever there are newlines in the filename.
Another possibility might be to check the difference between the current file list and the last one. Something like this:
#!/bin/bash
base="http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/"
cd /some/place/safe/
wget --mirror -nd "$base"
rm index.html* *.gif # remove debris from mirroring an index
ls > /tmp/filelist.txt.$$
if [ -f /tmp/filelist.txt ]; then
echo "Difference since last check:"
diff /tmp/filelist.txt /tmp/filelist.txt.$$
fi
mv /tmp/filelist.txt.$$ /tmp/filelist.txt
You can parse the diff output (man diff for more options) to determine what file has been added.
Of course, with a solution like this, you could run your script every day and hopefully download a new update within a day of it being ready, rather than a fortnight later. Nice thing about --mirror is that it won't download files that are already on-hand.
Oh, and I haven't tested what I've written here. That's one monstrously large file.

Rsync bash script and hard linking files

I am creating a bash script to backup my files with rsync.
Backups all come from a single directory.
I only want new or modified files to be backed up.
Currently, I am telling rsync to backup the dir, and to check the files compared to the last backup.
The way I am doing this is
THE_TIME=`date "+%Y-%m-%dT%H:%M:%S"`
rsync -aP --link-dest=/Backup/Current /usr/home/user/backup /Backup/Backup-$THE_TIME
rm -f /Backup/Current
ln -s /Backup/Backup-$THE_TIME /Backup/Current
I am pretty sure I have the syntax correct for this. Each backup will check against the "Current" folder, and upload only as necesary. It will then delete the Current folder, and re-create the symlink to the newest backup it just did.
I am getting an error when I run the script:
rsync: link "/Backup/Backup-2010-08-04-12:21:15/dgs1200series_manual_310.pdf"
=> /Backup/Current/dgs1200series_manual_310.pdf
failed: Operation not supported (45)
The host OS is running HFS filesystem, which supports hard linking. I am trying to figure out if something else is not supporting this, or if I have a problem in my code.
Thanks for any help
Edit:
I am able to create a hard link on my local machine.
I am also able to create a hard link on the remote server (when logged in locally)
I am NOT able to create a hard link on the remote server when mounted via afp. Even if both files exist on the server.
I am guessing this is a limitation of afp.
Just in case your command line is only an example: Be sure to always specify the link-dest directory with an absolute pathname! That’s something which took me quite some time to figure out …
Two things from the man page stand out that are worth checking:
If file's aren't linking, double-check their attributes. Also
check if some attributes are getting forced outside of rsync's
control, such a mount option that squishes root to a single
user, or mounts a removable drive with generic ownership (such
as OS X's “Ignore ownership on this volume” option).
and
Note that rsync versions prior to 2.6.1 had a bug that could
prevent --link-dest from working properly for a non-super-user
when -o was specified (or implied by -a). You can work-around
this bug by avoiding the -o option when sending to an old rsync.
Do you have the "ignore ownership" option turned on? What version of rsync do you have?
Also, have you tried manually creating a similar hardlink using ln at the command line?
I don't know if this is the same issue, but I know that rsync can't sync a file when the destination is a FAT32 partition and the filename has a ":" (colon) in it. [The source filesystem is ext3, and the destination is FAT32]
Try reconfiguring the date command so that it doesn't use a colon and see if that makes a difference.
e.g.
THE_TIME=`date "+%Y-%m-%dT%H_%_%S"`

Incremental deploy from a shell script

I have a project, where I'm forced to use ftp as a means of deploying the files to the live server.
I'm developing on linux, so I hacked together a bash script that makes a backup of the ftp server's contents,
deletes all the files on the ftp, and uploads all the fresh files from the mercurial repository.
(and taking care of user uploaded files and folders, and making post-deploy changes, etc)
It's working well, but the project is starting to get big enough to make the deployment process too long.
I'd like to modify the script to look up which files have changed, and only deploy the modified files. (the backup is fine atm as it is)
I'm using mercurial as a VCS, so my idea is to somehow request the changed files between two revisions from it, iterate over the changed files,
and upload each modified file, and delete each removed file.
I can use hg log -vr rev1:rev2, and from the output, I can carve out the changed files with grep/sed/etc.
Two problems:
I have heard the horror stories that parsing the output of ls leads to insanity, so my guess is that the same applies to here,
if I try to parse the output of hg log, the variables will undergo word-splitting, and all kinds of transformations.
hg log doesn't tell me a file is modified/added/deleted. Differentiating between modified and deleted files would be the least.
So, what would be the correct way to do this? I'm using yafc as an ftp client, in case it's needed, but willing to switch.
You could use a custom style that does the parsing for you.
hg log --rev rev1:rev2 --style mystyle
Then pipe it to sort -u to get a unique list of files. The file "mystyle" would look like this:
changeset = '{file_mods}{file_adds}\n'
file_mod = '{file_mod}\n'
file_add = '{file_add}\n'
The mods and adds templates are files modified or added. There is a similar file_dels and file_del template for deleted files.
Alternatively, you could use hg status -ma --rev rev1-1:rev2 which adds an M or an A before modified/added files. You need to pass a different revision range, one less than rev1, as it is the status since that "baseline". Deleted files are similar - you need the -d flag and a D is added before each deleted file.

Resources