Efficiently creating tar files - windows

Note: I'm using Windows file servers and .NET
If I were to create a TAR file from files on a remote file server (meaning, the TAR file would be created on the remote file server, where the original files are), would the bytes need to come to my machine and then go back to the file server (since my machine is running the code that's generating the TAR), or would they stay on the file server? I'm asking about the best possible (theoretical) implementation.
Thank you!

The bytes need to be where they are processed.
If you process them on your remote system, they must be transferred.
If you process them on your server, they don't need to be transferred.
If your goal is to minimize bandwidth usage, your best bet would be to have a script on your server that will generate the tar files for you when triggered by your remote system.
The best possible implementation really depends on what your goals and constraints are.

The bytes would have to be read into your machine. The only way I know that you can just do the TARing on the remote server is to have the remote server generate the TAR. For example, you could connect via SSH and run a shell command on the remote server.

Unfortunately, in the scenario described, the TAR operation will use network bandwidth. You need to run the tar program on the file server to avoid using bandwidth.

Related

How to upload a file to a server, that's not in the inventory?

Sometimes we need to upload logs of an application, that's distributed among multiple local Unix machines, to the vendor's server. The machines are all part of the same inventory, and can perform the archiving of the logs, and uploading the archives directly.
The server runs Unix and accepts only SCP and SFTP, so synchronize module (which uses rsync) will not work.
There exists a net_put-module, but that seems intended for uploads to special network appliances -- trying to use it, I get cryptic errors about ansible_network_os...
I can, of course, use the command module, but is not there something specifically targeted for SCP- and/or SFTP-servers?
No, there is no module for scp or sftp, and I don't really see that it would provide a lot of value. sftp and scp are straightforward to use with command, and the underlying commands don't really support the things you might want a module to do, like skipping an upload if the file on the remote wouldn't change.

Syncing a file from a client to a server

I'm trying to keep a file updated real time with the server. Its more like a real time syncing which has a very small delay. Is there any application that lets me do this? Or would you suggest me using a local host as a server?
I dont know how you are connected to your server - but i assume this will be something like SCP / SFTP / FTP and i dont know your OS. WinSCP will do excatly this what you need, you can set it to watch your Filesystem (to a specified folder) and it will update the server files as soon as your file on your drive changes.
It also supports command line features so that you can use it within your own applications.

What's the best way to (programatically) determine a file's network origin?

For an application I'm writing, i want to programatically find out what computer on the network a file came from. How can I best accomplish this?
Do I need to monitor network transactions or is this data stored somewhere in Windows?
When a file is copied to the local system Windows does not keep any record of where it was copied. So unless the application that created it saved such information in the file then it will be lost.
With file auditing file and directory operations can be tracked, but I don't think that will include the source path with file copies (just who created it and when).
Yes, it seems like you would either need to detect the file transfer based on interception of network traffic, or if you have the ability to alter the file in some way, use public key cryptography to sign files using a machine-specific key before they are transferred.
Create a service on either the destination computer, or on the file hosting computers which will add records to an Alternate Data Stream attached to each file, much the way that Windows handles ZoneInfo for files downloaded from the internet.
You can have a background process on machine A which "tags" each file as having been tagged by machine A on such-and-such a date and time. Then when machine B downloads the file, assuming we are using NTFS filesystems, it can see the tag from A. Or, if you can't have a process at the server, you can use NTFS streams on the "client" side via packet sniffing methods as others have described. The bonus here is that future file-copies will retain the data as long as it is between NTFS systems.
Alternative: create a requirement that all file transfers must be done through a Web portal (as opposed to network drag-and-drop). Built in logging. Or other type of file retrieval proxy. Do you have control over procedures such as this?

Encrypted FTP Storage

I guess this is kind of a programming question, because I'm going to write a program if this doesn't exist.
So I found a very cheap web-host (I don't really care about the actual web hosting). They will give me a domain name and ftp server with a ton of storage space. Anyway, I want to backup a few hundred gigs of data (mostly family photos and scans of important documents). I also want to backup any future family photos / documents. I don't care if everything on my local NAS dies in a fire, I just want to have the photos and important documents backed up off-site.
So I want some program that lets me select folders locally and schedules them to be backed up to the ftp server. I'm a bit of a security nut, so i'd like the files to be encrypted locally before being transferred up onto the server.
I know I can do this with truecrypt volumes, but I don't want to transfer an entire encrypted volume blob up to the server ever time I change a file in it. So I could do multiple true crypt volumes but that will be a pain to manage
Also this must be mac/linux compatible although I'll primarily be on linux.
I basically need rsync + truecrypt + cron + sftp all rolled into a cryptographically secure program.
I've been searching for days with no luck. Any ideas?
mozyBackup does this - it doesn't use FTP, it has a custom uploader.
ps. Remember a typical home ADSL connection only does about 1Gb/day upstream
Linux option.
Out of the box option probably duplicity ( for example see http://www.howtoforge.com/creating-encrypted-ftp-backups-with-duplicity-and-ftplicity-on-debian-lenny )
Otherwise if these are basically rarely changed archive copies of files - I would roll my own gnupg (or dpad) individual file encryption, a file changed script, and ftp or rsync.

How do I verify the integrity of a Sybase dump file, without trying to load it?

Here's the scenario - a client uploads a Sybase dump file to (gzipped) to our local FTP server. We have an automated process which picks these up and then moves them to different server within the network where the database server resides. Unfortunately, this transfer is over a WAN, which for large files takes a long time, and sometimes our clients forget to FTP in binary mode, which results in 10GB of transfer over our WAN all for nothing as the dump file can't be loaded at the other end. What I'd like to do, is verify the integrity of the dump file on the local server before sending it out over the WAN, but I can't just try and "load" the dump file, as we don't have Sybase installed (and can't install it). Are there any tools or bits of code that I can use to do this?
There are a few things you can do from the command line. The first, on the sending side, is to generate md5sum's of the files.
$ md5sum *.dmp
2bddf3cd8b04010183dd3295ce7594ff pubs_1.dmp
7510e0250c8d68bae3e0e794c211e60b pubs_2.dmp
091fe54fa5fd81d8c109cc7835d37f4a pubs_3.dmp
On the client side, they can run the same. Secondly, usually Sybase dumps are done with the compress option. If this option is used, you can also test the file integrity by uncompressing the files via the command line. This isn't as complete, but it will verify the 8 byte CRC-32 checksum which is part of the compress algorithm.
$ gunzip --test *.dmp
gunzip: pubs_3.dmp: unexpected end of file
Neither of these methods validate that Sybase will be able to load the file, but it does help ensure the file isn't corrupt.
There is no way to really verify the integrity of the dump file without loading it in some way by a backup server. The client should know whether the dump is successful or not via the backup log or output during the dump.
But to solve your problem you should use to SFTP or SCP, all transfers are done in binary, alleviating your problem.
Ensure that they are also using compression in the dump a value of 1-3 is more than enough, this should reduce your network traffic also.

Resources