How do I verify the integrity of a Sybase dump file, without trying to load it? - ftp

Here's the scenario - a client uploads a Sybase dump file to (gzipped) to our local FTP server. We have an automated process which picks these up and then moves them to different server within the network where the database server resides. Unfortunately, this transfer is over a WAN, which for large files takes a long time, and sometimes our clients forget to FTP in binary mode, which results in 10GB of transfer over our WAN all for nothing as the dump file can't be loaded at the other end. What I'd like to do, is verify the integrity of the dump file on the local server before sending it out over the WAN, but I can't just try and "load" the dump file, as we don't have Sybase installed (and can't install it). Are there any tools or bits of code that I can use to do this?

There are a few things you can do from the command line. The first, on the sending side, is to generate md5sum's of the files.
$ md5sum *.dmp
2bddf3cd8b04010183dd3295ce7594ff pubs_1.dmp
7510e0250c8d68bae3e0e794c211e60b pubs_2.dmp
091fe54fa5fd81d8c109cc7835d37f4a pubs_3.dmp
On the client side, they can run the same. Secondly, usually Sybase dumps are done with the compress option. If this option is used, you can also test the file integrity by uncompressing the files via the command line. This isn't as complete, but it will verify the 8 byte CRC-32 checksum which is part of the compress algorithm.
$ gunzip --test *.dmp
gunzip: pubs_3.dmp: unexpected end of file
Neither of these methods validate that Sybase will be able to load the file, but it does help ensure the file isn't corrupt.

There is no way to really verify the integrity of the dump file without loading it in some way by a backup server. The client should know whether the dump is successful or not via the backup log or output during the dump.
But to solve your problem you should use to SFTP or SCP, all transfers are done in binary, alleviating your problem.
Ensure that they are also using compression in the dump a value of 1-3 is more than enough, this should reduce your network traffic also.

Related

What is correct way for an FTP server to prevent corrupted uploaded files because of late append?

Using pureftpd I uploaded 1% of a 1276541542 byte file or about 15 megs. Then I killed the network connection abnormally to simulate a client getting kicked off their ISP. Then I waited an hour. Then I re-connected and issued an APPE (append) command and uploaded the rest of the file. The final size of the file on the server after the upload finished was 1292326238. i.e. about 15 megs MORE than it should be. Corrupt file. What is correct way for an FTP server to prevent corrupted uploaded files because of late append?
What is correct way for an FTP server to prevent corrupted uploaded files because of late append?
There is no way for the FTP server to prevent corrupted uploaded files because the server does not know what the file should be.
But the server can help the client to do a proper upload by implementing the SIZE command. Using this command the client can determine the current file size at the server and thus the position in the file where the upload should be continued. Of course this logic has to be implemented at the client.
i have pure-ftpd answers about it’s upload-script
i’m running pure-uploadscript --run /home/aa/done.rb —daemonize
and my done.rb program is
#!/usr/bin/env ruby
puts "done"
f=File.open("/home/aa/ddd.txt", "w")
f << "test"
f.close
and when I run pure-ftpd —uploadscript and upload a file, sure enough the done.rb program is run.
(I know it’s run cuz there is a new file called ddd.txt)
BUT when I’m uploading a big file and kill the ftp client in middle of upload done.rb is STILL run. (Yes I deleted ddd.txt first.)
Therefore, the answer to the question is, EVEN pureftpd can't handle this because of the limits of FTP protocol.

Calculate file checksum in FTP server using Apache FtpClient

I am using FtpClient of Apache Commons Net to upload videos to FTP server. To check if the file has really been successfully transferred, I want to calculate the checksum of remote file, but unfortunately I found there is no related API I could use.
My question is: Whether there is a need to calculate file checksum in ftp server? If the answer is yes, how to get checksum in FtpClient?
If the answer is no, how do FtpClient know if the file has really been successfully and completely transferred?
With FTP, I'd recommend to verify the upload, if possible.
The problem is that there's no widespread standard API for calculating checksum with FTP.
There are many proposals for checksum calculation command for FTP. None were accepted yet.
The latest proposal is:
https://datatracker.ietf.org/doc/html/draft-bryan-ftpext-hash-02
As a consequence, different FTP servers support different checksum commands, with a different syntax. HASH, XSHA1, XSHA256, XSHA512, XMD5, MD5, XCRC, to name some. You need to check what, and if any at all, your FTP server supports.
You can test that with WinSCP. The WinSCP supports all the previously mentioned commands. Test its checksum calculation function or checksum scripting command. If they work, enable logging and check what command and what syntax WinSCP uses against your server.
> 2015-04-28 09:19:16.558 XSHA1 /test/file.dat
< 2015-04-28 09:19:22.778 213 a98faefdb2c36ca352a2d9b01668aec6b641cf4b
Then execute the command using Apache Commons Net sendCommand method:
if (FTPReply.isPositiveCompletion(ftpClient.sendCommand("XSHA1", "filename"))
{
String[] reply = ftpClient.getReplyStrings();
}
(I'm the author of WinSCP)
If your server does not support any of the checksum commands, you do not have many options:
Download the file back and check it locally.
When using encryption (TLS/SSL), chances of the file being corrupted during transfer are significantly lower. The receiving party (server in this case) would otherwise fail to decrypt the data. So if you are sure that the file transfer completed (no decryption errors and the size of the uploaded file is the same as size of the original local file), you can be pretty sure that the uploaded file is correct.
Just a addition of how I implemented this. When dealing with standard ftp servers without any additionak modules loaded for checksum checking, all i did was creating a list of MD5 CRC hashes about each file into an SFV file. Say its called uploads.sfv (just in the same format as sfv generator would do). This allows you to do further checksum checks.
Examples about the server side support checksum checking support:
PZS-ng for cuftpd, glftpd
mod_digest for ProFTPD
Of course as #MartinPrikryl highlighted, none of these are standardized.
That's a long shot, but if the server supports php, you can exploit that.
Save the following as a php file (say, check.php), in the same folder as your name_of_file.txt file:
<? php
echo md5_file('name_of_file.txt');
php>
Then, visit the page check.php, and you should get the md5 hash of your file.
Related questions:
FTP: copy, check integrity and delete
How to perform checksums during a SFTP file transfer for data integrity?
https://serverfault.com/q/98597/401691

What's the best way to (programatically) determine a file's network origin?

For an application I'm writing, i want to programatically find out what computer on the network a file came from. How can I best accomplish this?
Do I need to monitor network transactions or is this data stored somewhere in Windows?
When a file is copied to the local system Windows does not keep any record of where it was copied. So unless the application that created it saved such information in the file then it will be lost.
With file auditing file and directory operations can be tracked, but I don't think that will include the source path with file copies (just who created it and when).
Yes, it seems like you would either need to detect the file transfer based on interception of network traffic, or if you have the ability to alter the file in some way, use public key cryptography to sign files using a machine-specific key before they are transferred.
Create a service on either the destination computer, or on the file hosting computers which will add records to an Alternate Data Stream attached to each file, much the way that Windows handles ZoneInfo for files downloaded from the internet.
You can have a background process on machine A which "tags" each file as having been tagged by machine A on such-and-such a date and time. Then when machine B downloads the file, assuming we are using NTFS filesystems, it can see the tag from A. Or, if you can't have a process at the server, you can use NTFS streams on the "client" side via packet sniffing methods as others have described. The bonus here is that future file-copies will retain the data as long as it is between NTFS systems.
Alternative: create a requirement that all file transfers must be done through a Web portal (as opposed to network drag-and-drop). Built in logging. Or other type of file retrieval proxy. Do you have control over procedures such as this?

utl_file.FCLOSE() is slow with large files

We are using utl_file in Oracle 10g to copy a blob from a table row to a file on the file system and when we call utl_file.fclose() it takes a long time. It's a 10mb file, not very big, and it takes just over a minute to complete. Anyone know why this would be so slow?
Thanks
EDIT
Looks like this is related to our file system. When we write to a local drive it works fine.
We have determined that it is our network file system mount causing the issue. When we remove that from the problem and store the file to a local drive it works fine. We were able to test this on another environment with the same configuration and it's fast and works as expected.
Now we need to get our network guys involved and see why transfering data on the NFS is so slow in this environment.
EDIT
It was the network speed between the oracle server and the UNIX server. It was set to half duplex 10Mb per second. So we bumped it up to full duplex 100Mb and it works great now!
Are you doing a fflush prior to that? If not, then the fclose is performing the fflush for you and that may be where the time is. Check it by issuing a fflush prior to close.

Efficiently creating tar files

Note: I'm using Windows file servers and .NET
If I were to create a TAR file from files on a remote file server (meaning, the TAR file would be created on the remote file server, where the original files are), would the bytes need to come to my machine and then go back to the file server (since my machine is running the code that's generating the TAR), or would they stay on the file server? I'm asking about the best possible (theoretical) implementation.
Thank you!
The bytes need to be where they are processed.
If you process them on your remote system, they must be transferred.
If you process them on your server, they don't need to be transferred.
If your goal is to minimize bandwidth usage, your best bet would be to have a script on your server that will generate the tar files for you when triggered by your remote system.
The best possible implementation really depends on what your goals and constraints are.
The bytes would have to be read into your machine. The only way I know that you can just do the TARing on the remote server is to have the remote server generate the TAR. For example, you could connect via SSH and run a shell command on the remote server.
Unfortunately, in the scenario described, the TAR operation will use network bandwidth. You need to run the tar program on the file server to avoid using bandwidth.

Resources