I am looking for a robust way to copy files over a Windows network share that is tolerant of intermittent connectivity. The application is often used on wireless, mobile workstations in large hospitals, and I'm assuming connectivity can be lost either momentarily or for several minutes at a time. The files involved are typically about 200KB - 500KB in size. The application is written in VB6 (ugh), but we frequently end up using Windows DLL calls.
Thanks!
I've used Robocopy for this with excellent results. By default, it will retry every 30 seconds until the file gets across.
I'm unclear as to what your actual problem is, so I'll throw out a few thoughts.
Do you want restartable copies (with such small file sizes, that doesn't seem like it'd be that big of a deal)? If so, look at CopyFileEx with COPYFILERESTARTABLE
Do you want verifiable copies? Sounds like you already have that by verifying hashes.
Do you want better performance? It's going to be tough, as it sounds like you can't run anything on the server. Otherwise, TransmitFile may help.
Do you just want a fire and forget operation? I suppose shelling out to robocopy, or TeraCopy or something would work - but it seems a bit hacky to me.
Do you want to know when the network comes back? IsNetworkAlive has your answer.
Based on what I know so far, I think the following pseudo-code would be my approach:
sourceFile = Compress("*.*");
destFile = "X:\files.zip";
int copyFlags = COPYFILEFAILIFEXISTS | COPYFILERESTARTABLE;
while (CopyFileEx(sourceFile, destFile, null, null, false, copyFlags) == 0) {
do {
// optionally, increment a failed counter to break out at some point
Sleep(1000);
while (!IsNetworkAlive(NETWORKALIVELAN));
}
Compressing the files first saves you the tracking of which files you've successfully copied, and which you need to restart. It should also make the copy go faster (smaller total file size, and larger single file size), at the expense of some CPU power on both sides. A simple batch file can decompress it on the server side.
Try using BITS (Background Intelligent Transfer Service). It's the infrastructure that Windows Update uses, is accessible via the Win32 API, and is built specifically to address this.
It's usually used for application updates, but should work well in any file moving situation.
http://www.codeproject.com/KB/IP/bitsman.aspx
I agree with Robocopy as a solution...thats why the utility is called "Robust File Copy"
I've used Robocopy for this with excellent results. By default, it will retry every 30 seconds until the file gets across.
And by default, a million retries. That should be plenty for your intermittent connection.
It also does restartable transfers and you can even throttle transfers with a gap between packets assuing you don't want to use all the bandwidth as other programs are using the same connection (/IPG switch)?.
How about simply sending a hash after or before you send the file, and comparing that with the file you received? That should at least make sure you have a correct file.
If you want to go all out you could do the same process, but for small parts of the file. Then when you have all pieces, join them on the receiving end.
You could use Microsoft SyncToy (free).
http://www.microsoft.com/Downloads/details.aspx?familyid=C26EFA36-98E0-4EE9-A7C5-98D0592D8C52&displaylang=en
Hm, seems rsync does it, and does not need server/daemon/install I thought it does - just $ rsync src dst.
SMS if it's available works.
Related
A tool I'm writing is responsible for downloading thousands of image files over a matter of many hours. Originally, using TIdHTTP, I would Get the file(s) into a TMemoryStream, and then save that to a file, so long as there were no exceptions. In order to improve speed, I changed the TMemoryStream to a TFileStream.
However, now if the resource was not found, or otherwise any sort of exception which results in no actual file, it still saves an empty file.
Completely understandable, since I simply create a file stream just prior to the download...
FileStream:= TFileStream.Create(FileName, fmCreate);
try
Web.Get(AURL, FileStream);
finally
FileStream.Free;
end;
I know I could simply delete the file if there was an exception. But it seems far too sloppy. I'm sure there's a more appropriate method of aborting such a situation.
How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?
How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?
This isn't possible in general. Errors and failures can happen at any step if the way, including part way through the download. Once this point is understood, then you must accept that the file can be partially downloaded and then abandoned. At which point where do you store it?
The obvious choices are memory and file. You don't want to store to memory, which leaves to file.
This takes you back to your current solution.
I know I could simply delete the file if there was an exception.
This is the correct approach. There are a few variants on this. For instance you might download to a temporary file that is created with flags to arrange its deletion when closed. Only if the download completes do you then copy to the true destination. This is the approach that a browser takes. But the basic idea is to download to file and deal with any failure by tidying up.
Instead of downloading the entire image in one go, you could consider using HTTP range requests if the server supports it. Then you could chunk the file into smaller parts, requesting the next part after the first finishes (or even requesting multiple parts at the same time to increase performance). If there is an exception then you can about the future requests, so they never start in the first place.
YouTube and a number of streaming media sites started doing this a while ago. It used to be if you started playing a video, then paused it, then it would eventually cache the entire video. Now it only caches a little ahead of the current position. This saves a ton of bandwidth because of the abandon rate for videos.
You could write the partial file to disk or keep it in memory.
I'm currently trying to upload some files via zmodem to a small system with an embedded linux with busybox. While most files takes a long time through the 9600 BAUD connection, there is one file that always fails (cramfs_cmc-pu2_v2.45.img). With about 4MB it is also the largest one. For the upload I use Le Putty, a Putty fork that supports zmodem. Unfortunately there is no other method to upload files as the ftp server on that machine does not work properly.
The problem is that the upload always ends up with this strange stuff (after some hours of no feedback at all):
# /usr/bin/rz
Sending: cramfs_cmc-pu2_v2.45.img23be50
Bytes Sent: 0/4132864 BPS:0 ETA 00:00
®B#id##íÁ##htCJÁ®B#killíÁ##htCJ®B#killall#íÁ##htCJÁ®B#ln##íÁ##htCJ®B
#logger##íÁ##<H#Jº!#login###íÁ##htCJÁ®B#ls##íÁ##htCJ®B#md5sum##íÁ##¿
##JCø##mgfestart###íÁ##htCJ®B#mkdir###íÁ##htCJ®B#mknod###íÁ##htCJkH>
F¾#
I guessed that it runs out of flash memory but df gives me just
df: /proc/mounts: No such file or directory
Calculation of free space is difficult in that case anyway as the filesystem is jffs2.
Maybe there is anyone with an idea how to solve this problem with that ancient protocol. Thanks in advance.
Edit: Meanwhile I've splitted the file in many smaller ones and tried to upload them. It always fails after two files. This supports the suspicion that there is not enough free space.
Quite simple approach to check how much space there is left, even if you have no "df":
I just copied an existing file several times and the result was: "No space left on the device". So I'm pretty sure that the strange behaviour described above happened because of this.
I want to be able to (programmatically) move (or copy and truncate) a file that is constantly in use and being written to. This would cause the file being written to would never be too big.
Is this possible? Either Windows or Linux is fine.
To be specific what I'm trying to do is log video with FFMPEG and create hour long videos.
It is possible in both Windows and Linux, but it would take cooperation between the applications involved. If the application that is writing the new data to the file is not aware of what the other application is doing, it probably would not work (well ... there is some possibility ... back to that in a moment).
In general, to get this to work, you would have to open the file shared. For example, if using the Windows API CreateFile, both applications would likely need to specify FILE_SHARE_READ and FILE_SHARE_WRITE. This would allow both (multiple) applications to read and write the file "concurrently".
Beyond sharing the file, though, it would also be necessary to coordinate the operations between the applications. You would need to use some kind of locking mechanism (either by locking some part of the file or some shared mutex/semaphore). Note that if you use file locking, you could lock some known offset in the file to act as a "semaphore" (it can even be a byte value beyond the physical end of the file). If one application were appending to the file at the same exact time that the other application were truncating it, then it would lead to unpredictable results.
Back to the comment about both applications needing to be aware of each other ... It is possible that if both applications opened the file exclusively and kept retrying the operations until they succeeded, then perform the operation, then close the file, it would essentially allow them to work without "knowledge" of each other. However, that would probably not work very well and not be very efficient.
Having said all that, you might want to consider alternatives for efficiency reasons. For example, if it were possible to have the writing application write to new files periodically, it might be more efficient than having to "move" the data constantly out of one file to another. Also, if you needed to maintain some portion of the file (e.g., move out the first 100 MB to another file and then move the second 100 MB to the beginning) that could be a fairly expensive operation as well.
logrotate would be a good option is linux, comes stock on just about any distro. I'm sure there's a similar windows service out there somewhere
To restrict the scope, let assume we are in Windows world only.
Also assume we don't want to play with permission policy.
Is it possible for us to create a file that cannot be copied?
Thank you in advance.
"Trying to make digital files uncopyable is like trying to make water not wet." ~ Bruce Schneier
No. You can't create a file that a SYSADMIN can't copy. You could encrypt it, though.
Well, how about creating a file that uses up more than 50% of the total space on that machine and that is not compressible?
For instance, let us assume that you want to save a boolean (true or false) in such a fashion.
Depending on its value, you could then write a bit stream of ones or zeroes and encrypt said stream using some kind of encryption algorith, such as AES in CBC mode. This gives you the added advantage of error correction. Even in case of massive data corruption, you should be able to recover your boolean by checking whether ones or zeroes are prevalent in the decrypted stream.
In that case you cannot copy it around (completely) on the machine...
Of course, any type of external memory that can be added to the system would pose a problem in this scenario. But the file would be already encrypted, so don't worry about it too much...
Any file that can be read can have its contents written to another location (such as another file, i.e. copied).
The only thing you can do is limit who/what can read the file.
What is the motivation behind? If it is a read-only file, you can have it as embedded resources within your assembly.
Nice try, RIAA.
But seriously, no you can not. It is always possible to copy, you can just make it it more difficult for people to make sense of the file or try to hide it using like encryption. Spotify does it.
If you really try hard thou, you cold make a root-kit for windows and use it to prevent windows from even knowing about the file and also prevent copies. The file will still be there and copy-able by other tools, or Linux accessing the ntfs.
If in a running process you open a file and hold an exclusive lock, then other processes cannot read the file until you close the handle or your process terminates. However, as admin you could forcibly remove the lock handle.
Short answer: No.
You can, of course, use security settings to limit who can read the file. But if someone can read it, then they can copy it. Even if you found some operating system trick to disable "ordinary" copying, if someone can read the file, they can extract the contents, store it in memory, and then write it somewhere else.
You can encrypt the contents so it's only useful to your own program, that knows how to decrypt it.
That's about it.
When using Windows 7 to copy some files from a hard drive, certain files popped up a message saying they could not be copied in their entirety; certain data would be omitted from the copy. I suspect that had something to do with slack space at the end of the files, though I thought the message was curious. I would have expected the copy operation to just ignore the slack space.
If you are running old (OLD) versions of windows, there are certain characters you can put in the filename that make it invalid, not listed in folders, etc. They were used a lot in the old pub ftp days of filesharing ;)
In the old DOS days, you used to be able to flag disk sectors as bad and still read from them. This meant the OS ignored the sector in question but your application would know where to look and be able to get the data. Not sure this would work these days.
Another old MS-DOS trick was to put a space character in the middle of the filename (yes, spaces were valid characters for filenames). Since there was no method on the command line to escape a space, the file couldn't be copied using the DOS commands.
This answer is outside Windows so yeah
Dont know if its already been said but what about a file that is an inseperable part of the firmware so that it is always on AND running, perhaps it has firmware that generates a sequence that is required for the other . AN incedental effect of its running is to prevent any 80% or more of its code from being replicated. Lets say its on an entirely different board, protected by surge protectors, heavy em proof shielding and anything else required to make it completely unerasable.
If its possible to make a program that is ALWAYS on and running as long as the copying software is running then yes.
I have another way and this IS with windows. I will come to your house and give you a disk, i will then proceed to destroy every single computer you put the disk into. This doesnt work on XP
Well technically you could create and write to a write-only network share.
So, I am in this little predicament where I am stuck watching a few ftp folders to see if they have new files added to them. If they do, it needs to throw an event with the file name. Thereby telling something else to download that file.
This is a pretty simple object to make, I was just curious if anyone knew how expensive this operation would be?
I plan on using the command NLIST because I don't need file size information, and there will be no sub-directories in the folder. Each file in the folder will have exactly 25 characters in its name.
There could be anywhere from 10 to 'maybe' a couple thousand (max around 2000) files per folder (usually on the lower end, 100-300, but currently growing).
The files are anywhere from 250kb to a very VERY unlikely 10mb (usually within the 250kb to 4mb range).
There possibly could be up to a few hundred folders (in which case I could change the watch frequency depending on number of folders), but currently there are only a few (6-10ish).
There also would be multiple logins for the ftp server, different logins would have access to different folders.
I am not asking for an implementation, just if anyone has some first or second hand knowledge about FTP, how could this affect my network.
I am not opposed to putting in file retention times or change the frequency in which I check for new files.
Do you have any control over the remote servers? FTP isn't really optimized for this, and you could probably do a lot better with some sort of dedicated mini-server. You could use file system monitoring on the remote side and just send out the filenames when they arrive rather than continuously polling. You'd only need to have one connection open too, rather than the two that FTP requires.