Handling possible errors with network drive file I/O - windows

I'm trying to make file I/O over a network drive (likely over a WAN or VPN) as reliable as possible for a native C++ Windows app...
What are the possible error conditions that I need to be able to handle?
How can I simulate these error conditions in testing?
How do I get detailed information on a particular error? For example, if fopen() fails, does errno tell me everything I need to know, or do I need to get at the GetLastError() value?
How do I reliably distinguish between "network drive access fully functional but the file doesn't exist" and various problems with the network or server?
One particular error condition that I've noticed on my desktop (not specific to the app we're developing) is that sometimes the first attempt to access a file on a network drive will fail, but it presumably causes the drive to be reconnected in the background, because subsequent connections work. I don't know what causes this. This is an example of the kind of error condition that I want to properly handle.
EDIT: This is for a legacy distributed application that uses files on network shares for communication between nodes. Some nodes may be unattended, so passing the error on to the end user may not be an option. The long term goal is to switch to a better protocol, but in the short term I'd like to make the file I/O as reliable as possible.

I believe you're approaching this from a wrong perspective. There's little one can do in the application itself to improve what is essentially a network filesystem driver problem, perhaps except implementing the networked I/O itself. That being said, you should be better off choosing a suitable networked filesystem for your needs. Look at this on Wikipedia.
Generally, your application should behave like the file is locally-stored. Don't try too hard to handle network problems. But if your choice of a network filesystem is good, then these problems can be automatically mitigated.
So I'd say you should settle with checking errno in case of errors. Perhaps fallback on local storage in case writing a remote file fails (if the networked file system doesn't handle this itself).

Related

Getting specific errors when TCP connections disconnect in Windows

I'm trying to improve the usefulness of the error reporting in a server I am working on. The server uses TCP sockets, and it runs on Windows.
The problem is that when a TCP link drops due to some sort of network failure, the error code that I can get from WSARecv() (or the other Windows socket APIs) is not very descriptive. For most network hiccups, I get either WSAECONNRESET (10054) or WSAETIMEDOUT (10060). But there are about a million things that can cause both of these: the local machine is having a problem, the remote machine or process is having a problem, some intermediate router has a problem, etc. This is a problem because the server operator doesn't have a definitive way to investigate the problem, because they don't necessarily even know where the problem is, or who might be responsible.
At the IP level, it's a different story. If the server operator happens to have a network sniffer attached when something bad happens, it's usually pretty easy to sort of what went wrong. For instance, if an intermediate router sent an ICMP unreachable, the router that sent it will put its IP address in there, and that's usually enough to track it down. Put another way, Windows killed the connection for a reason, probably because it got a specific packet that had a specific problem.
However, a large number of failures are experienced in the field, unexpected. It is not realistic to always have a network sniffer attached to a production server. There needs to be a way to track down problems that happen only rarely, intermittently, or randomly.
How can I solve this problem programmatically?
Is there a way to get Windows to cough up a more specific error message? Is there some easy way to capture and mine recent Windows events (perhaps the one Microsoft Network Monitor uses)? One way I've "solved it" before is to keep dumpcap (from Wireshark) running in ring buffer mode, and force it to stop capturing when a bad event happens, that I can mine later.
I'm also open to the possibility that this is not the right way to solve this problem. For instance, perhaps there is some special Windows mode that can be turned on to cause it to log useful information, that a network administrator could use to track this down after-the-fact.

How to guarantee file integrity without mandatory file lock on OS X?

AFAIK, OS X is a BSD derivation, which doesn't have actual mandatory file locking. If so, it seems that I have no way to prevent writing access from other programs even while I am writing a file.
How to guarantee file integrity in such environment? I don't care integrity after my program exited, because that's now user's responsibility. But at least, I think I need some kind of guarantee while my program is running.
How do other programs guarantee file content integrity without mandatory locking? Especially database programs. If there's common technique or recommended practice, please let me know.
Update
I am looking for this for data layer of GUI application for non-engineer users. And currently, my program have this situations.
Data is too big that it cannot be fit to RAM. And even hard to be temporarily copied. So it cannot be read/written atomically, and should be used from disk directly while program is running.
A long running professional GUI content editor application used by humans who are non-engineers. Though users are not engineers, but they still can access the file simultaneously with Finder or another programs. So users can delete or write on currently using file accidentally. Problem is users don't understand what is actually happening, and expect program handles file integrity at least program is running.
I think the only way to guarantee file's integrity in current situation is,
Open file with system-wide exclusive mandatory lock. Now the file is program's responsibility.
Check for integrity.
Use the file as like external memory while program is running.
Write all the modifications.
Unlock. Now the file is user's responsibility.
Because OS X lacks system-wide mandatory lock, so now I don't know what to do for this. But still I believe there's a way to archive this kind of file integrity, which just I don't know. And I want to know how everybody else handles this.
This question is not about my programming error. That's another problem. Current problem is protecting data from another programs which doesn't respect advisory file lockings. And also, users are usually root and the program is running with same user, so trivial Unix file privilege is not useful.
You have to look at the problem that you are trying to actually solve with mandatory locking.
File content integrity is not guaranteed by mandatory locking; unless you keep your file locked 24/7; file integrity will still depend on all processes observing file format/access conventions (and can still fail due to hard drive errors etc.).
What mandatory locking protects you against is programming errors that (by accident, not out of malice) fail to respect the proper locking protocols. At the same time, that protection is only partial, since failure to acquire a lock (mandatory or not) can still lead to file corruption. Mandatory locking can also reduce possible concurrency more than needed. In short, mandatory locking provides more protection than advisory locking against software defects, but the protection is not complete.
One solution to the problem of accidental corruption is to use a library that is aggressively tested for preserving data integrity. One such library (there are others) is SQlite (see also here and here for more information). On OS X, Core Data provides an abstraction layer over SQLite as a data storage. Obviously, such an approach should be complemented by replication/backup so that you have protection against other causes for data corruption where the storage layer cannot help you (media failure, accidental deletion).
Additional protection can be gained by restricting file access to a database and allowing access only through a gateway (such as a socket or messaging library). Then you will just have a single process running that merely acquires a lock (and never releases it). This setup is fairly easy to test; the lock is merely to prevent having more than one instance of the gateway process running.
One simple solution would be to simply hide the file from the user until your program is done using it.
There are various ways to hide files. It depends on whether you're modifying an existing file that was previously visible to the user or creating a new file. Even if modifying an existing file, it might be best to create a hidden working copy and then atomically exchange its contents with the file that's visible to the user.
One approach to hiding a file is to create it in a location which is not normally visible to users. (That is, it's not necessary that the file be totally impossible for the user to reach, just out of the way so that they won't stumble on it.) You can obtain such a location using -[NSFileManager URLForDirectory:inDomain:appropriateForURL:create:error:] and passing NSItemReplacementDirectory and NSUserDomainMask for the first two parameters. See -replaceItemAtURL:withItemAtURL:backupItemName:options:resultingItemURL:error: method for how to atomically move the file into its file place.
You can set a file to be hidden using various APIs. You can use -[NSURL setResourceValue:forKey:error:] with the key NSURLIsHiddenKey. You can use the chflags() system call to set UF_HIDDEN. The old Unix standby is to use a filename starting with a period ('.').
Here's some details about this topic:
https://developer.apple.com/library/ios/documentation/FileManagement/Conceptual/FileSystemProgrammingGuide/FileCoordinators/FileCoordinators.html
Now I think the basic policy on OSX is something like this.
Always allow access by any process.
Always be prepared for shared data file mutation.
Be notified when other processes mutates the file content, and provide proper response on them. For example you can display an error to end users if other process is trying to access the file. And then users will learn that's bad, and will not do it again.

NFS Client library

I'm looking for some stand alone library to access NFS shares.
I am not looking for mounting the shares, just browsing and accessing the files for reading.
Preferable something with a simple simple API similar to regular POSIX operations of opendir, scandir, read and etc.
Thanks in advance!
Here's a link to this NFS client library, but it looks promising, to quote:
The NFS client handles only one connection at a time, but no connection takes
very long.
Read requests must be for under 8000 bytes. This has to do with packet size.
You don't want to know.
Once 256 files are open simultaneously -- by all applications, since the client
does not discriminate between requests in any way -- file handles begin to be
overwritten. The client prints an error.
If the client has problems opening sockets it quits gracefully, including
returning a message over the socket to the application. The exception is if
it is given a bad hostname to mount, in which case it just responds with failure
rather than quitting.
If the formatting of the code looks messed up, it's because the code was written
half on a Mac (tab = 4 spaces).
Here is another link that might explain the limitation of the 256 files opened simultaneously here on sourceforge.net, see B3 of the FAQ there on sourceforge...
Edit: Here's a question that was posted here on Stackoverflow in respect to recursively reading a directory that could be easily modified to scandir...
There is now a libnfs library on github: https://github.com/sahlberg/libnfs
I see it has Debian and FreeBSD packages.

Testing file transfer speed across LAN/WAN

Is there a utility for Windows that allows you to test different aspects of file transfer operations across a Lan or a Wan.
Example...
How long does it take to move a file of a known size (500 MB or 1 GB) from Server A (on site) to Server B (on site) or to Server C (off site-Satellite location)?
D-ITG will allow you to test many aspects of your links. It does not necessarily allow you transfer a file directly, but it allows you to control almost all aspects of the transmission of data across the wire.
If all you are interested in is bulk transfer time (and not all the nitty-gritty details) you could just use a basic FTP application and time the transfer.
Probably nothing you've not already figured out. You could get some coarse grain metrics using a batch file to coordinate:
start monitoring
copy file
stop monitoring
Copy file might just be initiating a file copy between two nodes on the LAN, or it might initiate a FTP copy between two nodes on the WAN.
Monitoring could be as basic as writing the current time to output or file, or it could be as complex as adding performance counter metrics from the network adapter on the two machines.
A commercial WAN emulator would also give you the information your looking for. I've used the Shunra Appliance successfully in the past. Its pretty expensive, so I'd really only recommend it if critical business success is riding on understanding how application behavior could change based on network conditions and is something you could incorporate into regular testing activities.

Best secure single running app guard on windows

I would like to improve the way how an application is checking that another instance is not already running. Right now we are using named mutexes with checking of running processes.
The goal is to prevent security attacks (as this is security software). My idea right now is that "bulletproof" solution is only to write an driver, that will serve this kind of information and will authenticate client via signed binaries.
Does anyone solved such problem?
What are your opinions and recommendations?
First, let me say that there is ultimately no way to protect your process from agents that have administrator or system access. Even if you write a rootkit driver that intercepts all system calls (a difficult and unsafe practice in of itself), there are still ways to use admin access to get in. You have the wrong design if this is a requirement.
If you set up your secure process to run as a service, you can use the Service Control Manager to start it. The SCM will only start one instance, will monitor that it stays up, allow you to define actions to execute if it crashes, and allow you to query the current status. Since this is controlled by the SCM and the service database can only be modified by administrators, an attacking process would not be able to spoof it.
I don't think there's a secure way of doing this. No matter what kind of system-unique, or user-unique named object you use - malicious 3rd party software can still use the exact same name and that would prevent your application from starting at all.
If you use the method of checking the currently executing processes, and checking if no executable with the same name is running - you'd run into problems, if the malicious software has the same executable name. If you also check the path, of that executable - then it would be possible to run two copies of your app from different locations.
If you create/delete a file when starting/finishing - that might be tricked as well.
The only thing that comes to my mind is you may be able to achieve the desired effect by putting all the logic of your app into a COM object, and then have a GUI application interact with it through COM interfaces. This would, only ensure, that there is only one COM object - you would be able to run as many GUI clients as you want. Note, that I'm not suggesting this as a bulletproof method - it may have it's own holes (for example - someone could make your GUI client to connect to a 3rd party COM object, by simply editing the registry).
So, the short answer - there is no truly secure way of doing this.
I use a named pipe¹, where the name is derived from the conditions that must be unique:
Name of the application (this is not the file name of the executable)
Username of the user who launched the application
If the named pipe creation fails because a pipe with that name already exists, then I know an instance is already running. I use a second lock around this check for thread (process) safety. The named pipe is automatically closed when the application terminates (even if the termination was due to an End Process command).
¹ This may not be the best general option, but in my case I end up sending data on it at a later point in the application lifetime.
In pseudo code:
numberofapps = 0
for each process in processes
if path to module file equals path to this module file
increment numberofapps
if number of apps > 1
exit
See msdn.microsoft.com/en-us/library/ms682623(VS.85).aspx for details on how to enumerate processes.

Resources