MPI file view and IO - view

I am writing an MPI program that needs to read a part of a file into memory, one piece at a time, with each piece going to an available process. I am therefore using a shared filepointer. The first part of the file is a header which I want to read to read and distribute to all processes, I have managed to do this by reading it on the master process and broadcasting it to all other processes.
The next part of the file is a long (in theory up to several gigabytes) array of float triples. I want to set the fileview for all the processes so that it starts at the beginning of this array, and each process should be able to see the whole array. Furthermore, and this is my real problem, I do not want the processes to see beyond this array, so that after they encounter the last set of 3 floats they report EOF. So in practice each process just sees one long 3-float array and nothing else.
After the header has been read this is my code:
MPI_Datatype particle_type;
MPI_Type_contiguous(3,MPI_FLOAT,&particle_type);
MPI_Type_commit(&particle_type);
MPI_Offset cur_file_pos;
MPI_File_get_position_shared(fh,&cur_file_pos);
MPI_File_set_view(fh, cur_file_pos, particle_type, particle_type, (char *) "native", MPI_INFO_NULL); /* fh is the file-handle from MPI_File_open */
As I understand, this simply skips the header, but the file view does not stop after the array, it continues into the next part of the file which I am not interested in. Can anyone help me with this simple problem? I have not been able to find any thorough explanations (with examples) of file views anywhere.

Unfortunately, MPI_File_set_view won't do this for you; once you go beyond the filetype the filetype repeats. While MPI_File_set_view will allow you to partition the view of the file between processes, it won't let you "truncate" the view of the file like this.
If you're using the shared file pointer, presumably the simplest thing to do is to loop until the new position == number of particles (once the view is set, the file pointer is in units of etypes).

Related

How to protect a file with Win32 API from being corrupted if the power is reset?

In a C++ Win32 app I write a large file by appending blocks about 64K using a code like this:
auto h = ::CreateFile(
"uncommited.dat",
FILE_APPEND_DATA, // open for writing
FILE_SHARE_READ, // share for reading
NULL, // default security
CREATE_NEW, // create new file only
FILE_ATTRIBUTE_NORMAL, // normal file
NULL); // no attr. template
for (int i = 0; i < 10000; ++i) { ::WriteFile(h, 64K);}
As far as I see if the process is terminated unexpectedly, some blocks with numbers i >= N are lost, but blocks with numbers i < N are valid, and I can read them when the app restarts, because the blocks themselves are not corrupted.
But what happens if the power is reset? Is it true that entire file can be corrupted, or even have zero length?
Is it a good idea to do
FlushFileBuffers(h);
MoveFile("uncommited.dat", "commited.dat");
assuming that MoveFile is some kind of an atomic operation, and when the app restarts open "commited.dat" as valid and delete "uncommited.dat" as corrupted. Or is there a better way?
MoveFile can work all right in the right situation. It has a few problems though--for example, you can't have an existing file by the new name.
If that might occur (you're basically updating an existing file you want to assure won't get corrupted by making a copy, modifying the copy, then replacing the old with the new), rather than MoveFile you probably want to use ReplaceFile.
With ReplaceFile, you write your data to the uncommitted.dat (or whatever name you prefer). Then yes, you probably want to do FlushFileBuffers, and finally ReplaceFile to replace the old file with the new one. This makes use of the NTFS journaling (which applies to file system metadata, not the contents of your files), assuring that only one of two possibilities can happen: either you have the old file (entirely intact) or else the new one (also entirely intact). If power dies in the middle of making a change, NTFS will use its journal to roll back the transaction.
NTFS does also support transactions, but Microsoft generally recommends against applications trying to use this directly. It apparently hasn't been used much since they added it (in Windows Vista), and MSDN hints that it's likely to be removed in some future version of Windows.
For append only scenario you can split data in blocks (constant or variable size). Each block should be accompanied with some form of checksum (SHA, MD5, CRC).
After crash you can read sequentially each block and verify it's checksum. First damaged block and all following it should be treated as lost (eventually you can inspect them and recover manually).
To append more data, truncate file to the end of last correct block.
You can write two copies in parallel and after crash select one with more good blocks.

worse case scenario: launched two copies of a program which appends lines to a file

I have a Python program which performs a simple operation on a file:
with open(self.cache_filename_url, "a", encoding="utf8") as f:
w = csv.writer(f, delimiter=',', quotechar='"', lineterminator='\n')
w.writerow([cache_url, rpd_products])
As you can see it just opens the file and appends a CSV line to it. It does this a lot, in a loop.
I accidentally ran two copies of this program simultaneously, so I think they would have been appending to the file simultaneously. I am trying to determine the worst-case-scenario for file corruption.
Do you think the writes would at least be atomic operations in this case? For example this wouldn't be a problem for me:
old line
old line
new line written by instance 1
new line written by instance 2
new line written by one
This would be a problem for me:
old line
old line
[half of new line written by instance 1] [half of new line by instance 2]
etc
To put it another way, is it possible for the two append operations to "interfere" with each other?
EDIT: I am using Windows 7
Opening the same file multiple times in shared write mode can definitely be problematic. And, if they don't open in shared mode, you'll get one of them throwing exceptions that it cannot open the file.
If SHARED mode:
Both instances will have their own internal pointer. In most cases, they will probably write independently. You could get:
Process A opens file, sets pointer to end (byte 1024)
Process B opens file, sets pointer to end (byte 1024)
Process B writes at byte 1024 and closes file
Process A writes at byte 1024 and closes file.
Both processes will have written to the file at the same location. You've basically lost the record from Process B, and depending on how the close works (if it truncates), if the lines it writes are different lengths, you could get part of Process B if the line was longer.
If it is in EXCLUSIVE mode, one process will fail to open the file, and whatever exception handling you have will kick in.
Which mode you are in can be system dependent, as Python doesn't seem to provide any mechanisms for controlling the share mode.
Update: I ran a check on my file, and I did indeed have corrupted partial lines (the case under "This would be a problem for me" in my question)
It's unfortunate, especially since it implies you could have problems even when you intend to share a file between two processes.
I am still interested in any pointers on how to avoid this outcome. I will hold off on marking an answer as accepted for now. (The other answer is good, but doesn't provide enough details on these modes or how to determine which will be used.)

Closing all pipes of a process

I am working on making a program that will act in a similar way as a shell, but supports only foreground processes and pipes. I have multiple processes writing to the same pipe and some other properties that differ from the normal usage of pipes. Anyhow, my question is,
Is there any easy (automatic) way to close all file descriptors of a process except the three basic ones?
I am asking this question since I have a lot of difficulties keeping track of all file descriptors for every process. And sometimes they act in some unpredictable ways to me. It could be also because of the fact that I don't have a very thorough understanding of them.
Is there any easy way(automatic) to close all file descriptors of a process except the three basic ones?
The normal way to do this is to simply iterate over all of them and close them:
for (i = getdtablesize(); i > 3;) close(--i);
That's already a one-liner. It doesn't get any more "automatic" than that.
I am asking this question since I have a lot of difficulty keeping track of all file descriptors for every process.
It will be worth your time to think about the life cycle of each file descriptor you open, when it gets duplicated (e.g. dup2() and fork()), how it gets used, and make sure you account for how each one is going to get closed when it is no longer needed. Papering over a problem of leaked file descriptors by indiscriminately closing them all is not going to be sustainable.
I have multiple processes writing to the same pipe
If you do this, then you need to be aware that the order in which data arrive at the other end of the pipe is going to be unpredictable. It will be difficult to avoid corrupting the data stream.
Use the closefrom(3) C library function.
From the manpage:
The closefrom() system call deletes all open file descriptors greater
than or equal to lowfd from the per-process object reference table.
Any errors encountered while closing file descriptors are ignored.
Example usage:
#include <unistd.h>
int main() {
// Close everything except stdin, stdout and stderr
closefrom(3); // Were 3 is the lowest file descriptor you wish to close
printf("Clear of all, but the three basic file descriptors!\n");
return 0;
}
This works in most unices, but requires the libbsd support library for Linux.

Implementation of Fifo in GNU-GUILE

I would like to do the following :
I want to imple,ment the concept of FIFO in normal files using GUILE.
Two processes should communicate via a normal text file, that a third process , if needed, can access.
The subordinate of the original two processes should write in the file, line after line, that is append. So far so good. (implemented in c++)
The master proces however, should treat this file as a FIFO, it should read the first line, and do somethong corresponding to it, and delete the first line leaving the rest intact.
The problems are :
While the Master is accessing the file, the subordinate may come to a point where it must write there, leading to a conflict.
Popping the first line may need reading the whole ile out, in a string, poping the first thereof, and then saving it, which is memory intensive, and the second saving action may again conflict with the child trying to write there,
I wanted to implement this in GUILE, because since it is the official OS extension language, there might be better ways which addresses the above two issues.
But in the web I do not find much to orient myself. Please help, sorry for the lewss than concrete question, then I dont have a code snippet to show.

Appending data to a file from the Linux Kernel

I'm trying gather measurements of cycle counts for a particular sys call (sys_clone) in the linux kernel. That said, my process won't be the only one calling it and I can't know my pid ahead of time; so I'll have to record every invocation of it for every pid.
The problem that I've got is that the only ways I can figure out how to output this data (debugfs, sysfs, procfs) involve statically sized buffers, which will be quickly overwritten with irrelevant data from other processes calling sys_clone.
So, does anyone know how to append an arbitrary number of lines to a user space accessible file in linux?
You can take the printk()/klogd approach, and use a circular buffer that is exported via /proc. A user-space process blocks on reading your /proc file, and once it reads something that is removed from the buffer. In fact, you could take a look whether klogd/syslogd can be modified to also read your /proc file, thus you wouldn't need to implement the userspace part.
If you are good with something simpler, just printk() your information in a normalized form with some prefix, and then just filter it out from your syslog using this prefix.
There are a few more possibilities (e.g. using netlink to send messages to userspace), but writing to a file from the kernel is not something I'd recommend.
You could stash the counts in the right task_struct, and make it visible through a per-process file in /proc/<pid>/.

Resources