Where is send() implemented in OpenMPI? - parallel-processing

In OpenMPI, if I follow the call stack of any collective operation (e.g MPI_Reduce) deep enough, I find that it calls a function called send().
After a lot of grepping, I'm not sure where send() is implemented. I suspect that send() may be buried inside of a macro or obscure shim layer of some sort.
Where are the implementation(s) of send() located in the OpenMPI codebase?
I'm looking at OpenMPI v1.8.1, though I suspect that the organization of the sorce tree hasn't changed that much between versions.

send(2) is the BSD socket system call for sending data over network sockets. It is ultimately used by the tcp BTL of Open MPI to perform the actual network transfer from one process to another and its implementation is to be found in the source code of the standard C library and in the OS kernel.
If you are interested in the actual higher-level mechanism that Open MPI uses to transmit messages from one rank to another over TCP/IP networks, then the tcp BTL itself is to be found in $OMPI_SOURCE/ompi/mca/btl/tcp/ (for older Open MPI versions) or in $OMPI_SOURCE/opal/mca/btl/tcp/ (for newer versions).

Related

MPI client server connection with Singleton MPI_INIT

I want to implement (in C++) a feature, using MPI, in an existing (non-MPI) application. I am thinking of using mpich-3.4.1 for this.
I am planning to create a .so file for that feature, which the original application can link to. I initially thought to have a function in the .so file that starts with an MPI_Init() and ends with MPI_Finalize() and, in between, calls all required MPI apis to do the parallel job. As part of the MPI job, the new feature makes the current application an MPI server by calling APIs like 'MPI_Open_port' and 'MPI_Comm_accept'. Other worker processes (possibly running on different machines) connect to this server, send/receive messages, and complete a heavy computation in parallel. The application then resumes its other non-mpi work.
It seems to me that Singleton MPI_INIT mechanism will be useful for this. I found the following page on Singleton Init:
https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node254.htm
This page says, "A high-quality implementation will allow any process (including those not started with a ``parallel application'' mechanism) to become an MPI process by calling MPI_INIT. Such a process can then connect to other MPI processes...".
However, the comments in mpich-3.4.1/src/mpi/init/init.c says, "The MPI standard does not say what a program can do before an 'MPI_INIT' or after an 'MPI_FINALIZE'. In the MPICH implementation, you should do as little as possible. In particular, avoid anything that changes the external state of the program, such as opening files, reading standard input or writing to standard output."
Based on the above comments, it seems we should not have MPI_Init(NULL, NULL) and MPI_Finalize() as part of any implementation in a library. In that case, I am thinking to have the init and finalize APIs in the original application's main function, and have rest of the API calls made from the .so file. My original application is a working large software, and may not need to execute my mpi feature at all, in some situations.
My questions are:
(1) Does it make sense to have MPI_Init(NULL, NULL) and MPI_Finalize() called in the main function of this application, and rest of the MPI functionalities in a .so file?
(2) Once MPI_Init(NULL, NULL) is called in the main, would it interfere with the normal execution of the software in any way? Would there be any performance impact on the existing application?
(3) Is there an MPI implementation that handles this better?
(4) Is MPI a good approach to handle this requirement, or other mechanisms like ZeroMQ better? In the comments made by Wesley Bland in the following link, he says that "MPI may not be right for you if you're looking for a client/server model. Yes, it's possible, but it's not really optimized for that use case and you might have better luck using a different communication mechanism". Is that true in 2022?
client relationship within MPI server

How is FastCGI implemented under Windows?

The official FastCGI documentation says that stdin is repurposed as a listening socket when a FastCGI module is started. That's great on Linux, where stdin and sockets are all ints, but I don't think it could it work on Windows, where stdin is a FILE*, and a socket is a HANDLE.
Since Windows servers do support FastCGI, someone has either found a way to make them compatible, or redefined the system for that OS. My Google-fu doesn't seem to be up to locating how though. Where can I find documentation on it?
FastCGI defines only the message exchange protocol, but people behind FastCGI also provide one implementation of that protocol for C++. In this implementation your app must use provided FCGX_Request object to rewire three provided FCGX_Stream objects to the usual ones (cin, cout, cerr). But I suspect that you don't have to rewire the streams, and can use them directly. Check out this FastCGI Hello World to see how it's done.
So, your app does not see HANDLE or FILE*. It sees instead fcgi_streambuf, which inherits from std::streambuf. The way the previously mentioned protocol is implemented is just a detail that you're not supposed to be concerned with. The implementation gets hold of a stream of bytes and provides it to the app, and also the other way around.

ZeroMQ and actor model

I'm having problems scaling up an application that uses the actor model and zeromq. To put it simply: I'm trying to create thousands of threads that communicate via sockets. Similar to what one would do with a Erlang-type message passing. I'm not doing it for multicore/performance reasons, but because framing it in this way gives me very clean code.
From a philosophical point of view it sounds as if this is what zmq developers would like to achieve, e.g.
http://zeromq.org/whitepapers:multithreading-magic
However, it seems as if there are some practical limitations. At 1024 inproc sockets I start getting the "ZMQError: Too many open files" error. TCP gives me the typical "Assertion failed: fds.size () <= FD_SETSIZE" crash.
Why does inproc sockets have this limit?
To get it to work I've had to group together items to share a socket. Is there a better way?
Is zmq just the wrong tool for this kind of job? i.e. it's still more a network library than an actor message passing library?
ZMQ uses file descriptors as the "resource unit" for inproc connections. There is a limit for file descriptors set by the OS, you should be able to modify that (found several potential avenues for Windows with a quick Google search), though I don't know what the performance impact might be.
It looks like this is related to the ZMQ library using C code that is portable among systems for opening new files, rather than Windows native code that doesn't suffer from this same limitation.

how custom route for a process?

In my computer, there are two network adapters, connecting to different subnet. As below:
adapter A: 10.20.30.201
adapter B: 10.20.31.201
I want to make all outgoing data of a special process (for example Process A) through adapter A. That is I want to make adapter A as the process's default route.
I know, I can modify route table for some special destination, But what I want to do here is very different. Process A may communicate with many different IP and I don't know in advance.
Winsock2 provides LSP as a way to lay a dll in TCP/IP stack. I'm not familiar with LSP and don't know whether LSP can do what I want to do.
Can anybody give me some suggestion, Thanks.
A quick background on LSP:
An application, which uses Winsock2 API, calls a combination of WSA-prefix functions, eg WSAConnect, WSASocket, WSASend, WSARecv, etc.
If an application still use old winsock functions, these functions are mapped to Winsock2 behind the scene anyway. For instances: send() is mapped to WSASend(), recv() to WSARecv(), etc
WSA-prefix functions will internally call their corresponding WSP-prefix functions provided by LSP. For instances WSASend() calls WSPSend(), WSASocket() call WSPSocket(), etc. In short, WSAWhateverFunction() will calls WSPWhateverFunction(). Their parameters/returns are also the same (Not quite, but kind of).
LSP is a dll with these WSP-prefix functions implemented, eg. modify outbound/inbound traffic, filtering, etc. However an LSP is still a userspace dll. It's as limited as other userspace programs, and has no higher privilege than its host application, eg internet browsers. It has access to same set of system functions that is available to other programs, eg. winsock etc.
Conclusion is if your program can direct out-coming traffic to specific NIC, LSP can do it too. If it can't, neither can LSP. LSP therefore is irrelevant to your problem.

Porting Winsock to Linux Sockets

I have a program that does some networking using Winsock, and one of our requirements right now is to port over our program to Linux. The only thing stopping us from doing this is Winsock.
My question is: How easy can I port this over to a Linux implementation?
Are there any pitfalls I should be aware of, and if I simply include the appropriate header files, what sort of things will I have to be sure to handle?
Thanks for any help!
I'd post code but I can't unfortunately due to legal reasons.
But, our code does use the following:
WSAStartup(..)
WSACleanup(..)
Socket(..)
sendto(..)
recvfrom(..)
ioctlsocket(..)
setsocketopt(..)
Based on that list of functions, things should more or less just work. Add #if _WIN32 around the calls to WSAStartup and WSACleanup (the linux equivalent is to not do anything, the sockets library is initialized automatically).
You also might need some OS-dependent code when setting socket options, some of them are the same, some aren't, and the types might be different.
It will depend if you use any windows specific networking functionality or if you're just using mostly the mostly BSD compatible API.
So, if you're using overlapped I/O and I/O completion ports and other advanced parts of the Winsock API then things will be very difficult to port and if you're just using the BSD compatible stuff then it should be easy to write a thin translation layer or even just have the winsock startup and shutdown stuff inside a windows specific ifdef...
This may help: http://tangentsoft.net/wskfaq/articles/bsd-compatibility.html
The only calls that make porting difficult are the WSA* calls.
WSAStartup() -> nop
WSACleanup() -> nop
Socket/setsockopt -> socket/setsockopt
Under *nix, sockets are blocking by default and it's not necessary or possible to use that weird setsockopt call to fiddle with it.
ioctlsocket -> ioctl
Under *nix we don't like asynchronous sockets much and prefer to use the select() system call.
---- Rest of this answer seems only to apply to Win95 compatible winsock ----
Unfortunately as the original socket() in Winsock was broken in some cases, you probably used WSASocket() and so have to convert those calls.
Without seeing code, it's tough to say how easy it is. But you should be able to replace winsock calls to analogs in sys/socket.h.

Resources