wait fails in multithreaded application at 2.4 kernel - linux-kernel

I have an application that is multithreaded - one thread is responsible for collecting the dead children with wait(), anther thread spawns them with fork upon request.
I found out that on one platform with 2.4 kernel and LinuxThread wait always fails with ECHILD. I've found out that problem might be in non-POSIX compliant implementation of LinuxThreads on 2.4 kernel and the following discussion suggests that there is no way how this could be solved.
Still I'd like to be sure that nobody knows about any solution. Even patch for the kernel would be acceptable.
When I think about the application design I don't think it could be possible to do both fork() and wait() in a single thread (or only with enormous effort)

It seems to me that this (obviously bogus) behavior is features of LinuxThreads implementation.
There really seems to be only two ways out - either switch to NPTL (requires kernel 2.6) or avoid such multi-threaded fork/wait model (this was my solution to the problem and tough it made the architecture a bit more complicated and complex it still was manageable to do in a single day)
Following example is the bare bone example of the bogus situation that fails on LinuxThreads.
#include <pthread.h>
#include <sys/wait.h>
#include <unistd.h>
#include <errno.h>
void * wait_for_child(void *arg)
{
int s;
pid_t ret;
ret = wait(&s);
if (ret == -1 && errno == ECHILD) perror("Bogus LinuxThreads encountered");
return NULL;
}
int main(int argc, char ** argv)
{
pid_t pid = fork();
if (pid == -1) return 1;
// child waits and then dies
if (pid == 0)
{
sleep(3);
return 0;
}
pthread_t wt;
pthread_create(&wt, NULL, wait_for_child, NULL);
pthread_join(wt, NULL);
return 0;
}

If you're starting to think about kernel patches, then it's time to think about upgrades. 2.4 is very long in the tooth.

Related

When MPI_Send doesn't block

I have used some code that implements manual MPI broadcast, basically a demo that unicasts an integer from root to all other nodes. Of course, unicasting to many nodes is less efficient than MPI_Bcast() but I just want to check how things work.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
void my_bcast(void* data, int count, MPI::Datatype datatype, int root, MPI::Intracomm communicator) {
int world_size = communicator.Get_size();
int world_rank = communicator.Get_rank();
if (world_rank == root) {
// If we are the root process, send our data to everyone
int i;
for (i = 0; i < world_size; i++) {
if (i != world_rank) {
communicator.Send(data, count, datatype, i, 0);
}
}
} else {
// If we are a receiver process, receive the data from the root
communicator.Recv(data, count, datatype, root, 0);
}
}
int main(int argc, char** argv) {
MPI::Init();
int world_rank = MPI::COMM_WORLD.Get_rank();
int data;
if (world_rank == 0) {
data = 100;
printf("Process 0 broadcasting data %d\n", data);
my_bcast(&data, 1, MPI::INT, 0, MPI::COMM_WORLD);
} else {
my_bcast(&data, 1, MPI::INT, 0, MPI::COMM_WORLD);
printf("Process %d received data %d from root process\n", world_rank, data);
}
MPI::Finalize();
}
What I noticed is that if I remove the check that the root doesn't send to itself,
if (i != world_rank) {
...
}
the program still works and doesn't block whereas the default behavior of MPI_Send() is supposed to be blocking i.e. to wait until the data has been received at the other end. But MPI_Recv() is never invoked by the root. Can someone explain why this is happening?
I run the code from the root with the following command (the cluster is set up on Amazon EC2 and using NFS as shared storage among the nodes and all machines have Open MPI 1.10.2 installed)
mpirun -mca btl ^openib -mca plm_rsh_no_tree_spawn 1 /EC2_NFS/my_bcast
The C file is compiled with
mpic++ my_bcast.c
and mpic++ version is 5.4.0.
The code is taken from www.mpitutorial.com
You are mistaking blocking for synchronous behaviour. Blocking means that the call does not return until the operation has completed. The standard send operation (MPI_Send) completes once the supplied buffer is free to be reused by the program. This means either that the message is fully in transit to the receiver or that it was stored internally by the MPI library for later delivery (buffered send). The buffering behaviour is implementation-specific, but most libraries will buffer messages the size of a single integer. Force the synchronous mode by using MPI_Ssend (or the C++ equivalent) to have your program hang.
Please note that the C++ MPI bindings are no longer part of the standard and should not be used in the development of new software. Use the C bindings MPI_Blabla instead.

G++ -cilkplus random behavior with std::vectors

The following (reduced) code is very badly handled by the series of GCC
#include <vector>
#include <cilk/cilk.h>
void walk(std::vector<int> v, int bnd, unsigned size) {
if (v.size() < size)
for (int i=0; i<bnd; i++) {
std::vector<int> vnew(v);
vnew.push_back(i);
cilk_spawn walk(vnew, bnd, size);
}
}
int main(int argc, char **argv) {
std::vector<int> v{};
walk(v , 5, 5);
}
Specifically:
G++ 5.3.1 crash:
red.cpp: In function ‘<built-in>’:
red.cpp:20:39: internal compiler error: in lower_stmt, at gimple-low.c:397
cilk_spawn walk(vnew, bnd, size);
G++ 6.3.1 create a code which works perfectly well if executed on one core
but segfault sometime, signal a double free some other times if using more cores. A student who
has a arch linux g++7 reported a similar result.
My question : is there something wrong with that code. Am I invoking some
undefined behavior or is it simply a bug I should report ?
Answering my own question:
According to https://gcc.gnu.org/ml/gcc-help/2017-03/msg00078.html its indeed a bug in GCC. The temporary is destroyed in the parent and not in the children in a cilk_spawn. So if the thread fork really occur, it might be destroyed too early.

How to use PAPI with C++11 std:thread?

I would like to use PAPI to get the overall counters of all C++11 std::thread threads in a program.
PAPI documentation on Threads says that:
Thread support in the PAPI library can be initialized by calling the following low-level function in C: int PAPI_thread_init(unsigned long(*handle)(void));
where the handle is a
Pointer to a routine that returns the current thread ID as an unsigned long.
For example, for pthreads the handle is pthread_self.
But, I have no idea what it should be with C++11 std::thread.
Nor if it makes more sense to use something different from PAPI.
C++11 threading support has the std::this_thread::get_id() function that returns a std::thread::id instance which can be serialized to a stream. Then you coud try to read an unsigned long from the stream and return ir. Something like this:
#include <thread>
#include <iostream>
#include <sstream>
unsigned long current_thread_id()
{
std::stringstream id_stream;
id_stream << std::this_thread::get_id();
unsigned long id;
id_stream >> id;
return id;
}
int main(int argc, char** argv)
{
std::cout << current_thread_id();
return 0;
}
So in this snippet the current_thread_id function is what you are looking for, but you should add proper error handling (the thread id may not always be a number, in that case you will not be able to read a number from the stream and you should handle that accordingly).
That being said, maybe just use GetCurrentThreadId , since you are already introducing the Linux specific pthread_self.

Create MPI processes on the fly with fork?

If I use MPI, I have a number of processes specified when I run the main program. However I would like to start with one process and dynamically decide at runtime if and when I need more, to fork more processes off. Is that or something similar possible?
Otherwise I would have to reinvent MPI which I would very much like to avoid.
It is not possible to use fork() as the child process will not be able to use MPI functions. There is a simple mechanism in MPI to create dynamically new processes. You must use the MPI_Comm_spawn function or the MPI_Comm_spawn_mutliple
OpenMPI doc: http://www.open-mpi.org/doc/v1.4/man3/MPI_Comm_spawn.3.php
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#define NUM_SPAWNS 2
int main( int argc, char *argv[] )
{
int np = NUM_SPAWNS;
int errcodes[NUM_SPAWNS];
MPI_Comm parentcomm, intercomm;
MPI_Init( &argc, &argv );
MPI_Comm_get_parent( &parentcomm );
if (parentcomm == MPI_COMM_NULL) {
MPI_Comm_spawn( "spawn_example", MPI_ARGV_NULL, np, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &intercomm, errcodes );
printf("I'm the parent.\n");
} else {
printf("I'm the spawned.\n");
}
fflush(stdout);
MPI_Finalize();
return 0;
}

OpenSSL and multi-threads

I've been reading about the requirement that if OpenSSL is used in a multi-threaded application, you have to register a thread identification function (and also a mutex creation function) with OpenSSL.
On Linux, according to the example provided by OpenSSL, a thread is normally identified by registering a function like this:
static unsigned long id_function(void){
return (unsigned long)pthread_self();
}
pthread_self() returns a pthread_t, and this works on Linux since pthread_t is just a typedef of unsigned long.
On Windows pthreads, FreeBSD, and other operating systems, pthread_t is a struct, with the following structure:
struct {
void * p; /* Pointer to actual object */
unsigned int x; /* Extra information - reuse count etc */
}
This can't be simply cast to an unsigned long, and when I try to do so, it throws a compile error. I tried taking the void *p and casting that to an unsigned long, on the theory that the memory pointer should be consistent and unique across threads, but this just causes my program to crash a lot.
What can I register with OpenSSL as the thread identification function when using Windows pthreads or FreeBSD or any of the other operating systems like this?
Also, as an additional question:
Does anyone know if this also needs to be done if OpenSSL is compiled into and used with QT, and if so how to register QThreads with OpenSSL? Surprisingly, I can't seem to find the answer in QT's documentation.
I will just put this code here. It is not panacea, as it doesn't deal with FreeBSD, but it is helpful in most cases when all you need is to support Windows and and say Debian. Of course, the clean solution assumes usage of CRYPTO_THREADID_* family introduced recently. (to give an idea, it has a CRYPTO_THREADID_cmp callback, which can be mapped to pthread_equal)
#include <pthread.h>
#include <openssl/err.h>
#if defined(WIN32)
#define MUTEX_TYPE HANDLE
#define MUTEX_SETUP(x) (x) = CreateMutex(NULL, FALSE, NULL)
#define MUTEX_CLEANUP(x) CloseHandle(x)
#define MUTEX_LOCK(x) WaitForSingleObject((x), INFINITE)
#define MUTEX_UNLOCK(x) ReleaseMutex(x)
#define THREAD_ID GetCurrentThreadId()
#else
#define MUTEX_TYPE pthread_mutex_t
#define MUTEX_SETUP(x) pthread_mutex_init(&(x), NULL)
#define MUTEX_CLEANUP(x) pthread_mutex_destroy(&(x))
#define MUTEX_LOCK(x) pthread_mutex_lock(&(x))
#define MUTEX_UNLOCK(x) pthread_mutex_unlock(&(x))
#define THREAD_ID pthread_self()
#endif
/* This array will store all of the mutexes available to OpenSSL. */
static MUTEX_TYPE *mutex_buf=NULL;
static void locking_function(int mode, int n, const char * file, int line)
{
if (mode & CRYPTO_LOCK)
MUTEX_LOCK(mutex_buf[n]);
else
MUTEX_UNLOCK(mutex_buf[n]);
}
static unsigned long id_function(void)
{
return ((unsigned long)THREAD_ID);
}
int thread_setup(void)
{
int i;
mutex_buf = malloc(CRYPTO_num_locks() * sizeof(MUTEX_TYPE));
if (!mutex_buf)
return 0;
for (i = 0; i < CRYPTO_num_locks( ); i++)
MUTEX_SETUP(mutex_buf[i]);
CRYPTO_set_id_callback(id_function);
CRYPTO_set_locking_callback(locking_function);
return 1;
}
int thread_cleanup(void)
{
int i;
if (!mutex_buf)
return 0;
CRYPTO_set_id_callback(NULL);
CRYPTO_set_locking_callback(NULL);
for (i = 0; i < CRYPTO_num_locks( ); i++)
MUTEX_CLEANUP(mutex_buf[i]);
free(mutex_buf);
mutex_buf = NULL;
return 1;
}
I only can answer the Qt part. Use QThread::currentThreadId(), or even QThread::currentThread() as the pointer value should be unique.
From the OpenSSL doc you linked:
threadid_func(CRYPTO_THREADID *id) is needed to record the currently-executing thread's identifier into id. The implementation of this callback should not fill in id directly, but should use CRYPTO_THREADID_set_numeric() if thread IDs are numeric, or CRYPTO_THREADID_set_pointer() if they are pointer-based. If the application does not register such a callback using CRYPTO_THREADID_set_callback(), then a default implementation is used - on Windows and BeOS this uses the system's default thread identifying APIs, and on all other platforms it uses the address of errno. The latter is satisfactory for thread-safety if and only if the platform has a thread-local error number facility.
As shown providing your own ID is really only useful if you can provide a better ID than OpenSSL's default implementation.
The only fail-safe way to provide IDs, when you don't know whether pthread_t is a pointer or an integer, is to maintain your own per-thread IDs stored as a thread-local value.

Resources