error in MPI_Ineighbor_alltoall while using MPI_IN_PLACE - parallel-processing

Is MPI_IN_PLACE supported for MPI_Ineighbor_alltoall or is there a problem with the function call? https://www.open-mpi.org/doc/v2.0/man3/MPI_Neighbor_alltoall.3.php says that MPI_IN_PLACE is not supported for send? Doe that implicitly mean it is not supported for receive as well?
If I switch the receive buffer to MPI_IN_PLACE the code breaks
ierr = MPI_Ineighbor_alltoall( &P( 0 ), P.chunkSize, MPI_DOUBLE, &R(0), P.chunkSize, MPI_DOUBLE, nbrComm[0], &request );
ierr = MPI_Ineighbor_alltoall( &P( 0 ), P.chunkSize, MPI_DOUBLE, MPI_IN_PLACE, P.chunkSize, MPI_DOUBLE, nbrComm[0], &request );
with the following error,
*** An error occurred in MPI_Ineighbor_alltoall
*** reported by process [3809738753,4575480487799160832]
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_ARG: invalid argument of some other kind
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)

Related

SetHandleInformation failed with error code ERROR_ACCESS_DENIED (5)

My code opens a virtual NIC device using CreateFile. After that I am calling SetHandleInformation on the handle returned by CreateFile to avoid leaking the handle to child processes. The problem is, SetHandleInformation fails with error code 5 (ERROR_ACCESS_DENIED).
Following is the piece of code where I am opening the device and calling SetHandleInformation:
handle = CreateFile(dev_name, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_SYSTEM | FILE_FLAG_OVERLAPPED,NULL);
if (handle != INVALID_HANDLE_VALUE) {
if (!(SetHandleInformation(handle, HANDLE_FLAG_INHERIT, 0))) {
printf("SetHandleInformation error, code: %u\n", GetLastError());
}
}
I am observing this failure on Windows 10. My application is running with Administrator privileges. I am able to successfully use the handle for read and write even after this failure, which indicates that CreateFile is working fine.
What could be the possible reason of this failure? I could not find much information on SetHandleInformation failure with error code 5.

MPI_COMM_RANK vs MPI_GROUP_RANK

I tried to group 8 processors into two subgroups. One of the subgroup contains 2 processors, say, their ranks are 0 and 1. I don't need the other group for the current example. The code snippet in the context below is what I used to reach this goal. However, I kept obtaining error messages.
One of the error message I got is as the following:
Fatal error in PMPI_Comm_rank: Invalid communicator, error stack:
PMPI_Comm_rank(121): MPI_Comm_rank(MPI_COMM_NULL, rank=0x7fff5a451e10) failed
PMPI_Comm_rank(73).: Null communicator.
If I change the statement in line 15 to call MPI_GROUP_RANK(...) then there is no error message shown. However, I don't know whether I can use group_rank as an input argument of the subroutines like MPI_SEND or MPI_RECV. Can anyone please tell me what is wrong with my code? Thanks.
Lee
01 program main
02 include 'mpif.h'
03 integer :: ierr, irank, num_procs, base_group
04 integer :: incl_list(2), new_group, new_comm, new_rank
05 call MPI_Init ( ierr )
06 call MPI_COMM_RANK( MPI_comm_world, irank, ierr )
07 call MPI_COMM_SIZE( MPI_comm_world, num_procs, ierr)
08 call MPI_COMM_GROUP( MPI_comm_world, base_group, ierr)
09
10 incl_list(1) = 0
11 incl_list(2) = 1
12
13 call MPI_GROUP_INCL( base_group, 2, incl_list, new_group, ierr )
14 call MPI_COMM_CREATE( MPI_COMM_WORLD, new_group, new_comm, ierr )
15 call MPI_COMM_RANK( new_comm, new_rank, ierr )
16 call MPI_Finalize ( ierr )
17 end program
MPI_COMM_CREATE returns MPI_COMM_NULL in those ranks that are not included in new_group. Calling MPI_COMM_RANK with MPI_COMM_NULL results in the error you are getting. You should use an IF statement to prevent it:
call MPI_COMM_CREATE(MPI_COMM_WORLD, new_group, new_comm, ierr)
if (new_comm /= MPI_COMM_NULL) then
!
! The process is part of new_group - do something useful
!
call MPI_COMM_RANK(new_comm, new_rank, ierr)
! ...
else
!
! The process is not part of new_group - do nothing
!
end if

What is the reason for this sysgen error?

(Windows CE 7)
If i give "clean sysgen" (blddemo clean -q), i'm getting following errors:
SYSGEN: BUILDMSG: Found localized resources for Languages ( 0404 0407 0409 040c 0410 0411 0412 0413 0416 0419 041d 0804 0c0a)
Res2Res: ERROR: Failed CreateFileW("C:\DOCUME~1\KESHAV~1.IWA\LOCALS~1\Temp\R2R1000.tmp", RW, RW, 0, Existing, Normal, 0), GetLastError = 5. {log="C:\WINCE700\build.log(18570)"}
Res2Res: ERROR: Failed CreateFileW("C:\DOCUME~1\KESHAV~1.IWA\LOCALS~1\Temp\R2R1000.tmp", RW, RW, 0, Existing, Normal, 0), GetLastError = 5. {log="C:\WINCE700\build.log(18571)"}
NMAKE : fatal error U1077: 'C:\WINCE700\public\common\oak\Bin\i386\res2res.EXE' : return code '0xffffffff' {log="C:\WINCE700\build.log(18572)"}
NMAKE : fatal error U1077: 'C:\WINCE700\sdk\bin\i386\nmake.exe' : return code '0x2' {log="C:\WINCE700\build.log(18574)"}
SYSGEN: ERROR: error(s) in sysgen phase ( ie7 ) {log="C:\WINCE700\build.log(18576)"}
(Error list:
)
Later, if i give "sysgen" (blddemo -q), there will not be any errors and build successful.
What is the reason of this error?
Error code 5 is ACCESS_DENIED. Are you running with administrative privileges on your build machine?
This error was because of Antivirus agent (Symantec Endpoint Protection) running on my system.
It is solved by disabling the SEP program.

Is my loop wrong ? Do I misuse ReadFile() and I/O completion port ?

I want to implement a server/client using named pipes (for IPC).I'm using async (overlapped) connections and I/O completion port (I searched a lot and it seems that it is the most efficient way to do that).
First here are the codes:
server: http://pastebin.com/XxeXdunC
and client: http://pastebin.com/fbCH2By8
The problem is in the server (i can improve the client but i will do that when the server works).
I use I/O completion port like that : basically, I run a thread in which I call ReadFile(). If it returns TRUE, I get all the data, if it returns FALSE, and the error is ERROR_IO_PENDING, I wait with GetQueuedCompletionStatus().
What is strange is that, even if I read all the data, the last ReadFile() call fails and the error is ERROR_IO_PENDING
The thread in which I call ReadFile() is beginning line 64 of the server code.
The client sends 24 bytes (the string "salut, c'est le client !") and the ReadFile() buffer is of length 5 bytes (to check how my server deals data that is larger than the Readfile() buffer)
The output is:
waiting for client...
WaitForMultipleObjects : 0
client connected (1)
ReadFile 1 msg (5 -> 05) : salut
ReadFile 2 msg (5 -> 10) : salut, c'e
ReadFile 2 msg (5 -> 15) : salut, c'est le
ReadFile 2 msg (5 -> 20) : salut, c'est le clie
ReadFile 2 msg (4 -> 24) : salut, c'est le client !
ReadFile2: ERROR_IO_PENDING
GQCIOS 0 255 003D3A18
ReadFile3: ERROR_IO_PENDING
ReadFile1: ERROR_IO_PENDING
GQCIOS 5 255 003D3A2C
ReadFile3: ERROR_IO_PENDING
ReadFile1: ERROR_IO_PENDING
GQCIOS 5 255 003D3A2C
ReadFile3: ERROR_IO_PENDING
ReadFile1: ERROR_IO_PENDING
GQCIOS 5 255 003D3A2C
ReadFile3: ERROR_IO_PENDING
ReadFile1: ERROR_IO_PENDING
GQCIOS 5 255 003D3A2C
ReadFile3: ERROR_IO_PENDING
ReadFile1: ERROR_IO_PENDING
GQCIOS 4 255 003D3A2C
ReadFile3: ERROR_IO_PENDING
ReadFile1: ERROR_IO_PENDING
What I do not understand is that even if I read all the data, ReadFile() still returns a pending operation (it's the "ReadFile2: ERROR_IO_PENDING" error message after the last "msg" output)
Is my loop wrong ? Do I misuse ReadFile() / GetQueuedCompletionStatus() ?
thank you
Where is your write related function? it seems that your code is in wrong order. In the _read_data_cb routine, GetQueuedCompletionStatus should be called first, then depending on the lpOverlapped parameter, the received data should be ready in the buffer that you specified in the ReadFile function. Since you're calling Readfile without checking whether OVERLAPPED is a overlapped structure for the send context or recv context, you're not getting the expected output. The following code should clear things up:
while(TRUE)
{
bReturnValue=GetQueuedCompletionStatus(pIOCPServer->m_pIOCP, &dwBytesTransferred,(DWORD *)pClient,reinterpret_cast<LPOVERLAPPED*>(&pOverlapped),INFINITE);
if(!bReturnValue)
{
if(NULL==pOverlapped)
continue;
else
break;
}
else
{
if(pOverlapped==NULL)
continue;
}
if(dwBytesTransferred==0)
break;
if(lpOverlapped==&(pClient->m_pRecvContext->overlapped))
pClient->handleRecvEvent(dwBytesTransferred)
else if(lpOverlapped==&(pClient->m_pSendContext->overlapped))
pClient->handleSendEvent(dwBytesTransferred)
}
...

Win32 ::shutdown() returns -1, but WSAGetLastError() returns 0?

In porting some working unit tests from Linux to Windows I'm running across a strange problem. It appears that when my tests go to shutdown the server socket, shutdown() returns -1, but WSAGetLastError() returns 0 (and getsockopt( with SO_ERROR ) returns 0, and GetLastError() returns 0 )... So, shutdown() tells me there is an error, but all of the normal calls to see what that problem was are returning "no problem!"... Has anyone ever seen this before?
The code that calls shutdown looks like this:
int ret = ::shutdown( _sok, mode );
if( ret < 0 )
X_THROW(( XSDK::ModuleId, XSDK::F_OS_ERROR, "Unable to shutdown socket."));
When I catch the exception, I call all those GetLastError() functions... Does throwing reset the last errors?
The answer ended up being that nearly any system calls can clear Win32's "LastError()" errors... In my case, throwing an exception meant formatting and logging a message, which caused the error to be clear... And even though I was calling WSAGetLastError() immediately in my catch(...) it was already too late...

Resources