I have multithreaded app working quite well, except the following scenario:
Initially I read from pipe 16 bytes, then depending on the header I read the rest.
The problem is that sometimes client writes a message (say 300 bytes long) and then close connection.
My server receives first 16 bytes, then decide to get the rest 284 bytes, but ReadFile returns error 233 (No process is on the other end of the pipe.)
So, where are those 284 bytes go ? I suppose they should be in a pipe buffer or something.
Pipe created as usually like in all examples over the net:
HANDLE h= CreateNamePipe(
name, // pipe name
PIPE_ACCESS_DUPLEX | // read/write access
FILE_FLAG_OVERLAPPED, // overlapped mode
PIPE_TYPE_MESSAGE | // message-type pipe
PIPE_READMODE_MESSAGE | // message read mode
PIPE_WAIT, // blocking mode
PIPE_UNLIMITED_INSTANCES, // unlimited instances
100000, // output buffer size
100000, // input buffer size
0, // client time-out
lpSecurityAttributes); // default security attributes
As WhozCraig noticed, client does not do FlushFileBuffers before DisconnectNamePIpe. Unfortunately.
Related
I'm implementing a voice chat server which will be used in my Virtual Class e-learning application for Windows, which makes use of the Remote Desktop API.
So far I 've been compressing the voice in with OPUS and I 've tested various options:
To pass the voice through the RDP Virtual Channel. This works but it creates lots of lag despite the channel creation with CHANNEL_PRIORITY_HI.
To use my own TCP (or UDP) voice server. For this option I have been wondering what would be the best method to implement.
Currently I 'm sending the udp datagram received, to all other clients (later on I will do server-side mixing).
The problem with my current UDP voice server is that is has lag even within the same pc: One server, and four clients connected, two of them have open mics, for example.
I get audible lag with this setup:
void VoiceServer(int port)
{
XSOCKET Y = make_shared<XSOCKET>(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
if (!Y->Bind(port))
return;
auto VoiceServer2 = [&]()
{
OPUSBUFF o;
char d[200] = { 0 };
map<int, vector<char>> udps;
for (;;)
{
// get datagram
int sle = sizeof(sockaddr_in6);
int r = recvfrom(*Y, o.d, 4000, 0, (sockaddr*)d, &sle);
if (r <= 0)
break;
// a MESSAGE is a header and opus data follows
MESSAGE* m = (MESSAGE*)o.d;
// have we received data from this client already?
// m->arb holds the RDP ID of the user
if (udps.find(m->arb) == udps.end())
{
vector<char>& uu = udps[m->arb];
uu.resize(sle);
memcpy(uu.data(), d, sle);
}
for (auto& att2 : aatts) // attendee list
{
long lxid = 0;
att2->get_Id(&lxid);
#ifndef _DEBUG
if (lxid == m->arb) // if same
continue;
#endif
const vector<char>& uud = udps[lxid];
sendto(*Y, o.d + sizeof(MESSAGE), r - sizeof(MESSAGE), 0, (sockaddr*)uud.data(), uud.size());
}
}
};
// 10 threads receiving
for (int i = 0; i < 9; i++)
{
std::thread t(VoiceServer2);
t.detach();
}
VoiceServer2();
}
Each client runs a VoiceServer thread:
void VoiceServer()
{
char b[4000] = { 0 };
vector<char> d2;
for (;;)
{
int r = recvfrom(Socket, b, 4000, 0, 0,0);
if (r <= 0)
break;
d2.resize(r);
memcpy(d2.data(), b, r);
if (audioin && wout)
audioin->push(d2); // this pushes the buffer to a waveOut writing class
SetEvent(hPlayEvent);
}
}
Is this because I test in the same machine? But with a TeamSpeak client I had setup in the past there is no lag whatsoever.
Thanks for your opinion.
SendTo():
For message-oriented sockets, care must be taken not to exceed the
maximum packet size of the underlying subnets, which can be obtained
by using getsockopt to retrieve the value of socket option
SO_MAX_MSG_SIZE. If the data is too long to pass atomically through
the underlying protocol, the error WSAEMSGSIZE is returned and no data
is transmitted.
A typical IPv4 header is 20 bytes, and the UDP header is 8 bytes. The theoretical limit (on Windows) for the maximum size of a UDP packet is 65507 bytes(determined by the following formula: 0xffff - 20 - 8 = 65507). Is it actually the best way to send such a large packet? If we set a packet size too large, bottom of network protocol will splits packets at the IP layer. This takes up a lot of network bandwidth, cause the delay.
MTU(maximum transmission unit), is actually related to the link layer protocol. The structure of the EthernetII frame DMAC+SMAC+Type+Data+CRC has a minimum size of 64 bytes per Ethernet frame due to the electrical limitations of Ethernet transmission, and the maximum size can not exceed 1518 bytes. For Ethernet frames less than or greater than this limitation, we can regard it as a mistake. Since the largest data frame of Ethernet EthernetII is 1518 bytes, except for the frame header 14Bytes and the frame tail CRC check part 4Bytes, there is only 1500 bytes in the data domain left. That's MTU.
In the case that the MTU is 1500 bytes, the maximum size of UDP packet should be 1500 bytes - IP header (20 bytes) - UDP header (8 bytes) = 1472 bytes if you want IP layer not to split packets. However, since the standard MTU value on the Internet is 576 bytes, it is recommended that UDP data length should be controlled within (576-8-20) 548 bytes in a sendto/recvfrom when programming UDP on the Internet.
You need to reduce the bytes of a send/receive and then control the number of times.
I'm reading a volume (logical drive) with ReadFile. I'm using DeviceIoControl with FSCTL_ALLOW_EXTENDED_DASD_IO code, because I want to have access to all (including the last) bytes and had an issue trying to read last 512 bytes (ReadFile successed, but reported 0 bytes read) and saw advice to use it. Unfortunately, ReadFile fails being called after that DeviceIoControl called.
In code it looks like this (all success checks are omitted for the brevity):
HANDLE fd;
DWORD junk;
int lenToBeRead = 0x1000;
DWORD nread;
char* alignedBuf = new char[lenToBeRead];
fd = CreateFile("path to volume", FILE_READ_DATA,
FILE_SHARE_READ | FILE_SHARE_WRITE,
NULL, OPEN_EXISTING, 0, NULL)) //success
DeviceIoControl(fd, FSCTL_ALLOW_EXTENDED_DASD_IO,
NULL, 0, NULL, 0, &junk, (LPOVERLAPPED) NULL) //success
ReadFile(fd, alignedBuf, (DWORD) lenToBeRead, &nread, NULL)
// fails with 0x57 code, ERROR_INVALID_PARAMETER
All work with fd handle is synchronous.
EDIT. I solved the problem. I was trying to read last bytes. So my volume was of length L = 0x...200 and I had my handle on position pos = L - 0x200. What I had done before I did the FSCTL_ALLOW_EXTENDED_DASD_IO thing - I cut lenToBeRead to fit in remaining space (so, if it was 0x1000, it would change to 0x200), because I had found that ReadFile did not guarantee read all the bytes to the EOF in case of lenToBeRead is greater than amount of bytes remained from current handle position. This did not help, ReadFilewas still returning with success and 0 bytes read. I deleted that fix and then used FSCTL_ALLOW_EXTENDED_DASD_IO, which deliver me then ReadFile failing with ERROR_INVALID_PARAMETER on lenToBeRead = 0x1000. I totally forgot about the first fix and remembered now and now it works.
I found the solution and add it to the question body.
What one has to keep in mind when working with ReadFile is to control arguments (length) to not cross the boundary of the file.
I had tried it as a fix before doing the FSCTL_ALLOW_EXTENDED_DASD_IO thing and it did not help. But combination of the FSCTL_ALLOW_EXTENDED_DASD_IO thing and the boundary check gave me wanted result - I could read that last bytes.
I am building a Visual C++ WinSock TCP server using BindIoCompletionCallback, it works fine receiving and sending data, but I can't find a good way to detect timeout: SetSockOpt/SO_RCVTIMEO/SO_SNDTIMEO has no effect on nonblocking sockets, if the peer is not sending any data, the CompletionRoutine is not called at all.
I am thinking about using RegisterWaitForSingleObject with the hEvent field of OVERLAPPED, that might work but then CompletionRoutine is not needed at all, am I still using IOCP ? is there a performance concern if I use only RegisterWaitForSingleObject and not using BindIoCompletionCallback ?
Update: Code Sample:
My first try:
bool CServer::Startup() {
SOCKET ServerSocket = WSASocket(AF_INET, SOCK_STREAM, 0, NULL, 0, WSA_FLAG_OVERLAPPED);
WSAEVENT ServerEvent = WSACreateEvent();
WSAEventSelect(ServerSocket, ServerEvent, FD_ACCEPT);
......
bind(ServerSocket......);
listen(ServerSocket......);
_beginthread(ListeningThread, 128 * 1024, (void*) this);
......
......
}
void __cdecl CServer::ListeningThread( void* param ) // static
{
CServer* server = (CServer*) param;
while (true) {
if (WSAWaitForMultipleEvents(1, &server->ServerEvent, FALSE, 100, FALSE) == WSA_WAIT_EVENT_0) {
WSANETWORKEVENTS events = {};
if (WSAEnumNetworkEvents(server->ServerSocket, server->ServerEvent, &events) != SOCKET_ERROR) {
if ((events.lNetworkEvents & FD_ACCEPT) && (events.iErrorCode[FD_ACCEPT_BIT] == 0)) {
SOCKET socket = accept(server->ServerSocket, NULL, NULL);
if (socket != SOCKET_ERROR) {
BindIoCompletionCallback((HANDLE) socket, CompletionRoutine, 0);
......
}
}
}
}
}
}
VOID CALLBACK CServer::CompletionRoutine( __in DWORD dwErrorCode, __in DWORD dwNumberOfBytesTransfered, __in LPOVERLAPPED lpOverlapped ) // static
{
......
BOOL res = GetOverlappedResult(......, TRUE);
......
}
class CIoOperation {
public:
OVERLAPPED Overlapped;
......
......
};
bool CServer::Receive(SOCKET socket, PBYTE buffer, DWORD length, void* context)
{
if (connection != NULL) {
CIoOperation* io = new CIoOperation();
WSABUF buf = {length, (PCHAR) buffer};
DWORD flags = 0;
if ((WSARecv(Socket, &buf, 1, NULL, &flags, &io->Overlapped, NULL) != 0) && (GetLastError() != WSA_IO_PENDING)) {
delete io;
return false;
} else return true;
}
return false;
}
As I said, it works fine if the client is actually sending data to me, 'Receive' is not blocking, CompletionRoutine got called, data received, but here is one gotcha, if the client is not sending any data to me, how can I give up after a timeout ?
Since SetSockOpt/SO_RCVTIMEO/SO_SNDTIMEO wont help here, I think I should use the hEvent field in the OVERLAPPED stucture which will be signaled when the IO completes, but a WaitForSingleObject / WSAWaitForMultipleEvents on that will block the Receive call, and I want the Receive to always return immediately, so I used RegisterWaitForSingleObject and WAITORTIMERCALLBACK. it worked, the callback got called after the timeout, or, the IO completes, but now I have two callbacks for any single IO operation, the CompletionRoutine, and the WaitOrTimerCallback:
if the IO completed, they will be called simutaneously, if the IO is not completed, WaitOrTimerCallback will be called, then I call CancelIoEx, this caused the CompletionRoutine to be called with some ABORTED error, but here is a race condition, maybe the IO will be completed right before I cancel it, then ... blahblah, all in all its quite complicated.
Then I realized I dont actually need BindIoCompletionCallback and CompletionRoutine at all, and do everything from the WaitOrTimerCallback, it may work, but here is the interesting question, I wanted to build an IOCP-based Winsock server in the first place, and thought BindIoCompletionCallback is the easiest way to do that, using the threadpool provied by Windows itself, now I endup with a server without IOCP code at all ? is it still IOCP ? or should I forget BindIoCompletionCallback and build my own IOCP threadpool implementation ? why ?
What I did was to force the timeout/completion notifications to enter a critical section in the socket object. Once in, the winner can set a socket state variable and perform its action, whatever that might be. If the I/O completion gets in first, the I/O buffer array is processed in the normal way and any timeout is directed to restart by the state-machine. Similarly if the timeout gets in first, the I/O gets CancelIOEx'd and any later queued completion notification is discarded by the state-engine. Because of these possible 'late' notifications, I put released sockets onto a timeout queue and only recycle them onto the socket object pool after five minutes, in a similar way to how the TCP stack itself puts its sockets into 'TIME_WAIT'.
To do the timeouts, I have one thread that operates on FIFO delta-queues of timing-out objects, one queue for each timeout limit. The thread waits on an input queue for new objects with a timeout calculated from the smallest timeout-expiry-time of the objects at the head of the queues.
There were only a few timeouts used in the server, so I used queues fixed at compile-time. It would be fairly easy to add new queues or modify the timeout by sending appropriate 'command' messages to the thread input queue, mixed-in with the new sockets, but I didn't get that far.
Upon timeout, the thread called an event in the object which, in case of a socket, would enter the socket object CS-protected state-machine, (these was a TimeoutObject class which the socket descended from, amongst other things).
More:
I wait on the semaphore that controls the timeout thread input queue. If it's signaled, I get the new TimeoutObject from the input queue and add it to the end of whatever timeout queue it asks for. If the semaphore wait times out, I check the items at the heads of the timeout FIFO queues and recalculate their remaining interval by sutracting the current time from their timeout time. If the interval is 0 or negative, the timeout event gets called. While iterating the queues and their heads, I keep in a local the minimum remaining interval before the next timeout. Hwn all the head items in all the queues have non-zero remaining interval, I go back to waiting on the queue semaphore using the minimum remaining interval I have accumulated.
The event call returns an enumeration. This enumeration instructs the timeout thread how to handle an object whose event it's just fired. One option is to restart the timeout by recalcuating the timeout-time and pushing the object back onto its timeout queue at the end.
I did not use RegisterWaitForSingleObject() because it needed .NET and my Delphi server was all unmanaged, (I wrote my server a long time ago!).
That, and because, IIRC, it has a limit of 64 handles, like WaitForMultipleObjects(). My server had upwards of 23000 clients timing out. I found the single timeout thread and multiple FIFO queues to be more flexible - any old object could be timed out on it as long as it was descended from TimeoutObject - no extra OS calls/handles needed.
The basic idea is that, since you're using asynchronous I/O with the system thread pool, you shouldn't need to check for timeouts via events because you're not blocking any threads.
The recommended way to check for stale connections is to call getsockopt with the SO_CONNECT_TIME option. This returns the number of seconds that the socket has been connected. I know that's a poll operation, but if you're smart about how and when you query this value, it's actually a pretty good mechanism for managing connections. I explain below how this is done.
Typically I'll call getsockopt in two places: one is during my completion callback (so that I have a timestamp for the last time that an I/O completion occurred on that socket), and one is in my accept thread.
The accept thread monitors my socket backlog via WSAEventSelect and the FD_ACCEPT parameter. This means that the accept thread only executes when Windows determines that there are incoming connections that require accepting. At this time I enumerate my accepted sockets and query SO_CONNECT_TIME again for each socket. I subtract the timestamp of the connection's last I/O completion from this value, and if the difference is above a specified threshold my code deems the connection as having timed out.
I am trying to use combination of functions CreateFileMapping , MapViewOfFile, FlushViewOfFile.
the total buffer size is more than the mapped view.
example buffer is 50KB. and mapped view is 2KB. in such scenario,
i want to write the total buffer to a physical file, using the above function.
First part i am able to write to file. but the remaining part how to write to file. I mean, how to move to next page and write the next part of data.
#define MEM_UNIT_SIZE 100
-first module...Memory map creator
GetTempPath (256, szTmpFile);
GetTempFileName (szTmpFile, pName, 0, szMMFile);
hFile = CreateFile (szMMFile, GENERIC_WRITE | GENERIC_READ, FILE_SHARE_WRITE,
NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_TEMPORARY, NULL);
HANDLE hFileMMF = CreateFileMapping( hFile ,NULL,PAGE_READWRITE,0,
(MEM_UNIT_SIZE),pName)
-second module... Memory writer
long lBinarySize = 1000;
long lPageSize = MEM_UNIT_SIZE;
HANDLE hFileMMF = OpenFileMapping(FILE_MAP_WRITE,FALSE,pMemName);
LPVOID pViewMMFFile = MapViewOfFile(hFileMMF,FILE_MAP_WRITE,0,0, lPageSize );
CMutex mutex (FALSE, _T("Writer"));
mutex.Lock();
try
{
ASSERT(FALSE);
CopyMemory(pViewMMFFile,pBinary,lPageSize); // write
FlushViewOfFile(pViewMMFFile,lPageSize);
// first 100 bytes flushed to file.
//how to move to next location and write next 900 bytes..<---??
}
catch(CException e)
{
...
}
please share if you have any suggestion.
thanks in advance,
haranadh
Repeat your call to MapViewOfFile with a different range.
as described in the following link,
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366761(v=VS.85).aspx
can you please check "allocation granularity", I think you should use this parameter to set the values for "dwFileOffsetLow" or "dwFileOffsetHigh".
This code is for an HTTPS server using blocking sockets:
request := '';
start := gettickcount;
repeat
if SSL_pending(ssl) > 0 then
begin
bytesin := SSL_read(ssl, buffer, sizeof(buffer)-1);
if bytesin > 0 then
begin
buffer[bytesin] := #0;
request := request + buffer;
end
else break; // read failed
end; // pending
until (gettickcount - start) > LARGETIMEOUT;
// "request" is ready, though possibly empty
SSL_pending() always returns zero and the SSL_read() is never reached. If the SSL_pending() call is removed, SSL_read() is executed. Why doesn't SSL_pending() indicate how many bytes are available?
Note that if you call SSL_read() and the number of bytes returned is less than your buffer size, you've read everything and are done.
If the incoming data is larger than your buffer size, the first SSL_read() call fills the buffer, and you can repeat calling SSL_read() until you can't fill the buffer.
BUT if the incoming data is an exact multiple of your buffer size, the last chunk of data fills the buffer. If you attempt another SSL_read() thinking there might be more data on a blocking socket, it hangs indefinitely. Hence the desire to check SSL_pending() first. Yet that doesn't appear to work.
How do you avoid hanging on a final SSL_read()? (I can't imagine the answer is to go non-blocking, since that means you could never use SSL_read with blocking.)
UPDATE: The following works. Apparently SSL_pending() doesn't work until after the first SSL_read():
request := '';
repeat
bytesin := SSL_read(ssl, buffer, sizeof(buffer)-1);
if bytesin > 0 then
begin
buffer[bytesin] := #0;
request := request + buffer;
end
else break; // read failed
until SSL_pending(ssl) <= 0;
// "request" is ready, though possibly empty
You are using SSL_pending() the completely wrong way. OpenSSL uses a state machine, where SSL_pending() indicates if the state machine has any pending bytes that have been buffered and are awaiting processing. Since you are never calling SSL_read(), you are never buffering any data or advancing the state machine.
If the SSL_pending function returns a return code of 0, it does not necessarily mean that there is no data immediately available for reading on the SSL session. A return code of 0 indicates that there is no more data in the current SSL data record. However, more SSL data records may have been received from the network already. If the SSL_pending function returns a return code of 0, issue the select function, passing the file descriptor of the socket to check if the socket is readable. Readable means more data has been received from the network on the socket.