I am trying to understand the difference between file descriptor from socket() and fd from accept().
If I read(fd,buffer,buffersize) from these two fds, what I will probably get?
For a server, the descriptor returned by socket() represents the local socket that is listening for clients, whereas the descriptor returned by accept() represents a specific client that is connected to the listening socket. You cannot read/write using the listening descriptor, you must use the client descriptor instead.
Related
I'm working with Gorilla Websocket and curious about how the WriteMessage and ReadMessage functions work.
Does the WriteMessage function send the bytes data to the client synchronously? Or Does the ReadMessage actively fetch the data from the server (according to the documentation, we need to create an event loop to call the ReadMessage function).
What happened if the server kept calling WriteMessage, but no one read the message (the client calls the ReadMessage function through event loop), is the data is lost, or is it kept until the next read request came? Thank You.
Does the WriteMessage function send the bytes data to the client synchronously?
WriteMessage writes the data to the underlying network connection.
The operating system network connection maintains a buffer of data to transmit to the peer. Data is removed from the buffer when the peer acknowledges that the peer received data.
Write to the operating system network connection returns after all of the application data is added to the buffer. Write can block waiting for space in the buffer.
It is almost always the case that the application write call returns before the peer receives the data. A successful call to WriteMessage does imply that the peer application read the data.
Or Does the ReadMessage actively fetch the data from the server (according to the documentation, we need to create an event loop to call the ReadMessage function).
ReadMessage calls read on the underlying network connection.
The operating system buffers some amount of data received from peer.
Read on the operating network connection blocks until data is available in the buffer.
What happened if the server kept calling WriteMessage, but no one read the message.
WriteMessage will eventually block waiting for space in the operating system transmit buffer.
Use a write deadline to protect against blocking forever on a dead or stuck peer.
is the data is lost, or is it kept until the next read request came?
The data is held in operating system transmit and receive buffers.
Application write to the websocket connection blocks when the transmit buffer is full.
The data is only lost if the peer application terminates before the peer application reads the data.
You can find source code for that function here: https://github.com/gorilla/websocket/blob/c3dd95aea9779669bb3daafbd84ee0530c8ce1c1/conn.go#L751-L774
And it looks like this is is blocking/sync method.
As per tracking They creating writer here: https://github.com/gorilla/websocket/blob/c3dd95aea9779669bb3daafbd84ee0530c8ce1c1/conn.go#L766
w, err := c.NextWriter(messageType)
Then they are writing the data:
if _, err = w.Write(data); err != nil {
return err
}
And this is blocking because they are closing connection in the last line of that function, so writing must be done at this moment.
This is behavior of io.WriteCloser interface returned into w variable.
What happened if the server kept calling WriteMessage, but no one read the message (the client calls the ReadMessage function through event loop), is the data is lost, or is it kept until the next read request came? Thank You.
You should set Write/Read timeouts.
Library is not repeating sending the data for you. You need to implement this logic in your application.
If server is up and receive your connection, (probably) it will read your message(if it is not stopped before perform your data).
If you sent message and server was dead(did not receive your message), your data is lost.
Additional reference:
The w.Write function: https://github.com/gorilla/websocket/blob/c3dd95aea9779669bb3daafbd84ee0530c8ce1c1/conn.go#L650-L675
The io.WriteCloser interface desc: https://golang.org/pkg/io/#WriteCloser
Gorila Websocket timeouts: https://pkg.go.dev/github.com/gorilla/websocket#Conn.SetReadDeadline
Timeouts documentation for Gorila: https://pkg.go.dev/github.com/gorilla/websocket#Conn.SetReadDeadline
I am experimenting with ZeroMQ. And I found it really interesting that in ZeroMQ, it does not matter whether either connect or bind happens first. I tried looking into the source code of ZeroMQ but it was too big to find anything.
The code is as follows.
# client side
import zmq
ctx = zmq.Context()
socket = ctx.socket(zmq.PAIR)
socket.connect('tcp://*:2345') # line [1]
# make it wait here
# server side
import zmq
ctx = zmq.Context()
socket = ctx.socket(zmq.PAIR)
socket.bind('tcp://localhost:2345')
# make it wait here
If I start client side first, the server has not been started yet, but magically the code is not blocked at line [1]. At this point, I checked with ss and made sure that the client is not listening on any port. Nor does it have any open connection. Then I start the server. Now the server is listening on port 2345, and magically the client is connected to it. My question is how does the client know the server is now online?
The best place to ask your question is the ZMQ mailing list, as many of the developers (and founders!) of the library are active there and can answer your question directly, but I'll give it a try. I'll admit that I'm not a C developer so my understanding of the source is limited, but here's what I gather, mostly from src/tcp_connector.cpp (other transports are covered in their respective files and may behave differently).
Line 214 starts the open() method, and here looks to be the meat of what's going on.
To answer your question about why the code is not blocked at Line [1], see line 258. It's specifically calling a method to make the socket behave asynchronously (for specifics on how unblock_socket() works you'll have to talk to someone more versed in C, it's defined here).
On line 278, it attempts to make the connection to the remote peer. If it's successful immediately, you're good, the bound socket was there and we've connected. If it wasn't, on line 294 it sets the error code to EINPROGRESS and fails.
To see what happens then, we go back to the start_connecting() method on line 161. This is where the open() method is called from, and where the EINPROGRESS error is used. My best understanding of what's happening here is that if at first it does not succeed, it tries again, asynchronously, until it finds its peer.
I think the best answer is in zeromq wiki
When should I use bind and when connect?
As a very general advice: use bind on the most stable points in your architecture and connect from the more volatile endpoints. For request/reply the service provider might be point where you bind and the client uses connect. Like plain old TCP.
If you can't figure out which parts are more stable (i.e. peer-to-peer) think about a stable device in the middle, where boths sides can connect to.
The question of bind or connect is often overemphasized. It's really just a matter of what the endpoints do and if they live long — or not. And this depends on your architecture. So build your architecture to fit your problem, not to fit the tool.
And
Why do I see different behavior when I bind a socket versus connect a socket?
ZeroMQ creates queues per underlying connection, e.g. if your socket is connected to 3 peer sockets there are 3 messages queues.
With bind, you allow peers to connect to you, thus you don't know how many peers there will be in the future and you cannot create the queues in advance. Instead, queues are created as individual peers connect to the bound socket.
With connect, ZeroMQ knows that there's going to be at least a single peer and thus it can create a single queue immediately. This applies to all socket types except ROUTER, where queues are only created after the peer we connect to has acknowledge our connection.
Consequently, when sending a message to bound socket with no peers, or a ROUTER with no live connections, there's no queue to store the message to.
When you call socket.connect('tcp://*:2345') or socket.bind('tcp://localhost:2345') you are not calling these methods directly on an underlying TCP socket. All of ZMQ's IO - including connecting/binding underlying TCP sockets - happens in threads that are abstracted away from the user.
When these methods are called on a ZMQ socket it essentially queues these events within the IO threads. Once the IO threads begin to process them they will not return an error unless the event is truly impossible, otherwise they will continually attempt to connect/reconnect.
This means that a ZMQ socket may return without an error even if socket.connect is not successful. In your example it would likely fail without error but then quickly reattempt and succeeded if you were to run the server side of script.
It may also allow you to send messages while in this state (depending on the state of the queue in this situation, rather than the state of the network) and will then attempt to transmit queued messages once the IO threads are able to successfully connect. This also includes if a working TCP connection is later lost. The queues may continue to accept messages for the unconnected socket while IO attempts to automatically resolve the lost connection in the background. If the endpoint takes a while to come back online it should still receive it's messages.
To better explain here's another example
<?php
$pid = pcntl_fork();
if($pid)
{
$context = new ZMQContext();
$client = new ZMQSocket($context, ZMQ::SOCKET_REQ);
try
{
$client->connect("tcp://0.0.0.0:9000");
}catch (ZMQSocketException $e)
{
var_dump($e);
}
$client->send("request");
$msg = $client->recv();
var_dump($msg);
}else
{
// in spawned process
echo "waiting 2 seconds\n";
sleep(2);
$context = new ZMQContext();
$server = new ZMQSocket($context, ZMQ::SOCKET_REP);
try
{
$server->bind("tcp://0.0.0.0:9000");
}catch (ZMQSocketException $e)
{
var_dump($e);
}
$msg = $server->recv();
$server->send("response");
var_dump($msg);
}
The binding process will not begin until 2 seconds later than the connecting process. But once the child process wakes and successfully binds the req/rep transaction will successfully take place without error.
jason#jason-VirtualBox:~/php-dev$ php play.php
waiting 2 seconds
string(7) "request"
string(8) "response"
If I was to replace tcp://0.0.0.0:9000 on the binding socket with tcp://0.0.0.0:2345 it will hang because the client is trying to connect to tcp://0.0.0.0:9000, yet still without error.
But if I replace both with tcp://localhost:2345 I get an error on my system because it can't bind on localhost making the call truly impossible.
object(ZMQSocketException)#3 (7) {
["message":protected]=>
string(38) "Failed to bind the ZMQ: No such device"
["string":"Exception":private]=>
string(0) ""
["code":protected]=>
int(19)
["file":protected]=>
string(28) "/home/jason/php-dev/play.php"
["line":protected]=>
int(40)
["trace":"Exception":private]=>
array(1) {
[0]=>
array(6) {
["file"]=>
string(28) "/home/jason/php-dev/play.php"
["line"]=>
int(40)
["function"]=>
string(4) "bind"
["class"]=>
string(9) "ZMQSocket"
["type"]=>
string(2) "->"
["args"]=>
array(1) {
[0]=>
string(20) "tcp://localhost:2345"
}
}
}
["previous":"Exception":private]=>
NULL
}
If your needing real-time information for the state of underlying sockets you should look into socket monitors. Using socket monitors along with the ZMQ poll allows you to poll for both socket events and queue events.
Keep in mind that polling a monitor socket using ZMQ poll is not similar to polling a ZMQ_FD resource via select, epoll, etc. The ZMQ_FD is edge triggered and therefor doesn't behave the way you would expect when polling network resources, where a monitor socket within ZMQ poll is level triggered. Also, monitor sockets are very light weight and latency between the system event and the resulting monitor event is typically sub microsecond.
I have a server application which is connected with telnet client(i.e. telnet localhost _port_num - here port number is same associated with the server application),
My application works correctly, but the thing is I used recv as follows:
#define BUFLEN 512
char buf[BUFLEN];
iResult = recv(sd, (char *)buf, BUFLEN, 0);
here recv call returns as soon as any character pressed over the connected telnet terminal, and most of the time iResult is 1 or some times 2, Even though I wouldn't press enter telnet client sends frame containing a single character to the server application.
How can I make sure that recv should return after BUFLEN read ?
In case of linux recv works as expected, get blocks until enter.
Any help or pointers are greatly appreciated.
Q: How can I make sure that ... BUFLEN read ?
A: You read in a loop until you get all the characters you expect. Or until you get a timeout, or an error.
You need to call recv function again and again until your desired amount of data is received. Please note that when you use TCP Sockets, you cannot make sure if you receive all data in single receive call. If you send data using single TCP Send() call, then it is fairly possible that you receive it in multiple receives as TCP sockets are Stream Sockets.
The recv() function returns the number of bytes received, so you can keep calling the function until you get all they bytes.
How can I find out whether TCP connection was torn down by the peer (by sending RST packet or similar) using Windows IOCP API? Specifically, I can't send or receive any data -- there's no overlapped operation going on. I just want to get an asynchronous notification. Is there a way to do that?
You need to have a read or write pending to detect connection closure. Either will return as Remy suggests on RST but with a pending read you'll also get notification of when the remote side closes the send side of its connection.
I suggest you always keep an overlapped read pending, if you don't want to tie up memory you can always make this a zero byte read.
Your IOCP completion handler will be notified whether a socket operation succeeds or fails. The parameters tell you which is the case.
If you are using GetQueuedCompletionStatus(), it will return FALSE if any failure occured. If it was a socket failure, *lpOverlapped will be set to the non-NULL pointer value of the OVERLAPPED operation that failed. If GetQueuedCompletionStatus() itself failed, *lpOverlapped will be set to NULL. If the peer disconnects gracefully, it will return TRUE and set *lpNumberOfBytes to 0 instead.
If you are using WSAgetOverlappedResult(), it will return FALSE if any failure occurs. Use WSAGetLastError() to determine if it was a socket failure or not. If the peer disconnects gracefully, it will return TRUE and set *lpcbTrasfer to 0 instead.
so.. I'm doing a small multiplayer game with blocking UDP and IO.select. To my problem.. (In the server) reading from a UDP socket (packet, sender = #socket.recvfrom(1000)) which have just sent a packet to a dead client results in a ICMP unreachable (and exception Errno::ECONNRESET in ruby). The problem is that I can't find any way whatsoever to extract the IP of that ICMP.. so I can clean out that dead client.
Anyone know how to achieve this?
thanks
You'll need to call recvmsg for the socket, and pass MSG_ERRQUEUE as the flag.
The original destination address of the datagram that caused the error is supplied via msg_name.
It's worth noting that the source IP address of the ICMP packet will not always be the same address as your client. Any router that handles packets for this connection could be the source, and the payload of the ICMP packet would contain the IP header + the first 8 bytes of the packet it relates to.