UDP Server to client communication - UDP being stateless, how to by-pass router? - client

In a recent series of question I have asked alot about UDP, boost::asio and c++ in general.
My latest question, which doesn't seem to have an answer here at Stackoverflow, is this:
In a client/server application, it is quite okay to require that the server open a port in any firewall, so that messages are allowed in. However, doing the same for clients is definately not a great user experience.
TCP-connections typically achieve this due to the fact that most routers support stateful packet inspection, allowing response packets through if the original request originated from the local host.
It is not quite clear to me how this would work with UDP, since UDP is stateless, and there is no such thing as "response packets" (to my knowledge). How should I account for this in my client application?
Thanks for any answers!

UDP itself is stateless, but the firewall typically is not. The convention on UDP is that if a request goes out from client:port_A to server:port_B, then the response will come back from server:port_B to client:port_A.
The firewall can take advantage of this. If it sees a UDP request go out from the client, it adds an entry to its state table that lets it recognise the response(s), to allow them in. Because UDP is stateless and has no indication of connection termination, the firewall will typically implement a timeout - if no traffic occurs between that UDP address pair for a certain amount of time, the association in the firewall's state table is removed.
So - to take advantage of this in your client application, simply ensure that your server sends responses back from the same port that it uses to receive the requests.

Related

ZeroMQ, async blocking sockets

I'm building a distributed system and I would like asynchronous send and recv from both sides with blocking after high water mark.
PUSH/PULL sockets works great, but I wasn't able to bind a PUSH socket. Meaning I can't have a client-PUSH to server-PULL and a server-PUSH to client-PULL, if the client is behind a firewall, since the server can't connect to the client.
In the book, the following is written, but I can't find an example of it.
"REQ to DEALER: you could in theory do this, but it would break if you added a second REQ because DEALER has no way of sending a reply to the original peer. Thus the REQ socket would get confused, and/or return messages meant for another client." http://zguide.zeromq.org/php:chapter3
I only need a one-to-one connection, so this would in theory work for me.
My question is, what is the best practice to obtain asynchronous send and recv with ZeroMQ without dropping packets?
Most ZeroMQ sockets can both bind (listen on a specific port, acting as a server) and connect (acting as a client). It is usually not related to the data flow. See the guide for more info.
Try to bind on your servers PUSH socket and connect from your clients PULL socket.

SNMP traffic captured by Wireshark, but source port and destination port is same

As my knowledge, in normal behavior, client will make random port to connect the SNMP service port, now the client use SNMP trap port (162) to communicate with server.
My questions:
As the client setting, I do not have configure any SNMP setting for the client, why WireShark able to capture SNMP traffic came from the client?
Why the client use SNMP trap port (162) to communicate with server, rather than using random port?
SNMP is usually transmitted over UDP, so there is actually no "connection", and speaking technically the source port doesn't matter. You can just send out datagrams (e.g. traps) without binding to a port.
However, even when run over UDP, SNMP does involve some two-way communication. If you are expecting a response (which a client does if it's sending an SNMP Get or Set request), the only place the other end knows to send it is back where the request came from, i.e. the original IP/port combination. There's no information in a SNMP packet that provides any alternative "return address" information.
So, in order to get a response on a predictable port, you'll send the request datagram from a bound socket. Typically the client will run its own listening "server" on port 162, send requests from there, and then it can receive responses there too. Otherwise you wouldn't see the responses. This also allows us to set up simple firewall rules (though you can often get away without firewall rules for the return path, due to hole punching*).
This is also true for the server, which emits traps and informs on a known, standard, predictable port not only so that you can configure your trap receiver and firewalls in a reliable way, but so that inform responses can be sent back to a known, standard, predictable port that you're listening on.
tl;dr: You can send your requests from an arbitrary port if you like, but it's not very useful.
* My SNMP implementation seemed buggy when the client/receiver only saw traps emitted during the ~15 minutes after it had last poked out some kind of request packet. Subsequent traps seemed to be completely missing. After much debugging on the server end, it turned out that we'd forgotten to open the correct port on the inbound firewall for the client, and were accidentally relying on hole punching, which has a time limit. :D
As for why Wireshark is seeing traffic from an unconfigured SNMP client, well, either your SNMP client actually is configured to send requests, or you're misinterpreting the results. Wireshark doesn't invent traffic. Without a more complete picture of your network setup, software setup, and those packets you're seeing, we could only speculate as to the exact cause of your confusion.

Why HTTP/2 does multiplexing altough tcp does same thing?

As far as i know, TCP break down a message into segments. So, Why is multiplexing again on HTTP2? What are the benefits of multiplexing twice?
TCP isn’t multiplexed. TCP is just a guaranteed messaging stream (i.e. missing packets are re-requested and the TCP stream is basically temporarily blocked while this happens).
TCP, as a packet based protocol, can be used for multiplexed connections if the higher level application protocol (e.g. HTTP) allows sending of multiple messages. Unfortunately HTTP/1.1 does not allow this: once a HTTP/1.1 message is sent, no other message can be sent on that connection until that message is returned in full (ignoring the badly supported pipelining concept). This means HTTP/1.1 is basically synchronous and, if the full bandwidth is not used and other HTTP messages are queued, then it wastes any extra capacity that could be used on the underlying TCP connection.
To get around this more TCP connections can be opened, which basically allows HTTP/1.1 to act like a (limited) multiplexed protocol. If the network bandwidth was fully utilised then those extra connections would not add any benefit - it’s the fact there is capacity and that the other TCP connections are not being fully utilised that means this makes sense.
So HTTP/2 adds multiplexing to the protocol to allow a single TCP connection to be used for multiple in flight HTTP requests.
It does this by changing the text-based HTTP/1.1 protocol to a binary, packet-based protocol. These may look like TCP packets but that’s not really relevant (in the same way that saying TCP is similar to IP because it’s packet based is not relevant). Splitting messages into packets is really the only way of allowing multiple messages to be in flight at the same time.
HTTP/2 also adds the concept of streams so that packets can belong to different requests - TCP has no such concept - and this is what really makes HTTP/2 multiplexed.
In fact, because TCP doesn’t allow separate, independents streams (i.e. multiplexing), and because it is guaranteed, this actually introduces a new problem where a single dropped TCP packet holds up all the HTTP/2 streams on that connection, despite the fact that only one stream should really be affected and the other streams should be able to carry on despite this. This can even make HTTP/2 slower in certain conditions. Google is experimenting with moving away from TCP to QUIC to address this.
More details on what multiplexing means under HTTP/2 (and why it is a good improvement!) in my answer here: What does multiplexing mean in HTTP/2
TCP doesn't do multiplexing. The TCP segments just means that the (single) stream data is chopped up into pieces that can be sent in IP packets. Each TCP segment is only identified with a stream offset (sequence number), not with any useful way to identify separate streams. (We'll ignore the rarely-useful Urgent Pointer thing.)
So to do multiplexing, you need to put something on top of TCP. Which HTTP/2 does.
HTTP & HTTP/2 are both application level protocols that must utilize a lower level protocol like TCP to actually talk on the Internet. The protocol of the Internet is generally TCP over IP over Ethernet.
It looks like this:
As you can see HTTP is sitting above TCP. Below TCP is IP. One of the main protocols of the Internet. IP itself deals with packets which are switched/multiplexed. I think that's where you might be getting the idea that TCP is multiplexed, it's not. Think of a TCP connection as being like a single lane road tunnel where no one can pass. Lets say it has one single lane in each direction. This is what a TCP connection would look like. A tunnel where you put data in one end, and it comes out the other in the same order it went in. That is TCP. You can see there is no multiplexing on that. However, TCP does provides a reliable connection protocol for which other protocols may be built on top of like HTTP. And reliability is essential for HTTP.
HTTP 1.1 is simply a request response protocol. But as you know, it's not multiplexed. So only allow one outstanding request at a time and has to send the whole response to each request at a time. Previously the browsers got around that limitation by creating multiple TCP connections (tunnels) to the server with which to make more requests.
HTTP 2 actually splits the data up again and allows multiplexing over the one connection so that no further connections need to be created. It means the server can start servicing multiple requests and multiplex the responses so that the browser can start receiving images, pages and other resources at the same time, not one at a time.
Hope that makes it clear.

Shall I use WebSocket on ports other than 80?

Shall I use WebSocket on non-80 ports? Does it ruin the whole purpose of using existing web/HTTP infrastructures? And I think it no longer fits the name WebSocket on non-80 ports.
If I use WebSocket over other ports, why not just use TCP directly? Or is there any special benefits in the WebSocket protocol itself?
And since current WebSocket handshake is in the form of a HTTP UPGRADE request, does it mean I have to enable HTTP protocol on the port so that WebSocket handshake can be accomplished?
Shall I use WebSocket on non-80 ports? Does it ruin the whole purpose
of using existing web/HTTP infrastructures? And I think it no longer
fits the name WebSocket on non-80 ports.
You can run a webSocket server on any port that your host OS allows and that your client will be allowed to connect to.
However, there are a number of advantages to running it on port 80 (or 443).
Networking infrastructure is generally already deployed and open on port 80 for outbound connections from the places that clients live (like desktop computers, mobile devices, etc...) to the places that servers live (like data centers). So, new holes in the firewall or router configurations, etc... are usually not required in order to deploy a webSocket app on port 80. Configuration changes may be required to run on different ports. For example, many large corporate networks are very picky about what ports outbound connections can be made on and are configured only for certain standard and expected behaviors. Picking a non-standard port for a webSocket connection may not be allowed from some corporate networks. This is the BIG reason to use port 80 (maximum interoperability from private networks that have locked down configurations).
Many webSocket apps running from the browser wish to leverage existing security/login/auth infrastructure already being used on port 80 for the host web page. Using that exact same infrastructure to check authentication of a webSocket connection may be simpler if everything is on the same port.
Some server infrastructures for webSockets (such as socket.io in node.js) use a combined server infrastructure (single process, one listener) to support both HTTP requests and webSockets. This is simpler if both are on the same port.
If I use WebSocket over other ports, why not just use TCP directly? Or
is there any special benefits in the WebSocket protocol itself?
The webSocket protocol was originally defined to work from a browser to a server. There is no generic TCP access from a browser so if you want a persistent socket without custom browser add-ons, then a webSocket is what is offered. As compared to a plain TCP connection, the webSocket protocol offers the ability to leverage HTTP authentication and cookies, a standard way of doing app-level and end-to-end keep-alive ping/pong (TCP offers hop-level keep-alive, but not end-to-end), a built in framing protocol (you'd have to design your own packet formats in TCP) and a lot of libraries that support these higher level features. Basically, webSocket works at a higher level than TCP (using TCP under the covers) and offers more built-in features that most people find useful. For example, if using TCP, one of the first things you have to do is get or design a protocol (a means of expressing your data). This is already built-in with webSocket.
And since current WebSocket handshake is in the form of a HTTP UPGRADE
request, does it mean I have to enable HTTP protocol on the port so
that WebSocket handshake can be accomplished?
You MUST have an HTTP server running on the port that you wish to use webSocket on because all webSocket requests start with an HTTP request. It wouldn't have to be heavily featured HTTP server, but it does have to handle the initial HTTP request.
Yes - Use 443 (ie, the HTTPS port) instead.
There's little reason these days to use port 80 (HTTP) for anything other than a redirection to port 443 (HTTPS), as certification (via services like LetsEncrypt) are easy and free to set up.
The only possible exceptions to this rule are local development, and non-internet facing services.
Should I use a non-standard port?
I suspect this is the intent of your question. To this, I'd argue that doing so adds an unnecessary layer of complication with no obvious benefits. It doesn't add security, and it doesn't make anything easier.
But it does mean that specific firewall exceptions need to be made to host and connect to your websocket server. This means that people accessing your services from a corporate/school/locked down environment are probably not going to be able to use it, unless they can somehow convince management that it is mandatory. I doubt there are many good reasons to exclude your userbase in this way.
But there's nothing stopping you from doing it either...
In my opinion, yes you can. 80 is the default port, but you can change it to any as you like.

In Windows, how do I find out which process is on the other end of a local network socket?

That is to say, if I have a server listening on 127.0.0.1, and a TCP connection comes in, how can I determine the process id of the client?
Also if there isn't an API for this, where would I be able to extract the information from in a more hackish manner?
(The purpose of this is to modify a local HTTP proxy server to accept or deny requests based on the requesting process.)
Edit: palacsint's answer below led me to find this answer to a similar question which is just what's needed
netstat -a -o
prints it. I suppose they are on the same machine becase you are listening on 127.0.0.1.
The only way to do this is if the connecting process sends some sort of custom headers which contains identifier. This is due to the fact that the networking layer is completely separated from the application layer (hint: OSI MODEL. This way it is possible to write lower layers software without caring what happens above as long as the messages exchanged (read: networking packets) follow a pre-determined format (read: use the same protocol).

Resources