How does RFB and X11 protocols work together? - x11

I am trying to understand how a VNC works using X11 and RFB protocols.
I see my XVnc process listens on 59xx(RFB), 58xx(HTTP) and 60xx(X11) ports.
I do not know what is HTTP for, but I think we can exclude that
from picture and still can understand how Xvnc uses RFB and X11 protocols.
From the definition in here: https://www.rfc-editor.org/rfc/rfc6143, I understand that RFB protocol is for remote access through GUI and which uses framebuffers.
Each client session has a dedicated framebuffer which is accessed by client, modified on client request and updates on it are sent to client.
X11 is this protocol is to display any GUI applications like it tells how to display a window or text etc.
So, is my following understand correct?
The client communicates to server on :59xx port using RFB protocol
and sends any requests.
The requests are then processed by a component of Xvnc service running on :59xx port and a request to :60xx port is created in X11 protocol.
The Xvnc service component on :60xx port then provides an output and is processed by Xvnc service and updates the frame buffer.
The update is sent to client in RFB protocol in simplest way using different methods to reduce network data.

Related

How to listen to a TCP port to detect changes in the server?

I'm working on mac os x. I'm trying to build a cocoa app working on a storage server (similar to Dropbox) that does something whenever a file is added,removed..I have already a client app installed on the mac that shows all the files stored on the server and I need to listen to the port that the server is using to send changes notification to the app. I've started following some tutorials for Sockets but I get "Address already in use".
The Question: are sockets the only way to listen to a port and if yes is there a way to build a socket to listen to an already existing server/client connection?
If a process is already listening on a port then no other process can bind(2) to that port. Alternatives would include to have a proxy listen on that port that would deal with events and then pass them on elsewhere (the client app may not play well with this), or to use firewall rules to duplicate the packets to some other port that your app would then listen on, or maybe the client application issues notifications that then can be acted on.
https://github.com/thrig/lognots
Is one way to inspect the notifications available.
Remember that listening on a port is how you prepare to receive incoming connections. It is not necessary to receive data — once a connection is established, data can flow in both directions! It is almost never appropriate for a client application to listen on a port; that's usually only appropriate for server applications.
With that in mind: Your client application should connect to a port on the server, and the server will send data to the client as appropriate.

What exactly is X11 Channel

In all the documentations of X11 that I've found so far something like this is written
Communication between server and clients is done by exchanging packets over a channel. The connection is established by the client (how the client is started is not specified in the protocol). (from wikipedia)
I haven't been able to find what is this channel exactly? A network channel for example? Is it on a port? Is it a memory map? Any help is appreciated.
The phrasing of 'channel' is intentionally vague as it can be either over a local socket, a remote connection (such as SSH), a named pipe, or another method that allows client/server bidirectional communication. Which is to say, a 'channel' is simply a connection between two points that facilitates exchange of data.
When perform X11 forwarding over SSH, the channel is the SSH connection. See the SSH man page for example:
$ man ssh
X11 connections and arbitrary TCP/IP ports can also be forwarded over the secure channel.
or per the x.org documentation:
The communications channel between an X client and server is full-duplex: either side can send a message to the other at any time. This is canonically implemented over a TCP/IP socket interface, though other communications channels are often used, including Unix domain sockets, named pipes and shared memory. The channel must provide a reliable, ordered byte stream---the X protocol provides no mechanism for reordering or resending packets.
X11 support multiple forms of communication between client and server. These so called channels can be TCP sockets, UNIX sockets, and a bunch of other network mechanisms, such as DECnet, token ring etc. TCP and UNIX sockets are really the only ones used today.
The X server is a process that has access to the graphics hardware, keyboard, and mouse. Any application that produces graphics on the computer screen is called a client. Usually, a workstation has on X server running, and multiple X clients. The applications (clients) need to connect to the X-Server via a TCP socket (identified by IP address and port number) or via a UNIX socket (identified by a file name, e.g. /tmp/X0)
If both, server and clients, run on the same system they usually connect through the UNIX socket. However, one of great features of X11 is that server and clients do not have the reside on the same system, but rather connect through the network via TCP sockets. This allows us to run applications on different computers on the network, and bring their graphics output on a single screen. (A single application may also connect to multiple X server and distribute graphics content on multiple screens.)

How does WebSockets server architecture work?

I'm trying to get a better understanding of how the server-side architecture works for WebSockets with the goal of implementing it in an embedded application. It seems that there are 3 different server-side software components in play here: 1) the web server to serve static HTTP pages and handle upgrade request, 2) a WebSockets library such as libwebsockets to handle the "nuts and bolts" of WebSockets communications, and 3) my custom application to actually figure out what to do with incoming data. How do all these fit together? Is it common to have a separate web server and WebSocket handling piece, aka a WebSocket server/daemon?
How does my application communicate with the web server and/or WebSockets library to send/receive data? For example, with CGI, the web server uses environmental variables to send info to the custom application, and stdout to receive responses. What is the equivalent communication system here? Or do you typically link in a WebSocket library into the customer application? But then how would communication with the web server to the WebSocket library + custom application work? Or all 3 combined into a single component?
Here's why I am asking. I'm using the boa web server on a uClinux/no MMU platform on a Blackfin processor with limited memory. There is no native WebSocket support in boa, only CGI. I'm trying to figure out how I can add WebSockets support to that. I would prefer to use a compiled solution as opposed to something interpreted such as JavaScript, Python or PHP. My current application using long polling over CGI, which does not provide adequate performance for planned enhancements.
First off, it's important to understand how a webSocket connection is established because that plays into an important relationship between webSocket connections and your web server.
Every webSocket connection starts with an HTTP request. The browser sends an HTTP request to the host/port that the webSocket connection is requested on. That request might look something like this:
GET /chat HTTP/1.1
Host: example.com:8000
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
What distinguishes this request from any other HTTP request to that server is the Upgrade: websocket header in the request. This tells the HTTP server that this particular request is actually a request to initiate a webSocket connection. This header also allows the web server to tell the difference between a regular HTTP request and a request to open a webSocket connection. This allows something very important in the architecture and it was done this way entirely on purpose. This allows the exact same server and port to be used for both serving your web requests and for webSocket connections. All that is needed is a component on your web server that looks for this Upgrade header on all incoming HTTP connections and, if found, it takes over the connection and turns it into a webSocket connection.
Once the server recognizes this upgrade header, it responds with a legal HTTP response, but one that signals the client that the upgrade to the webSocket protocol has been accepted that looks like this:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
At that point, both client and server keep that socket from the original HTTP request open and both switch to the webSocket protocol.
Now, to your specific questions:
How does my application communicate with the web server and/or
WebSockets library to send/receive data?
Your application may use the built-in webSocket support in modern browsers and can initiate a webSocket connection like this:
var socket = new WebSocket("ws://www.example.com");
This will instruct the browser to initiate a webSocket connection to www.example.com use the same port that the current web page was connected with. Because of the built-in webSocket support in the browser, the above HTTP request and upgrade protocol is handled for you automatically from the client.
On the server-side of things, you need to make sure you are using a web server that has incoming webSocket support and that the support is enabled and configured. Because a webSocket connection is a continuous connection once established, it does not really follow the CGI model at all. There must be at least one long-running process handling live webSocket connections. In server models (like CGI), you would need some sort of webServer add-on that supports this long-running process for your webSocket connections. In a server environment like node.js which is already a long running process, the addition of webSockets is no change at all architecturally - but rather just an additional library to support the webSocket protocol.
I'd suggest you may find this article interesting as it discussions this transition from CGI-style single request handling to the continuous socket connections of webSocket:
Web Evolution: from CGI to Websockets (and how it will help you better monitor your cloud infrastructure)
If you really want to stick with the stdin/stdout model, there are libraries that model that for your for webSockets. Here's one such library. Their tagline is "It's like CGI, twenty years later, for WebSockets".
I'm trying to figure out how I can add WebSockets support to that. I
would prefer to use a compiled solution as opposed to something
interpreted such as JavaScript, Python or PHP.
Sorry, but I'm not familiar with that particular server environment. It will likely take some in-depth searching to find out what your options are. Since a webSocket connection is a continuous connection, then you will need a process that is running continuously that can be the server-side part of the webSocket connection. This can either be something built into your webServer or it can be an additional process that the webServer starts up and forwards incoming connections to.
FYI, I have a custom application at home here built on a Raspberry Pi that uses webSockets for real-time communication with browser web pages and it works just fine. I happen to be using node.js for the server environment and the socket.io library that runs on top of webSockets to give me a higher level interface on top of webSockets. My server code checks several hardware sensors on a regular interval and then whenever there is new/changed data to report, it sends messages down any open webSockets so the connected browsers get real-time updates on the sensor readings.
You would likely need some long-running application that incoming webSocket connections were passed from the web server to your long running process or you'd need to make the webSocket connections on a different port than your web server (so they could be fielded by a completely different server process) in which case you'd have a whole separate server to handle your webSocket requests and sockets (this server would also have to support CORS to enable browsers to connect to it since it would be a different port than your web pages).

Shall I use WebSocket on ports other than 80?

Shall I use WebSocket on non-80 ports? Does it ruin the whole purpose of using existing web/HTTP infrastructures? And I think it no longer fits the name WebSocket on non-80 ports.
If I use WebSocket over other ports, why not just use TCP directly? Or is there any special benefits in the WebSocket protocol itself?
And since current WebSocket handshake is in the form of a HTTP UPGRADE request, does it mean I have to enable HTTP protocol on the port so that WebSocket handshake can be accomplished?
Shall I use WebSocket on non-80 ports? Does it ruin the whole purpose
of using existing web/HTTP infrastructures? And I think it no longer
fits the name WebSocket on non-80 ports.
You can run a webSocket server on any port that your host OS allows and that your client will be allowed to connect to.
However, there are a number of advantages to running it on port 80 (or 443).
Networking infrastructure is generally already deployed and open on port 80 for outbound connections from the places that clients live (like desktop computers, mobile devices, etc...) to the places that servers live (like data centers). So, new holes in the firewall or router configurations, etc... are usually not required in order to deploy a webSocket app on port 80. Configuration changes may be required to run on different ports. For example, many large corporate networks are very picky about what ports outbound connections can be made on and are configured only for certain standard and expected behaviors. Picking a non-standard port for a webSocket connection may not be allowed from some corporate networks. This is the BIG reason to use port 80 (maximum interoperability from private networks that have locked down configurations).
Many webSocket apps running from the browser wish to leverage existing security/login/auth infrastructure already being used on port 80 for the host web page. Using that exact same infrastructure to check authentication of a webSocket connection may be simpler if everything is on the same port.
Some server infrastructures for webSockets (such as socket.io in node.js) use a combined server infrastructure (single process, one listener) to support both HTTP requests and webSockets. This is simpler if both are on the same port.
If I use WebSocket over other ports, why not just use TCP directly? Or
is there any special benefits in the WebSocket protocol itself?
The webSocket protocol was originally defined to work from a browser to a server. There is no generic TCP access from a browser so if you want a persistent socket without custom browser add-ons, then a webSocket is what is offered. As compared to a plain TCP connection, the webSocket protocol offers the ability to leverage HTTP authentication and cookies, a standard way of doing app-level and end-to-end keep-alive ping/pong (TCP offers hop-level keep-alive, but not end-to-end), a built in framing protocol (you'd have to design your own packet formats in TCP) and a lot of libraries that support these higher level features. Basically, webSocket works at a higher level than TCP (using TCP under the covers) and offers more built-in features that most people find useful. For example, if using TCP, one of the first things you have to do is get or design a protocol (a means of expressing your data). This is already built-in with webSocket.
And since current WebSocket handshake is in the form of a HTTP UPGRADE
request, does it mean I have to enable HTTP protocol on the port so
that WebSocket handshake can be accomplished?
You MUST have an HTTP server running on the port that you wish to use webSocket on because all webSocket requests start with an HTTP request. It wouldn't have to be heavily featured HTTP server, but it does have to handle the initial HTTP request.
Yes - Use 443 (ie, the HTTPS port) instead.
There's little reason these days to use port 80 (HTTP) for anything other than a redirection to port 443 (HTTPS), as certification (via services like LetsEncrypt) are easy and free to set up.
The only possible exceptions to this rule are local development, and non-internet facing services.
Should I use a non-standard port?
I suspect this is the intent of your question. To this, I'd argue that doing so adds an unnecessary layer of complication with no obvious benefits. It doesn't add security, and it doesn't make anything easier.
But it does mean that specific firewall exceptions need to be made to host and connect to your websocket server. This means that people accessing your services from a corporate/school/locked down environment are probably not going to be able to use it, unless they can somehow convince management that it is mandatory. I doubt there are many good reasons to exclude your userbase in this way.
But there's nothing stopping you from doing it either...
In my opinion, yes you can. 80 is the default port, but you can change it to any as you like.

Can I open a websocket connection to a local server running on an arbitrary port?

I have a local server outputting my real-time home sensor data, and I want to visualize it in my browser.
My question is, can I use a websocket to open the connection from my browser to the local server? How would I go about doing that?
The local server runs on a non-http designated port number, and I can't change that.
Yes and no.
No:
WebSockets are not raw TCP connections. They have an HTTP compatible handshake (for both security and compatibility with existing servers) and have some minimal framing for each packet to make WebSockets a message based protocol. Also, the current WebSocket API and protocol that exists in browsers as of today do not directly support binary data messages. They only UTF-8 encoded payloads.
Yes:
You can use websockify to proxy a WebSockets connection to a raw binary TCP server. websockify is a python proxy/bridge that has binary support and also includes a javascript library to make interacting with it easier. In addition, websockify includes the web-socket-js fallback/polyfill (implemented in Flash) for browser that do not have native WebSockets support. The downside is that you have to run websockify somewhere (either on the client system, the server system, or some other system). Also, websockify is Linux/UNIX only for now. On the plus side, websockify has a special mode that you can use to launch and wrap an existing service.
Disclaimer: I made websockify.

Resources