Can both sides send data on an FTP bi-directional data connection - ftp

I always thought that when FTP data connection opens, it transfers data only in one way.
Now I found out that both sides can transfer data on the opened data connection.
My questions:
What is it used for? I read that it can be used to transfer files over SSL, so the bi-directional is used for negotiation, but then why not using ftps?
Data connection opens for transfer of files and listing (anything else?). So what should the sending side do when it receives data from the other side? how would it process it?
Are there clients supporting this behavior?
Is it common?

You are correct that FTP RFC really mentions possibility that data connection is used bi-directionally:
It ought to also be noted that the data connection may be used for simultaneous sending and receiving
But it's likely that the RFC authors just wanted to make sure such option is available for future features of the protocol.
But as far as I know, there's actually no such feature that make use of bi-directional data connection.
The FTP protocol does not allow simultaneous transfers at all, neither in the same nor opposite direction.
Currently the data connection is used:
For downloads, where only the server sends data.
For uploads, where only the client sends data.
For directory listings, where only the server sends data.
Regarding FTPS: Indeed if the data connection is encrypted using TLS/SSL, the connection is used bi-directionally on TCP-level, when the client and the server negotiate the encryption. But I do not think this is what the RFC refers too, as SSL/TLS did not exist at the time and the negotiation is out of scope of FTP protocol anyway.

Related

Moving from socket.io to raw websockets?

Right now I'm using socket.io with mandatory websockets as the transport. I'm thinking about moving to raw websockets but I'm not clear on what functionality I will lose moving off of socket.io. Thanks for any guidance.
The socket.io library adds the following features beyond standard webSockets:
Automatic selection of long polling vs. webSocket if the browser does not support webSockets or if the network path has a proxy/firewall that blocks webSockets.
Automatic client reconnection if the connection goes down (even if the server restarts).
Automatic detection of a dead connection (by using regular pings to detect a non-functioning connection)
Message passing scheme with automatic conversion to/from JSON.
The server-side concept of rooms where it's easy to communicate with a group of connected users.
The notion of connecting to a namespace on the server rather than just connecting to the server. This can be used for a variety of different capabilities, but I use it to tell the server what types of information I want to subscribe to. It's like connection to a particular channel.
Server-side data structures that automatically keep track of all connected clients so you can enumerate them at any time.
Middleware architecture built-in to the socket.io library that can be used to implement things like authentication with access to cookies from the original connection.
Automatic storage of the cookies and other headers present on the connection when it was first connected (very useful for identifying what user is connected).
Server-side broadcast capabilities to send a common message to either to all connected clients, all clients in a room or all clients in a namespace.
Tagging of every message with a message name and routing of message names into an eventEmitter so you listen for incoming messages by listening on an eventEmitter for the desired message name.
The ability for either client or server to send a message and then wait for a response to that specific message (a reply feature or request/response model).

Does HTTP/2 make websockets obsolete?

I'm learning about HTTP/2 protocol. It's a binary protocol with small message frames. It allows stream multiplexing over single TCP connection. Conceptually it seems very similar to WebSockets.
Are there plans to obsolete websockets and replace them with some kind of headerless HTTP/2 requests and server-initiated push messages? Or will WebSockets complement HTTP/2?
After just getting finished reading RFC 7540, HTTP/2 does obsolete websockets for all use cases except for pushing binary data from the server to a JS webclient. HTTP/2 fully supports binary bidi streaming (read on), but browser JS doesn't have an API for consuming binary data frames and AFAIK such an API is not planned.
For every other application of bidi streaming, HTTP/2 is as good or better than websockets, because (1) the spec does more work for you, and (2) in many cases it allows fewer TCP connections to be opened to an origin.
PUSH_PROMISE (colloquially known as server push) is not the issue here. That's just a performance optimization.
The main use case for Websockets in a browser is to enable bidirectional streaming of data. So, I think the OP's question becomes whether HTTP/2 does a better job of enabling bidirectional streaming in the browser, and I think that yes, it does.
First of all, it is bi-di. Just read the introduction to the streams section:
A "stream" is an independent, bidirectional sequence of frames
exchanged between the client and server within an HTTP/2 connection.
Streams have several important characteristics:
A single HTTP/2 connection can contain multiple concurrently open
streams, with either endpoint interleaving frames from multiple
streams.
Streams can be established and used unilaterally or shared by
either the client or server.
Streams can be closed by either endpoint.
Articles like this (linked in another answer) are wrong about this aspect of HTTP/2. They say it's not bidi. Look, there is one thing that can't happen with HTTP/2: After the connection is opened, the server can't initiate a regular stream, only a push stream. But once the client opens a stream by sending a request, both sides can send DATA frames across a persistent socket at any time - full bidi.
That's not much different from websockets: the client has to initiate a websocket upgrade request before the server can send data across, too.
The biggest difference is that, unlike websockets, HTTP/2 defines its own multiplexing semantics: how streams get identifiers and how frames carry the id of the stream they're on. HTTP/2 also defines flow control semantics for prioritizing streams. This is important in most real-world applications of bidi.
(That wrong article also says that the Websocket standard has multiplexing. No, it doesn't. It's not really hard to find that out, just open the Websocket RFC 6455 and press ⌘-F, and type "multiplex". After you read
The protocol is intended to be extensible; future versions will likely introduce additional concepts such as multiplexing.
You will find that there is 2013 draft extension for Websocket multiplexing. But I don't know which browsers, if any, support that. I wouldn't try to build my SPA webapp on the back of that extension, especially with HTTP/2 coming, the support may never arrive).
Multiplexing is exactly the kind of thing that you normally have to do yourself whenever you open up a websocket for bidi, say, to power a reactively updating single page app. I'm glad it's in the HTTP/2 spec, taken care of once and for all.
If you want to know what HTTP/2 can do, just look at gRPC. gRPC is implemented across HTTP/2. Look specifically at the half and full duplex streaming options that gRPC offers. (Note that gRPC doesn't currently work in browsers, but that is actually because browsers (1) don't expose the HTTP/2 frame to the client javascript, and (2) don't generally support Trailers, which are used in the gRPC spec.)
Where might websockets still have a place? The big one is server->browser pushed binary data. HTTP/2 does allow server->browser pushed binary data, but it isn't exposed in browser JS. For applications like pushing audio and video frames, this is a reason to use websockets.
Edit: Jan 17 2020
Over time this answer has gradually risen up to the top (which is good, because this answer is more-or-less correct). However there are still occasional comments saying that it is not correct for various reasons, usually related to some confusion about PUSH_PROMISE or how to actually consume message-oriented server -> client push in a single page app.
If you need to build a real-time chat app, let's say, where you need to broadcast new chat messages to all the clients in the chat room that have open connections, you can (and probably should) do this without websockets.
You would use Server-Sent Events to push messages down and the Fetch api to send requests up. Server-Sent Events (SSE) is a little-known but well supported API that exposes a message-oriented server-to-client stream. Although it doesn't look like it to the client JavaScript, under the hood your browser (if it supports HTTP/2) will reuse a single TCP connection to multiplex all of those messages. There is no efficiency loss and in fact it's a gain over websockets because all the other requests on your page are also sharing that same TCP connection. Need multiple streams? Open multiple EventSources! They'll be automatically multiplexed for you.
Besides being more resource efficient and having less initial latency than a websocket handshake, Server-Sent Events have the nice property that they automatically fall back and work over HTTP/1.1. But when you have an HTTP/2 connection they work incredibly well.
Here's a good article with a real-world example of accomplishing the reactively-updating SPA.
From what I understood HTTP/2 is not a replacement for websocket but aims to standardize SPDY protocol.
In HTTP/2, server-push is used behind the scene to improve resource loading by the client from the browser. As a developer, you don't really care about it during your development. However, with Websocket, the developer is allowed to use API which is able to consume and push message with an unique full-duplex connection.
These are not the same things, and they should complement each other.
I say Nay (Websockets aren't obsolete).
The first and most often ignored issue is that HTTP/2 push isn't enforceable and might be ignored by proxies, routers, other intermediaries or even the browser.
i.e. (from the HTTP2 draft):
An intermediary can receive pushes from the server and choose not to forward them on to the client. In other words, how to make use of the pushed information is up to that intermediary. Equally, the intermediary might choose to make additional pushes to the client, without any action taken by the server.
Hence, HTTP/2 Push can't replace WebSockets.
Also, HTTP/2 connections do close after a while.
It's true that the standard states that:
HTTP/2 connections are persistent. For best performance, it is expected that clients will not close connections until it is determined that no further communication with a server is necessary (for example, when a user navigates away from a particular web page) or until the server closes the connection.
But...
Servers are encouraged to maintain open connections for as long as possible but are permitted to terminate idle connections if necessary. When either endpoint chooses to close the transport-layer TCP connection, the terminating endpoint SHOULD first send a GOAWAY (Section 6.8) frame so that both endpoints can reliably determine whether previously sent frames have been processed and gracefully complete or terminate any necessary remaining tasks.
Even if the same connection allows for pushing content while it is open and even if HTTP/2 resolves some of the performance issues introduced by HTTP/1.1's 'keep-alive'... HTTP/2 connections aren't kept open indefinitely.
Nor can a webpage re-initiate an HTTP/2 connection once closed (unless we're back to long-pulling, that is).
EDIT (2017, two years later)
Implementations of HTTP/2 show that multiple browser tabs/windows share a single HTTP/2 connection, meaning that push will never know which tab / window it belongs to, eliminating the use of push as a replacement for Websockets.
EDIT (2020)
I'm not sure why people started downvoting the answer. If anything, the years since the answer was initially posted proved that HTTP/2 can't replace WebSockets and wasn't designed to do so.
Granted, HTTP/2 might be used to tunnel WebSocket connections, but these tunneled connections will still require the WebSocket protocol and they will effect the way the HTTP/2 container behaves.
The answer is no. The goal between the two are very different. There is even an RFC for WebSocket over HTTP/2 which allows you to make multiple WebSocket connections over a single HTTP/2 TCP pipe.
WS over HTTP/2 will be a resource conservation play by decreasing the time to open new connections and allowing for more communication channels without the added expense of more sockets, soft IRQs, and buffers.
https://datatracker.ietf.org/doc/html/draft-hirano-httpbis-websocket-over-http2-01
Well, to quote from this InfoQ article:
Well, the answer is clearly no, for a simple reason: As we have seen above, HTTP/2 introduces Server Push which enables the server to proactively send resources to the client cache. It does not, however, allow for pushing data down to the client application itself. Server pushes are only processed by the browser and do not pop up to the application code, meaning there is no API for the application to get notifications for those events.
And so HTTP2 push is really something between your browser and server, while Websockets really expose the APIs that can be used by both client (javascript, if its running on browser) and application code (running on server) for transferring real time data.
As of today, no.
HTTP/2, compared to HTTP, allows you to maintain a connection with a server. From there, you can have multiple streams of data at the same time. The intent is that you can push multiple things at the same time even without the client requesting it. For example, when a browser asks for a index.html, the server might want to also push index.css and index.js. The browser didn't ask for it, but the server might provide it without being asked because it can assume you're going to want in a few seconds.
This is faster than the HTTP/1 alternative of getting index.html, parsing it, discovering it needs index.js and index.css and then building 2 other requests for those files. HTTP/2 lets the server push data the client hasn't even asked for.
In that context, it's similar to WebSocket, but not really by design. WebSocket is supposed to allow a bi-directional communication similar to a TCP connection, or a serial connection. It's a socket where both communicate with each other. Also, the major difference is that you can send any arbitrary data packets in raw bytes, not encapsulated in HTTP protocol. The concepts of headers, paths, query strings only happen during the handshake, but WebSocket opens up a data stream.
The other difference is you get a lot more fine-tuned access to WebSocket in Javascript, whereas with HTTP, it's handled by the browser. All you get with HTTP is whatever you can fit in XHR/fetch(). That also means the browser will get to intercept and modify HTTP headers without you being able to control it (eg: Origin, Cookies, etc). Also, what HTTP/2 is able to push is sent to the browser. That means JS doesn't always (if ever) know things are being pushed. Again, it makes sense for index.css and index.js because the browser will cache it, but not so much for data packets.
It's really all in the name. HTTP stands for HyperText Transfer Protocol. We're geared around the concept of transferring assets. WebSocket is about building a socket connection where binary data gets passed around bidirectionally.
The one we're not really discussing is SSE (Server-Sent Events). Pushing data to the application (JS) isn't HTTP/2's intent, but it is for SSE. SSE gets really strengthened with HTTP/2. But it's a not a real replacement for WebSockets when what's important is the data itself, not the variable endpoints being reached. For each endpoint in with WebSocket a new data stream is created, but with SSE it's shared between the already existing HTTP/2 session.
Summarized here are the objectives for each:
HTTP - Respond to a request with one asset
HTTP/2 - Respond to a request with multiple assets
SSE - Respond with a unidirectional text (UTF-8) event stream
WebSocket - Create a bidirectional binary data stream
Message exchange and simple streaming(not audio, video streaming) can be done via both Http/2 multiplexing and WebSockets. So there is some overlap, but WebSockets have well established protocol, a lot of frameworks/APIs and less headers overhead.
Here is nice article about the topic.
No, WebSockets are not obsolete. However, HTTP/2 breaks websockets as defined for HTTP/1.1 (mostly by forbidding protocol updates using the Upgrade header). Which is why this rfc:
https://datatracker.ietf.org/doc/html/rfc8441
defines a websocket bootstrapping procedure for HTTP/2.
For the time being April 2020, HTTP/2 is not making WebSockets obsolete. The greatest advantage of WebSockets over HTTP2 is that
HTTP/2 works only on Browser Level not Application Level
Means that HTTP/2 does not offer any JS API like WebSockets to allow communication and transfer some kind of JSON or other data to server directly from Application (e.g. Website). So, as far as I believe, HTTP/2 will only make WebSockets obsolete if it starts offering API like WebSockets to talk to server. Till that it is just updated and faster version of HTTP 1.1.
No HTTP/2 does not make websockets obsolete, but SSE over HTTP/2 offers a viable alternative. The minor caveat is that SSE does not support unsolicited events from server to client (and neither does HTTP/2): i.e. the client has to explicitly subscribe by creating an EventSource instance specifying the event source endpoint. So you may have to slightly reorganise how the client arranges for events to be delivered - I can't think of a scenario where this is actually a technical barrier.
SSE works with HTTP/1.1. But HTTP/2 makes using SSE generally viable and competitive with websockets in terms of efficiency, instead of practically unusable in the case of HTTP/1.1. Firstly, HTTP/2 multiplexes many event source connections (or rather "streams" in HTTP/2 terms) onto a single TCP connection where as in HTTP/1.1 you'd need one connection for each. According to the HTTP/2 spec, millions of streams can be created per connection by default with the recommended (configurable) minimum being 100, where as browsers maybe severly limited in the number of TCP connections they can make. Second reason is efficiency: many streams in HTTP/2 is requires much less overhead than the many connections required in HTTP/1.1.
One final thing is, if you want to replace websockets with SSE your forgoing some of the tools / middlewares built on top of websockets. In particular I'm thinking of socket.io (which is how a lot of people actually use websockets), but I'm sure there is a ton more.

is ftp data connection is only for one file?

I have a client trying to upload multiple files to FTP server in passive mode.
The client sends PASV command and the server responds with the relevant ip and port.
Is it possible to send multiple files on this one data connection? or the client need to send the PASV command and get a new port for each file?
Since the only indicator of the end of file is the close of the connection and because you cannot transfer any more data after the connection has closed, you will not be able to transfer more than one file using the same data connection.
But, maybe you tried to ask a different question, that is if is possible to have multiple data transfers (and thus multiple data connections) after a single PASV command? I can see nothing in RFC959 which directly would prevent this and reusing the same target port on the server. And because access would be done from different source ports on the client this should also not give problems with TCP connection states. But, in practice you will probably see problems because if you try to use this from the client side, because lots of servers create the listener only for a single data connection. So you better precede each data transfer with a new PASV command, like existing clients do.

Suggestions on keeping connections alive with FTP file listing via AJAX?

I have a multi-user Ruby on Rails web application that can interact with an FTP server via AJAX. The application allows the user to browse an FTP site. Javascript makes an AJAX call which communicates with a server-script that returns a list of files and directories within a given directory.
This works fine. However, each time a directory listing is requested, the server must re-establish a connection with the FTP server, which takes a lot of time. I'm looking for a way to leave the FTP connection open for until some number of timeout seconds.
I could probably do this using threads (though, I'm completely open to other ideas) or some fancy connection-pooling scheme (perhaps via a daemon that manages this).
What are some ways I could persist and regain reference to connections in my ruby source?
Someone suggested using a "Connection: Keep-Alive" header, but I don't see how that would help in this case.
Not a complete answer, but if you did have some sort of daemon or something managing the connection, you could use TCP keepalives to keep the control connection alive for an extended period of time.
FTP uses two connections. A control connection is established client-to-server, and data connections are established server-to-client for each request. So each directory listing or GET would prompt another data connection to be opened for the duration of the request.
You shouldn't worry about keeping lots of listening sockets open because the data connections are negotiated over the control connection just prior to being established. (Also the data connections could be made client-to-server instead of server-to-client by using passive mode if you want, but it's still a separate connection.)
Either way, I think the source of sluggishness is more to do with closing and reopening the control connection (and authenticating) for each request. I think if you have some process that keeps the control connection open using TCP keepalives (SO_KEEPALIVE socket option), you'll see a big improvement.

MSRP vs FTP

What is difference between MSRP(Message Session Relay Protocol) protocol and FTP protocol ( in terms of transferring files)
What precisely do you mean by "operationally"?
FTP is a protocol for transferring files over IP.
MSRP is a protocol for sending instant messages.
They are not at all alike. You might as well ask what the operational differences between dogs and TheTXI are.
I assume you're asking in terms of transferring files.
ftp, or File Transfer Protocol, is designed for transferring files from a dedicated server.
msrp, or Message Session Relay Protocol, is used for "transmitting a series of related instant messages". This can be used to transfer files to another user.
The big difference I can see is that ftp requires a server, while msrp is user to user, or peer to peer.

Resources