How does the server recognize the header compression of different clients in HTTP2 - http2

According to the http2 protocol, the client and the server need to maintain the same static and dynamic dictionary.
For a server, it receives compressed headers from many different clients.
For the header compressed content sent by a client, will it search only in this user's dictionary or all dictionaries?.
If the query is only in the dictionary of a specific client, how does it determine which dictionary the request comes from?
In addition, when does the user's dictionary stored in the server expire?

Client and server maintain a per-connection state that stores the static and dynamic tables that contain the HTTP headers exchanged between the two peers.
There is no concept of "user", just that of connection.
For a client that opens multiple TCP connections, each connection will have a different dynamic table (the static table is the same for all connections from all clients and it's specified by the HTTP/2 here).
For all HTTP/2 streams in a connection, the dynamic table is updated by both client and server, so it's always in sync.
Being tied to a TCP connection, the dynamic table is thrown away when the connection is closed.

Related

when should a web server do accept to create a new client, or reuse the same client?

In a Webserver for basic static website non-blocking event-driven, I don't understand the mechanics I should implement for a "new client".
When a browser connects to my socket, I get the clientfd from accept and answer with a HTTP response, but when the browser is reloaded, should it create a new connection and answer, or should it reuse the same connection and just send the new response?
I use poll to handle multiple fds, but when I reload the page its the same connection (for me this makes sense) but then I open a new tab, and it's still the same connection (It only does accept once). I'm not finding any documentation on this, and I don't have a way to test with multiple client's if it reuses the same one every time.
You can't reuse a connection from another client, new connections must always be accepted as new connections. It doesn't matter what kind of server application you're writing.
However, if the client passes the header Connection: keep-alive you should not close the connection once the response is finished, but keep the connection open for future requests from the same client.
I hope i understand correctly,
but anyway, What i personally do is create a map of sockets, each socket is a client.
Every time a socket disconnects, it's being removed from that map... and so on...
Whether to use a new connection is the browser's choice. You don't get much of a choice.
However, you can tell the browser that you don't allow it to reuse a connection, if you send Connection: close in the response. In this case, the browser is forced to open a new connection for the next request. This is the only control you have.
If you want to test several connections at the same time, you could open several different browsers, or you could use a different program, such as some HTTP load testing tool (there are many). You could also send it a web page with many images; browsers should try to download all the images using several connections at the same time.
A web server doesn't create clients. A web server has clients -- new clients trying to connect, and existing clients communicating on the sockets that it has already opened.
To handle new clients, a web server should pretty much be calling accept all the time, unless it's already handling the maximum number of clients that it's configured to handle.
As soon as you get a new connection from accept, hand it off to other threads to process and call accept again.

HTTP 2 : Session Span Query

Sorry for what I think might be a silly question but:
My understanding of HTTP 2 is that you create a connection, establish TLS if you want to, then upgrade to 2, and do lots of little queries on that one connection :. getting a speed increase because you're not re-establishing connections and TLS.
If this is true, When it comes to sessions - is each request over the connection treated as an unique request and :. has cookies and headers, or whether they're all treated as sub-requests of the original request?
This has impact on proxies whether you could merge requests from clients into a single stream or not?
My understanding of HTTP 2 is that you create a connection, establish TLS if you want to, then upgrade to 2
Not quite. If you use TLS then using HTTP/2 will be negotiated as part of the TLS negotiation. If you are not using TLS then you can upgrade but browsers only support HTTP/2 over HTTPS so that's the majority of the use cases.
When it comes to sessions - is each request over the connection treated as an unique request and :. has cookies and headers, or whether they're all treated as sub-requests of the original request?
Each request is part of what's call a stream in HTTP/2. It is a unique request, with it's own Cookies and Headers and is unrelated to the previous requests (though see note below for some caveats on this). Conceptually it's really no different than the fact that HTTP/1.1 allows you to send multiple unique, unrelated requests on the one connection - but unlike HTTP/1.1 multiple requests can all be in transit at the same time in HTTP/2 thanks to multiplexing.
This diagram might help explain it: https://freecontent.manning.com/mental-model-graphic-how-is-http-1-1-different-from-http-2/
This has impact on proxies whether you could merge requests from clients into a single stream or not?
I'm not sure what you mean by this?
Note: While HTTP/2 requests are independent of each other and HTTP is still stateless at a higher level, there are a few places when you go lower level where the strict wording of that could be challenged and where there is technically some connection between the requests. For example:
HTTP/2 actually uses HPACK header compression to compress HTTP headers so if the same header is sent twice on different requests (e.g. the same cookie) then the second call will include a reference to the previously received header, rather than repeat the data.
Each request has a unique stream id, which is an increasing integer which is either odd (client initiated) or even (server initiated) so of course the stream ids must be managed by the HTTP/2 implementation so arguable.
HTTP/2 push resources are pushed in response to a previous request stream id.
But these are all really just connection level issues and efficiencies. To a HTTP/2 user (e.g. a web developer) each HTTP/2 request is as independent as it was under HTTP/1.1 and HTTP is still stateless.

Moving from socket.io to raw websockets?

Right now I'm using socket.io with mandatory websockets as the transport. I'm thinking about moving to raw websockets but I'm not clear on what functionality I will lose moving off of socket.io. Thanks for any guidance.
The socket.io library adds the following features beyond standard webSockets:
Automatic selection of long polling vs. webSocket if the browser does not support webSockets or if the network path has a proxy/firewall that blocks webSockets.
Automatic client reconnection if the connection goes down (even if the server restarts).
Automatic detection of a dead connection (by using regular pings to detect a non-functioning connection)
Message passing scheme with automatic conversion to/from JSON.
The server-side concept of rooms where it's easy to communicate with a group of connected users.
The notion of connecting to a namespace on the server rather than just connecting to the server. This can be used for a variety of different capabilities, but I use it to tell the server what types of information I want to subscribe to. It's like connection to a particular channel.
Server-side data structures that automatically keep track of all connected clients so you can enumerate them at any time.
Middleware architecture built-in to the socket.io library that can be used to implement things like authentication with access to cookies from the original connection.
Automatic storage of the cookies and other headers present on the connection when it was first connected (very useful for identifying what user is connected).
Server-side broadcast capabilities to send a common message to either to all connected clients, all clients in a room or all clients in a namespace.
Tagging of every message with a message name and routing of message names into an eventEmitter so you listen for incoming messages by listening on an eventEmitter for the desired message name.
The ability for either client or server to send a message and then wait for a response to that specific message (a reply feature or request/response model).

ZMQ: Multiple request/reply-pairs

ZeroMQs Pub/Sub pattern makes it easy for the server to reply to the right client. However, it is less obvious how to handle communication that cannot be resolved within two steps, i.e. protocols where multiple request/reply pairs are necessary.
For example, consider a case where the client is a worker which asks the server for new work of a specific type, the server replies with the parameters of the work, the client then sends the results and the server checks these and replies whether they were correct.
Obviously, I can't just use recv,send,recv,send sequentially and assume that the first and the second recv are from the same client. What would be the idiomatic way to use multiple recv,send pairs without having to handle messages from other clients inbetween?
Multiple Request/Reply pairs can be made through the use of ZMQ_ROUTER sockets. I recommend using ZMQ_REQ sockets on the clients for bidirectional communication.
If you want to have multiple clients accessing a single server you could use a router socket on the server and request sockets on the clients.
Check out the ZMQ guide's section on this pattern:
http://zguide.zeromq.org/php:chapter3#The-Asynchronous-Client-Server-Pattern
All the clients will interact with the server in the same pattern as Pub/Subs except they will all point at a single server Router socket.
The server on the other hand will receive three messages for every single message a client sends. These parts represent:
Part0 = Identity of connection (random number of which client it is)
Part1 = Empty frame
Part2 = Data of the ZMQ message.
Reference:
http://zguide.zeromq.org/php:chapter3#ROUTER-Broker-and-REQ-Workers
The identity can be used to differentiate between clients accessing on a single port. Repacking the message in the same order and responding on the router socket (with a different data frame) will automatically route it to the client who sent the message.

Can both sides send data on an FTP bi-directional data connection

I always thought that when FTP data connection opens, it transfers data only in one way.
Now I found out that both sides can transfer data on the opened data connection.
My questions:
What is it used for? I read that it can be used to transfer files over SSL, so the bi-directional is used for negotiation, but then why not using ftps?
Data connection opens for transfer of files and listing (anything else?). So what should the sending side do when it receives data from the other side? how would it process it?
Are there clients supporting this behavior?
Is it common?
You are correct that FTP RFC really mentions possibility that data connection is used bi-directionally:
It ought to also be noted that the data connection may be used for simultaneous sending and receiving
But it's likely that the RFC authors just wanted to make sure such option is available for future features of the protocol.
But as far as I know, there's actually no such feature that make use of bi-directional data connection.
The FTP protocol does not allow simultaneous transfers at all, neither in the same nor opposite direction.
Currently the data connection is used:
For downloads, where only the server sends data.
For uploads, where only the client sends data.
For directory listings, where only the server sends data.
Regarding FTPS: Indeed if the data connection is encrypted using TLS/SSL, the connection is used bi-directionally on TCP-level, when the client and the server negotiate the encryption. But I do not think this is what the RFC refers too, as SSL/TLS did not exist at the time and the negotiation is out of scope of FTP protocol anyway.

Resources