What's the best way to store websocket connection in golang - websocket

I wrote a web chat app, using websockets. It uses one connection per page to push new messages to online users.
So, there are a lot of websocket.Conn to manage. I'm currently using a map.
onlineUser = map[int] *websocket.Conn
I am very worried about the map when 1,000,000 pages are open.
There is a better way to store all the websocket.Conn?
Erlang's inner database can be used to store an erlang socket.
For Go, I had considered using "encoding/gob" to cache the socket in memcached or redis. But then before using the websocket.Conn it but be GOB decoded and that will consume too much CPU.

1,000,000 concurrent users is a nice problem to have. :)
Your primary problem is that the websocket spec requires the connection to be maintained open. This means that you can't serialize and cache them.
However, if you could GOB encode them, and every user sent one message a second (unrealistically high IMO) a million GOB decodes per second isn't going to dominate your load.
Also a map can easily handle a million entries.
If you really want to worry about future scaling, figure out how to get two instances of your application to work together on the load. Then increase that to N instances.

Related

Manage multi-tenancy ArangoDB connection

I use ArangoDB/Go (using go-driver) and need to implement multi-tenancy, means every customer is going to have his data in a separate DB.
What I'm trying to figure out is how to make this multi-tenancy work. I understand that it's not sustainable to create a new DB connection for each request, means I have to maintain a pool of connections (not a typical connection pool tho). Of course, I can't just assume that I can make limitless, there has to be a limit. However, the more I think about that the more I understand that I need some advice on it. I'm new to Go, coming from the PHP world, and obviously it's a completely different paradigm in PHP.
Some details
I have an API (written in Go) which talks to ArangoDb using arangodb/go-driver. A standard way of creating a DB connection is
create a connection
conn, err := graphHTTP.NewConnection(...)
create client
c, err := graphDriver.NewClient(...)
create DB connection
graphDB, err := p.cl.Database(...)
This works if one has only one DB, and DB connection is created on API's boot up.
In my case it's many, and, as previously suggested, I need to maintain a DB connections pool.
Where it gets fuzzy for me is how to maintain this pool, keep in mind that pool has to have a limit.
Say, my pool is of size 5, abd over time it's been filled up with the connections. A new request comes in, and it needs a connection to a DB which is not in the pool.
The way I see it, I have only 2 options:
Kill one of the pooled connections, if it's not used
Wait till #1 can be done, or throw an error if waiting time is too long.
The biggest unknow, and this is mainly because I've never done anything like this, for me is how to track whether connection is being used or not.
What makes thing even more complex is that DB connection has it's own pool, it's done on the transport level.
Any recommendations on how to approach this task?
I implemented this in a Java proof of concept SaaS application a few months ago.
My approach can be described in a high level as:
Create a Concurrent Queue to hold the Java Driver instances (Java driver has connection pooling build in)
Use subdomain to determine which SaaS client is being used (can use URL param but I don't like that approach)
Reference the correct Connection from Queue based on SaaS client or create a new one if not in the Queue.
Continue with request.
This was fairly trivial by naming each DB to match the subdomain, but a lookup from the _systemdb could also be used.
*Edit
The Concurrent Queue holds at most one Driver object per database and hence the size at most will match the number of databases. In my testing I did not manage the size of this Queue at all.
A good server should be able to hold hundreds of these or even thousands depending on memory, and a load balancing strategy can be used to split clients into different server clusters if scaling large enough. A worker thread could also be used to remove objects based on age but that might interfere with throughput.

Where in kernel/socket memory to store long term information between network sessions

I'm trying to implement the QUIC protocol in the linux kernel. QUIC works on top of UDP to provide a connection-oriented, reliable data transfer.
QUIC was designed to reduce the number of handshakes required between sessions as compared to TCP.
Now, I need to store some data from my current QUIC session so that I can use it when the session ends and later on use it to initiate a new session. I'm at a loss about where should this data be stored so that it's not deleted between sessions.
EDIT 1: The data needs to be stored till the socket lives in the memory. Once the socket has been destroyed, I don't need the data anymore.
As an aside, how can I store data even between different sockets? Just need a general answer to this as I don't need it for now.
Thank you.

Broadcast Server

I am writing a TCP Server that accepts connections from multiple clients, this server gathers data from the system that it's running on and transmits it to every connected client.
What design patterns would be best for this situation?
Example
Put all connections in an array, then loop through the array and send the data to each client one by one. Advantage: very easy to implement. Disadvantage: not very efficient when handling large amounts of data.
An easier way is to use some existing software to do this ... For example use https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=d5bedadd-e46f-4c97-af89-22d65ffee070 .
In case you want to write on your own you will need a list(linked list) to manage the connections.
Here is an example of a server http://examples.oreilly.com/jenut/Server.java
If you want to handle large amounts of data, one of the techniques is to have a queue associated with each of the subscribers at the server end. A multi-threaded program can send the data to the clients from those queues.
A number of patterns have been developed for distributed processing and servers, for instance in the ACE project: http://www.cs.wustl.edu/~schmidt/patterns-ace.html. The design might be focused around the events which announce either that data has been received and may be read, or that buffers have been emptied and more data may now be written. At least in the days when a 32-bit address space was the rule, you could have many more open connections than you had threads, so you would typically have a small number of threads waiting for events which announced that they could safely read or write without stalling until the other side co-operated. This may come from events, or from calls such as select() or poll() terminating. Patterns are also associated with http://zguide.zeromq.org/page:all#-MQ-in-a-Hundred-Words.

Is the mux in this golang socket.io example necessary?

In an app that I'm making, a user is always part of a 'game'. I'd like to set up a socket.io server to communicate with users in a game. I'm planning to use http://godoc.org/github.com/madari/go-socket.io go-socket.io, which defines the newSocketIOfunction to create a new socketio instance.
Instead of creating one socketio instance, I thought it might be possible to create a map that maps game id's to socket.io instances, and configure them so that they listen on an url that represents the game id.
This way, I can use methods such as broadcast and broadcastExcept to broadcast to all players ithin a single game. However, I'd have to start a new goroutine for every game, and I don't know enough about their performance characteristics to know if this is scalable, since the request rate for a single socketio instance will be very low, about 1/second at peak times, but the connection might be idle for tens of seconds at other times (except for heartbeat, and possibly other communication specified by the socket.io protocol).
Would I be better off creating 1 socket.io instance, and tracking which connections belong to which games?
I'd have to start a new goroutine for every game, and I don't know enough about their performance characteristics to know if this is scalable
Fire away, the Go scheduler is built to efficiently handle thousands and even millions of goroutines.
The default net/http server in the Go standard library spawns a goroutine for every client for instance.
Just remember to return from your goroutines once they're done working. Else you'll end up with a lot of stale ones.
Would I be better off creating 1 socket.io instance, and tracking which connections belong to which games?
I'm not involved in the project but if it follows Go's "get sh*t done" philosophy, then it shouldn't matter. You can find out what works better by profiling both approaches though.

what is a good Pattern for using AsyncSockets in .net35 when inititiating several client connections

I'm re-building an IM gateway and hope to take advantage of the new performance features in AsyncSockets for .net35.
My existing implementation simply creates packets and forwards IM requests from users to the various IM networks as required, handling request/ response streams for each connected users session(socket).
i presently have to coupe with IasyncResult and as you know it's not very pretty or scalable.
My confusion is this basically:
1) in using the new Begin/End and SocketAsyncEventArgs in 3.5 do we still need to create one SocketAsyncEventArgs per socket?
2) do we gain anything by pre-initializing say, 20000 client connections since we know the expected max_connections per server is 20000
3) do we still need to use a LOH (large object heap) allocated byte[] to handle receive data as shown in SocketServers example on MSDN, we are not building a server per say, but are still handling a lot of independent receives for each connected socket.
4) maybe there is a better pattern altogether for what i'm trying to acheive?
Thanks in advance.
Charles.
1) IAsyncResult/Begin/End is a completely different system from The "xAsync" methods that use SocketAsyncEventArgs. You're better off using SocketAsyncEventArgs and dropping Begin/End entirely.
2) Not really. Initialize a smaller number (50? 100?) and use an intermediate class (ie/ a "resource pool") to manage them. As more requests come in, grow the pool by another 50 or 100 for example. The tough part is efficiently "scaling down" the number of pooled items as resource requirements drop. A large # of sockets/buffers/etc will consume a large amount of memory, so it's better to only allocate it in batches as the server requires it.
3) Don't need to use it, but it's still a good idea. The buffer will still be "pinned" during each call.

Resources