Best way to save data using Socket IO - socket.io

I'm learning how to use Socket.IO, and I'm building a small game.
When someone creates a room, I save the room values in a array.
var clients = [], rooms = [];
...
rooms.push(JSON.parse(roomData));
But if the server crashes, the server loses all the rooms Data.
Is it a good idea to save the data into a Database and repopulate the array with these values when the user connects to the server?
Thank you.

Restoring socket.io connection state after a server crash is a complicated topic that depends a lot on exactly what you're doing and what the state is. Sometimes the client can hold most of the state, sometimes it must be persisted on the server.
State can be stored to disk, to another in memory process like redis or in the client and presented when they reconnect.
You just have to devise a sequence of events on your server and then when the client reconnects for how everything gets restored. You will also likely need persistent client IDs so you know which client is which when they reconnect.
There are many different ways to do it. So, yes you could use a DB or you could do it a different way. There is no single "best" way because it depends upon your particular circumstances and tools you are already using.

Related

Storing room data on Socket.io

I am making a web app with Socket.io and I want to store data for each of the rooms. The data includes some data about users, as well as the room itself, etc., all in a JavaScript object.
Now my question is if I simply have an array let rooms = [] on my server.js which I manipulate and use to store data, would that be OK?
If I deploy to production and have users on the site, would this be fine and work as expected? I am not sure if I need to implement a DB here. Thoughts?
It really depends on what you want to get out of it. Using local state (i.e. what you are doing with let rooms = []) will work just fine (I've done this and had success with it).
The downside is that your state will be in one server's memory. So if that server goes down or you restart it, you will lose all that state (all your rooms). Also, if you need to scale beyond one server then this won't work because each server would have a different list of room data. Your clients would get a different view of things depending on which server they connect to.
The reason this approach has worked for me previously was because my data was very transient and I could accept losing it. I also did not have scaling needs.
In summary, if your situation is such that:
you won't have more users than you can handle on one server instance at any given time
it's okay if your data gets reset
Then go ahead with this - it worked great for me! Otherwise, if you want to make sure your room data doesn't get reset or if you need more than one server, you will want something like a database.

Where in kernel/socket memory to store long term information between network sessions

I'm trying to implement the QUIC protocol in the linux kernel. QUIC works on top of UDP to provide a connection-oriented, reliable data transfer.
QUIC was designed to reduce the number of handshakes required between sessions as compared to TCP.
Now, I need to store some data from my current QUIC session so that I can use it when the session ends and later on use it to initiate a new session. I'm at a loss about where should this data be stored so that it's not deleted between sessions.
EDIT 1: The data needs to be stored till the socket lives in the memory. Once the socket has been destroyed, I don't need the data anymore.
As an aside, how can I store data even between different sockets? Just need a general answer to this as I don't need it for now.
Thank you.

synchronize data structures between unreliable client and server when data is too large for client

Summary:
How do I synchronize very large amount of data with a client which can't hold all the data in memory and keeps disconnecting?
Explanation:
I have a real-time (ajax/comet) app which will display some data on the web.
I like to think of this as the view being on the web and the model being on the server.
Say I have a large number of records on the server, all of them being added/removed/modified all the time. Here are the problems:
-This being the web, the client is likely to have many connections/disconnections. While the client is disconnected, data may have been modified and the client will need to be updated when reconnected. However, the client can't be sent ALL the data every time there is a re-connections, since the data is so large.
-Since there is so much data, the client obviously can't be sent all of it. Think of a gmail account with thousands of messages or google map with ... the whole world!
I realize that initially a complete snapshot of some relevant subset of data will be sent to the client, and thereafter only incremental updates. This will likely be done through some sort of sequence numbers...the client will say "the last update I received was #234" and the client will send all messages between #234 and #current.
I also realize that the client-view will notify the server that it is 'displaying' records 100-200 "so only send me those" (perhaps 0-300, whatever the strategy).
However, I hate the idea of coding all of this myself. There is a general enough and common enough problem that there must be libraries (or at least step-by-step recipes) already.
I am looking to do this either in Java or node.js. If solutions are available in other languages, I'll be willing to switch.
Try a pub/sub solution. Subscribe the client at a given start time to your server events.
The server logs all data change events based on the time they occur.
After a given tim eor reconnect of your client the client asks for a list of all changed data rows since last sync.
You can keep all the logic on the server and just sync the changes. Would result in a typical "select * from table where id in (select id from changed_rows where change_date > given_date)" statement on the server, which can be optimized.

Send data to browser

An example:
Say, I have an AJAX chat on a page where people can talk to each other.
How is it possible to display (send) the message sent by person A to persons B, C and D while they have the chat opened?
I understand that technically it works a bit different: the chat(ajax) is reading from DB (or other source), say every second, to find out if there are new messages to display.
But I wonder if there is a method to send the new message to the rest of the people just when it is sent, and not to load the DB with 1000s of reads every second.
Please note that the AJAX chat example is just an example to explain what I want, and is not something I want to realize. I just need to know if there is a method to let all the opened browser at a specific page(ajax) that there is new content on the server that should be gathered.
{sorry for my English}
Since the server cannot respond to a client without a corresponding request, you need to keep state for each user's queued message. However, this is exactly what the database accomplishes. You cannot get around this by replacing the database with something that doesn't just accomplish the same thing in a different way. That said, there are surely optimizations you could do. Keep in mind, however, that you shouldn't prematurely optimize situations like this; databases are designed to handle extremely high traffic, and it's very possible (and in fact, likely), that the scenario described will be handled just fine by the database out of the box.
What you're describing is generally referred to as the 'Comet' concept. See the Wikipedia article for details, especially implementation options (long polling, etc.).
Another answer is to have the server push changes to connected clients, that way there is just one call to the database and then the server pushes the change to all the clients. This article indicates it is possible, however I have never tried this myself.
It's very basic, but if you want to stick with a standard AJAX solution, a simple means of reducing load on the server when polling would be to get the AJAX call to forward the last collected comment ID for that client - you then use that (with the appropriate escaping) in the lookup query on the server side to ensure you only return new comments.

How to sync a list on a server with many clients?

Consider a poker game server which hosts many tables. While a player is at the lobby he has a list of all the active tables and their stats. These stats constantly change while players join, play, and leave tables. Tables can be added and closed.
Somehow, these changes must be notified to the clients.
How would you implement this functionality?
Would you use TCP/UDP for the lobby (that is, should users connect to server to observe the lobby, or would you go for a request-response mechanism)?
Would the server notify clients about each event, or should the client poll the server?
Keep that in mind: Maybe the most important goal of such a system is scalability. It should be easy to add more servers in order to cope with growing awdience, while all the users should see one big list that consists from multiple servers.
This specific issue is a manifestation of a very basic issue in your application design - how should clients be connecting to the server.
When scalability is an issue, always resort to a scalable solution, using non-blocking I/O patterns, such as the Reactor design pattern. Much preferred is to use standard solutions which already have a working and tested implementation of such patterns.
Specifically in your case, which involves a fast-acting game which is constantly updating, it sounds reasonable to use a scalable server (again, non-blocking I/O), which holds a connection to each client via TCP, and updates him on information he needs to know.
Request-response cycle sounds less appropriate for your case, but this should be verified against your exact specifications for your application.
That's my basic suggestion:
The server updates the list (addition, removal, and altering exsisting items) through an interface that keeps a queue of a fixed length of operations that have been applied on the list. Each operation is given a timestamp. When the queue is full, the oldest operations are progressivly discarded.
When the user first needs to retrive the list, it asks the server to send him the complete list. The server sends the list with the current timestamp.
Once each an arbitary period of time (10-30 seconds?) the client asks the server to send him all the operations that have been applied to the list since the timestamp he got.
The server then checks if the timestamp still appears in the list (that is, it's bigger than the timestamp of the first item), and if so, sends the client the list of operations that have occured from that time to the present, plus the current timestamp. If it's too old, the server sends the complete list again.
UDP seems to suit this approach, since it's no biggy if once in a while an "update cycle" get's lost.

Resources