Network structure for online programming game with webSockets - websocket

Problem
I'm making a game where you would provide a piece of code to represent the agent program of an Intelligent Agent (think Robocode and the like), but browser-based. Being an AI/ML guy for the most part, my knowledge of web development was/is pretty lacking, so I'm having a bit of a trouble implementing the whole architecture. Basically, after the upload of text (code), naturally part of the client-side, the backend would be responsible for running the core logics and returning JSON data that would be parsed and used by the client mainly for the drawing part. There isn't really a need for multiplayer support right now.
If I model after Robocode's execution loop, I would need a separate process for each battle that then assigns different agents (user-made or not) to different threads and gives them some execution time for each loop, generating new information to be given to the agents as well as data for drawing the whole scene. I've tried to think of a good way to structure the multiple clients, servers/web servers/processes [...], and came to multiple possible solutions.
Favored solution (as of right now)
Clients communicate with a Node.js server that works kinda like an interface (think websocketd) for unique processes running on the same (server) machine, keeping track of client and process via ID and forwarding the data (via webSockets) accordingly. So an example scenario would be:
Client C1 requests new battle to server S and sends code (not necessarily a single step, I know);
S handles the code (e.g. compiling), executes new battle and starts a connection with it's process P1 (named pipes/FIFO?);
P1 generates JSON, sends to S;
S sees P1 is "connected" to C1, sends data to C1 (steps 3 and 4 will be repeated as long as the battle is active);
Client C2 requests new battle;
Previous steps repeated; C2 is assigned to new process P2;
Client C3 requests "watching" battle under P1 (using a unique URL or a token);
S finds P1's ID, compares to the received one and binds P1 to C3;
This way, the Server forwards received data from forked processes to all clients connected to each specific Battle.
Questions
Regarding this approach:
Is it simple enough? Are there easier or even more elegant ways of doing it? Could scalability be a problem?
Is it secure enough (the whole compiling and running code — likely C++ — on the server)?
Is it fast enough (this one worries me the most for now)? It seems a bit counter intuitive to have a single server dealing with the entire traffic, but as far as I know, if I'd assign all these processes to a separate web server, I would need different ports for each of them, which seems even worse.

Since this is a theoretical and opinion based question... I feel free to throwing the ball in different directions. I'll probably edit the answer as I think things over or read comments.
A process per battle?
sounds expensive. Also, there is the issue of messages going back and forth between processes... might as well be able to send the messages between machines and have a total separation of concerns.
Instead of forking battles, we could have them running on their own, allowing them to crash and reboot and do whatever they feel like without ever causing any of the other battles or our server any harm.
Javascript? Why just one language?
I would consider leveraging an Object Oriented approach or language - at least for the battles, if not for the server as well.
If we are separating the code, we can use different languages. Otherwise I would probably go with Ruby, as it's easy for me, but maybe I'm mistaken and delving deeper into Javascript's prototypes will do.
Oh... foreign code - sanitization is in order.
How safe is the foreign code? should it be in a localized sped language that promises safety of using an existing language interpreter, that might allow the code to mess around with things it really shouldn't...
I would probably write my own "pseudo language" designed for the battles... or (if it was a very local project for me and mine) use Ruby with one of it's a sanitizing gems.
Battles and the web-services might not scale at the same speed.
It seems to me that handling messages - both client->server->battle and battle->server->client - is fairly easy work. However, handling the battle seems more resource intensive.
I'm convincing myself that a separation of concerns is almost unavoidable.
Having a server backend and a different battle backend would allow you to scale the battle handlers up more rapidly and without wasting resources on scaling the web-server before there's any need.
Network disconnections.
Assuming we allow the players to go offline while their agents "fight" in the field ... What happens when we need to send our user "Mitchel", who just reconnected to server X, a message to a battle he left raging on server Y?
Separating concerns would mean that right from the start we have a communication system that is ready to scale, allowing our users to connect to different endpoints and still get their messages.
Summing these up, I would consider this as my workflow:
Http workflow:
Client -> Web Server : requesting agent with identifier and optional battle data (battle data is for creating an agent, omitting battle data will be used for limiting the request to an existing agent if it exists).
This step might be automated based on Client authentication / credentials (i.e. session data / cookie identifier or login process).
if battle data exists in the request (request to make):
Web Server -> Battle instance for : creating agent if it doesn't exist.
if battle data is missing from the request:
Web Server -> Battle Database, to check if agent exists.
Web Server -> Client : response about agent (exists / created vs none)
If Agent exists or created, initiate a Websocket connection after setting up credentials for the connection (session data, a unique cookie identifier or a single-use unique token to be appended to the Websocket request query).
If Agent does't exist, forward client to a web form to fill in data such as agent code, battle type etc'.
Websocket "workflows" (non linear):
Agent has data: Agent message -> (Battle communication manager) -> Web Server -> Client
It's possible to put Redis or a similar DB in there, to allow messages to stack while the user is offline and to allow multiple battle instances and multiple web server instances.
Client updates to Agent: Client message -> (Battle communication manager) -> Web Server -> Agent

Related

Online game's alive connections count

In online multiplayer games where the world around you changes frequently (user gets updates from the server about that) - how many alive connections usually are made?
For example WebSockets can be used. Is it an effective way to send all data through the one connection? You will have to check every received message type:
if it's info about the world -> make changes to the world around you;
if it's info about user's personal data -> make changes in your profile;
if it's local chat message -> add new message to the chat window.
..etc.
I think this if .. else if .. else if .. else if .. for every incoming data decreases client-side performance very much. Wouldn't it be better to get world changes from the second WS connection? Then you won't have to check it's type every time. But another types are not so frequent, so the first connection can be for them.
So the question is how developers usually deal with connections count and message types to increase performance?
Thanks
It depends on clientside vs serverside load. You need to balance whether you want to place the load of having more open connections on the server, or the analysis of the payload on the client. If you have a simple game, and your server is terrible, I would suggest placing more clientside load. However, for high-performance gaming functioning with an excellent server, using more WebSockets would be the recommended approach.

Should I be using AJAX or WebSockets.

Oh the joyous question of HTTP vs WebSockets is at it again, however even after quit a bit of reading on the hundreds of versus blog posts, SO questions, etc, etc.. I'm still at a complete loss as to what I should be working towards for our application. In this post I will be supplying information on application functionality, and the types of requests/responses used in our application currently.
Currently our application is a sloppy piece of work, thrown together using AngularJS and AJAX requests to a Apache server running PHP, namely XAMPP. With the launch of our application I've noticed that we're having problems with response times when the server is under any kind of load. This probably has something to do with the sloppy architecture of our server, the hardware, and the fact that our MySQL database isn't exactly optimized.
However, with such a loyal fanbase and investors seeing potential in our application and giving us a chance to roll out a 2.0 I've been studying hard into how to turn this application into a powerhouse of low latency scalability. Honestly the best option would be hire someone with experience, but unfortunately I'm a hobbyist, and a one-man-army without much experience.
After some extensive research, I've decided on writing the backend using NodeJS this time. However I'm having a hard time deciding on HTTP or Websockets. Here's the types of transactions that are done between the Server/Client.
Client sends a request to the server in JSON format. The request has a few different things.
A request id (For processing logic based on the request)
The data associated with the request ID.
The server receives the request, polls the database (if necessary) and then responds to the client in JSON format. Sometimes the server is serving files to the client. Namely images in Base64 format.
Currently the application (When being used) sends a request to the server every time an interface is changed, which on average for our application is once every few seconds. Every action on our interfaces sends another request to the server. The application also sends requests to check for notifications/messages every 8 seconds, (or two seconds depending on if they're on the messaging interface).
Currently here are the benefits I see of a stated connection over a stateless connection with our application.
If the connection is stated, I can eliminate the requests for notifications and messages, as the server can just tell the client whenever one comes available. This can eliminate x(n)/4 requests per second to the server alone.
Handling something like a disconnection from the server is as simple as attempting to reconnect, opposed to handling timeouts/errors per request, this would only be handled on the socket.
Additional security can be obtained by removing security keys for database interaction, this should prevent the possibility of Hijacking(?) of a session_key and using it to manipulate or access another users data. The session_key is only needed due to there being no state in the AJAX setup.
However, I'm someone who started learning programming through TCP game server emulation. So I understand some benefits of a STATED connection, while I don't understand the benefits of a STATELESS connection very much at all. I know they both have their benefits and quirks, but I'm curious what would be the best approach for us.
We're mainly looking for Scalability, as we had a local application launch and managed to bottleneck at nearly 10,000 users in under 48 hours. Luckily I announced this as a BETA and the users are cutting me a lot of slack after learning that I did it all on my own as a learning project. I've disabled registrations while looking into improving the application's front and backend.
IMPORTANT:
If using WebSockets, would we be able to asynchronously download pictures from the server like we can with AJAX? For example, I can make 5 requests to the server using AJAX for 5 different images, and they will all start downloading immediately, using a stated connection would I have to wait for each photo to be streamed before moving to the next request? Would this only bottle-neck a single user, or every user that is waiting on a request to be completed?
It all boils down on how your application works and how it needs to scale. I would use bare WebSockets rather than any wrapper, since it is an already easy to use API and your hands won't be tied when you need to scale out.
Here some links that will give you insight, although not concrete answers to your questions because as I said, it depends on your expectations.
Hard downsides of long polling?
WebSocket/REST: Client connections?
Websockets, and identifying unique peers[PHP]
How HTML5 Web Sockets Interact With Proxy Servers
If your question is Should I use HTTP over Websockets ?, the response is: You should not.
Even if it is faster because you don't lose time opening the connection, you lose also all the HTTP specification like verbs (GET, POST, PATCH, PUT, ...), path, body, and also response, status code. This seams simple but you'll have to re-implement all or part of these protocol things.
So you should use Ajax, as long as it is one ponctual request.
When you need to make an ajax request every 2 seconds, you need in fact that the server sends you data, not YOU request server to check Api change (if changed). So this is a sign that you should implement a websocket server.

Best way to initialize initial connection with a server for REST calls?

I've been building some apps that connect to a SQL backend. I use ajax calls to hit WebMethods, a WebAPI, etc.
I notice that the first initial call to the SQL backend retrieves the data fairly slow. I can only assume that this is because it must first negotiate credentials first before retrieving the data. It probably caches this somewhere, and thus, any calls made afterwards come back very fast.
I'm wondering if there's an ideal, or optimal way, to initialize this connection.
My thought was to make a simple GET call right when the page loads (grabbing something very small, like a single entry). I probably wouldn't be using the returned data in any useful way, other than to ensure that any calls afterwards come back faster.
Is this an okay way to approach fixing the initial delay? I'd love to hear how others handle this.
Cheers!
There are a number of reasons that your first call could be slower than subsequent ones
Depending on your server platform, code may be compiled when first executed
You may not have an active DB connection in your connection pool
The database may not have cached indices or data on the first call
Some VM platforms may take a while to allocate sufficient resources to your server if it has been idle for a while.
One way I deal with those types of issues on the server side is to add startup code to my web service that fetches data likely to be used by many callers when the service first initializes (e.g. lookup tables, user credential tables, etc).
If you only control the client, consider that you may well wish to monitor server health (I use the open source monitoring platform Zabbix. There are also many commercial web-based monitoring solutions). Exercising the server outside of end-user code is probably better than making an extra GET call from a page that an end user has loaded.

CPU bound/stateful distributed system design

I'm working on a web application frontend to a legacy system which involves a lot of CPU bound background processing. The application is also stateful on the server side and the domain objects needs to be held in memory across the entire session as the user operates on it via the web based interface. Think of it as something like a web UI front end to photoshop where each filter can take 20-30 seconds to execute on the server side, so the app still has to interact with the user in real time while they wait.
The main problem is that each instance of the server can only support around 4-8 instances of each "workspace" at once and I need to support a few hundreds of concurrent users at once. I'm going to be building this on Amazon EC2 to make use of the auto scaling functionality. So to summarize, the system is:
A web application frontend to a legacy backend system
task performed are CPU bound
Stateful, most calls will be some sort of RPC, the user will make multiple actions that interact with the stateful objects held in server side memory
Most tasks are semi-realtime, where they have to execute for 20-30 seconds and return the results to the user in the same session
Use amazon aws auto scaling
I'm wondering what is the best way to make a system like this distributed.
Obviously I will need a web server to interact with the browser and then send the cpu-bound tasks from the web server to a bunch of dedicated servers that does the background processing. The question is how to best hook up the 2 tiers together for my specific neeeds.
I've been looking at message Queue systems such as rabbitMQ but these seems to be geared towards one time task where any worker node can simply grab a job form a queue, execute it and forget the state. My needs are a little different since there could be multiple 'tasks' that needs to be 'sticky', for example if step 1 is started in node 1 then step 2 for the same workspace has to go to the same worker process.
Another problem I see is that most worker queue systems seems to be geared towards background tasks that can be processed anytime rather than a system that has to provide user feedback that I'm dealing with.
My question is, is there an off the shelf solution for something like this that will allow me to easily build a system that can scale? Would love to hear your thoughts.
RabbitMQ is has an RPC tutorial. I haven't used this pattern in particular but I am running RabbitMQ on a couple of nodes and it can handle hundreds of connections and millions of messages. With a little work in monitoring you can detect when there is more work to do then you have consumers for. Messages can also timeout so queues won't backup too greatly. To scale out capacity you can create multiple RabbitMQ nodes/clusters. You could have multiple rounds of RPC so that after the first response you include the information required to get second message to the correct destination.
0MQ has this as a basic pattern which will fanout work as needed. I've only played with this but it is simpler to code and possibly simpler to maintain (as it doesn't need a broker, devices can provide one though). This may not handle stickiness by default but it should be possible to write your own routing layer to handle it.
Don't discount HTTP for this as well. When you want request/reply, a strict throughput per backend node, and something that scales well, HTTP is well supported. With AWS you can use their ELB easily in front of an autoscaling group to provide the routing from frontend to backend. ELB supports sticky sessions as well.
I'm a big fan of RabbitMQ but if this is the whole scope then HTTP would work nicely and have fewer moving parts in AWS than the other solutions.

User closes the browser without logging out

I am developing a social network in ASP.NET MVC 3. Every user has must have the ability to see connected people.
What is the best way to do this?
I added a flag in the table Contact in my database, and I set it to true when the user logs in and set it to false when he logs out.
But the problem with this solution is when the user closes the browser without logging out, he will still remain connected.
The only way to truly know that a user is currently connected is to maintain some sort of connection between the user and the server. Two options immediately come to mind:
Use javascript to periodically call your server using ajax. You would have a special endpoint on your server that would be used to update a "last connected time" status, and you would have a second endpoint for users to poll to see who is online.
Use a websocket to maintain a persistent connection with your server
Option 1 should be fairly easy to implement. The main thing to keep in mind that this will increase the amount of requests coming into your server, and you will have to plan accordingly in order handle the traffic this could generate. You will have some control over the amount of load on your server by configuring how often javascript timer calls back to your server.
Option 2 could be a little more involved if you did this without library support. Of course there are libraries out there such as SignalR that make this really easy to do. This also has an impact on the performance of your site since each user will be maintaining a persistent connection. The advantage with this approach is that it reduces the need for polling like option 1 does. If you use this approach it would also be very easy to push a message to user A that user B has gone offline.
I guess I should also mention a really easy 3rd option as well. If you feel like your site is pretty interactive, you could just track the last time they made a request to your site. This of course may not give you enough accuracy to determine whether a user is "connected".

Resources