Best Practice in designing a client/server communication protocol - client-server

I am currently integrating a server functionality into a software that runs a complicated measuring system.
The client will be a software from another company that will periodically ask my software for the current state of the system.
Now my question is: What is the best way to design the protocol to provide these state information. There are many different states that have to be transmitted.
I have seen solutions where they generate a different state flags and then only transfer for example a 32 bit number where each bit stands for a different state.
Example:
Bit 0 - System Is online
Bit 1 - Measurement in Progress
Bit 2 - Temperature stabilized
... and so on.
This solution will produce very little traffic. Though it seems very unflexible to me and also very hard to debug.
The other I think it could be done is to tranfer each state preceded by the name of the state:
Example:
#SystemOnline#1#MeasurementInProgress#0#TemperatureInProgress#0#.....
This solution will produce a lot more traffic. But it appears a lot more flexible because the order in which each state is tranfered irrelevant. Also it should be a lot easier to debug.
Does anybody knows from experience a good way to solve the problem, or does anybody know a good source of knowledge where I can find best practices. I just want to prevent trying to reinvent the wheel

Once you've made a network request to a remote system, waited for the response, and received and decoded the response, it hardly matters whether the response is 32 bits or 32K. And how many times a second will you be generating this traffic? If less than 1, it matters even less. So use whatever is easiest to implement and most natural for the client, be it a string, or XML.

Related

Networking - Data To Be Sent To Server

I'm attempting to make my first multiplayer game (I'm doing this in Ruby with Gosu) and I'm wondering what information to send to the server and how many, if any, of the calculations should be done on the server.
Should the client be used simply for input gathering and drawing while leaving the server to compute everything else? Or should it be more evenly distributed than that?
I'm going to answer my own question with some more experience under my belt for the sake of anyone who might be interested or in need of an answer.
It will depend on what you're doing, but primarily, for most games, it's the best practice to have the client get and send inputs to the server so that it can do all the required calculations. This makes it much harder for players to cheat by using software such as Cheat Engine, as it means the only values they'd be able to change would be local variables, which have no bearing on the game.
However, in sending all of the client data from the server to the client, be careful not to send too much as it can end up creating a lot of network overhead. Keep your data transferred to the bare minimum needed. On that same note however, don't be afraid of adding data to your packets, just make sure you're being efficient.
Good luck with your projects everyone, and feel free to add to or debate my answer if something isn't up to scratch.

How do you mitigate proposal-number overflow attacks in Byzantine Paxos?

I've been doing a lot of research into Paxos recently, and one thing I've always wondered about, I'm not seeing any answers to, which means I have to ask.
Paxos includes an increasing proposal number (and possibly also a separate round number, depending on who wrote the paper you're reading). And of course, two would-be leaders can get into duels where each tries to out-increment the other in a vicious cycle. But as I'm working in a Byzantine, P2P environment, it makes me what to do about proposers that would attempt to set the proposal number extremely high - for example, the maximum 32-bit or 64-bit word.
How should a language-agnostic, platform-agnostic Paxos-based protocol deal with integer maximums for proposal number and/or round number? Especially intentional/malicious cases, which make the modular-arithmetic approach of overflowing back to 0 a bit unattractive?
From what I've read, I think this is still an open question that isn't addressed in literature.
Byzantine Proposer Fast Paxos addresses denial of service, but only of the sort that would delay message sending through attacks not related to flooding with incrementing (proposal) counters.
Having said that, integer overflow is probably the least of your problems. Instead of thinking about integer overflow, you might want to consider membership attacks first (via DoS). Learning about membership after consensus from several nodes may be a viable strategy, but probably still vulnerable to Sybil attacks at some level.
Another strategy may be to incorporate some proof-of-work system for proposals to limit the flood of requests. However, it's difficult to know what to use this as a metric to balance against (for example, free currency when you mine the block chain in Bitcoin). It really depends on what type of system you're trying to build. You should consider the value of information in your system, then create a proof of work system that requires slightly more cost to circumvent.
However, once you have the ability to slow down a proposal counter, you still need to worry about integer maximums in any system with a high number of (valid) operations. You should have a strategy for number wrapping or a multiple precision scheme in place where you can clearly determine how many years/decades your network can run without encountering trouble without blowing out a fixed precision counter. If you can determine that your system will run for 100 years (or whatever) without blowing out your fixed precision counter, even with malicious entities, then you can choose to simplify things.
On another (important) note, the system model used in most papers doesn't reflect everything that makes a real-life implementation practical (Raft is a nice exception to this). If anything, some authors are guilty of creating a system model that is designed to avoid a hard problem that they haven't found an answer to. So, if someone says that X will solve everything, please be aware they they only mean that it solves everything in the very specific system model that they defined. On the other side of this, you should consider that the system model is closely tied to a statement that says "Y is impossible". A nice example to explain this concept is the completely asynchronous message passing of the Ben-Or consensus algorithm which uses nondeterminism in the system model's state machine to avoid the limits specified by the FLP impossibility result (which specifies that consensus requires partially asynchronous message passing when the system model's state machine is deterministic).
So, you should continue to consider the "impossible" after you read a proof that says it can't be done. Nancy Lynch did a nice writeup on this concept.
I guess what I'm really saying is that a good solution to your question doesn't really exist yet. If you figure it out, please publish it (or let me know if you find an existing paper).

Algorithm for detecting combinations

I am creating a simple intrusion detection system for an Information Security course using jpcap.
One of the features will be remote OS detection, in which I must implement an algorithm that detects when a host sends 5 packets within 20 seconds that have different ACK, SYN, and FIN combinations.
What would be a good method of detecting these different "combinations"? A brute-force algorithm would be time-consuming to implement, but I can't think of a better method.
Notes: jpcap's API allows one to know if the packet is ACK, SYN, and/or FIN. Also note that one doesn't need to know what ACK, SYN, and FIN are in order to understand the problem.
Thanks!
I built my own data structure based on vectors that hold "records" about the type of packet.
You need to keep state on each session. - using hashtables. Keep each syn,ack and fin/fin-ack. I wrote and opensource IDS sniffer a few years ago that does this; feel free to look at the code. It should be very easy to write an algorithm to do passive os-detection (google it). My opensource code is here dnasystem

Website Response Times - General Performance Rules

I am currently in the process of performance tuning a Web Application and have been doing some research into what is considered 'Good' performance. I know this depends often on the application being built, target audience, plus many other factors, but wondered if people follow a general set of rules.
There is always the risk with tuning that there is no end to the job, and one should at some point have to make a call one when to stop, but when is this? When can we be happy the job is done?
To kick off the discussion, I have been using the following rules, based on the Jakob Nielsen report (http://www.useit.com/alertbox/response-times.html), which says
The 3 response-time limits are the
same today as when I wrote about them
in 1993 (based on 40-year-old research
by human factors pioneers):
0.1 seconds gives the feeling of instantaneous response — that is, the
outcome feels like it was caused by
the user, not the computer. This level
of responsiveness is essential to
support the feeling of direct
manipulation (direct manipulation is
one of the key GUI techniques to
increase user engagement and control —
for more about it, see our Principles
of Interface Design seminar).
1 second
keeps the user's flow of thought
seamless. Users can sense a delay, and
thus know the computer is generating
the outcome, but they still feel in
control of the overall experience and
that they're moving freely rather than
waiting on the computer. This degree
of responsiveness is needed for good
navigation.
10 seconds keeps the
user's attention. From 1–10 seconds,
users definitely feel at the mercy of
the computer and wish it was faster,
but they can handle it. After 10
seconds, they start thinking about
other things, making it harder to get
their brains back on track once the
computer finally does respond.
A 10-second delay will often make users
leave a site immediately. And even if
they stay, it's harder for them to
understand what's going on, making it
less likely that they'll succeed in
any difficult tasks.
Even a few
seconds' delay is enough to create an
unpleasant user experience. Users are
no longer in control, and they're
consciously annoyed by having to wait
for the computer. Thus, with repeated
short delays, users will give up
unless they're extremely committed to
completing the task. The result? You
can easily lose half your sales (to
those less-committed customers) simply
because your site is a few seconds too
slow for each page.slow for each page.
If you have Apache as web server you can use the page-speed module made by Google.
Instead of waiting for developers to change legacy, make use of the CPU and memory you have available to provide a better UX.
http://code.google.com/speed/page-speed/docs/module.htmlct
It provides the solution to the most common pain factors and with immediate effect. No coding, no changes to the legacy code of web applications.
The rules are pretty much sensible. Indeed one should aim to have response times in 1 second or less but sometimes the processing will really take longer (bad design, slow machines, waiting on 3rd parties, intense data processing, etc). In this case one can use various tips & tricks to improve the user experience:
use caching (both in the browser and in your frequently processed data)
use progressive loading of data using ajax where possible (and use progress indicators to give feedback that tings are happening)
use tools such as Firebug, YSlow to detect potential issues with your html design and structure
etc etc

Writing fool-proof applications vs writing for performance

I'm biased towards writing fool-proof applications. For example with PHP site, I validate all the inputs from client-side using JS. On the server-side I validate again. On both sides I do validation for emptiness, and other patterns (email, phone, url, number, etc). And then I strip malicious tags or characters, trim them (server-side). Later I convert the input into desired formats/data types (string, int, float, etc). If the library meant for server-side only, I even give developers chances for graceful degradation and accommodate the tolerate the worst inputs and normalize to the acceptable ones (I have predefined set of the acceptable ones).
Now I'm reading a library that I wrote one and a half years ago. I'm wondering if developers are so evil or lack IQ for me do so much of graceful degradation, finding every possible chance to make the dudes right, even they gave crappy input which seriously harms performance. Or shall I do minimal checking and expect developers to be able and are willfully to give proper input? I have no hope for end-users but should I trust developers more and give them application/library with better performance?
Common policy is to validate on the server anything sent from the client because you can't be totally sure it really was your client that sent it. You don't want to "trust developers more" and in the process find that you've "trusted hackers of your site more".
Fixing invalid input automatically can be as much a curse as a blessing -- you've essentially committed to accepting the invalid input as a valid part of your protocol (ie, in a future version if you make a change that will break the invalid input that you were correcting, it is no longer backwards compatible with the client code that has been written). In extremis, you might paint yourself into a corner that way. Also, invalid calls tend to propagate to new code -- people often copy-and-paste example code and then modify it to meet their needs. If they've copied bad code that you've been correcting at the server, you might find you start getting proportionally more and more bad data coming in, as well as confusing new programmers who think "that just doesn't look like it should be right, but it's the example everyone is using -- maybe I don't understand this after all".
Never expect diligence from developers. Always validate, if you can, any input that comes into your code, especially if it comes across a network.
End users (whether they're programmers using your tool, or non-programmers using your application) don't have to be stupid or evil to type the wrong thing in. As programmers we all too often make wrong assumptions about what's obvious for them.
That's the first thing, which justifies comprehensive validation all on its own. But validation isn't the same as guessing what they meant from what they typed, and inferring correct input from incorrect - unless the inference rules are also well known to the users (like Word's auto-correct, for instance).
But what is this performance you seek? There's no bit of client-side (or server-side, for that matter) validation that takes longer to run than the second or so that is an acceptable response time.
Validate, and ensure it doesn't break as the first priority. Then worry about making it clever enough to know (reliably) what they meant. After that, worry about how fast it is. In the real world, syntax validation doesn't make a measurable difference to anything where user input takes most of the total time.
Microsoft made the mistake of trusting programmers to do the right thing back in the days of Windows 3.1 and to a lesser extent Windows 95. You need only read a few posts from Raymond Chen to see where that road ultimately leads.
(P.S. This is not a dig against Microsoft - it's a statement on fact about how programmers abused the more liberal Win16, either deliberately or through ignorance)
I think you are right in being biased toward fool-proof applications. I would not assume that that degrades performance enough to be of much concern. Rather I would address performance concerns separately, starting by profiling or my favorite method, stackshots. There must be a way to get those in PHP.

Resources