NATS IO internal architecture - performance

Currently I am working on migrating TIBCO RV to NATS IO for a project. I am curious to know what internal architecture makes NATS IO to be superior interms of performance as they claim in their website http://nats.io/about/. I couldn't find any resources online explaining internals of nats. Could anyone please help me on this?.

There's a good overview referenced in the protocol documentation to a presentation given by Derek Collison, the creator of NATS. He covers some of the highly performant areas of NATS including the zero allocation byte parser, subject management algorithms, and golang optimizations.
NATS is open source - implementation details can be found in the gnatsd repository. The protocol parser and the subject handling would be a few areas to look at.

I was heavily involved in both RV and in NATS of course. I am not claiming that NATS is faster than RV. Although I designed and built both, I have not tested RV in many years for any type of performance. NATS should compare well, is OSS of course and has a simple TEXT based protocol vs a binary protocol for RV. Also NATS is an overlay design using TCP/IP, similar to TIBCO's EMS which I also designed, however RV can use multicast (PGM) or reliable broadcast. So RV will be more efficient at large fanout in most cases.

In general messaging system's performance is tied to 3 simple things IMO.
How many messages can be processed per IO call or jump from user to kernel space.
How fast can you route messages for distributions, e.g. subject distributors.
How efficient is the system at copying data to achieve #1 coalescing messages.

Related

ZeroMQ: What are the possibilities of packet loss in PUSH/PULL method?

I'm using ZeroMQ PUSH/PULL technique.
The PUSH socket blocks when no PULL sockets available.
What are the different scenarios in which there's packet loss and if possible, how can we tackle them?
Yes, PUSH-side access may block, but need not do so:
distributed systems' designs ought anticipate such state and rather use a nonblocking mode of the .send( payload, zmq.NOBLOCK ) command.
Some other means are further available for a detailed configuration via .setsockopt( ... )
Zero-compromise:
There is nothing as a packet-loss in the ZeroMQ Scalable Formal Communication Pattern design. Either ZeroMQ delivers a complete message-payload, or nothing. There is zero alternative to this, so packet-loss does not match the set of ZeroMQ ideas & principles implemented under the hood. The .Context() instance is capable of saving most of the low level transport & signalisation situations and still re-connect lost abstract-socket transports & deliver the data or deliver nothing, Zero-Compromise on this.
The best next step:
As the messaging landscape is rich of features and new design paradigms, the best next step is to learn more about the technology and about designing such distributed systems. May enjoy other ZeroMQ-related posts here and there linked pdf-book with many design issues and approaches to solutions from Pieter HINTJENS himself. Worth one's time to read it.

ZeroMQ: Can I use a ROUTER and a DEALER as server / client, instead of using them as proxies?

I have a server/client application, which uses a REQ/REP formal pattern and I know this is synchronous.
Can I completely replace zmq.REQ / zmq.REP by zmq.ROUTER and zmq.DEALER ?
Or do these have to be used only as intermediate proxies?
ZeroMQ is a box with a few smart and powerful building blocks
However, only the Architect and the Designer decide how well or how poor these get harnessed in your distributed applications' architecture.
So, a synchronicity or asynchronicity is not an inherent feature of some particular ZeroMQ Scaleable Formal Communication Pattern's access-node, but depends on real deployment, within some larger context of use.
Yes, ROUTER can talk to DEALER, but ...
as one may read in details in ZeroMQ API-specification tables, so called compatible socket-archetypes are listed for each named socket type, however anyone can grasp much stronger powers from ZeroMQ if trying to start using the ZeroMQ way of thinking by spending more time on the ZeroMQ concept and their set of Zero-maxims -- Zero-copy + (almost) Zero-latency + Zero-warranty + (almost) Zero-scaling degradation etc.
The best next step:
IMHO if you are serious about professional messaging, get the great book and source both the elementary setups knowledge, a bit more complex multi-socket messaging layer designs with soft signaling and also the further thoughts about the great powers of concurrent, heterogeneous, distributed processing to advance your learning curve.
Pieter Hintjens' book "Code Connected, Volume 1" ( available in PDF ) is more than a recommended source for your issue.
There you will get grounds for your further use of ZeroMQ.
ZeroMQ is a great tool, not just for the messaging layer itself. Worth time and efforts.

Cluster Computing in Go

Is there a framework for cluster computing in Go? (I wish to bring together multiple PC's to for custom parallel computation, and wonder whether Go might be a suitable language to use).
I don't know the level of connectedness you plan to have in your cluster, but go's RPC package makes communication among nodes trivial. It will likely serve as the backbone of your work and you can build abstractions on top of it (for instance if you need to multicast requests to different nodes). The examples given in the doc assume your nodes will communicate over HTTP, but that bit is abstracted out in net/rpc to allow different transports.
http://golang.org/pkg/net/rpc/
You can use Hadoop Streaming with Go. See (a bit dated) example here.
You should have a look at Go Circuit.
Quoting from the introduction:
The circuit reduces the human development and sustenance costs of complex massively-scaled
systems nearly to the level of their single-process counterparts. ...
... and:
For isntance, we have been able to write large real-world cloud applications — e.g.
streaming multi-stage MapReduce pipelines — in as many as 200 lines of code from
the ground up.
Also, for some simpler use cases, you might want to check out Golem.
You can try to use https://github.com/bketelsen/skynet . This is service oriented framework based on doozer.

Algorithm for detecting combinations

I am creating a simple intrusion detection system for an Information Security course using jpcap.
One of the features will be remote OS detection, in which I must implement an algorithm that detects when a host sends 5 packets within 20 seconds that have different ACK, SYN, and FIN combinations.
What would be a good method of detecting these different "combinations"? A brute-force algorithm would be time-consuming to implement, but I can't think of a better method.
Notes: jpcap's API allows one to know if the packet is ACK, SYN, and/or FIN. Also note that one doesn't need to know what ACK, SYN, and FIN are in order to understand the problem.
Thanks!
I built my own data structure based on vectors that hold "records" about the type of packet.
You need to keep state on each session. - using hashtables. Keep each syn,ack and fin/fin-ack. I wrote and opensource IDS sniffer a few years ago that does this; feel free to look at the code. It should be very easy to write an algorithm to do passive os-detection (google it). My opensource code is here dnasystem

Would you recommend Google Protocol Buffers or Caucho Hessian for a cross-language over-the-wire binary format?

Would you recommend Google Protocol Buffers or Caucho Hessian for a cross-language over-the-wire binary format? Or anything else, for that matter - Facebook Thrift for example?
We use Caucho Hessian because of the reduced integration costs and simplicity. It's performance is very good, so it's perfect for most cases.
For a few apps where cross-language integration is not that important, there's an even faster library that can squeeze even more performance called Kryo.
Unfortunately it's not that widely used, and it's protocol is not quasi-standard like the one from Hessian.
Depends on use case. PB is much more tightly coupled, best used internally with closely-coupled systems; not good for shared/public interfaces (as in to be shared between more than 2 specific systems).
Hessian is bit more self-descriptive, has nice performance on Java. Better than PB on my tests, but I'm sure that depends on use case. PB seems to have trouble with textual data, perhaps it has been optimized for integer data.
I don't think either is particularly good for public interfaces, but given you want binary format, that is probably not a big problem.
EDIT: Hessian performance is actually not all that good as, per jvm-serializers benchmark. And PB is pretty fast as long as you make sure to add the flag that forces use of fast options on Java.
And if PB is not good for public interfaces, what is? IMO, open formats like JSON are superior externally, and more often than not fast enough that performance does not matter a lot.
For me, Caucho Hessian is the best.
It is very easy to get started, and the performance is good. I have tested local, the latent is about 3ms, on Lan you can expect about 10ms.
With hessian you don't have to write another file to define the model (we using java + java). It saves a lot of time for development and maintenance.
If you need a support to interconnect apps from many languages/platforms, than Hessian is the best. If you use only Java, than Kryo is even faster.
I'm myself looking into this.. no good conclusions so far, but I found http://dewpoint.snagdata.com/2008/10/21/google-protocol-buffers/ summarizing all the options.
Muscle has a binary message transport. Sorry that I can't comment on the others as I haven't tried them.
I tried Google Protocol Buffers. It works with C++/MFC, C#, PHP and more languages (see: http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns) and works really well regardless of transport and disk save/loading.
I would say that ProtocolBuffers, Thrift or Hessian are fairly similar as far as their Binary formats are concerned - where they provide cross-language serialization support. The inherent serialization might have some small performance differences between them ( size/space tradeoffs ) but this is not the most important thing. ProtocolBuffers is certainly a well performing IDL defined format which has features for extensibility which make it attractive.
HOWEVER the use of an "over-the-wire" in the question implies the use of a communications library. Here Google has provided an interface definition for protobuf RPC, which is equivalent to making a specification where all implementation details is left to the implementer. This is unfortunate because it means there is de-facto NO cross-language implementation - unless you can find a cross language implementation probably mentioned here http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns. I have seen some RPC implementations which support java and c, or c and c++, or python and c etc, but here you just have to find a library which satisfies your concrete requirements and evaluate otherwise youre likely to be disappointed. ( At least i was disappointed enough to write protobuf-rpc-pro )
Kyro is a serialization format like protobuf, but java only. Kyro/Net is a java only RPC implementation using Kryo messages. So it's not a good choice for "cross-language-ness" communication.
Today it would seem that ICE http://www.zeroc.com/, and Thrift which provides an RPC implementation out of the box, are the best cross-language RPC implementations out there.

Resources